#ceph IRC Log


IRC Log for 2013-03-21

Timestamps are in GMT/BST.

[0:00] * Kioob (~kioob@luuna.daevel.fr) Quit ()
[0:01] <gregaf> there's the quorum features; I'm not sure if they're exposed to ciients or not
[0:01] <sjustlaptop> ugh
[0:01] <gregaf> a quick grep says they are in the map so we could expose them if wee wanted to
[0:03] <gregaf> oh, no, they aren't
[0:03] <gregaf> damn
[0:03] <gregaf> just used in encoding the map
[0:04] <sjustlaptop> harumph
[0:04] <joshd> why does it need to check the features? wouldn't older monitors just ignore the new message, and the osd would timeout waiting for the reply and shutdown?
[0:04] <sjustlaptop> joshd: that's the other approach
[0:05] <sjustlaptop> seemed tidier to just check the feature bits though
[0:05] <gregaf> yeah
[0:06] <dmick> maybe you should invent a new message type to get features :)
[0:06] <gregaf> this is why I wanted one message type, since they're all clients to the monitor :p
[0:06] * tnt (~tnt@54.211-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[0:07] <gregaf> but passing daemon-specific information through that interface would be annoying, because who wants to manually encode and decode bufferlists
[0:07] <gregaf> sjust: exposing the connected monitor's feature bits through the MonClient is easy if we decide that's appropriate — and if we think this message setup is appropriate, then so is exposing the feature bits
[0:08] * The_Bishop (~bishop@2001:470:50b6:0:d5fb:59b7:82d8:4fd3) has joined #ceph
[0:10] <gregaf> if you examine the way the quorum_features work I think you might find that the featureset won't regress and that it's what the Monitor exports…but I'm not certain
[0:11] <gregaf> but now I need to go back to getting my brain smashed into a pulp :D
[0:11] * rturk is now known as rturk-away
[0:20] * stefunel (~stefunel@static. Quit (Server closed connection)
[0:20] * stefunel (~stefunel@static. has joined #ceph
[0:28] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[0:29] * jlogan1 (~Thunderbi@2600:c00:3010:1:8c00:81c9:796a:9e97) Quit (Ping timeout: 480 seconds)
[0:32] * rturk-away is now known as rturk
[0:32] * scalability-junk (uid6422@id-6422.tooting.irccloud.com) Quit (Server closed connection)
[0:33] * scalability-junk (uid6422@id-6422.tooting.irccloud.com) has joined #ceph
[0:58] * markbby (~Adium@ Quit (Quit: Leaving.)
[1:01] * jjgalvez (~jjgalvez@ Quit (Quit: Leaving.)
[1:08] * alram (~alram@ Quit (Quit: leaving)
[1:14] * jtang1 (~jtang@ has joined #ceph
[1:14] * Cube (~Cube@ Quit (Quit: Leaving.)
[1:18] * jskinner_ (~jskinner@ has joined #ceph
[1:18] * jskinner (~jskinner@ Quit (Read error: Connection reset by peer)
[1:22] * rturk is now known as rturk-away
[1:26] * jskinner (~jskinner@ has joined #ceph
[1:27] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[1:31] * jskinner_ (~jskinner@ Quit (Ping timeout: 480 seconds)
[1:40] * mcclurmc (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)
[1:41] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[1:41] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) Quit (Quit: Leaving.)
[1:42] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[1:43] * jtang1 (~jtang@ has joined #ceph
[1:43] * jtang1 (~jtang@ Quit ()
[1:44] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) Quit ()
[1:44] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[1:44] * noob2 (~cjh@ has joined #ceph
[1:44] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[1:44] <noob2> question for the ceph guru's
[1:45] * vata (~vata@2607:fad8:4:6:1a8:d976:3b01:35f5) Quit (Quit: Leaving.)
[1:45] <noob2> lets say I had 3 ceph nodes in one rack connected to 1 top rack switch and a client machine in the same rack. If the client is writing to the cluster in rack and the rack switch dies what will happen to my writes?
[1:46] <noob2> so i don't have redundant network and it dies all of a sudden
[1:46] <noob2> i know ceph will likely recover but will the client see the mount as corrupted?
[1:47] <gregaf> it'll be fine — if the client doesn't lose power or get rebooted it'll replay all its uncommitted writes to the cluster once networking is restored
[1:47] <noob2> ok
[1:47] <noob2> so the kernel rbd client is smart enough to handle this
[1:47] <gregaf> if the client does go away then nobody else will be able to tell; it'll just be like if the client had shut down without flushing its page cache
[1:48] <noob2> right
[1:48] <noob2> then at that point you have potential for lost information
[1:48] <noob2> wouldn't the client go away though since he's been separated from the cluster with the switch down?
[1:48] <gregaf> ah, kernel rbd might be a bit different in the specifics — but yes, the memory will remain marked as dirty and it'll keep trying to flush and eventually it will reconnect and the flushing will succeed
[1:49] <noob2> ok
[1:49] <noob2> is there a timeout on that?
[1:49] * BillK (~BillK@124-148-238-28.dyn.iinet.net.au) has joined #ceph
[1:49] <gregaf> don't think so
[1:49] <noob2> awesome
[1:49] <noob2> writes will just suspend until it is reconnected then
[1:49] <noob2> reads also
[1:49] <gregaf> I could be missing some in-kernel mechanism, but yeah, that's what should happen
[1:49] <noob2> ok
[1:50] <noob2> i'll test it to be extra sure just in case :)
[1:50] <gregaf> elder (when he's around) or joshd might be able to provide more certainty :)
[1:50] <noob2> fire up clients and then reboot the network switch
[1:51] <joshd> you have to be a bit more careful with vms, whose kernels may cause problems after 120s of blocked i/o
[1:51] <noob2> ok
[1:52] <noob2> i don't think we're going to have vms hooked up to it
[1:52] <noob2> i'm almost certain of it :)
[1:54] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has left #ceph
[1:54] <noob2> so basically this is a non issue unless the client is rebooted while he has dirty pages
[1:55] <joshd> yup
[1:55] <gregaf> other than stuff that's dependent on IO going to disk and that gets hung is going to be unhappy
[1:55] <gregaf> ;)
[1:55] <noob2> right
[1:55] <noob2> the applications are going to have to built to handle that interruption
[1:56] <noob2> if that's the path we want to go down
[1:56] <noob2> i'm urging they look into spreading it across racks since ceph was built to do that :)
[2:04] * noob2 (~cjh@ Quit (Quit: Leaving.)
[2:21] * sagelap (~sage@ Quit (Read error: Operation timed out)
[2:26] * sagelap (~sage@ has joined #ceph
[2:31] * LeaChim (~LeaChim@5e0d7853.bb.sky.com) Quit (Ping timeout: 480 seconds)
[2:35] * timmytwot (~TimmyTwoT@pool-100-40-57-129.prvdri.fios.verizon.net) has joined #ceph
[2:36] <timmytwot> http://www.youtube.com/watch?v=8WPafswrC64
[2:37] * timmytwot (~TimmyTwoT@pool-100-40-57-129.prvdri.fios.verizon.net) Quit (Remote host closed the connection)
[2:37] <dmick> what is with the children today?
[2:42] * eschnou (~eschnou@223.86-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[2:54] * eschnou (~eschnou@223.86-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[3:00] <jluis> 2013 just got better
[3:07] <dmick> as in "the calendar year?
[3:09] <jluis> yeah, vanilla ice is back, and next year a new TMNT is being released
[3:10] <jluis> and I'm not in the business of believing in coincidences
[3:10] <jluis> *TMNT movie
[3:11] <dmick> this is an interesting definition of "better"
[3:13] <jluis> oh, I was trying to be sarcastic
[3:14] <jluis> I can no longer think of tmnt without thinking about both the movie and vanilla ice; they kind of ruined a big chunk of my childhood
[3:15] <janos> the new tmnt cartoon isn't super-bad, if that helps
[3:16] <janos> thanks for th ehorrid earworm, though
[3:16] <janos> *the horrid
[3:16] <janos> "ice ice baby"
[3:16] <jluis> it might, as long as they don't use the 'go ninja, go ninja, go' bit
[3:17] <janos> yeah i have no idea how anyone can take tmnt and do the things they did to it
[3:17] <jluis> http://www.youtube.com/watch?v=GFLGRidfFo4
[3:17] <jluis> still, the first movie was the best of the lot
[3:17] <janos> is that link going to hurt?
[3:17] <jluis> yes
[3:18] <janos> hahahah oh man
[3:18] <janos> AHHHH
[3:19] * janos is mesmerized
[3:20] <dmick> I never had any love for TMNT to be ruined, but that still hurts
[3:21] * masterpe_ (~masterpe@2001:990:0:1674::1:82) has joined #ceph
[3:22] * masterpe (~masterpe@2001:990:0:1674::1:82) Quit (Read error: Connection reset by peer)
[3:22] <janos> dang i went back to look at the eastman and laird stuff. my mental image was better. more along the lines of frank miller
[3:23] <janos> mental image was actually somewhere between miller and akira
[3:24] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[3:25] <janos> gives me urge to draw again though!
[3:27] * Cube (~Cube@66-87-66-67.pools.spcsdns.net) has joined #ceph
[3:39] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[3:42] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[3:57] * noob2 (~cjh@pool-96-249-204-90.snfcca.dsl-w.verizon.net) has joined #ceph
[4:02] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[4:03] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.90 [Firefox 19.0.2/20130307023931])
[4:10] * The_Bishop (~bishop@2001:470:50b6:0:d5fb:59b7:82d8:4fd3) Quit (Ping timeout: 480 seconds)
[4:17] * The_Bishop (~bishop@2001:470:50b6:0:c9b0:8b83:73f8:f5a2) has joined #ceph
[4:39] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:43] * chutzpah (~chutz@ Quit (Quit: Leaving)
[4:44] * noob2 (~cjh@pool-96-249-204-90.snfcca.dsl-w.verizon.net) has left #ceph
[4:58] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[5:00] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[5:46] * The_Bishop (~bishop@2001:470:50b6:0:c9b0:8b83:73f8:f5a2) Quit (Ping timeout: 480 seconds)
[5:50] * ScOut3R (~ScOut3R@c83-249-245-183.bredband.comhem.se) has joined #ceph
[5:51] * ScOut3R (~ScOut3R@c83-249-245-183.bredband.comhem.se) Quit (Remote host closed the connection)
[5:55] * The_Bishop (~bishop@2001:470:50b6:0:c9b0:8b83:73f8:f5a2) has joined #ceph
[6:12] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:23] * dmick (~dmick@2607:f298:a:607:30ef:9d9d:28d6:4654) Quit (Quit: Leaving.)
[6:25] * nhm (~nh@184-97-180-204.mpls.qwest.net) has joined #ceph
[6:27] * nhm_ (~nh@184-97-137-60.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[6:43] * The_Bishop (~bishop@2001:470:50b6:0:c9b0:8b83:73f8:f5a2) Quit (Ping timeout: 480 seconds)
[6:52] * The_Bishop (~bishop@2001:470:50b6:0:d5fb:59b7:82d8:4fd3) has joined #ceph
[6:53] * asadpanda (~asadpanda@2001:470:c09d:0:20c:29ff:fe4e:a66) has joined #ceph
[7:07] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[7:12] * KindTwo (KindOne@h253.24.131.174.dynamic.ip.windstream.net) has joined #ceph
[7:18] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[7:18] * KindTwo is now known as KindOne
[7:23] * NaioN_ (stefan@andor.naion.nl) has joined #ceph
[7:24] * NaioN (stefan@andor.naion.nl) Quit (Read error: Connection reset by peer)
[7:42] * jks (~jks@3e6b5724.rev.stofanet.dk) Quit (Remote host closed the connection)
[7:46] * loicd (~loic@magenta.dachary.org) has joined #ceph
[7:50] * sleinen (~Adium@user-23-12.vpn.switch.ch) has joined #ceph
[7:53] * tnt (~tnt@82.195-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[8:08] * jks (~jks@3e6b5724.rev.stofanet.dk) has joined #ceph
[8:10] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:29] * jks (~jks@3e6b5724.rev.stofanet.dk) Quit (Quit: jks)
[8:37] * nz_monkey_ (~nz_monkey@ Quit (Remote host closed the connection)
[8:40] * nz_monkey (~nz_monkey@ has joined #ceph
[8:44] * Cube (~Cube@66-87-66-67.pools.spcsdns.net) Quit (Quit: Leaving.)
[8:55] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[9:05] * leseb (~leseb@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) has joined #ceph
[9:07] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: Why is the alphabet in that order? Is it because of that song?)
[9:09] * gerard_dethier (~Thunderbi@ has joined #ceph
[9:12] <joelio> morning! put the order through for our prod cluster yesterday, can't wait to build it :)
[9:17] * loicd (~loic@lvs-gateway1.teclib.net) has joined #ceph
[9:21] * tnt (~tnt@82.195-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:24] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[9:38] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[9:41] * xiaoxi (~xiaoxiche@jfdmzpr03-ext.jf.intel.com) Quit (Ping timeout: 480 seconds)
[9:43] * leseb (~leseb@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) Quit (Remote host closed the connection)
[9:46] * eschnou (~eschnou@ has joined #ceph
[9:47] * madkiss1 is now known as madkiss
[9:47] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[9:49] * Cube (~Cube@cpe-76-95-217-215.socal.res.rr.com) has joined #ceph
[9:49] * leseb (~leseb@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) has joined #ceph
[9:51] * leseb (~leseb@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) Quit (Remote host closed the connection)
[9:52] * dosaboy (~gizmo@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) has joined #ceph
[9:55] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[9:58] * leseb (~leseb@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) has joined #ceph
[10:09] * ssejour (~sebastien@ has joined #ceph
[10:10] * mcclurmc (~mcclurmc@firewall.ctxuk.citrix.com) has joined #ceph
[10:10] <ssejour> hello all
[10:12] <ssejour> I hope it's the right place to ask that. I'm looking for a TCO comparison between a ceph architecture and a legacy NAS/SAN architecture. Is there any examples available somewhere? thanks
[10:12] * LeaChim (~LeaChim@5e0d7853.bb.sky.com) has joined #ceph
[10:15] * dosaboy (~gizmo@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) Quit (Read error: Connection reset by peer)
[10:16] <absynth> quite certainly not
[10:24] * leseb (~leseb@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) Quit (Remote host closed the connection)
[10:31] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[10:32] * dosaboy (~gizmo@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) has joined #ceph
[10:46] * janisg (~troll@ Quit (Ping timeout: 480 seconds)
[10:47] * sleinen (~Adium@user-23-12.vpn.switch.ch) Quit (Quit: Leaving.)
[10:47] * sleinen (~Adium@ has joined #ceph
[10:48] * janisg (~troll@ has joined #ceph
[10:52] * leseb (~leseb@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) has joined #ceph
[10:53] <absynth> all services at dreamhost operating again, i read?
[10:55] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[10:56] <Gugge-47527> im really curious what went wrong :)
[10:57] <absynth> sounded like blackouts and ups overwhelmed or something
[10:57] <absynth> aren't blackouts quite common in california?
[11:00] <Gugge-47527> that part i got
[11:00] <Gugge-47527> but why did it take so long to get dreamobjects (ceph) online again :)
[11:01] <absynth> from what i know, DO is a petabyte-scale ceph instance
[11:01] <absynth> so there are *lots* of OSDs
[11:01] <absynth> and mons
[11:01] <absynth> i would presume after everything came back up, there was a massive recovery underway
[11:02] <absynth> or the radosgw cluster was just fried
[11:02] <absynth> oh, and i read in the status blog taht network hardware was broken, so maybe the ceph backnet was done for
[11:05] <Gugge-47527> Assumptions are the root of all evil
[11:06] <absynth> no, women are
[11:06] <absynth> http://www.anvari.org/db/fun/Gender/Proof_that_Girls_are_Evil.jpg
[11:07] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[11:08] * Morg (b2f95a11@ircip2.mibbit.com) has joined #ceph
[11:08] <Gugge-47527> Women doing assumptions must be _really_ bad then :P
[11:10] * stacker100 (~stacker66@206.pool85-61-191.dynamic.orange.es) Quit (Ping timeout: 480 seconds)
[11:14] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[11:15] * Qt3n (~Qten@qten.qnet.net.au) has joined #ceph
[11:17] * ShaunR (~ShaunR@staff.ndchost.com) Quit ()
[11:19] <absynth> Gugge-47527: some detailed information is here: http://www.datacenterknowledge.com/archives/2013/03/20/power-outage-knocks-dreamhost-customers-offline/?utm-source=feedburner&utm-medium=feed&utm-campaign=Feed%3A+DataCenterKnowledge+%28Data+Center+Knowledge%29
[11:19] <absynth> i apologize for the link monster
[11:21] <absynth> and a statement by the ceo: http://www.dreamhoststatus.com/2013/03/19/power-disruption-affecting-us-west-data-center-irvine-ca/#more-14269
[11:23] <Gugge-47527> maybe im missing it, but neither ceph or object is mentioned there, so i cant really find the technical info about what made the startup of ceph/dreamobjects take so long
[11:23] <Gugge-47527> And that is what im curious about :)
[11:23] * leseb (~leseb@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) Quit (Remote host closed the connection)
[11:24] <absynth> i think the secret is the network part
[11:24] * psomas (~psomas@inferno.cc.ece.ntua.gr) Quit (Ping timeout: 480 seconds)
[11:25] <Gugge-47527> i appreciate your guesses, but im not really interested in guessing :)
[11:25] <Gugge-47527> im curious about the facts :)
[11:26] <Gugge-47527> just to learn from it, and try to secure myself about something similar
[11:27] <absynth> good luck with that. ceph is not designed to survive a full-blown data center power outage, as it is very reliant on low network latency.
[11:28] <absynth> and good luck finding that detailed information
[11:29] <Gugge-47527> I dont really expect them to give that information
[11:29] <Gugge-47527> Im stil curious though
[11:33] * scuttlemonkey (~scuttlemo@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) has joined #ceph
[11:33] * ChanServ sets mode +o scuttlemonkey
[11:33] * mib_e4oaoh (57ee8a78@ircip3.mibbit.com) has joined #ceph
[11:33] <mib_e4oaoh> hello
[11:33] <mib_e4oaoh> i have a question
[11:34] <tnt> yehudasa: ping
[11:34] <mib_e4oaoh> are there any rbd_cache recommendations for a rbd windows image file?
[11:34] <absynth> Off
[11:35] <absynth> what are you using for virtualization, qemu?
[11:35] <mib_e4oaoh> qemu/kvm
[11:35] <absynth> in current versions, the rbd_cache seems to cause massive issues due to lack of parallelization
[11:36] <absynth> i.e. if you have disk I/O inside a VM, all other i/o to that VM is stalled
[11:36] <absynth> we have rbd_cache off for all (not only windows) vms
[11:36] <mib_e4oaoh> ok
[11:36] <mib_e4oaoh> do you have a poor windows perfomance too?
[11:37] <absynth> no, without rbd_cache it's fine
[11:37] <absynth> we have only few windows vms, though
[11:37] <mib_e4oaoh> so the setting have look like her for caching_off? <source protocol='rbd' name='test/test-v1:rbd_cache=false'>
[11:38] <mib_e4oaoh> i mean like this
[11:39] <absynth> yeah
[11:39] <absynth> looks ok for me
[11:39] <mib_e4oaoh> ok
[11:39] <mib_e4oaoh> dont unterstand why the perfomance is still poor :(
[11:40] <mib_e4oaoh> have installed windows 2008 standard server R2 with newest virtio drivers for the network card and harddrive
[11:40] <mib_e4oaoh> gave the machine enough ram and cpu's
[11:40] <mib_e4oaoh> and its still not slow
[11:40] <mib_e4oaoh> and it still slow :)
[11:44] <Gugge-47527> did you test what is slow, diskio or network io?
[11:44] <Gugge-47527> or both
[11:44] <mib_e4oaoh> both
[11:44] <Gugge-47527> how do you do the test?
[11:45] <mib_e4oaoh> if i install windows updates, it takes to long to download and to install it
[11:45] <absynth> yeah, we had that issue too
[11:45] <absynth> let me ask olli
[11:45] <mib_e4oaoh> service pack installation takes about 5 hours
[11:45] <Gugge-47527> mib_e4oaoh: dont test network io with something that writes to the disk :)
[11:45] <Gugge-47527> but it sounds like slow diskio allright :)
[11:46] <mib_e4oaoh> yeah :(
[11:46] <absynth> what qemu version?
[11:47] <mib_e4oaoh> 1.1.2+dfsg-5
[11:47] <absynth> try 1.4
[11:47] <absynth> that fixed those issues for us
[11:48] <absynth> Gugge-47527: is that an ASN in your nick?
[11:49] <mib_e4oaoh> hmm ok
[11:49] <mib_e4oaoh> i will try to update
[11:53] <Gugge-47527> absynth: yes
[11:54] <absynth> i see
[11:54] <absynth> where is Herning?
[11:54] <Gugge-47527> Middle of denmark
[11:54] <absynth> Sealand?
[11:54] <Gugge-47527> Nope
[11:54] <Gugge-47527> Jutland
[11:54] <absynth> ah
[12:00] * markbby (~Adium@ has joined #ceph
[12:03] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) has joined #ceph
[12:07] * dosaboy (~gizmo@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) Quit (Ping timeout: 480 seconds)
[12:08] * ScOut3R (~scout3r@5401D8E4.dsl.pool.telekom.hu) has joined #ceph
[12:12] * ScOut3R (~scout3r@5401D8E4.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[12:13] * ScOut3R (~scout3r@5401D8E4.dsl.pool.telekom.hu) has joined #ceph
[12:14] * xiaoxi (~xiaoxiche@ has joined #ceph
[12:14] <xiaoxi> hi
[12:14] * ScOut3R (~scout3r@5401D8E4.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[12:14] <xiaoxi> 0.59 released?
[12:15] * ScOut3R (~scout3r@5401D8E4.dsl.pool.telekom.hu) has joined #ceph
[12:17] * ScOut3R (~scout3r@5401D8E4.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[12:18] * joelio likes the look of docker.io
[12:19] * JohansGlock (~quassel@kantoor.transip.nl) Quit (Server closed connection)
[12:19] * JohansGlock (~quassel@kantoor.transip.nl) has joined #ceph
[12:29] * stacker100 (~stacker66@ has joined #ceph
[12:36] <xiaoxi> I cannot start the monitor for v0.59...any one also upgrade ceph to that version?
[12:36] <jluis> xiaoxi, was about to hit reply on your email
[12:36] <jluis> xiaoxi, can you get me some logs with debug info?
[12:37] <jluis> say, --debug-mon 20 ?
[12:37] <jluis> --debug-auth 10 would also be nice
[12:41] <jluis> btw, xiaoxi, you're not really upgrading from v0.58 are you?
[12:41] <jluis> if you were, there would be no reason for the monitor to be running a store conversion
[12:41] <xiaoxi> jluis:I am upgrading from 0.56.3,but actually,it's a clean cluster,I re mkcephfsed
[12:42] <xiaoxi> and I even delete the old monitor data by rm -f -r /data/mon.ceph1
[12:42] <xiaoxi> in the email ,you say it's auth related,but I disabled the cephx by auth cluster required = none
[12:42] <xiaoxi> auth service required = none
[12:42] <xiaoxi> auth client required = none
[12:42] <jluis> so the steps were 'delete /data/mon.ceph1', 'mkcephfs', 'upgrade to v0.58+' and then run the monitor?
[12:44] <jluis> xiaoxi, I say that it looks to be somewhere on the auth subsystem, but without more debug infos I can't really pinpoint where and why, or if it is the monitor itself that has gone nuts
[12:45] <xiaoxi> the setps were : apt-get remove ceph && apt-get autoremove && reboot && upgrade to 0.59 && rm /data/mon.ceph1 && mkcephfs --mkfs && /etc/init.d/ceph -a start
[12:45] <xiaoxi> jluis:OK, I can have the log you want now
[12:46] <absynth> i'd love sushi now
[12:46] <jluis> okay, the log would be nice to have; can you drop it somewhere? or send it to my email 'joao.luis@inktank.com' or something of the sorts
[12:47] * jluis is now known as joao
[12:47] <xiaoxi> ok,just now
[12:49] <xiaoxi> 13-03-21 11:50:38.517134 7f02ec1ee780 -1 end of key=val line 22 reached, no "=val" found...missing =?
[12:49] <xiaoxi> 2013-03-21 11:50:38.652541 7eff53a60780 -1 Errors while parsing config file!
[12:49] <xiaoxi> 2013-03-21 11:50:38.652549 7eff53a60780 -1 end of key=val line 21 reached, no "=val" found...missing =?
[12:49] <xiaoxi> 2013-03-21 11:50:38.652550 7eff53a60780 -1 end of key=val line 22 reached, no "=val" found...missing =?
[12:49] <xiaoxi> 2013-03-21 11:50:38.656290 7fe83bcde780 -1 Errors while parsing config file!
[12:49] <xiaoxi> 2013-03-21 11:50:38.656293 7fe83bcde780 -1 end of key=val line 21 reached, no "=val" found...missing =?
[12:49] <xiaoxi> 2013-03-21 11:50:38.656294 7fe83bcde780 -1 end of key=val line 22 reached, no "=val" found...missing =?
[12:49] <xiaoxi> the log is full of this
[12:49] <joao> what did you add to your ceph.conf?
[12:51] * scuttlemonkey (~scuttlemo@HSI-KBW-46-237-220-11.hsi.kabel-badenwuerttemberg.de) Quit (Ping timeout: 480 seconds)
[12:52] <xiaoxi> [mon]
[12:52] <xiaoxi> mon data = /data/$name
[12:52] <xiaoxi> debug mon 20
[12:52] <xiaoxi> debug auth 20
[12:52] <xiaoxi> should i write debug mon=20?
[12:52] <joao> missing the = :)
[12:52] <joao> debug mon = 20
[12:52] <joao> debug auth = 20
[12:55] <xiaoxi> sent
[12:57] <joao> nope, that's definitely in the monitor
[12:57] <joao> looking, thanks
[12:58] <xiaoxi> is 0.59 released? I cannot find a release note for it :)
[12:58] <joao> not officially I think
[12:58] <xiaoxi> but I have a happy experience with 0.58 (in a different setup)
[12:59] <xiaoxi> well, but the develop-testing is really provide package of that version
[13:10] <joao> xiaoxi, I'm not really sure about this, but do you by any chance only disabled cephx after mkcephfs?
[13:11] <joao> and I'm guessing you only disabled it on the monitors?
[13:11] <xiaoxi> [global]
[13:11] <xiaoxi> ; allow ourselves to open a lot of files
[13:11] <xiaoxi> max open files = 131072
[13:11] <xiaoxi> auth cluster required = none
[13:11] <xiaoxi> auth service required = none
[13:11] <xiaoxi> auth client required = none
[13:11] <xiaoxi> ; set log file
[13:11] <xiaoxi> log file = /var/log/ceph/$name.log
[13:11] <xiaoxi> ; set up pid files
[13:11] <xiaoxi> pid file = /var/run/ceph/$name.pid
[13:11] <xiaoxi> I think I disabled all
[13:11] * lupine_85 (~lupine@eboracum.office.bytemark.co.uk) has joined #ceph
[13:12] <xiaoxi> sent my ceph.conf to you
[13:12] <joao> that would disable it for everyone, yes, but did you change that file after the upgrade? were the osds restarted with that new configuration ?
[13:13] * stacker100 (~stacker66@ Quit (Read error: Operation timed out)
[13:13] <joao> afaict, there's something nasty going in the monitor that appears to be caused by someone contacting with auth enabled
[13:13] <xiaoxi> no, I even never change it..this is the ceph.conf I used for 0.56.3
[13:13] <joao> ah, okay
[13:13] <xiaoxi> but I would like to restart all the daemon to see what happened
[13:14] <joao> can I bother you to spin the monitor again, but this time also with 'debug ms = 1' and 'debug paxos = 10' ?
[13:14] <xiaoxi> joao:well,but it should be a bug :) you can always break someone's cluster by using a miconfigured ceph
[13:14] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[13:15] <joao> xiaoxi, it is a bug, I'm just not sure where or why
[13:15] <xiaoxi> joao:my pleasure,but I have to go out for about 1 hour and come back with you :) my dog is complaining...
[13:15] <joao> asserting certainly is not a feature :p
[13:15] <janos> that is no way to speak about a wife
[13:15] <joao> eheh, sure
[13:16] <joao> I'll be here most of the day anyway
[13:17] * janisg (~troll@ Quit (Read error: Operation timed out)
[13:17] * janisg (~troll@ has joined #ceph
[13:20] * loicd (~loic@lvs-gateway1.teclib.net) Quit (Ping timeout: 480 seconds)
[13:24] <xiaoxi> joao:sent..I have to gone...the dog is barking...
[13:24] <joao> cool thanks
[13:28] <joao> xiaoxi, whenever you're back, the log appears to have turned into a blob of data instead of plain text :\
[13:31] * xiaoxi (~xiaoxiche@ Quit (Remote host closed the connection)
[13:36] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[13:38] * markbby (~Adium@ Quit (Remote host closed the connection)
[13:46] * sagewk (~sage@2607:f298:a:607:14f7:6b1e:46e1:933a) Quit (Server closed connection)
[13:47] * sagewk (~sage@2607:f298:a:607:2055:f0c4:52ca:ba9) has joined #ceph
[13:54] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[14:12] * xiaoxi (~xiaoxiche@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[14:12] <xiaoxi> hi joao
[14:12] <xiaoxi> I am back
[14:12] <joao> cool
[14:14] <joao> looks like your last log turned into a big pile of data instead of text :\
[14:14] <joao> joao@tardis:~/inktank/issues/mon-crash/xiaoxi/2013-03-21$ tar xf ceph_log.tar
[14:14] <joao> joao@tardis:~/inktank/issues/mon-crash/xiaoxi/2013-03-21$ file mon.ceph1.log
[14:14] <joao> mon.ceph1.log: data
[14:14] <joao> could you please resend it again?
[14:15] * BManojlovic (~steki@ has joined #ceph
[14:18] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[14:18] * stacker100 (~stacker66@206.pool85-61-191.dynamic.orange.es) has joined #ceph
[14:28] <xiaoxi> sent, to see if it's right... I just too hurry to go out that time, my dog wants to go out and play with his dog friends :)
[14:28] <joao> eheh
[14:28] <joao> I wish my dog was somewhat sociable
[14:33] <xiaoxi> your dog doesnt enjoying playing with other dogs outside ?
[14:37] <xiaoxi> joao:seems everytime I try ceph -s, the mon will die,and from the log it seems client related..
[14:37] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) has joined #ceph
[14:37] <joao> xiaoxi, my dog doesn't get along very well with other dogs
[14:39] <joao> xiaoxi, how does the monitor die when you try 'ceph -s'? same way?
[14:39] <xiaoxi> yes
[14:39] <joao> what does 'ceph -v' report?
[14:40] <xiaoxi> root@ceph-1:/var/log/ceph# ceph -v
[14:40] <xiaoxi> ceph version 0.59 (cbae6a435c62899f857775f66659de052fb0e759)
[14:43] <xiaoxi> well, if no one touch the monitor, it seems good,but once I tried to start an OSD or ceph -s , the monitor will die
[14:43] <joao> okay, I'm just going to grab something for lunch and be back real quick
[14:43] <joao> btw, this is a single-monitor setup right?
[14:43] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:43] <xiaoxi> yes
[14:43] * loicd (~loic@ has joined #ceph
[14:44] <xiaoxi> root@ceph-1:/var/log/ceph# ceph -v
[14:44] <xiaoxi> ceph version 0.59 (cbae6a435c62899f857775f66659de052fb0e759)
[14:44] <joao> kay thanks
[14:44] <joao> brb
[14:44] <xiaoxi> root@ceph-1:/var/log/ceph# dpkg -l | grep ceph
[14:44] <xiaoxi> ii ceph 0.59-1quantal amd64 distributed storage and file system
[14:44] <xiaoxi> ii ceph-common 0.59-1quantal amd64 common utilities to mount and interact with a ceph storage cluster
[14:44] <xiaoxi> ii ceph-fs-common 0.59-1quantal amd64 common utilities to mount and interact with a ceph file system
[14:44] <xiaoxi> ii ceph-fuse 0.59-1quantal amd64 FUSE-based client for the Ceph distributed file system
[14:44] <xiaoxi> ii ceph-mds 0.59-1quantal amd64 metadata server for the ceph distributed file system
[14:44] <xiaoxi> ii libcephfs1 0.59-1quantal amd64 Ceph distributed file system client library
[14:44] <xiaoxi> I think I have every thing 0.59 already..
[14:53] <xiaoxi> tried to reinstall ceph,but doesn't dolve problem
[14:57] * PerlStalker (~PerlStalk@ has joined #ceph
[15:14] <absynth> just a wild shot, do you have any stray ceph binaries somewhere? in /usr/local/bin or so?
[15:15] <joao> xiaoxi, did you turned down 'debug mon' and 'debug auth' during your last run (when you increased 'debug paxos' and 'debug ms') ?
[15:17] <xiaoxi> yes, I disable it
[15:17] <xiaoxi> do you also want it ?
[15:17] <joao> yeah, I wanted all four debug messages :p
[15:18] <joao> if you can manage it, I'd appreciate; otherwise I'll figure it out using the two logs
[15:19] <xiaoxi> debug ms = 1
[15:19] <xiaoxi> debug paxos = 10
[15:19] <xiaoxi> debug mon = 10
[15:19] <xiaoxi> debug auth = 10
[15:19] <xiaoxi> is it looks good to you?
[15:25] <joao> crank up mon and auth to 20?
[15:27] <sstan> Hi, I've got a strange error :
[15:27] <sstan> filestore(/var/lib/ceph/osd/ceph-0) _detect_fs unable to create /var/lib/ceph/osd/ceph-0/xattr_test: (28) No space left on device
[15:28] <sstan> however the device isn't running out of space
[15:30] <sstan> I remounted /var/lib/ceph/osd/ceph-0 and it's now up again ...
[15:30] * jlogan (~Thunderbi@2600:c00:3010:1:8c00:81c9:796a:9e97) has joined #ceph
[15:32] <sstan> any ideas ?
[15:34] <xiaoxi> maybe your filesystem is in a incorrect status so it become readonly, after the remount it back to normal ... just a guess
[15:35] <absynth> i have no idea, but i admire the problem ;)
[15:35] <absynth> 100% of inodes full or something like that?
[15:39] * loicd (~loic@ Quit (Quit: Leaving.)
[15:39] * markbby (~Adium@ has joined #ceph
[15:44] * xiaoxi (~xiaoxiche@jfdmzpr05-ext.jf.intel.com) Quit (Remote host closed the connection)
[15:44] * wschulze (~wschulze@rrcs-108-176-12-3.nyc.biz.rr.com) has joined #ceph
[15:45] * wschulze (~wschulze@rrcs-108-176-12-3.nyc.biz.rr.com) Quit ()
[15:49] * l0nk (~alex@ Quit (Quit: Leaving.)
[15:49] * l0nk (~alex@ has joined #ceph
[15:55] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[15:56] <sstan> absynth, no , just (28) No space left on device
[15:56] * tchmnkyz (~jeremy@0001638b.user.oftc.net) has joined #ceph
[15:56] <sstan> on 2 osds / 3
[15:56] <sstan> third one was okay
[15:57] <tchmnkyz> hey guys, i am having a problem with rbd when i try to create a image on my cluster, it will not allow me to use the --format 2 even though it says it is a valid way to do it in the man pages. Am i doing something wrong?
[15:57] <sstan> what kernel version are you using ?
[15:58] <tchmnkyz> 2.6.32-5-amd64
[15:58] <sstan> type 2 can be useful only if you have kernel >= 3.8 (?)
[15:59] <tchmnkyz> i see
[15:59] <tchmnkyz> so basically users of debian based systems are SOL on type 2
[15:59] <tchmnkyz> until debian moves into a 3.x series kernel
[16:00] <sstan> most people use ubuntu for CEPH
[16:00] <tchmnkyz> ya i dont like ubuntu
[16:00] <tchmnkyz> lol
[16:01] <tchmnkyz> and being that i already have 272 tb in this cluster i would rather not format and move over to debian
[16:01] <tchmnkyz> err ubuntu
[16:01] * wer (~wer@168.sub-70-192-194.myvzw.com) has joined #ceph
[16:01] <sstan> 272 TB on a ceph cluster?
[16:02] <tchmnkyz> yea about to expand out to 1.5 pb
[16:02] <sstan> as long as you do not format your OSD it should be fine
[16:02] <tchmnkyz> i am using ceph as a backend storage to a proxmox cluster
[16:02] <tchmnkyz> that is true
[16:02] <tchmnkyz> i did not think about it that way
[16:02] <tchmnkyz> ok
[16:03] <sstan> so yeah as far as I know .. no type 2 available on such a kernel
[16:03] <tchmnkyz> i guess i am going to go format now
[16:03] <tchmnkyz> thnx
[16:03] <Gugge-47527> format 2 isnt even available on 3.8
[16:04] <Gugge-47527> but you should be able to create them fine with rbd, just not map them :)
[16:04] <sstan> but what's the use?
[16:05] <Gugge-47527> no idea :)
[16:05] <Gugge-47527> i guess you could use it for something with rbd-fuse
[16:06] <Gugge-47527> but i cant imagine what :)
[16:06] <Gugge-47527> basically format 2 images are not usefull with the kernel client, but they are with qemu
[16:07] <Gugge-47527> i dont know if the 3.9-rc kernels supports format 2 images though
[16:07] <tchmnkyz> yes that is what i am using them with
[16:07] <tchmnkyz> i am using them for proxmox vm disks
[16:07] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[16:07] <Gugge-47527> if you use qemu, you should be able to use format 2 images fine
[16:07] <tchmnkyz> and proxmox hard coded the format to type 2 when it creates my images
[16:07] * denken (~denken@dione.pixelchaos.net) Quit (Server closed connection)
[16:07] <tchmnkyz> so it is a PITA
[16:07] <tchmnkyz> lol
[16:07] * denken (~denken@dione.pixelchaos.net) has joined #ceph
[16:08] <tchmnkyz> ok i am going to go format my nodes now thnx
[16:08] <Gugge-47527> why do you need to map them with rbd?
[16:08] <Gugge-47527> and why would you format your nodes, what would that help you?
[16:08] <tchmnkyz> taking them from debian to ubuntu
[16:08] <Gugge-47527> why?
[16:08] <sstan> wait ...
[16:08] <tchmnkyz> rbd: create error: (22) Invalid argument when i use --format 2
[16:09] <tnt> tchmnkyz: what do you OSD look like ? (hw wise)
[16:09] <tchmnkyz> when i use --format 1 it works fine
[16:09] <sstan> wait that should work no matter what kernel
[16:09] * stacker100 (~stacker66@206.pool85-61-191.dynamic.orange.es) Quit (Ping timeout: 480 seconds)
[16:09] <tchmnkyz> they are Dual Xone 5620 48 GB ram with a hardware raid50 55tb array
[16:09] <Gugge-47527> tchmnkyz: create does not use the kernel, and should work fine
[16:09] <sstan> ... that doesn't matter
[16:09] <tchmnkyz> ok then why would i not be able to create type 2 images
[16:10] <tchmnkyz> ceph version 0.56.3
[16:10] <Gugge-47527> paste the command you run, and the error somewhere
[16:11] <tchmnkyz> http://pastebin.com/LycsW781
[16:11] <sstan> --image-format
[16:11] <tchmnkyz> not --format?
[16:11] <sstan> format is for "Specifies output formatting "
[16:11] <sstan> ...
[16:11] * jskinner (~jskinner@ has joined #ceph
[16:11] <sstan> http://ceph.com/docs/master/man/8/rbd/
[16:12] <mikedawson> joao: did your MON changes land in 0.58 or 0.59? Sage's original 0.58 release notes and your blog list them for 0.58, but the new 0.59 release notes and http://ceph.com/docs/master/release-notes/#v0-59 say they landed in 0.59?
[16:12] <tchmnkyz> rbd: error parsing command '--image-format'
[16:12] <tchmnkyz> i dont think my release has --image-format
[16:13] <Gugge-47527> "rbd --format 2 --size 1000 create test01" works fine on my 0.56.3
[16:13] <sstan> rbd create --image-format 2 --size 1000 test
[16:13] <sstan> works fine for me
[16:13] <tchmnkyz> on 56.3?
[16:13] <sstan> ceph version 0.58
[16:14] <Gugge-47527> tchmnkyz: what version does rbd -v output?
[16:14] <tchmnkyz> ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)
[16:15] <Gugge-47527> i have no idea then :)
[16:15] <tchmnkyz> great
[16:15] <Gugge-47527> why are you trying to create format 2 images with the rbd command?
[16:15] <tchmnkyz> that is how proxmox creates the image
[16:15] <Gugge-47527> proxmox uses rbd to create the image, and fails?
[16:16] <tchmnkyz> yup
[16:16] <tchmnkyz> i assembled the command from their perl script to duplicate the error
[16:16] <Gugge-47527> so basically you are not using your cluster yet?
[16:16] <Gugge-47527> or you have some proxmox hosts that works fine?
[16:16] <tchmnkyz> i was using it untill this latest update
[16:17] <Gugge-47527> a proxmox update?
[16:17] <tchmnkyz> proxmox pushed new features that rely on type 2
[16:17] <tchmnkyz> yes
[16:17] <Gugge-47527> so all the old images are format 1?
[16:17] <tchmnkyz> yup
[16:17] <tchmnkyz> and those vms are fine
[16:17] <tchmnkyz> i just cant create any new vm's because of it
[16:18] <Gugge-47527> and all your osd/mon nodes run 0.56.3 too?
[16:18] <joao> mikedawson, iirc, they landed on 0.58
[16:18] <tchmnkyz> yea
[16:18] <tchmnkyz> everything is debian squeeze with 56.3
[16:18] <Gugge-47527> does "rbd --format 2 --size 1000 create test123" work on one of the storage nodes?
[16:18] <joao> mikedawson, I'll clear that out with sage
[16:18] <Gugge-47527> without all the other options
[16:19] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) Quit (Remote host closed the connection)
[16:19] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Server closed connection)
[16:19] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[16:19] <mikedawson> joao: Thanks. Because of the nature of your commits, it seems the release notes matter here more than most.
[16:19] <tchmnkyz> nop
[16:20] <tchmnkyz> i get the following error librbd: error setting image id: (22) Invalid argument
[16:20] <joao> mikedawson, completely agree
[16:20] <tchmnkyz> same problem as on the proxmox nodes
[16:22] <Gugge-47527> tchmnkyz: what command did you run exactly?
[16:22] <tchmnkyz> rbd -p pool --format 2 --size 1000 create test123
[16:23] <Gugge-47527> you dont have a pool named rbd?
[16:23] * wschulze (~wschulze@rrcs-108-176-12-3.nyc.biz.rr.com) has joined #ceph
[16:23] <tchmnkyz> i do but i dont use it for my vms
[16:24] <tchmnkyz> i have a seperate pool for different customers
[16:24] <Gugge-47527> super, now try my command (it will create the image in rbd)
[16:24] <tchmnkyz> no it wont: librbd: error setting image id: (5) Input/output error
[16:25] <Gugge-47527> but the id listed is the pool id :)
[16:26] <tchmnkyz> http://pastebin.com/uU1v8qxx
[16:26] <Gugge-47527> did you start your ceph cluster as 0.56.x, or as an older version?
[16:26] <tchmnkyz> may have been an older version
[16:26] <tchmnkyz> i dont remember what version it was when i started
[16:27] <Gugge-47527> my first guess is that some part somewhere is still running an old version
[16:27] <tchmnkyz> how can i check that
[16:27] <Gugge-47527> ceph --admin-daemon <path-to-admin-socket> version
[16:28] <Gugge-47527> for all your ceph daemons
[16:29] <Gugge-47527> something like "ceph --admin-daemon /var/run/ceph/ceph-mon.a.asok version" on all your mons and osds
[16:29] <tchmnkyz> apperently it was running on v 48.x
[16:30] <Gugge-47527> and now you know why you cant create format 2 images :)
[16:30] <tchmnkyz> lol
[16:30] <tchmnkyz> so just reload the daemons?
[16:30] <Gugge-47527> i cant help you there, i never ran that old versions
[16:30] <Gugge-47527> and dont know how to upgrade safely
[16:30] <tchmnkyz> great
[16:31] <tchmnkyz> that kinda scares me
[16:31] <tchmnkyz> lol
[16:31] <Gugge-47527> i guess there is documentation on ceph.com somewhere though :)
[16:32] <tchmnkyz> looks like jsut a restart of all of them fixed it
[16:32] <tchmnkyz> lol
[16:32] <tchmnkyz> sorry to be a PITA
[16:32] <Gugge-47527> Im just glad you got it fixed :)
[16:34] * stacker666 (~stacker66@206.pool85-61-191.dynamic.orange.es) has joined #ceph
[16:34] <tchmnkyz> me too
[16:35] <tchmnkyz> i hae a rather large customer that is wanting me to turn up vms today
[16:35] <tchmnkyz> lol
[16:35] <tchmnkyz> whent to do it and was like ummmm
[16:36] * wer (~wer@168.sub-70-192-194.myvzw.com) Quit (Remote host closed the connection)
[16:39] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Quit: Ex-Chat)
[16:42] * stacker666 (~stacker66@206.pool85-61-191.dynamic.orange.es) Quit (Read error: Operation timed out)
[16:46] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[16:55] * stacker666 (~stacker66@206.pool85-61-191.dynamic.orange.es) has joined #ceph
[16:58] * l0nk (~alex@ Quit (Quit: Leaving.)
[16:59] * joshd1 (~jdurgin@2602:306:c5db:310:9db2:7a44:69b5:93e) has joined #ceph
[17:03] <sstan> Does a pool object size affect performance?
[17:04] <sstan> I'm trying to optimize ceph for small writes
[17:04] * gerard_dethier (~Thunderbi@ Quit (Quit: gerard_dethier)
[17:06] * mib_e4oaoh (57ee8a78@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[17:07] * aliguori (~anthony@ has joined #ceph
[17:08] * Cube (~Cube@cpe-76-95-217-215.socal.res.rr.com) Quit (Quit: Leaving.)
[17:08] * BillK (~BillK@124-148-238-28.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[17:10] <nhm> sstan: It may, I haven't done a lot of testing on changing the default object size yet.
[17:10] * diegows (~diegows@ has joined #ceph
[17:11] * sagelap (~sage@ has joined #ceph
[17:13] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[17:14] * wschulze (~wschulze@rrcs-108-176-12-3.nyc.biz.rr.com) Quit (Quit: Leaving.)
[17:15] <sstan> ok I'll try then
[17:19] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[17:19] * sagelap (~sage@ Quit (Read error: Operation timed out)
[17:19] * davidz1 (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[17:20] * sagelap (~sage@ has joined #ceph
[17:21] <jamespage> sagewk, gregaf: hey - whats the plan (if any) around 0.56.4? Just seeing whether it fits pre-release for Ubuntu raring
[17:22] * BillK (~BillK@124-148-230-197.dyn.iinet.net.au) has joined #ceph
[17:22] * sagelap1 (~sage@2600:1012:b014:9419:215:ffff:fe36:60) has joined #ceph
[17:23] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[17:23] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:24] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[17:26] * Morg (b2f95a11@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[17:27] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[17:28] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[17:37] * eschnou (~eschnou@ Quit (Remote host closed the connection)
[17:38] <sstan> nhm : I tried .. and reducing the order doesn't increase the performance. Reducing it too much decreases the performance
[17:40] <sstan> 1.5 MB/s for bs=4k
[17:44] * leseb (~leseb@ has joined #ceph
[17:45] <nhm> sstan: interesting
[17:45] <sstan> all the fun features require type 2 images
[17:46] <nhm> sstan: Yeah, sounds like type 2 images are a hot feature request. :)
[17:50] * sagelap1 (~sage@2600:1012:b014:9419:215:ffff:fe36:60) Quit (Ping timeout: 480 seconds)
[17:52] * Cube (~Cube@cpe-76-95-217-215.socal.res.rr.com) has joined #ceph
[17:53] * chutzpah (~chutz@ has joined #ceph
[17:55] * leseb (~leseb@ Quit (Remote host closed the connection)
[17:59] * Cube (~Cube@cpe-76-95-217-215.socal.res.rr.com) Quit (Quit: Leaving.)
[17:59] <joelio> sstan: yes, I accidentally created a pool with 8 pgs (stupid defaults if you ask me) - speed wasn't the greatest, at all
[18:00] <sstan> hah it's 192 by default I think?
[18:00] <sstan> I was talking about the order when one creates an image
[18:00] <sstan> (default 22 (i.e 4M))
[18:01] <joelio> sstan: http://ceph.com/docs/master/rados/operations/pools/ says 8
[18:02] <sstan> hmm indeed
[18:02] <sstan> it's 192 for default pools (rbd, etc.)
[18:02] <joelio> Next to "The default value 8 is NOT suitable for most systems."
[18:02] <joelio> why put it then???
[18:02] <joelio> a default should be suitable FOR most systems :)
[18:02] <sstan> yeah
[18:03] <gregaf> jamespage: point releases like that are kind of erratic, but Sage says he's hoping this week — given the day and scheduling I'd count on next week some time
[18:03] <jamespage> gregaf, great - that fits nicely :-)
[18:03] <gregaf> just wondering if it'll be packaged in time for a milestone, or something else?
[18:03] <gregaf> (wasn't clear to me from the question)
[18:04] <jamespage> gregaf, final beta freeze is 28th march
[18:04] <gregaf> ah, got it
[18:04] <jamespage> so as long as its before then ++
[18:04] <gregaf> coolio
[18:05] <jamespage> gregaf, I can get it a point release in after then but it becomes more difficult
[18:06] * noob2 (~cjh@ has joined #ceph
[18:06] <gregaf> yeah
[18:06] <gregaf> I'll make sure the powers that be keep that in mind :)
[18:06] <jamespage> gregaf, thanks!
[18:07] <jamespage> gregaf, that said I did get a provisional minor release exception so I can push ceph point releases into 12.10 onwards
[18:07] <gregaf> sweet
[18:07] * jamespage must get the automated testing setup for that
[18:09] * wer_ (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[18:11] <noob2> so i was wondering why rados is built on top of a filesystem
[18:11] <noob2> wouldn't it be 'better' to use just raw storage blocks instead of all the filesystem overhead?
[18:12] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (Ping timeout: 480 seconds)
[18:12] <gregaf> you have to implement many of the hard parts of a filesystem to handle block allocation et al even if you're just writing an object store
[18:13] <gregaf> if a stable one existed we'd probably use it, but it doesn't
[18:13] <noob2> i see
[18:13] <noob2> so rather than reinvent the wheel
[18:13] <gregaf> Ceph in the past used to use a custom thing called EBOFS for storing objects, and that did do direct block management
[18:13] <noob2> why was that abandoned ?
[18:13] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:13] <gregaf> but it had some limitations, the most important being that it needed to be able to hold all metadata in-memory
[18:13] <noob2> ah
[18:14] <noob2> so you looked around and said hey there's already stuff that does this :D
[18:14] <gregaf> and btrfs emerged with support for all the important features of EBOFS, the critical advantage of being maintained by somebody else, and the minor weakness of having to implement POSIX as well
[18:14] <gregaf> yeah
[18:14] <noob2> haha yes that's a great development
[18:15] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[18:15] * leseb (~leseb@ has joined #ceph
[18:16] <noob2> was there a great performance difference between ebos and xfs?
[18:16] * BillK (~BillK@124-148-230-197.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[18:17] * vata (~vata@2607:fad8:4:6:358d:3c90:6bc1:af08) has joined #ceph
[18:19] <gregaf> dunno — ebofs was already gone by the time I joined
[18:19] <gregaf> I suspect not (I doubt it was really that optimized), but I'm not sure
[18:19] <noob2> gotcha
[18:19] <noob2> i found an old thread from 2009 with sage saying he expected better performance out of btrfs than ebofs
[18:19] <gregaf> it did support snapshotting, unlike xfs, so if it was better that would be why
[18:19] <gregaf> yeah, btrfs definitely
[18:20] <gregaf> it has *other people* making it work well ;)
[18:20] <noob2> hehe
[18:21] <noob2> well that's interesting. ceph has come a long way
[18:22] * joshd1 (~jdurgin@2602:306:c5db:310:9db2:7a44:69b5:93e) Quit (Quit: Leaving.)
[18:23] * leseb (~leseb@ Quit (Ping timeout: 480 seconds)
[18:24] * Cube (~Cube@ has joined #ceph
[18:32] * tnt (~tnt@82.195-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[18:34] <sjustlaptop> sage, gregaf: anyone want to take a look at the wip_osd_shutdown pull request?
[18:35] <gregaf> I thought sage had already done that...
[18:35] <sjustlaptop> ohi
[18:35] <gregaf> I could be wrong
[18:35] <gregaf> just saw some comments go by yesterday
[18:35] <sjustlaptop> ah
[18:35] <sjustlaptop> looking
[18:35] * dmick (~dmick@2607:f298:a:607:3da2:8104:76b2:7174) has joined #ceph
[18:35] <joao> mikedawson, just spoke with sage, and sage tells me that the monitor rework was merged right after v0.58 was frozen, so v0.59 is in fact the first release with the rework (the blog post must be updated to reflect that)
[18:36] <dmick> indeed:
[18:36] <dmick> $ git tag --contains cb85fb7d9a1da5a8f194bd9406c7df49da3c2e33
[18:36] <dmick> v0.59
[18:37] <joao> yeah
[18:37] <gregaf> *stare*
[18:37] <gregaf> dmick, I am in love with you right now
[18:37] <dmick> um, wow
[18:37] <gregaf> sjust: that's how you find out what tags or branches a commit is in without going into gitk
[18:37] <joao> gregaf was clearly unaware of tag --contains :p
[18:37] <gregaf> dmick: I don't think you understand how long I've been annoyed at not being able to find a command like that in the manpages
[18:38] * BillK (~BillK@58-7-212-96.dyn.iinet.net.au) has joined #ceph
[18:38] <mikedawson> joao: sounds good. thanks for the clarification. 0.58 -> 0.59 worked for me this am on a test deployment
[18:38] <gregaf> I've asked people and they didn't know either
[18:38] <nhm> ooh, I've been looking for that too.
[18:38] <dmick> ah. caveat: for some stupid reason tags may not be pulled by default; you gotta git fetch -t to be sure
[18:38] <nhm> dmick: you are the talk of the town now! ;)
[18:38] * eschnou (~eschnou@37.90-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[18:38] <joao> mikedawson, let us know if you bump into something weird
[18:39] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[18:40] <mikedawson> joao: I'll be scaling it up to 20+ nodes in the next few days, so I'll let you know if I run into anything
[18:40] * ssejour (~sebastien@ Quit (Quit: Leaving.)
[18:40] <gregaf> awww, bummer that branch --contains only works on your local branches, not remote refs
[18:40] <gregaf> I wonder if there's a modifier for that
[18:40] <joao> gregaf, -a ?
[18:41] <gregaf> yay
[18:42] <dmick> Tonight on Discovery: 20,000 Leagues Beneath the Git
[18:46] * leseb (~leseb@ has joined #ceph
[18:46] * eschnou (~eschnou@37.90-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[18:48] <davidz1> dmick: I do "git remote update -p origin" which fetches all branches, tags, and with -p prunes deleted branches.
[18:51] <t0rn> on a totally idle 0.56.3 cluster, i got a inconsistent pg. I have 3 copy, i did a getfattr -d on the objects, and did a diff on the three copies. The only difference was one had a 'user.cephos.seq' line at the bottom, on that copy, i did a setfattr -x user.cephos.seq {object} and then performed a pg scrub on the PG.. ceph pg scrub pg_id , i then had health_ok. Anyone seen that before? Its at least the 2nd time ive seen it
[18:51] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[18:51] <gregaf> sjust: ^
[18:53] * NuxRo (~nux@ Quit (Remote host closed the connection)
[18:54] * NuxRo (~nux@ has joined #ceph
[18:55] <sjustlaptop> hmm
[18:55] <sjustlaptop> t0rn: there should have been an explanation of why the object was considered inconsistent in ceph.log
[18:55] <sjustlaptop> what was it?
[18:58] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: If your not living on the edge, you're taking up too much space)
[19:00] <t0rn> sjustlaptop: this is the only reference to the said pg in my ceph.log on my mon: http://paste.debian.net/243450/ i manually fixed it around 13:22
[19:01] <t0rn> osd order being [345] for that pg if it matters
[19:02] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[19:03] <sjustlaptop> t0rn: did you compare the contents of the files?
[19:03] <sjustlaptop> ectory/head//0 digest 1387885116 != known digest 3316512973
[19:03] <sjustlaptop> means that the hash of the object on the primary did not match one or both of the replicas
[19:03] <t0rn> yes, they did differ, the part i dont understand is why
[19:03] <sjustlaptop> you most likely did a shallow scrub which superficially appeared to fix it
[19:04] <sjustlaptop> if you re-deep scrub the pg, it should show up as inconsistent again
[19:04] <sjustlaptop> t0rn: most likely, it was due to one of the bugs fixed in current bobtail
[19:04] <sjustlaptop> we'll have a 56.4 out at some point
[19:05] <gregaf> we're shooting for before the 28th now, apparently :D
[19:05] <sjustlaptop> to repair it, you'll want to compare checksums of the files to determine which doesn't match
[19:05] <sjustlaptop> and copy over the file contents from a healthy one to one which doesn't match
[19:05] <t0rn> i did a 'ceph pg scrub pgid' the first two had this: http://paste.debian.net/243454/ the one that differed: http://paste.debian.net/243455/ so it was line 3, which is why i did the setfattr -x user.cephos.seq on that one
[19:05] <gregaf> (well, Sage keeps saying this week, so maybe it'll appear tonight at 10:30 or something)
[19:06] <sjustlaptop> yeah, that seq xattr is a red herring
[19:06] <sjustlaptop> it's an internal filestore thing which will frequently be different
[19:06] <sjustlaptop> gregaf: it's how we prevent replay prior to a non-idempotent operation
[19:07] <gregaf> clear as mud :p
[19:07] <sjustlaptop> write a
[19:07] <sjustlaptop> clone a b
[19:07] <gregaf> but part of the replay guards around multiple updates
[19:07] <sjustlaptop> write b
[19:07] <gregaf> yeah
[19:07] <sjustlaptop> now replay
[19:07] <sjustlaptop> so prior to clone a b
[19:08] <sjustlaptop> we mark the seq attr and fsync to prevent replay prior to the sequence number of clone a b
[19:08] <sjustlaptop> oops, it's actually on b
[19:08] * Cube (~Cube@ Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * aliguori (~anthony@ Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * janisg (~troll@ Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * al (d@niel.cx) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * wogri (~wolf@nix.wogri.at) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * mo- (~mo@2a01:4f8:141:3264::3) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * jochen (~jochen@laevar.de) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * DLange (~DLange@dlange.user.oftc.net) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * juuva (~juuva@dsl-hkibrasgw5-58c05e-231.dhcp.inet.fi) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * tore_ (~tore@ Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * Tribaal (uid3081@hillingdon.irccloud.com) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * brambles (lechuck@s0.barwen.ch) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * wido (~wido@2a00:f10:104:206:9afd:45af:ae52:80) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * lurbs (user@uber.geek.nz) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * Lennie`away (~leen@lennie-1-pt.tunnel.tserv11.ams1.ipv6.he.net) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * samppah (hemuli@namibia.aviation.fi) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * MK_FG (~MK_FG@00018720.user.oftc.net) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] * jefferai (~quassel@quassel.jefferai.org) Quit (reticulum.oftc.net charon.oftc.net)
[19:08] <sjustlaptop> yeah, we mark the seq attr on b to prevent replay prior to the clone operation
[19:09] <dmick> davidz1: ah, I often use remote update, but hadn't realized it did tags. cool.
[19:10] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) has joined #ceph
[19:10] * Cube (~Cube@ has joined #ceph
[19:10] * aliguori (~anthony@ has joined #ceph
[19:10] * janisg (~troll@ has joined #ceph
[19:10] * al (d@niel.cx) has joined #ceph
[19:10] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) has joined #ceph
[19:10] * wogri (~wolf@nix.wogri.at) has joined #ceph
[19:10] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[19:10] * mo- (~mo@2a01:4f8:141:3264::3) has joined #ceph
[19:10] * jochen (~jochen@laevar.de) has joined #ceph
[19:10] * DLange (~DLange@dlange.user.oftc.net) has joined #ceph
[19:10] * juuva (~juuva@dsl-hkibrasgw5-58c05e-231.dhcp.inet.fi) has joined #ceph
[19:10] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) has joined #ceph
[19:10] * tore_ (~tore@ has joined #ceph
[19:10] * Tribaal (uid3081@hillingdon.irccloud.com) has joined #ceph
[19:10] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[19:10] * brambles (lechuck@s0.barwen.ch) has joined #ceph
[19:10] * wido (~wido@2a00:f10:104:206:9afd:45af:ae52:80) has joined #ceph
[19:10] * samppah (hemuli@namibia.aviation.fi) has joined #ceph
[19:10] * MK_FG (~MK_FG@00018720.user.oftc.net) has joined #ceph
[19:10] * jefferai (~quassel@quassel.jefferai.org) has joined #ceph
[19:10] * lurbs (user@uber.geek.nz) has joined #ceph
[19:10] * Lennie`away (~leen@lennie-1-pt.tunnel.tserv11.ams1.ipv6.he.net) has joined #ceph
[19:16] * leseb (~leseb@ Quit (Remote host closed the connection)
[19:21] * ChanServ sets mode +o dmick
[19:21] * ChanServ changes topic to 'v0.56.3 has been released -- http://goo.gl/f3k3U || argonaut v0.48.3 released -- http://goo.gl/80aGP || New Ceph Monitor Changes http://ow.ly/ixgQN'
[19:21] * ChanServ sets mode +o joao
[19:21] * leseb (~leseb@ has joined #ceph
[19:24] <Karcaw> i'm getting on mone of my mons with the new 0.59 code:
[19:24] <Karcaw> Invalid argument: /data/mon/store.db: does not exist (create_if_missing is false)
[19:24] <Karcaw> mon/Monitor.cc: In function 'bool Monitor::StoreConverter::needs_conversion()' thread 7f50ee495760 time 2013-03-21 11:23:58.073139
[19:24] <Karcaw> mon/Monitor.cc: 4109: FAILED assert(0 == "Existing store has not been converted to 0.52 format")
[19:24] <Karcaw> any help?
[19:27] * mcclurmc (~mcclurmc@firewall.ctxuk.citrix.com) Quit (Ping timeout: 480 seconds)
[19:30] * nhorman (~nhorman@2001:470:8:a08:7aac:c0ff:fec2:933b) has joined #ceph
[19:31] * sagelap1 (~sage@2600:1010:b102:c499:91fd:e33f:5ac3:854) has joined #ceph
[19:31] <sagelap1> dmick: want to review https://github.com/ceph/ceph/pull/129, since i'm making your life more difficult by adding more monitor commands?
[19:32] * BillK (~BillK@58-7-212-96.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[19:35] * scuttlemonkey (~scuttlemo@ has joined #ceph
[19:35] * ChanServ sets mode +o scuttlemonkey
[19:35] * leseb (~leseb@ Quit (Read error: Connection reset by peer)
[19:43] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[19:49] * sagelap1 is now known as sagelap
[19:52] * noahmehl (~noahmehl@ip-64-134-66-149.public.wayport.net) has joined #ceph
[19:58] * sagelap (~sage@2600:1010:b102:c499:91fd:e33f:5ac3:854) Quit (Quit: Leaving.)
[20:16] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) Quit (Ping timeout: 480 seconds)
[20:18] * noahmehl (~noahmehl@ip-64-134-66-149.public.wayport.net) Quit (Quit: noahmehl)
[20:22] <mikedawson> joshd: using Grizzly with leftover conf files from Folsom, "Driver path cinder.volume.driver.RBDDriver is deprecated, update your configuration to the new path." Any idea?
[20:22] * sagelap (~sage@2600:1010:b102:c499:f8f2:94cf:ac59:4a89) has joined #ceph
[20:24] <joshd> mikedawson: yeah, that'll still work, but the drivers were rearranged a little so now cinder.volume.drivers.rbd.RBDDriver is the new location
[20:25] * scuttlemonkey (~scuttlemo@ Quit (Ping timeout: 480 seconds)
[20:28] <mikedawson> joshd: thanks
[20:28] * sagelap (~sage@2600:1010:b102:c499:f8f2:94cf:ac59:4a89) Quit (Quit: Leaving.)
[20:30] * jtangwk1 (~Adium@2001:770:10:500:9560:c54d:4ebb:72f0) has joined #ceph
[20:32] * jtangwk (~Adium@2001:770:10:500:8576:dd71:b404:785) Quit (Read error: Connection reset by peer)
[21:16] <mikedawson> joao: on 0.59, I'm seeing ceph-create-keys processes start and never seem to end http://pastebin.com/bATtHm2g Is that expected?
[21:17] * eschnou (~eschnou@37.90-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:22] * ScOut3R (~ScOut3R@5401D8E4.dsl.pool.telekom.hu) has joined #ceph
[21:27] * drokita (~drokita@ has joined #ceph
[21:31] * janisg (~troll@ Quit (Ping timeout: 480 seconds)
[21:32] * nhorman (~nhorman@2001:470:8:a08:7aac:c0ff:fec2:933b) Quit (Quit: Leaving)
[21:38] * nwat (~Adium@eduroam-233-33.ucsc.edu) has joined #ceph
[21:40] <nwat> In ceph Java wrappers we have a class that represents <type, name> string pairs from the CRUSH map. What name describes this pair? An item, bucket, xyz ?
[21:43] <gregaf> they're buckets
[21:43] <gregaf> …unless they're items :p
[21:43] * PerlStalker (~PerlStalk@ Quit (Read error: Connection reset by peer)
[21:43] <gregaf> devices are the actual things which store data
[21:44] <gregaf> and are always leaves
[21:44] <gregaf> buckets are interior nodes in the tree
[21:44] <nwat> hah.. so bucket vs item.
[21:44] <gregaf> items are any node
[21:44] <nwat> ahh got it
[21:44] <nwat> so I need a class for item
[21:44] <gregaf> although honestly I don't know that it's defined that clearly anywhere
[21:45] <gregaf> we mostly talk about buckets, and item is just a keyword in the crush language
[21:45] <gregaf> but if I had to draw distinctions that's where they'd be
[21:47] <dmick> sagelap1: I'm dying to
[21:48] <dmick> mikedawson: I'm not joao, but no, that's not normal
[21:49] * PerlStalker (~PerlStalk@ has joined #ceph
[21:49] <nwat> gregaf: a topology path might be root/rack/host — i was thinking that host was device, but in fact its bucket :) thanks
[21:50] <gregaf> right
[21:56] <mikedawson> dmick: When I do a "ceph auth list", I don't have a mon. entry
[21:56] <dmick> that would be a problem
[21:57] <mikedawson> this is a new install on 0.58 with mkcephfs, then upgraded to 0.59
[21:57] <mikedawson> when is that entry typically created? during mkcephfs?
[21:58] <gregaf> I thought the mon. keyring was stored separately and didn't get dumped as part of the auth list (though I could be msremembering)
[21:59] <dmick> just broke my 'ceph' command or I'd look; hold one
[21:59] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[22:00] <gregaf> yeah, just checked and it shouldn't be there
[22:00] <dmick> maybe auth export?
[22:00] <gregaf> what are you looking for it for?
[22:01] <dmick> probably trying to figure out why ceph-create-keys is hanging
[22:01] <gregaf> I don't think you should be able to get them to tell you — the mon. keyring is shared symmetric and is how you prove you're a monitor, basically
[22:02] <mikedawson> gregaf: having trouble with an 0.58 mkcephfs install upgraded to 0.59 with ceph-create-keys instances hanging http://pastebin.com/bATtHm2g
[22:03] <gregaf> yeah, check and see if the keys it's trying to create exist, make sure the monitors are in a quorum, make sure the key it's using can successfully connect to the cluster
[22:03] <gregaf> would be my starting points
[22:06] <mikedawson> gregaf: testing right now, so I only have 1 mon. everytime I restart ceph, I get a new /etc/ceph/ceph.client.admin.keyring.<pid>.tmp
[22:06] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[22:07] <gregaf> weird, sounds like maybe it's creating the temporary keyring and then failing to move it into place — do you have an out-of-date one in /etc/ceph/ceph.client.admin.keyring?
[22:08] <gregaf> unfortunately I'm booked; somebody else will need to run through the rest of this
[22:08] <gregaf> dmick, you got any spare brain cycles?
[22:08] <dmick> perhaps in a minute
[22:08] <mikedawson> gregaf: no, just tmps
[22:13] * markl_ (~mark@tpsit.com) Quit (Quit: leaving)
[22:13] * markl (~mark@tpsit.com) has joined #ceph
[22:14] * drokita (~drokita@ Quit (Quit: Leaving.)
[22:15] <mikedawson> dmick: looks like it is stuck in a while loop in ceph_create_keys function get_key()
[22:18] <mikedawson> dmick: http://pastebin.com/k5FdHnYZ
[22:18] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[22:20] * janisg (~troll@ has joined #ceph
[22:23] <mikedawson> dmick: root@node1:/var/log/ceph# ceph --cluster=ceph --name=mon. --keyring=/var/lib/ceph/mon/ceph-a/keyring auth get-or-create client.admin mon allow * osd allow * mds allow
[22:23] <mikedawson> access denied
[22:27] * mcclurmc (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[22:29] <dmick> does /var/lib/ceph/mon/ceph-a/keyring have a client.admin key, and does that key match the keyring the client is using?
[22:29] <dmick> s/match the key/match the client.admin key in the keyring/
[22:33] <mikedawson> dmick: /var/lib/ceph/mon/ceph-a/keyring has mon. m but no client.admin
[22:34] <dmick> hm. because mkcephfs should have put it there, by my reading
[22:37] <dmick> no, I'm confused
[22:37] <dmick> is the mon. key in the keyring file that the *client* is using
[22:38] <mikedawson> since we've been looking at this, I killed the cluster and started over to make sure it wasn't related to 0.58. So against 0.59, I ran mkcephfs -a -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.keyring and the same problem exists
[22:39] <dmick> so in the two different clusters I have, the keyring used by the client has both the mon key and the client key in it
[22:40] <mikedawson> dmick: http://pastebin.com/Kq5YGyVV looks like client.admin and mon. have two different keys
[22:40] <dmick> yeah, they do, that's fine
[22:40] <dmick> but the keyring file that the client uses
[22:40] <dmick> (which is probably /etc/ceph/ceph.keyring)
[22:41] <dmick> should have the same mon. key that the monitor keyring has
[22:41] <dmick> I believe
[22:41] <mikedawson> yes, /etc/ceph/ceph.keyring has the valid key for ceph.admin
[22:42] <dmick> that's not the question
[22:43] <mikedawson> no, /etc/ceph/ceph.keyring does not list anything else besides ceph.admin
[22:43] <dmick> so I think for whatever reason that's the issue. I think it needs the mon key in it
[22:43] <dmick> let me test that theory by removing mine and seeing if things break
[22:45] <dmick> hm. well mine also has the client.admin key installed already
[22:49] <dmick> so without the client.admin key present, I get a failure from any ceph invocation that includes "unable to authenticate as client.admin"
[22:49] <mikedawson> so if I copy the /etc/ceph/ceph.keyring that lists a valid ceph.admin to /etc/ceph/ceph.client.admin.keyring, the problem goes away
[22:50] <dmick> have you changed your keyring paths in ceph.conf?
[22:50] <mikedawson> no
[22:50] <mikedawson> perhaps instead of mkcephfs -a -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.keyring, I need mkcephfs -a -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
[22:50] <dmick> no; the ceph.client.admin.keyring name is weird
[22:51] <dmick> the default search order is
[22:51] <dmick> /etc/ceph/$cluster.$name.keyring,/etc/ceph/$cluster.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin
[22:51] <dmick> so indeed that path is the first one
[22:51] <dmick> but if it's not found, it should be searching the second one
[22:52] <dmick> can you try removing the one with client.admin in its name, and then strace -t open the ceph client invocation?
[22:53] <mikedawson> dmick: I will, but I'm not sure what to do. Can you be more explicit?
[22:53] <dmick> with only /etc/ceph/ceph.keyring in place
[22:54] <dmick> actually, before strace, try
[22:55] <dmick> ceph auth get-or-create client.admin mon 'allow *' osd 'allow *' mds allow
[22:55] <mikedawson> root@node1:/etc/ceph# ceph auth get-or-create client.admin mon 'allow *' osd 'allow *' mds allow
[22:55] <mikedawson> [client.admin]
[22:55] <mikedawson> key = AQBqdUtRUMWNIBAANHfgcsnNq/U3/2FtcKA3cw==
[22:55] <dmick> if you weren't quoting the '*', the access denied was probably for some filename access
[22:56] <mikedawson> root@node1:/etc/ceph# ceph auth get-or-create client.admin mon allow * osd allow * mds allow
[22:56] <mikedawson> key for client.admin exists but cap mon does not match
[22:56] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[22:56] <dmick> yeah, the shell will expand those globs
[22:56] <dmick> if you don't quote them
[22:58] <dmick> so I think this was unconnected to the original problem with ceph-create-keys
[22:59] <mikedawson> http://pastebin.com/pMKxZxLr
[23:00] <mikedawson> when I service ceph stop, the ceph-create-keys processes stick around, and the get-or-create's show up without the quotes
[23:01] <mikedawson> perhaps the ceph-create-keys needs to inject the quotes into the get-or-create lines?
[23:01] <dmick> the quoting is only for the shell
[23:01] <mikedawson> ahh
[23:01] <dmick> you say "when you service ceph stop"...you mean "and then start", I assume?
[23:01] <dmick> no...that's not in the paste. Hm. why is it running create-keys after stop?...
[23:03] <mikedawson> no, I stop it and get the output in the last pastebin. when I start it back up, I just get another ceph-create-keys process (but no associated get-or-create). The get-or-create spawned from the ceph-create-keys only shows up after stopping ceph
[23:04] <dmick> I think upstart is fighting with at least my understanding :) and maybe with mkcephfs/service
[23:05] <dmick> do any of /var/lib/ceph/*/* contain an 'upstart' file?
[23:06] <mikedawson> one more wrinkle: this is a nightly of ubuntu raring with the ceph develepment repo for quantal (because there isn't yet a build for raring)
[23:07] <mikedawson> no files there called upstart
[23:08] * ivoks (~ivoks@jupiter.init.hr) Quit (Remote host closed the connection)
[23:09] <dmick> so in your last pastebin, those ceph-create-keys are never going to complete, because there's no monitor to talk to
[23:09] <dmick> those are only supposed to be started when a ceph-mon is started
[23:09] <dmick> I wonder if raring has changed the way the upstart tasks run
[23:09] <dmick> somehow
[23:10] <dmick> perhaps exposing a latent bug in our tasks
[23:10] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[23:11] <mikedawson> dmick: are you saying because ceph-create-keys has a lower pid then ceph-mon?
[23:11] <dmick> no, because there is no ceph-mon
[23:11] <dmick> stop kills mon/osd, and then ceph-create-keys start, but no mon
[23:12] <dmick> so how it ought to work, I think, is:
[23:12] <dmick> ceph-all.conf starts whenever there's a filesystem and a network interface
[23:12] * BillK (~BillK@124-149-78-131.dyn.iinet.net.au) has joined #ceph
[23:13] <dmick> that gates ceph-{mds,mon,osd}-all and radosgw-all
[23:13] <dmick> ceph-mon-all enables ceph-mon
[23:13] <mikedawson> actually those are hanging out from old service ceph starts. Here's a more explicit paste after I kill all the old cruft http://pastebin.com/csmDNB49
[23:14] <mikedawson> that one was started with /etc/init.d/ceph start to avoid the upstart scripts, same issue appears
[23:15] <dmick> I take back "ceph-mon-all enables ceph-mon"
[23:15] <dmick> ceph-mon-all enables ceph-mon-all-starter
[23:16] <dmick> ceph-mon-all-starter looks for all the mon directories, and only if they contain both a 'done' and an 'upstart' file does it emit a ceph-mon event, which enables ceph-mon
[23:16] <dmick> that's the only thing that's supposed to start ceph-create-keys
[23:16] <dmick> is when the ceph-mon event happens
[23:17] <dmick> now I wonder if, say, upstart has changed such that "starting a process named ceph-mon" is now considered "an event 'start ceph-mon' happened"
[23:18] <dmick> because I think we believe the event name and any proc name are distinct. I'm only waving my hands.
[23:18] * markbby1 (~Adium@ has joined #ceph
[23:18] <dmick> if I knew more about Upstart I'd say "watch upstart events when you use /etc/init.d/ceph to restart" and see which you get
[23:18] * vata (~vata@2607:fad8:4:6:358d:3c90:6bc1:af08) Quit (Quit: Leaving.)
[23:18] <mikedawson> I can to reproduce with a quantal box, if that will help
[23:18] <dmick> there is information on the net about debugging upstart. I'd look at that
[23:19] <dmick> but I have to get back to my day job here
[23:19] <mikedawson> dmick: sure thing. I have a work around, and if I can nail down the root cause, I'll get a bug filed
[23:19] <mikedawson> thanks
[23:20] <dmick> yw
[23:22] * ScOut3R (~ScOut3R@5401D8E4.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[23:23] <sstan> is every write acknowledged as soon as it's written to at least two journals?(replica size 2)
[23:23] * markbby (~Adium@ Quit (Ping timeout: 480 seconds)
[23:26] <sstan> that seems logical ... but it doesn't explain why small writes are slow (even though the journal isn't the bottleneck)
[23:26] <gregaf> yes in general — writes are acknowledged when written to disk on all nodes in the acting set
[23:27] <sstan> you mean journals ?
[23:27] <gregaf> for most setups, yes :)
[23:27] <sstan> hmm then for small writes, the bottleneck must be the latency (network + access to the journal )
[23:28] <gregaf> latency is generally the bottleneck for small writes on any storage system
[23:28] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:29] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[23:29] <sstan> I compared my ceph setup (journal on RAM) to a MSA ... the MSA's small writes are 10x faster :/
[23:29] <sstan> so , more specifically , it must be the network latency .. right?
[23:30] <dmick> what do "on RAM" and "MSA" mean?
[23:31] <sstan> journal is a file on /dev/shm. By MSA I meant iSCSI machine dedicated to storage
[23:32] <sstan> I'll test the network latency hypothesis by reducing replica size form 2 to 1. It should improve small writes greatly
[23:32] <dmick> is the "iSCSI machine dedicated to storage" being accessed across the network?
[23:32] <sstan> yes it is
[23:33] <dmick> that's a little surprisingly slow IMO, then, yes.
[23:34] <dmick> size == 1 is an interesting test
[23:34] <sstan> what's slow?
[23:34] <dmick> ceph 10x slower than iSCSI remote
[23:34] <sstan> ah
[23:35] <sstan> iSCSI : 1120 kB/s 4k block size
[23:35] <sstan> ceph : 120
[23:35] * The_Bishop (~bishop@2001:470:50b6:0:d5fb:59b7:82d8:4fd3) Quit (Ping timeout: 480 seconds)
[23:39] * aliguori (~anthony@ Quit (Remote host closed the connection)
[23:39] <dmick> and you're not filling the journals, right?...
[23:43] * Cube (~Cube@ Quit (Quit: Leaving.)
[23:43] * tnt (~tnt@82.195-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[23:43] * Cube (~Cube@ has joined #ceph
[23:43] * The_Bishop (~bishop@2001:470:50b6:0:a883:f83e:a532:aa6a) has joined #ceph
[23:43] * Cube (~Cube@ Quit ()
[23:44] <PerlStalker> I'm seeing this error: rbd: error opening image bricolage-root: (2) No such file or directory2013-03-21 16:41:20.366045 7f4c041dc780 -1 librbd::ImageCtx: error finding header: (2) No such file or directory
[23:44] <PerlStalker> I get then when I try to run rbd info bricolage-root
[23:45] <PerlStalker> rbd info shows the image.
[23:47] * diegows (~diegows@ has joined #ceph
[23:47] * alram (~alram@ has joined #ceph
[23:54] * mcclurmc (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.