#ceph IRC Log


IRC Log for 2013-05-31

Timestamps are in GMT/BST.

[0:00] <elder> And noticing these other things.
[0:02] <sagewk> tnt: was mon logging up over the high mon cpu period?
[0:02] * BillK (~BillK@124-148-124-185.dyn.iinet.net.au) has joined #ceph
[0:03] <tnt> sagewk: no
[0:04] <tnt> and mon cpu usage is still pretty high now. ~ 75% of a core or so. (while usually it's like a couple %)
[0:04] <tnt> it does get higher with loggin enabled of course.
[0:07] * Maskul (~Maskul@host-89-241-174-13.as13285.net) Quit (Quit: Maskul)
[0:09] <tnt> mm, actually, looking at the graphs mon cpu usage is higher than I thought when idle. And the async compaction patch seem to have made it higher.
[0:09] * BManojlovic (~steki@fo-d- has joined #ceph
[0:13] <sagewk> hmm, we can turn down teh trim interval some.
[0:14] <sagewk> try --paxos-service-trim-min 1000 --paxos-service-trim-max 5000
[0:14] <tnt> well, honestly, I'd rather not touch it ATM ... it's filling the new OSD and the VM are still running so I'm good.
[0:15] <sagewk> k. i'll experiement here.
[0:15] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[0:16] * nhm (~nhm@ma22436d0.tmodns.net) has joined #ceph
[0:17] * aliguori (~anthony@ Quit (Remote host closed the connection)
[0:17] <sagewk> oh.. it's trimming the paxos way too often.
[0:18] * MooingLe1ur (~troy@phx-pnap.pinchaser.com) Quit (Ping timeout: 480 seconds)
[0:19] * terje-_ (~terje@63-154-145-97.mpls.qwest.net) has joined #ceph
[0:21] * dxd828_ (~dxd828@host-92-24-117-118.ppp.as43234.net) Quit (Quit: Textual IRC Client: www.textualapp.com)
[0:22] * MooingLemur (~troy@phx-pnap.pinchaser.com) has joined #ceph
[0:24] * terje- (~terje@63-154-145-97.mpls.qwest.net) has joined #ceph
[0:26] * terje- (~terje@63-154-145-97.mpls.qwest.net) Quit (Read error: Operation timed out)
[0:27] * terje-_ (~terje@63-154-145-97.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[0:28] * terje_ (~joey@63-154-145-97.mpls.qwest.net) has joined #ceph
[0:30] * mnash_ (~chatzilla@vpn.expressionanalysis.com) has joined #ceph
[0:30] * Cube1 (~Cube@ has joined #ceph
[0:30] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[0:30] * mikedawson_ (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[0:30] <tnt> sagewk: if you want an idea of the activity Disk: http://i.imgur.com/DwbaK4T.png IO: http://i.imgur.com/rn5KbRw.png CPU: http://i.imgur.com/dN5hcTK.png
[0:31] * Tamil1 (~tamil@ has joined #ceph
[0:31] <jlogan1> mon.a@-1(probing) e0 preinit fsid 00000000-0000-0000-0000-000000000000
[0:31] <jlogan1> mon.a@-1(probing) e0 check_fsid cluster_uuid contains '6ac903d6-8f9e-4768-9531-64fdfc2c3061'
[0:31] * Fetch (fetch@gimel.cepheid.org) has joined #ceph
[0:31] <jlogan1> mon.a@-1(probing) e0 error: cluster_uuid file exists with value '6ac903d6-8f9e-4768-9531-64fdfc2c3061', != our uuid 00000000-0000-0000-0000-000000000000
[0:31] * darkfaded (~floh@ has joined #ceph
[0:31] * PerlStalker (~PerlStalk@ Quit (Quit: ...)
[0:31] * jjgalvez1 (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[0:32] * iggy (~iggy@theiggy.com) Quit (Remote host closed the connection)
[0:33] * Fetch__ (fetch@gimel.cepheid.org) Quit (Read error: Connection reset by peer)
[0:33] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[0:33] * darkfader (~floh@ Quit (Read error: Connection reset by peer)
[0:33] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 484 seconds)
[0:34] * redeemed (~redeemed@static-71-170-33-24.dllstx.fios.verizon.net) Quit (Quit: bia)
[0:34] * joshd (~joshd@2607:f298:a:607:ac93:ff05:d54d:d7b4) Quit (Ping timeout: 484 seconds)
[0:34] * iggy___ (~iggy@theiggy.com) Quit (Remote host closed the connection)
[0:34] * Tamil (~tamil@ Quit (Ping timeout: 484 seconds)
[0:34] * saaby_ (~as@mail.saaby.com) has joined #ceph
[0:34] * joshd (~joshd@2607:f298:a:607:91e2:9879:95d9:fc7a) has joined #ceph
[0:35] * iggy (~iggy@theiggy.com) has joined #ceph
[0:35] * mnash (~chatzilla@vpn.expressionanalysis.com) Quit (Ping timeout: 484 seconds)
[0:35] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 484 seconds)
[0:35] * mikedawson_ is now known as mikedawson
[0:35] * mnash_ is now known as mnash
[0:36] * tnt_ (~tnt@ has joined #ceph
[0:36] * rturk-away (~rturk@ds2390.dreamservers.com) Quit (Ping timeout: 484 seconds)
[0:36] * rturk-away (~rturk@ds2390.dreamservers.com) has joined #ceph
[0:36] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 484 seconds)
[0:36] * MK_FG (~MK_FG@00018720.user.oftc.net) Quit (Ping timeout: 484 seconds)
[0:36] * Cube (~Cube@ Quit (Ping timeout: 484 seconds)
[0:36] * saaby (~as@mail.saaby.com) Quit (Ping timeout: 484 seconds)
[0:36] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 484 seconds)
[0:37] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) has joined #ceph
[0:37] * terje_ (~joey@63-154-145-97.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[0:37] * MK_FG (~MK_FG@00018720.user.oftc.net) has joined #ceph
[0:37] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[0:41] * tnt (~tnt@ Quit (Ping timeout: 480 seconds)
[0:41] * tnt_ is now known as tnt
[0:41] * jamespag` (~jamespage@culvain.gromper.net) has joined #ceph
[0:42] * jamespage (~jamespage@culvain.gromper.net) Quit (Remote host closed the connection)
[0:44] * phantomcircuit (~phantomci@covertinferno.org) Quit (Ping timeout: 480 seconds)
[0:44] * phantomcircuit (~phantomci@covertinferno.org) has joined #ceph
[1:02] <joao> saaby_, still around?
[1:12] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:14] <paravoid> 2013-05-30 23:04:36.258643 mon.0 1200417 : [INF] osdmap e181868: 144 osds: 142 up, 142 in
[1:14] <paravoid> 2013-05-30 23:04:39.771210 mon.0 1200427 : [INF] osdmap e181869: 144 osds: 136 up, 142 in
[1:14] <paravoid> 2013-05-30 23:04:48.815916 mon.0 1200484 : [INF] osdmap e181872: 144 osds: 114 up, 142 in
[1:14] <paravoid> fun
[1:14] <paravoid> sagewk: I know it was (probably) the thing you fixed, but makes me wonder how I hit it every other week or so but noone else is complaining
[1:15] <sagewk> because you set the min failure reporters to a larger number
[1:15] <sagewk> it only happened when you have reports but not enough to mark someone down
[1:15] <paravoid> hm
[1:16] <paravoid> and I did this because of #4552
[1:17] <paravoid> not sure if it's the same case
[1:17] <paravoid> all I did was to readd an osd
[1:17] <sagewk> yeah. its the right thing to do.. would like to make it the default somehow, but need to figure out how to describe it in terms of the crush hierarchy instead of a raw #
[1:18] <paravoid> nod
[1:18] <paravoid> I'm not sure why it got all these reports now though
[1:19] <paravoid> doesn't matter that much I guess
[1:24] * terje-_ (~terje@63-154-145-97.mpls.qwest.net) has joined #ceph
[1:26] * The_Bishop (~bishop@2001:470:50b6:0:c9df:6318:fdca:966) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[1:28] * denken (~denken@dione.pixelchaos.net) Quit (Quit: poof)
[1:30] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 480 seconds)
[1:33] * terje-_ (~terje@63-154-145-97.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[1:38] * LeaChim (~LeaChim@ Quit (Ping timeout: 480 seconds)
[1:40] * denken (~denken@dione.pixelchaos.net) has joined #ceph
[1:41] * denken (~denken@dione.pixelchaos.net) Quit ()
[1:59] * KindOne (~KindOne@0001a7db.user.oftc.net) has joined #ceph
[2:02] * denken (~denken@dione.pixelchaos.net) has joined #ceph
[2:03] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[2:04] * nhm (~nhm@ma22436d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[2:05] * andrei (~andrei@host86-155-31-94.range86-155.btcentralplus.com) has joined #ceph
[2:06] <andrei> evening guys
[2:06] <andrei> i was wondering if anyone has used caching mechanisms such as bcache or EnhanceIO with ceph?
[2:06] <andrei> if so, how did it go? is it working okay for you?
[2:09] * redeemed (~redeemed@cpe-192-136-224-78.tx.res.rr.com) has joined #ceph
[2:11] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[2:11] * loicd (~loic@2a01:e35:2eba:db10:7d41:74db:3711:985a) has joined #ceph
[2:16] * The_Bishop (~bishop@e179012236.adsl.alicedsl.de) has joined #ceph
[2:20] * terje- (~terje@63-154-144-245.mpls.qwest.net) has joined #ceph
[2:28] * terje- (~terje@63-154-144-245.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[2:35] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[2:45] * Tamil1 (~tamil@ Quit (Quit: Leaving.)
[2:45] * terje-_ (~terje@63-154-144-245.mpls.qwest.net) has joined #ceph
[2:53] * terje-_ (~terje@63-154-144-245.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[3:00] * terje- (~terje@63-154-144-245.mpls.qwest.net) has joined #ceph
[3:08] <mech422> Does ceph not like having the cluster name set as something other then 'ceph' ?
[3:08] * terje- (~terje@63-154-144-245.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[3:20] * JohansGlock (~quassel@kantoor.transip.nl) Quit (Read error: Connection reset by peer)
[3:22] * andrei (~andrei@host86-155-31-94.range86-155.btcentralplus.com) Quit (Quit: Ex-Chat)
[3:28] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[3:29] * DarkAce-Z (~BillyMays@ has joined #ceph
[3:34] * terje (~joey@63-154-131-182.mpls.qwest.net) has joined #ceph
[3:37] * terje (~joey@63-154-131-182.mpls.qwest.net) Quit (Read error: Operation timed out)
[3:43] * iggy_ (~iggy@theiggy.com) has joined #ceph
[3:44] * iggy is now known as iggy__
[3:44] * iggy_ is now known as iggy
[3:44] * Cube1 (~Cube@ Quit (Ping timeout: 480 seconds)
[3:44] * iggy__ is now known as iggy_
[3:53] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[4:04] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[4:04] * portante (~user@c-24-63-226-65.hsd1.ma.comcast.net) has joined #ceph
[4:10] * madkiss1 (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[4:14] * diegows (~diegows@ Quit (Read error: Operation timed out)
[4:18] <tnt> HEALTH_OK 16 osds: 16 up, 16 in ... yeah !
[4:25] <mech422> grats!
[4:26] <mech422> I only have 5 nodes to play with :-P
[4:29] * tnt (~tnt@ Quit (Read error: Operation timed out)
[4:36] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Ping timeout: 480 seconds)
[4:49] * julian (~julian@ has joined #ceph
[4:54] * terje_ (~joey@63-154-131-182.mpls.qwest.net) has joined #ceph
[5:02] * terje_ (~joey@63-154-131-182.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[5:02] * redeemed (~redeemed@cpe-192-136-224-78.tx.res.rr.com) Quit (Quit: bia)
[5:07] * Vanony_ (~vovo@ has joined #ceph
[5:12] * eternaleye (~eternaley@2002:3284:29cb::1) Quit (Remote host closed the connection)
[5:13] * eternaleye (~eternaley@2002:3284:29cb::1) has joined #ceph
[5:14] * Vanony (~vovo@ Quit (Ping timeout: 480 seconds)
[5:16] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[5:16] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[5:18] * san_ (~san@ has joined #ceph
[5:19] * houkouonchi (~linux@ Quit (Quit: Client exiting)
[5:37] * mrjack_ (mrjack@office.smart-weblications.net) has joined #ceph
[5:37] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[5:41] * terje- (~terje@63-154-139-245.mpls.qwest.net) has joined #ceph
[5:44] * mrjack (mrjack@office.smart-weblications.net) Quit (Ping timeout: 480 seconds)
[5:49] * terje- (~terje@63-154-139-245.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[5:51] * terje- (~terje@63-154-139-245.mpls.qwest.net) has joined #ceph
[5:59] * terje- (~terje@63-154-139-245.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[6:06] * eternaleye (~eternaley@2002:3284:29cb::1) Quit (Read error: Connection reset by peer)
[6:07] * eternaleye (~eternaley@2002:3284:29cb::1) has joined #ceph
[6:19] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[6:24] * tkensiski1 (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has joined #ceph
[6:25] * terje_ (~joey@63-154-139-245.mpls.qwest.net) has joined #ceph
[6:33] * terje_ (~joey@63-154-139-245.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[6:36] * loicd (~loic@2a01:e35:2eba:db10:7d41:74db:3711:985a) Quit (Quit: Leaving.)
[6:37] * loicd (~loic@magenta.dachary.org) has joined #ceph
[6:37] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph
[6:45] * jjgalvez1 (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[6:45] * houkouonchi-home (~linux@pool-71-160-127-158.lsanca.fios.verizon.net) has joined #ceph
[6:47] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[6:57] * nhm (~nhm@65-128-142-169.mpls.qwest.net) has joined #ceph
[7:27] * terje-_ (~terje@63-154-139-245.mpls.qwest.net) has joined #ceph
[7:30] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[7:35] * terje-_ (~terje@63-154-139-245.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[7:48] * tkensiski1 (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[7:50] * san_ (~san@ Quit (Quit: Ex-Chat)
[7:50] * san (~san@ Quit (Quit: Ex-Chat)
[7:59] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[8:00] * terje (~joey@63-154-128-158.mpls.qwest.net) has joined #ceph
[8:08] * terje (~joey@63-154-128-158.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[8:13] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:30] * stxShadow (~Jens@ip-88-152-161-249.unitymediagroup.de) has joined #ceph
[8:31] <- *stxShadow* moin ! .... ich bin dann jetzt verf├╝gbar
[8:41] * eternaleye (~eternaley@2002:3284:29cb::1) Quit (Remote host closed the connection)
[8:41] * eternaleye (~eternaley@2002:3284:29cb::1) has joined #ceph
[8:44] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[8:52] * eternaleye (~eternaley@2002:3284:29cb::1) Quit (Read error: Connection reset by peer)
[8:52] * eternaleye (~eternaley@2002:3284:29cb::1) has joined #ceph
[8:54] * san (~san@ has joined #ceph
[9:02] * eschnou (~eschnou@ has joined #ceph
[9:02] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[9:05] * loicd (~loic@87-231-103-102.rev.numericable.fr) has joined #ceph
[9:06] * dcasier (~dcasier@ has joined #ceph
[9:09] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[9:09] * ChanServ sets mode +v andreask
[9:10] * oliver1 (~oliver@p4FD06190.dip0.t-ipconnect.de) has joined #ceph
[9:10] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:11] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has left #ceph
[9:17] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[9:17] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[9:17] * eternaleye (~eternaley@2002:3284:29cb::1) Quit (Remote host closed the connection)
[9:18] <loicd> good morning ceph
[9:19] * eternaleye (~eternaley@2002:3284:29cb::1) has joined #ceph
[9:20] * dcasier (~dcasier@ Quit (Ping timeout: 480 seconds)
[9:28] * juriskr (~oftc-webi@ has joined #ceph
[9:29] * dcasier (~dcasier@ has joined #ceph
[9:29] * sha (~kvirc@ has joined #ceph
[9:30] <juriskr> Hi everybody. We're considering use Ceph as a storage platform. But we have couple question we'd like to clarify.
[9:31] <juriskr> 1. Has anybody used Ceph rdb and expose these block devices using iSCSI ? Any problem/issues with that ?
[9:31] <juriskr> 2. Does anybody used Ceph in geo replication configuration, I mean the same cluster spread between 2 sites ?
[9:32] * terje- (~terje@63-154-128-158.mpls.qwest.net) has joined #ceph
[9:32] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[9:32] <juriskr> 3. Can Ceph cluster mix machines with different arch: 32 and 64 bits ?
[9:40] * terje- (~terje@63-154-128-158.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[9:46] <nigwil> I have an OSD in status 'down' attempting to bring it 'up' is getting an assertion failure
[9:47] <nigwil> http://pastebin.com/JXqFBrN1
[9:49] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[9:49] * ScOut3R (~ScOut3R@ has joined #ceph
[9:49] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[9:52] * bergerx_ (~bekir@ has joined #ceph
[9:56] <stxShadow> juriskr: q2 -> only if you have very low latency between the sites .... and a lot of bandwith
[9:56] <stxShadow> q3: yes ... ist working
[9:56] <stxShadow> q1: has to be answered by someone else
[9:57] * jksM (~jks@3e6b5724.rev.stofanet.dk) Quit (Ping timeout: 480 seconds)
[10:04] <juriskr> stxShadow: Thanks for the answer. Any exprience how low latency should be ? Maybe certain numbers ? Right now we have 50Mbit/s between sites.
[10:05] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[10:05] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[10:07] <stxShadow> as low as possible ..... not quit slower than ethernet !! we have testet with 10 and 20 ms
[10:07] <stxShadow> 20 ms is significant slower
[10:07] <stxShadow> 50 Mbit : depends on you storage size
[10:08] <stxShadow> you can also get in trouble with quorum with only 2 sites
[10:09] * LeaChim (~LeaChim@ has joined #ceph
[10:10] <juriskr> so lets say we can start with 10ms as a starting point. Right now we have ~40TB os raw data to store, but of course it depends on how frequently data will be changed and so on.
[10:11] <juriskr> can you explain a bit about quorum ? I thought from the architecture point of view there will be single Ceph cluster (with OSD >2) spread over 2 sites ?
[10:12] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[10:12] <juriskr> The only possible situation to lose quorum I can imagine is when you have equal number os OSD's on both sides and you lose connection.
[10:14] <juriskr> thats probably the reason we should always keep odd number of OSDs.
[10:14] <juriskr> if we talk about 2 site configuration.
[10:15] <stxShadow> not odd number of osds -> but odd number of monitors ;)
[10:17] <stxShadow> so .... if you build up a side with 1 monitor .... and the other one with 2
[10:17] <stxShadow> -> if you loose the connection your cluster will stall
[10:18] <juriskr> hmm
[10:19] * tnt (~tnt@ has joined #ceph
[10:19] * schlitzer|work (~schlitzer@ Quit (Quit: Leaving)
[10:19] * nlopes (~nlopes@a95-92-0-12.cpe.netcabo.pt) Quit (Remote host closed the connection)
[10:19] <juriskr> that's strange behavior.
[10:19] <stxShadow> the statement of the developers is: ceph might run on a two site setup .... but it is not designed for that case
[10:20] <stxShadow> -> at the moment
[10:21] * BManojlovic (~steki@ has joined #ceph
[10:21] <juriskr> ok, thanks. I probably need to go deeper and study more manuals.
[10:21] * cking (~king@cpc3-craw6-2-0-cust180.croy.cable.virginmedia.com) has joined #ceph
[10:22] * cking (~king@cpc3-craw6-2-0-cust180.croy.cable.virginmedia.com) Quit ()
[10:22] <juriskr> you probably have ceph cluster up and running ? any other issues, problems or some specific things we have to look at ?
[10:22] <juriskr> you would recomend
[10:23] <stxShadow> my cluster uses 30 TB of data .... in case of recovery an osd ....
[10:23] <stxShadow> we use 2,4 Gbit of bandwith
[10:23] <stxShadow> frist hint: never use the last stable ;) there are mostly lots of unknown bugs
[10:24] <stxShadow> cuttlefish is the last stable .... so we are happy with bobtail
[10:24] <juriskr> ok. what about access protocols ?
[10:25] <stxShadow> sencond: use ssd space for jounals ....
[10:25] <juriskr> clusterfs ? rbd ? any other ?
[10:25] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[10:25] <juriskr> i mean cephfs
[10:25] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[10:25] <stxShadow> we only use rbd .... cephfs is not stable till now
[10:25] * nlopes (~nlopes@a95-92-0-12.cpe.netcabo.pt) has joined #ceph
[10:26] <stxShadow> we tested cephfs 2-3 month ago .... in some cases we were not able to access the data anymore (until restart)
[10:26] * jks (~jks@4810ds1-ns.2.fullrate.dk) has joined #ceph
[10:27] <stxShadow> at the moment rbd and rados are tagged stable
[10:30] <juriskr> ok. this is interesting. so basically we will end up with 2 options right now (mean access protocols): rbd and rados. Which distro do you use for the cluster nodes ?
[10:31] <stxShadow> Debian and Ubuntu
[10:31] <juriskr> any sugestions on how to use/access Ceph from Windows systems ?
[10:32] <juriskr> this is the reason why I asked about iSCSI. this is the only way I can imagine this right now.
[10:32] <stxShadow> i might be wrong ... but there is no client at the moment
[10:34] <stxShadow> so you have to setup some kind of iscsi <-> rbd / rados gateway
[10:34] <juriskr> couple more questions: 1. have you tried CentOS/RHEL as a distro for cluster nodes ? 2. Have you experienced any problems with locking, accessing data in Ceph from multiple clients ?
[10:36] <stxShadow> 1) -> no ... Ubuntu 12.04 is the recommended distro .... so i use it ;)
[10:37] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[10:37] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[10:38] <stxShadow> 2) i think you are a little bit wrong here .... ceph is not a cluster fs like lustre etc .... its a distributed storage ....
[10:38] <stxShadow> some guys use ocfs2 on top of ceph to do that
[10:41] <juriskr> i know i mean when you access the same object (lets say file) through REST interface from multiple clients you can end up in a situation when you acces the same piece of data from different clients. Regarding cluster/shared file systems, we've experienced lots of problems with GFS2 + some specific applications and the roor cause of all those problems were locking.
[10:42] <juriskr> And by the way one more question, what kind of data do you store in you cluster ?
[10:45] <stxShadow> juriskr: sorry ... no experience with that ..... we use it only for our vms (rbd)
[10:45] <stxShadow> actualy we provide storage space to a little more the 1000 vms over ceph and rbd
[10:46] <juriskr> and you access those rbd's from hardware nodes or use qemu driver ?
[10:46] <stxShadow> we use the kernel driver
[10:46] <stxShadow> and provide it to qemu
[10:47] <stxShadow> so something like that:
[10:47] <stxShadow> qemu rbd:12345/test.rbd
[10:47] <stxShadow> 12345 is the pool
[10:47] <stxShadow> test.rbd is the vm image
[10:47] <juriskr> ok i got it.
[10:48] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[10:48] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[10:49] <juriskr> probably offtopic question, do you use any cloud/virtualization management platform ? any experience how those piece of software works with Ceph ?
[10:51] <stxShadow> we've tried openstack
[10:51] <stxShadow> works nice
[10:52] <stxShadow> but we've developed our own middleware -> cause some of our requirements were not met
[10:55] <juriskr> and the last question :) any experience with inktank as a company that support ceph project ?
[10:55] <stxShadow> yes .... we have a support contract .... really nice guys ....
[10:56] <stxShadow> they have access to our cluster
[10:56] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[10:57] <stxShadow> if we report a bug -> they react very fast and try to solve the problem (worked all the time till now=
[10:58] <juriskr> ok. this is good. so that's it. thanks for the answers and detail explanation. we'll probably need to try things out to see how it works.
[10:59] <stxShadow> you are welcome :)
[11:03] * rahmu (~rahmu@ip-147.net-81-220-131.standre.rev.numericable.fr) has joined #ceph
[11:06] * terje_ (~joey@63-154-157-244.mpls.qwest.net) has joined #ceph
[11:09] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[11:09] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[11:13] * mschiff (~mschiff@port-29027.pppoe.wtnet.de) has joined #ceph
[11:14] * terje_ (~joey@63-154-157-244.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[11:18] <mschiff> Hi, I am trying to use ceph-deploy on gentoo. When executing "ceph-deploy install" I only get a python Traceback, any Idea anyone how to fix that? ( http://privatepaste.com/84f9022aa8 )
[11:20] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Quit: Konversation terminated!)
[11:21] <mschiff> basicylly I only wanted to use it to configure ceph, not to install, but a command like "mon create" gives a similar TRaceback
[11:21] <mech422> I've noticed pushy just sorta dies...
[11:22] <mech422> if i say mon create mon1 mon2 mon3 mon4 - it always dies on one of htem
[11:22] <mech422> usually just rerunning hte command works
[11:27] <sha> hi...what does this mean. HEALTH_WARN mds a is laggy
[11:28] <nigwil> sha: check NTP (clocks are sync'd) and network connectivity
[11:31] <sha> nigwil: clocks are sync`d.
[11:31] <stxShadow> so one of your mds is down ?
[11:32] * todin (tuxadero@kudu.in-berlin.de) has joined #ceph
[11:32] <nigwil> sha: what version of Ceph? (ceph -v)
[11:33] <sha> mdsmap e8499: 1/1/1 up {0=a=up:active(laggy or crashed)}
[11:33] <sha> 0.63
[11:33] * dcasier (~dcasier@ Quit (Ping timeout: 480 seconds)
[11:35] <nigwil> sha: anything interesting in tail /var/log/ceph/ceph-mds.a.log
[11:36] <stxShadow> it think your mds is down ..... check for the process
[11:37] * terje_ (~joey@63-154-153-156.mpls.qwest.net) has joined #ceph
[11:37] <nigwil> service ceph status mds.a
[11:37] <sha> 1 min please
[11:41] * xinxinsh (~xinxin.sh@jfdmzpr04-ext.jf.intel.com) has joined #ceph
[11:41] * xinxinsh (~xinxin.sh@jfdmzpr04-ext.jf.intel.com) Quit ()
[11:42] <sha> he was turned off. when we start it ....HEALTH_WARN mds cluster is degraded
[11:43] * xinxinsh (~xinxinsh@jfdmzpr04-ext.jf.intel.com) has joined #ceph
[11:45] * terje_ (~joey@63-154-153-156.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[11:45] * xinxinsh (~xinxinsh@jfdmzpr04-ext.jf.intel.com) Quit ()
[11:47] * jks (~jks@4810ds1-ns.2.fullrate.dk) Quit (Remote host closed the connection)
[11:47] * jks (~jks@3e6b5724.rev.stofanet.dk) has joined #ceph
[11:50] <sha> nigwil: tail of ceph-mds.a.log http://pastebin.com/8Fw83Ap2
[11:57] <nigwil> sha: hmmm, I thought the Finisher bug had been squashed, not sure what to suggest ehre
[11:57] <nigwil> here
[12:00] * Tamil (~tamil@ has joined #ceph
[12:03] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[12:05] <sha> ok. may be some one else know...or have idea what`s wrong here
[12:07] * terje-_ (~terje@63-154-153-156.mpls.qwest.net) has joined #ceph
[12:08] * tnt (~tnt@ Quit (Ping timeout: 480 seconds)
[12:10] <nigwil> sha: I wonder if adding a second MDS might be an option?
[12:11] <nigwil> are you using ceph-deploy?
[12:12] * andrei (~andrei@host217-46-236-49.in-addr.btopenworld.com) has joined #ceph
[12:14] <sha> mkcephfs
[12:16] * terje-_ (~terje@63-154-153-156.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[12:16] <nigwil> is this a test environment? as in it doesn't matter if you lose it?
[12:17] * rahmu (~rahmu@ip-147.net-81-220-131.standre.rev.numericable.fr) Quit (Remote host closed the connection)
[12:19] <sha> not really. data that are there - are important
[12:19] * Tamil (~tamil@ Quit (Read error: Connection reset by peer)
[12:21] <nigwil> ok. I'm too inexperienced to provide suggestions for fixing a production deployment of Ceph. I susgest waiting until someone from the Ceph team is online or instead post to the newsgroup comp.file-systems.ceph.user to get guidance
[12:34] * dosaboy_ (~dosaboy@host86-161-201-199.range86-161.btcentralplus.com) has joined #ceph
[12:35] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[12:36] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[12:36] * The_Bishop (~bishop@e179012236.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[12:36] * diegows (~diegows@ has joined #ceph
[12:37] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[12:39] * Vjarjadian (~IceChat77@ Quit (Quit: There's nothing dirtier then a giant ball of oil)
[12:41] * dosaboy (~dosaboy@host86-164-82-60.range86-164.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[12:46] * dcasier (~dcasier@ADijon-653-1-18-15.w86-213.abo.wanadoo.fr) has joined #ceph
[12:46] * DarkAce-Z (~BillyMays@ Quit (Ping timeout: 480 seconds)
[12:50] * sha (~kvirc@ Quit (Read error: Connection reset by peer)
[12:50] * san (~san@ Quit (Read error: Connection reset by peer)
[12:57] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:03] * juriskr (~oftc-webi@ Quit (Quit: Page closed)
[13:08] * DarkAce-Z (~BillyMays@ has joined #ceph
[13:09] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[13:23] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:28] * terje-_ (~terje@63-154-153-156.mpls.qwest.net) has joined #ceph
[13:35] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[13:35] * ChanServ sets mode +v andreask
[13:36] * terje-_ (~terje@63-154-153-156.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[13:38] <mrjack_> where can i get libleveldb-dev and libsnappy-dev for squeeze?
[13:38] <mrjack_> could someone add this info to http://ceph.com/docs/master/install/build-prerequisites/#debian ?
[13:42] * terje (~joey@63-154-157-112.mpls.qwest.net) has joined #ceph
[13:45] * fridudad (~oftc-webi@fw-office.allied-internet.ag) has joined #ceph
[13:46] <fridudad> since updating to upstream/cuttlefish my log is full of caller_ops.size 3002 > log size 3001 while restarting an osd
[13:50] * terje (~joey@63-154-157-112.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[13:53] * sha (~kvirc@ has joined #ceph
[13:58] * fridudad (~oftc-webi@fw-office.allied-internet.ag) Quit (Quit: Page closed)
[13:58] <sha> pinh
[13:58] * fridudad (~oftc-webi@fw-office.allied-internet.ag) has joined #ceph
[14:07] * jeff-YF (~jeffyf@stt192137.broadband.vi) has joined #ceph
[14:08] * jeff-YF (~jeffyf@stt192137.broadband.vi) Quit ()
[14:09] <sha> how can tell what is it http://pastebin.com/hTENKXzS
[14:09] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[14:09] * san (~san@ has joined #ceph
[14:10] <joao> sha, ?
[14:10] <sha> see the log http://pastebin.com/hTENKXzS
[14:11] <joao> sha, what appears to be the problem with that message?
[14:11] <joao> afaict, everything is good in mon-land
[14:11] <sha> why it`s 90% avalible....
[14:11] <joao> mon is in quorum, and has 90% available disk space in its store disk
[14:12] <joao> well, I suppose 10% are taken?
[14:12] <sha> ooo thnx) sounds good
[14:15] <fridudad> since cuttlefish my whole ceph cluster becomes unstable when an osd is missing. The log is full of osd.54 [ERR] 4.97d caller_ops.size 3002 > log size 3001
[14:15] <fridudad> can someone help?
[14:17] <mrjack_> fridudad: i have observed this, but what do you mean by unstable? no IO?
[14:18] <fridudad> mrjack_ right after these messages i get a lot of slow request messages and I/O gets slow / stalled on clients / qemu rbd
[14:19] <fridudad> osd.3 [WRN] slow request 45.152716 seconds old, received at 2013-05-31 13:42:55.579955: osd_op(client.9439559.0:3387071 rbd_data.5039256b8b4567.000000000000290f [write 2568192~12288] 4.f4a3cae1 RETRY=1 snapc 4da2=[] e90858) v4 currently reached pg
[14:20] <tnt> I had that same error yesterday ... twice for the same PG.
[14:22] <fridudad> tnt: did you also had slow requests and stalled I/O?
[14:22] <fridudad> the strange thing is these slow request messages are from the osd whihc is down
[14:22] <fridudad> so it's clear that they don't get answered
[14:23] <tnt> fridudad: yeah, yesterday evening was epic :)
[14:23] <fridudad> tnt: so it's a bug in cuttlefish?
[14:23] <fridudad> anybody from ceph here who can comment on that?
[14:24] <tnt> Well, that message is either a bug, or maybe it detects an abnormal condition and correct for it ... no idea what it really means.
[14:24] <mrjack_> fridudad: i have seen the same, i can force that when i rbd rm <largeimage>..
[14:24] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[14:25] <nhm> tnt: fridudad I don't suppose you guys have submitted a bug in the tracker yet?
[14:25] <fridudad> mrjack_ tnt i've posted to the ceph devel mailinglist it would be nice if you can comment on that so inktank knows that there are more users facing this problem with cuttlefish
[14:25] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[14:25] <fridudad> nhm: i've seen this 5 minutes ago the first time => so no
[14:25] * portante (~user@c-24-63-226-65.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[14:25] * julian_ (~julian@ has joined #ceph
[14:25] <fridudad> nhm: don't know about tnt and mrjack_
[14:25] <nhm> fridudad: ok, I'm not the right person to help with this probably, but the more info we can get the better
[14:26] <tnt> nhm: well ... it recovered just fine, so I'm not even sure it's a bug. And as for the issues I had with IO last night, they're the general high-monitor load/IO issue.
[14:26] <fridudad> tnt: ok i've no high load monitor issue (SSDs here)
[14:26] <fridudad> nhm: ok i'll open up one in the tracker
[14:26] <nhm> tnt: yeah, Sage and I have been profiling.
[14:26] <nhm> tnt: Sage had some ideas
[14:27] <tnt> nhm: yes, I read his post this morning. I actually also tried setting some non-zero cache value for leveldb read, but that didn't seem to change all that much.
[14:27] <fridudad> nhm: what to set for target version? cuttlefish?
[14:28] <tnt> nhm: I also tried enabling compression. which drastically reduced the IO load ... and bumped the CPU load even higher.
[14:29] * JonTheNiceGuy (~oftc-webi@ has joined #ceph
[14:30] <nhm> fridudad: hrm, probably if it lets you.
[14:30] <fridudad> nhm: oh ok no just 0.64 so I'll keep it empty
[14:30] <nhm> fridudad: yeah, not sure why it does that
[14:31] <tnt> isn't there a backport field ?
[14:31] <fridudad> tnt: yes but no pull down it's a free text field
[14:31] <JonTheNiceGuy> Hi. I have a few very basic questions about Ceph, which I'm not getting the answers from the Getting Started guides, and I wasn't sure whether I was just missing the point or not :)
[14:32] <JonTheNiceGuy> If I've got three servers which are going to provide the storage, do I end up with 3 x OSDs, 2 x Monitors and 3 x MDS?
[14:32] * julian (~julian@ Quit (Ping timeout: 480 seconds)
[14:32] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[14:33] <fridudad> mrjack_ tnt: i've added http://tracker.ceph.com/issues/5216 so it would be great if you can report it too
[14:33] <fridudad> or just say that you've seen the same
[14:33] * terje- (~terje@63-154-157-112.mpls.qwest.net) has joined #ceph
[14:33] <JonTheNiceGuy> Also, if each of those 3 boxes has got 100GB storage (just for example), how much approximate space will I end up with? Is it 100GB (I saw an image showing primary OSD, secondary OSD and tertiary OSD), or is it different?
[14:33] * tnt replied on the ml.
[14:34] <stxShadow> JonTheNiceGuy: you need an odd number of monitors to build quorum
[14:34] <JonTheNiceGuy> stxShadow: D'oh, that was a typo - meant to put 3 on monitors also.
[14:34] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[14:34] <stxShadow> the space amount depends on the replication level
[14:34] <tnt> JonTheNiceGuy: in general the available space is the total disk space / replication level (which is 2 by default)
[14:34] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[14:35] <stxShadow> if its got 2 -> 300 GB / 2 = 150 GB usable
[14:35] <stxShadow> aproximatly
[14:35] <stxShadow> if its 3 -> 300/3 etc
[14:35] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[14:36] <stxShadow> you can start with 2 osd, 1 mon and 1 mds (only needed for cephfs)
[14:36] <stxShadow> if you dont use cephfs -> mds is not required
[14:36] <stxShadow> so minmal setup is 2 osd and 1 mon
[14:36] <JonTheNiceGuy> stxShadow: I plan to be proxying the connections to the store via a NFS mount, so I suspect I'll need the MDS
[14:37] <stxShadow> ok .... but at the moment -> cephfs ist not stable
[14:38] <stxShadow> or has there been something changed ?
[14:38] <JonTheNiceGuy> Should I be using RBD then instead?
[14:38] <stxShadow> RBD is stable for sure
[14:39] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[14:41] <JonTheNiceGuy> Should the admin node be kept separate from the OSD nodes?
[14:41] * terje- (~terje@63-154-157-112.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[14:41] <stxShadow> hmm ? what do you mean by "admin node" ?
[14:42] <tnt> the one with an admin keyring
[14:42] <stxShadow> ahhh ... ok sorry ;)
[14:43] <stxShadow> there is no reason why it should be separated ... right ?
[14:43] <tnt> well, personally I use a separate machine for admin taks and the OSDs don't have the admin key. No reason for them to have it.
[14:43] <JonTheNiceGuy> Sorry about all these questions, it's just hard trying to get my head around the deployment strategy. I've only been looking since yesterday though, so that's good :)
[14:44] <tnt> but you could just pick one OSD to use as admin and just put the key on there.
[14:45] <stxShadow> so ... he asked for the minimal setup ..... and the implies .... that the admin keyring could be on one of the osds
[14:46] * tziOm (~bjornar@ti0099a340-dhcp0745.bb.online.no) has joined #ceph
[14:48] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:50] * andrei (~andrei@host217-46-236-49.in-addr.btopenworld.com) Quit (Quit: Ex-Chat)
[14:56] <JonTheNiceGuy> If I go with CephFS, should I have the same number of MDS as MONs?
[14:57] <sig_wal1> hello
[14:58] <sig_wal1> can META_COLL be placed on separate drive than other osd files? I used mount --bind but osd become inconsistent after 2 restarts
[14:58] <mrjack_> joao: i have a log of rbd rm <bigimage> which kicked out one osd.. i have a mon log with debug and the osd log of the failing osd... which issues where these? do you remember issue id?
[14:59] <stxShadow> JonTheNiceGuy: No ... i dont think so. actually there is no aktiv / aktiv setup possible for mds
[14:59] * jahkeup (~jahkeup@ has joined #ceph
[15:01] <JonTheNiceGuy> So this is what I'm aiming for: http://pastebin.com/pAZPJWJp
[15:02] <JonTheNiceGuy> Does the allocation of roles look right?
[15:02] <JonTheNiceGuy> Should/can I add an OSD role to the storage proxy?
[15:07] * rahmu (~rahmu@ has joined #ceph
[15:15] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[15:16] <mrjack_> is there already a ticket for "slow rbd rm" and "wrongly marked osd down"?
[15:19] * terje-_ (~terje@63-154-157-112.mpls.qwest.net) has joined #ceph
[15:19] * andrei (~andrei@host217-46-236-49.in-addr.btopenworld.com) has joined #ceph
[15:20] <andrei> hello guys
[15:20] <andrei> was wondering if someone could help me with this: health HEALTH_WARN mds cluster is degraded; clock skew detected on mon.a
[15:20] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[15:21] <andrei> i do have ntp running on all 3 monitor servers
[15:22] <fridudad> andrei: this one gets cleared when you restart mon.a
[15:23] <mrjack_> there is no need to restart mon.a
[15:23] <tnt> mrjack_: I think there is already a slow rbd rm fix.
[15:23] <mrjack_> tnt: that is rbd parallel rm patch...
[15:23] <fridudad> mrjack_ oh how to solve it? I had the same yesterday and i was only able to clear it with restarting the mon
[15:23] <mrjack_> tnt: but it kicks out osds as well on my setup so it might be interesting as i captured this one luckily with full logging ;)
[15:24] <mikedawson> andrei: I've seen the 'clock skew detected' issue with clocks properly synced via ntp, too. Like fridudad, I just had to restart monitors.
[15:24] <mrjack_> fridudad: you don't need to restart monitors, just sync clocks and wait for ceph to check for clockskew again fixes it, too
[15:24] <fridudad> mrjack_ sure as the parallel one does not reduce I/O it only speeds up the rbd process itself
[15:24] <andrei> thanks guys, i will try that
[15:24] <fridudad> mrjack_ how often does it check? i waited 10 min and the message did not disappear
[15:24] <mrjack_> fridudad: don't know...
[15:25] <mrjack_> fridudad: a mon-restart will work, but i also noticed that this goes away automagically..
[15:25] <fridudad> mrjack_ after restarting it was away and never came back ;-) also restarting a mon does not shut my cluster down like an OSD ;-) so it was low risk
[15:26] <mrjack_> tnt: it does not make sense to me...
[15:26] <mrjack_> tnt: the osd gets marked down with rbd rm.. but if i put normal load on the cluster i can't get the osd to fail..
[15:27] <tnt> mrjack_: What's the load like on the mon ?
[15:27] <mrjack_> tnt: when there is no load on the cluster and i do rbd rm, it always kicks out the osd, i think there are some more errors here
[15:27] * capri (~capri@ Quit (Read error: Connection reset by peer)
[15:27] * terje-_ (~terje@63-154-157-112.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[15:27] <tnt> (IO / CPU)
[15:27] <mrjack_> tnt: almost no load
[15:27] * capri (~capri@ has joined #ceph
[15:27] <mrjack_> tnt i can see from my log, if the osd gets marked down falsly, it tries to reconnect again, but these reconnects look like DoS to me ... ;)
[15:28] <mrjack_> >> pipe(0x95b0a00 sd=31 :39951 s=2 pgs=155 cs=25 l=0).fault, initiating reconnect
[15:28] <mrjack_> tnt: these reconnects the osd sends..
[15:29] <mrjack_> zcat /var/log/ceph/ceph-osd.6.log.1.gz |grep "2013-05-30 22:34:05" |grep reconne |wc -l
[15:29] <mrjack_> 5774
[15:29] <mrjack_> 6000 retries in a second.. seems to much for me...
[15:30] <tnt> yeah, I've seen that as well.
[15:31] <tnt> I know I reported it here, but I don't remember if there was a ticket # for that
[15:31] <mrjack_> tnt: yeah, this seems not right to me, there should be a timeout or so so that it does not send so much reconnects per second
[15:32] <tnt> yup
[15:35] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Quit: Ex-Chat)
[15:35] * dcasier (~dcasier@ADijon-653-1-18-15.w86-213.abo.wanadoo.fr) Quit (Read error: Connection reset by peer)
[15:38] <fridudad> tnt nhm mrjack_ to me it seems the my stalled I/O problem is that after restarting the OSD it sets itself online when it isn't really ready to serve I/O. Then it gets requests which it never answers.
[15:39] <mrjack_> fridudad: hm
[15:39] <mrjack_> is there a way to find out which rbd images consumes which amount of IO?
[15:40] <Kioob> with kernel client, on client side, yes
[15:40] <mrjack_> Kioob: no not on client side..
[15:40] * gurubert (~gurubert@pD9FF7F9B.dip0.t-ipconnect.de) has joined #ceph
[15:40] <Kioob> (with "iostat" for example)
[15:40] <gurubert> hi
[15:41] <fridudad> Kioob:i/O stat how would you see which rbd image?
[15:41] <gurubert> are there recommendations for client capablities when the client uses CephFS ?
[15:41] <mrjack_> Kioob: ceph -w does show totals: 14915KB/s rd, 1840KB/s wr, 1991op/s - i'd like to have that e.g. rbd iostats that lists io for each image..
[15:41] <gurubert> something like mon allow r, osd allow rw, mds allow rw?
[15:42] <Kioob> fridudad: for example : http://pastebin.com/tb4SRGYt
[15:43] <tnt> no, there is no per-image IO accounting in ceph AFAIK
[15:43] <fridudad> Kioob: ah OK kernel driven. i'm on the qemu way on thought mrjack_ would like to see the load on the osd servers
[15:43] <Kioob> mrjack_: I understand... but I didn't found how to do that
[15:43] <niklas> Where would I find the meaning of the error codes, librados passes when some call goes wrong?
[15:43] <mrjack_> Kioob: yeah, this would be a feature request ;)
[15:44] <Kioob> Yes. I see that Xen will support librbd, to have better performance. Then I will loose IOstats :(
[15:46] <fridudad> mrjack_: you don't believe my theory? At least in my case if i start an osd it takes up to 200% CPU for at least 30-90 seconds - i think it does some scanning or flushing journal or whatever. But it says immediatly - hey i'm here give me I/O requests but it can't serve them...
[15:47] <tnt> Kioob: actually my xen librbd driver maintains some stats ... and it wouldn't be that hard to extend to whatever you want to monitor. (currently it's mostly IOPs)
[15:47] <Kioob> great :)
[15:48] <tnt> fridudad: and so if you just shut it down but don't start it again you don't have slow requests ?
[15:48] <niklas> wido: do you know where to look for the explanation of the error codes, librados throws at me?
[15:49] <andrei> after a reboot of one of the servers I am seeing the following error while starting osds
[15:49] <andrei> 2013-05-31 14:45:46.346709 7febfe91c780 -1 filestore(/var/lib/ceph/osd/ceph-0) Error initializing leveldb: Corruption: CURRENT file does not end with newline
[15:49] <andrei> 2013-05-31 14:45:46.346804 7febfe91c780 -1 ESC[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-0: (1) Operation not permittedESC[0m
[15:49] <wido> niklas: Yes, /usr/include/errno.h
[15:49] <tnt> niklas: they're most likely standard errors codes liek -EIO -EACCESS -E... see errno.h
[15:49] <wido> tnt: niklas: Java doesn't support looking up those errors codes, so it just throws an integer back
[15:49] <andrei> anyone have an idea what's wrong with the osd?
[15:50] * diegows (~diegows@ has joined #ceph
[15:50] <tnt> andrei: fs seems to have been corrupted
[15:50] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) has joined #ceph
[15:50] <fridudad> tnt: not sure if i want to try that ... i've formatted my testing cluster so i've only a production system right now. But it seems i then only get the caller_ops.size 3002 > log size 3001 messages
[15:50] <andrei> tnt: on all 8 osds?
[15:50] <andrei> how could this happen?
[15:51] <niklas> wido: I know, I'll have a look at errno.h
[15:51] <niklas> thx
[15:51] <andrei> i've just done a server reboot
[15:51] <andrei> with init 6
[15:51] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) Quit ()
[15:52] <andrei> all my /var/lib/ceph/osd/ceph-* are mounted
[15:52] <andrei> and I can see data there
[15:53] * redeemed (~redeemed@static-71-170-33-24.dllstx.fios.verizon.net) has joined #ceph
[15:53] <andrei> the other strange thing is that ceph -s is showing that 16 out of 17 osds are up; however, that's not the case as one server with 9 osds is not up. these are the osds that I can't start
[15:54] <niklas> Hmm, /usr/include/errno.h does not help, but I guess they are the same as http://www.virtsync.com/c-error-codes-include-errno?
[15:56] * JonTheNiceGuy (~oftc-webi@ Quit (Quit: Page closed)
[15:59] * gurubert (~gurubert@pD9FF7F9B.dip0.t-ipconnect.de) Quit (Quit: Leaving)
[15:59] * jluis (~JL@ has joined #ceph
[16:05] * joao (~JL@89-181-159-84.net.novis.pt) Quit (Ping timeout: 480 seconds)
[16:07] * Jahkeup_ (~jahkeup@ has joined #ceph
[16:07] * jahkeup (~jahkeup@ Quit (Read error: Connection reset by peer)
[16:12] * fridudad (~oftc-webi@fw-office.allied-internet.ag) Quit (Remote host closed the connection)
[16:14] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[16:14] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[16:15] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[16:22] * xinxinsh (~xinxinsh@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[16:24] * xinxinsh (~xinxinsh@jfdmzpr05-ext.jf.intel.com) Quit ()
[16:31] <Kioob> Hi, I have "corrupted objects", which does not exists in "rados ls"
[16:31] <Kioob> (an old image)
[16:32] <Kioob> Is there a way to safely remove those files on OSD ?
[16:37] <andrei> does anyone know if ceph startup script automatically unmounts osds when you shutdown the server?
[16:38] <nhm> andrei: the OSD volumes? I don't think so.
[16:38] <tnt> That's the OS job ... fstab is there for that
[16:38] <andrei> nhm: it automatically mounts it during the startup
[16:38] <andrei> so i assumed it will automatically unmount it as well
[16:38] <andrei> )))
[16:38] <tnt> imho it should do neither
[16:39] <Kioob> automatically mounts ??? how ?
[16:39] <andrei> Kioob: not really sure
[16:39] <andrei> it just does for me
[16:39] <andrei> i have osds in ceph.conf
[16:40] <andrei> and when service ceph start osd.0
[16:40] <andrei> it would automatically mount it if it is not mounted
[16:40] <Kioob> the blockdevice and FS to use are not in ceph.conf
[16:40] <andrei> here is what i have in ceph.conf
[16:40] <andrei> [osd.0]
[16:40] <andrei> host = arh-ibstorage1-ib
[16:40] <andrei> devs = /dev/disk/by-id/scsi-35000cca01a9acbac
[16:40] <andrei> osd journal size = 15360
[16:40] <andrei> osd journal = /dev/disk/by-id/scsi-SATA_INTEL_SSDSC2CW2CVCV206304HL240CGN-part5
[16:40] <andrei> please let me know if this is not correct
[16:41] <andrei> as i am learning
[16:41] <Kioob> oh. Never seen that before
[16:41] <andrei> i am specifying block device for osd
[16:41] <tnt> osd journal size is only for journal as file IIRC. If you have a device, it's not needed.
[16:41] <Kioob> I can't say...
[16:41] <andrei> and partition for the journal
[16:41] <andrei> and it would automatically mount it
[16:42] <andrei> now, the question is does it automatically unmount it?
[16:42] <andrei> it seems not
[16:42] <andrei> as I am having issues with starting osds on one of the servers
[16:42] <andrei> none of them would start
[16:42] <andrei> and I can't figure out why
[16:42] <andrei> i can see that there is a mount issue with one of the disks
[16:43] <andrei> however all other 7 osds are mounted okay
[16:43] <tnt> most distrib I know will umount all mounted FS on shutdown anyway.
[16:43] <andrei> tnt: thanks
[16:43] <andrei> i should of thought of that
[16:43] <andrei> that is true
[16:44] <andrei> so, that doesn't explain why one of the drives is corrupted
[16:44] * vipr_ (~vipr@78-23-113-4.access.telenet.be) has joined #ceph
[16:44] * vipr_ (~vipr@78-23-113-4.access.telenet.be) Quit ()
[16:46] <andrei> i wonder if it has anything to do with the firmware upgrade of the lsi controller which i've done today?
[16:48] * terje_ (~joey@63-154-130-33.mpls.qwest.net) has joined #ceph
[16:49] * xinxinsh (~xinxinsh@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[16:49] * vata (~vata@2607:fad8:4:6:7030:ba13:af9d:b4c) has joined #ceph
[16:56] * terje_ (~joey@63-154-130-33.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[16:57] * xinxinsh (~xinxinsh@jfdmzpr05-ext.jf.intel.com) Quit (Quit: Leaving)
[16:59] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[17:06] * loicd (~loic@87-231-103-102.rev.numericable.fr) Quit (Quit: Leaving.)
[17:06] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:09] <ron-slc> Hello all. Are there any upcoming point-releases for cuttlefish? I'm about to upgrade from Bobtail.
[17:09] <ron-slc> And if there's going to be a point-release coming soon, I'll wait the week or so.
[17:10] <mrjack_> ron-slc: then wait
[17:10] <mrjack_> ron-slc: there will be 0.61.3 soon
[17:11] <ron-slc> mrJack_: perfect! thanks. I've seen a bit of mention on back-porting a few fixes. Thanks for the info
[17:12] <jluis> most fixes have already been backported to the cuttlefish branch, so I suppose .3 is Coming Real Soon Now
[17:14] * terje- (~terje@63-154-130-33.mpls.qwest.net) has joined #ceph
[17:17] * bergerx_ (~bekir@ Quit (Quit: Leaving.)
[17:17] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[17:19] * tnt (~tnt@212-166-48-236.win.be) Quit (Read error: Operation timed out)
[17:19] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit ()
[17:20] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[17:21] * eschnou (~eschnou@ Quit (Remote host closed the connection)
[17:22] * terje- (~terje@63-154-130-33.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[17:27] * yehudasa (~yehudasa@2607:f298:a:607:dc9c:c9bf:9554:d22d) Quit (Ping timeout: 480 seconds)
[17:35] * yehudasa (~yehudasa@2607:f298:a:607:3d09:4ac6:3111:56c6) has joined #ceph
[17:35] * stxShadow (~Jens@ip-88-152-161-249.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[17:39] * loicd (~loic@magenta.dachary.org) has joined #ceph
[17:39] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[17:40] * tnt (~tnt@ has joined #ceph
[17:42] <joelio> It may be worth adding python-setuptools as a dependency for ceph-deploy on 13.04 at least.. my minimal pxe doesn't ship with it
[17:43] <joelio> also support for proxies to wget the key
[17:46] * san|2 (~kvirc@ has joined #ceph
[17:49] * The_Bishop (~bishop@f052103195.adsl.alicedsl.de) has joined #ceph
[17:51] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Quit: Leaving)
[17:59] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[18:01] * ScOut3R (~ScOut3R@ Quit (Read error: Operation timed out)
[18:08] * BillK (~BillK@124-148-124-185.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[18:09] * Jahkeup_ (~jahkeup@ Quit (Ping timeout: 480 seconds)
[18:10] <san|2> Hello, somebody can explain this osd.10 [ERR] 3.3c osd.10: soid 1a9b83c/rbd_data.154c6b8b4567.000000000005b79e/head//3 digest 427726981 != known digest 211948696
[18:10] <san|2> 2013-05-31 22:06:17.791397 osd.10 [ERR] 3.3c osd.14: soid 1a9b83c/rbd_data.154c6b8b4567.000000000005b79e/head//3 digest 427726981 != known digest 211948696
[18:10] <san|2> 2013-05-31 22:06:20.492229 osd.10 [ERR] 3.3c repair 0 missing, 2 inconsistent objects
[18:10] <san|2> 2013-05-31 22:06:20.492256 osd.10 [ERR] 3.3c repair 4 errors, 4 fixed
[18:14] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[18:14] <Kioob> if I well read : it found 2 inconsistent objects, then fix it
[18:15] * oliver1 (~oliver@p4FD06190.dip0.t-ipconnect.de) has left #ceph
[18:15] * PerlStalker (~PerlStalk@ has joined #ceph
[18:15] <san|2> good
[18:15] <san|2> Our cluster stood all night in a state of degradation at the level of 1.728%
[18:15] <Kioob> your cluster is in HEALTH_OK state ?
[18:16] <san|2> health HEALTH_ERR 72 pgs degraded; 93 pgs inconsistent; 3 pgs repair; 101 pgs stuck unclean; recovery 34608/2003142 degraded (1.728%); 177 scrub errors; mds cluster is degraded; mds a is laggy
[18:16] <Kioob> outch
[18:17] <san|2> not ok ))
[18:17] <san|2> sorry for my English
[18:17] <san|2>
[18:18] <san|2> I can not understand why the process of recovery depends on at 1.728%
[18:21] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[18:23] <jluis> san|2, any down osds?
[18:29] <san|2> Yes HDD denied one week ago (after a failover cluster state was OK) we upgraded to 0.63, and only after the update removed the drive from the CrushMap. then shed some degrodatsiya with 9% and hovered at 1.728%
[18:29] <Kioob> when I do a "ceph pg scrub 3.1", no scrub is launched : the state of the PG doesn't change (active+clean+inconsistent). Why ? :S
[18:34] * terje- (~root@ has joined #ceph
[18:34] * rahmu (~rahmu@ Quit (Remote host closed the connection)
[18:39] <san|2> Correctly I understand that ceph can not find the percentage is 1.728?
[18:39] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Quit: Konversation terminated!)
[18:40] <Kioob> (I answer to myself : because of osd_scrub_load_threshold. Manual scrubs seem to also follow this. After changing it, scrub start)
[18:42] <Kioob> san|2: some PG are not in "active+clean" state, and those PG seems to represent 1.728% of the objects
[18:43] <terje-> Hey sage, I missed your im earlier.. you still around?
[18:44] <san|2> Kioob: thanks for the reply. Is there a way to find out why they are not in "active+clean" state?
[18:45] <Kioob> "ceph health detail" give more detail
[18:45] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[18:46] <Kioob> then, when you have a PG id, you can do : ceph pg 8.372 query
[18:46] <Kioob> it's very verbose, and you'll often found more help here
[18:46] <Kioob> (but not always)
[18:47] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[18:49] <san|2> Ok I got it. Thank you
[18:50] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) Quit (Ping timeout: 480 seconds)
[18:51] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[18:53] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) has joined #ceph
[18:54] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[18:56] * tkensiski (~tkensiski@ has joined #ceph
[18:56] * tkensiski (~tkensiski@ has left #ceph
[18:56] <san|2> http://pastebin.com/XuQ5Q6SM ceph pg query
[18:57] <san|2> any ideas?
[19:02] * rturk-away is now known as rturk
[19:04] <loicd> sjust: I don't understand the rationale for the merge_old_entry at https://github.com/dachary/ceph/blob/563acd2092e26960b1a67aef7e30aa3c2e454291/src/osd/PGLog.cc#L279 as tested in https://github.com/dachary/ceph/blob/563acd2092e26960b1a67aef7e30aa3c2e454291/src/test/osd/TestPGLog.cc#L252 . A hint would be much appreciated :-)
[19:05] <Kioob> san|2: it's truncated, what do you see in last lines ? "ceph pg XXX query | tail -n20"
[19:06] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[19:08] <san|2> http://pastebin.com/0NLZbCes
[19:10] <san|2> To my regret I do not see any reason why the cluster is stopped at 1.728 percent
[19:11] <Kioob> oh ok, so it was not truncated :S
[19:12] <Kioob> san|2 : and all your OSD are up ?
[19:12] <san|2> yes
[19:13] <Kioob> well, I'm not able to help you here... sorry
[19:13] <cjh_> when clients write information into the ceph cluster they first download a copy of the current crush map from the monitor right?
[19:16] * andrei (~andrei@host217-46-236-49.in-addr.btopenworld.com) Quit (Ping timeout: 480 seconds)
[19:19] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[19:25] * jluis is now known as joao
[19:29] * san|2 (~kvirc@ Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[19:29] <cjh_> locid: i've been keeping an eye on your pg questions
[19:30] <cjh_> could you explain how an object goes from an object to to a page group?
[19:30] <cjh_> i understand it's broken up into chunks but then what happens?
[19:32] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) Quit (Quit: Leaving.)
[19:32] <sagewk> loicd, sjust: should this one be closed too? https://github.com/ceph/ceph/pull/327
[19:33] <cjh_> sagewk: if i want to find out how an object get into the cluster and where it goes should i continue to reference the phd paper or is there more up to date docs now somewhere else?
[19:35] * Vjarjadian (~IceChat77@ has joined #ceph
[19:39] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[19:39] <sagewk> cjh_: there's lots of rnadom docs for internals, and some arch docs, on ceph.com/docs/master
[19:39] <cjh_> yeah i'm moving through them slowly trying to put the pieces together. i'm getting asked some deep questions about how an object gets stored in the cluster and i don't know how to answer
[19:42] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[19:46] * Tamil (~tamil@ has joined #ceph
[19:48] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Ping timeout: 480 seconds)
[19:48] <loicd> cjh_: I'm not sure I understand your question. When an object enters the cluster, it is written to the primary OSD. This OSD then copies it to the replicates. Is it what you're asking ?
[19:48] <cjh_> loicd: so i guess my question is around what crush uses to calculate the primary osd
[19:49] <cjh_> the cluster map as it is referred to on the wiki is a collection of many things. do clients always download the entire cluster map before they start communicating?
[19:49] <tnt> well, it uses crush :P that's the name of the algorithm to map PG to OSD.
[19:49] <cjh_> my PG map for instance is 119MB. does it download all of that?
[19:49] <cjh_> tnt: oh i know :)
[19:50] <cjh_> so a new client to the cluster contacts the monitor, what does he download ?
[19:50] <tnt> it doesn't need the pgmap, just the crushmap and osdmap.
[19:51] <cjh_> ah the osdmap, that's the piece i was missing i think
[19:51] <cjh_> can i dump the osd map from the running cluster to look at it?
[19:51] <tnt> basically which OSD are up/down.
[19:51] <cjh_> and the monitors maintain the PG map right?
[19:51] <tnt> ceph osd dump -o osdmap IIRC
[19:51] <cjh_> awesome :)
[19:51] <cjh_> that's 47K
[19:51] <tnt> yes AFAIK the pgmap is mon stuff.
[19:51] <cjh_> + my 7.3K for my crush map
[19:52] <cjh_> which is nothing
[19:52] <tnt> and the extracted osdmap may actually contain the crushmap inside it btw.
[19:52] <gregaf> the crush map is actually embedded in the osdmap
[19:52] <cjh_> oh ok
[19:53] <cjh_> so it's actually 47K for the whole payload it downloads
[19:53] <cjh_> ok next question i got was what happens if while talking to the primary osd it dies before the write is complete?
[19:54] <loicd> sagewk: https://github.com/ceph/ceph/pull/327 is still to be merged. I'll rebase to make sure it can be. I accidentally left two TEST(pg_missing_t, constructor) in the PGLog patch my bad :-( It should be harmless since it passes make check and does not modify the pg_missing_t implementation.
[19:54] <sagewk> cool
[19:55] <joao> sagewk, want me to push the rebased cached-versions patches to wip-mon-trim or to some other branch?
[19:55] <sagewk> on top
[19:55] <joao> kay
[19:55] <tnt> Does anyone know if you can enable the HASH_POOL feature bit for a given pool after it's created ?
[19:56] <sagewk> i think the basic approach is right, but i want to make a more generic PaxosService::refresh() method that is called instead
[19:56] <sagewk> and have it call update_from_paxos()
[19:56] <sagewk> then we can drop the update_from_paxos() call in dispatch()
[19:56] <joao> sagewk, yeah, that would be far more elegant
[19:56] <sagewk> the trick is, we need to make it get called after paxos does any update to the underlying store
[19:57] * dosaboy_ (~dosaboy@host86-161-201-199.range86-161.btcentralplus.com) Quit (Remote host closed the connection)
[19:57] <sagewk> it is a bit trickier to get right, but will have a much cleaner result
[19:57] <joao> sagewk, best way for that is to call it on PaxosService::_active()
[19:57] <sagewk> yeah, but that isn't called after a normal update, right? only after an election
[19:57] <sagewk> oh, maybe it is..
[19:57] <joao> should be called whenever the service goes active
[19:58] <joao> well, *maybe*
[19:58] <joao> afaict, it will only be called if we have a callback waiting for it to go active
[19:58] <joao> s/afaict/as far as I recall/
[19:58] <joao> so maybe that's not such a good option
[19:59] * terje_ (~joey@63-154-157-208.mpls.qwest.net) has joined #ceph
[19:59] <joao> I'll look into it again
[19:59] <sagewk> yeah, i think we want an explicit call to all paxosservice refresh() methods after paxos recovery and commit without relying on them asking for callbacks
[20:00] <sjust> sagewk: ok to backport 0289c445 to cuttlefish?
[20:00] <sagewk> but lets look at the sync thing first, i think this can wait
[20:00] <joao> sagewk, we shouldn' do that from Paxos though
[20:00] <sagewk> sjust: yeah
[20:00] <sjust> k
[20:00] <joao> hmm
[20:01] <sagewk> joao: yeah, that's where i got stuck last night :). maybe paxos calls mon->notify_updated_state() or refresh_paxos_state() or someting
[20:01] <joao> sagewk, we should be able to do just that from C_Committed or something
[20:01] <sagewk> and the mon tell sthe paxosservices
[20:01] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[20:01] <sagewk> only the services that proposed will have callbacks scheduled.. and i think that route is too fragile.
[20:01] <sagewk> gotta run, ttyl
[20:02] <joao> sagewk, from the PaxosService's PoV, it only cares about his own state being updated
[20:02] <joao> then again, recovery and sync are special cases that should be handled
[20:02] <joao> argh
[20:07] <loicd> https://github.com/ceph/ceph/pull/327 has been rebased against master is ready for review :-)
[20:07] * terje_ (~joey@63-154-157-208.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[20:07] * tkensiski1 (~tkensiski@ has joined #ceph
[20:09] <cjh_> on the wiki it says that cephx is kerberos 'like', why do we say that?
[20:09] <cjh_> because it is distributed authentication vs kerberos which is centralized?
[20:10] <cjh_> cephx also doesn't have time limits like kerberos does right?
[20:10] * tkensiski1 (~tkensiski@ has left #ceph
[20:20] <tnt> cjh_: I think cephx has something like session keys that are periodically renewed.
[20:22] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) has joined #ceph
[20:24] <cjh_> tnt: ok thanks that's good info
[20:27] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[20:27] * ChanServ sets mode +v andreask
[20:28] <mech422> so I figured out my big problem with ceph-deploy last night - the 'shortname' must be THE hostname of a node, not _a_ hostname. The fact that a name is valid and returns the correct IP address isn't enough - it MUST match the hostname output
[20:28] <mech422> but now my osd's don't want to fire up
[20:28] * fridudad (~oftc-webi@p5B09CFD8.dip0.t-ipconnect.de) has joined #ceph
[20:29] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) Quit ()
[20:30] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[20:30] * ChanServ sets mode +v andreask
[20:32] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has left #ceph
[20:42] <redeemed> mech422, i've run into that problem too (short hostname)
[20:42] <mech422> hehe - its a bit of a nuisance as the machines are multi-homed
[20:43] <mech422> I had the hostnames set for a different network from the storage network
[20:44] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[20:48] * bergerx_ (~bekir@ has joined #ceph
[20:49] * terje_ (~joey@63-154-157-208.mpls.qwest.net) has joined #ceph
[20:49] * BManojlovic (~steki@fo-d- has joined #ceph
[20:52] * eschnou (~eschnou@168.176-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[20:52] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[20:53] <Kioob> does "ceph pg force_create_pg 4.5c" should fix problem of "incomplete" PG ?
[20:57] * terje_ (~joey@63-154-157-208.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[21:00] <sjust> no
[21:00] <paravoid> sjust: hey
[21:00] <sjust> paravoid: hi
[21:00] <sjust> the test I started didn't work
[21:00] <paravoid> we had a severe outage last night that i'm unsure if it's related to #5084
[21:01] <paravoid> we actually had to remove ceph from production...
[21:01] <sjust> what happened?
[21:01] <paravoid> so I added a new OSD last night
[21:01] <paravoid> and it started syncing
[21:02] <paravoid> and hours after that, some pgs started peering and were stuck peering
[21:02] <paravoid> until I restarted the osds
[21:03] <paravoid> I still haven't had the time to fully parse our logs
[21:03] <paravoid> but this has happened before
[21:03] * Cube (~Cube@ has joined #ceph
[21:04] <nhm> paravoid: no good. :(
[21:04] <paravoid> nope
[21:05] <sjust> loicd: regarding your question, the log previously contained a delete (so the store does not have that object), but the new entry is not a delete, so the store *should* contain that object, and at the version of that entry
[21:06] * Tamil (~tamil@ Quit (Quit: Leaving.)
[21:06] <tnt> paravoid: btw, does dmesg show anything ? like related to hung process or xfs ?
[21:07] <paravoid> no
[21:07] <tnt> paravoid: I had the same stuck peering issue a couple of times, and some times (but not always) there is some dmesg in there as well.
[21:08] <tnt> I've never been able to reproduce it reliably though and when it happened I was more focused on restoring service than collecting log and debugging ...
[21:08] <paravoid> exactly the same here
[21:08] <paravoid> I've had this happened during recovery from an osd 3-4 times
[21:08] <paravoid> but I was always quickly outing/restarting osds
[21:10] * Tamil (~tamil@ has joined #ceph
[21:11] <paravoid> sjust: so, you were saying? the fix for 5084 didn't work?
[21:11] <paravoid> more people are reporting this now
[21:11] <sjust> paravoid: no, the test didn't run properly, I've put it in master for the moment
[21:11] <sjust> there is very little chance that that caused the stuck peering problem
[21:12] <sjust> they are likely distinct issues
[21:12] <paravoid> too bad
[21:12] <sjust> the pg_epoch change merely makes peering a bit more efficient
[21:12] <paravoid> ok
[21:13] <paravoid> so, now that ceph is out of production, I feel a bit more comfortable upgrading to 0.61
[21:13] <sjust> you should wait for the mon stuff to stabilize, I think
[21:13] <sjust> there should be a point release soon
[21:13] <paravoid> yep, I'm waiting for that
[21:13] <sjust> which should resolve the bulk of the problems
[21:13] <paravoid> I have enough bugs of my own :)
[21:13] <sjust> fwiw, I would bet that cuttlefish makes the stuck peering problem go away
[21:13] <paravoid> good to hear
[21:13] <paravoid> what is "soon"?
[21:14] <sjust> I think we kicked off tests this morning, so soon
[21:14] <paravoid> so the #5084 fix won't land in .3
[21:15] <paravoid> I'm a bit worried by the two people in that bug saying they also experience this with cuttlefish
[21:15] <sjust> no, but you could try the wip branch
[21:15] <nhm> sjust: did Sage/joao do the mon fix described on the ML?
[21:15] <sjust> nhm: not sure
[21:16] <paravoid> I guess I could now...
[21:16] <tnt> nhm: which one ? The "not read from leveldb all the time" thing ?
[21:16] <paravoid> is there a wip branch with 0.61.3 (or something that contains the mon fixes anyway) + #5084?
[21:17] <sjust> paravoid: yeah, but I would wait on upgrading until reports are positive
[21:17] <sjust> there's also a bobtail version of the branch
[21:17] <paravoid> oh trust me, I'll wait
[21:17] <paravoid> nah, I think at this point it doesn't make sense to stay on bobtail
[21:17] <paravoid> we've depooled production traffic, which is an expensive process for us
[21:17] <paravoid> (we need to resync data stores to put it back in)
[21:18] <paravoid> so this is an opportunity to do this now, rather than risk it later
[21:18] <sjust> paravoid: yeah
[21:18] <paravoid> what's the branch name?
[21:18] * dosaboy (~dosaboy@host86-161-201-199.range86-161.btcentralplus.com) has joined #ceph
[21:19] <paravoid> btw, is 450MB a reasonable size for a bobtail mon?
[21:19] <sjust> wip_(bobtail|cuttlefish)_pg_epoch
[21:19] <sjust> seems reasonable
[21:20] <paravoid> okay
[21:20] <paravoid> sounded a lot to me, but I have nothing to compare it with :)
[21:20] * todin (tuxadero@kudu.in-berlin.de) Quit (Ping timeout: 480 seconds)
[21:20] <sjust> just repushed those two rebased on bobtail/cuttlefish respectively
[21:20] <paravoid> many thanks!
[21:20] <sjust> sure
[21:22] <paravoid> okay, I'll wait until next week to make sure
[21:22] <paravoid> in the meantime, I'm going to go have a weekend
[21:22] <sjust> enjoy!
[21:22] <paravoid> thanks again :)
[21:24] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[21:27] <joao> bit again by thunderbird's inability to refresh my inbox, missed most of today's emails, including that one nhm
[21:29] <Tamil> mech422: which distro are you trying ceph-deploy on?
[21:29] * eschnou (~eschnou@168.176-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[21:29] <mech422> Debian Wheezy
[21:32] <Tamil> which ceph branch are you using?
[21:32] <Tamil> try with --dev=cuttlefish
[21:32] <Tamil> this should work on debian-wheezy
[21:32] <mech422> err...I was using ceph-deploy install --testing
[21:33] <mech422> so i should use 'ceph-deploy install --dev=cuttlefish' instead ?
[21:33] <Tamil> mech422: yes, that will work
[21:33] <mech422> Ahh cool - Thanks!
[21:34] <Tamil> mech422: there was a fix for osd create issue sometime last week for debian
[21:34] <Tamil> mech422: np
[21:34] <mech422> oh - that might fix me then :-)
[21:37] <joao> sagewk, ping me when you're around?
[21:38] <Tamil> mech422: yes
[21:41] * bergerx_ (~bekir@ Quit (Remote host closed the connection)
[21:49] * dxd828_ (~dxd828@host-92-24-117-118.ppp.as43234.net) has joined #ceph
[21:49] * dxd828_ (~dxd828@host-92-24-117-118.ppp.as43234.net) Quit ()
[21:52] * dxd828_ (~dxd828@host-92-24-117-118.ppp.as43234.net) has joined #ceph
[21:54] * terje-_ (~terje@63-154-141-182.mpls.qwest.net) has joined #ceph
[21:55] * eschnou (~eschnou@168.176-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:57] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[21:58] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[22:00] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[22:02] * terje-_ (~terje@63-154-141-182.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[22:07] <fridudad> joao sjust there seem to be some people having stalled i/o when starting an OSD while peering / recovering. I see stalled I/O too with cuttlefish. Didn't had a problem with bobtail. Is there anything i can try?
[22:09] * jcsp (~john@ has joined #ceph
[22:16] <sjust> fridudad: can you describe the experiment and the result?
[22:18] <fridudad> sjust it just happens if i restart an osd. When the osd is started it consumes for 90-120s a lot of CPU i think it scans it OSD folder or does something else. While this it starts to produce a lot of slow request log entries
[22:19] <fridudad> sjust and I/O gets stalled on my qemu clients
[22:19] <fridudad> i've seen more users reporting this
[22:20] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[22:21] <sjust> how long does it stall for?
[22:22] <sjust> it's not scanning the OSD folder, btw, that happens long before that point
[22:23] <sjust> fridudad: some pause in IO is to be expected as PGs flip over to the new osd
[22:23] <sjust> we should be able to make it faster though
[22:23] <sjust> and it may be related to the current mon issues
[22:23] <fridudad> sjust started at 2013-05-31 22:03:47.18257 last slow / stalled log entry 22:04:47.181901
[22:23] * jcsp (~john@ Quit (Ping timeout: 480 seconds)
[22:23] <sjust> so about 60s?
[22:24] <sjust> there should be a new point release which may improve matters soon
[22:24] <mech422> Tamil: still no osds - I prepared and activated them, but ceph -s shows '0 osd'
[22:24] <Tamil> mech422: are your disks mounted ?
[22:25] <Tamil> mech422: trying to figure out where exactly it is failing
[22:25] <fridudad> sjust: yes i'm already using current upstream/cuttlefish
[22:25] <mech422> yes - mounted at default location with journal on partition 1:
[22:25] <mech422> /dev/sdb2 872G 200M 827G 1% /var/lib/ceph/osd/ceph-2
[22:25] <mech422> (/dev/sdb1 is journal)
[22:26] <fridudad> sjust another osd started at 13:54:04.046823 last slow I/O entry at 13:55:13.374592
[22:27] <fridudad> sjust this is while having all mons on SSDs and using upstream/cuttlefish branch
[22:28] * loicd contemplating http://lists.ceph.com/private.cgi/ceph-qa-ceph.com/2013-May/000705.html which is run against https://github.com/ceph/ceph/commit/fbf5a242d91e293e4e24fbb94e31e163374c7912 and trying to figure out if something originates from the patch
[22:28] <sjust> fridudad: what sha1?
[22:29] <loicd> comparing to http://lists.ceph.com/private.cgi/ceph-qa-ceph.com/2013-May/000689.html
[22:29] <sjust> loicd: I can't see that
[22:29] <elder> joshd, do you have a clear idea of what ctl_mutex is intended to protect in the kernel rbd code?
[22:30] <loicd> sjust: no worries, I'm talking to myself :-) You need to authenticate to see the archives.
[22:30] <sjust> yeah, not sure how, I'll have to ask someone
[22:30] <joshd> elder: generally, common state for the rbd driver, like the ids, the clients, and the sysfs stuff
[22:31] <fridudad> sjust: f87a19d34f9a03493eaca654dd176992676c5812 + cherry-picked e6ad9da03a17e9bfa12386ecaf0a2ab257327da6 + upstream/wip_cuttlefish_pg_epoch you pushed half an hour ago but it also happens with version from yesterday or two days ago
[22:31] <joshd> elder: iirc it's also used a bit for image-specific things, which could be changed to an image-specific mutex
[22:31] <sjust> fridudad: and it still happens with those patches?
[22:32] <sjust> 60s from when you add an osd to io resuming?
[22:32] <Tamil> mech422: is your osd process running?
[22:32] <fridudad> sjust yes
[22:33] <joshd> elder: not long ago it was the only mutex, so it's purpose is probably still a bit murky
[22:33] <elder> I'm just trying to figure out whether I can avoid taking it in certain cases. For example, now that I have maps and unmaps protected by the open count and REMOVING flag, whether that barrier before proceeding is sufficient, or whether more protection is needed.
[22:33] <fridudad> sjust i have just two osds with this patches all others are on a version from two days ago but it still happens for these osds if i restart them or do i need to uprade all of them first?
[22:34] <sjust> fridudad: how many pgs?
[22:34] <sjust> how many osds?
[22:34] <elder> No unmap will proceed if it's in use. And no open will be allowed if it's begun getting unmapped.
[22:34] <fridudad> sjust: regarding 60s yes - 4096 pgs and 24 osds
[22:35] <mech422> tamil: here's the ps output:
[22:35] <mech422> root@storage2:/var/log/ceph# ps -ef | grep ceph
[22:35] <mech422> root 15694 1 0 13:20 ? 00:00:00 /usr/bin/ceph-mon -i storage2 --pid-file /var/run/ceph/mon.storage2.pid -c /etc/ceph/ceph.conf
[22:35] <mech422> root 16148 16112 0 13:34 pts/0 00:00:00 grep ceph
[22:35] <elder> The ids are protected by a spinlock now.
[22:35] <sjust> when you are adding an osd, do you mean 1 osd, or 1 host?
[22:36] <fridudad> sjust i just stop ONE osd and start it again so i mean just ONE osd not the whole host
[22:36] <sjust> k
[22:36] <joshd> elder: I'm not sure it's needed for protecting map/unmap anymore
[22:36] <sjust> how many osds/hos?
[22:36] <Tamil> mech422: so, you are trying a 3 node cluster
[22:37] <fridudad> sjust: 4 osds / host with journals on SSD
[22:37] <mech422> Tamil: 5 nodes plus the ceph-deploy 'admin' machine
[22:37] <elder> OK, well I'm trying to work through that right now. Thanks for the input joshd.
[22:37] <fridudad> sjust 2 journals per SSD
[22:37] <fridudad> sjust: i didn't had any problems with bobtail it started when upgrading to cuttlefish
[22:38] <mech422> tamil: they all have the same 1TB /dev/sdb, with 50G /dev/sdb1 journal and rest in /dev/sdb2 for data
[22:38] <sjust> fridudad: ok, I just updated cuttlefish to include all of the patches you just mentioned
[22:38] <joshd> elder: it seems to me that we should be taking the id spinlock in __rbd_get_dev, and not the ctl_mutex in the caller
[22:38] <sjust> when it finishes building, can you install the cuttlefish package from our builder and retest
[22:38] <sjust> ?
[22:38] <sjust> then we can enable logging
[22:38] <joshd> elder: nevermind, it's already taking the rbd_dev_list_lock
[22:38] <elder> Yes.
[22:38] <Tamil> mech422: give me a minute
[22:41] <joshd> elder: the only race I'm still worried about is two concurrent requests to remove the same device
[22:41] <fridudad> sjust it misses e6ad9da03a17e9bfa12386ecaf0a2ab257327da6
[22:41] <fridudad> sjust or i mean it misses the backport for e6ad9da03a17e9bfa12386ecaf0a2ab257327da6
[22:41] <sjust> sagewk: ok to backport e6ad9da03a17e9bfa12386ecaf0a2ab257327da6?
[22:42] <elder> joshd that could be resolved by checking to see if the REMOVING flag is already set.
[22:42] <Tamil> mech422: ceph osd tree?
[22:42] <elder> I hadn't thought about that.
[22:42] <sagewk> yeah definitely
[22:42] <sjust> k
[22:42] <fridudad> sjust what's the advantage of using your build instead of mine?
[22:42] <joshd> I think that's what ctl_mutex is really protecting against there
[22:42] <sjust> merely so that I have an easy base line
[22:42] <fridudad> sjust: ok i understand
[22:43] <fridudad> sjust how long does building take? i had to go to bed to get some sleep ;-(
[22:43] <sjust> 20min or so
[22:44] <mech422> Tamil: sorry - how do I get the osd tree ?
[22:44] <phantomcircuit> mmhhmmm emerging boost
[22:45] <fridudad> sjust ok not sure if i fell suddently asleep until that ;-) may i ask you to give me details what i should do so i can ping you may be tomorrow or on monday with logs?
[22:45] <Tamil> mech422:i suspect your osd prepare/activate command
[22:46] <Tamil> mech422: it should be "ceph-deploy osd prepare host:disk:<disk for journal>
[22:46] <sjust> fridudad: just pushed
[22:47] <sjust> so, I'll want the ceph log from the duration of the test
[22:47] <sjust> also, if you can do 'ceph pg dump' with timestamps every few seconds during the test, that would also be good
[22:47] <Tamil> mech422: "ceph-deploy osd prepare host:<disk for data>:<disk for journal>
[22:47] <sjust> after that, we'll pick probably 2 osds to enable logging on and redo the test
[22:48] <sjust> debug osd = 20
[22:48] <sjust> debug filestore = 20
[22:48] <mech422> Tamil : so 'ceph-deploy osd prepare storage2:/dev/sdb2:/dev/sdb1' ? and it will mount it to default location auto-magically ?
[22:48] <sjust> debug ms = 1
[22:48] <Tamil> mech422: yes
[22:48] <mech422> ahh - let me change my clean & build script and re-run it...
[22:49] <Tamil> mech422: run purge followed by purgedata on all nodes, it will cleanup
[22:50] * redeemed (~redeemed@static-71-170-33-24.dllstx.fios.verizon.net) Quit (Quit: bia)
[22:51] <loicd> in ceph-qa "[Ceph-qa] 5 failed, 0 hung, 113 passed in teuthology-2013-05-29_01:00:07-rados-master-testing-basic", should I be concerned about the valgrind issues ? If so, is there a way to get the valgrind output ?
[22:51] <sjust> loicd: no, it's a known problem
[22:51] * loicd away for an hour
[22:51] <sjust> we recently fixed valgrind and haven't fixed the issues yet
[22:51] <loicd> sjust: :-) cool
[22:52] <sjust> loicd: it's on the list!
[22:52] * loicd looking again
[22:52] <mech422> tamil: ok - rebuilding now
[22:52] <Tamil> mech422: kool
[22:52] <fridudad> sjust OK thanks i copied everything down. is it OK for you when i come back in the next 1-3 days? really need to sleep
[22:52] * loicd curses himself for not taking time to become more familiar with the work related to ceph-qa
[22:52] <sjust> certainly
[22:53] <fridudad> sjust oh which is the debian repo URL for squeeze? right now i always use my custom builds so i never needed it
[22:54] <sjust> hmm, not sure
[22:54] <sjust> http://gitbuilder.sepia.ceph.com/gitbuilder-squeeze-amd64/ ?
[22:55] <fridudad> sjust great that looks good to me
[22:56] <sjust> http://gitbuilder.ceph.com/ceph-deb-squeeze-x86_64-basic/ref/
[22:56] <sjust> or that I guess
[22:58] <mech422> Tamil: Nice! osdmap e10: 3 osds: 3 up, 3 in
[22:58] <mech422> but still missing 2 OSDs
[23:01] <fridudad> sjust thanks a lot it have to get up in 6 hours agan. so good night.
[23:01] <sjust> fridudad: yep, have a good weekend
[23:02] <fridudad> sjust thanks you too will ping you and see if you're available. Which time zone?
[23:02] <sjust> PST
[23:02] <fridudad> sjust bye
[23:02] <sjust> yep
[23:02] * fridudad (~oftc-webi@p5B09CFD8.dip0.t-ipconnect.de) Quit (Quit: Page closed)
[23:07] <Tamil> mech422: :)
[23:08] <Tamil> mech422: missing 2 osds?
[23:08] <mech422> yeah - should be 5 - I'm rebuilding the cluster now to see if it was just something I missed
[23:09] <Tamil> mech422: ok, you could also use "ceph-deploy disk list" to figure out available disks and their state
[23:14] * eschnou (~eschnou@168.176-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[23:14] <Kioob> I have 6 PG which stuck in "creating" state since hours. I suppose I should wait the 0.61.3 release, with the commit 0289c445. Right ?
[23:14] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[23:15] <mech422> Tamil: Hmm - this time no OSDs came up - I used 'osd create' instead of the prepare/activate combo
[23:21] * loicd can't figure out where to find the output of "27640: (7034s) collection:singleton-nomsgr all:filestore-idempotent-aio-journal.yaml ./run_seed_to_range.sh errored out" ( from http://lists.ceph.com/private.cgi/ceph-qa-ceph.com/2013-May/000705.html )
[23:23] <mech422> Tamil: rebuild again with osd prepare/activate - all 5 OSDs came up perfect :-)
[23:23] * mech422 goes to get a coffee/smoke to celebrate!
[23:24] <loicd> mech422: :-)
[23:28] * yehuda_hm (~yehuda@2602:306:330b:1410:6885:1334:8c26:70e9) Quit (Read error: Connection timed out)
[23:29] * yehuda_hm (~yehuda@2602:306:330b:1410:5183:e6bc:8046:69b) has joined #ceph
[23:35] <Tamil> mech422: kool
[23:36] <Tamil> loicd: are you looking for the teuthology log?
[23:37] <loicd> Tamil: yes I am :-)
[23:37] <Tamil> loicd: it should be in the archive folder on teuthology machine
[23:37] <Tamil> loicd: do you have access to the teuthology machine?
[23:38] <loicd> Tamil: I don't, that explains why I can't find the logs :-D
[23:38] <Tamil> loicd: :)
[23:39] <loicd> They are somewhere at inktank
[23:39] <dmick> yes, on the teuthology machine
[23:43] <loicd> if anyone around with access to the machine that ran http://lists.ceph.com/private.cgi/ceph-qa-ceph.com/2013-May/000705.html can send me the output of "27640: (7034s) collection:singleton-nomsgr all:filestore-idempotent-aio-journal.yaml ./run_seed_to_range.sh errored out" , i would be more than happy to analyse it
[23:45] <sjust> Kioob: youc an use the force create command to fix those
[23:45] <sjust> loicd: that is pretty much guarranteed to be unrelated to your changes
[23:45] <sjust> it doesn't actually involve an osd
[23:46] <loicd> :-)
[23:46] <loicd> I'm overstressed.
[23:46] <sjust> I didn't see anything related to your changes in the failures, fwow
[23:46] <sjust> *fwiw
[23:47] <sjust> most of the failures are due to my fd cache changes
[23:47] <Tamil> loicd: sure will do that
[23:47] <loicd> Tamil: great :-) Please send all failed output. I'll sort out what's unrelated to the PGLog changes.
[23:48] <Tamil> loicd: w.r.to rados suite?
[23:48] <Tamil> loicd: I agree with sjust, do you still need it?
[23:48] <loicd> this one I don't need
[23:48] <loicd> indeed
[23:49] <loicd> the others I'm unsure
[23:49] * loicd refering to http://lists.ceph.com/private.cgi/ceph-qa-ceph.com/2013-May/000705.html errors
[23:49] <loicd> 27630: (1667s) collection:singleton all:osd-recovery-incomplete.yaml fs:btrfs.yaml msgr-failures:many.yaml Command failed on with status 3: '/home/ubuntu/cephtest/27630/enable-coredump ceph-coverage /home/ubuntu/cephtest/27630/archive/coverage ceph --concise tell osd.1 flush_pg_stats'
[23:49] <loicd> 27657: (2410s) collection:thrash clusters:fixed-2.yaml fs:xfs.yaml msgr-failures:few.yaml thrashers:default.yaml workloads:radosbench.yaml Command failed on with status 1: '/home/ubuntu/cephtest/27657/enable-coredump ceph-coverage /home/ubuntu/cephtest/27657/archive/coverage sudo /home/ubuntu/cephtest/27657/daemon-helper kill ceph-osd -f -i 1'
[23:50] <sjust> 27630 is a problem with handle_osd_ping, unrelated
[23:51] <loicd> ok :-)
[23:51] <loicd> 27630: (1667s) collection:singleton all:osd-recovery-incomplete.yaml fs:btrfs.yaml msgr-failures:many.yaml Command failed on with status 3: '/home/ubuntu/cephtest/27630/enable-coredump ceph-coverage /home/ubuntu/cephtest/27630/archive/coverage ceph --concise tell osd.1 flush_pg_stats'
[23:51] <sjust> same with 27657
[23:51] <loicd> (sorry repeat)
[23:52] <loicd> 27603: (11456s) collection:osd-powercycle clusters:3osd-1per-target.yaml fs:xfs.yaml powercycle:default.yaml tasks:cfuse_workunit_misc.yaml Failed to revive osd.2 via ipmi
[23:52] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[23:52] <loicd> that one seems completely unrelated because it refers to IPMI & restaring osds.
[23:53] <sjust> loicd: yeah, we generally go through these
[23:53] * houkouonchi-home (~linux@pool-71-160-127-158.lsanca.fios.verizon.net) Quit (Read error: Connection reset by peer)
[23:54] <loicd> 27574: (597s) collection:monthrash clusters:fixed-2.yaml fs:xfs.yaml msgr-failures:osd-delay.yaml thrashers:mon-thrasher.yaml workloads:rados_mon_workunits.yaml Command failed on with status 22: 'mkdir -p -- ... es" /home/ubuntu/cephtest/27574/enable-coredump ceph-coverage /home/ubuntu/cephtest/27574/archive/coverage /home/ubuntu/cephtest/27574/workunit.client.0/mon/crush_ops.sh'
[23:54] <loicd> is the last one
[23:55] <loicd> seems to be related to mon and not osd
[23:55] * houkouonchi-home (~linux@pool-71-160-127-158.lsanca.fios.verizon.net) has joined #ceph
[23:58] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Read error: Operation timed out)
[23:59] * cjh_ (~cjh@ps123903.dreamhost.com) Quit (Remote host closed the connection)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.