#ceph IRC Log

Index

IRC Log for 2013-05-16

Timestamps are in GMT/BST.

[0:03] <paravoid> sagelap: speaking of packaging, did you see my mail on ceph-dev about radosgw 0.61/librados2 0.56?
[0:04] <masterpe> sagelap: http://pastebin.com/Gu4zWL58
[0:06] * BillK (~BillK@124-169-186-145.dyn.iinet.net.au) has joined #ceph
[0:07] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[0:09] <sagelap> ah, i think you hit a known bug.. upgrade those 2 mons to 0.56.6 first, restart ceph-mons, and *then* upgrade them to 0.61.x
[0:09] <sagelap> 0.56.4 just wasn't writing down on disk that it was in the right format.
[0:09] * DarkAceZ (~BillyMays@50.107.54.92) Quit (Ping timeout: 480 seconds)
[0:11] * frank9999 (~frank@kantoor.transip.nl) has joined #ceph
[0:15] <masterpe> sagelap: but mon.c and mon.b are converting to 0.61.x but mon.e is the problem
[0:15] <masterpe> All were on the same version before the upgrade
[0:15] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 481 seconds)
[0:16] <sagelap> oh sorry, it's just one mon :)
[0:16] <sagelap> gregaf, joao: will restarting one mon on 0.56.6 that is out of quorum update the compatset properly?
[0:18] <masterpe> sagelap: so i need to mon.e needs be downgrated to 0.56.6
[0:18] <sagelap> yeah
[0:18] <sagelap> gregaf: that will do the trick, right?
[0:19] <sagelap> alternatively you can nuke and recreate mon.e and it'll will resync iwth the other 2 (which are in quorum and healthy, now, right?)
[0:21] <masterpe> sagelap: that was an option that i have cros my mind.
[0:21] <sagelap> once the other 2 mons are done and healthy, that might be simpler.
[0:24] <masterpe> Is it normal that it takes about 2 hours to convert the mon?
[0:24] <gregaf> it'll only update the store on a quorum start, so if you've already upgraded the rest of them I believe you're best off removing, wiping, and re-adding the monitor
[0:28] * diegows (~diegows@200.68.116.185) Quit (Read error: Operation timed out)
[0:30] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[0:34] * tziOm (~bjornar@ti0099a340-dhcp0870.bb.online.no) Quit (Remote host closed the connection)
[0:37] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[0:43] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[0:47] * tnt (~tnt@91.177.224.32) Quit (Ping timeout: 480 seconds)
[0:49] * coyo (~unf@pool-71-170-191-140.dllstx.fios.verizon.net) has joined #ceph
[0:51] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Quit: ...)
[0:58] * DarkAceZ (~BillyMays@50.107.54.92) has joined #ceph
[1:01] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[1:02] * sjustlaptop (~sam@2607:f298:a:697:dd2a:5cd3:66ae:e1f4) has joined #ceph
[1:05] * kyle_ (~kyle@216.183.64.10) has joined #ceph
[1:10] * coyo (~unf@00017955.user.oftc.net) Quit (Ping timeout: 480 seconds)
[1:16] * kyle_ (~kyle@216.183.64.10) Quit (Quit: Leaving)
[1:29] * rustam (~rustam@94.15.91.30) has joined #ceph
[1:29] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[1:34] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) Quit (Read error: Operation timed out)
[1:38] * tkensiski (~tkensiski@66.sub-70-197-10.myvzw.com) has joined #ceph
[1:38] * tkensiski (~tkensiski@66.sub-70-197-10.myvzw.com) has left #ceph
[1:38] <jmlowe> ok, I've got another one
[1:39] <jmlowe> health HEALTH_WARN osdmap e10714: 18 osds: 18 up, 18 in 2700 pgs: 2700 active+clean
[1:41] <jmlowe> what's up with the HEATH_WARN ?
[1:42] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:42] <dmick> try ceph health detail
[1:42] <lurbs> jmlowe: After an upgrade to 0.61? Might be disk space on the monitors.
[1:43] <jmlowe> yeah, I greped for WRN and found a warning about 30% free
[1:43] * tkensiski1 (~tkensiski@87.sub-70-197-6.myvzw.com) has joined #ceph
[1:44] <lurbs> 0.61.1 or 0.61.2? .1 had an issue with some .tdump logfiles getting too big.
[1:45] <lurbs> http://tracker.ceph.com/issues/5024
[1:45] <lurbs> That's what I ran into, anyway.
[1:45] * tkensiski1 (~tkensiski@87.sub-70-197-6.myvzw.com) has left #ceph
[1:47] <jmlowe> hmm, looks like my telling the osd to turn down logging didn't take
[1:48] <jmlowe> restart did it
[1:49] <jmlowe> health HEALTH_OK
[1:56] <joao> sagelap, from what I've been noticing from some users experiences, that will sometimes indeed trigger the compact
[1:56] <joao> more often than not
[1:56] <joao> and the funny thing is that it might even trigger a compact across the whole quorum
[1:56] <joao> I can only speculate why though
[2:04] * themgt (~themgt@24-177-232-33.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[2:04] * themgt (~themgt@24-177-232-33.dhcp.gnvl.sc.charter.com) has joined #ceph
[2:05] * yehuda_hm (~yehuda@2602:306:330b:1410:7849:6691:3662:529c) Quit (Ping timeout: 480 seconds)
[2:06] * sjustlaptop (~sam@2607:f298:a:697:dd2a:5cd3:66ae:e1f4) Quit (Ping timeout: 480 seconds)
[2:06] * LeaChim (~LeaChim@176.250.188.136) Quit (Ping timeout: 480 seconds)
[2:08] * sagelap (~sage@38.122.20.226) Quit (Read error: Operation timed out)
[2:12] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[2:12] * tkensiski (~tkensiski@189.sub-70-197-5.myvzw.com) has joined #ceph
[2:13] * tkensiski (~tkensiski@189.sub-70-197-5.myvzw.com) has left #ceph
[2:13] <nigwil> jmlowe: I'm getting the same HEALTH_WARN about the 30% free
[2:13] * rustam (~rustam@94.15.91.30) has joined #ceph
[2:13] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[2:14] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[2:15] <jmlowe> nigwil: I had the all the debugging turned up when we were figuring out the issue that triggered 61.1
[2:15] <sagewk> joao: the compact is encoded in the transaction on the primary, so it gets applied on all mons
[2:15] <sagewk> it was the easiest way to plumb it down to leveldb
[2:18] * sagelap (~sage@2607:f298:a:607:598c:d480:4af:b6ce) has joined #ceph
[2:29] <sagewk> sjust: 5020 looks good
[2:35] * tkensiski (~tkensiski@90.sub-70-197-7.myvzw.com) has joined #ceph
[2:35] * tkensiski (~tkensiski@90.sub-70-197-7.myvzw.com) has left #ceph
[2:41] * jjgalvez (~jjgalvez@12.248.40.138) Quit (Ping timeout: 480 seconds)
[2:44] * yehuda_hm (~yehuda@2602:306:330b:1410:baac:6fff:fec5:2aad) has joined #ceph
[2:45] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[2:53] * alram (~alram@38.122.20.226) Quit (Quit: leaving)
[3:00] * sagelap (~sage@2607:f298:a:607:598c:d480:4af:b6ce) Quit (Quit: Leaving.)
[3:05] * danieagle (~Daniel@186.214.76.12) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[3:09] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[3:16] * noahmehl_ (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[3:17] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Read error: Operation timed out)
[3:19] * noahmehl_ (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Read error: Operation timed out)
[3:22] * JohansGlock (~quassel@kantoor.transip.nl) Quit (Read error: Connection reset by peer)
[3:29] * diegows (~diegows@190.190.2.126) has joined #ceph
[3:29] * Cube (~Cube@12.248.40.138) Quit (Read error: Connection reset by peer)
[3:29] * Cube (~Cube@12.248.40.138) has joined #ceph
[3:30] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[3:33] * xiaoxi1 (~xiaoxi@shzdmzpr02-ext.sh.intel.com) has joined #ceph
[3:34] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[3:37] * alex_ (~chatzilla@d24-141-198-231.home.cgocable.net) has joined #ceph
[3:37] * Cube (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[3:53] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[3:57] * lmh (~moli@inet-cnmc02-pri-ext.oracle.co.jp) has joined #ceph
[4:00] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[4:03] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[4:04] * lerrie (~Larry@remote.compukos.nl) Quit ()
[4:08] * lmh (~moli@inet-cnmc02-pri-ext.oracle.co.jp) Quit (Remote host closed the connection)
[4:08] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) has joined #ceph
[4:11] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) Quit (Remote host closed the connection)
[4:11] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) has joined #ceph
[4:14] * alex_ (~chatzilla@d24-141-198-231.home.cgocable.net) has left #ceph
[4:19] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) Quit (Remote host closed the connection)
[4:19] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) has joined #ceph
[4:20] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Quit: noahmehl)
[4:24] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) Quit (Remote host closed the connection)
[4:24] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) has joined #ceph
[4:28] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) Quit (Remote host closed the connection)
[4:28] * moli (~moli@inet-cnmc02-pri-ext.oracle.co.jp) has joined #ceph
[4:28] * moli is now known as lmh
[4:30] * lmh (~moli@inet-cnmc02-pri-ext.oracle.co.jp) Quit (Remote host closed the connection)
[4:30] * lmh (~moli@inet-cnmc02-pri-ext.oracle.co.jp) has joined #ceph
[4:33] * lmh (~moli@inet-cnmc02-pri-ext.oracle.co.jp) Quit (Remote host closed the connection)
[4:33] * lmh (~moli@inet-cnmc02-pri-ext.oracle.co.jp) has joined #ceph
[4:33] * lmh (~moli@inet-cnmc02-pri-ext.oracle.co.jp) Quit ()
[4:36] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[4:37] * malcolm (~malcolm@silico24.lnk.telstra.net) has joined #ceph
[4:48] <malcolm> So, I have an odd question. If I were to manually down an OSD, to say, backup the ceph on-disk stuff, how long would the rest of the cluster allow it to be away, and come back, before it would invalidate all the data on the OSD?
[4:49] <malcolm> I am aware 'thats not how you back up ceph' but still.. how long?
[4:49] * FroMaster (~DM@static-98-119-19-146.lsanca.fios.verizon.net) has joined #ceph
[4:50] <FroMaster> Just setup Ceph and radosgw. I'm using Cloudberry explorer to connect and i keep getting "no such file or directory"... After creating a new user, how do i create a bucket?
[4:51] <lurbs> malcolm: Not sure it's exactly what you're asking, but the cluster will decide the OSD's out, and start shuffling objects around to cope, after "mon osd down out interval" - which defaults to 300 seconds.
[4:54] <FroMaster> Is there a way to create a bucket via the radosgw-admin command?
[4:55] <lurbs> malcolm: BTW, in order for the reboot of a node to come in under that five minute window I'd suggest looking at kexec-tools.
[4:56] <lurbs> 'Just' re-execs a new kernel at the end of shutdown, instead of going all the way back through BIOS etc.
[4:57] <malcolm> Ok, well we wouldn't be shuting the node down. Just, stoping the service on that node. so 5 mins.
[4:58] <malcolm> I guess then the question is, if you were to do a rolling down then up across the whole lot, all inside the timeout, and you did get a copy of the disks, could it be used to 'rebuild' in the event of disaster?
[5:02] <lurbs> I doubt it. Most of the raw data would be there, but the state of each node would be out of sync. You'd need to speak to a dev though, which I'm not.
[5:05] * Rorik_ (~rorik@199.182.216.68) Quit (Read error: Connection reset by peer)
[5:05] * Rorik (~rorik@199.182.216.68) has joined #ceph
[5:08] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[5:10] * klnlnll (~DW-10297@dhcp92.cmh.ee.net) has joined #ceph
[5:12] * xiaoxi1 (~xiaoxi@shzdmzpr02-ext.sh.intel.com) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * humbolt (~elias@91-113-103-253.adsl.highway.telekom.at) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * ShaunR (ShaunR@ip72-211-231-130.oc.oc.cox.net) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * Teduardo (~DW-10297@dhcp92.cmh.ee.net) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * psomas (~psomas@inferno.cc.ece.ntua.gr) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * eternaleye (~eternaley@2607:f878:fe00:802a::1) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * sileht (~sileht@gizmo.sileht.net) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * capri_wk (~capri@212.218.127.222) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * med (~medberry@ec2-50-17-21-207.compute-1.amazonaws.com) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * dxd828 (~dxd828@195.191.107.205) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * ivoks (~ivoks@jupiter.init.hr) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * ggreg_ (~ggreg@int.0x80.net) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * Meths (rift@2.25.193.124) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * sbadia (~sbadia@yasaw.net) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * asadpanda (~asadpanda@2001:470:c09d:0:20c:29ff:fe4e:a66) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * soren (~soren@hydrogen.linux2go.dk) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * infernix (nix@5ED33947.cm-7-4a.dynamic.ziggo.nl) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * wonko_be (bernard@november.openminds.be) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * nyerup (irc@jespernyerup.dk) Quit (reticulum.oftc.net solenoid.oftc.net)
[5:12] * xiaoxi1 (~xiaoxi@shzdmzpr02-ext.sh.intel.com) has joined #ceph
[5:12] * humbolt (~elias@91-113-103-253.adsl.highway.telekom.at) has joined #ceph
[5:12] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[5:12] * ShaunR (ShaunR@ip72-211-231-130.oc.oc.cox.net) has joined #ceph
[5:12] * Teduardo (~DW-10297@dhcp92.cmh.ee.net) has joined #ceph
[5:12] * psomas (~psomas@inferno.cc.ece.ntua.gr) has joined #ceph
[5:12] * eternaleye (~eternaley@2607:f878:fe00:802a::1) has joined #ceph
[5:12] * med (~medberry@ec2-50-17-21-207.compute-1.amazonaws.com) has joined #ceph
[5:12] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[5:12] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) has joined #ceph
[5:12] * sileht (~sileht@gizmo.sileht.net) has joined #ceph
[5:12] * capri_wk (~capri@212.218.127.222) has joined #ceph
[5:12] * dxd828 (~dxd828@195.191.107.205) has joined #ceph
[5:12] * ggreg_ (~ggreg@int.0x80.net) has joined #ceph
[5:12] * Meths (rift@2.25.193.124) has joined #ceph
[5:12] * nyerup (irc@jespernyerup.dk) has joined #ceph
[5:12] * asadpanda (~asadpanda@2001:470:c09d:0:20c:29ff:fe4e:a66) has joined #ceph
[5:12] * infernix (nix@5ED33947.cm-7-4a.dynamic.ziggo.nl) has joined #ceph
[5:12] * soren (~soren@hydrogen.linux2go.dk) has joined #ceph
[5:12] * ivoks (~ivoks@jupiter.init.hr) has joined #ceph
[5:12] * sbadia (~sbadia@yasaw.net) has joined #ceph
[5:12] * wonko_be (bernard@november.openminds.be) has joined #ceph
[5:12] * Teduardo (~DW-10297@dhcp92.cmh.ee.net) Quit (Ping timeout: 480 seconds)
[5:13] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) Quit (Ping timeout: 480 seconds)
[5:13] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) has joined #ceph
[5:14] <FroMaster> argh... [Wed May 15 20:13:01 2013] [error] [client 127.0.0.1] (2)No such file or directory: FastCGI: failed to connect to server "/var/www/s3gw.fcgi": connect() failed
[5:15] <FroMaster> Does that mean the FASTCGI can't connect or is that the respose?
[5:22] <malcolm> Thanks :D
[5:43] <FroMaster> Is there another guide to show what to do after setting up a radosgw? I'm trying to create my first bucket but can't seem to get it to work
[5:50] <FroMaster> seriously... radosgw-admin create user is BROKEN....
[5:51] <FroMaster> I had to create the user 6 times before I got a secret_key that didn't have a \ in it
[6:08] * FroMaster (~DM@static-98-119-19-146.lsanca.fios.verizon.net) Quit ()
[6:25] * rektide (~rektide@192.73.236.68) has joined #ceph
[6:37] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[6:41] * themgt (~themgt@24-177-232-33.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[6:53] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:54] * matt_ (~matt@220-245-1-152.static.tpgi.com.au) has joined #ceph
[7:09] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[7:26] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[7:59] * tnt (~tnt@91.177.224.32) has joined #ceph
[8:02] * glowell (~glowell@38.122.20.226) Quit (Quit: Leaving.)
[8:05] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[8:05] * ChanServ sets mode +v andreask
[8:10] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:11] * sagelap (~sage@2600:1012:b023:ec30:598c:d480:4af:b6ce) has joined #ceph
[8:11] * loicd (~loic@magenta.dachary.org) Quit ()
[8:15] * sagelap1 (~sage@2600:1012:b018:d2d7:a5fa:483e:7130:faef) has joined #ceph
[8:19] * sagelap (~sage@2600:1012:b023:ec30:598c:d480:4af:b6ce) Quit (Ping timeout: 480 seconds)
[8:21] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 481 seconds)
[8:38] <tnt> Argh ! My mon have grown insanely !
[8:38] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[8:38] * ChanServ sets mode +v andreask
[8:39] * sagelap1 (~sage@2600:1012:b018:d2d7:a5fa:483e:7130:faef) Quit (Quit: Leaving.)
[8:41] <tnt> Anyone knows what I can do ?
[8:45] * glowell (~glowell@ip-64-134-236-4.public.wayport.net) has joined #ceph
[8:46] <tnt> Pfff, restarting seems to have worked for at least 2 mons out of 3, but that was close, that almost filled all 3 mons !
[8:51] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Quit: noahmehl)
[9:02] * malcolm (~malcolm@silico24.lnk.telstra.net) Quit (Ping timeout: 480 seconds)
[9:11] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) has joined #ceph
[9:16] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) Quit (Remote host closed the connection)
[9:26] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:30] * jerker (jerker@Psilocybe.Update.UU.SE) has joined #ceph
[9:30] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[9:35] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[9:36] <nigwil> not sure I understand why I am in HEALTH_WARN:
[9:37] <nigwil> http://pastebin.com/WRRRKa21
[9:38] <tnt> because / has low disk space and it's the drive that stores monitor data.
[9:38] <nigwil> I thought nearfull was at 0.85 though
[9:38] <tnt> do a "du -sh /var/lib/ceph/mon"
[9:38] <tnt> that's for OSD
[9:39] <nigwil> is it tunnable for the MON?
[9:39] <nigwil> I'd be ok with 10% margin
[9:39] <tnt> can you run the command I pasted above.
[9:39] <nigwil> root@ceph0:~# du -sh /var/lib/ceph/mon
[9:39] <nigwil> 4.7G /var/lib/ceph/mon
[9:40] <tnt> Ok, so you're hitting the same bug I hit this morning ... the mon store grows uncontrollably until il fills the disk and suicide itself.
[9:40] <tnt> try restarting the mons.
[9:40] <tnt> http://tracker.ceph.com/issues/4895
[9:41] <nigwil> restarting...
[9:42] <tnt> once done, re-run the du
[9:44] <nigwil> oh wow...
[9:44] <nigwil> root@ceph0:/var/log/ceph# du -sh /var/lib/ceph/mon
[9:44] <nigwil> 62M /var/lib/ceph/mon
[9:44] <nigwil> HEALTH_OK now, thanks for the tip
[9:45] <tnt> nigwil: please report that you're hitting the issue as well on the tracker or ml
[9:49] * JohansGlock (~quassel@kantoor.transip.nl) has joined #ceph
[9:50] <agh> Hello to all
[9:50] <agh> Pleease, I need help !
[9:50] <agh> I'm trying to make OpenStack (Grizzly) works with Ceph
[9:50] <agh> but... it's hard
[9:51] <agh> My cinder-volume seems to absolutly need to speek with LVM
[9:51] <nigwil> tnt: done
[9:52] <tnt> nigwil: thanks. You're under ubuntu 12.04 ?
[9:52] <nigwil> Description: Ubuntu 12.04.2 LTS
[9:54] * leseb (~Adium@83.167.43.235) has joined #ceph
[10:02] <tnt> Unfortunately it seems that once you hit the issue, you're condemmed to hit it again becaue my mon grew like 300 Mo over the last hour ...
[10:05] <nigwil> ok, good that we have a workaround
[10:05] * LeaChim (~LeaChim@176.250.188.136) has joined #ceph
[10:07] <tnt> well unfortunately I just tried restarting one of the mon and that did nothing ...
[10:07] <nigwil> I am running on quite low-end hardware and the restart wedge the server for sometime freeing the space (a couple of minutes) then it was happy again
[10:07] <tnt> seems you need to restart them all at once for it to work.
[10:15] * dxd828 (~dxd828@195.191.107.205) Quit (Read error: Connection reset by peer)
[10:18] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[10:18] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[10:21] * dxd828 (~dxd828@195.191.107.205) has joined #ceph
[10:27] * tkensiski (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has joined #ceph
[10:39] * tkensiski (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has left #ceph
[10:42] * alex_ (~chatzilla@d24-141-198-231.home.cgocable.net) has joined #ceph
[10:49] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) Quit (Remote host closed the connection)
[10:50] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) has joined #ceph
[11:04] * xiaoxi1 (~xiaoxi@shzdmzpr02-ext.sh.intel.com) Quit (Ping timeout: 480 seconds)
[11:07] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[11:07] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[11:08] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[11:13] * malcolm (~malcolm@101.165.48.42) has joined #ceph
[11:17] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Remote host closed the connection)
[11:18] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[11:18] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit ()
[11:21] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Remote host closed the connection)
[11:27] * malcolm (~malcolm@101.165.48.42) Quit (Ping timeout: 480 seconds)
[11:40] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[11:40] <loicd> ccourtaut: hi :-)
[11:43] <loicd> ccourtaut: I'm curious to know where the s3 / swift tests are located.
[11:47] * todin (tuxadero@kudu.in-berlin.de) Quit (Remote host closed the connection)
[11:50] <mrjack> re
[11:50] <mrjack> tnt: yes
[11:51] <tnt> yes ?
[11:51] <tnt> can you remind me the question :p
[11:52] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[11:58] <mrjack> hi
[11:58] <mrjack> tnt: yes
[11:58] <mrjack> tnt: but i saw you already got it fixed by restarting mon
[11:59] <ccourtaut> loicd: the test related to the s3/swift REST api are located on other repositories
[11:59] <loicd> ccourtaut: do you have the URL of those repositories ?
[12:00] <ccourtaut> loicd: https://github.com/ceph/s3-tests https://github.com/ceph/swift
[12:00] <loicd> thanks :-)
[12:00] <ccourtaut> loicd: the swift tests seems to be the test suite used for openstack
[12:01] <loicd> ccourtaut: https://github.com/ceph/swift is a fork of https://github.com/openstack/swift indeed
[12:02] <tnt> mrjack: yes, although restarting my mon whenever they go crazy hardly seems like a "fix" :p
[12:02] <loicd> https://github.com/ceph/swift/commits/master says it's fairly outdated too
[12:03] <ccourtaut> loicd: indeed
[12:04] <loicd> did yehudasa told you how it is used to test against rgw ?
[12:04] <ccourtaut> loicd: which tests?
[12:05] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[12:06] <loicd> https://github.com/ceph/swift is used to test the swift API of ceph ? Maybe I misunderstood what you said earlier ;-)
[12:08] <ccourtaut> yes that's it, they used the swift functionnal test suite against rgw swift
[12:09] <loicd> ok. If using the swift repository, that will not test ceph. My question was to know if yehudasa told you how swift is used to test against rgw ?
[12:09] <ccourtaut> nop
[12:10] <loicd> the teuthology project is designed to run integration tests
[12:10] <loicd> maybe swift is used in this context
[12:11] <loicd> I described how I installed it here but it changed since then : http://dachary.org/?p=1788
[12:13] <ccourtaut> there are also test related to swift inside teuthology repo, but they do not seem to be related to the swift repo
[12:14] <loicd> :-D that does not make it any easier
[12:16] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[12:17] <ccourtaut> loicd: yep
[12:21] * Tamil (~tamil@38.122.20.226) Quit (Read error: Connection reset by peer)
[12:23] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) Quit (Quit: jlogan1)
[12:25] <ay> How do i limit which machines can mount what rbd-device and cepfs?
[12:28] <mrjack> tnt: it is a workaround
[12:30] * diegows (~diegows@190.190.2.126) has joined #ceph
[12:31] <Gugge-47527> ay: you can limit access to pools with cephx
[12:34] * ShaunR (ShaunR@ip72-211-231-130.oc.oc.cox.net) Quit ()
[12:36] <tnt> mrjack: what I don't get is that if it can reduce the store size on start ... why doesn't just do it periodically ?
[12:39] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[12:42] <mrjack> tnt: it tries to do so
[12:42] <mrjack> tnt: but maybe the mon is too loaded to get the compaction finished in time... what is the fs you run your mon data-dir on? is there other IO on that fs/disk?
[12:43] * Volture (~Volture@office.meganet.ru) has joined #ceph
[12:43] <tnt> yes, the 'compact on trim', but why does it fail to during operation and success on start.
[12:43] <tnt> The FS is XFS and there is no other heavy IO on that disk.
[12:43] <tnt> It grew overnight when there is virtually nothing going on at all.
[12:44] <Volture> hi all
[12:47] <mrjack> tnt: how many mons are there?
[12:47] <tnt> 3
[12:47] <mrjack> tnt: on another ceph installation with 3 mons and mons on ssd i don't see store grow
[12:47] <mrjack> tnt: how many elections did you have the last 24 hours?
[12:48] <tnt> how can I know that ?
[12:48] <mrjack> zcat /var/log/ceph/ceph-mon.?.log.1.gz |grep elec ;)
[12:48] <tnt> And now they're not growing anymore. Nor were they just after the upgrade.
[12:49] <Volture> I want to upgrade ceph from 0.56.4 to 0.61.1. What problems may arise in the conduct of the operation?
[12:50] <joao> Volture, upgrade to 0.61.2 instead
[12:50] <joao> there were a couple of significant bug fixes since .1
[12:51] <tnt> mrjack: 150
[12:51] <mrjack> tnt: hm..
[12:52] <mrjack> they are not growing anymore? how big is the store?
[12:52] <tnt> 200M
[12:52] <tnt> and it reached 9.2G at 8am when I first logged in this morning.
[12:52] <tnt> (5h ago)
[12:52] <mrjack> hm
[12:53] <mrjack> i have seen it grow like 100mb/hour on our servers
[12:53] <mrjack> but have installations where this bug does not show up...
[12:53] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[12:53] <tnt> yes, just after I dropped it from 9.2 to 200M by restarting them, they started growing again by ~ 100M/h as well. When they reached 400M, it restarted them all again and then it stopped growing
[12:53] <tnt> for now ..
[12:56] <mrjack> interesting...
[12:57] <ay> Gugge-47527: Ok. But does that apply directly to rbd and cephfs as well?
[12:58] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Read error: Operation timed out)
[13:03] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[13:03] * uli (~uli@mail1.ksfh-bb.de) has joined #ceph
[13:07] <uli> hi there, trying to setup ceph bobtail and i get this error in osd-logfile: mount: enabling WRITEAHEAD journal mode: btrfs not detected
[13:07] <tnt> it's not an error
[13:07] <uli> btrfs is installed an mount sais: /dev/sdb1 on /var/lib/ceph/osd/ceph-0 type btrfs (rw,noatime,space_cache)
[13:08] <uli> ah ok
[13:08] <joao> nhm_, around?
[13:09] <tnt> ah well, if you use btrfs then it's not normal
[13:10] <uli> :)
[13:10] <uli> ok
[13:10] <uli> strange things here....
[13:10] <joao> old version of btrfs maybe?
[13:10] <uli> v0.19
[13:11] <joao> iirc, those checks are made by checking some fs capabilities
[13:11] <joao> such as ioctls and the sorts
[13:11] <uli> gonna check out xfs... maybe better option
[13:12] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:16] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:24] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[13:27] <uli> now i get next error... ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
[13:28] <uli> on all nodes so no osds are startet
[13:28] <uli> *started
[13:28] * tnt (~tnt@91.177.224.32) Quit (Ping timeout: 480 seconds)
[13:29] <absynth> well, does /var/lib/ceph/osd exist?
[13:30] <uli> es
[13:30] <uli> yes
[13:30] * tziOm (~bjornar@194.19.106.242) has joined #ceph
[13:30] <uli> /dev/sdb1 on /var/lib/ceph/osd/ceph-0 type btrfs (rw,noatime,space_cache)
[13:30] <uli> result of `mount`
[13:31] <uli> but directory is completely empty....
[13:31] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Remote host closed the connection)
[13:31] <uli> ahh mom i think i forgot --mkfs ... could this be the prob?
[13:33] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[13:33] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[13:33] * bergerx_ (~bekir@78.188.101.175) has joined #ceph
[13:34] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[13:36] <joao> uli, definitely, yes
[13:37] <tnt> mrjack: interestingly, since the cuttle fish update I have a lot more of mon election. In the logs before the update, there isn't a single one ... since the update there is 200 or so.
[13:38] <joao> tnt, are you coming from bobtail?
[13:40] <uli> joao, ceph health
[13:40] <uli> HEALTH_OK
[13:43] <tnt> joao: I did argonaut -> bobtail 0.56.3 -> bobtail 0.56.6 -> cuttlefish 0.61.2
[13:44] <joao> tnt, the election issue may somehow be connected with leveldb, which was only introduced in cuttlefish (from you point-of-view)
[13:44] <joao> *your
[13:44] <mrjack_> tnt: hm... i have upgraded another cluster from argonaut -> bobtail -> cuttlefish, and replaced mon disks with xfs and ssd, and hat 4 elections so far since the upgrade (3 days)
[13:44] <mrjack_> tnt: you could try and place monitors on faster storage..
[13:44] <tnt> joao: Yes, I guess that's linked with the fact it grew to 9G last night as well :p
[13:45] <joao> tnt, yeah :\
[13:45] <tnt> mrjack_: they're on 15k SAS drive
[13:45] <mrjack_> tnt: hm, ok.. is there other IO?
[13:46] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[13:47] <tnt> There shouldn't be anything IO intensive. It's on the same disk as the OS (different partition) but there is no swap on the disk and no other service running so there shouldn't be much IO on that drive at all.
[13:50] * markbby (~Adium@168.94.245.2) has joined #ceph
[13:51] * humbolt (~elias@91-113-103-253.adsl.highway.telekom.at) Quit (Ping timeout: 480 seconds)
[13:51] * markbby (~Adium@168.94.245.2) Quit (Remote host closed the connection)
[13:51] * markbby (~Adium@168.94.245.2) has joined #ceph
[13:53] * ScOut3R (~ScOut3R@212.96.47.215) has joined #ceph
[13:54] <tnt> Is there a way to see why there was a new election held ?
[13:58] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[14:01] <joao> tnt, with debugging, yeah
[14:01] <joao> usually it's only when a monitor tries to join the quorum
[14:02] * humbolt (~elias@62-46-149-101.adsl.highway.telekom.at) has joined #ceph
[14:02] <tnt> joao: do you know which debug option and how high I must set it ?
[14:02] <joao> debug mon = 10
[14:02] <joao> that should do it
[14:04] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[14:04] <tnt> Ok, I'll try that and wait for a random election.
[14:07] <uli> someone has an idea what the problem could be, i mounted cephfs, i can set xattr (setfattr -n user.test -v test file.txt) and can read it after without any errors, but when i try to copy a file with windows attributes (from a samba share: cp -a /samba/sahre/text.txt /ceph/share/.) i get a error: (trying to translate from german to english:) cp: perserving rights for ../file.txt: operation not supported
[14:09] <uli> also when i try to set attributes from windos (ceph is mounted in a samba-share) i get error access denied
[14:09] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[14:09] <uli> but i can create files....
[14:19] * ivan` (~ivan`@000130ca.user.oftc.net) Quit (Quit: ERC Version 5.3 (IRC client for Emacs))
[14:19] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 481 seconds)
[14:22] <tnt> joao: http://pastebin.com/raw.php?i=0fc1KMMd does that tell you why an election was called ?
[14:24] <tnt> "handle_propose from mon.2" Oh, so I suppose mon.2 called for it and I should check its logs
[14:25] <jerker> uli: trying to set ACLs?
[14:25] * ivan` (~ivan`@000130ca.user.oftc.net) has joined #ceph
[14:27] <joao> tnt, yeah, mon.2 called the election
[14:27] <joao> and mon.0 (i.e., mon.a) went along with it
[14:28] <tnt> Yup, I've restarted all the monitors with debug enabled now and waiting for it to happen again, shouldn't be too long ...
[14:29] <joao> I'll be around, but am waiting for lunch to arrive, so just poke me whenever you get the logs and I'll look at them as soon as I'm back
[14:29] <tnt> joao: thanks, will do.
[14:30] <jerker> uli: (CephFS does not AFAIK support ACLs.)
[14:31] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[14:31] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 481 seconds)
[14:42] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[14:45] * dosaboy_ (~dosaboy@AMontsouris-651-1-43-174.w82-123.abo.wanadoo.fr) has joined #ceph
[14:46] <uli> jerker, bad news
[14:46] <uli> jerker, but you are right there seems to be a problem with acls
[14:46] <uli> can't set acl
[14:46] <uli> ...
[14:48] <nhm_> uli: btw, this is to get samba to work?
[14:50] <uli> nhm_, samba is working fine, but it needs acl.... on other partitions and so on it's working fine...
[14:50] <uli> nhm_, i'd like to have a distributed filesystem for my samba-machines in two different locations and so i wanted to give ceph a try....
[14:51] <nhm_> uli: just curious if you've tried the ceph/samba integration stuff?
[14:52] <nhm_> uli: this is outside my area of expertise, but: https://github.com/ceph/samba
[14:53] <nhm_> I guess the vfs module is the important bit maybe.
[14:53] * dosaboy_ (~dosaboy@AMontsouris-651-1-43-174.w82-123.abo.wanadoo.fr) Quit (Quit: leaving)
[14:54] <uli> nhm_, the name of the link on its own is promising :)
[14:54] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[14:55] <jerker> uli: http://www.samba.org/samba/docs/man/manpages-3/vfs_acl_xattr.8.html
[14:56] <jerker> uli: I have not tried
[14:58] <uli> jerker, thanks gonna check that out
[15:06] * agh (~oftc-webi@gw-to-666.outscale.net) Quit (Quit: Page closed)
[15:08] * ghartz (~ghartz@ill67-1-82-231-212-191.fbx.proxad.net) has joined #ceph
[15:09] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[15:11] <tnt> joao: http://pastebin.com/raw.php?i=8FaFTeNi
[15:14] * klnlnll (~DW-10297@dhcp92.cmh.ee.net) Quit ()
[15:16] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 481 seconds)
[15:19] * themgt (~themgt@96-37-28-221.dhcp.gnvl.sc.charter.com) has joined #ceph
[15:22] <joao> tnt, thanks
[15:24] <tnt> joao: I hope it speaks to you because it doesn't really tell me much. So far I think the "bootstrap" line is what triggered the election.
[15:25] <joao> yeah, bootstrap() eventually triggers an election
[15:25] <joao> the question now is why the hell we're bootstrapping
[15:25] <tnt> http://pastebin.com/bJQHRT9Y
[15:25] <joao> the good thing is that we don't bootstrap from that many places
[15:26] <tnt> That's the only place I found that didn't print something else just before.
[15:26] <tnt> (AFAICT)
[15:28] <joao> tnt, you're running it without debug ms right?
[15:29] <tnt> yes
[15:29] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[15:29] <tnt> I don't have that much space on /var/log so can't enable too much stuff for too long.
[15:30] <joao> yeah, that's okay
[15:30] <joao> just that debug ms would show us which message had been received prior to that
[15:30] <joao> but this is a great start
[15:34] <joao> tnt, can you check your logs from other monitors for elections around that timestamp?
[15:34] <joao> looks like this monitor might have missed an election, thus it's bootstrapping to catch up on the cluster's epoch
[15:35] <tnt> I can re-check but I was running a tail -f on all logs and only saw one.
[15:35] <joao> I'd say it would probably have happened between 13:00:20 and 13:00:24
[15:39] <tnt> Theses are the log from mon.b http://pastebin.com/raw.php?i=3h4gLqgR for that time
[15:39] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[15:40] <tnt> And the only mention of "election" in the log dates back from 34 minutes before that when I restarted it with loggin enabled ...
[15:40] <joao> yeah
[15:40] <joao> there goes my theory
[15:41] <joao> it's unlikely mon.2 was 30 minutes behind on election epoch :\
[15:42] * themgt (~themgt@96-37-28-221.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[15:50] <mikedawson> tnt: did you get all three monitors in quorum or in one still out of quorum? If one will not rejoin, is it's store.db significantly larger than the others?
[15:52] <tnt> mikedawson: most of the time the 3 are in quorum. From the logs I can see that some election, one monitor was excluded but then right after a new election is held and it comes back in quorum.
[15:53] <tnt> This morning after restarting the 3 mons that were at 9G size, one of them didn't want to size down and never joined again so I just deleted the data and recreated it.
[15:53] <mrjack_> re
[15:54] <mikedawson> tnt: I see. When you restarted the mons, did you have any OSDs running?
[15:54] <tnt> yes, all of them.
[15:55] <tnt> Last time I restarted the OSD that was the upgrade to cuttlefish on tuesday.
[15:55] <mikedawson> tnt: see my notes at http://tracker.ceph.com/issues/4895#change-22035
[15:56] <tnt> Yes, I've seen those.
[15:56] * PerlStalker (~PerlStalk@72.166.192.70) has joined #ceph
[15:56] <mikedawson> tnt: I've fought these issues for several weeks. If a monitor will not compact on startup, you likely will need to stop all OSDs and try again
[15:56] <tnt> For me shutting down all mon was sufficient. (I need to do it all at once, restarting 1 by 1 doesn't work)
[15:57] <mikedawson> tnt: yep, all mons at once is key, one at a time will not work
[15:58] <mikedawson> tnt: and if a monitor doesn't compact, you're stuck stopping the OSDs (or killing it and recreating like you did)
[15:58] <absynth> uhm
[15:58] <absynth> "shut down all mons"
[15:58] <absynth> that does not really sound like a production recipe
[15:58] <mikedawson> absynth: yeah, it's an absolutely critical bug that has been tough to diagnose
[15:58] <tnt> yes, and surprisingly during the like 5/10 min where all mons were down ... the RBD backed VMs actually kept running ...
[15:59] * ScOut3R_ (~ScOut3R@212.96.47.215) has joined #ceph
[15:59] * tziOm (~bjornar@194.19.106.242) Quit (Remote host closed the connection)
[15:59] * wschulze (~wschulze@38.98.115.249) has joined #ceph
[15:59] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[16:01] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 481 seconds)
[16:02] <mikedawson> Attempting to diagnose this issue is why the transaction dump config option exists. It accidentally shipped with 0.61.1 enabled by default. 0.61.2 fixed that.
[16:02] <tnt> and did that debug info turn out anything useful ?
[16:03] <absynth> mikedawson: the reason for 0.62?
[16:03] <mikedawson> Several people have seen HEALTH_WARN in the past few days due to the tdump file (which grows unbounded in 0.61.1). Although it is disabled by default in 0.61.2, it likely needs to be manually deleted.
[16:04] <absynth> scuttlemonkey: around?
[16:04] <mrjack_> tnt: i also noticed that my VMs kept running without ceph -w showing mon quorum...
[16:05] <scuttlemonkey> absynth: moo
[16:05] <scuttlemonkey> :)
[16:05] <mikedawson> tnt: no. My monitors were 36GB when I started the tdump. There was way too much noise for Sage to analyze what happened. He wants a snapshot of a monitor store.db when it is small. Then a tdump while it is growing unbounded, and a snapshot of the large store.db
[16:05] <absynth> scuttlemonkey: i think i want to sponsor a resident geek
[16:05] * ScOut3R (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[16:05] <scuttlemonkey> yeah?
[16:05] <absynth> err
[16:05] <absynth> not resident
[16:06] <absynth> geek on duty
[16:06] <scuttlemonkey> hehe
[16:06] <scuttlemonkey> sounds great to me
[16:06] <absynth> you got slots?
[16:06] <absynth> i'll coerce oliver or myself :)
[16:06] <scuttlemonkey> sure, anything that isn't taken on the geek-on-duty page on ceph.com
[16:06] <scuttlemonkey> if you send a note to community at inktank with a logo, times, and the irc handles of whomever needs voicing I can get it posted
[16:07] <scuttlemonkey> http://ceph.com/help/community/
[16:07] <absynth> i'll run it by the guys and see what they think. I for one think it's a really good idea that we want to support
[16:07] <scuttlemonkey> sweet
[16:07] <absynth> and since i'm kind of the boss... :P
[16:07] <scuttlemonkey> haha
[16:07] <scuttlemonkey> it's good to be king :)
[16:07] <mrjack_> lol
[16:07] <scuttlemonkey> but yeah, it's a pretty informal process
[16:08] <scuttlemonkey> just need what you see on that page and irc nicks to add to voice so you get the fancy gold bubble (or +v for dedicated term hackers)
[16:09] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) Quit (Remote host closed the connection)
[16:09] <joao> scuttlemonkey, gold bubble for those using xchat I suppose :p
[16:09] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[16:10] <absynth> +
[16:10] <absynth> for me it's a +
[16:10] <scuttlemonkey> joao: the only way to go!
[16:10] <joao> scuttlemonkey, I miss mIRC :(
[16:10] <absynth> pff, irssi ftw
[16:10] <absynth> WHAT?
[16:10] <scuttlemonkey> hehe
[16:10] <joao> lol
[16:10] <absynth> did you just say you miss mIRC?
[16:10] <mrjack_> wine + mIRC ;)
[16:10] <scuttlemonkey> yeah, actually mIRC was my first non-text-based client
[16:11] <joao> I haven't used mIRC for a good part of a decade now
[16:11] <absynth> ...and I applaud you for that, Sir.
[16:11] <joao> I feel a bit nostalgic about it
[16:11] <scuttlemonkey> hehe, funny how nostalgia can erase all of ther turribleness? :)
[16:11] <absynth> i haven't used mirkforce for a good part of a decade now
[16:11] <absynth> ...feel kinda nostalgic about it
[16:12] <joao> scuttlemonkey, true!
[16:12] <absynth> wonder if it still compiles
[16:12] <tnt> mikedawson: do you also have spurious re-elections ? During the period it grew, I had a lot of these .. right now only 1 every 30 min or so but before cuttlefish it was never.
[16:16] <mikedawson> tnt: with 0.58 and 0.59, there were several issues achieving quorum if you look back through bugs I originated or commented on, you'll get an idea of how bad it was in the weeks leading up to Cuttlefish.
[16:17] <mikedawson> tnt: I am not seeing any election or quorum issues any more as long as my leveldb's aren't growing unbounded
[16:19] * anthonyacquanita (~anthonyac@pool-173-50-173-226.pghkny.fios.verizon.net) Quit (Quit: Leaving)
[16:19] <tnt> joao: anything else I can do to narrow down the issue ?
[16:23] <joao> tnt, not at the moment
[16:24] <joao> I guess at the moment the hardest part is to make sense of what might be going on
[16:24] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[16:34] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: my troubles seem so far away, now yours are too...)
[16:34] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[16:34] * ChanServ sets mode +o scuttlemonkey
[16:34] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[16:36] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Quit: Leaving)
[16:37] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[16:43] <tnt> joao: ceph-mon.a.log:2013-05-16 13:00:24.954574 7f99d6699700 10 mon.a@0(leader).elector(562) bump_epoch 562 to 563
[16:43] <tnt> ceph-mon.a.log:2013-05-16 13:00:24.974705 7f99d6699700 10 mon.a@0(electing).elector(563) bump_epoch 563 to 564
[16:43] <tnt> is it normal there is two bump_epoch ?
[16:44] <joao> epoch is bumped to mark election start, and bumped again to mark election finish
[16:45] <joao> odd election epochs represent an ongoing election
[16:46] <tnt> ok i see, thanks.
[16:46] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[16:49] <tnt> Oh, wait, all the call to bootstrap in Paxos.cc are in another debug domain ...
[16:49] * nooky (~nooky@190.221.31.66) has joined #ceph
[16:50] <tnt> Is it possible to set deubg level at runtime or is restart needed ?
[16:52] <joao> I believe we can adjust it on-the-fly with 'ceph mon tell \* '--debug-paxos 10'
[16:53] * ccourtaut (~ccourtaut@2a01:e0b:1:119:88e6:75e4:2c00:3af4) Quit (Ping timeout: 480 seconds)
[16:54] <tnt> the doc gives as examples " ceph osd tell 0 injectargs '--debug-osd 20 --debug-ms 1' " but neither version seem to work.
[16:55] <joao> yeah, was missing 'injectargs'
[16:55] <joao> well
[16:55] * alrs (~lars@209.144.63.76) has joined #ceph
[16:55] <joao> for monitors you can simply run 'ceph -m IP:PORT injectargs <foo>'
[16:55] <joao> then again, the monitor must be in the quorum for this to work
[16:56] <tnt> ah yes, that works, I was looking at ceph.log and not ceph-mon.a.log ...
[16:56] <joao> be back shortly
[17:03] <tnt> of course when you need it to happen ... it doesn't ...
[17:04] * bergerx_ (~bekir@78.188.101.175) Quit (Quit: Leaving.)
[17:04] * wschulze (~wschulze@38.98.115.249) Quit (Quit: Leaving.)
[17:05] <mrjack> re
[17:05] * loicd (~loic@magenta.dachary.org) has joined #ceph
[17:06] * vata (~vata@2607:fad8:4:6:f9b0:68a5:e595:8675) has joined #ceph
[17:07] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:09] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[17:11] * wschulze (~wschulze@38.98.115.249) has joined #ceph
[17:12] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[17:13] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[17:17] * Neptu (~Hej@mail.avtech.aero) has joined #ceph
[17:17] <Neptu> hej
[17:18] <Neptu> I just trying to understand if i could use ceph as a distributed replica among 3 locations
[17:19] <scuttlemonkey> neptu: only if those 3 locations are connected by a high-bandwidth, low-latency connection
[17:19] <Neptu> so far i understand it scales quick replicates and heals, but unsure if it can consider a location as a diferent place that should contain a replica of all the files
[17:19] <tnt> yes, you can specify how to replicate.
[17:20] <scuttlemonkey> otherwise you'll have to a) run 2 clusters and use the RBD incremental snapshotting or b) wait for dumpling and use the RGW DR stuff being built
[17:21] * julienhuang (~julienhua@AMontsouris-553-1-3-234.w92-151.abo.wanadoo.fr) has joined #ceph
[17:21] <scuttlemonkey> Neptu: Ceph is network-topology-aware, so within a datacenter you can tell it to keep replicas on different rows/racks/etc...we just don't recommend telling it to put replicas on different continents yet :)
[17:24] <Neptu> coolish
[17:24] <Neptu> another question
[17:25] <Neptu> can I have files for 2 months all data in 2 datacenters and 10 years files in another datacenter?
[17:25] <Neptu> so the 10 year one can keep scaling horizontally
[17:26] <Neptu> i mean each location can scale in size and what needs to be replicated?
[17:26] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Remote host closed the connection)
[17:26] <tnt> no, ceph won't do that for you automatically. you'll need some higher level logic for that.
[17:27] <scuttlemonkey> yeah we don't automatically deal with the concept of data "heat" (yet)
[17:27] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[17:27] <tnt> you can define different storage pool with different replication level / target but you'll need something to move file between pools.
[17:27] <scuttlemonkey> ^^
[17:27] * barryo (~borourke@cumberdale.ph.ed.ac.uk) Quit (Ping timeout: 480 seconds)
[17:27] <mtanski> Is there a way to build the cephfs as a out of kernel module? I'd like to add support for cachets to ceph
[17:28] <Neptu> ok so you can have a pool called long storage and another frequent acces but you need to move files from one pool to another yourself
[17:28] <Neptu> ok
[17:29] <Neptu> and how is syncronization between location do I need big cable of if its not so critical i can have slower line?
[17:29] * eschnou (~eschnou@249.73-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[17:29] * barryo (~borourke@cumberdale.ph.ed.ac.uk) has joined #ceph
[17:30] <tnt> Neptu: you need a fast line, all replication is syncronous and ceph doesn't like latency much.
[17:31] <Neptu> but will catch up?
[17:31] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[17:32] <tnt> It just won't work properly I think.
[17:32] <scuttlemonkey> mtanski: only options I know are kernel and FUSE
[17:32] <scuttlemonkey> mtanski: but I am far from the authoritative resource :)
[17:33] <Neptu> tnt that is kinda a deal braker in the sense for me is not important the update time is more important the pool storage and the reliability and healing.... if i let it sync on nights no problem
[17:34] <Neptu> maybe i should have 3 ceph and rsync them
[17:34] * markbby1 (~Adium@168.94.245.2) has joined #ceph
[17:34] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[17:34] * julienhuang (~julienhua@AMontsouris-553-1-3-234.w92-151.abo.wanadoo.fr) Quit (Quit: julienhuang)
[17:35] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[17:35] * ChanServ sets mode +v andreask
[17:36] * markbby (~Adium@168.94.245.2) Quit (Ping timeout: 480 seconds)
[17:38] * Romeo1979 (~roman@046-057-022-241.dyn.orange.at) has joined #ceph
[17:38] * markbby (~Adium@168.94.245.3) has joined #ceph
[17:38] * markbby1 (~Adium@168.94.245.2) Quit (Remote host closed the connection)
[17:40] <Neptu> does any NAS brand comes with "Ceph inside"??
[17:40] <Neptu> kinda
[17:40] * spicewiesel (~spicewies@2a01:4f8:191:316b:dcad:caff:feff:ee19) has joined #ceph
[17:40] * glowell (~glowell@ip-64-134-236-4.public.wayport.net) Quit (Quit: Leaving.)
[17:40] * ScOut3R_ (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[17:40] <spicewiesel> hi all.
[17:40] * Romeo_ (~romeo@198.144.195.85) has joined #ceph
[17:41] * jksM (~jks@3e6b5724.rev.stofanet.dk) has joined #ceph
[17:41] * Cube1 (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[17:42] * markbby (~Adium@168.94.245.3) Quit (Remote host closed the connection)
[17:42] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit (Quit: Leaving.)
[17:42] * Fetch__ (fetch@gimel.cepheid.org) has joined #ceph
[17:43] <spicewiesel> If ceph is set to store 3 copies of each object, where will these copies be stored in the cluster? are the copied to 3 (randomly chosen) osds? are they mirrored to every osd? could anyone help me to understand that? thanks in advance.
[17:43] * iggy2 (~iggy@theiggy.com) has joined #ceph
[17:43] <jksM> spicewiesel, you can set that up using the crushmap configuration
[17:44] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[17:44] * jks (~jks@3e6b5724.rev.stofanet.dk) Quit (Read error: Connection reset by peer)
[17:44] * Romeo (~romeo@198.144.195.85) Quit (Read error: Connection reset by peer)
[17:44] * Fetch (fetch@gimel.cepheid.org) Quit (Read error: Connection reset by peer)
[17:44] * iggy (~iggy@theiggy.com) Quit (Remote host closed the connection)
[17:44] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) Quit (Quit: ZNC - http://znc.sourceforge.net)
[17:44] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) has joined #ceph
[17:44] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:44] <spicewiesel> ok, so it depends on the configuration? I can't assume that I need 3 GB of storage to store 1GB in the cluster, if the copies are set to 3?
[17:45] <jksM> spicewiesel, as far as I know, yes you can assume that (roughly)
[17:45] * mistur (~yoann@kewl.mistur.org) Quit (Remote host closed the connection)
[17:45] <spicewiesel> ok :)
[17:45] * mistur (~yoann@kewl.mistur.org) has joined #ceph
[17:45] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[17:45] <jmlowe> spicewiesel: plus some overhead
[17:45] <spicewiesel> so, the copies are not mirrored in the cluster. We could say: 3 Copies are stored on 3 OSDs
[17:45] <jksM> spicewiesel, what I meant is that with the crush map settings, you can decide how that data is spread across your cluster.. for example to make it one copy per rack
[17:45] <spicewiesel> jmlowe: okay
[17:45] * markbby (~Adium@168.94.245.2) has joined #ceph
[17:45] <jksM> spicewiesel, yes copies are not "mirrored" in that sense
[17:46] <spicewiesel> jksM: okay, then I have to check the crushmap documenation.
[17:46] <spicewiesel> okay
[17:46] <jksM> spicewiesel, you can see the available options here: http://ceph.com/docs/master/rados/operations/crush-map/
[17:46] <spicewiesel> It's just a customer asked, how much storage is needed to store a specified amount of data
[17:46] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[17:46] * The_Bishop (~bishop@2001:470:50b6:0:80d2:31ad:4852:4a37) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[17:47] <Neptu> tnt what is the preferred speed line speed for cephs if i have to shuffle around 30GB per day
[17:47] <jmlowe> spicewiesel: I have mine split across racks and have .4% overhead on 4164G
[17:47] <spicewiesel> jmlowe: thanks, nice to know that exmaples
[17:47] <jmlowe> spicewiesel: that's at 2x replication
[17:48] <jksM> I have overhead in the same range on my setup... the overhead is so small it is of no concern really
[17:48] <jmlowe> spicewiesel: 4164 GB data, 8345 GB used, 74529 GB / 82874 GB avail
[17:48] <darkfader> Neptu: that's like 728KB/s at 2x replikation
[17:48] <spicewiesel> so, in the crushmap I could define how many and where copies should stored?
[17:49] <jmlowe> yep
[17:49] <jmlowe> step take default
[17:49] <jmlowe> step chooseleaf firstn 0 type rack
[17:49] <jmlowe> step emit
[17:49] <darkfader> Neptu: trying to say that you will need to mention how quick you want the data to be stored
[17:49] <spicewiesel> but in general, the copy option defines, how much storage is used in the cluster. So I could tell them: If you want to work with 4 copies, then calculate your data*4
[17:49] <jmlowe> grabs a host from each rack for me
[17:50] <Neptu> darkfader, do not need really quick need reliable basically
[17:50] <jmlowe> make that osd not host
[17:50] <spicewiesel> ok
[17:50] <Neptu> darkfader, nother thing will be to know if the data is crypted over the transfer
[17:50] <darkfader> Neptu: i think the people who tried using 100mbit were really unhappy. so gigabit, and that'll do at that rate i guess
[17:50] * sagelap (~sage@2600:1012:b001:827d:9d3b:a069:66e9:a44d) has joined #ceph
[17:51] <jmlowe> spicewiesel: there is a crushmap simulation tool, dmick showed it to me awhile ago, it will simulate the placement of a bunch of random objects so you can see if they are distributed as you expect
[17:51] * dpippenger (~riven@cpe-76-166-221-185.socal.res.rr.com) Quit (Quit: Leaving.)
[17:52] <spicewiesel> It's just everyone is still thinking in RAID. So, they asked: Is my data mirrored to _every_ osd and how much storage do I need to store X Gb data?
[17:52] <jksM> jmlowe, oh! - really nice! - would be interested to try that tool some day
[17:52] <spicewiesel> jmlowe: nice!
[17:53] <jmlowe> jksM: yeah, that's what I thought
[17:53] <jksM> spicewiesel, it's probably best not to use the term "mirror" as it kind of implies a RAID-1 type of setup with two copies of the data
[17:53] <spicewiesel> exactely
[17:54] * rturk-away is now known as rturk
[17:54] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[17:55] <jmlowe> the important feature is that the primary gets the data from the client and it doesn't return as successful until the osd forwards the write on to the secondary/secondaries and it returns as successful, in other words the client writes once and doesn't get notified that the write was successful until all the replicas finish
[17:57] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Quit: Leaving)
[17:57] * themgt (~themgt@96-37-28-221.dhcp.gnvl.sc.charter.com) has joined #ceph
[17:57] <jmlowe> crushtool --test --output-csv -i crushmap
[17:58] <jmlowe> assuming you've dumped your crushmap as "./crushmap"
[17:58] * wschulze (~wschulze@38.98.115.249) Quit (Quit: Leaving.)
[17:59] <spicewiesel> ok, I think my little questions are answered, thanks guys!
[17:59] <spicewiesel> I will check crushmap config now and try to play arround with crushtool :)
[18:02] <jmlowe> dumped or compiled, I should have said
[18:02] <jksM> jmlowe, just trying it out - got a tonne of csv files now ;-)
[18:03] <jksM> gotta go, but I'll check it out a bit later - thanks!
[18:04] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:04] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Remote host closed the connection)
[18:04] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[18:04] * glowell (~glowell@ip-64-134-236-4.public.wayport.net) has joined #ceph
[18:05] * aliguori (~anthony@32.97.110.51) has joined #ceph
[18:06] <sagelap> joao: ping!
[18:07] * glowell (~glowell@ip-64-134-236-4.public.wayport.net) Quit ()
[18:08] * BillK (~BillK@124-169-186-145.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[18:09] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[18:13] * tnt (~tnt@91.177.224.32) has joined #ceph
[18:14] <joao> sage, pong
[18:14] <joao> sagelap, ^
[18:17] * markbby1 (~Adium@168.94.245.2) has joined #ceph
[18:17] * markbby (~Adium@168.94.245.2) Quit (Remote host closed the connection)
[18:20] * sagelap1 (~sage@2600:1012:b02d:3677:e8fd:760b:f439:1f3) has joined #ceph
[18:22] * markbby (~Adium@168.94.245.2) has joined #ceph
[18:23] * eschnou (~eschnou@249.73-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[18:24] * markbby1 (~Adium@168.94.245.2) Quit (Remote host closed the connection)
[18:24] * sagelap (~sage@2600:1012:b001:827d:9d3b:a069:66e9:a44d) Quit (Ping timeout: 480 seconds)
[18:24] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[18:24] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[18:25] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[18:31] * sagelap1 (~sage@2600:1012:b02d:3677:e8fd:760b:f439:1f3) Quit (Ping timeout: 481 seconds)
[18:34] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) has joined #ceph
[18:35] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) Quit ()
[18:35] * rturk is now known as rturk-away
[18:39] * Cube1 (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[18:54] * tkensiski1 (~tkensiski@15.sub-70-197-9.myvzw.com) has joined #ceph
[18:55] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Remote host closed the connection)
[18:55] * tkensiski1 (~tkensiski@15.sub-70-197-9.myvzw.com) has left #ceph
[18:57] <alex_> ceph-deploy list disks isn't listing disks on the OS controller (3 other disks on it that are not formatted), but it shows the disks on the other controllers?
[18:57] * leseb (~Adium@83.167.43.235) Quit (Quit: Leaving.)
[18:58] * alram (~alram@38.122.20.226) has joined #ceph
[19:00] * matt_ (~matt@220-245-1-152.static.tpgi.com.au) Quit (Ping timeout: 480 seconds)
[19:01] * iggy2 is now known as iggy
[19:04] * Cube (~Cube@12.248.40.138) has joined #ceph
[19:09] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[19:10] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[19:17] * jjgalvez (~jjgalvez@12.248.40.138) has joined #ceph
[19:19] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) Quit (Quit: Leaving.)
[19:21] * markbby (~Adium@168.94.245.2) Quit (Remote host closed the connection)
[19:22] * Cube1 (~Cube@12.248.40.138) has joined #ceph
[19:24] * eschnou (~eschnou@249.73-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[19:24] * jskinner (~jskinner@69.170.148.179) has joined #ceph
[19:26] * mtanski (~mtanski@69.193.178.202) Quit (Quit: mtanski)
[19:27] * markbby (~Adium@168.94.245.2) has joined #ceph
[19:27] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[19:29] * Cube (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[19:30] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[19:32] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) has joined #ceph
[19:32] * glowell (~glowell@12.248.40.138) has joined #ceph
[19:33] <mjblw> When I ask this question, I am referring to http://ceph.com/docs/master/architecture/, #4 Replication and the workflow illustration contained in that section. When we are talking about writing to an OSD in this illustration, are we talking about writing to the OSD's *journal* or the actual underlying OSD device?
[19:37] <paravoid> sagewk: so having librados2 0.61.2 with ceph 0.56.6, sounds a road less traveled and kinda scary :)
[19:38] <sagewk> should not be any problems
[19:38] <paravoid> maybe I should just upgrade to 0.61
[19:38] <sagewk> ceph just has the daemons, and they are all statically linked.
[19:38] <sagewk> well, you should do that too. :)
[19:39] <paravoid> restarting osds results in ~5' of peering, instability, slow requests
[19:39] <paravoid> not sure how I'll fully migrate to 0.61 without bringing down the site :)
[19:39] * loicd (~loic@magenta.dachary.org) has joined #ceph
[19:41] <sagewk> restarting all osds you mean? or just a single one?
[19:41] <sagewk> i would do one a at time...
[19:41] <gregaf1> mjblw: the journal is part of the OSD ;) and writing to an OSD usually means getting it committed to the journal
[19:41] <paravoid> just a single one
[19:42] <sagewk> really?!? that is not right :/. is logging off?
[19:43] <paravoid> yes
[19:43] <paravoid> it's not very nice, no :)
[19:43] <sagewk> also: i futzed around with udev on wheezy for a while last night and can't figure out what is wrong with the rule. the events are coming through with the right properties so i'm not sure why the rule isn't triggering.
[19:43] <mjblw> so, the ack is sent after the write is committed to the journal? Is that right gregaf1?
[19:44] <paravoid> we lost a disk the other day, it escalated to full blown outage for a minute
[19:44] <paravoid> pages and everything
[19:44] <paravoid> (did I mention ceph is in production now?)
[19:45] <gregaf1> mjblw: there's a "safe" ack and an "applied" (to the filesystem) ack, and the safe one gets sent after the write is durable to the journal, yes
[19:45] <cjh_> nhm_: i tried making a large raid6 volume and cutting it up into partitions. run an osd on each partition. seems to really be bad for performance haha.
[19:46] <cjh_> nhm_: i started with 10 partitions and i'm going to reduce it to 5 and see what that does
[19:46] <mjblw> gregaf1: Ok, so which ack is it in the context of the diagram at http://ceph.com/docs/master/architecture/ in the replication section?
[19:48] <gregaf1> yeah, that's the safe ack for the journal
[19:48] <paravoid> sagewk: can I see the rules?
[19:49] <sagewk> paravoid: eek! this is 0.56.4 right?
[19:49] <paravoid> mons are 0.56.6, osds are 0.56.4
[19:49] <paravoid> because I can't easily restart them
[19:49] <sagewk> it's what udev 175-7.2 installs at /lib/udev/rules.d/60-persistent-storage.rules
[19:50] <sagewk> the rules are the same as udev on precise (which work fine), modulo whitespace.
[19:50] <sagewk> how heavily loaded on the machines?
[19:51] <paravoid> sagewk: it's PART_ENTRY_{NAME,TYPE} instead of ID_PART_ENTRY_{NAME,TYPE}
[19:52] <sagewk> ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_ENTRY_UUID}=="?*", \
[19:52] <sagewk> SYMLINK+="disk/by-partuuid/$env{ID_PART_ENTRY_UUID}"
[19:52] <sagewk> is what i have..
[19:52] <paravoid> yeah, that doesn't work with udev <= 178
[19:52] <sagewk> and udevadm monitor --property shows those with the ID_ prefix when the events come thru
[19:52] <paravoid> because of http://git.kernel.org/cgit/linux/hotplug/udev.git/commit/?id=1b9e13e2e2c4755752e1e9fd8ff4399af7329ab8
[19:53] <paravoid> try "udevadm test /block/sda/sda1"
[19:53] <paravoid> | grep PART_ENTRY_UUID
[19:53] <mjblw> gregaf1: so I'm clear then, the safe ack being received by the client is sufficient for the client to send it's next I/O request, right? The client isn't waiting for committal of data to the backing OSD before being able to go about it's business, right?
[19:54] <gregaf1> right
[19:54] <gregaf1> the client can forget about the request once it gets a safe ack
[19:54] <sagewk> ID_PART_ENTRY_SCHEME=gpt
[19:54] <sagewk> ID_PART_ENTRY_SIZE=1950349279
[19:54] <sagewk> ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-062c0ceff05d
[19:54] <sagewk> ID_PART_ENTRY_UUID=4f08794b-4b1b-448e-ad70-f1e07b927740
[19:54] <sagewk> ID_PART_TABLE_TYPE=gpt
[19:54] <sagewk> ...
[19:54] <paravoid> huh
[19:54] <paravoid> that's strange
[19:54] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[19:54] <sagewk> yeah :)
[19:54] <paravoid> is that stock wheezy?
[19:54] <sagewk> afaics
[19:55] <sagewk> i've apt-get install --reinstall'd udev
[19:55] <mjblw> gregaf1: thanks for clearing that up for me.
[19:56] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[19:56] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[19:56] <paravoid> https://ganglia.wikimedia.org/latest/?c=Ceph%20eqiad&h=ms-be1002.eqiad.wmnet&m=cpu_report&r=hour&s=descending&hc=4&mc=2
[19:56] <paravoid> that's how much the machines are loaded
[19:57] <paravoid> (not at all)
[19:57] * JackW (~oftc-webi@staff.ndchost.com) has joined #ceph
[19:57] * JackW (~oftc-webi@staff.ndchost.com) Quit ()
[20:02] * alrs (~lars@209.144.63.76) Quit (Ping timeout: 480 seconds)
[20:06] <tnt> joao: http://pastebin.com/8PGJL8Sn
[20:06] <loicd> sjust: PG::merge_log updates info.log_tail and info.stats ( under certain conditions ) around https://github.com/ceph/ceph/blob/master/src/osd/PG.cc#L605
[20:06] <tnt> 2013-05-16 17:41:29.757033 7f09093cb700 5 mon.c@2(peon).paxos(paxos updating c 27572138..27572157) lease_timeout -- calling new election
[20:07] <loicd> I could extract the change from merge_log so that it only has side effects on log & missing. What do you think sjust sjustlaptop ?
[20:07] <paravoid> sagewk: I'm trying to find a testing setup with wheezy to test those rules but no luck
[20:07] <paravoid> won't work with losetup
[20:07] <sjusthm> loicd: that sounds reasonable
[20:08] <loicd> there also is https://github.com/ceph/ceph/blob/master/src/osd/PG.cc#L678 updating info.last_update & info.purged_snaps
[20:08] <loicd> sjusthm: I missed the one nick you're using :-D
[20:09] <sjustlaptop> loicd: no worries
[20:09] <loicd> ahaha
[20:09] <paravoid> lol
[20:11] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[20:13] <sagewk> paravoid: can you attach some info about the cluster (ceph osd tree, ceph osd dump) and maybe a snippet of ceph.log after an osd restart to this bug? http://tracker.ceph.com/issues/5084
[20:13] <paravoid> oh heh
[20:13] <paravoid> I can :)
[20:13] <sagewk> tnx :)
[20:14] <paravoid> we have a bit more traffic since this week and peering was better the day before yesterday
[20:14] <paravoid> let's see how much downtime a osd restart cycle will bring
[20:15] <paravoid> sagewk: can you try sed -i s/ID_PART_ENTRY/PART_ENTRY/ /lib/udev/rules.d/60-persistent-storage.rules
[20:16] <paravoid> then udevadm control --reload-rules; udevadm trigger ?
[20:19] <tnt> Anyone knows what a lease_timeout is in the context of paxos ? Timeout seems to be 10s by default, but how often is it renewed ?
[20:22] * alrs (~lars@cpe-142-129-65-37.socal.res.rr.com) has joined #ceph
[20:24] * tkensiski (~tkensiski@209.66.64.134) has joined #ceph
[20:24] * eschnou (~eschnou@249.73-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[20:24] * tkensiski (~tkensiski@209.66.64.134) has left #ceph
[20:26] <paravoid> sagewk: done
[20:27] * rustam (~rustam@94.15.91.30) has joined #ceph
[20:28] <gregaf1> tnt: I think it's every lease_timeout/2 time units, but I'm not certain — joao probably has it in his head
[20:28] <tnt> "handle_lease_ack from mon.1 -- stray (probably since revoked)"
[20:29] <tnt> I have theses all the time. "stray" doesn't sound like normal behavior.
[20:30] <paravoid> sagewk: I also think it has gotten a bit better this week with the increased traffic
[20:30] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[20:30] <paravoid> but that's just a feeling
[20:31] * loicd (~loic@magenta.dachary.org) has joined #ceph
[20:37] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[20:40] <tnt> Something looks weird in the way Paxos::extend_lease() works.
[20:40] * coyo (~unf@71.21.193.106) has joined #ceph
[20:41] <tnt> Each call to it, will schedule another renew event for later, so that it's periodically called once the cycle is initiated.
[20:41] * markbby (~Adium@168.94.245.2) Quit (Quit: Leaving.)
[20:42] <tnt> but extend_lease is also called from other places in the code, and previously scheduled event are not cancelled, so won't that create more and more calls to renew event ?
[20:43] <gregaf1> I believe there's a trap somewhere in there for the extra renewals
[20:45] <tnt> Well I don't see it.
[20:45] <sagewk> paravoid: wth, now it si all working fine (without any changes). grr.
[20:46] <paravoid> without the sed?
[20:46] <sagewk> well i switched it back.
[20:46] <sagewk> oh, it's bc /dev/disk/by-partuuid is now populated. blerg.
[20:47] <sagewk> yeah nm, still broken.
[20:47] <gregaf1> tnt: it's only called from lease_renew_timeout() and on events when Paxos moves into active (from electing)
[20:49] <tnt> gregaf1: well, if it was only called on timeout it should be called ~ once every 5 sec or so, but in the logs, I see it called several times per second.
[20:49] <gregaf1> and lease_renew_timeout is only called by the C_LeaseRenew context, created in extend_lease itself
[20:49] <tnt> http://pastebin.com/eLj8Yt7s
[20:49] <gregaf1> do you have a bunch of elections going on?
[20:49] <tnt> No.
[20:50] <paravoid> sagewk: tried with the sed above?
[20:50] * eschnou (~eschnou@249.73-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[20:50] * b1tbkt (~quassel@24-216-67-250.dhcp.stls.mo.charter.com) Quit (Read error: No route to host)
[20:50] * b1tbkt_ (~quassel@24-216-67-250.dhcp.stls.mo.charter.com) has joined #ceph
[20:51] <tnt> gregaf1: I have some spurious ones, (ie happening without reason since all mons are up) and traced it back to a lease timeout on one of the peons. And when trying to see why that was, I found that the master was calling renew a lot ... except right before the peon timeout where it just stops renewing ...
[20:51] <tnt> but the last election was over 1 hour ago ...
[20:51] <loicd> sjusthm: merge_log also indirectly modifies snap_mapper https://github.com/ceph/ceph/blob/master/src/osd/PG.cc#L422 . I'll assume it also has to happen outside of the merge logic unless you tell me otherwise ;-)
[20:52] <tnt> gregaf1: so AFAICT, something weird is going on.
[20:54] * markbby (~Adium@168.94.245.2) has joined #ceph
[20:54] <sjustlaptop> loicd: yeah
[20:55] <gregaf1> that does seem odd, but the code all looks good to me
[20:55] <gregaf1> what version?
[20:56] <gregaf1> and you should make a ticket I guess; I'm not even sure if this is actually a problem or not but I don't have the time to dig into it so I'll have to pass this to joao when he's around
[21:01] <tnt> 0.61.2
[21:03] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[21:04] <tnt> gregaf1: do you know joao 's timezone ?
[21:04] * lofejndif (~lsqavnbok@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[21:10] * Cube1 (~Cube@12.248.40.138) Quit (Quit: Leaving.)
[21:17] <mjblw> In a scenario where there is heavy read access (basically a maxed out cluster of reads), how long does a write that is committed to the journal sit on the journal before the write to the backing OSD is forced?
[21:23] * LeaChim (~LeaChim@176.250.188.136) Quit (Ping timeout: 480 seconds)
[21:32] * themgt (~themgt@96-37-28-221.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[21:34] * rustam (~rustam@94.15.91.30) has joined #ceph
[21:34] * LeaChim (~LeaChim@176.250.188.136) has joined #ceph
[21:35] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[21:38] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[21:38] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:43] <sagewk> paravoid: ok, i have a stripped down rules file that seems to work. is there an easy way to include it only in certain distributions for the deb, or should we just include it always (since its effectively a no-op on a newer udev)?
[21:44] * saaby (~as@mail.saaby.com) Quit (Remote host closed the connection)
[21:47] <sagewk> btw, i think the bug is just the rule ordering.. the by-partuuid rules are above the blkid rule that does the probe.
[21:49] <paravoid> what do you mean the ordering?
[21:49] <paravoid> it's not the ID_ thing?
[21:50] <paravoid> sagewk: ^
[21:50] <sagewk> nope, not ID_
[21:51] <sagewk> KERNEL!="sr*", IMPORT{program}="/sbin/blkid -o udev -p $tempnode"
[21:51] <sagewk> happens before the by-partuuid on working rules, and below on broken rules.
[21:51] <sagewk> just moving them to the bottom in wheezy's rule file fixes it
[21:53] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[21:53] * ChanServ sets mode +v andreask
[21:53] <paravoid> ha!
[21:54] <paravoid> *just* that fixes it?
[21:54] <paravoid> are you sure?
[21:54] <paravoid> I wonder what those udev commits were about then
[21:54] <paravoid> but if that's the case, this is a Debian-specific bug
[21:54] <paravoid> (the rules are different in upstream udev)
[21:55] <sagewk> let me double- dboule check :)
[21:55] <paravoid> :-)
[21:57] <sagewk> yup
[21:57] <sagewk> http://fpaste.org/12630/73422513/
[21:57] <paravoid> ha
[21:57] <paravoid> good catch!
[21:58] <sagewk> there are a bunch of other rules in there that *might* also be affected.. they're doing stuff with serial numbers. not sure if that comes from the blkid probe or not
[21:58] <sagewk> fwiw also, just copying the precise rules file works. :)
[21:59] <sagewk> 175-0ubuntu1 vs 175-7.2
[22:00] * themgt (~themgt@24-177-232-33.dhcp.gnvl.sc.charter.com) has joined #ceph
[22:04] <sagewk> paravoid: i wonder if the deep scrubs are slowing things down. or if the slow peering corresponds to a single osd.
[22:04] <sagewk> commented on http://tracker.ceph.com/issues/5084
[22:05] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:09] * glowell (~glowell@12.248.40.138) Quit (Quit: Leaving.)
[22:09] * Cube (~Cube@12.248.40.138) has joined #ceph
[22:09] * glowell (~glowell@12.248.40.138) has joined #ceph
[22:14] <paravoid> sagewk: but it's definitely not normal?
[22:14] <paravoid> this has been my experience all along
[22:14] <sagewk> it should be 10s of seconds at most
[22:14] <sagewk> not minutes
[22:15] <paravoid> I remember in my early days you fixing a bug where idle pgs weren't peered or something
[22:15] <sagewk> they weren't persisting their epoch to disk.. that was a bit different.
[22:16] <paravoid> noscrub/nodeepscrub aren't in bobtail
[22:17] <paravoid> tell osd-max-scrubs 0?
[22:19] * loicd trying to figure out if PG::missing_loc ( which seems to be a soid => OSD map ) should stay in PG or be moved away with PG::missing . PG::missing & PG::log will apparently not need to know anything about OSDs and I'm tempted to leave loc_missing in PG, along with search_for_missing & discover_all_missing
[22:21] * jskinner (~jskinner@69.170.148.179) Quit (Remote host closed the connection)
[22:22] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[22:22] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[22:23] <joao> tnt, I'm on GMT
[22:24] <joao> also, gregaf1, what should I be looking at?
[22:25] <paravoid> sagewk: replied to the Debian bug report with your findings (and your name :), let's see if this gets the maintainer's attention
[22:25] <tnt> joao: I was looking at why mon.2 requested an election. Turned out of be because of a "mon.c@2(peon).paxos(paxos updating c 27572138..27572157) lease_timeout -- calling new election"
[22:25] <joao> hmm
[22:25] <tnt> joao: and when looking at the leader which is supposed to send the lease, I found some weird messages.
[22:26] <joao> please do share
[22:26] <tnt> joao: is there a mail where I can send you the full logs of the 3 mons ? It's a bit big to pastebin :p
[22:28] <tnt> First weird stuff is that extend_lease is called several times per second.
[22:29] <tnt> except when mon.c times out where suddenly it's not called for like 20 sec.
[22:29] <tnt> There is also a bunch of "handle_lease_ack from xxx -- stray (probably since revoked)".
[22:30] <sagewk> paravoid: 'osd scrub {min,max} interval = 86400000'
[22:30] <sagewk> ought to do it.
[22:30] <paravoid> max-scrubs 0 seems to be doing it
[22:30] <sagewk> k
[22:30] <tnt> joao: so I'm wondering is Paxos::cancel_events() isn't called a bit early and the auto extend timeout is canceled before a new one is really ready to be sent.
[22:31] <joao> tnt, before, we used to hit the lease_timeout stuff when clocks were out of sync, there was a considerable latency or a monitor was overloaded, but then the timeout was 0.5s or something; now it's set at 10 seconds, so we shouldn't be hitting that for those reasons
[22:31] <joao> my guess is that there's a compaction going on the leader and it might be blocked trying to read/write to leveldb
[22:31] <tnt> Yes, I've seen the 10s timeout and it should renew every 3s or 5s I think.
[22:32] <PerlStalker> And I just added another 70TB to my ceph cluster. :-)
[22:32] <joao> that's the only thing that would take long enough to trigger that and the only thing that comes to mind that was recently introduced
[22:32] <via> my monitor is failing with 0.61.2, with this assert: https://pastee.org/8me59
[22:32] * Tamil (~tamil@38.122.20.226) has joined #ceph
[22:32] <joao> (also, I'm sensing a pattern in my tendencies to blame it all on leveldb)
[22:33] <joao> sagewk, considering I haven't yet looked at logs, does what I said above make sense to you?
[22:34] <joao> via, that feels an awful lot like #4999
[22:34] <sagewk> joao: yeah, my guess is compaction
[22:35] <sagewk> and my guess with that is its related to load on the disk.. probably leveldb is starving the compaction or something
[22:35] <joao> yeah, we either get smarter on when to compact, or we fix leveldb
[22:35] <joao> I just wish I could reproduce that locally
[22:36] <sagewk> joao: what do you use to load test it?
[22:36] <joao> well, dinner is ready and there's a angry mob of three shouting from across the house waiting for me
[22:36] <sagewk> i wonder if we could reproduce if we run something else on teh same disk (iozone or dbench or whatever)
[22:36] <joao> sagewk, been overloading it with the workloadgen
[22:36] <sagewk> joao: enjoy!
[22:36] <sagewk> k
[22:37] <joao> aside from that, created a standalone tool to reflect the monitor's workload
[22:37] <joao> but that failed in reproducing the issue
[22:37] <joao> might take another crack at it though
[22:37] <joao> well, brb
[22:37] <tnt> joao: I'll try to add instrumentation on the disk IO but it's a pretty fast disk and the cluster was pretty much idle so why would the mon generate that much traffic.
[22:39] <paravoid> sagewk: so, 2 minutes for peering to finish, another 2 minutes with 2 pgs in recovery_wait and slow requests
[22:39] <paravoid> let me upload peering.txt
[22:39] <tnt> also when the lease renewe timeout should have kicked in, the monitor was still outputting log output "normally" AFAICT.
[22:40] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[22:41] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[22:43] <tnt> joao: actually I might have misidentified the reason for the new election ( for the one at 17h41 at least ). Argh, I'm getting lost in those logs.
[22:46] <tnt> ok, no, that was correct except it's mon.b and not mon.c that timedout lease first.
[22:52] * jskinner (~jskinner@69.170.148.179) has joined #ceph
[22:53] * jskinner (~jskinner@69.170.148.179) Quit (Remote host closed the connection)
[22:53] * jskinner (~jskinner@69.170.148.179) has joined #ceph
[22:58] * madkiss (~madkiss@2001:6f8:12c3:f00f:f4d2:eae4:fd37:f894) has joined #ceph
[23:00] <pioto> so, what's the limiting factor behind the magic "100 PGs per OSD" rule? cpu/ram/disk throughput/...?
[23:01] <sagewk> paravoid: it looks like osd.3 is your problem
[23:01] <paravoid> osd.3 is the one I restarted :)
[23:01] <sagewk> oh, hmm.
[23:02] * BillK (~BillK@124-169-186-145.dyn.iinet.net.au) has joined #ceph
[23:02] * markbby (~Adium@168.94.245.2) Quit (Quit: Leaving.)
[23:02] <pioto> since, with the cephfs security blueprint stuff... pgs are a limiting factor unless/until differnet cephfs dirs can have their location set to use different namespaces, not just pools
[23:04] <sagewk> cpu/memory/network overhead.
[23:04] <sagewk> right.
[23:04] <sagewk> namespaces are coming soon :)
[23:05] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:c6d:6b04:701b:2b1b) Quit (Ping timeout: 480 seconds)
[23:07] <tnt> joao / sagewk : it's indeed the compaction. There is "mon.a@0(leader).paxosservice(pgmap) trim from 13720090 to 13720340" then nothing in the logs for 4 seconds. Once done, the first things that runs is not the renew timeout (it would still be time to renew), but a call to propose_queued which actually cancels the lease_renew timeout, but doesn't renew the lease directly, that will only happen 3 more seconds later when the update is effectively processed, but by the
[23:07] <sagewk> yup.
[23:07] <dmick> sagewk: it looks like mon tell never took anything but a numeric "who"?
[23:07] <dmick> (and *)
[23:08] <sagewk> right
[23:08] <sagewk> we should extend it to take a name
[23:08] <dmick> ok.
[23:12] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[23:14] <tnt> wouldn't more election trigger more store.db writes ? And then get into a bad cycle where elections causes writes which causes timeout, causing even more elections, etc ... ?
[23:19] <joao> tnt, I wouldn't say it would cause more writes than usual, but it may cause a spurious burst of proposals if a significant amount of operations were queued during the election (which would otherwise have been proposed in due time)
[23:19] * Romeo1979 (~roman@046-057-022-241.dyn.orange.at) has left #ceph
[23:19] <joao> but that is not a problem; it has been like that since forever really
[23:19] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[23:19] <joao> well, kind of
[23:20] * eschnou (~eschnou@249.73-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[23:21] <tnt> ok. And wouldn't it make sense that in propose_queued instead of just canceling the renew timeout, it should check if there is enough spare time for a update cycle before the renew or if it should issue a renew right now ?
[23:23] * alex_ (~chatzilla@d24-141-198-231.home.cgocable.net) Quit (Ping timeout: 480 seconds)
[23:26] <sagewk> jamespage: around?
[23:27] <joao> tnt, it absolutely makes sense
[23:27] <joao> at least I'm not seeing anything against it
[23:27] * rustam (~rustam@94.15.91.30) has joined #ceph
[23:28] * jskinner (~jskinner@69.170.148.179) Quit (Remote host closed the connection)
[23:29] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[23:31] * rustam (~rustam@94.15.91.30) has joined #ceph
[23:33] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[23:33] <mrjack> joao: i hit another bug
[23:33] <mrjack> mon/Monitor.cc: In function 'void Monitor::sync_timeout(entity_inst_t&)' thread 7fa02ed6f700 time 2013-05-16 23:31:57.308633
[23:33] <mrjack> mon/Monitor.cc: 1098: FAILED assert(sync_role == SYNC_ROLE_REQUESTER
[23:37] <jamespage> sagewk, yes
[23:37] <sagewk> jamespage: sent you an email.. we're wondering how to go about getting google-perftools built for arm
[23:38] <sagewk> upstream says it works, but there are only amd64 and i386 packages...
[23:38] <jamespage> sagewk, I'll take a look - I have an arm box locally I can test on
[23:39] <sagewk> sweet, thanks! we're specifically after armv7l
[23:39] <jamespage> sagewk, that matches OK with the armhf arch in Ubuntu I think
[23:40] <sagewk> cool
[23:40] <joao> mrjack, looks like #4999
[23:41] <joao> do you have logs with 'mon debug = 10' by any chance? :)
[23:44] * jjgalvez (~jjgalvez@12.248.40.138) Quit (Read error: Connection reset by peer)
[23:54] * tnt (~tnt@91.177.224.32) Quit (Read error: Operation timed out)
[23:57] * Tamil (~tamil@38.122.20.226) Quit (Quit: Leaving.)
[23:57] * Tamil (~tamil@38.122.20.226) has joined #ceph
[23:59] * spicewiesel (~spicewies@2a01:4f8:191:316b:dcad:caff:feff:ee19) has left #ceph
[23:59] * rustam (~rustam@94.15.91.30) has joined #ceph
[23:59] * spicewiesel (~spicewies@2a01:4f8:191:316b:dcad:caff:feff:ee19) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.