#ceph IRC Log


IRC Log for 2013-03-12

Timestamps are in GMT/BST.

[0:00] * wer (~wer@wer.youfarted.net) Quit (Remote host closed the connection)
[0:00] * wer_ (~wer@wer.youfarted.net) has joined #ceph
[0:01] <fghaas> mo-: pleasure, happy to help
[0:08] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[0:12] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[0:26] * jlogan1 (~Thunderbi@ has joined #ceph
[0:26] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) Quit (Remote host closed the connection)
[0:26] * leseb (~leseb@ Quit (Remote host closed the connection)
[0:28] * fghaas (~florian@vpn13.hotsplots.net) Quit (Quit: Leaving.)
[0:31] * jlogan (~Thunderbi@2600:c00:3010:1:3500:efc8:eaed:66fd) Quit (Ping timeout: 480 seconds)
[0:38] <jlk> hey folks - any idea why this osd tree doesn't have a single top branch? http://pastebin.com/ufmbfvHv
[0:40] * alram (~alram@ has joined #ceph
[0:43] * lofejndif (~lsqavnbok@09GAAALO5.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[0:50] * yanzheng (~zhyan@ has joined #ceph
[0:51] * Cube (~Cube@ Quit (Quit: Leaving.)
[0:54] * stackevil (~stackevil@ Quit (Quit: There are 10 types of people. Those who understand binary and those who don't.)
[0:59] <noob2> prob cause your crush map is oddly setup
[1:00] <noob2> decompile it and take a look at it
[1:00] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:00] <jlk> noob2: the map's pretty basic http://pastebin.com/M7ZCVT3b
[1:06] <noob2> yeah it seems ok
[1:06] <noob2> that is odd
[1:07] <noob2> can someone else take a look at his map?
[1:19] * xmltok_ (~xmltok@pool101.bizrate.com) has joined #ceph
[1:20] * xmltok_ (~xmltok@pool101.bizrate.com) Quit ()
[1:20] * xmltok (~xmltok@pool101.bizrate.com) Quit (Read error: Operation timed out)
[1:22] <jlk> I'm not sure if it's affecting performance or anything...
[1:24] * chutzpah (~chutz@ Quit (Quit: Leaving)
[1:25] * Merv31000 (~Merv.rent@gw.logics.net.au) has joined #ceph
[1:25] * jtang1 (~jtang@ has joined #ceph
[1:25] * sagelap1 (~sage@2600:1012:b010:c482:d879:2821:956b:44ae) has joined #ceph
[1:25] * jtang1 (~jtang@ Quit (Read error: Connection reset by peer)
[1:25] * sagelap (~sage@2607:f298:a:607:6845:ba75:64c3:82a8) Quit (Ping timeout: 480 seconds)
[1:25] * jtang1 (~jtang@ has joined #ceph
[1:25] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) Quit (Quit: Pogoapp - http://www.pogoapp.com)
[1:25] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[1:27] <noob2> does the storage count of data in there look correct for your replica level?
[1:28] * jtang1 (~jtang@ Quit ()
[1:29] <jlk> not sure how I figure that out?
[1:32] <noob2> i believe just a ceph -s
[1:32] <noob2> that shows you the health
[1:32] <noob2> and you can see the data and cluster counts for usage
[1:33] <noob2> i don't have a ceph cluster to work with at the moment so i'm just going by memory
[1:34] <jlk> my rep sizes are all 2 (ceph osd dump -o -|grep 'rep size') so that should work
[1:35] * alram (~alram@ Quit (Quit: leaving)
[1:36] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[1:41] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[1:43] * noob2 (~cjh@ Quit (Quit: Leaving.)
[1:50] * stackevil (~stackevil@ has joined #ceph
[1:51] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[1:58] * dpippenger (~riven@ Quit (Remote host closed the connection)
[2:10] * sagelap1 (~sage@2600:1012:b010:c482:d879:2821:956b:44ae) Quit (Ping timeout: 480 seconds)
[2:11] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[2:24] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Ping timeout: 480 seconds)
[2:25] * sagelap (~sage@ has joined #ceph
[2:26] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[2:27] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[2:27] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit ()
[2:30] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[2:32] * jlogan1 (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[2:34] * jlogan1 (~Thunderbi@ has joined #ceph
[2:37] * yanzheng (~zhyan@ has joined #ceph
[2:40] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[2:42] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[2:44] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[2:44] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[2:44] * rturk-away is now known as rturk
[3:00] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[3:06] * LeaChim (~LeaChim@b0faa428.bb.sky.com) Quit (Ping timeout: 480 seconds)
[3:06] * rturk is now known as rturk-away
[3:22] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[3:32] * buck (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving.)
[3:33] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[3:35] * sagelap (~sage@ has joined #ceph
[3:40] * dpippenger (~riven@cpe-75-85-17-224.socal.res.rr.com) has joined #ceph
[4:00] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[4:08] * The_Bishop__ (~bishop@f052101142.adsl.alicedsl.de) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[4:15] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[4:39] * lx0 is now known as lxo
[4:41] * jlogan1 (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[4:41] * sagelap (~sage@ Quit (Read error: Operation timed out)
[4:45] * jlogan1 (~Thunderbi@ has joined #ceph
[4:46] * sagelap (~sage@ has joined #ceph
[5:05] <MrNPP> ooks like i took out my ceph
[5:05] <MrNPP> http://bit.ly/YX3O0N
[5:05] <MrNPP> journal failure?
[5:18] <dmick> looks like there's a problem with the OSD data store itself
[5:19] <dmick> what happened?
[5:53] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[5:57] * sagelap (~sage@ has joined #ceph
[6:04] <MrNPP> i'm not sure
[6:04] <MrNPP> i rebooted
[6:04] <MrNPP> it didn't come back up
[6:04] <MrNPP> HEALTH_WARN 507 pgs degraded; 535 pgs stale; 535 pgs stuck stale; 510 pgs stuck unclean; recovery 19440/79632 degraded (24.412%); 6/14 in osds are down
[6:05] <MrNPP> we are in the process of testing before we expand, this cluster has a gentoo vm on it, and some random files, nothing valueble, but i would love to learn to recover it
[6:05] <MrNPP> if its possible anyway
[6:05] <MrNPP> and find out why it happened :)
[6:11] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:11] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[6:13] * esammy (~esamuels@host-2-102-70-24.as13285.net) Quit (Quit: esammy)
[6:13] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[6:52] * jlogan1 (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[6:56] * jlogan (~Thunderbi@ has joined #ceph
[7:05] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[7:08] * sagelap (~sage@ has joined #ceph
[7:35] * stackevil (~stackevil@ Quit (Quit: This Mac has gone to sleep!)
[7:40] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Read error: Connection reset by peer)
[7:45] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) has joined #ceph
[7:46] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[7:49] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph
[7:51] * fghaas (~florian@ has joined #ceph
[8:06] * stackevil (~stackevil@ has joined #ceph
[8:14] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) Quit (Quit: Leaving.)
[8:14] * fghaas (~florian@ Quit (Quit: Leaving.)
[8:17] * Merv31000 (~Merv.rent@gw.logics.net.au) Quit (Quit: Leaving)
[8:17] * stackevil (~stackevil@ Quit (Quit: This Mac has gone to sleep!)
[8:29] * leseb (~leseb@ has joined #ceph
[8:52] <absynth> i can see someone else stumbled over the rbd_cache issue that we reported over two months ago
[8:52] * leseb (~leseb@ Quit (Read error: Connection reset by peer)
[8:52] * leseb (~leseb@ has joined #ceph
[8:52] <absynth> here's hoping it will finally get someone looking at it, after us pointing nearly everyone in the team to that issue
[8:56] * Morg (b2f95a11@ircip3.mibbit.com) has joined #ceph
[8:59] * jlogan1 (~Thunderbi@ has joined #ceph
[9:02] * jlogan (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[9:04] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[9:05] * Morg (b2f95a11@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[9:08] * jtang1 (~jtang@ has joined #ceph
[9:08] * sleinen (~Adium@2001:620:0:25:fc4b:81d9:af5a:ae78) has joined #ceph
[9:12] * BManojlovic (~steki@ has joined #ceph
[9:12] <nz_monkey_> absynth: You mean the one where when caching is on network performance goes to shit ?
[9:13] <nz_monkey_> absynth: If so, then yes we found that too. We thought it was just our 1gbit bonded NIC's until we went to 10Gbit and it was still happening
[9:16] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[9:16] <absynth> nz_monkey_: yeah
[9:17] <absynth> sucks balls, we are talking about latency in the multiple-seconds area
[9:17] * leseb_ (~leseb@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[9:18] <nz_monkey_> absynth: Good to see it confirmed by the users
[9:18] * Morg (b2f95a11@ircip2.mibbit.com) has joined #ceph
[9:18] <nz_monkey_> absynth: hopefully we see it before 0.61 as we want to move in to production soon
[9:18] <nz_monkey_> the fix that is!
[9:19] <Morg> morning
[9:20] <nz_monkey_> evening
[9:20] <Morg> ;]
[9:21] <absynth> nz_monkey_: and actually, we were the first to report the issue ;)
[9:21] <absynth> over 2 months ago
[9:21] * capri (~capri@ has joined #ceph
[9:22] <absynth> http://tracker.ceph.com/issues/3737
[9:22] <nz_monkey_> absynth: Im not trying to claim that one! ;) We first noticed it early December, but thought it was our particular hardware.
[9:23] <Morg> i see that discussion about computing avail space in cephfs depending on replication lvl has died :/
[9:27] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[9:27] <nz_monkey_> absynth: Have you guys tested the patch ?
[9:28] <absynth> no, but i figure we might be able to do that today, our ceph wizard just came to work after a week holidays
[9:28] <absynth> we need to erase our test cluster soon, though
[9:30] * eschnou (~eschnou@46.85-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[9:31] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[9:34] <nz_monkey_> a holiday would be nice !
[9:36] <nz_monkey_> absynth: I think by the time we sort out the BIOS issues with our test cluster, the QEMU/RBD patch will hopefully be in the stable branch
[9:36] <absynth> we have been running a qemu/kvm/rbd cluster in prod for, err... i think it's a year now
[9:45] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[9:45] <absynth> nz_monkey_: from what i can see in the bug ticket, there is currently no patch for the issue
[9:45] <absynth> it's in progress for .60 and not marked as fixed in the bug tracker
[9:47] <absynth> and that's a high priority ticket :D
[9:47] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: If at first you don't succeed, skydiving is not for you)
[9:48] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[9:52] * jtang1 (~jtang@ has joined #ceph
[9:56] * l0nk (~alex@ has joined #ceph
[10:00] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[10:03] * yanzheng (~zhyan@ Quit (Quit: Leaving)
[10:06] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[10:08] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[10:10] * Philip__ (~Philip@hnvr-4dbd242e.pool.mediaWays.net) has joined #ceph
[10:10] * mcclurmc_laptop (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Quit: Leaving)
[10:14] * barryo (~borourke@cumberdale.ph.ed.ac.uk) Quit (Quit: Leaving.)
[10:18] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[10:18] * stackevil (~stackevil@cpe90-146-43-165.liwest.at) has joined #ceph
[10:25] * ninkotech (~duplo@ip-89-102-24-167.net.upcbroadband.cz) has joined #ceph
[10:32] * LeaChim (~LeaChim@b0faa428.bb.sky.com) has joined #ceph
[10:45] * ScOut3R (~ScOut3R@c83-249-233-227.bredband.comhem.se) has joined #ceph
[10:47] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[11:05] * jlogan1 (~Thunderbi@ Quit (Read error: Connection reset by peer)
[11:06] * jlogan (~Thunderbi@2600:c00:3010:1:25f1:2bb4:463f:d4d5) has joined #ceph
[11:07] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[11:11] * Philip_ (~Philip@hnvr-4d0797a7.pool.mediaWays.net) has joined #ceph
[11:13] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[11:16] * esammy (~esamuels@host-2-102-70-24.as13285.net) has joined #ceph
[11:18] * Philip__ (~Philip@hnvr-4dbd242e.pool.mediaWays.net) Quit (Ping timeout: 480 seconds)
[11:31] * ramonskie (ab15507e@ircip1.mibbit.com) has joined #ceph
[11:32] * gerard_dethier (~Thunderbi@154.25-240-81.adsl-dyn.isp.belgacom.be) has joined #ceph
[11:34] <ramonskie> just tried ceph barclamp it seems its installing the latest version0.5.8 but there seems to be problems when the chef script is exectuted it fails on the "ceph-mon --mkfs" command with the error "adminsocketconfigobs failed to bind the unix domain socket"
[11:35] <absynth> what the hell is a barclamp?
[11:35] <ramonskie> does someone have any idea's? i search for this error and this problem was known in a older argonaut release
[11:36] <ramonskie> barclamp is a chef-crowbar script https://github.com/ceph/barclamp-ceph
[11:37] <ramonskie> lets say we just use the chef script :)
[11:42] <absynth> as you will have guessed, i have no clue of ceph :)
[11:42] <absynth> err
[11:42] <absynth> that ws freudian
[11:42] <absynth> i meant chef
[11:42] <ramonskie> ah no problem
[11:43] <ramonskie> but my guess its not a problem in chef but a problem in the latest packages
[11:50] <joao> ramonskie, yeah, that's a problem with the chef scripts
[11:50] <joao> looks like they weren't updated according to the most recent mon requirement: the data store must be created prior to --mkfs
[11:51] <joao> I have no idea how chef works though
[11:51] <joao> I'll take a quick look to see if I can do anything wrt that
[11:53] <BillK> with .58 and small 3 osd cluster, I have to mark nodown/noout or the third osd keeps timing out and flapping until settled after starting cluster - known issue? (was fine on .56)
[11:59] <joao> ramonskie, I've submitted a patch to ceph's cookbooks repository; I'll have someone who understands how this work better than I do have a look later today
[12:00] <absynth> BillK: does it assert and crash or what happens?
[12:00] <ramonskie> okay thanks
[12:00] <ramonskie> will take a look :)
[12:02] <joao> ramonskie, 'wip-mon-fix-mkfs' branch on 'ceph-cookbooks' repository
[12:02] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Read error: Connection reset by peer)
[12:02] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[12:03] <ramonskie> joao: just wanted to ask whihc branch..
[12:03] <BillK> nolan, osd seems quiet (low cpu/mem) to other two, but eventually gets marked down, data starts to move. Try and mark it in, wont stay in, nodown/noout keeps in in until the
[12:03] <BillK> startup load is past, then I can reset the flags and it works fine
[12:05] <BillK> absynth: load etc was fine on .56 and earlier, no sign of this problem.
[12:07] <absynth> uhm
[12:07] <absynth> do you see scrubbing?
[12:10] * leseb_ (~leseb@3.46-14-84.ripe.coltfrance.com) Quit (Remote host closed the connection)
[12:14] * ashish (~ashish@ has joined #ceph
[12:16] <BillK> absynth: only once its settled down, but thats normal anyway?
[12:16] <absynth> yeah, but there are massive issues with scrubbing in bobtail
[12:16] <absynth> we have seen memleaks during deepscrub that brought OSDs to a grinding halt
[12:18] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[12:21] * ashish (~ashish@ Quit ()
[12:21] * ashish (~ashish.ch@ has joined #ceph
[12:23] <ashish> hi
[12:24] <ashish> Hi guys, just want to know is there any limit on the number of buckets/accout
[12:31] * leseb_ (~leseb@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[12:31] * barryo (~borourke@cumberdale.ph.ed.ac.uk) has joined #ceph
[12:39] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[12:45] * eschnou (~eschnou@46.85-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[12:48] * mib_sucb6r (c5579f12@ircip1.mibbit.com) has joined #ceph
[12:52] * mib_sucb6r (c5579f12@ircip1.mibbit.com) Quit ()
[12:53] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[12:56] * gucki (~smuxi@HSI-KBW-095-208-162-072.hsi5.kabel-badenwuerttemberg.de) Quit (Remote host closed the connection)
[12:57] * markbby (~Adium@ has joined #ceph
[12:58] * gucki (~smuxi@HSI-KBW-095-208-162-072.hsi5.kabel-badenwuerttemberg.de) has joined #ceph
[13:08] * dosaboy (~user1@host86-164-227-220.range86-164.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[13:09] * dosaboy (~user1@host86-164-227-220.range86-164.btcentralplus.com) has joined #ceph
[13:10] * dosaboy (~user1@host86-164-227-220.range86-164.btcentralplus.com) Quit ()
[13:10] * jlogan1 (~Thunderbi@2600:c00:3010:1:1416:2740:5081:c4f5) has joined #ceph
[13:11] * dosaboy (~user1@host86-164-227-220.range86-164.btcentralplus.com) has joined #ceph
[13:12] * jlogan (~Thunderbi@2600:c00:3010:1:25f1:2bb4:463f:d4d5) Quit (Ping timeout: 480 seconds)
[13:18] <ramonskie> joao: just tried it and that problem is now solved but now a other problem occurs. "getting monitor state failed"
[13:21] * mcclurmc_laptop (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[13:21] <joao> ramonskie, getting out my depth here, so bear with me for a sec
[13:21] <joao> ramonskie, I'm assuming your cluster has been deployed?
[13:21] <ramonskie> "/var/run/ceph/" does not exists so that is the problem i guess
[13:21] <joao> ah
[13:21] <joao> okay
[13:22] <joao> yeah, that would cause a problem, as the command being run on that step does require the admin socket that should be in /var/run/ceph/ceph-mon.foo.asok
[13:23] <ramonskie> one of the deb packages should arrange this right?
[13:24] <joao> I suppose; not sure which though
[13:25] <joao> I thought it would be created when installing 'ceph'
[13:25] <ramonskie> yes
[13:25] <ramonskie> humm okay i will try it with a older package 0.56.3
[13:25] <ramonskie> i'm using 0.58 now
[13:26] <joao> ramonskie, would you mind filling a ticket for that?
[13:26] <ramonskie> will test with a older version first
[13:26] <joao> cool
[13:26] <joao> thanks
[13:29] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[13:33] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[13:33] * eschnou (~eschnou@ has joined #ceph
[13:34] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) has joined #ceph
[13:38] <ramonskie> joao: same problem with older package
[13:38] <ashish> hello everyone can anybody here answer my following questions I am in real need:
[13:38] <ashish> 1) can the number of buckets/account be set to unlimited or a very high number which is practically not possible to achieve.
[13:38] <ashish> 2) if 1) is yes, then can it degrade the performance if the number of buckets/account goes really high.
[13:38] <ashish> 3) How many buckets/account you suggest.
[13:38] <ashish> 4) How many objects/bucket you suggest.
[13:38] <ashish> 5) After what number of objects/bucket performance will start degrading.
[13:43] * JohansGlock_ (~quassel@kantoor.transip.nl) has joined #ceph
[13:48] * mcclurmc_laptop (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)
[13:50] * JohansGlock (~quassel@kantoor.transip.nl) Quit (Ping timeout: 480 seconds)
[13:57] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) Quit (Quit: Leaving.)
[14:06] * BillK (~BillK@58-7-239-bcast.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[14:09] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) has joined #ceph
[14:09] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) Quit ()
[14:15] -kinetic.oftc.net- *** Looking up your hostname...
[14:15] -kinetic.oftc.net- *** Checking Ident
[14:15] -kinetic.oftc.net- *** No Ident response
[14:15] -kinetic.oftc.net- *** Found your hostname
[14:15] * CephLogBot (~PircBot@rockbox.widodh.nl) has joined #ceph
[14:15] * Topic is 'v0.56.3 has been released -- http://goo.gl/f3k3U || argonaut v0.48.3 released -- http://goo.gl/80aGP || New Ceph Monitor Changes http://ow.ly/ixgQN'
[14:15] * Set by scuttlemonkey!~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net on Fri Mar 08 00:25:28 CET 2013
[14:17] * tziOm (~bjornar@ has joined #ceph
[14:18] * fghaas (~florian@dhcp-admin-217-66-51-168.pixelpark.com) has joined #ceph
[14:18] * fghaas (~florian@dhcp-admin-217-66-51-168.pixelpark.com) has left #ceph
[14:20] * drokita (~drokita@ has joined #ceph
[14:23] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[14:28] * noahmehl (~noahmehl@cpe-75-186-45-161.cinci.res.rr.com) Quit (Read error: Operation timed out)
[14:32] * mcclurmc_laptop (~mcclurmc@firewall.ctxuk.citrix.com) has joined #ceph
[14:39] <scuttlemonkey> ashish: Ceph is meant to scale to very large numbers
[14:39] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[14:39] <scuttlemonkey> speaking specifically about the ragos gateway for S3-type access and buckets/accounts, you can actually spin up multiple gateway machines and load balance them like you would any multi-headed web setup
[14:42] <scuttlemonkey> I know that doesn't specifically answer your questions...but I haven't heard of any meaningful upward bounding on Ceph or RGW
[14:42] <scuttlemonkey> ashish: what are you planning on using it for that has you concerned?
[14:43] <scuttlemonkey> the only thing you'll want to consider is buckets/account...since there is no way to limit/filter the ls request
[14:45] * The_Bishop (~bishop@e179013244.adsl.alicedsl.de) has joined #ceph
[14:45] <scuttlemonkey> also, it doesn't relate to one of your questions...but tangentially related: I did some playing with s3fs against a ceph gateway and it (s3fs) tends to fail if there are large numbers of files when you ls
[14:46] * diegows (~diegows@ has joined #ceph
[14:48] <Azrael> scuttlemonkey: rbd needs to communicate directly with the ceph mon + osd's, correct? one does not use rbd via radosgw, right?
[14:49] <scuttlemonkey> right, rbd and rgw serve different purposes
[14:50] <scuttlemonkey> in fact, you don't even need to spin up a gateway if all you plan on using is rbd
[14:50] <Azrael> yep
[14:51] <Azrael> i saw some posts about geo-repl with rbd or radosgw that had me confused
[14:51] <Azrael> on ceph-devel
[14:51] <scuttlemonkey> oh
[14:51] <scuttlemonkey> yeah, there is no "real" geo replication with Ceph yet
[14:51] <Azrael> yep
[14:51] <scuttlemonkey> that is one of the two major tasks being worked on at the moment
[14:52] <Azrael> you could do what i think they were talking about, which was georepl if all your objects are accessed via radosgw
[14:52] <scuttlemonkey> but there have been some workarounds using the gateway for eventually-consistent DR-type stuff
[14:52] * Morg (b2f95a11@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[14:52] <scuttlemonkey> I think Sebastien had a decent writeup on it
[14:52] <Azrael> they modified radosgw to also send objects (and actions?) into an amqp system, with another ceph cluster feeding off that queue and reproducing the results
[14:53] <scuttlemonkey> ahh, yeah that was one way
[14:54] <Azrael> scuttlemonkey: what do you think of the chef cookbook for chef? i'm evaluating chef (and then the ceph cookbook) as we speak
[14:54] <scuttlemonkey> there was another experiment here:
[14:54] <scuttlemonkey> http://www.sebastien-han.fr/blog/2013/01/28/ceph-geo-replication-sort-of/
[14:54] <Azrael> and also i keep accidentally typing "chef" when i mean "ceph", and vice versa, on the command line + config files heh
[14:54] <scuttlemonkey> Azrael: well, at last count I think there were 3 different ceph cookbooks
[14:55] <ashish> scuttlemonkey: I am planning to use ceph as the backend of my app: Document Management Systme, but I don't want to create ceph account for eachof my app user, I will only use one ceph account and each bucket of ceph account will map to a db schmer, as I already have 1000 of schmers that means there gonna be 1000 of buckets and this number will always be keep on raising, that raises a concer about
[14:55] <ashish> the performance of ceph, I have no idea whether there will be any hit on the performance of my app when number of bucket goes really high.
[14:55] * PerlStalker (~PerlStalk@ has joined #ceph
[14:55] <Azrael> scuttlemonkey: thanks for the link. looks like a good read.
[14:56] <scuttlemonkey> Azrael: but to answer your question more directly...at least 2 of the chef cookbooks for Ceph are being actively maintained
[14:56] <Azrael> i think i only looked at the inktank one
[14:57] <scuttlemonkey> yeah, that's probably the best bet
[14:57] <scuttlemonkey> https://github.com/ceph/ceph-cookbooks
[14:57] <scuttlemonkey> they'll be getting all of the updated work Sage has been pouring in to ceph-deploy
[14:57] <Azrael> very nice
[14:57] <Azrael> do you think there will be knife commands for working with ceph
[14:57] <Azrael> or no need
[14:58] <scuttlemonkey> ashish: 1000s of buckets per user? Or just many buckets and many users?
[14:58] <scuttlemonkey> Azrael: I'm guessing that will come once the management API is more solidified for Ceph
[14:58] <scuttlemonkey> (which is the other of the two major tasks being hammered on at present)
[14:59] <Azrael> yup
[14:59] <Azrael> plus once ceph is "api-ify'd"
[14:59] <ashish> scuttlemonkey 1000s of buckets per user
[14:59] <Azrael> in reference to sage's email about ceph management apiness
[15:00] <scuttlemonkey> ashish: that seems quite high...you may see some performance hits if users are constantly listing out their buckets
[15:01] <ramonskie> Azrael: i'm trying the chef-cookbook now but i ran in to some problems let me know it it works for you..
[15:01] <Azrael> ramonskie: sure
[15:01] <ramonskie> thanks
[15:01] <Azrael> ramonskie: i read the code and thought a few areas needed some attention, but i dont know yet if it works
[15:02] <ramonskie> well i'm using crowbar/chef to install it so maby there are some conflicts there
[15:02] <Azrael> oh yes i looked at crowbar
[15:03] <Azrael> you like it?
[15:03] <ashish> scuttlemoney: but what if I say user will not list their bucket at all, or lets say my app will always know in advance which bucket to connect to.
[15:03] <scuttlemonkey> ramonskie: I know several production folks are using crowbar to deploy Ceph
[15:04] <Azrael> we decided not to use crowbar because its one more thing to add to our list of things to manage. we'd go crowbar if we did ceph, hadoop, and everything else via crowbar.
[15:04] <ramonskie> scuttlemonkey: i tried the one from inktank and that one works but its a 0.43 version
[15:05] <ramonskie> Azrael: we are using it to deploy openstack and really want it to intergrate it with ceph.
[15:05] <scuttlemonkey> really!?
[15:05] <scuttlemonkey> hrm
[15:05] <scuttlemonkey> lemme see where the bobtail barclamps live
[15:06] * vata (~vata@2607:fad8:4:6:b4a4:7a9e:be41:b0c5) has joined #ceph
[15:06] <Azrael> we may switch to openstack at some point
[15:06] <Azrael> and if that happens
[15:06] <Azrael> thus having hadoop, ceph, openstack....
[15:06] <Azrael> proooobably should think about crowbar heh
[15:06] <Azrael> since we also <3 dell
[15:08] <ramonskie> crowbar still bit buggy sometime but crowbar 2.0 seems a really good improvement still waiting for that release....
[15:09] <Azrael> nice
[15:09] <ramonskie> scuttlemonkey: this is the latest barclamp that i used https://github.com/ceph/barclamp-ceph with https://github.com/ceph/package-ceph-barclamp
[15:10] <scuttlemonkey> hmmm
[15:10] <scuttlemonkey> I'm not as familiar with chef/barclamp
[15:11] <scuttlemonkey> but I thought there was a way to specify the source location
[15:11] <ramonskie> never seen that
[15:13] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) Quit (Remote host closed the connection)
[15:13] * cashmont (~cashmont@c-76-18-76-30.hsd1.nm.comcast.net) has left #ceph
[15:16] * jlogan (~Thunderbi@2600:c00:3010:1:28a6:90b2:f1cc:3d54) has joined #ceph
[15:17] * jlogan1 (~Thunderbi@2600:c00:3010:1:1416:2740:5081:c4f5) Quit (Ping timeout: 480 seconds)
[15:21] * livekcats (~stackevil@cpe90-146-43-165.liwest.at) has joined #ceph
[15:21] * stackevil (~stackevil@cpe90-146-43-165.liwest.at) Quit (Read error: Connection reset by peer)
[15:22] <scuttlemonkey> https://github.com/ceph/ceph-cookbooks/blob/master/recipes/apt.rb
[15:24] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) has joined #ceph
[15:24] * BillK (~BillK@58-7-234-139.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:25] <ramonskie> ah yes i disabled apt. we don't have internet on our test setup at the moment
[15:26] <scuttlemonkey> ahh
[15:27] <scuttlemonkey> I'd imagine there is a way to do it from tarball...but that is pure speculation from a crowbar-n00b
[15:28] <scuttlemonkey> could always spin up your own local apt server for ceph stuff....but that sounds like a lot of work :P
[15:29] <ramonskie> it is
[15:30] <ramonskie> next week we have a new test setup with internet so i think i should wait before i shoot my self... :P
[15:31] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:31] <scuttlemonkey> haha
[15:31] <scuttlemonkey> a wise plan
[15:33] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Remote host closed the connection)
[15:34] * b1tbkt (~Peekaboo@68-184-193-142.dhcp.stls.mo.charter.com) has joined #ceph
[15:58] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[16:00] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit ()
[16:02] * aliguori (~anthony@ has joined #ceph
[16:02] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[16:03] * joelio has a much happier and more robust cluster now the 9 bare disk OSDs per host have been moved to 3 RAID0's per OSD and 3 OSDs per host (only 4GB in the OSD heads)
[16:04] <joelio> seemed to be falling over with heavy threaded I/O inside RBD backed VMs
[16:05] * Cube (~Cube@ has joined #ceph
[16:05] <joelio> on another note, has the recommended RAM usage for OSD's doubled recently... sure it was 1/2Gb, now it's a full GB
[16:05] <jmlowe> joelio: I've had better luck since I dropped btrfs raid 10 and used my raid controller the right way
[16:05] <janos> for heavy use i thought the recommend ram per osd was 1-2GB
[16:06] <joelio> jmlowe: Well, I have no btrfs and I have a SAS HBA with no h/w RAID
[16:06] <joelio> so I guess I am using everythying in the correct way
[16:08] <joelio> this is a test cluster btw - production will have adequate resource
[16:08] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[16:12] <joelio> janos: not sure, maybe I misread (quite likely)
[16:14] <janos> i had a faulty machine in my test that would (and still does) only see a little less than 2GB of it's ram. and it would drop out constantly
[16:14] <janos> was really frustrating until i realized what was going on
[16:16] <joelio> Sure thing. Wanted to ensure that even under abysmal load the RBD devs wouldn't go *pop* - we're now there in testing which gives us a massive amount of more confidence about the product
[16:16] <joelio> and if I can make it go on this pile of inherited crap, then all the better :)_
[16:16] <janos> yeah, once i found my faults/bottlenecks my test cluster has been great
[16:17] <janos> shoot, my test is a home cluster i'm dogfooding and replacing my home storage system with
[16:17] <janos> a mish-mash of machines. running great
[16:17] <janos> 3 hosts
[16:17] <joelio> I think it's quite like a journey of personal development too.. you need to see/experience the bad times to know when the good times are
[16:18] <joelio> and gain maintenance experience in the process
[16:18] <janos> yeah, had plenty of that. prefer it when it's not in production ;)
[16:18] <joelio> indeed!!
[16:22] * The_Bishop (~bishop@e179013244.adsl.alicedsl.de) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[16:24] * ramonskie (ab15507e@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[16:27] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[16:29] * nick5 (~nick@ Quit (Remote host closed the connection)
[16:29] * nick5 (~nick@ has joined #ceph
[16:30] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[16:33] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) has joined #ceph
[16:35] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[16:40] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) Quit (Quit: Leaving.)
[16:44] <iggy> required ram per osd depends on OSD size
[16:45] * eschnou (~eschnou@ Quit (Remote host closed the connection)
[16:45] <iggy> and I've always used the 1G/1T rule of thumb
[16:45] <iggy> it can increase a good bit during recovery
[16:47] <joelio> That's definitely not mentioned in the documentation - I wonder if it would help someone in future?
[16:47] <joelio> there is note about recovery and handling the filesystems metatdata and what ghave you
[16:48] <joelio> but nothing hard and fast from what I see. I quite like the 1T/1G ruke
[16:48] <joelio> rule
[16:48] <iggy> it's been mentioned on the mailing list a few times
[16:48] <iggy> I haven't read the new docs well enough to know what they say
[16:48] <joelio> sure, but do most people read the mailing lists before setup or the docs (generally) :)
[16:49] <iggy> most people don't read the docs before setup
[16:49] <janos> haha
[16:49] <joelio> heh, ok, you know what I mean though
[16:49] <iggy> I forget who the docs monkey is, but you could open a ticket asking for clarification
[16:50] <janos> yeah, first-timers are much less likely to read mailing lists i would think
[16:50] <joelio> no problem, was more of a nice to mention kinda thing. Would have helped me out a little (although for the life of me I don't know where I got the 1/2Gb per OSD from)
[16:51] <iggy> the hardware recommendations page does mention that the OSDs need more memory during recovery
[16:51] <iggy> but it also has silly low recommendations
[16:55] <joelio> yea, and XFS filesystem fscks (if that's your poison) are big
[17:29] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) Quit (Remote host closed the connection)
[17:30] * diegows (~diegows@ Quit (Read error: Operation timed out)
[17:31] * sleinen (~Adium@2001:620:0:25:fc4b:81d9:af5a:ae78) Quit (Quit: Leaving.)
[17:31] * sleinen (~Adium@ has joined #ceph
[17:36] * gerard_dethier (~Thunderbi@154.25-240-81.adsl-dyn.isp.belgacom.be) Quit (Quit: gerard_dethier)
[17:38] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[17:39] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[17:47] * noob2 (~cjh@ has joined #ceph
[17:53] * eschnou (~eschnou@46.85-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[17:56] * chutzpah (~chutz@ has joined #ceph
[17:56] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[17:57] * sleinen (~Adium@2001:620:0:25:459c:91da:6bb2:fb69) has joined #ceph
[18:09] * eschnou (~eschnou@46.85-201-80.adsl-dyn.isp.belgacom.be) Quit (Read error: Operation timed out)
[18:12] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[18:13] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[18:14] * mjblw (~mbaysek@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[18:20] * tziOm (~bjornar@ Quit (Remote host closed the connection)
[18:22] <mjblw> Hi all. I am curious about the status of the collectd plugin for ceph. The best I can tell, there was a version of collectd that was forked to support ceph performance metric logging. The git repo suggests that it was to be 'upstreamed soon' as of 18 months ago. Is there anyone working on integrating perf metric collection into upstream collectd?
[18:22] * l0nk (~alex@ Quit (Remote host closed the connection)
[18:24] <dmick> mjblw: I know that Dreamhost is using that plugin for DreamObjects; you might try contacting them about it. I can ask if they feel ownership about it if you like.
[18:24] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Ping timeout: 480 seconds)
[18:27] <mjblw> dmick: anything would be helpful. I'm trying to figure out whether I have to implement something myself or not. I'm already running collectd 5.1 and rather prefer not to have to use a fork of an older version. I mostly wonder about the status of the upstreaming.
[18:35] * leseb_ (~leseb@3.46-14-84.ripe.coltfrance.com) Quit (Remote host closed the connection)
[18:41] * alram (~alram@ has joined #ceph
[18:50] <nhm> XFS fsck on a 500tb volume is scary. :)
[18:50] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[18:52] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[18:54] <infernix> https://www.redhat.com/archives/libguestfs/2013-March/msg00047.html - libguestfs patch for (rough) rbd support
[18:54] <infernix> in case anyone's interested
[18:55] * livekcats (~stackevil@cpe90-146-43-165.liwest.at) Quit (Ping timeout: 480 seconds)
[18:55] <dmick> cool!
[18:55] <joelio> nice
[18:57] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[18:59] <infernix> next email in thread shows how to use it
[18:59] <infernix> it's rough but usable
[19:02] <dmick> that could be quite handy. I need more experience with guestfs
[19:03] <infernix> readonly support should be added
[19:03] <infernix> although maybe snapshot=on might work in the config -set line
[19:03] <infernix> haven't tried that yet
[19:03] <dmick> that's surely just generic code though? you mean libguestfs currently doesn't do it for any subsidiary fs?
[19:04] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:04] <infernix> well this is just a quick way of passing file=rbd: to qemu. the correct way is to add full api support, but that's more involved
[19:04] <infernix> that's why i mention that caveat
[19:06] <dmick> mjblw: sage replies that he's sent it several times to the collectd list, but (perhaps) because we're using a different JSON library, it hasn't been picked up yet. If you want to chime in on the list requesting the patch, it may help
[19:08] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[19:11] * diegows (~diegows@ has joined #ceph
[19:12] * jtang1 (~jtang@ has joined #ceph
[19:13] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[19:16] * terje (~Adium@c-67-176-69-220.hsd1.co.comcast.net) has joined #ceph
[19:17] <terje> hi, I have a test node where I have: health HEALTH_WARN 384 pgs stuck unclean
[19:17] <terje> I can't figure out how to clear that up
[19:17] * mikedawson_ (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[19:19] * nhorman (~nhorman@2001:470:8:a08:7aac:c0ff:fec2:933b) has joined #ceph
[19:22] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[19:22] <mjblw> dmick: ty
[19:22] * mikedawson_ is now known as mikedawson
[19:22] * barryo1 (~barry@host86-128-180-76.range86-128.btcentralplus.com) has joined #ceph
[19:37] * stackevil (~stackevil@ has joined #ceph
[19:42] * mauilion (~root@ has joined #ceph
[19:43] <mauilion> Hi everyone. I am having a problem with an mds daemon. The storage node has a public and a backend ip address and the mds node is binding to the backend ip. Thus nodes are failing to connect to the mds.
[19:43] <mauilion> http://nopaste.linux-dev.org/?71077
[19:44] <mauilion> that's my ceph.conf
[19:45] <mauilion> http://nopaste.linux-dev.org/?71077 <-- that's the ss -ln output you can see that port 6805 is tied to the address specifically instead of the
[19:45] <mauilion> http://nopaste.linux-dev.org/?71078 is my ss ln output
[19:48] <mauilion> I don't know why the daemon is binding to a specific ip
[19:48] <mauilion> that's the odd part
[19:48] <dmick> terje: have you read http://ceph.com/docs/master/rados/operations/troubleshooting-osd/#troubleshooting-pg-errors ?
[19:49] * leseb_ (~leseb@ has joined #ceph
[19:51] * b1tbkt (~Peekaboo@68-184-193-142.dhcp.stls.mo.charter.com) Quit (Remote host closed the connection)
[19:51] <dmick> mauilion: the mds specifically attempts to bind to the public net, so that is odd to start with
[19:51] <dmick> looking at your conf file...where are you setting the address or network for the mds?
[19:56] <scuttlemonkey> dmick: I was just going to refer him to doc
[19:56] <scuttlemonkey> but all of our doc just says to specify host =
[19:57] <dmick> well, if there are two networks, one would expect to see global settings for the subnets
[19:57] <dmick> and you *can* set an address on any daemon
[19:59] <dmick> but I'm pretty sure that, in the absence of either, it'll try listen on IPADDR_ANY
[19:59] * leseb (~leseb@ Quit (Read error: Connection reset by peer)
[20:00] <dmick> so mauilion, not sure where you went, but I think you probably want to set public network and cluster network in [global]
[20:00] <scuttlemonkey> yeah
[20:00] <scuttlemonkey> mauilion: Have you looked here: http://ceph.com/docs/master/rados/configuration/network-config-ref/
[20:01] <scuttlemonkey> specifically: "public network = {public-network-ip-address/netmask}" and "cluster network = {enter cluster-network-ip-address/netmask}" for [global]
[20:02] * fghaas (~florian@ has joined #ceph
[20:06] * ScOut3R (~ScOut3R@c83-249-233-227.bredband.comhem.se) Quit (Remote host closed the connection)
[20:17] * stackevil (~stackevil@ Quit (Quit: This Mac has gone to sleep!)
[20:17] * mcclurmc_laptop (~mcclurmc@firewall.ctxuk.citrix.com) Quit (Ping timeout: 480 seconds)
[20:24] * ScOut3R (~ScOut3R@c83-249-233-227.bredband.comhem.se) has joined #ceph
[20:25] <terje> hey dmick
[20:25] <terje> thanks for the link, I did read that
[20:25] <terje> I figured out that you can't have a single node in a ceph pool which was my problem.
[20:27] <dmick> ah, the old "one OSD with replication level two" problem?
[20:29] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[20:29] <terje> that's the one. :)
[20:32] <dmick> looking to see just where we document that
[20:37] * nwat (~Adium@eduroam-242-170.ucsc.edu) has joined #ceph
[20:38] * ScOut3R (~ScOut3R@c83-249-233-227.bredband.comhem.se) Quit (Remote host closed the connection)
[20:39] <nwat> gregaf: been poking around in MDS land. Is the MDS single threaded, and all the threads I see in the MDS process from the messenger?
[20:39] <gregaf> nwat: essentially, yes :(
[20:40] <gregaf> anything in the MDS with an open fd/sd is definitely in the messenger
[20:40] <gregaf> except I suppose the main thread (which opens the config, but doesn't actually do any of the processing) and whoever owns the admin socket
[20:41] <nwat> gregaf: that's what i figured. i was more curious about how some of the pieces fit together. hope that info on the tracker helps a bit.
[20:41] <gregaf> yeah
[20:41] <gregaf> my bet (I posted there) is that one of the recent refactors means that the MDS is no longer closing old sockets when they get replaced
[20:42] <gregaf> it follows different code paths than the OSDs and monitors do thanks to being stateful connections
[20:42] <nwat> I see, less tested paths
[20:43] <nwat> Turns out ulimit unlimited is actually unlimited :)
[20:43] <nwat> ^not
[20:43] <gregaf> heh
[20:44] <gregaf> how high did it let you go?
[20:44] <nwat> A bit over 2M
[20:44] <gregaf> heh, ouch
[20:44] <dmick> ulimit what?
[20:44] <gregaf> that might be reaching up into the kernels absolute internal limits
[20:44] <dmick> fds?
[20:44] <gregaf> yeah
[20:44] <nwat> yeh
[20:45] <gregaf> well, sockets, in this case
[20:45] <dmick> same same
[20:45] <dmick> wow, 2Megadescriptors. That's impressive
[20:45] <nwat> 1009 mounts will do the trick
[20:45] <gregaf> I'm a bit confused about how quickly you're building them up, actually
[20:46] <dmick> that seems like a lot of descriptors per mount to the uneducated observer
[20:46] <nwat> the behavior i was seeing is that for a single mount, all messenger threads ended up with a +1 open socket
[20:51] * mcclurmc_laptop (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[20:51] <dmick> nwat: does "ended up with a +1 open socket" mean each thread got one more socket assigned per mount?
[20:52] <nwat> After one mount there are 20 threads with 1 open socket. After 2 mounts there are 22 threads with 2 open sockets. After 3 mounts, there are 24 threads with 3 sockets.
[20:53] * eschnou (~eschnou@46.85-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[20:53] <nwat> So, each mount added 2 extra threads, plus 1 open socket for each thread.
[20:54] <dmick> hm....are they distinct sockets for each thread?
[20:54] <gregaf> that's more than one client mounting though, right?
[20:55] <gregaf> because one client only gets one socket...
[20:55] <nwat> that's a single process in a mount()/unmount() loop.
[20:55] <gregaf> actually, hang on a sec
[20:55] <gregaf> so each logical connection has two threads, a reader and writer
[20:56] <gregaf> so if before mounting you have some connections, that's probably the MDS talking to the OSDs
[20:56] <gregaf> or other client connections, or something
[20:56] <gregaf> and you're adding two threads each time, that makes sense because that's the reader/writer for the new mount
[20:56] <gregaf> but the other threads shouldn't be adding anything at all; that's just bizarre
[20:57] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[20:57] <gregaf> so just to be clear, that was each thread adding a socket on each mount, right?
[20:58] <dmick> is the 'new thread' operation cloning all existing fds, and should it?
[20:58] <nwat> gregaf: that's the pattern I saw diffing lsof output for 1,2,3,..5 mounts. I assume that pattern continues
[20:59] <gregaf> dmick: they shouldn't have any relationship to the other threads; I really don't see how it can be happening
[21:00] <gregaf> although hmm, I suppose the Accepter thread is creating the sockets and then passing them off to the Pipe reader/writer threads, so maybe it's getting them accounted against itself too even though it's no longer using them
[21:00] <gregaf> I dunno how lsof works
[21:01] <gregaf> maybe it's accounting every socket available in the address space to the thread, and they're all in the same space so it's double-counting?
[21:01] <dmick> that's why I was asking if they were distinct
[21:01] <dmick> can lsof print the fd number?
[21:01] <nwat> dmick: good point. fd is unique to the process not thread, eh?
[21:02] <dmick> I believe so, yes
[21:02] <gregaf> yeah, it's per-process
[21:02] <gregaf> so actually I don't know what a per-thread display would even begin to mean
[21:02] <nwat> if the numbers i'm seeing are the actual file descitpor, then double counting seems like the behavior i'm seeing
[21:02] <dmick> (argh lsof's manpage *still* horribly broken)
[21:02] <gregaf> dmick: what did you mean by cloning here though? it's not like they're ref-counted or anything; closing an fd in one thread closes it for all threads
[21:03] <dmick> I'm not sure about the thread vs. process division in Linux
[21:03] <gregaf> nwat: so if that's the case this is the behavior I'd expect to see, and then if you wait 15 minutes from the close then you should see thread shutting down and sockets getting closed
[21:03] <dmick> there's tasks, and the multiple view in ps, and I'm not certain I know what's shared. It seems to me threads should share the same process view of fds, yes
[21:04] <dmick> but if they're not doubling that seems like it should be a lot longer way to 2Mdescs
[21:04] <gregaf> yeah
[21:04] <dmick> s/doubling/multiplying badly/
[21:05] <nwat> i'll give the timeout a shot, but I think I wait a long time when I was reproducing this on the cluster. Irregardless, this just doesn't work for Hadoop (or HPC w/o IO forwarding optimizations). Is there an optimization for shutdown() causing an early release of the sockets?
[21:05] <gregaf> I'm thinking it was actually at 1000 descriptors (*2000 threads)
[21:07] <gregaf> nwat: well, you can adjust the "ms_tcp_read_timeout
[21:07] <fghaas> joshd: if you were trying to attach an rbd volume in folsom, and it would just fail after the cinder rpc_request_timeout, and you did observe the RPC call from nova-api to cinder-volume, and to nova-compute on your compute node, and it would apparently just sit on the compute node and do nothing, and you have previously verified that "virsh attach-device" works for that very volume and the exact client id that you configured your rbd_user and rbd_secret_uuid for
[21:07] <gregaf> nwat: it defaults to 900, 15 minutes, and is the time between when the socket stops receiving data and when the socket gets closed
[21:07] <nwat> gregaf: what about an explicit shutdown message from the client?
[21:07] <gregaf> and we have a bug in the tracker to do a proper close sequence between the client and MDS instead of (basically) just going silent
[21:08] <gregaf> but we haven't done it yet because it's actually shockingly difficult to do properly and in a way that…doesn't suck
[21:08] <nwat> ahh, well, i'm a fan of that ticket ;)
[21:08] <dmick> gregaf: are any of the issues discussed in the bug? I haz interest
[21:09] <nwat> ok, i'm gonna do some testing. looks like there is enough now to work around the problem
[21:09] <gregaf> that said, you will also need to figure out why it's shutting you down at 1000 descriptors (we can get well past that in our systems) because 1000 clients mounted at the same time is not exactly unlikely
[21:09] <gregaf> dmick: not sure
[21:09] <nwat> gregaf: ahh right, there is still a problem :) — ok i'll poke around with that
[21:10] <dmick> ulimit -n *does* say (most systems do not allow this value to be set) but I'm fairly sure Linux is not one of them
[21:10] <dmick> but some strace'ing might still be useful
[21:10] <gregaf> let's see, there's http://tracker.ceph.com/issues/3630 but it's not the one I was thinking of
[21:10] <gregaf> dmick: nwat: yeah, it's possible you need to set an explicit limit on descriptors?
[21:11] <gregaf> they're a bit more dangerous to the system than eg core being unlimited is
[21:12] <nwat> alright. ulimit returns unlimited, but ulimit -n is 1024. i thought the former covered all the categories. looks like that explains the problem
[21:13] * stackevil (~stackevil@ has joined #ceph
[21:13] <joshd> fghaas: I'd check the libvirt logs to make sure compute actually told it to attach, and if so what the error was (and verify that 'virsh attach-device' was run as the same unix user that's running nova-compute)
[21:14] <dmick> If no option is given, then -f
[21:14] <dmick> is assumed.
[21:14] * oddover (~oddover@glados.colorado.edu) Quit (Ping timeout: 480 seconds)
[21:15] <dmick> that's "size of files writable by the process". oops :)
[21:15] <dmick> good, mystery solved
[21:16] <gregaf> nwat: dmick: here we go http://tracker.ceph.com/issues/1803
[21:18] <nwat> the hard limit for -n on my box is somewhere a bit above 1024000. That should tide us over for Hadoop testing, but big clusters could easily max this out fast.
[21:19] * eschnou (~eschnou@46.85-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[21:20] <fghaas> joshd: nothing in libvirtd.log with log_level 2, and and avalanche at log_level 1. can you tell me a libvirt function or exec callout that I should be looking for?
[21:21] <joshd> fghaas: anything with attach or hotplug in it
[21:21] * jjgalvez (~jjgalvez@ has joined #ceph
[21:21] <fghaas> nothing. just the qemu -help dump that libvirt spews out at level 1
[21:22] <joshd> fghaas: better yet, with debugging level logs, the 'rbd:pool/image...' will show up
[21:22] <fghaas> yeah, no trace of that at all
[21:22] <fghaas> we are talking about libvirtd.log, correct?
[21:22] <joshd> yeah
[21:23] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: my troubles seem so far away, now yours are too...)
[21:23] <fghaas> it is nova-compute that is supposed to tell libvirt, right? as opposed to any of the api services
[21:23] <joshd> yes
[21:25] <fghaas> yeah, I see it create a cinder client connection, then making an async cast on the compute service, then it just sits there
[21:26] <fghaas> which has me believe that the client node making the "nova volume-attach" call is doing the right thing, and cinder is doing the right thing, just nova-compute never actually attaches
[21:27] <fghaas> or even tries to
[21:28] * nhorman (~nhorman@2001:470:8:a08:7aac:c0ff:fec2:933b) Quit (Quit: Leaving)
[21:33] * LeaChim (~LeaChim@b0faa428.bb.sky.com) Quit (Ping timeout: 480 seconds)
[21:40] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[21:43] * LeaChim (~LeaChim@02da1ea0.bb.sky.com) has joined #ceph
[21:48] * leseb (~leseb@ has joined #ceph
[21:49] * livekcats (~stackevil@ has joined #ceph
[21:52] <mikedawson> joao: are your Mon changes detailed in the blog related to the release note for 0.58 "osd: move pg info, log into leveldb (== better performance) (David Zafman)"?
[21:53] * stackevil (~stackevil@ Quit (Ping timeout: 480 seconds)
[21:54] * leseb_ (~leseb@ Quit (Ping timeout: 480 seconds)
[21:56] <dmick> joao may well be asleep, but no. the monitor store is growing a key-value service, but the OSD changes are separate (even though they also use the leveldb key-value store for pginfo and logs now)
[21:57] <mikedawson> thanks dmick
[22:02] * markbby (~Adium@ Quit (Quit: Leaving.)
[22:03] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[22:03] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) has joined #ceph
[22:08] * ScOut3R (~ScOut3R@c83-249-233-227.bredband.comhem.se) has joined #ceph
[22:20] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[22:20] * dosaboy (~user1@host86-164-227-220.range86-164.btcentralplus.com) Quit (Remote host closed the connection)
[22:21] * leseb (~leseb@ Quit (Ping timeout: 480 seconds)
[22:22] <yehuda_hm> so, apparently libcurl is horribly broken
[22:24] * brambles (lechuck@s0.barwen.ch) Quit (Remote host closed the connection)
[22:25] * leseb (~leseb@ has joined #ceph
[22:25] <dmick> yehuda_hm: color me shocked, but, how so?
[22:26] <yehuda_hm> "When using multiple threads you should set the CURLOPT_NOSIGNAL option to 1 for all handles."
[22:26] <yehuda_hm> thread safety... the thread safe api is not thread safe unless you set some option on the handles
[22:27] <dmick> ew
[22:27] <dmick> mjblw: fwiw, Dreamhost responded about the collectd plugin that they've been preferring Diamond recently: https://github.com/BrightcoveOS/Diamond/blob/master/src/collectors/ceph/ceph.py
[22:33] <nhm> yehuda_hm: that's awful
[22:33] <yehuda_hm> yeah, and that's not even the entire issue
[22:33] <yehuda_hm> http://curl.haxx.se/mail/lib-2008-09/0197.html
[22:34] <yehuda_hm> so I was trying to chase the problem where our fastcgi socket crashed internally in accept(), and it ended up being libcurl thing
[22:34] <dmick> I'm terrified by the prospect of a resolver call without a timeout
[22:35] <yehuda_hm> yeah
[22:35] <dmick> but I guess that's common
[22:36] * Cube1 (~Cube@ has joined #ceph
[22:36] * Cube (~Cube@ Quit (Read error: Connection reset by peer)
[22:37] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Read error: Connection reset by peer)
[22:39] * loicd (~loic@AAnnecy-257-1-112-254.w90-36.abo.wanadoo.fr) has joined #ceph
[22:40] * hybrid5121 (~w.moghrab@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[22:41] * oddover_ (~oddover@glados.colorado.edu) has joined #ceph
[22:44] * hybrid512 (~w.moghrab@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[22:47] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[22:47] <mjblw> dmick: ty. i'll look into diamond.
[22:47] <dmick> but do also bug them on collectd
[22:49] * mauilion_ (~dcooley@c-71-198-86-127.hsd1.ca.comcast.net) has joined #ceph
[22:49] * oddover_ (~oddover@glados.colorado.edu) Quit (Ping timeout: 480 seconds)
[22:49] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[22:49] * dosaboy (~user1@host86-164-136-44.range86-164.btcentralplus.com) has joined #ceph
[22:50] <mjblw> my goal is getting the data into graphite anyway. i'm not opposed to just using what is preferred. just need to look into it a bit to see if diamonn works the way i need it to.
[22:50] * vata (~vata@2607:fad8:4:6:b4a4:7a9e:be41:b0c5) Quit (Quit: Leaving.)
[22:52] <dmick> yeah, but collectd is still important as well, and more voices == better. If you have a chance it can't hurt.
[22:52] * barryo1 (~barry@host86-128-180-76.range86-128.btcentralplus.com) Quit (Quit: Leaving.)
[22:53] * mauilion (~root@ Quit (Remote host closed the connection)
[22:53] * mauilion_ is now known as mauilion
[22:54] <mauilion> so I am sorry about wondering off. Anyone have any idea about the mds daemon binding to a specific ip address ?
[22:54] <mauilion> is there an option (that I can't find) that I can use to specify what ip to bind to?
[22:54] <mauilion> in the mds config sub?
[22:54] <gregaf> public addr
[22:55] <mauilion> I can put that in the mds sub?
[22:55] <mauilion> or do you mean the general public addr part
[22:56] <dmick> mauilion: yes, I said a lot above
[22:56] <mauilion> dammit
[22:56] <mauilion> let me go look for the log
[22:56] <mauilion> thanks
[22:57] <dmick> but I'm still not certain we understand the problem; if I guessed right your MDS should be listening on, so should accept connections wherever it has to
[22:57] <mauilion> dmick: that's what I expect as well
[22:57] <mauilion> dmick: from the paste you can see that port 6805 is binding to an address
[22:57] <mauilion> it's odd
[22:59] <dmick> how do you know which of those are which processes?
[22:59] <mauilion> dmick: http://nopaste.linux-dev.org/?71080
[22:59] <dmick> mds and osd will both listen on ports starting at 6800
[22:59] <mauilion> dmick: from the log of a box trying to connect I can see that it's looking fo port 6805 at
[23:00] <dmick> trying to connect to what, in response to what?
[23:02] <mauilion> dmick: http://nopaste.linux-dev.org/?71080
[23:02] <mauilion> when trying to moung ceph fs
[23:03] <mauilion> I see in the log that it's unable to connect to the mds.0 service on port 6805 at
[23:03] <dmick> that's the same paste, and
[23:03] <dmick> ok
[23:03] <mauilion> dmick: my bad http://nopaste.linux-dev.org/?71081
[23:03] <mauilion> that's an lsof -i :6805 on the server
[23:04] * noahmehl (~noahmehl@cpe-75-186-45-161.cinci.res.rr.com) has joined #ceph
[23:04] <mauilion> ah
[23:04] <mauilion> dmick: I see that the bind to port 6805 is ceph-osd
[23:04] <mauilion> dmick: give me a sec to grab the log
[23:06] <mauilion> Mar 11 18:01:41 server141 kernel: [ 1745.414906] libceph: mds0 connect error
[23:06] <mauilion> that is specifically what I am seeing
[23:06] <dmick> it wasn't clear from your ceph.conf whether the MDS node has interfaces on both nets or not, but if you want to constrain where MDS listens, you can set a 'public network' address in global, and then it will listen on interfaces on that net
[23:06] <mauilion> on both 80.10 and 10.10
[23:06] * aliguori (~anthony@ Quit (Quit: Ex-Chat)
[23:06] <mauilion> dmick:
[23:07] <mauilion> dmick: the problem I have with that is that these storage node are on a bunch of l3 networks
[23:07] <dmick> are 10.34.10.x and 10.34.80.x your two subnets?
[23:07] <mauilion> dmick: if you look at the config you can see that each set of 2 nodes are on 2 different subnets
[23:08] <dmick> not without knowing subnet masks I can't, but I might be able to infer it, and I might further be able to infer that your mds's are on similarly-configured hosts
[23:08] <mauilion> the 10.10 vs 80.10 thing is when I shutdown the active mds.0 and failed over. The connection error persisted
[23:08] <mauilion> I see your point
[23:09] * MarkN1 (~nathan@ has joined #ceph
[23:09] * barryo1 (~barry@host86-128-180-76.range86-128.btcentralplus.com) has joined #ceph
[23:09] * MarkN1 (~nathan@ has left #ceph
[23:09] <mauilion> so the way I see it I can make it so that every 2 nodes have a unique ceph.conf that set's the public addr thing correctly for them
[23:09] * BillK (~BillK@124-169-35-9.dyn.iinet.net.au) has joined #ceph
[23:10] <mauilion> or can I set the public addr only in the mds.0 config sub?
[23:10] <dmick> you can do that too
[23:10] <mauilion> does it have to be in both places?
[23:10] <dmick> any daemon will use 1) its configured addr, or 2) choose from its configured interfaces from the appropriate 'network' definition
[23:11] <dmick> (or 3) listen on
[23:11] <dmick> but I guess there's "advertising its address to the cluster once it connects to the monitors" that's another wrinkle in this
[23:13] <mauilion> there isn't a mds addr = concept right?
[23:14] <dmick> public addr and cluster addr are valid, I believe
[23:14] <mauilion> ok
[23:14] <mauilion> I will try that.
[23:14] <mauilion> thanks
[23:15] <dmick> fwiw, you can also specify a list of subnets in [global] public network
[23:15] <dmick> that might save you some typing in your config
[23:15] * MarkN (~nathan@ has joined #ceph
[23:15] * MarkN (~nathan@ has left #ceph
[23:16] <dmick> (and also in [global] cluster network, of course)
[23:16] <mauilion> i can't specify more than one network in the global set right?
[23:16] <dmick> a "list of subnets" is definitely specifying more than one network, yes...
[23:17] <mauilion> like global: public oh
[23:17] <mauilion> og
[23:17] <mauilion> oh that's cool
[23:17] <dmick> http://ceph.com/docs/master/rados/configuration/network-config-ref/?highlight=public%20addr#network-config-settings
[23:17] <mauilion> I thought that was limited to one network defined as the "public" net
[23:17] <mauilion> sweet!
[23:18] <mauilion> thanks I have been looking for that.
[23:18] <dmick> the implication, of course, is that they all route to one another
[23:18] <dmick> but if that's true...
[23:18] <mauilion> they can all reach each other.
[23:18] <mauilion> but the public stuff is the default gw
[23:18] <mauilion> on each server
[23:30] * drokita (~drokita@ Quit (Ping timeout: 480 seconds)
[23:30] * BillK (~BillK@124-169-35-9.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[23:34] * jlogan (~Thunderbi@2600:c00:3010:1:28a6:90b2:f1cc:3d54) Quit (Ping timeout: 480 seconds)
[23:35] * jlogan (~Thunderbi@ has joined #ceph
[23:39] * BillK (~BillK@124-149-92-238.dyn.iinet.net.au) has joined #ceph
[23:52] * leseb (~leseb@ Quit (Remote host closed the connection)
[23:53] <elder> joshd, are you around?
[23:55] * PerlStalker (~PerlStalk@ Quit (Quit: ...)
[23:55] <elder> Sage wanted to reproduce 4079 yesterday and was having trouble doing so. I thought maybe I had hit it, but now it seems like it might have resolved itself again.
[23:56] <elder> Anyway, I wanted to give you a chance to look at the state of things if you wanted to.
[23:56] <joshd> elder: yeah
[23:56] <elder> Are you interested?
[23:57] <joshd> yeah

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.