#ceph IRC Log


IRC Log for 2013-05-03

Timestamps are in GMT/BST.

[0:02] * jmlowe (~Adium@ Quit (Quit: Leaving.)
[0:03] <elder> sagewk, flattening takes time. It seems like I need to know when it has begun, and when it has ended.
[0:04] <elder> My original question was, at what point does the clone get marked as a normal image with no parent?
[0:04] <sagewk> don't need to know when it begins.. just need to check for the parent on completion always
[0:04] <sagewk> when it finishes copying down all the objects, it removes the parent and the refresh happens.
[0:04] <elder> So the data in the parent won't disappear until the flatten is complete.
[0:04] * brady (~brady@rrcs-64-183-4-86.west.biz.rr.com) Quit (Quit: Konversation terminated!)
[0:04] <elder> OK.
[0:04] <sagewk> yeah
[0:05] <elder> I thought I heard you say before that the data might disappear while it is underway.
[0:05] <elder> But now I konw what you must have meant.
[0:05] <sagewk> :)
[0:05] <elder> So the atomic switch is at the end, when the flatten is done.
[0:05] <elder> Prior to that, the parent continues to act like a parent.
[0:05] <dmick> right.
[0:05] <elder> Got it.
[0:05] <dmick> progressively less and less so as the data is flattened down.
[0:05] * aliguori (~anthony@ Quit (Remote host closed the connection)
[0:06] <elder> Well but from the client's perspective it doesn't matter. As long as it has a parent the client needs to do check if the target exists for a write.
[0:06] <elder> Once it is known it doesn't have one, it can simply write.
[0:06] <gregaf> how does the flatten prevent write races losing updates?
[0:06] <elder> copyup
[0:07] <elder> When the target doesn't exist, it fetches the whole object from the parent.
[0:07] <elder> Then it supplies that whole object, plus the original write, to the original (child) target object in a copyup.
[0:07] <elder> The copyup operation will write the full object data first, then the write requets on top of it.
[0:07] <gregaf> right, but if they both fetch the whole object at the same time? oh, copyup is a class op with failure modes in case the data got added, isn't it
[0:08] <elder> But if at the time of the copyup, the target has come into existence, the full-object is ignored.
[0:08] <elder> Correct.
[0:08] <mikedawson> sagewk: all mons are acting the same, do you want to see more than one?
[0:08] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) has joined #ceph
[0:09] <sagewk> all hung?
[0:10] <sagewk> mikedawson: ^?
[0:10] <mikedawson> yep, they never progress a useful state
[0:11] <sagewk> can you dump the threads in gdb to confirm it is stuck where we want it?
[0:11] <sagewk> and if so, tar it all up?
[0:12] <mikedawson> yeah, I got a tdump, and I'm taring it up now, once done, I'll start it again and do the backtrace
[0:12] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[0:12] <sagewk> k
[0:12] * portante is now known as portante|afk
[0:12] * jtang2 (~jtang@ Quit (Quit: Leaving.)
[0:13] * jtang1 (~jtang@ has joined #ceph
[0:15] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[0:16] * alram (~alram@ Quit (Quit: leaving)
[0:17] * portante` (~user@ Quit (Read error: Operation timed out)
[0:17] <mikedawson> sagewk: mikedawson-ceph-mon.a-tdump.tar.bz2
[0:17] <sagewk> sweet
[0:17] * verwilst (~verwilst@dD576F6A2.access.telenet.be) has joined #ceph
[0:18] * rustam (~rustam@ has joined #ceph
[0:20] * rustam (~rustam@ Quit (Remote host closed the connection)
[0:24] <sjust> sagewk: added replay
[0:25] <mikedawson> sagewk: backtrace http://pastebin.com/raw.php?i=uii6XxBJ (note the mon_status never returns, so ceph-create-keys never finishes, either)
[0:27] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[0:29] * sstan (~chatzilla@modemcable016.164-202-24.mc.videotron.ca) has joined #ceph
[0:33] * verwilst (~verwilst@dD576F6A2.access.telenet.be) Quit (Quit: Ex-Chat)
[0:33] <sagewk> mikedawson: the -bak is before starting ceph-mon?
[0:33] <mikedawson> yes
[0:33] <eagen> mikedawson: I have the exact same problem. I've been struggling to get a working configuration going and trying all the various documented options. I run into the infinite create-keys when using ceph-deploy. (Other problems when using mkcephfs...)
[0:34] * LeaChim (~LeaChim@ Quit (Ping timeout: 480 seconds)
[0:38] <sstan> eagen: you can always try to create osds/mons by hand..
[0:38] <mikedawson> eagen: Yeah, there have been several problems. A handful have been fixed. Unfortunately 0.59/0.60 didn't get critical mass of real-world testers. There are now several of us helping the testing effort leading up to Cuttlefish
[0:39] <eagen> That's what I'll do next, when I find the correct out what to do in order. The docs are pretty poor for that. I'm using 0.56.4 that's in the Ubuntu 13.04 apt repositories.
[0:40] <sstan> I did it, if that can encourage you. I'll try to answer any question you might have
[0:40] <mikedawson> me too
[0:40] <eagen> Thanks. Unfortunately I won't be able to work on it for a few hours. :)
[0:42] <sstan> mikedawson: is ceph a good alternative to raid1 (on one computer)?
[0:44] * nhm (~nhm@ma02536d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[0:44] <mikedawson> sstan: I value ceph for use with many servers, haven't spent the time to weigh it against raid1
[0:45] * yehuda_hm (~yehuda@2602:306:330b:1410:a455:4b9b:5d5a:bdd9) Quit (Remote host closed the connection)
[0:45] <tnt_> that's a whole lot more complicated than RAID1 ...
[0:46] <tnt_> and since you can't use the rbd kernel client on the same machine as an OSD, things get tricky ...
[0:47] <sstan> actually, nothing prevents one to do so
[0:47] * gmason_ (~gmason@hpcc-fw.net.msu.edu) Quit (Quit: Computer has gone to sleep.)
[0:47] * yehudasa_ (~yehudasa@2602:306:330b:1410:ea03:9aff:fe98:e8ff) has joined #ceph
[0:48] <sstan> I guess it's not recommended because it puts more stress on the machine, but other that that, it's ok
[0:48] <sstan> buy yeah ceph's forte is doing storage on a large number of machines
[0:49] * imjustmatthew_ (~imjustmat@pool-173-53-54-223.rcmdva.fios.verizon.net) has joined #ceph
[0:50] <tnt_> mmm ,actually I'm pretty sure there was some subtle kernel interaction that could cause issues because of page paging in/out of the case.
[0:51] * davidzlap1 (~Adium@ has joined #ceph
[0:51] <tnt_> like, the kernel needs to flush cache to get more ram for an userspace process, this triggers a write on the block device, but that write may block because it needs to send data to the OSD which was exactly the process the kernel was trying to make room for ...
[0:51] <gregaf> yeah, you might not run into trouble if you use a kclient on an OSD, but the potential is always there and if you do run into issues the response will be "yeah, duh"
[0:53] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[0:53] * glowell1 (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[0:56] * PerlStalker (~PerlStalk@ Quit (Quit: ...)
[0:58] <mikedawson> sagewk: I'm thinking about going back to 0f7d951003b09973b75b99189a560c3a308fef23, that worked for my on Tuesday. Thoughts?
[0:59] <sagewk> mikedawson: i don't think we've changed anything substantially. setting 'mon compact on trim = false' will get you back to the old behavior
[0:59] <sagewk> at least wrt leveldb
[0:59] <sagewk> sorry if i asked this already, but: did you confirm that --mon-leveldb-paranoid doesn't help here?
[0:59] <sagewk> having a hard time keeping everyone straight :)
[1:01] <mikedawson> sagewk: I have 'mon leveldb paranoid = true' in ceph.conf, but not "--mon-leveldb-paranoid" as a startup option
[1:01] <sagewk> same thing
[1:01] <sagewk> :( ok
[1:01] <mikedawson> that's what I thought
[1:02] <mikedawson> sagewk: to me it looks like the compact isn't working which is borking everything else the mon typically does at startup
[1:03] <sagewk> yeah. what happens when you run with 'mon compact on trim = false'?
[1:03] <sagewk> i think that was the main thing we changed since tuesday
[1:03] <sagewk> (related to leveldb at least)
[1:03] <sagewk> that and the block size.. but the 4MB we had was apparently nonsense
[1:07] <mikedawson> sagewk: with 'mon compact on trim = false' set, mon_status and ceph_create_keys don't hang, so that seems better. Still no compaction at startup with 'mon compact on bootstrap' either true or false
[1:08] <sagewk> the startup compaction is 'mon compact on start = true'
[1:08] <mikedawson> ahh, that changed on me!
[1:08] <sagewk> bootstrap happens before first election, when mons are added, etc.
[1:09] <mikedawson> sagewk: much better!
[1:09] * Havre (~Havre@2a01:e35:8a2c:b230:2d8b:cae5:ff86:48e6) Quit ()
[1:10] <sagewk> aaah, i think the leveldb background thread is dying.
[1:10] <sagewk> it's missing from both wido's and mikedawson's backtraces
[1:13] <sagewk> mikedawson: can you try one new thing?
[1:13] <sagewk> run a ceph-mon with -f (does not fork) and get it to hang
[1:14] <sagewk> and see if anything appears on stderr
[1:15] <mikedawson> will do, but changing to 'mon compact on start = true' got my mons back to quorum, so will we see the error?
[1:17] <sagewk> you seem to have little trouble getting it to hang... :)
[1:20] <mikedawson> sagewk: '/usr/bin/ceph-mon -i c --pid-file /var/run/ceph/mon.c.pid -c /etc/ceph/ceph.conf -f', right?
[1:20] * loicd1 (~loic@2a01:e35:2eba:db10:75bf:fd54:c3ed:54ab) Quit (Quit: Leaving.)
[1:20] <sagewk> yeah
[1:21] <mikedawson> nothing output, except typical starting mon.c rank 2 at mon_data /var/lib/ceph/mon/ceph-c fsid 2f2730c5-0504-4433-ae0b-331dd41d99a4
[1:21] <sagewk> yeah, it should be running tho (unless it is stuck already)
[1:23] <mikedawson> sagewk: gdb -> http://pastebin.com/raw.php?i=RyhXPfbc
[1:24] <mikedawson> this one doesn't have the right stuff for 'mon compact on start = true', I can do it again with the right config to compact, if that helps
[1:25] <sagewk> sure. that bt doesn't make much sense. not sure why it's stuck
[1:27] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[1:28] <mikedawson> sagewk: started it again with 'mon compact on start = true', compacted, then hung. Last output was 2013-05-02 23:26:12.331456 7f5176ce77c0 -1 done compacting. Logs show it is probing
[1:28] * sagewk (~sage@2607:f298:a:607:c54c:c777:f195:239a) has left #ceph
[1:28] * sagewk (~sage@2607:f298:a:607:c54c:c777:f195:239a) has joined #ceph
[1:29] <mikedawson> sage:wk: mon.a compacted, but mon.b hasn't. I bet this one is hung due to mon.b being non-compacted and out to lunch
[1:34] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[1:35] <imjustmatthew_> sagewk: wip_mon_dump_ops is the branch you're testing now for the monitor bug?
[1:35] <sagewk> yeah
[1:35] <sagewk> it let syou capture a trace of the leveldb workload
[1:36] <imjustmatthew_> k, it's unfortunately not hanging for me, even without the paranoid option
[1:38] * tnt_ (~tnt@ Quit (Read error: Operation timed out)
[1:42] <sagewk> pushed a new patch to wip_mon_dump_ops that lets you specify 'mon leveldb log = <filename>'
[1:42] <sagewk> which catches all of leveldb's internal logging. that ought to provide some clues!
[1:44] * imjustmatthew_ (~imjustmat@pool-173-53-54-223.rcmdva.fios.verizon.net) Quit (Remote host closed the connection)
[1:44] * imjustmatthew_ (~imjustmat@pool-173-53-54-223.rcmdva.fios.verizon.net) has joined #ceph
[1:53] <mikedawson> sagewk: I'm back to HEALTH_OK. Process was compact the 3 mons on start. This will not get a quorum, but it is necessary to start with compacted mons. Then start three mons without compact on start. quorum. Start all osds
[1:53] <sagewk> mikedawson: weird
[1:54] <mikedawson> sagewk: And I have default settings for "mon compact on trim", "mon debug dump transactions", and "mon leveldb paranoid"
[1:55] <sagewk> mikedawson: wip_mon_dump_ops (with leveldb logging) just built.. can you give it a go, and add 'mon leveldb log = /var/log/ceph/ceph-$name.leveldb.log'?
[1:56] <mikedawson> I think that's what I did on Tuesday, too. But I was starting daemons with the --compact-on-bootstrap (or whatever it was called at the time)
[1:57] <sagewk> its good to know that we can get the cluster up semi-reliably, but it's all skating around the real problem.. all of these combinations should work without problems. :(
[1:57] <sagewk> do you have time still to give the leveldb log a go?
[1:57] <sagewk> i'm hoping that will have the info we need to see what leveldb is doing wrong
[1:57] <mikedawson> yeah, I'm grabbing it now
[1:58] <sagewk> thanks :)
[2:00] <mikedawson> sagewk: here it is when it achieves quorum without issue http://pastebin.com/raw.php?i=SQzePEn2
[2:00] <mikedawson> ignore the top 22 lines
[2:00] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[2:03] <mikedawson> sagewk: heading home, let me know if you need anything else
[2:03] <mikedawson> I'll be back on in ~2h
[2:04] <sagewk> actually need a log when it hits the hang... when you get back! :)
[2:04] <sagewk> thanks for all the help!
[2:04] <mikedawson> sure thing. I'll grab it if imjustmatthew doesn't beat me
[2:04] <sagewk> (success will be helpful to as a reference)
[2:07] * scuttlemonkey (~scuttlemo@74-130-236-21.dhcp.insightbb.com) has joined #ceph
[2:07] * ChanServ sets mode +o scuttlemonkey
[2:12] <sagewk> imjustmatthew: around?
[2:12] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[2:13] <imjustmatthew_> sagewk: yes
[2:13] <imjustmatthew_> I just updated to the new commit, we'll see if I can get it to hang for you
[2:14] <imjustmatthew_> We haven't figured out how to reliably trigger a hang yet have we?
[2:15] * eternaleye (~eternaley@c-50-132-41-203.hsd1.wa.comcast.net) Quit (Remote host closed the connection)
[2:16] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[2:16] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has left #ceph
[2:17] * eternaleye (~eternaley@cl-43.lax-02.us.sixxs.net) has joined #ceph
[2:20] <sagewk> you guys seem to be pretty good at it :). we haven't been able to do it locally, though.
[2:20] <sagewk> make sure you turn on the leveldb log via 'mon leveldb log = ... '
[2:20] <sagewk> thanks!
[2:21] <imjustmatthew_> yeap, it's logging, and yeah, we are on a roll with these :)
[2:22] <sagewk> great thanks!
[2:22] * rustam (~rustam@ has joined #ceph
[2:23] * rustam (~rustam@ Quit (Remote host closed the connection)
[2:24] <imjustmatthew_> np, I hope it helps. Only weird thing is a funny character in the leveldb output lines: "2013/05/02-20:22:19.098474 7f104f651700 Manual compaction at level-0 from 'paxos .. 'paxos^A' @ 0 : 0; will stop at (end)"
[2:24] <imjustmatthew_> (that's 'control character' A)
[2:25] <imjustmatthew_> might just be by terminal being dumb
[2:25] <sagewk> that's normal
[2:28] <sagewk> brb
[2:32] * sagelap1 (~sage@32.sub-70-197-75.myvzw.com) has joined #ceph
[2:33] * scuttlemonkey (~scuttlemo@74-130-236-21.dhcp.insightbb.com) Quit (Ping timeout: 480 seconds)
[2:33] <sagelap1> back
[2:33] <cjh_> has anyone tried booting a physical server off of rbd?
[2:34] * kiran (~kiran@proxy-rtp-1.cisco.com) Quit (Ping timeout: 480 seconds)
[2:34] * dmick (~dmick@2607:f298:a:607:d933:5785:4663:fa77) Quit (Quit: Leaving.)
[2:38] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[2:40] <imjustmatthew_> mikedawson: You're still well in the running, I haven't been able to kill a mon yet :)
[2:41] <mikedawson> imjustmatthew_: I'll give it a shot
[2:41] <sagelap1> try turning off 'mon leveldb paranoid = true' (remove it or set it to false)
[2:41] * sagelap1 is now known as sagelap
[2:45] <mikedawson> sagewk: working -> http://pastebin.com/raw.php?i=SQzePEn2 hung -> http://pastebin.com/raw.php?i=cjzDTFS9
[2:46] <mikedawson> difference was starting with starting with 'mon compact on start = true' made it hang
[2:46] <imjustmatthew_> nice :)
[2:47] <sagelap> sjust: ^
[2:48] <mikedawson> but for the record, if the mon leveldb baloons, you need to do a compact on start as an offline process on the mons to get them to a manageable size, they will hang, stop them, then start without compact on start to get quorum
[2:49] <mikedawson> that seems fairly reproducible on my deployment
[2:50] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[2:50] <sagelap> for this hung log, can you see where it ends up blocked?
[2:50] <sagelap> whether it's CompactRange or WaitForNotFull?
[2:50] <sagelap> or just catpure the full thread dump from gdb?
[2:50] <sagelap> to match the log?
[2:55] <mikedawson> sagelap: tail -f /var/log/ceph/ceph-mon.c.log stuck, then gdb, then ceph-mon.c.leveldb.log
[2:55] <mikedawson> http://pastebin.com/raw.php?i=mmVeZ4ik
[2:56] <sagelap> perfect. thanks!
[2:56] * houkouonchi-work (~linux@ Quit (Remote host closed the connection)
[2:57] <sagelap> sjust: i wonder if the bg thread is never getting started at all?
[2:58] <mikedawson> sagelap: and then I disable compact on start, and restart the mon... segfault http://pastebin.com/raw.php?i=H3pTGpHC
[3:03] <sagelap> mikedawson: that one is a quick fix :)
[3:03] <sagelap> gregaf: still there?
[3:03] <gregaf> yeah
[3:03] <sagelap> did sjust leave?
[3:03] <gregaf> yeah
[3:03] <sagelap> k
[3:04] <mikedawson> sagelap: good stuff
[3:04] <gregaf> I imagine he'll be back on when he gets home, from the way he left the office ;)
[3:05] <sagelap> k
[3:05] <sagelap> gregaf: peek at wip-mon-null?
[3:06] * Qten (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) Quit (Read error: Connection reset by peer)
[3:06] * Qten (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) has joined #ceph
[3:07] <gregaf> heh
[3:07] <gregaf> sagelap: we probably want that to be unconditional output (ie, "NULL or NONE" if the pointer is null)
[3:08] * iggy (~iggy@theiggy.com) Quit (Remote host closed the connection)
[3:08] * iggy (~iggy@theiggy.com) has joined #ceph
[3:08] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) has joined #ceph
[3:09] <imjustmatthew_> sagelap: okay, I have a live hung mon, do you need anything besides the logs and the backtrace?
[3:15] <sagelap> ooooh i know whats going on
[3:15] <sagelap> we are using leveldb prior to the fork() and it is losing its background thread
[3:16] <mikedawson> sagelap: i like where this is going
[3:16] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[3:16] <sagelap> mikedawson, imjustmatthew_: my guess is that if you run ceph-mon with -f you will be unable to make it hang
[3:17] <imjustmatthew_> I think I've always been running it -f using upstart
[3:19] <sagelap> maybe we have 2 problems, then.
[3:19] <sagelap> the 'mon compact on start' is a broken, though. i'll fix that up.
[3:20] * jbd_ (~jbd_@34322hpv162162.ikoula.com) Quit (Ping timeout: 480 seconds)
[3:21] <mikedawson> sagelap: I'm not starting it with -f. where would i put that in if I want 'service ceph start' to do it?
[3:22] <sagelap> hmm... not anywhere convenient without hacking up your init script
[3:22] <sagelap> i'd just run teh daemon from the command line
[3:22] <sagelap> ceph-mon -f -i a
[3:22] <sagelap> should do the trick (with whatever mon name you have besides a)
[3:22] <sagelap> and & :)
[3:23] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[3:24] <mikedawson> sagelap: I was going to hack the init, but I'm not finding it /etc/init/? /etc/init.d/ceph?
[3:26] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) Quit (Remote host closed the connection)
[3:27] <mikedawson> runarg?
[3:27] <imjustmatthew_> Mine is at /etc/init/ceph-mon.conf
[3:35] * sagelap1 (~sage@ has joined #ceph
[3:35] <sagelap1> dinner.. back in a bit
[3:35] <sagelap1> sjust: see wip-leveldb-reopen.. pbly need to do something smarter than re-calling init(), but basically reopening leveldb ought to solve the problem
[3:35] <sagelap1> best if we can ensure the old one is shut down before the fork too...
[3:35] <imjustmatthew_> sagelap1: Or maybe upstart isn't doing what I thought it was doing, running directly it hangs without '-f' and works with it? Either way, ttyl
[3:35] <sagelap1> ttyl!
[3:35] <sagelap1> thanks all for your help :)
[3:35] * sagelap (~sage@32.sub-70-197-75.myvzw.com) Quit (Ping timeout: 480 seconds)
[3:36] * scuttlemonkey (~scuttlemo@74-130-236-21.dhcp.insightbb.com) has joined #ceph
[3:36] * ChanServ sets mode +o scuttlemonkey
[3:37] * wido (~wido@2a00:f10:104:206:9afd:45af:ae52:80) Quit (Ping timeout: 480 seconds)
[3:37] * wido (~wido@rockbox.widodh.nl) has joined #ceph
[3:47] * scuttlemonkey (~scuttlemo@74-130-236-21.dhcp.insightbb.com) Quit (Ping timeout: 480 seconds)
[3:50] * tkensiski (~tkensiski@2600:1010:b02d:204d:51f0:7429:93a1:fbfb) has joined #ceph
[3:50] * tkensiski (~tkensiski@2600:1010:b02d:204d:51f0:7429:93a1:fbfb) has left #ceph
[3:51] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[4:09] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[4:15] * davidzlap1 (~Adium@ Quit (Ping timeout: 480 seconds)
[4:19] <mikedawson> sagelap1: so far, ceph-mon -f has not hung! I'll give wip-leveldb-reopen when its ready, looks like the build failed last try
[4:20] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:22] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[4:27] * sjustlaptop1 (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[4:32] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[4:34] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[4:46] * rustam (~rustam@ has joined #ceph
[4:46] * themgt (~themgt@96-37-28-221.dhcp.gnvl.sc.charter.com) has joined #ceph
[4:47] * rustam (~rustam@ Quit (Remote host closed the connection)
[4:57] * sjustlaptop1 (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[4:58] * sjustlaptop (~sam@66-214-187-119.dhcp.gldl.ca.charter.com) has joined #ceph
[5:01] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[5:05] * jamespage (~jamespage@culvain.gromper.net) Quit (Quit: Coyote finally caught me)
[5:05] * jamespage (~jamespage@culvain.gromper.net) has joined #ceph
[5:09] * sjustlaptop (~sam@66-214-187-119.dhcp.gldl.ca.charter.com) Quit (Read error: Operation timed out)
[5:10] * tkensiski (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has joined #ceph
[5:10] * tkensiski (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has left #ceph
[5:18] * sjustlaptop (~sam@m8f2636d0.tmodns.net) has joined #ceph
[5:24] * jskinner (~jskinner@ has joined #ceph
[5:26] * sjustlaptop1 (~sam@66-214-187-119.dhcp.gldl.ca.charter.com) has joined #ceph
[5:29] * sjustlaptop (~sam@m8f2636d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[5:36] * sjustlaptop1 (~sam@66-214-187-119.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[5:39] * sjustlaptop (~sam@m8f2636d0.tmodns.net) has joined #ceph
[5:40] * rustam (~rustam@ has joined #ceph
[5:41] * rustam (~rustam@ Quit (Remote host closed the connection)
[5:47] * themgt (~themgt@96-37-28-221.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[5:49] * jtang1 (~jtang@ has joined #ceph
[5:54] * sjustlaptop (~sam@m8f2636d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[5:54] * sjustlaptop (~sam@m8f2636d0.tmodns.net) has joined #ceph
[6:07] * yehudasa_ (~yehudasa@2602:306:330b:1410:ea03:9aff:fe98:e8ff) Quit (Remote host closed the connection)
[6:09] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) has joined #ceph
[6:10] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[6:20] * rustam (~rustam@ has joined #ceph
[6:21] * rustam (~rustam@ Quit (Remote host closed the connection)
[6:24] * ScOut3R (~ScOut3R@BC065770.dsl.pool.telekom.hu) has joined #ceph
[6:27] * sjustlaptop (~sam@m8f2636d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[6:35] * ScOut3R (~ScOut3R@BC065770.dsl.pool.telekom.hu) Quit (Ping timeout: 480 seconds)
[6:35] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[6:36] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[6:36] * rustam (~rustam@ has joined #ceph
[6:37] * rustam (~rustam@ Quit (Remote host closed the connection)
[6:41] * wca_ (~will@ has joined #ceph
[6:41] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[6:41] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[6:41] * wca (~will@ Quit (Read error: Connection reset by peer)
[6:52] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Ping timeout: 480 seconds)
[6:59] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[7:02] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.90 [Firefox 20.0.1/20130409194949])
[7:10] * jtang1 (~jtang@ Quit (Ping timeout: 480 seconds)
[7:14] * coyo (~unf@ has joined #ceph
[7:18] * rustam (~rustam@ has joined #ceph
[7:19] * rustam (~rustam@ Quit (Remote host closed the connection)
[7:26] * Travis (~oftc-webi@ has joined #ceph
[7:26] * Travis is now known as Guest4190
[7:27] <Guest4190> Is anyone around?
[7:27] <Guest4190> :-/
[7:27] * Guest4190 (~oftc-webi@ has left #ceph
[7:33] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[7:43] * Havre (~Havre@2a01:e35:8a2c:b230:eccc:1c47:8d:220b) has joined #ceph
[7:59] * kiran (~kiran@proxy-rtp-1.cisco.com) has joined #ceph
[8:05] * rustam (~rustam@ has joined #ceph
[8:06] * rustam (~rustam@ Quit (Remote host closed the connection)
[8:07] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[8:14] * madkiss (~madkiss@2001:6f8:12c3:f00f:547c:581c:76bf:88ef) has joined #ceph
[8:17] * kiran (~kiran@proxy-rtp-1.cisco.com) Quit (Quit: Leaving)
[8:19] * eternaleye (~eternaley@cl-43.lax-02.us.sixxs.net) Quit (Read error: Connection reset by peer)
[8:19] * eternaleye_ (~eternaley@cl-43.lax-02.us.sixxs.net) has joined #ceph
[8:19] * eternaleye_ is now known as eternaleye
[8:19] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[8:22] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:1ce8:919e:8ca9:d55c) has joined #ceph
[8:24] * madkiss (~madkiss@2001:6f8:12c3:f00f:547c:581c:76bf:88ef) Quit (Ping timeout: 480 seconds)
[8:26] * tnt (~tnt@ has joined #ceph
[8:26] * uli (~uli@mail1.ksfh-bb.de) has joined #ceph
[8:27] <uli> hi there
[8:28] <uli> is there a way to mount a cephfs on multiple clients?
[8:30] <uli> I'd like to have a distributed filesystem on two departments and want to be able to access the cluster from the one department and from the other department at the same time....
[8:30] * Vjarjadian (~IceChat77@ Quit (Quit: Easy as 3.14159265358979323846... )
[8:38] * rustam (~rustam@ has joined #ceph
[8:43] * fghaas (~florian@212095007007.public.telering.at) has joined #ceph
[8:46] * rustam (~rustam@ Quit (Remote host closed the connection)
[8:49] * rustam (~rustam@ has joined #ceph
[8:51] * rustam (~rustam@ Quit (Remote host closed the connection)
[8:56] * bergerx_ (~bekir@ has joined #ceph
[9:00] * loicd (~loic@magenta.dachary.org) has joined #ceph
[9:02] * ScOut3R (~ScOut3R@BC065770.dsl.pool.telekom.hu) has joined #ceph
[9:04] * eschnou (~eschnou@ has joined #ceph
[9:08] * ScOut3R (~ScOut3R@BC065770.dsl.pool.telekom.hu) Quit (Read error: Operation timed out)
[9:11] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:16] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:20] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[9:20] * trond (~trond@trh.betradar.com) Quit (Read error: Connection reset by peer)
[9:21] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[9:23] * fghaas (~florian@212095007007.public.telering.at) Quit (Ping timeout: 480 seconds)
[9:23] <wido> uli: That is what CephFS is for
[9:24] <wido> so yes, you can mount it on multiple, even thousands of clients
[9:27] <uli> ok, perfect thing here ;)
[9:28] <uli> and how to, yesterday i did a first try, just using 5 min quick start, but mounting on two clients seemed not to be consistent... (mounting rdb)...
[9:36] <Gugge-47527> rbd is not cephfs
[9:36] <Gugge-47527> rbd is a blockdevice, and using a normal filesystem on top you can not mount it multiple places
[9:37] <Gugge-47527> you need some of the cluster filesystems on top of rbd to do that
[9:38] * leseb (~Adium@ has joined #ceph
[9:39] * ScOut3R (~ScOut3R@ has joined #ceph
[9:40] <uli> ok, thank for that moment, I think it's a good idea to dig into documentation ;) but thanks so far (be sure I'm gonna be back ;) )
[9:48] * joshd2 (~joshd@108-93-176-49.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[9:48] * joshd2 (~joshd@108-93-176-49.lightspeed.irvnca.sbcglobal.net) Quit ()
[9:56] * l0nk (~alex@ has joined #ceph
[9:57] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[10:02] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[10:04] * vipr_ is now known as vipr
[10:08] <topro> wido: not so long ago I got "slapped" for mounting cephfs multiple times simultaneously. that was with bobtail, so have things changed recently?
[10:09] <wido> topro: CephFS isn't stable yet, but it is designed for just that
[10:09] <wido> do not confuse it with RBD though! Like Gugge-47527 mentioned
[10:10] <topro> thats clear, I'm using cephfs, no rbd. and I know it's fun (aka. experimental)
[10:11] <topro> btw. has cuttlefish been see anywhere?
[10:11] <topro> s/see/seen/
[10:12] <Gugge-47527> you can use the "next" packages
[10:12] <topro> well, I think I can wait to use stable-unstable instead of next-unstable ;)
[10:13] <topro> unstable refers to cephfs though
[10:13] <topro> not ceph as a whole
[10:14] <Gugge-47527> http://article.gmane.org/gmane.comp.file-systems.ceph.user/1000 <- if you want to test cuttlefish :)
[10:16] <topro> I feel like I really should have helped with that, i'm just missing a real test cluster setup
[10:16] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[10:18] * LeaChim (~LeaChim@ has joined #ceph
[10:18] <Gugge-47527> topro: i know the feeling :)
[10:18] <Gugge-47527> im considering just doing a test setup on amazon :)
[10:26] <wido> I'd recommend staying away from the next branch for now
[10:27] <wido> Some monitor issues, which seem to be resolved (for me) with a patch from last night
[10:28] <leseb> wido: is it? :)
[10:28] <wido> leseb: Ha!
[10:28] <leseb> wido: :)
[10:28] <wido> leseb: Yes, right now it's working, but I'm waiting for a couple of hours to see if it keeps working
[10:28] <wido> But for now, it works
[10:28] <leseb> wido: sure :)
[10:29] <topro> wido: would cuttlefish be affected by that monitor issues as well if last nights patch won't help?
[10:29] <wido> topro: Cuttlefish won't be release before this is fixed
[10:29] <wido> released*
[10:29] <topro> ^^ so the answer is yes ;)
[10:29] <wido> topro: Cuttlefish currently lives in the "next" branch and that one is kind of broken
[10:30] <wido> But these fixes live in a different branch which probably will be merged into next later today
[10:30] <topro> ok so next will become cuttlefish at release, like debian testing-->stable, right?
[10:31] <wido> topro: correct
[10:31] <wido> It will then go into the branch 'cuttlefish'
[10:31] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[10:38] * uli (~uli@mail1.ksfh-bb.de) Quit (Quit: Verlassend)
[10:39] * rustam (~rustam@ has joined #ceph
[10:42] * rustam (~rustam@ Quit (Remote host closed the connection)
[10:53] * uli (~uli@mail1.ksfh-bb.de) has joined #ceph
[10:54] * vo1d (~v0@213-240-75-196.adsl.highway.telekom.at) has joined #ceph
[10:56] <uli> is there a mirror for debian wheezy... ceph-deploy install [server] gives me a: http://ceph.com/debian-cuttlefish/dists/wheezy/main/binary-amd64/Packages failed: 404 Not Found
[10:58] <wido> uli debian/cuttlefish doesn't exist yet
[10:58] <wido> cuttlefish isn't there yet, so right now you can only deploy bobtail
[10:59] <uli> ok, just used the defaults....
[11:01] * v0id (~v0@91-115-228-104.adsl.highway.telekom.at) Quit (Ping timeout: 480 seconds)
[11:06] <topro> btw. with cuttlefish will MDS default behaviour still be to have exactly one active MDS and subsequent as hot-spare?
[11:33] * erdem (~erdem@ has joined #ceph
[11:39] <erdem> Hi all, we have just recovered from some issues caused by a single OSD complaining with "heartbeat_map is_healthy 'OSD::op_tp thread 0x7fdcea38f700' had timed out after 15"
[11:39] <erdem> I tried to explain it in http://thread.gmane.org/gmane.comp.file-systems.ceph.user/1112/focus=1117
[11:40] <erdem> If anyone can offer some help we'll be grateful
[11:40] <wido> topro: Yes, not so much CephFS work has been done
[11:40] <wido> don't exect multi-mds soon
[11:46] <topro> wido: thats fine for me, I just wanted to make sure that cuttlefish won't make that default
[11:58] * tnt (~tnt@ Quit (Ping timeout: 480 seconds)
[12:08] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[12:10] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[12:11] * diegows (~diegows@ has joined #ceph
[12:14] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[12:17] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[12:18] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[12:19] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[12:21] * __jt__ (~james@rhyolite.bx.mathcs.emory.edu) has joined #ceph
[12:22] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[12:23] * __jt___ (~james@rhyolite.bx.mathcs.emory.edu) Quit (Ping timeout: 480 seconds)
[12:25] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:1ce8:919e:8ca9:d55c) Quit (Ping timeout: 480 seconds)
[12:42] * rustam (~rustam@ has joined #ceph
[12:44] * rustam (~rustam@ Quit (Remote host closed the connection)
[12:50] * fridad (~fridad@b.clients.kiwiirc.com) has joined #ceph
[13:06] * athrift (~nz_monkey@ Quit (Remote host closed the connection)
[13:07] * ScOut3R (~ScOut3R@ Quit (Remote host closed the connection)
[13:08] * ScOut3R (~ScOut3R@ has joined #ceph
[13:09] * athrift (~nz_monkey@ has joined #ceph
[13:11] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:20] <mikedawson> wido: which branch are you running? wip-leveldb-reopen?
[13:23] * l0nk1 (~alex@ has joined #ceph
[13:24] * nhm (~nhm@65-128-150-185.mpls.qwest.net) has joined #ceph
[13:27] * l0nk (~alex@ Quit (Ping timeout: 480 seconds)
[13:29] * madkiss (~madkiss@089144192030.atnat0001.highway.a1.net) has joined #ceph
[13:30] * ScOut3R_ (~ScOut3R@ has joined #ceph
[13:30] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:32] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:34] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[13:36] * ScOut3R (~ScOut3R@ Quit (Ping timeout: 480 seconds)
[13:37] * madkiss (~madkiss@089144192030.atnat0001.highway.a1.net) Quit (Quit: Leaving.)
[13:44] * l0nk1 (~alex@ Quit (Quit: Leaving.)
[13:44] * l0nk (~alex@ has joined #ceph
[13:50] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[14:02] * ScOut3R (~ScOut3R@ has joined #ceph
[14:03] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Ping timeout: 480 seconds)
[14:06] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[14:10] * ScOut3R_ (~ScOut3R@ Quit (Ping timeout: 480 seconds)
[14:12] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[14:20] * eternaleye (~eternaley@cl-43.lax-02.us.sixxs.net) Quit (Remote host closed the connection)
[14:21] * eternaleye_ (~eternaley@cl-43.lax-02.us.sixxs.net) has joined #ceph
[14:21] * eternaleye_ is now known as eternaleye
[14:25] <joao> okay, so, anyone using wip-leveldb-reopen?
[14:26] <joao> wido ^
[14:26] <wido> joao: Yes, me
[14:26] <wido> See #4815
[14:26] <joao> has it worked?
[14:26] <wido> my comment :)
[14:26] <wido> http://tracker.ceph.com/issues/4851
[14:26] <wido> joao: http://tracker.ceph.com/issues/4851
[14:27] <joao> so, still issues with the whole MakeRoomForWrite() :\
[14:27] <joao> good god, leveldb is killing me
[14:27] <wido> joao: Have to note, it has been running just fine now
[14:27] <wido> Past 4 hours
[14:28] <joao> how's the store size doing?
[14:29] <joao> also, wido, how long did the mon got stuck in MakeRoomForWrite() before you noticed it? any idea?
[14:30] <joao> we may have to report this to the leveldb mailing list
[14:30] <joao> I just wish it was reproducible
[14:30] <joao> (accurately reproducible without using ceph that is)
[14:38] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:38] <wido> joao: About 30 min or so
[14:39] <joao> wido, kay, thanks; I'll try to reproduce this again
[14:39] <joao> mikedawson, have you tried wip-leveldb-reopen?
[14:41] * rustam (~rustam@ has joined #ceph
[14:42] * ScOut3R_ (~ScOut3R@ has joined #ceph
[14:42] <imjustmatthew> joao: I just moved my cluster to that branch and got a Segfault on one of the three mons
[14:42] <joao> no trace whatsoever?
[14:42] <imjustmatthew> This mon had been out of quorum before moving to the new branch
[14:42] <imjustmatthew> I have a trace, one sec
[14:42] <joao> thanks
[14:44] * rustam (~rustam@ Quit (Remote host closed the connection)
[14:46] <imjustmatthew> joao: https://docs.google.com/file/d/0B-VPvwEe43FVZFdYLVBqVWVlNms/edit?usp=sharing
[14:46] <joao> thanks
[14:46] <imjustmatthew> np
[14:47] <imjustmatthew> that looks like the same segfault mikedawson hit last night
[14:47] <wido> joao: So it hung once after 30 mins running that branch, but after a restart of that mon it didn't hang anymore
[14:47] <joao> ah
[14:47] <wido> And all is running just fine now
[14:47] <joao> okay
[14:47] <wido> HEALTH_OK it says :)
[14:47] * ScOut3R (~ScOut3R@ Quit (Read error: Operation timed out)
[14:47] <mikedawson> imjustmatthew: if that is the same segfault, Sage thought the fix was trivial. Not sure if he implemented it yet
[14:47] <joao> so Sage is probably right; the fork was messing with leveldb's threads
[14:48] <joao> mikedawson, context on what fix Sage was trivial?
[14:48] <joao> err
[14:48] <joao> Sage thought was trivial
[14:48] <joao> :p
[14:48] <joao> ah
[14:49] <mikedawson> imjustmatthew: I haven't tried wip-leveldb-reopen yet.
[14:49] <joao> ought to be simple indeed
[14:49] <joao> let me take a look
[14:49] <mikedawson> joao: context is at 2:58 here http://irclogs.ceph.widodh.nl/index.php?date=2013-05-03
[14:49] <joao> mikedawson, thanks
[14:50] <joao> this timezone offset sometimes becomes a nuisance to keep up with the previous night work :x
[14:51] <joao> eh, indeed; now I just wonder whether sage fixed it or what :p
[14:51] <jerker> Is there any way of using a sinble ceph cluster but to seperate two different CephFS file systems for two different classes of systems? Like one development system and one live system for example. Just like there can be multiple RBD can there be multiple CephFS?
[14:51] <mikedawson> joao: he made (at least?) two wip branches right around that time
[14:53] <joao> wip-mon-null it seems
[14:54] <imjustmatthew> Whoa, just got a new segfault on that leveldb_reopen branch when starting a mon with "mon compact on start = true"; mon log is at: https://docs.google.com/file/d/0B-VPvwEe43FVbUF4QVlGWUduUGs/edit?usp=sharing
[14:55] <joao> lol
[14:55] <imjustmatthew> Did that branch remove leveldb logging?
[14:56] <joao> sorry, this bug is annoying, it's not even funny
[14:56] <joao> not much info on that trace
[14:56] <joao> imjustmatthew, which branch? wip-mon-null?
[14:56] <imjustmatthew> oh, it's pretty funny :)
[14:57] <joao> wip-mon-null only contains the segfault fix
[14:57] <joao> apparently
[14:57] <joao> I can push a branch for a build with that fix in if you'd prefer
[14:57] <imjustmatthew> joao: wip-leveldb-reopen
[14:58] <joao> if you're going to build from source, then you'd just have to cherry-pick the fix to wip-leveldb-reopen
[14:58] <joao> ah
[14:58] <joao> I think it added it, didn't it?
[14:58] <joao> let me check
[14:58] <joao> wip-leveldb-reopen added leveldb logging
[14:58] <joao> on 7ec01513970b5a977bdbdf60052b6f6e257d267e
[14:59] <imjustmatthew> Hmm, then it sure didn't log much
[14:59] <joao> did you specify the log file?
[14:59] <imjustmatthew> http://pastebin.com/hKtY8aQg is the leveldb log
[14:59] <joao> looks like it is unset by default
[15:00] <imjustmatthew> note this is from a seocnd restart, the times won't quite match the other log
[15:00] <joao> right
[15:00] <imjustmatthew> There's also a core file allegedly dumped if you want that
[15:01] <imjustmatthew> service cpeh restart mon
[15:01] <imjustmatthew> ^ sorry, wrong window and wrong command, I must need more coffee
[15:02] <imjustmatthew> joao: this is the terminal output, I just noticed it's different from the log output: http://pastebin.com/KQiMfMAH
[15:04] <joao> imjustmatthew, grab me the symbol on main()+0xda2 ?
[15:04] <imjustmatthew> joao: sure, what do I need to tell gdb?
[15:05] <joao> gdb 'path-to/ceph-mon'
[15:05] <joao> list * main+0xda2
[15:05] <joao> that should do it
[15:06] <joao> well, I have to grab something to eat
[15:06] <imjustmatthew> joao: No debugging symbols found
[15:06] <joao> I start looking into dumps and whatnot and forget to eat
[15:06] <joao> imjustmatthew, okay, np
[15:06] <joao> can't be segfaulting in a lot of places after the 'compacting' string
[15:06] <joao> :)
[15:06] <imjustmatthew> joao: go grab some food, it'll still be broken when you get back :)
[15:07] <joao> eh :p
[15:07] <joao> brb
[15:12] <imjustmatthew> joao: with ceph-dbg: "0x48bba2 is in main(int, char const**) (mon/MonitorDBStore.h:496). 491 mon/MonitorDBStore.h: No such file or directory."
[15:18] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Read error: Operation timed out)
[15:18] <joao> that helps a lot :) thanks
[15:18] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[15:19] <imjustmatthew> np
[15:20] * smangade (~shardul@71-223-35-14.phnx.qwest.net) has joined #ceph
[15:35] * ScOut3R (~ScOut3R@ has joined #ceph
[15:35] * rahmu (~rahmu@ has joined #ceph
[15:41] * ScOut3R_ (~ScOut3R@ Quit (Ping timeout: 480 seconds)
[15:43] <LeaChim> Is there anyone around who might be able to help me with a hbase/hadoop/ceph/mds issue? (I'm not entirely sure in which component the problem lies)
[15:45] * yehudasa_ (~yehudasa@2602:306:330b:1410:ea03:9aff:fe98:e8ff) has joined #ceph
[15:45] * Havre (~Havre@2a01:e35:8a2c:b230:eccc:1c47:8d:220b) Quit (Ping timeout: 480 seconds)
[15:45] * aliguori (~anthony@ has joined #ceph
[15:48] <imjustmatthew> LeaChim: Most of the devs are on US-Pacific time, you might have more luck in a few hours
[15:50] * fridad (~fridad@b.clients.kiwiirc.com) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[15:54] <LeaChim> ok, cheers
[15:55] * loicd (~loic@ has joined #ceph
[15:56] * bergerx_ (~bekir@ has left #ceph
[15:56] * aliguori_ (~anthony@ has joined #ceph
[15:56] * PerlStalker (~PerlStalk@ has joined #ceph
[15:56] * aliguori (~anthony@ Quit (Read error: Connection reset by peer)
[15:57] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[16:02] * gmason (~gmason@hpcc-fw.net.msu.edu) has joined #ceph
[16:08] * portante|ltp (~user@ has joined #ceph
[16:11] * SpamapS (~clint@xencbyrum2.srihosting.com) Quit (Quit: leaving)
[16:12] * erdem (~erdem@ Quit (Quit: Leaving)
[16:13] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[16:14] * PerlStalker (~PerlStalk@ Quit (Remote host closed the connection)
[16:14] * vata (~vata@2607:fad8:4:6:221:5aff:fe2a:d1dd) has joined #ceph
[16:16] * PerlStalker (~PerlStalk@ has joined #ceph
[16:32] * Wolff_John (~jwolff@vpn.monarch-beverage.com) has joined #ceph
[16:47] * eschnou (~eschnou@ Quit (Remote host closed the connection)
[16:57] * noahmehl (~noahmehl@cpe-184-57-16-227.columbus.res.rr.com) has joined #ceph
[17:05] * rustam (~rustam@ has joined #ceph
[17:07] * rustam (~rustam@ Quit (Remote host closed the connection)
[17:08] * jtangwk (~Adium@2001:770:10:500:39f1:f2be:6be0:794) Quit (Ping timeout: 480 seconds)
[17:08] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[17:09] <sage> good morning
[17:09] <joao> imjustmatthew, just confirming, you're on wip-leveldb-reopen right?
[17:10] <joao> sage, pushing a fix to wip-leveldb-reopen
[17:10] <imjustmatthew> joao: yes
[17:10] <joao> also, morning sage :)
[17:10] <joao> pushed
[17:11] <sage> cool
[17:11] <sage> still seeing a hang after that tho, i thinkteh close isn't working right
[17:12] <joao> sage, I'll take a look
[17:12] <joao> sage, any way to reproduce said hang?
[17:14] <sage> restart mon with --mon-compact-on-start is doing it for me
[17:14] <joao> kay
[17:19] * drokita (~drokita@ has joined #ceph
[17:21] <drokita> What would be some situations where one has a failed disk, plenty of room to rebalance and noout unset, BUT the cluster is not rebalancing?
[17:22] <jmlowe> crushmap problems?
[17:24] <drokita> I guess it could be... my map is pretty much the same as it has always been though.
[17:24] <drokita> Never had the problem before
[17:24] <drokita> ceph osd tree
[17:24] <drokita> whoops
[17:24] <drokita> wrong window ;)
[17:25] <jmlowe> there is a rebalance timeout, I'm assuming you've waited long enough
[17:26] * sagelap2 (~sage@2600:1012:b02f:8cf8:b499:ff7c:ce05:7e3f) has joined #ceph
[17:26] * sagelap1 (~sage@ Quit (Ping timeout: 480 seconds)
[17:30] * rahmu (~rahmu@ Quit (Remote host closed the connection)
[17:30] <imjustmatthew> joao: that new commit looks better, it starts, but isn't joining quorum. I'm running with "mon compact on start = true" and "mon leveldb paranoid" unset. I'm fairly sure ceph-mon is being started with "-f" by upstart when I call "service ceph restart mon"
[17:31] <imjustmatthew> I'm assuming this branch should quorum fine with the other two mons that are on 0.60-803-g7544e86
[17:32] <joao> imjustmatthew, point me to logs please? :)
[17:32] <imjustmatthew> k, one sec, do you also want thread backtraces?
[17:33] <joao> if possible, couldn't hurt
[17:34] <drokita> jmlowe: I would think that I waited long enough. I took the OSD down and went to bed. Woke up this morning and the cluster was in the same state of degraded as it was las tnight
[17:35] <joao> sage, sagelap2, haven't been able to hung the mons with --mon-compact-on-start
[17:35] <joao> :\
[17:35] <sagelap2> it's not doing it on my laptop but it was at home
[17:35] <sagelap2> i'm hnting down teh real issue tho.. on reopen leveldb is not creating its worker thread
[17:36] <joao> hmm... then again...
[17:37] <joao> it didn't hung when I firstly compacted a 5.5 GB store on two mons
[17:37] <joao> restarting them with the flag on seems to have done it for me
[17:38] <joao> waiting somewhere under CompactRange()
[17:38] <joao> on TEST_Compact() it would seem
[17:40] <joao> looks as if it's still working though
[17:40] <joao> store sizes keep on growing
[17:40] <joao> (which usually happens during compact, only to be freed once the thing is all done)
[17:42] <sagelap2> totally confused.. when i call open again i'm getting back the same pointer.
[17:43] <joao> maybe shared_ptr or scoped_ptr magic?
[17:43] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[17:44] <joao> sagelap2, is that on LevelDBStore?
[17:44] * noahmehl (~noahmehl@cpe-184-57-16-227.columbus.res.rr.com) Quit (Quit: noahmehl)
[17:45] * brian_appscale (~brian@wsip-72-215-161-77.sb.sd.cox.net) has joined #ceph
[17:46] <sagelap2> i'm dumping the pointer that leveldb gives me back...
[17:47] <gregaf> LevelDB is supposed to be fairly aware; have you checked the interface docs?
[17:49] <sagelap2> i'm reading the source code
[17:49] <sagelap2> DB::Open() allocates a DBImpl on the heap
[17:49] <sagelap2> but it is returning the same pointer twice
[17:51] <gregaf> it doesn't have a bypass for checking if one's already allocated? Because I thought you were okay opening from multiple threads and would get the same thing back
[17:51] <sagelap2> oh, its just the allocator being consistent.
[17:51] <sagelap2> it doesn't
[17:51] <gregaf> hmm, guess I'm misremembering then
[17:55] <joao> sagelap2, wrt http://tracker.ceph.com/issues/4895 did mikedawson upload his store?
[17:55] <joao> any idea?
[17:55] <sagelap2> yeah
[17:56] <joao> to cephdrop?
[17:56] * leseb (~Adium@ Quit (Quit: Leaving.)
[17:57] <joao> well, there's a bunch of them over there
[17:57] <joao> mikedawson, still around?
[17:57] * allsystemsarego (~allsystem@5-12-37-186.residential.rdsnet.ro) has joined #ceph
[17:58] <mikedawson> joao: yep
[17:58] <imjustmatthew> joao: Sorry thesee took a few minutes, people walked in: https://docs.google.com/file/d/0B-VPvwEe43FVOVRXdHBVbGwwZ28/edit?usp=sharing https://docs.google.com/file/d/0B-VPvwEe43FVYVJWSjV0WkxtaGc/edit?usp=sharing https://docs.google.com/file/d/0B-VPvwEe43FVUEswSnpWckJtSUk/edit?usp=sharing
[17:59] <joao> do you remember which was the store you uploaded (I'm assuming) last night?
[17:59] <joao> mikedawson, ^
[17:59] <joao> imjustmatthew, that's okay; thanks :)
[18:00] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:00] <mikedawson> joao: mikedawson-ceph-mon.a-tdump.tar.bz2 has the store before, the store after, and some leveldb dump stuff sage put in
[18:00] * ScOut3R (~ScOut3R@ Quit (Ping timeout: 480 seconds)
[18:00] <joao> cool
[18:00] <joao> thanks
[18:02] <sagelap2> pushed my current code that reproduces it on every ceph-mon start..
[18:02] <sagelap2> joao, gregaf: any clues? on the open after fork(), no worker thread gets created and it hangs on compact every time
[18:02] * sagelap2 (~sage@2600:1012:b02f:8cf8:b499:ff7c:ce05:7e3f) has left #ceph
[18:03] * sagelap2 (~sage@2600:1012:b02f:8cf8:b499:ff7c:ce05:7e3f) has joined #ceph
[18:03] <sagelap2> may need to link elveldb statically to debug :(
[18:03] * sagelap2 is now known as sagelap
[18:03] <mikedawson> imjustmatthew: if you still can't get quorum, start all mons with "mon compact on start = true", it will get stuck. stop mons after you confirm the leveldb has been compacted to a small size. then start all mons with "mon compact on start = false". That works for me
[18:04] <joao> sagelap, let me give it a spin and see if it hangs here
[18:04] <gregaf> sagelap: on a call and haven't looked at it, sorry
[18:05] <sagelap> oh, i think the posix env stuff is hding some fixed state
[18:05] <sagelap> np
[18:07] <imjustmatthew> mikedawson: thanks, my other two mons are on sage's code from last night and working relatively well without compact on start; it's just the one with compact on start that hangs; they are using a lot of CPU, but that could just be from elevated logging
[18:08] <mikedawson> imjustmatthew: that *should* work for the single mon out of quorum while the other two are still running (unless perhaps their mon leveldb stores are big)
[18:08] * portante|ltp (~user@ Quit (Ping timeout: 480 seconds)
[18:10] <mikedawson> imjustmatthew: basically, in the state you, me, and wido are able to reproduce, compact on start will hang. -f or not compacting on start appear to be workarounds
[18:12] * tnt (~tnt@ has joined #ceph
[18:12] <sagewk> static Env* default_env;
[18:12] <sagewk> static in a shared library... damn you, leveldb!!
[18:13] * Wolff_John (~jwolff@vpn.monarch-beverage.com) Quit (Ping timeout: 480 seconds)
[18:13] * aliguori_ (~anthony@ Quit (Quit: Ex-Chat)
[18:13] * aliguori (~anthony@ has joined #ceph
[18:14] <joao> sagelap, fyi mons started just fine but segfault on shutdown somewhere on the scoped_ptr's destructor
[18:14] <joao> I mean, on LevelDBStore's scoped_ptr
[18:15] <joao> sagewk, lol
[18:15] <joao> hmm
[18:16] <joao> wait a sec, so monitors would only hang with --mon-compact-on-start if run as daemons?
[18:16] <sagewk> no
[18:16] * bergerx (~bergerx@ has joined #ceph
[18:16] <sagewk> they would hang if they happened to call any async (background thread) functions before the fork()
[18:16] <sagewk> but we do gobs of stuff before the fork. compact() guaranteed we would, but other thigns might cause it..
[18:17] * sagelap (~sage@2600:1012:b02f:8cf8:b499:ff7c:ce05:7e3f) Quit (Ping timeout: 480 seconds)
[18:18] <sagewk> now regretting our decision to dynamically link to leveldb :)
[18:20] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[18:21] <sagewk> they go to all this effort to abstract away the environment and then don't let you create one.. except via their static deduped Default() function
[18:25] * l0nk (~alex@ Quit (Quit: Leaving.)
[18:25] * dxd828 (~dxd828@ Quit (Remote host closed the connection)
[18:27] <joao> sage, their source states
[18:27] <joao> Sophisticated users may wish to provide their own Env
[18:27] <joao> // implementation instead of relying on this default environment.
[18:27] <joao> any chance we are that sophisticated that implementing our own Env would do the trick?
[18:29] * Wolff_John (~jwolff@vpn.monarch-beverage.com) has joined #ceph
[18:29] <sagewk> ah good call
[18:30] <sagewk> joao: can you see if all the qa hangs are because of this?
[18:30] <joao> sure
[18:30] <sagewk> just gdb attach to hung monitors and see if they are blocked in leveldb
[18:30] <joao> looking
[18:33] * yehuda_hm (~yehuda@2602:306:330b:1410:5c3f:8e44:656a:fa4c) has joined #ceph
[18:33] * yehudasa_ (~yehudasa@2602:306:330b:1410:ea03:9aff:fe98:e8ff) Quit (Quit: Leaving)
[18:46] <joao> sage, none of the 34 hangs were due to this
[18:47] <joao> all waiting on machines; a couple of them managed to eventually get the machines and finish successfully but not in time it appears
[18:55] <sagewk> look at which jobs are holding locks to see whifdh are hung
[18:55] <sagewk> teuthology-lock --list --owner scheduled_teuthology@teuthology |grep desc | sort
[18:58] * Tamil (~tamil@ has joined #ceph
[19:00] * tkensiski (~tkensiski@ has joined #ceph
[19:00] * tkensiski (~tkensiski@ has left #ceph
[19:00] <cjh_> i wonder if ceph could take advantage of native crc32 on the cpu? when i run perf top it seems like the only thing it's doing is crc32c_le
[19:02] <sagewk> sjust: want to look at wip-env?
[19:02] <sagewk> pita just to make it build
[19:02] <sagewk> might be simpler to fork early
[19:03] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[19:03] * joshd1 (~jdurgin@2602:306:c5db:310:881a:87fc:52ea:35ce) Quit (Ping timeout: 480 seconds)
[19:03] <gregaf> cjh_: that is something at the top of our mind in terms of performance features
[19:03] <cjh_> gregaf: awesome :)
[19:04] <cjh_> i changed over to btrfs for testing and the performance is pretty decent
[19:04] <cjh_> about 2GB/s sustained over 10 mins with rados bench
[19:04] <cjh_> a single client sees about 750MB/s
[19:04] <cjh_> make that 815MB/s.
[19:08] <imjustmatthew> sagewk: do you want those newer commits you made to wip-leveldb-reload running or are you pretty set with reproducing the issue locally now?
[19:08] <sagewk> all set, thanks
[19:08] <imjustmatthew> k, good luck
[19:10] * houkouonchi-work (~linux@ has joined #ceph
[19:10] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Remote host closed the connection)
[19:13] * kyle_ (~kyle@ has joined #ceph
[19:13] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[19:14] * joshd1 (~jdurgin@2602:306:c5db:310:29c4:15ac:5ad:9e70) has joined #ceph
[19:16] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[19:16] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[19:17] <kyle_> hello all. i'm having problems getting placement groups right after an osd failure. Can these placement groups be fixed... HEALTH_WARN 386 pgs down; 94 pgs incomplete; 386 pgs peering; 480 pgs stale; 480 pgs stuck inactive; 576 pgs stuck unclean
[19:18] * Wolff_John (~jwolff@vpn.monarch-beverage.com) Quit (Ping timeout: 480 seconds)
[19:18] * jluis (~JL@ has joined #ceph
[19:18] <kyle_> the cluster is not in production yet, and has been in this state for a few days (i was hoping they would begin to recover, but nothing has moved in days)
[19:18] * rturk-away is now known as rturk
[19:20] * sjustlaptop (~sam@2607:f298:a:697:2015:b217:f5be:3ab1) has joined #ceph
[19:21] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[19:22] <kyle_> if there is anyone available for some PG troubleshooting, i would be very appreciative.
[19:22] <gregaf> kyle_: for help with this you should pastebin the output of "ceph osd tree" and "ceph pg dumpp"
[19:23] <kyle_> will do.
[19:23] * jskinner (~jskinner@ has joined #ceph
[19:23] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:23] * joao (~JL@89-181-146-10.net.novis.pt) Quit (Ping timeout: 480 seconds)
[19:24] <kyle_> http://pastebin.com/83XHiVNP
[19:25] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Read error: Connection reset by peer)
[19:26] * Tamil (~tamil@ has left #ceph
[19:27] * Svedrin (svedrin@ketos.funzt-halt.net) Quit (Ping timeout: 480 seconds)
[19:28] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[19:28] <sagewk> wip-mon-forker
[19:28] * Havre (~Havre@2a01:e35:8a2c:b230:d433:a843:4749:9772) has joined #ceph
[19:29] <kyle_> okay so here is my "ceph osd tree" http://pastebin.com/83XHiVNP
[19:29] <kyle_> and here is a sample of my "ceph ps dump" http://pastebin.com/08BGrJ3K
[19:29] <kyle_> the pg dump was cut down since it was very large
[19:29] <sagewk> er, repushed
[19:30] * eschnou (~eschnou@104.207-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[19:30] * barryohome (~barry@host86-179-27-71.range86-179.btcentralplus.com) has joined #ceph
[19:33] * sjustlaptop (~sam@2607:f298:a:697:2015:b217:f5be:3ab1) Quit (Ping timeout: 480 seconds)
[19:34] <gregaf> kyle_: what's your replication level set to? and your crush rules?
[19:35] <gregaf> it looks like most of your PGs are mapping to a single OSD, but they're stuck in down+peering, which indicates they can't find the data
[19:35] <gregaf> if you have a size of 1 and you lost an OSD, then you lost all the data that was on it
[19:35] <kyle_> that makes sense. my crush map was incorrect during the osd failure
[19:35] <kyle_> is there are way to sort of wipe the data and start fresh?
[19:36] <gregaf> you can reformat the whole cluster, or delete and recreate the pools, or you can mark the OSD as "lost" (that's covered in the docs)
[19:36] * Svedrin (svedrin@ketos.funzt-halt.net) has joined #ceph
[19:36] <gregaf> marking lost will preserve the data that's available but give up on finding what's not
[19:36] <kyle_> okay. if i delete and recreate the pool. will it recover the space that was previously being used?
[19:36] <mikedawson> kyle_: look at the output of ceph pg dump | grep "^[0-9]\.[0-9a-f]*" | awk '{ print $14 }'
[19:37] <gregaf> in the meantime, you probably want to fix things up so you have multiple copies of all your PGs...
[19:37] <gregaf> yes, deleting the pool will clean up all the space
[19:37] <kyle_> yeah i have since rectified the settings and crush map.
[19:37] <kyle_> i think i will just start with a new pool
[19:37] <sjust> sagewk: I think that looks ok
[19:37] <gregaf> are you sure? because most of the sample you provided are still mapping to a single OSD
[19:37] <kyle_> i appreciate how helpful you guys always are
[19:38] <kyle_> and that mapping would be based on the crush map right?
[19:38] <gregaf> yeah
[19:38] <sagewk> sjust: pushed a few fixes
[19:38] <gregaf> and some per-pool settings like the "size"
[19:39] <kyle_> does this look like it would be right?
[19:39] <kyle_> http://pastebin.com/D4xvpfb4
[19:40] <kyle_> i want to spread across 3 osds
[19:40] <sagewk> mikedawson, imjustmatthew: around?
[19:40] <imjustmatthew> sagewk: yes
[19:40] * eschnou (~eschnou@104.207-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[19:40] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) has joined #ceph
[19:40] <mikedawson> sagewk: me, too
[19:41] <gregaf> kyle_: yeah, but you'll need to set the pool size parameters appropriately (eg, 3)
[19:41] <sjust> sagewk: looks more right
[19:41] <gregaf> actually, wait, no
[19:41] <gregaf> your current rules…are very strangely formed, but I guess are fine, so yeah, just set the size appropriately
[19:42] <sagewk> have a branch ready to test that should resolve the hangs properly. building now..
[19:42] <imjustmatthew> awesome, wip-mon-forker?
[19:42] <sagewk> yeah. it pushed twice, so need to make sure you get the second build (not 07914db3b302095912b563e14c896007b357f118)
[19:43] <gregaf> kyle_: urgh, keep changing my mind, that's not going to work
[19:43] <kyle_> okay here is what my conf looks like... http://pastebin.com/ryqgbCfM
[19:43] <gregaf> you've specified that the granularity for each replica is to split across racks with that "type rack" in the "step chooseleaf firstn 0 type rack" line
[19:43] <mikedawson> sagewk: wip-no-more-major-problems-before-cuttlefish, right?
[19:44] * alram (~alram@ has joined #ceph
[19:44] <sagewk> mikedawson: fingers crossed! :)
[19:44] <sagewk> because i'm going camping this weekend :)
[19:45] <kyle_> oh i see. i guess i was confused about the chooseleaf option. i'll have to review that again
[19:45] <sagewk> joao: did you verify the hangs?
[19:45] <sagewk> or did someone else?
[19:45] <gregaf> kyle_: you've got a crush map quite a bit different from the default ones, but if you change that line to "step chooseleaf firstn 0 type osd" it should work out
[19:46] <gregaf> the ones we construct by default include a "pool root" which holds a "rack unknownrack", each of the "host"s under that and each "device" in the appropriate "host"
[19:46] <gregaf> and then the rules are more like "take root" "step chooseleaf firstn 0 type host"
[19:46] <kyle_> okay. that sounds good. i'll give it a try. yeah the default one seemed a bit too granular. but it's likely a lack of understanding on my part.
[19:48] * madkiss (~madkiss@2001:6f8:12c3:f00f:4d7c:af8c:ed4f:254e) has joined #ceph
[19:51] * davidzlap (~Adium@ has joined #ceph
[19:54] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[19:54] * fudida (~oftc-webi@p4FC2C57C.dip0.t-ipconnect.de) has joined #ceph
[19:55] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[19:55] <mikedawson> gregaf: I have 10GB journal partitions, but my osd admin sockets report osd_journal_size = 5120 (I don't set it in ceph.conf). Is that a concern, or is osd_journal_size unused when attaching to a journal partition?
[19:56] * eschnou (~eschnou@104.207-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[19:58] <gregaf> mikedawson: ah, that's a default of 5GB, and it's ignored if there's a pre-existing block device
[19:58] * Wolff_John (~jwolff@vpn.monarch-beverage.com) has joined #ceph
[19:58] <mikedawson> thx
[20:07] <sagewk> wip-mon-forker is built
[20:07] <sagewk> test away!
[20:07] * mjblw (~mbaysek@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[20:12] <imjustmatthew> sagek: Do you happen to know if there is a simple way to tell apt that version (v0.60-801-gb343b44) is actually newer than the "0.60-804-g62a37f0" installed now?
[20:14] * eschnou (~eschnou@104.207-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[20:15] * Cube (~Cube@ has joined #ceph
[20:15] <sagewk> apt-get install ceph=theversionyouwant
[20:16] * dwt (~dwt@128-107-239-234.cisco.com) has joined #ceph
[20:19] <imjustmatthew> awesome, and that branch is working for me
[20:20] <sjust> sagewk: that means we can get rid of the manual compaction?
[20:20] <sjust> or rather, the auto compaction on trim?
[20:20] <sagewk> no.. it just means we won't hang
[20:20] <sagewk> or.. oh, hmm! maybe
[20:21] <sagewk> mikedawson: can you test? turn off 'mon compact on trim = false' with this branch and see if it stops the crazy growth??
[20:21] <gregaf> I would assume it would
[20:22] * eternaleye (~eternaley@cl-43.lax-02.us.sixxs.net) Quit (Remote host closed the connection)
[20:22] * eternaleye_ (~eternaley@cl-43.lax-02.us.sixxs.net) has joined #ceph
[20:23] * eternaleye_ is now known as eternaleye
[20:23] <via> why does a new point release for a stable series get a new dependency on el6
[20:23] <joshd> via: apparently a merge error - should be fixed shortly
[20:24] <via> oh... cool
[20:24] <via> thanks for responding so quickly
[20:24] <sagewk> imjustmatthew: sweet, thanks!
[20:26] <wido> fwy, my mons are running happily
[20:26] <wido> are only for over 8 hours now. Saw one hang again (reported in 4851), but after the restart nothing
[20:26] <wido> HEALTH_OK!
[20:32] * fudida (~oftc-webi@p4FC2C57C.dip0.t-ipconnect.de) Quit (Remote host closed the connection)
[20:33] <mjblw> I have a cluster with 12 storage nodes each with 14 OSDs on spinning disks. For a while I had a down and out OSD and just got it replaced. I started the osd process and the cluster went nuts. 9 of the osd processes across the cluster crashed and system loads shot up over 100.
[20:33] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[20:34] <benner> is there any more descriptive documentation about crush rules?
[20:34] <mjblw> i ran ceph osd tell \* injectargs '--osd-max-backfills 1' knowing that would reduce the backfill load and after restarting all the crashed osd processes, things seem to have stabilised
[20:35] <mjblw> It seems to me that there is some serious problem here that I need to address. Adding an OSD should not cause the cluster to fall over.
[20:36] <mjblw> does someone have some insight as to what might be happening here?
[20:36] * Havre (~Havre@2a01:e35:8a2c:b230:d433:a843:4749:9772) Quit (Ping timeout: 480 seconds)
[20:37] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:e07c:a2b0:6a21:5622) has joined #ceph
[20:38] * Havre (~Havre@2a01:e35:8a2c:b230:11cc:1cbe:c87:5139) has joined #ceph
[20:38] <benner> mjblw: try add new OSD with low weight
[20:38] <sagewk> mikedawson: if this fixes growth, then we can turn off the explicit compaction too and we're golden. let us know when you have a chance to try it!
[20:38] <mjblw> well, this is re-adding an OSD whose disk failed
[20:39] <mjblw> all disks in the cluster are the same size
[20:40] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[20:41] <benner> mjblw: if i understood correctly you suffering rebuilding damage. adding new OSD with weight 0.1 and increasing it time to time to 1 (or what ather has) will minimize this damage. just check when one wieght was done and increase it
[20:41] <mjblw> maybe I am misconceptualizing the purpose of weight. I thought weight was the proportionality of desired data distribution to that bucket
[20:41] * madkiss (~madkiss@2001:6f8:12c3:f00f:4d7c:af8c:ed4f:254e) Quit (Ping timeout: 480 seconds)
[20:42] * diegows (~diegows@ has joined #ceph
[20:42] <jmlowe> any chance we could get some raring 0.56.5 builds?
[20:42] <benner> weight means how many objects can osd take. in many case it's related to storage capacity, in others - how fast osd is but you can allways do little rebalancing in the start when adding weight to it online
[20:42] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:e07c:a2b0:6a21:5622) Quit (Quit: Leaving.)
[20:44] <mjblw> I see. So, this 'rebuilding damage' can result in cascading failures of osd processes across the cluster of those osd processes are starved long enough for I/O?
[20:49] * kyle__ (~kyle@ has joined #ceph
[20:49] * wag2 (~wag2@node001.ds.geneseo.edu) has joined #ceph
[20:50] <benner> mjblw: what version of ceph are you using?
[20:50] * Havre_ (~Havre@2a01:e35:8a2c:b230:a968:c921:7f3e:4fc9) has joined #ceph
[20:51] <mjblw> ceph version 0.56.3 (6eb7e15a4783b122e9b0c85ea9ba064145958aa5)
[20:52] * Havre (~Havre@2a01:e35:8a2c:b230:11cc:1cbe:c87:5139) Quit (Ping timeout: 480 seconds)
[20:52] <benner> and you're using rdb, cephfs, just rados?
[20:53] <mjblw> rbd
[20:53] <jmlowe> hmm, 0.56.5 seems to have broken my upstart
[20:53] <jmlowe> service ceph -a stop mon.alpha
[20:53] <jmlowe> /etc/init.d/ceph: mon.alpha not found (/etc/ceph/ceph.conf defines "", /var/lib/ceph defines "")
[20:55] <benner> mjblw: in my settup i got the same results for now but i think it's related to my setup :)
[20:56] * kyle_ (~kyle@ Quit (Ping timeout: 480 seconds)
[20:59] <pioto> rturk: hi, i'm listed as the "owner" of the CephFS Security blueprint... i think i should be able to be on irc for that session, but i may or may not be able to be on G+. would that be okay?
[21:00] * drokita (~drokita@ Quit (Ping timeout: 480 seconds)
[21:00] <rturk> pioto: hi :) hmm, someone will have to be in the hangout to moderate the discussion
[21:01] <rturk> is there someone who can moderate if you can't make it?
[21:01] <rturk> perhaps gregaf?
[21:02] <jmlowe> sagewk: you might want to pull back 0.56.5, the init scripts don't look quite right
[21:02] <sagewk> jmlowe: what's wrong with them?
[21:02] <sagewk> oh
[21:03] <jmlowe> this worked yesterday but didn't work after the point release "service ceph -a restart mon.alpha"
[21:03] <sagewk> jmlowe: can you pastebin your ceph.conf?
[21:03] <jmlowe> standby
[21:05] <jmlowe> http://pastebin.com/Xw0ReXmw
[21:06] <sagewk> jmlowe: config is in /etc/ceph.conf, and exists on the remote hosts?
[21:06] <sagewk> /etc/ceph/ceph.conf rather
[21:06] <jmlowe> yes
[21:06] <sagewk> k, i'll reproduce
[21:07] <jmlowe> I did a stop and start yesterday as part of a hardware upgrade
[21:07] <jmlowe> no problems, no changes to ceph.conf
[21:08] <sagewk> yeah, there was a change to the init script. (wasn't supposed to break tho! :)
[21:10] <jmlowe> well, at least it didn't let me stop and not start
[21:11] <mjblw> so, benner, you also saw cascading crashes of osd processes? I think for now, I will just keep my osd_max_backfills set to 1 since it seems to alleviate this problem.
[21:12] <sagewk> fwiw you should be able to start locally.. i think its just -a that's broken
[21:12] <mjblw> I wonder if future versions of ceph will be smarter about not DOSing itself when an osd is added while the system is configured with a high max-backfill or the weight of the bucket is wrong.
[21:12] <jmlowe> hmm, also broken without the -a
[21:13] * eschnou (~eschnou@104.207-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:17] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[21:17] * loicd (~loic@ Quit (Read error: Operation timed out)
[21:18] <sjust> mjblw: max_backfill is the main way it avoids DOSing itself
[21:18] <sjust> we probably have a non-sane default though
[21:19] <sagewk> bah, it's a oneliner.
[21:20] <benner> mjblw: no, i sow performance problems.
[21:21] <benner> or stuck in IO operations
[21:21] <benner> i saw kernel panics on client nodes when i change tunables settings but it other story :-))
[21:23] * brady (~brady@rrcs-64-183-4-86.west.biz.rr.com) has joined #ceph
[21:24] <sagewk> jmlowe: pushed a fix to bobtail branch
[21:25] <sagewk> http://fpaste.org/10327/67609128/
[21:25] <sagewk> if you put that in /usr/lib/ceph/ceph_common.sh it ought to resolve it
[21:26] * bmjason (~bmjason@ has joined #ceph
[21:29] <bmjason> anyone else having problems getting to the package apt repo? I am getting 404's on ceph.com
[21:29] <sagewk> bmjason: took the packages down until we build new ones that the init script problem
[21:30] <jmlowe> well almost
[21:30] <bmjason> alright.. we were in the middle of upgrading our bobtail.. just wanted to make sure nothing crazy happened
[21:30] <jmlowe> it has this warning /etc/init.d/ceph: 1: /usr/lib/ceph/ceph_common.sh: !/bin/sh: not found
[21:31] <jmlowe> nm, pasting error
[21:38] <bmjason> having another issue with copy-on-write clone not working between an image pool and volume pool when I create a cinder volume from a glance image.. image in glance is raw format and when i create a new volume from it, it seems to copy the whole image into the new volume instead of using copy on write. I am using grizzly and .56 ceph with rbd and I have followed this url: http://ceph.com/docs/master/rbd/rbd-openstack/ to set it up
[21:39] <sagewk> jmlowe: all better?
[21:39] <jmlowe> yep
[21:39] <sagewk> great, thanks!
[21:40] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[21:41] <benner> does anyone knows when C release will out?
[21:44] <benner> blog says "Cuttlefish, which is due out in 4 weeks (around May 1)." :)
[21:44] <sagewk> benner: looks like monday. but the 'cuttlefish' branch is now there, so you can test via the autobuilt packages!
[21:46] <tnt> Did I understand correctly that I should first go to 0.56.5 then 0.61 ?
[21:48] <gregaf> only if your cluster was formatted pre-Bobtail
[21:48] <gregaf> otherwise you can go straight
[21:48] <tnt> yes, cluster was created on 0.48
[21:48] <tnt> why does that matter ? Isn't everything updated when upgrading ?
[21:49] <benner> someone was saying that hes got 404 in http://ceph.com/debian/dists/precise/main/binary-amd64/Packages . I'm suffering from this too...
[21:49] <gregaf> there was a bug in earlier bobtail releases so even they were doing what they needed to do a cuttlefish upgrade, some of the monitors didn't *write down* that they'd done it
[21:49] <gregaf> *even though
[21:50] <benner> gregaf: i think it's metter of backwarnds compatibility not the upgrade himself
[21:50] * Tamil (~tamil@ has joined #ceph
[21:51] <gregaf> cuttlefish involves a monitor disk and logic change that requires some extra data to be present; bobtail maintains the extra data but argonaut did not
[21:51] <gregaf> unfortunately the bobtail monitor had a bug preventing it from noting "hey, I'm maintaining this information" for upgraded clusters
[21:52] <gregaf> so then you turn on cuttlefish and it goes "hey, I don't have the info I need"
[21:53] * drokita (~drokita@ has joined #ceph
[21:57] * lightspeed (~lightspee@ has joined #ceph
[21:58] * allsystemsarego (~allsystem@5-12-37-186.residential.rdsnet.ro) Quit (Quit: Leaving)
[22:10] * n1md4 (~nimda@anion.cinosure.com) has joined #ceph
[22:11] <n1md4> hi. is it known that ceph.com/debian/ 404s ?
[22:11] * Cube (~Cube@ has joined #ceph
[22:12] <gregaf> we nuked it temporarily due to an error in the release
[22:12] <gregaf> see the mailing lists
[22:13] * drokita (~drokita@ Quit (Quit: Leaving.)
[22:14] * fridad (~oftc-webi@p4FC2C57C.dip0.t-ipconnect.de) has joined #ceph
[22:16] * Cube (~Cube@ Quit ()
[22:17] * madkiss (~madkiss@2001:6f8:12c3:f00f:1df1:3b3c:b807:450d) has joined #ceph
[22:20] <fridad> sagewk: when will we see 0.56.6? i don't see the tag in git?
[22:21] <sagewk> glowell is building it now. we usually push the tag along with the packages
[22:23] <fridad> but will it be 6dbdcf5a210febb5e0dd585e0e599ac807642210 ?
[22:25] <mjblw> another problem I have since adding back that osd, is that, even after setting osd_max_backfill to 1, I am *still* seeing slow requests that are blocking for like, 1500 seconds.
[22:26] * madkiss (~madkiss@2001:6f8:12c3:f00f:1df1:3b3c:b807:450d) Quit (Quit: Leaving.)
[22:30] <cjh_> how are you guys feeling about btrfs with kernel 3.9? stable enough to start using for real workloads maybe?
[22:33] <fridad> cjh_: i tried btrfs with 3.8.11 and i've seen crashing hosts (hanging tasks) and massively raising iop/s over 2 weeks. I started 1000 iop/s per host and ended up with 5000 iop/s with the same workload. with xfs i have now 500 iop/s stable since weeks
[22:33] <fridad> i won't recommand it
[22:33] * drokita (~drokita@ has joined #ceph
[22:34] <cjh_> ok
[22:34] <mikedawson> sagewk: I'm ready to test the new build. wip-mon-forker still, right?
[22:35] <sagewk> or next
[22:35] <sagewk> we want to see that it will stop growing with 'mon compact on trim = false'
[22:35] <mikedawson> sagewk: I assume that means someone else liked it and its been merged already?
[22:35] <sagewk> the hang bug likely is also responsible for that misbehavior
[22:35] <sagewk> yeah it fixed it for imjustmatthew and us (once we figured out how to reproduce the hang)
[22:36] <sagewk> wido: you should also try latest next; ought to clear things up
[22:36] <sagewk> (until then there some possibility it'll happen after each mon restart)
[22:37] * drokita (~drokita@ Quit ()
[22:38] <mikedawson> sagewk: for the record, on ceph version 0.60-800-g037bca5 (037bca59bee61d6e269ba5842d3084147be677fe) with 'mon compact on trim = false' my mons grew from ~400MB to ~2.5GB since yesterday
[22:39] <sagewk> cool, that makes sense given what we found
[22:40] <mikedawson> sagewk: actually, the socket reports, "mon_compact_on_trim": "true", so I must have mis-understood the default setting
[22:41] <bmjason> re my issue with copy on write not working earlier.. i found the solution after digging with strace.. cinder.conf was missing glance_api_version=2 which made it use v1 api that doesn't support it
[22:41] * Cube (~Cube@ has joined #ceph
[22:42] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:42] <mjblw> it cuttlefish being released today?
[22:42] <mjblw> *is
[22:43] <davidzlap> mjblw: Plan is Monday
[22:44] <cjh_> sweet
[22:44] <cjh_> i'll be installing as soon as it lands
[22:46] * rustam (~rustam@ has joined #ceph
[22:46] <n1md4> gregaf: being a naturally sysadmin not wanting to sieve the mailinglist .. is there a plan to get the debian download back online?
[22:46] <n1md4> s/naturally/naturally lazy/
[22:47] <sagewk> where is cuttlefish, you ask? https://www.google.com/search?q=cuttlefish+camouflage&hl=en&tbm=isch&source=lnms&sa=X&ei=HCKEUfOrOYrFiwKuy4CgAw&ved=0CAcQ_AUoAQ&biw=1916&bih=1082#hl=en&tbm=isch&sa=1&q=cuttlefish+camouflage&oq=cuttlefish+camouflage&gs_l=img.3...,or.r_cp.r_qf.&bvm=bv.45960087,d.cGE&fp=a6a15a51585ce5a8&biw=1916&bih=1082
[22:48] * rustam (~rustam@ Quit (Remote host closed the connection)
[22:49] * Wolff_John (~jwolff@vpn.monarch-beverage.com) Quit (Read error: Connection reset by peer)
[22:49] * Wolff_John (~jwolff@ has joined #ceph
[22:50] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[22:53] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) has joined #ceph
[22:53] <jmlowe> how long have you been waiting to do that?
[22:54] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:57] * Wolff_John (~jwolff@ Quit (Quit: ChatZilla 0.9.90 [Firefox 20.0.1/20130409194949])
[22:58] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit (Quit: Leaving.)
[22:59] <sagewk> heh
[23:00] <mjblw> what causes
[23:00] <mjblw> "2013-05-03 20:59:27.519636 osd.68 1333 : [WRN] 4 slow requests, 4 included below; oldest blocked for > 3520.357452 secs"? 3500 seconds is a *long* time
[23:00] * sagewk can't wait for v release
[23:00] <sagewk> mjblw: what version?
[23:00] <mjblw> 0.56.3
[23:00] <mjblw> no stuck inactive pgs
[23:00] <sagewk> known bug fixed in 0.56.4
[23:00] <mikedawson> sagewk: I'm on next ceph version 0.60-803-gb2501e9 (b2501e91bb8f2d28dc744f61b60052dff2acbe00) with 'mon compact on trim = false' and I already have a mon that ballooned from 2.5GB to 3.3GB in less than an hour
[23:01] <sagewk> marking osd.68 down (ceph osd down 68) should unwedge it tho as a workaround
[23:01] <mikedawson> half hour that is
[23:01] <mjblw> ty
[23:01] * Cube (~Cube@ Quit (Quit: Leaving.)
[23:01] <sagewk> mikedawson: damn. ok, thanks for checking!
[23:03] <mikedawson> sagewk: Is there any value for you or the leveldb guys in letting it grow? If not, I'll use the work around 'mon compact on trim = true'
[23:03] * smangade (~shardul@71-223-35-14.phnx.qwest.net) Quit (Ping timeout: 480 seconds)
[23:10] <mikedawson> sagewk: I just realized 'mon compact on trim = true' on a single monitor (in this case the leader) has the effect of making the other mon's (in this case the peons) leveldb shrink, too. Is it preferable to only have a single mon compact on trim, or have all do it?
[23:12] * rustam (~rustam@ has joined #ceph
[23:13] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) has joined #ceph
[23:14] * rustam (~rustam@ Quit (Remote host closed the connection)
[23:22] * TravisSoCal (~oftc-webi@rrcs-173-196-182-162.west.biz.rr.com) has joined #ceph
[23:23] <TravisSoCal> Anyone know if the ceph.com apt repos are down right now?
[23:24] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:24] * vata (~vata@2607:fad8:4:6:221:5aff:fe2a:d1dd) Quit (Quit: Leaving.)
[23:24] <TravisSoCal> I get the following when I do apt-get update with ceph http://pastebin.com/qtcBQ4d7
[23:28] <TravisSoCal> The following URL gets a 404 :-( http://ceph.com/debian/dists/precise/main/binary-amd64/Packages
[23:29] <jmlowe> I think they are working on it, something about a bad release script
[23:29] * gmason (~gmason@hpcc-fw.net.msu.edu) Quit (Read error: Operation timed out)
[23:29] <TravisSoCal> Ok, cool. Wasn't sure if it was on my end.
[23:29] <TravisSoCal> Thank you jmlowe
[23:29] <jmlowe> np
[23:29] <TravisSoCal> Did you perchance hear about an ETA? Should I go do something else for a while?
[23:30] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[23:31] <jmlowe> I don't recall hearing one, they did a quick point release after 0.56.5 and it was up before that
[23:36] * Tamil (~tamil@ Quit (Quit: Leaving.)
[23:38] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[23:40] * bmjason (~bmjason@ Quit (Quit: Leaving.)
[23:41] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[23:45] * fridad (~oftc-webi@p4FC2C57C.dip0.t-ipconnect.de) Quit (Quit: Page closed)
[23:50] * Tamil (~tamil@ has joined #ceph
[23:55] * TravisSoCal (~oftc-webi@rrcs-173-196-182-162.west.biz.rr.com) Quit (Remote host closed the connection)
[23:58] <sagewk> mikedawson: if you don't mind, can you try leaving compaction off for a while and see if the growth stabilizes? it may be that the non-compacted steady-state is quite a bit higher than the compacted steady-state

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.