#ceph IRC Log


IRC Log for 2011-01-18

Timestamps are in GMT/BST.

[3:30] * bchrisman (~Adium@70-35-37-146.static.wiline.com) Quit (Quit: Leaving.)
[3:36] * MarkN1 (~nathan@ has joined #ceph
[3:36] * MarkN (~nathan@ Quit (Read error: Connection reset by peer)
[3:44] * MarkN1 (~nathan@ Quit (Ping timeout: 480 seconds)
[3:47] * MarkN (~nathan@ has joined #ceph
[3:53] * MarkN (~nathan@ Quit (Quit: Leaving.)
[3:57] * MarkN (~nathan@ has joined #ceph
[4:19] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) has joined #ceph
[4:53] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[5:31] * votz (~votz@dhcp0020.grt.resnet.group.upenn.edu) Quit (Quit: Leaving)
[6:10] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) has joined #ceph
[6:45] * ijuz__ (~ijuz@p4FFF7FE4.dip.t-dialin.net) Quit (Ping timeout: 480 seconds)
[6:54] * ijuz__ (~ijuz@p4FFF607D.dip.t-dialin.net) has joined #ceph
[7:01] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) Quit (Quit: Leaving.)
[7:08] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has joined #ceph
[7:54] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[8:33] * MichielM (~michiel@mike.dwaas.org) Quit (Remote host closed the connection)
[9:14] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[9:38] * allsystemsarego (~allsystem@ has joined #ceph
[9:46] * Meths_ (rift@ has joined #ceph
[9:51] * Meths (rift@ Quit (Ping timeout: 480 seconds)
[10:16] * votz (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) has joined #ceph
[10:42] * Yoric (~David@ has joined #ceph
[13:28] <stingray> ugh
[15:08] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[15:12] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit ()
[15:16] * verwilst (~verwilst@router.begen1.office.netnoc.eu) has joined #ceph
[15:22] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[15:56] * Meths_ is now known as Meths
[16:31] * greglap (~Adium@cpe-76-90-239-202.socal.res.rr.com) Quit (Quit: Leaving.)
[16:34] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: Leaving)
[16:50] * greglap (~Adium@ has joined #ceph
[17:34] * greglap (~Adium@ Quit (Quit: Leaving.)
[17:38] * verwilst (~verwilst@router.begen1.office.netnoc.eu) Quit (Quit: Ex-Chat)
[17:44] * Yoric_ (~David@ has joined #ceph
[17:44] * Yoric (~David@ Quit (Read error: Connection reset by peer)
[17:44] * Yoric_ is now known as Yoric
[17:56] <stingray> huh
[17:57] <stingray> gregaf: did you get anything useful out of my journal?
[17:57] <gregaf> just started looking at it this morning :)
[17:57] <stingray> ah
[17:57] <stingray> ok
[17:57] <gregaf> I'm not but this is the first time I've tried and I'm about to point sagewk at it ;)
[18:04] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:12] * stingray is building 2.6.37 with btrfs fixes
[18:36] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:37] * cmccabe (~cmccabe@ has joined #ceph
[18:37] * Tv|work (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:44] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[18:45] * panitaliemom (87f50a05@ircip3.mibbit.com) has joined #ceph
[18:47] <panitaliemom> hi, i have a problem with mds where mds seems stuck at "up:creating" on boot. can someone tell me what's supposed to happen between "up:creating" and "up:active"?
[18:48] <panitaliemom> just trying to figure out what prevents it from becoming "active"
[18:51] <sagewk> stingray: did you see any osd crashes around the same time?
[18:52] <panitaliemom> no all osds are up fine
[18:53] <panitaliemom> that's what "ceph -w" is telling me
[18:53] <gregaf> panitaliemom: sagewk was asking stingray about a separate issue
[18:54] <panitaliemom> oops i'll shut up
[18:54] <gregaf> heh, don't do that
[18:54] <gregaf> was that your email about "can't read superblock"?
[18:55] <panitaliemom> no that's not mine, but his problem seems similar to mine
[18:55] <gregaf> ah, k
[18:56] <gregaf> what's the output you get of ceph -w?
[18:59] <gregaf> panitaliemom: also, what version are you running?
[19:00] <panitaliemom> http://pastebin.com/eiJsfjH6
[19:00] <panitaliemom> what does "blacklisted MDSes" mean?
[19:00] <gregaf> ah, that's interesting
[19:01] <gregaf> it's an MDS that the monitors have deemed dead, so the OSDs refuse to talk to it
[19:01] <panitaliemom> it's running 0.21.3
[19:01] * bchrisman (~Adium@70-35-37-146.static.wiline.com) has joined #ceph
[19:02] <panitaliemom> monitor thinks it's dead because it's stuck at creating?
[19:02] <gregaf> more probably it's stuck at creating because the monitor thinks it's dead
[19:02] <panitaliemom> oh
[19:03] <gregaf> I notice you've got an OSD flapping there too
[19:03] <panitaliemom> yeah..
[19:03] <gregaf> did you just try to set it up today?
[19:03] <panitaliemom> do you know why monitor think mds is dead?
[19:04] <gregaf> or has this cluster been working previously?
[19:04] <gregaf> nope, don't know why, trying to figure out the possibilities
[19:04] <panitaliemom> i set it up some time ago, and it worked okay at that time, but since yesterday this problem started to happen..
[19:05] <gregaf> huh
[19:06] <panitaliemom> does monitor have to receive some kinda heart beat to detect mds is alive?
[19:07] * sjust (~sam@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:08] <gregaf> yeah
[19:08] <panitaliemom> and mds still sends out heart beat while it's stuck at creating?
[19:09] <gregaf> it gets blacklisted if it goes too long without the monitor receiving a beacon from the MDS
[19:09] <panitaliemom> k
[19:10] <gregaf> but it shouldn't do that unless it can find a replacement MDS
[19:10] <panitaliemom> just wondering if causal relationship is like: stuck at creating -> no beacon out -> blacklisted
[19:10] <gregaf> your standby ought to have taken over and started up
[19:10] <panitaliemom> but i only have one mds
[19:12] <gregaf> hmmm, maybe I'm getting bitten by our odd output style
[19:12] <gregaf> but your MDS shouldn't get blacklisted unless the monitor thinks it's got a replacement
[19:13] <gregaf> or you explicitly tell it to
[19:13] <panitaliemom> oh hmm
[19:13] <panitaliemom> thats odd. what's the way to tell it to?
[19:14] <gregaf> ummm
[19:15] <gregaf> ./ceph mds tell 0 fail
[19:15] <gregaf> I think
[19:15] <panitaliemom> k
[19:17] <stingray> sagewk: I don't remember, sorry. It was 2 weeks ago.
[19:17] <gregaf> stingray: what was the patch you applied to your tree, again?
[19:17] <panitaliemom> perhaps osd flapping may cause mds to get stuck... i should check the vms running ods
[19:17] <stingray> sagewk: I was adding new osds but the timing is unclear. at any time I wasn't adding more than one osd, though
[19:18] <stingray> gregaf: http://stingr.net/d/stuff/what-i-added.diff
[19:18] * Yoric (~David@ Quit (Quit: Yoric)
[19:19] <gregaf> panitaliemom: I'm not quite sure how this situation could be happening, but an issue with the MDS talking to the OSDs is the most likely
[19:19] <gregaf> and yes, if it gets stuck on creating I think it could get marked laggy, at least
[19:19] <gregaf> panitaliemom: so one thing you can do is attempt to run a RADOS bench and see if all the OSDs are responding
[19:20] <panitaliemom> sorry but how do i run a rados bench?
[19:20] <gregaf> try it from the MDS node to make sure the connection works from there and not just your monitor
[19:21] <gregaf> you should have an executable called "rados"
[19:21] <stingray> http://koji.fedoraproject.org/koji/taskinfo?taskID=2729175 yeah
[19:21] <gregaf> so on your MDS node run "rados -p metadata bench 10 write"
[19:22] <panitaliemom> ok, i'll try that
[19:22] <panitaliemom> thanks for your help, greg
[19:22] <gregaf> np :)
[19:23] <gregaf> also, when did this start and have you tried just restarting your MDS?
[19:23] <gregaf> once it gets blacklisted it stays that way for 24 hours, but a simple restart should let it connect again
[19:24] <panitaliemom> so /etc/init.d/ceph -a restart should get it unblacklisted. right?
[19:24] <gregaf> well, that's overkill since it restarts all the nodes, but yet
[19:24] <gregaf> *but yes
[19:25] <panitaliemom> k
[19:32] <sjust> anyone using cosd?
[19:33] <sagewk> would be cmcabe or dallas, if anyone
[19:33] <sjust> ok
[19:37] * panitaliemom (87f50a05@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[19:42] <cmccabe> sjust: not using right now. Dallas isn't in yet, so I assume he's not.
[20:10] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[20:51] * fzylogic (~fzylogic@ has joined #ceph
[20:53] <fzylogic> the playground appears to be hung up
[20:54] <fzylogic> nasty btrfs error on ballpit0
[21:07] <cmccabe> I seem to see a bug when I mark an MDS that is still up as failed with 'ceph mds fail <mds-name>'
[21:28] <gregaf> cmccabe: I think you need to call ./ceph mds fail mds.a
[21:28] <gregaf> at least, maybe?
[21:28] <cmccabe> http://tracker.newdream.net/issues/720
[21:28] <cmccabe> gregaf: if that were the case, I'd expect an error message from cephtool
[21:29] <gregaf> okay, just my first thought looking at your bug
[21:29] <cmccabe> gregaf: yeah, understand
[21:29] <cmccabe> gregaf: initial description was vague
[21:36] <sagewk> fzylogic: rebooting ballpit0. that's a known btrfs bug :(
[21:36] <fzylogic> ok, cool.
[22:41] <fzylogic> still hung, looks like timeouts on every osd now
[22:41] <sagewk> ugh yeah
[22:44] <sagewk> huh network is weird?
[22:49] <fzylogic> there was a brief outage around 11am, but I'm not aware of any existing issues
[23:44] <bchrisman> I have 4-drive nodes and have previously been getting things running by grouping them under md. But putting that in raid-1 wastes a lot of space, putting them in raid-0 means drive failure=node failure, and raid-5 means a performance penality… I could also put four OSDs on each node, but I saw a recommendation against doing that (I'm guessing there are some buffers that should be shared?). Is there a way to have a single osd deal with fou
[23:45] <bchrisman> Or is osd-per-drive worth looking at?
[23:45] <cmccabe> bchrisman: even if a single cosd could deal with 4 drives, you would still have drive failure = node failure
[23:46] <sagewk> bchrisman: i would do cosd per drive.
[23:46] <sagewk> and structure your crush map appropriately (so that replicas span nodes, not drives)
[23:46] <bchrisman> cmccabe: drive failure takes cosd offline then?
[23:46] <bchrisman> sagewk: ok.. will test on that.
[23:47] <yehudasa> wido: are you there?
[23:47] <cmccabe> bchrisman: if the drive it depends on doesn't work, cosd isn't going to work
[23:51] * verwilst (~verwilst@dD576FAAE.access.telenet.be) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.