#ceph IRC Log

Index

IRC Log for 2010-11-22

Timestamps are in GMT/BST.

[1:10] <jantje> and Hi again!
[1:10] <jantje> :)
[1:11] <sage1> hi
[1:11] * sage1 is now known as sage
[1:17] <jantje> i'm about to go to sleep, was my report of the journal crash of any use?
[1:18] <jantje> I think you already opened a bug for that one (reported by someone else, but it could be the same)
[1:19] <sage> which report was it?
[1:28] <jantje> email to you with 'journal reset'
[1:28] <sage> oh right. sorry, lost in the shuffle
[1:28] <sage> gotta run. will take a look!
[1:28] <jantje> http://tracker.newdream.net/issues/531 <= could be the same
[1:28] <jantje> no problem
[1:29] <jantje> bye!
[1:30] <jantje> sage: i'm going to sleep, let me know if you think it's the same one, and i'll test the fix
[1:31] <jantje> sage: and #549 is fixed and can be closed
[2:23] * m3thos (~mindblast@bl7-118-207.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[2:31] * m3thos (~mindblast@bl7-118-207.dsl.telepac.pt) has joined #ceph
[3:27] * m3thos (~mindblast@bl7-118-207.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[3:52] * lidongyang (~lidongyan@222.126.194.154) Quit (Read error: Connection reset by peer)
[3:52] * lidongyang (~lidongyan@222.126.194.154) has joined #ceph
[4:06] * lidongyang (~lidongyan@222.126.194.154) Quit (Read error: Operation timed out)
[4:14] * lidongyang_ (~lidongyan@222.126.194.154) has joined #ceph
[4:32] * lidongyang_ (~lidongyan@222.126.194.154) Quit (Remote host closed the connection)
[4:40] * lidongyang_ (~lidongyan@222.126.194.154) has joined #ceph
[4:45] * lidongyang_ (~lidongyan@222.126.194.154) Quit (Remote host closed the connection)
[5:02] * Jiaju (~jjzhang@222.126.194.154) Quit (Ping timeout: 480 seconds)
[7:08] * Jiaju (~jjzhang@222.126.194.154) has joined #ceph
[7:09] * Ifur_ (~osm@big.aksis.uib.no) has joined #ceph
[7:10] * Ifur (~osm@big.aksis.uib.no) Quit (Remote host closed the connection)
[7:41] * CUTEESGIRL (~c3_LuVLy@189.44.160.146) has joined #ceph
[7:41] * CUTEESGIRL (~c3_LuVLy@189.44.160.146) has left #ceph
[7:45] * bbigras (quasselcor@bas11-montreal02-1128536101.dsl.bell.ca) has joined #ceph
[7:49] * Guest1216 (quasselcor@bas11-montreal02-1128536101.dsl.bell.ca) Quit (Ping timeout: 480 seconds)
[7:56] * CUTEESGIRL (~c3_LuVLy@189.44.160.146) has joined #ceph
[7:56] * CUTEESGIRL (~c3_LuVLy@189.44.160.146) has left #ceph
[8:05] * CUTEESGIRL (~c3_LuVLy@189.44.160.146) has joined #ceph
[8:05] * CUTEESGIRL (~c3_LuVLy@189.44.160.146) has left #ceph
[8:12] * CUTEESGIRL (~c3_LuVLy@189.44.160.146) has joined #ceph
[8:12] * CUTEESGIRL (~c3_LuVLy@189.44.160.146) has left #ceph
[8:24] * [[KORCH-away]] (~Female-De@194.116.32.115) has joined #ceph
[8:24] * [[KORCH-away]] (~Female-De@194.116.32.115) has left #ceph
[8:25] * cclien (~cclien@60-250-103-120.HINET-IP.hinet.net) Quit (Quit: leaving)
[8:25] * [[KORCH-away]] (~Female-De@194.116.32.115) has joined #ceph
[8:25] * [[KORCH-away]] (~Female-De@194.116.32.115) has left #ceph
[8:25] * cclien (~cclien@ec2-175-41-146-71.ap-southeast-1.compute.amazonaws.com) has joined #ceph
[8:40] * laycoz (~HITLER@196.214.92.114) has joined #ceph
[8:40] * laycoz (~HITLER@196.214.92.114) has left #ceph
[8:45] * laycoz (~HITLER@196.214.92.114) has joined #ceph
[8:45] * laycoz (~HITLER@196.214.92.114) has left #ceph
[9:01] * bbigras is now known as Guest0
[9:11] * allsystemsarego (~allsystem@188.26.32.15) has joined #ceph
[10:07] * Yoric (~David@213.144.210.93) has joined #ceph
[13:09] * Yoric_ (~David@213.144.210.93) has joined #ceph
[13:09] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[13:09] * Yoric_ is now known as Yoric
[14:11] * FeliXdk (~felix@217.195.176.49) has joined #ceph
[16:01] * allsystemsarego (~allsystem@188.26.32.15) Quit (Quit: Leaving)
[16:33] * greglap (~Adium@cpe-76-90-74-194.socal.res.rr.com) Quit (Quit: Leaving.)
[16:52] * greglap (~Adium@166.205.139.43) has joined #ceph
[17:00] * greglap1 (~Adium@166.205.138.106) has joined #ceph
[17:02] * greglap (~Adium@166.205.139.43) Quit (Ping timeout: 480 seconds)
[17:36] * fred_ (~fred@80-219-183-100.dclient.hispeed.ch) has joined #ceph
[17:37] <fred_> hi, I'm around for 45 minutes if you need something about #590
[17:54] <sagewk> fred_: i think we're good; should have the fix pushed today
[17:59] <fred_> sweet, are you going to update te rc branch accordingly ?
[17:59] <sagewk> unstable
[17:59] <sagewk> yeah
[18:00] <fred_> you mean both ?
[18:01] <sagewk> i haven't been updating the rc branch until a new release is imminent and there's stuff in unstable targetted toward that. testing is the bugfix branch for the previous release.
[18:03] <fred_> yes, sorry I meant the testing branch, so I will try to test that again tomorrow
[18:04] <sagewk> k
[18:05] <fred_> have a nice day
[18:05] * fred_ (~fred@80-219-183-100.dclient.hispeed.ch) Quit (Quit: Leaving)
[18:46] <wido> sagewk: I've checked my cluster, the whole weekend it stayed at "120231/1410393 degraded (8.525%)" with 10 of the 12 OSD's up. There are no hanging btrfs nor cosd processes, everything seems idle
[18:46] * greglap1 (~Adium@166.205.138.106) Quit (Read error: Connection reset by peer)
[18:48] <sagewk> there was a regression in recovery last week, not sure if you got the fix.
[18:48] <sagewk> or if its' pushed yet. let me look.
[18:48] <sagewk> do you have 8566c5cd7195cdafc19232c5daf7b090d535b1d0 ?
[18:48] <sagewk> it's #585
[18:50] <wido> yes, I'm at the unstable of 20-11-2010, or 11-20-2010 ;)
[18:50] <wido> Last saturday
[18:50] * cmccabe (~cmccabe@dsl081-243-128.sfo1.dsl.speakeasy.net) has joined #ceph
[18:50] <wido> cmccabe: helped me with #585, and I'm running that fix
[18:50] <sagewk> hmm ok. need to check your logs.
[18:51] <wido> The only thing that seems to happen is the scrub and the OSD's talking with eachother
[18:51] <sagewk> we do, i mean :)
[18:51] <sagewk> k
[18:51] <wido> debug osd is at 20
[18:51] <sagewk> perfect
[18:51] <cmccabe> hi all
[18:52] <cmccabe> wido: if you run unstable, you can get the fix for #585 with just git pull
[18:52] <cmccabe> wido: I made the change manually on your source tree but you can just pull and get the same thing in a more "official" way.
[18:53] <sagewk> cmccabe: looks like he has it, but recovery is still stalled
[18:53] <wido> cmccabe: Yes, tnx! I have a build script that pulls the latest unstable every morning and builds new packages
[18:54] <cmccabe> hmm. I think I turned up osd logging to 20 last time I was on wido's cluster
[18:54] <cmccabe> unless ceph.conf was overwritten later
[18:56] <wido> No, it wasn't :)
[18:58] * Yoric (~David@213.144.210.93) Quit (Quit: Yoric)
[19:04] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:13] * greglap1 (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:13] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Read error: Connection reset by peer)
[19:25] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[19:56] <cmccabe> wido: I just pushed a recovery fix that might have some bearing on what you're seeing
[19:56] <cmccabe> wido: can you pull unstable and restart the cluster?
[19:57] <sagewk> wido: 0856f57e2593 was also probably contributing to your problem; pulling will get you that one too
[19:58] <cmccabe> yeah
[20:01] * sig (~sig@c-67-169-42-1.hsd1.ca.comcast.net) has joined #ceph
[20:01] <sig> Hello!
[20:03] <cmccabe> hi
[20:06] <wido> cmccabe& sagewk I'll try!
[20:07] <gregaf> anything we can help you with, sig? :)
[20:08] <sig> Oh just saying hello and fly on the wall for a bit.
[20:09] <wido> well, there are enough bugs here ;)
[20:09] <gregaf> cool
[20:15] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[20:27] <wido> sagewk: Fix seems to work partially, degraded state went from 8.75% to 1.007% ( 14202/1410393 ), doesn't seem to move any further right now
[20:28] <cmccabe> wido: where do the logs go on this cluster?
[20:29] <cmccabe> I mean, er, I'm guessing they go to the node called "logger", but where do they end up
[20:29] <wido> oh cmccabe , they are still on the nodes locally, waiting for the syslog feature to be working
[20:30] <cmccabe> i see
[20:30] <wido> the logger node is now the place where I dump my core dumps
[20:35] <wido> I'm going afk for today, if I need to test some commits, just let me know here, I'll read back the IRC logs
[20:35] <wido> tnx again!
[20:35] <cmccabe> can I restart stuff?
[20:37] <wido> cmccabe: Yes sure, no problem
[20:37] <cmccabe> thanks
[20:37] <wido> ttyl
[20:55] * michael-ndn (~michael-n@12.248.40.138) has joined #ceph
[21:05] * alexxy[home] (~alexxy@79.173.81.171) Quit (Remote host closed the connection)
[21:16] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[21:30] * Meths_ (rift@91.106.195.140) has joined #ceph
[21:36] * Meths (rift@91.106.160.1) Quit (Ping timeout: 480 seconds)
[21:37] * Meths_ is now known as Meths
[21:53] * eternaleye_ (~eternaley@195.215.30.181) has joined #ceph
[21:59] * eternaleye (~eternaley@195.215.30.181) Quit (Ping timeout: 480 seconds)
[22:17] * jantje_ (~jan@paranoid.nl) has joined #ceph
[22:24] * jantje (~jan@paranoid.nl) Quit (Ping timeout: 480 seconds)
[23:24] * shdb (~shdb@217-162-231-62.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.