#ceph IRC Log


IRC Log for 2013-02-02

Timestamps are in GMT/BST.

[15:40] -larich.oftc.net- *** Looking up your hostname...
[15:40] -larich.oftc.net- *** Checking Ident
[15:40] -larich.oftc.net- *** No Ident response
[15:40] -larich.oftc.net- *** Found your hostname
[15:40] * CephLogBot (~PircBot@rockbox.widodh.nl) has joined #ceph
[15:40] * Topic is 'v0.56.2 has been released -- http://goo.gl/WqGvE || argonaut v0.48.3 released -- http://goo.gl/80aGP || argonaut vs bobtail performance preview -- http://goo.gl/Ya8lU'
[15:41] * Set by joao!~JL@ on Thu Jan 31 18:45:54 CET 2013
[16:06] * Ryan_Lane (~Adium@2001:67c:11f0:cafe:c8df:bb93:4485:ebad) Quit (Quit: Leaving.)
[16:28] * Akendo (~akendo@cable-89-16-142-69.cust.telecolumbus.net) has joined #ceph
[16:29] * The_Bishop (~bishop@2001:470:50b6:0:7c36:bc71:ffe:418f) has joined #ceph
[16:35] * scuttlemonkey (~scuttlemo@ has joined #ceph
[16:35] * ChanServ sets mode +o scuttlemonkey
[16:36] * Akendo (~akendo@cable-89-16-142-69.cust.telecolumbus.net) Quit (Ping timeout: 480 seconds)
[16:48] <Kioob> question : since I compile my own ceph deb packages, "/etc/init.d/ceph status" doesn't show the version
[16:48] <Kioob> any idea why ?
[16:49] <Kioob> for example, I have : osd.18: running {"version":""}
[17:05] <Kioob> oh, error messages in 0.56.2 are better. �currently waiting for ondisk�
[17:05] <Kioob> I like that one
[17:06] <Kioob> so �waiting for ondisk� = change your disk ? :D
[17:07] <Kioob> and if I have that message for 4 disk in the same OSD, should I change disks, or the disk controller ? :D
[17:08] * BillK (~BillK@124-148-95-134.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[17:11] <Kioob> health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
[17:11] <Kioob> outch
[17:11] <Kioob> mmm
[17:15] <jmlowe> Kioob: using btrfs?
[17:16] <Kioob> no, XFS
[17:16] <Kioob> how I'm supposed to fix that ?
[17:16] <Kioob> I think I know witch replica is the good one
[17:17] <jmlowe> depends on how it was made inconsistent, my problem that I fixed this week after a year of tripping over it was that btrfs was truncating some of the files that comprise my objects
[17:17] <jmlowe> is the good one the primary?
[17:19] <Kioob> yes, the good one is the primary
[17:19] <jmlowe> a repair will instruct the primary to overwrite the secondary
[17:19] <Kioob> I did a � ceph pg repair 6.26e �, and it worked
[17:19] <jmlowe> well there you go
[17:19] <Kioob> :)
[17:19] <Kioob> thanks
[17:20] <jmlowe> just a fellow user, glad to be of service
[17:20] <Kioob> and so, btrfs isn't ready to be used with ceph ?
[17:22] <jmlowe> from what I've seen this week it's not ready for any use, sparse files can loose data until this patch is applied https://git.kernel.org/?p=linux/kernel/git/josef/btrfs-next.git;a=commit;h=d468abec6b9fd7132d012d33573ecb8056c7c43f
[17:23] <jmlowe> I'm a little bitter, I think this bug has been causing me problems for over a year
[17:24] <jmlowe> I need to use btrfs because I have a flakey disk enclosure backplane, I need to crc all the bits coming and going to make sure it doesn't do it's biweekly bit flip on me
[17:26] * loicd (~loic@2001:67c:11f0:cafe:120b:a9ff:feb7:cce0) Quit (Quit: Leaving.)
[17:26] <Kioob> outch
[17:27] <Kioob> I often have problem with btrfs, but I was thinking it was because snapshots
[17:28] <jmlowe> yeah, everything went to hell on me, snapshots would cause deadlocks
[17:29] <jmlowe> I'm really hoping 3.8 will have a rock solid btrfs
[17:29] <Kioob> mmm, I have an other inconsistent PG, but... it has 1 primary + 2 replicas. It should now which version is the good one... except if there 3 versions :/
[17:29] <Kioob> yes, I have a looot of deadlocks with btrfs
[17:30] <jmlowe> when you do a repair the primary is always the marked as the correct one, even if it is the one that is broken
[17:30] <Kioob> outch
[17:31] <jmlowe> I believe there is some talk among the devs of doing end to end crc32 on objects so you always know which copy is the real one
[17:32] <Kioob> 2013-02-02 17:22:02.891914 7f935d6aa700 0 log [ERR] : 3.3f osd.37: soid e1f1c43f/rb.0.133f.238e1f29.00000001b994/head//3 digest 1997392597 != known digest 2076752419
[17:32] <Kioob> 2013-02-02 17:24:34.236888 7f935d6aa700 0 log [ERR] : 3.3f osd.37: soid 45f1eb3f/rb.0.2d766.238e1f29.000000000dc9/head//3 digest 3781526986 != known digest 1022335059
[17:32] <Kioob> 2013-02-02 17:24:44.723208 7f935d6aa700 0 log [ERR] : 3.3f osd.37: soid 95948c3f/rb.0.133f.238e1f29.000000021959/head//3 digest 2645010232 != known digest 1034101767
[17:33] <Kioob> so, the osd.37 is the problem
[17:33] <Kioob> (like the other PG)
[17:36] <jmlowe> not necessarily, osd.37 is the secondary and it doesn't match the primary, if you have another copy and this is the only one that is complaining then I'd say odds were better that osd.37 is bad
[17:37] <jmlowe> I should mention this caveat, we are nearing the edge of my limited knowledge
[17:37] <Kioob> yes, there is an other replica, and I have only complaints about osd.37
[17:37] <Kioob> so I suppose that osd.37 is the problem, like for the previous PG
[17:38] <jmlowe> that would be my guess, I can't stress enough that it's only a guess
[17:38] <Kioob> ;)
[17:39] <jmlowe> <- has trashed several ceph instances with guesswork
[17:39] <jmlowe> I figure I'm probably on my 6th cluster incarnation
[17:39] <Kioob> mm
[17:40] <Kioob> :p
[17:40] <jmlowe> over the past 18 months
[17:41] <Kioob> anyway, I did �ceph pg repair 3.3f�, it showed �instructing pg 3.3f on osd.10 to repair�, but... nothing appened. Maybe I need to wait that running scrubs finish ?
[17:41] <jmlowe> I don't know
[17:42] <Kioob> ok
[17:42] <Kioob> I retried after the end of a �deep-scrub� on the same (primary) OSD, and know I have a PG in active+clean+scrubbing+deep+inconsistent+repair state
[17:42] <jmlowe> some of the inktank guys were on earlier today, so you might get lucky and be able to talk to somebody who knows the mechanics of repairs better
[17:44] <Kioob> yes, and I hope they will help me maintain that cluster later. But before that, I try to fix problems
[17:54] <Kioob> is there a way to disable a complet host (8 OSD), without blocking all the cluster ?
[17:55] <Kioob> (= 20% of the data)
[18:01] <Kioob> for now, I tried to reweight that 8 OSD by step of 0.01
[18:04] <Kioob> but.. all the cluster is slow :/
[18:14] * loicd (~loic@2001:67c:11f0:cafe:8827:b708:d41e:a2d7) has joined #ceph
[18:14] * loicd (~loic@2001:67c:11f0:cafe:8827:b708:d41e:a2d7) Quit ()
[18:15] <Kioob> great, I have PG on same host :S
[18:24] * BillK (~BillK@124-148-252-188.dyn.iinet.net.au) has joined #ceph
[18:46] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[18:52] * ScOut3R (~scout3r@540079A1.dsl.pool.telekom.hu) has joined #ceph
[18:52] * ScOut3R (~scout3r@540079A1.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[19:03] * ScOut3R (~ScOut3R@540079A1.dsl.pool.telekom.hu) has joined #ceph
[19:03] * ScOut3R (~ScOut3R@540079A1.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[19:05] <via> i have a completely reproducable ceph osd crash thats preventing me from keeping the osd up for any real period of time
[19:05] <via> https://pastee.org/eqgg6
[19:07] * BillK (~BillK@124-148-252-188.dyn.iinet.net.au) Quit (Read error: Connection reset by peer)
[19:20] * ScOut3R (~ScOut3R@540079A1.dsl.pool.telekom.hu) has joined #ceph
[19:58] * wschulze1 (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[19:58] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Ping timeout: 480 seconds)
[19:59] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[20:02] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[20:07] * ScOut3R (~ScOut3R@540079A1.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[20:27] * wschulze (~wschulze@cpe-74-72-250-104.nyc.res.rr.com) has joined #ceph
[20:34] * wschulze1 (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Ping timeout: 480 seconds)
[20:34] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[20:38] * wschulze1 (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[20:41] * wschulze (~wschulze@cpe-74-72-250-104.nyc.res.rr.com) Quit (Ping timeout: 480 seconds)
[20:57] * The_Bishop (~bishop@2001:470:50b6:0:7c36:bc71:ffe:418f) Quit (Ping timeout: 480 seconds)
[21:03] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[21:05] * The_Bishop (~bishop@2001:470:50b6:0:3d48:4c8b:fa51:d56e) has joined #ceph
[21:23] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[21:27] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.89 [Firefox 18.0.1/20130116073211])
[21:42] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:51] * scuttlemonkey (~scuttlemo@ Quit (Quit: This computer has gone to sleep)
[21:53] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[21:58] * allsystemsarego_ (~allsystem@5-12-241-55.residential.rdsnet.ro) Quit (Quit: Leaving)
[22:10] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[22:18] * mdrnstm (~mdrnstm@238.sub-70-197-133.myvzw.com) has joined #ceph
[22:37] * mdrnstm (~mdrnstm@0001a920.user.oftc.net) has left #ceph
[22:52] <TMM> via, what do you do to trigger that crash?
[22:53] <via> nothing
[22:53] <via> i mean, there mght be some misc cephfs activity
[22:54] <via> nothing more than a couple of reads
[22:54] <TMM> is it always the same number of reads of the same file?
[22:54] <via> but now it happens whenever i start the osd
[22:55] <via> and no
[22:55] <TMM> immediately?
[22:55] <via> its nothing specific
[22:55] <via> the first two times it took about 5-10 minutes
[22:55] <via> i'm starting it again now
[22:58] <via> it hasn't died yet
[23:00] <via> maybe it stopped being reproducable.. i'll let you know if it does
[23:01] <via> fwiw all that is happening is a scrub because load is low
[23:03] <TMM> via, maybe you can describe your problem and put that error log you pasted in here : http://tracker.ceph.com/projects/ceph
[23:03] * loicd (~loic@stat-217-145-38-50.xdsl.toledo.be) has joined #ceph
[23:07] <via> done, although i don't have a lot of information since now it doesn't seem to be happening
[23:08] <via> there are a couple of kernel messages in the ring about btrfs checksum failures, maybe its related
[23:08] <via> oh wait, it just died
[23:08] <TMM> via, you may want to put the btrfs failures in the ticket as well
[23:10] <via> done
[23:14] <TMM> thank you, I can't really help you with this issue , but now I'm sure someone who can will see it :)
[23:16] <via> yeah
[23:16] <via> thankfully ceph is still working even with the down osd
[23:17] * The_Bishop (~bishop@2001:470:50b6:0:3d48:4c8b:fa51:d56e) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[23:21] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[23:31] * Matt (matt@matt.netop.oftc.net) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.