#ceph IRC Log

Index

IRC Log for 2010-11-04

Timestamps are in GMT/BST.

[1:02] * greglap1 (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[1:28] * greglap (~Adium@166.205.139.149) has joined #ceph
[1:38] * greglap (~Adium@166.205.139.149) Quit (Quit: Leaving.)
[1:49] * jantje_ (~jan@paranoid.nl) Quit (Read error: Connection reset by peer)
[1:55] * jantje (~jan@paranoid.nl) has joined #ceph
[2:08] * sjust (~sam@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[2:52] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[3:31] * jantje (~jan@paranoid.nl) Quit (Read error: Connection reset by peer)
[3:31] * jantje (~jan@paranoid.nl) has joined #ceph
[4:58] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Remote host closed the connection)
[5:14] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[5:31] * cmccabe (~cmccabe@adsl-76-199-101-63.dsl.pltn13.sbcglobal.net) has joined #ceph
[5:42] * greglap (~Adium@cpe-76-90-74-194.socal.res.rr.com) has joined #ceph
[6:11] * terang (~me@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[6:32] * cmccabe (~cmccabe@adsl-76-199-101-63.dsl.pltn13.sbcglobal.net) Quit (Quit: Leaving.)
[7:23] * terang (~me@pool-173-55-24-140.lsanca.fios.verizon.net) has joined #ceph
[8:20] * hijacker_ (~hijacker@213.91.163.5) has joined #ceph
[8:20] * hijacker_ (~hijacker@213.91.163.5) Quit (Read error: Connection reset by peer)
[8:35] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[8:37] * yehudasa_hm (~yehuda@ppp-69-228-129-75.dsl.irvnca.pacbell.net) Quit (Ping timeout: 480 seconds)
[9:17] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[10:17] * Yoric (~David@213.144.210.93) has joined #ceph
[10:23] * allsystemsarego (~allsystem@188.27.167.113) has joined #ceph
[13:11] * jantje_ (~jan@paranoid.nl) has joined #ceph
[13:11] * jantje (~jan@paranoid.nl) Quit (Ping timeout: 480 seconds)
[13:28] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[13:44] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[14:14] * ghaskins_mobile (~ghaskins_@130.57.22.201) has joined #ceph
[14:41] * ghaskins_mobile (~ghaskins_@130.57.22.201) Quit (Quit: This computer has gone to sleep)
[14:57] * ghaskins_mobile (~ghaskins_@130.57.22.201) has joined #ceph
[15:21] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[15:23] * terang (~me@pool-173-55-24-140.lsanca.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[15:26] * alexxy (~alexxy@79.173.81.171) Quit (Ping timeout: 480 seconds)
[15:45] * f4m8 is now known as f4m8_
[15:46] * failboat (~stingray@stingr.net) has joined #ceph
[15:47] <failboat> o hai.
[16:16] * ghaskins_mobile (~ghaskins_@130.57.22.201) Quit (Quit: This computer has gone to sleep)
[16:30] * ghaskins_mobile (~ghaskins_@130.57.22.201) has joined #ceph
[16:30] * yehudasa_hm (~yehuda@ppp-69-228-129-75.dsl.irvnca.pacbell.net) has joined #ceph
[16:39] * greglap (~Adium@cpe-76-90-74-194.socal.res.rr.com) Quit (Quit: Leaving.)
[16:50] * greglap (~Adium@166.205.136.122) has joined #ceph
[16:52] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[16:56] <greglap> failboat: hi
[17:04] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[17:04] * Yoric (~David@213.144.210.93) has joined #ceph
[17:15] * cmccabe (~cmccabe@adsl-76-199-101-63.dsl.pltn13.sbcglobal.net) has joined #ceph
[17:18] * Yoric_ (~David@213.144.210.93) has joined #ceph
[17:18] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[17:18] * Yoric_ is now known as Yoric
[17:18] <jantje_> sage: the backout of the commit you mentioned did not fix the bonnie++ problem
[17:25] <cmccabe> jantje: I think sage is afk now
[17:27] <jantje_> could be
[17:27] <jantje_> removing a 600MB directory is just really slow :(
[17:28] <jantje_> shouldn't it be just a metadata operation? (and clean the PG in the background?)
[17:28] <jantje_> or something like that, don't know about the specifics
[17:28] <cmccabe> how many files are in the directory?
[17:29] <greglap> file deletion is just slow right now, it's something we're aware of but haven't gotten to fixing yet :/
[17:30] <jantje_> cmccabe: maybe max 20 to 30 for each subdir
[17:31] * ghaskins_mobile (~ghaskins_@130.57.22.201) Quit (Ping timeout: 480 seconds)
[17:31] <greglap> it's the total number that matter
[17:31] <jantje_> 17230
[17:32] <greglap> unlike the OSD ops the metadata ops don't have any sort of split acknowledgement, so IIRC the basic problem is that every single delete requires an mds journal entry to go on-disk
[17:32] <cmccabe> split acknowledgement?
[17:33] <greglap> safe versus ack
[17:33] <cmccabe> k
[17:34] <greglap> and nobody ever spins off AIO threads to handle deletion because in a local FS it's pretty much instantaneous
[17:35] <cmccabe> could be interesting to benchmark with journal turned off (is that possible for mds?)
[17:36] <greglap> hopefully it's not! then you'd have to hit random disks for every op instead of streaming
[17:37] <cmccabe> why does streaming require MDS journalling
[17:38] <cmccabe> it's probably obvious but I'm not seeing it yet
[17:39] <greglap> I mean the MDS journal is a streaming write
[17:39] <greglap> generally the disks are already where they need to be
[17:39] <greglap> without it you're doing random read/writes to random disks to adjust the inode data in-place
[17:40] <cmccabe> oh, because we later go back and apply the journal
[17:41] <greglap> yeah, when we run out of journal space we apply the tail end to the on-disk inodes
[17:41] <greglap> (well, I think it's a little smarter than just "crap, outta space", but you get the idea)
[17:41] <greglap> train's in the station, be back in 20
[17:48] * jantje_ got another core dump
[17:48] <jantje_> and .. I dont know what went wrong
[17:48] <cmccabe> which process
[17:49] <jantje_> monitor
[17:49] * greglap (~Adium@166.205.136.122) Quit (Ping timeout: 480 seconds)
[17:50] <jantje_> i'm uploading it
[17:50] * jantje_ not that handy with gdb
[17:50] <cmccabe> ok
[17:51] <jantje_> i also have another one from last week, but didn't saw that one
[17:51] <jantje_> also mon
[17:53] <cmccabe> where is it uploaded to
[17:54] <jantje_> jan.sin.khk.be/ceph-core-04112010 , done in 3 min
[17:57] <jantje_> core-osd-29102010 is there in 5min
[17:58] * jantje_ got to go
[17:59] <cmccabe> so how do I look at the core?
[17:59] <cmccabe> also, what git revision were you on when you built it
[17:59] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:00] * julienhuang (~julienhua@pasteur.dedibox.netavenir.com) has joined #ceph
[18:00] <cmccabe> ok, I can access the core file
[18:00] <cmccabe> in order to have symbols I need to know what revision you were on
[18:08] * sjust (~sam@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:45] <failboat> is anybody aware of grave bugs in 2.6.35.6 btrfs that will be triggered by .22.2 osd?
[18:46] <failboat> yesterday I had my filesystem turn into a pumpkin
[18:47] <cmccabe> what was the osd configuration
[18:47] <cmccabe> raw block device or directory?
[18:48] <failboat> on a freshly created cluster, write doesn't go past 50 megabytes, then osd doesn't shutdown - it hangs as zombie with parent init, and osd fs is in use so only way to umount was echo b > /proc/sysrq-trigger
[18:49] <cmccabe> so kill -9 did not stop the cosd
[18:50] <failboat> the config was btrfs devs = /dev/sda3 /dev/sdb2 /dev/sdc2 /dev/sdd2
[18:50] <failboat> cmccabe: it was an obvious kernel bug, since wchan was do_exit and for all other purposes it was gone
[18:51] <failboat> I would assume some of the ioctls deadlocked in btrfs
[18:51] <failboat> anyway, I switched it to ext4 and this problem was gone
[18:51] <failboat> another problem stays, however
[18:51] <cmccabe> there were some new ioctls introduced in cosd but I don't know the timeframe
[18:52] <failboat> when I dump a shitload of small files in there, at some point metadata server goes nuts - it shows directory but readdir fails inside directory wit "permission denied"
[18:52] <cmccabe> I mean for btrfs
[18:52] <failboat> I tried to parse mds.log but failed for the first time and now trying to test large files
[18:53] <cmccabe> yeah, you should ask sage if he's seen this before. He also made a bunch of MDS fixes recently, so he probably would suggest a newer version to retest with.
[18:54] <failboat> ok, I'll probably grab a fresh git
[18:54] <cmccabe> actually he's going to create an RC today
[18:54] <failboat> and redeploy it there ->
[18:54] <cmccabe> so once the RC comes out that would probably be a good starting point
[18:56] * Yoric (~David@213.144.210.93) Quit (Quit: Yoric)
[18:57] <sagewk> failboat: there was a bug in the clone ioctl, but it usually only triggered during recovery
[18:58] <sagewk> any chance you can try with the latest mainline? (2.6.37-rc1+) if you have problems there then we definitley have problems.
[18:58] <failboat> sagewk: you will start laughing now, but I tried to disable all btrfs-specific in config
[18:58] <sagewk> there were also some space cache problems that josef fixed. we hit some of those (even an low utilizations)
[18:58] <failboat> namely, clones and transactions
[18:59] <failboat> anyway
[18:59] <sagewk> yeah, that would avoid the clone issue, but not the stuff josef fixed.
[18:59] <failboat> I have it stable on ext4 for now, except for this stupid metadata server issue
[18:59] <sagewk> one of those options was added specifically to work around the clone bug
[19:00] <sagewk> the permission denied thing isn't something i've seen. is it reproducible? what was the workload?
[19:00] <failboat> it may be just number of files or it may be number of hardlinks as this particular tree has enough of them
[19:01] <failboat> sagewk: I reproduced it twice. I'm going to reproduce it once more now so you could see it
[19:01] <failboat> the workload is rsync of mirrors.sgu.ru::m to ceph :)
[19:01] <failboat> (rsync -avAXHP to be precise)
[19:06] * yehudasa_hm (~yehuda@ppp-69-228-129-75.dsl.irvnca.pacbell.net) Quit (Ping timeout: 480 seconds)
[19:11] * ghaskins_mobile (~ghaskins_@12.157.84.42) has joined #ceph
[19:32] <sagewk> i'll try it here too
[20:30] * Meths_ (rift@91.106.152.214) has joined #ceph
[20:35] * Meths__ (rift@91.106.193.61) has joined #ceph
[20:36] * Meths (rift@91.106.214.122) Quit (Ping timeout: 480 seconds)
[20:36] * Meths__ is now known as Meths
[20:40] * Meths_ (rift@91.106.152.214) Quit (Ping timeout: 480 seconds)
[20:54] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[21:07] * ghaskins_mobile (~ghaskins_@12.157.84.42) Quit (Quit: This computer has gone to sleep)
[21:08] <sagewk> failboat: btw i get 'rsync: -avAXHP: unknown option' on rsync 2.6.9.. am i running an old version?
[21:19] * terang (~me@pool-173-55-24-140.lsanca.fios.verizon.net) has joined #ceph
[21:33] * ghaskins_mobile (~ghaskins_@12.157.84.42) has joined #ceph
[21:42] * julienhuang (~julienhua@pasteur.dedibox.netavenir.com) Quit (Quit: julienhuang)
[21:46] * terang (~me@pool-173-55-24-140.lsanca.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[22:00] * Meths_ (rift@91.106.149.24) has joined #ceph
[22:06] * Meths (rift@91.106.193.61) Quit (Ping timeout: 480 seconds)
[22:09] * Meths_ is now known as Meths
[22:29] * ghaskins_mobile (~ghaskins_@12.157.84.42) Quit (Quit: This computer has gone to sleep)
[22:33] <failboat> sorry
[22:34] <failboat> I was doing some stuff
[22:34] <failboat> sagewk: it's still working here, which is a surprise
[22:34] <failboat> I'm going to apply more wiight to it
[22:35] <failboat> and yes, your rsync is probably ancient
[22:35] <failboat> mine is 3.0.7
[22:52] * ghaskins_mobile (~ghaskins_@12.157.84.42) has joined #ceph
[23:06] * ghaskins_mobile (~ghaskins_@12.157.84.42) Quit (Quit: This computer has gone to sleep)
[23:51] * allsystemsarego (~allsystem@188.27.167.113) Quit (Quit: Leaving)
[23:57] * terang (~me@ip-66-33-206-8.dreamhost.com) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.