#ceph IRC Log

Index

IRC Log for 2011-03-20

Timestamps are in GMT/BST.

[1:12] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[2:38] * rajeshr (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) has joined #ceph
[4:17] * rajeshr (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) Quit (Quit: Leaving.)
[4:20] <lxo> I thought I read ceph could work on top of ext4, but cosd --mkfs says it can't create the filesystem (0.25.1). bug or misunderstanding?
[4:26] <bchrisman> it'll auto-setup btrfs if you tell it to… or you can pre-setup any other filesystem.. depending on what options are in the osd section of your configuration file.
[4:31] <lxo> by “pre-setup” I figured it sufficed to create an empty directory with the given name, and leave the device option alone. I also set up a journal. and then cosd -i # --mkfs --mkjournal failed
[4:32] <lxo> it created the journal file, the fsid, the current directory, and then errored out
[4:33] <lxo> hmm, it looks like the error was right after it failed to [sg]etxattr(".../fsid", "user.test", ...) (errno = -EOPNOTSUPP). odd... ext4 is supposed to support xattrs, no?
[4:34] <lxo> I guess that's just nature's way of telling me not to chicked out of the btrfs problems I'm running into ;-)
[4:35] <bchrisman> heh...
[4:36] <bchrisman> btrfs works for it basically in my exp with very recent kernels/fsprogs at least.
[4:37] <lxo> I'm running 2.6.38, but I had to disable journal to avoid freezes, and even then, it still hangs every now and then, or bugs at vmtruncate
[4:37] <bchrisman> yeah.. something I haven't run into I guess… different workloads maybe.
[4:37] <lxo> that said, I suspect a compiler bug for the latter, 'cause it doesn't make sense. the code path within btrfs that takes to that BUG_ON requires the caller to return non-zero after another BUG_ON, which I don't see
[4:38] <lxo> I'm installing kernel-debuginfo and kernel-debug to investigate
[4:39] <lxo> shrug. I had destroyed a filesystem onto which I'd already loaded 260GB to start over with ext4. it helped me decide that, after the last freeze, the two mons wouldn't sync any more (and the machine with the 3rd mon is at the repair shop :-(
[4:40] <lxo> besides, my wife was getting annoyed with the frequent freezes on her own machine :-(
[4:42] <lxo> been trying to rsync 1TB of data onto ceph for weeks now. 0.25.1 is much better, but btrfs isn't helping
[4:45] <lxo> I wonder if the btrfs problems I'm experiencing have to do with running it on X6 machines
[4:46] <lxo> the days of trivial SMP bugs in the kernel are pretty much gone, but maybe 6 is an odd (oddly :-) number of cores, or something
[5:20] <bchrisman> heh…
[5:20] <bchrisman> home ceph cluster huh? :)
[5:43] * Dantman (~dantman@S0106001eec4a8147.vs.shawcable.net) Quit (Quit: http://daniel.friesen.name or ELSE!)
[5:46] <lxo> yeah
[5:46] <lxo> kind of tired of keeping everything in raid 1 on all machines, or of thinking “I should have more copies of this”, so I figured I'd dump everything into ceph and let it take care of it for me ;-)
[5:48] <lxo> still wondering whether ceph is going to be a good “transport” for distributed builds, but that's not much of a priority
[5:49] <lxo> of course, now that I threatened the kernel with replacement with a -debug version, it won't misbehave any more ;-)
[5:50] <lxo> I think I never had it work for 7 hours straight, without any hiccups!
[6:02] <lxo> oh, good, not a compiler bug, just a patch installed on top of the pristine source tree I was looking at, that offset another BUG_ON onto the one that couldn't possibly hit
[6:03] * Dantman (~dantman@S0106001eec4a8147.vs.shawcable.net) has joined #ceph
[6:32] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) Quit (Read error: Operation timed out)
[6:33] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[6:33] * sjust (~sam@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[6:34] * yehuda_wk (~quassel@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[6:34] <bchrisman> distributed builds would be last quantities of metadata operations not too much throughput… seems reasonable.. and ceph will continue to progress in that area as they reintegrate distributed mds stuff.
[6:35] <bchrisman> (last -> large)
[6:39] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[6:40] * yehudasa (~quassel@ip-66-33-206-8.dreamhost.com) has joined #ceph
[6:41] <lxo> ccache+distcceph is going to be kind of cool ;-)
[6:42] * sjust (~sam@ip-66-33-206-8.dreamhost.com) has joined #ceph
[6:42] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) has joined #ceph
[8:21] * allsystemsarego (~allsystem@188.25.130.175) has joined #ceph
[8:40] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Quit: neurodrone)
[10:21] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[10:54] <lxo> first hiccup (btrfs issue with vmtruncate) after 11 hours! 50GB transferred, ~950GB to go
[10:55] <lxo> just *seconds* before I installed some systemtap probes to help tell what the btrfs problem was :-(
[11:39] * eternaleye_ (~eternaley@195.215.30.181) has joined #ceph
[11:39] * eternaleye (~eternaley@195.215.30.181) Quit (Read error: Connection reset by peer)
[12:15] * Meths_ (rift@91.106.128.216) has joined #ceph
[12:20] * Meths (rift@91.106.251.162) Quit (Ping timeout: 480 seconds)
[13:12] * Meths (rift@91.106.167.9) has joined #ceph
[13:18] * Meths_ (rift@91.106.128.216) Quit (Ping timeout: 480 seconds)
[15:46] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[15:55] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: Leaving)
[16:40] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[18:03] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[18:12] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[18:19] <lightspeed> I have a question relating to Ceph and multihomed systems...
[18:19] <lightspeed> if I have some Ceph storage hosts (osd/mds/mon) which are interconnected by a very fast network (in this case, it's IPoIB)
[18:20] <lightspeed> but the Ceph clients are on a separate, standard GigE network, to which the storage hosts are also connected
[18:20] <lightspeed> and there is no routing between the IPoIB and Ethernet networks (so the clients cannot reach the IPs used by the storage hosts on their private IPoIB network)
[18:21] <lightspeed> is there a way to ensure that the storage hosts will use the very fast private network for all Ceph related traffic that they're sending between each other, whilst still letting the clients access them via the other network?
[18:26] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[18:41] <bchrisman> lightspeed: yeah… I remember asking this and the answer was yes, in the latest source common/config.cc, the options are 'public addr' and 'cluster addr'
[18:41] <bchrisman> lightspeed: those are the options that I remember being referred to when I asked this question previously.. yes.. the osd communications for rebuilds etc can be on a backplane private network.
[18:43] <lightspeed> great, I'll look into that
[18:43] <lightspeed> thanks for the response!
[18:50] * lxo (~aoliva@201.82.54.5) Quit (Read error: Connection reset by peer)
[18:50] * lxo (~aoliva@201.82.54.5) has joined #ceph
[19:32] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Quit: neurodrone)
[19:35] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[19:56] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[20:02] * Juul (~Juul@c-24-130-50-84.hsd1.ca.comcast.net) has joined #ceph
[20:36] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Quit: Ex-Chat)
[20:40] * Juul (~Juul@c-24-130-50-84.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[20:44] * sakib (~sakib@enjoyable.benefit.volia.net) has joined #ceph
[21:41] * sakib (~sakib@enjoyable.benefit.volia.net) has left #ceph
[23:00] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[23:03] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit ()
[23:46] * allsystemsarego (~allsystem@188.25.130.175) Quit (Quit: Leaving)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.