#ceph IRC Log


IRC Log for 2011-03-13

Timestamps are in GMT/BST.

[0:42] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[0:51] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) has joined #ceph
[1:22] * greglap (~Adium@cpe-76-90-239-202.socal.res.rr.com) has joined #ceph
[1:44] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[1:59] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) has joined #ceph
[2:09] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[2:10] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) has joined #ceph
[3:22] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[3:23] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) has joined #ceph
[3:25] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) Quit ()
[3:27] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) has joined #ceph
[4:24] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[4:59] * rajeshr (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) has joined #ceph
[5:08] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[6:34] * raso (~raso@debian-multimedia.org) Quit (Ping timeout: 480 seconds)
[7:45] * rajeshr (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) Quit (Quit: Leaving.)
[8:43] * allsystemsarego (~allsystem@ has joined #ceph
[9:41] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Quit: neurodrone)
[10:43] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[14:52] * MK_FG (~MK_FG@ Quit (Ping timeout: 480 seconds)
[15:01] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[15:25] * MK_FG (~MK_FG@ has joined #ceph
[15:47] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: Leaving)
[16:44] * verwilst_ (~verwilst@dD576FAAE.access.telenet.be) has joined #ceph
[18:11] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[18:22] * allsystemsarego (~allsystem@ has joined #ceph
[18:27] * lalaa (~baz@kssl-4db0c3c5.pool.mediaWays.net) has joined #ceph
[18:31] <lalaa> "Ceph may not be ready for production environments" <- is this because data loss occuring or because of protocol or format changes?
[18:31] <sage> the former.
[18:32] <DeHackEd> will 1.0 be considered production ready?
[18:32] <lalaa> i see
[18:34] <lalaa> i guess someone with little insight into file systems (especially distributed ones) can't do much to help out code/wise?
[18:34] <sage> dehacked: by some, for some purposes. we are focusing on stabilizing the object store and moving up the stack. don't expect to build a 1000-node system with 50 mds's.
[18:34] <lalaa> or are there easy to squash bugs that might increase stability
[18:34] <sage> the biggest need currently is testing, not coding
[18:35] <sage> so you can definitely help! :)
[18:35] <lalaa> my setup will be _very_ small scale though
[18:36] * verwilst_ (~verwilst@dD576FAAE.access.telenet.be) Quit (Quit: Ex-Chat)
[18:41] <lalaa> as in pretty much only one host and few disks (which count will increase over time)
[18:41] <lalaa> if this is a scenario worth exploring i'd like to help by testing
[18:42] <greglap> lalaa: either you'll find bugs, or you'll demonstrate that the system is pretty stable at small scales — both valuable! ;)
[18:43] <lalaa> heh true
[18:43] <lalaa> would be sad to see the data go byebye though
[18:44] <lalaa> are we talking about corruption of single files or whole volumes (ods)?
[18:45] <lalaa> or the whole cluster going boom
[18:45] <greglap> depends on the bug
[18:46] <greglap> I actually don't recall full-on data loss bugs recently, just a lot of things where some of the daemons die or some manual repair is required
[18:47] <greglap> but maybe sage is thinking of some I've forgotten
[19:01] * DeHackEd (~dehacked@dhe.execulink.com) Quit (Ping timeout: 480 seconds)
[19:02] <sage> i'm not thinking of actual data loss, but the possibility of it.
[19:15] <lalaa> should tests be run with the current release or a git checkout?
[19:16] <greglap> the latest release is usually a good choice — we try and push out a new release every 2-4 weeks so they're pretty up-to-date :)
[19:50] <lalaa> 1
[19:51] <lalaa> hm arch doesn't have the current release yet, going for git
[19:51] <lalaa> unless the compilation/options haven't changed
[19:53] <lalaa> it lists a dependency on fuse, is that cruft that can be removed?
[19:57] <greglap> lalaa: there's a FUSE client module
[19:57] <greglap> you can opt to avoid building it if you like
[19:57] <greglap> —without-fuse, I think
[19:57] <greglap> when building it yourself
[19:58] <lalaa> yeah will do, just looked at the configure options
[19:58] <lalaa> installing dependencies
[20:00] <lalaa> is using libatomic-ops recommended?
[20:04] <greglap> yep
[20:05] <greglap> if you don't use that then all the atomic counters will be implemented with spinlocks
[20:08] <lalaa> i probably won't be pushing the system to its limit any way but i'll use it then. the default build instructions for the package in the arch repository don't include it
[20:15] * Dantman (~dantman@S0106001eec4a8147.vs.shawcable.net) Quit (Ping timeout: 480 seconds)
[20:27] * Dantman (~dantman@S0106001eec4a8147.vs.shawcable.net) has joined #ceph
[22:08] * Dantman (~dantman@S0106001eec4a8147.vs.shawcable.net) Quit (Ping timeout: 480 seconds)
[22:19] * Dantman (~dantman@S0106001eec4a8147.vs.shawcable.net) has joined #ceph
[22:29] * verwilst_ (~verwilst@dD576FAAE.access.telenet.be) has joined #ceph
[23:06] <lalaa> the recommended way to set up an OSD with multiple disks is to use btrfs for device spanning (as opposed to lvm) and then run one cosd instance on top of it insead of running multiple cosd instances?
[23:06] <sage> both are valid options
[23:08] <lalaa> will btrfs handle failing disks as gracefully?
[23:08] <lalaa> -as
[23:08] <sage> with a single osd, losing one disk loses all disks' data. raid solves that but costs you some space. with mlutiple cosds, you don't pay the raid overhead, but recovery uses network bandwidth instead of local disk io
[23:09] <sage> this is a faq and should get a wiki page at some point
[23:09] <lalaa> do you want me to add it?
[23:10] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[23:11] <lalaa> hmm so multiple daemons it is for me
[23:20] <lalaa> how many independent cosd instances are needed for redundancy to be achieved in case one fails completely? (assuming all OSDs are the same size or is that not necessary?) <- probably another one for the faq
[23:37] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.