#ceph IRC Log


IRC Log for 2012-06-01

Timestamps are in GMT/BST.

[0:09] * Oliver1 (~oliver1@ip-78-94-238-50.unitymediagroup.de) Quit (Quit: Leaving.)
[0:12] <elder> sagewk, if a connection goes into STANDBY, what is the next expected bit of communication that will occur on the connection?
[0:12] <elder> Is the socket expected to be held open, therefore a subsequent write can just proceed? Or is it expected that a reconnect will need to occur? And if a reconnect is done, will a banner be exchanged?
[0:13] <elder> gregaf and yehudasa (or anyone)you guys too if you know the answer.
[0:13] <sagewk> we go into standby when there is a fault (currently). the socket should be closed
[0:13] <sagewk> and then reopened the next time we try to send a message
[0:13] <elder> Oh yeah.
[0:14] <elder> That's good.
[0:24] * BManojlovic (~steki@ Quit (Remote host closed the connection)
[0:24] <elder> And a connection in STANDBY has a valid socket address. Is it true that a failed connection attempt (feature or version mismatch) requires a full connection re-open, meaning the address has to be supplied again?
[0:30] <elder> sagewk, ^
[0:32] <sagewk> the socket is reopened, but no new ceph_con_open... is that what you mean?
[0:32] <sagewk> i'm not sure how the socket api *should* be used, if that's what you're asking...
[0:32] <sagewk> elder: ^
[0:33] <elder> Let me put it another way. When is it necessary to supply a (new) socket address for a connection to use?
[0:35] * Tv_ (~tv@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[0:36] <sagewk> elder: never i think, except for a fresh ceph_connection
[0:36] <sagewk> if we reconnect, it'll be to the same peer address
[0:36] <elder> Looks like an MDS session creation, or an MDS reconnect. And an initial oSD open and after an OSD reset.
[0:37] <sagewk> yeah, and inthat case, mds_client calls ceph_con_close and ceph_con_open to create a fresh session
[0:38] <elder> The reason I'm pursuing this is that I think there is essentially a state "socket closed but address is set," and that fits in after calling ceph_con_open().
[0:39] <elder> That state is I think essentially no different from STANDBY.
[0:39] <sagewk> hmm, yeah
[0:39] <elder> Next thing that has to happen is a banner exchange.
[0:40] <sagewk> yeah, in fact, those can probably be collapsed.
[0:40] <sagewk> well, it means that if you ceph_con_open but don't send a message, it wouldn't open the tcp connection
[0:40] <elder> That's right.
[0:40] * adjohn (~adjohn@ has joined #ceph
[0:40] <sagewk> that's probably ok, because all callers *do* queue something after ceph_con_open
[0:40] <elder> I have another state CONNECTING for that purpose. That state transitions whenever there is something to write on a STANDBY socket.
[0:41] <elder> (Kind of working through this "as we speak")
[0:41] <sagewk> sounds reasonable to me. so ceph_con_open would basically put you in STANDBY, or maybe bump you into CONNECTING just for kicks
[0:42] <elder> Queueing something for writing would bump you to CONNECTING.
[0:42] <elder> I have to work through it but I think you've given me the sanity check I was looking for.
[0:45] <sagewk> i see. yeah, sounds good
[0:49] * sbohrer (~sbohrer@ Quit (Quit: Leaving)
[1:48] * lofejndif (~lsqavnbok@82VAAD7AD.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[2:00] * joshd (~joshd@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[2:13] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[2:17] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Ping timeout: 480 seconds)
[2:19] * adjohn (~adjohn@ Quit (Quit: adjohn)
[2:21] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[2:24] * stass (stas@ssh.deglitch.com) Quit (Ping timeout: 480 seconds)
[2:28] * yoshi (~yoshi@p3167-ipngn3601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:34] * adjohn (~adjohn@mb80536d0.tmodns.net) has joined #ceph
[2:37] * adjohn (~adjohn@mb80536d0.tmodns.net) Quit ()
[3:09] * adjohn (~adjohn@mb80536d0.tmodns.net) has joined #ceph
[3:10] * joao (~JL@aon.hq.newdream.net) Quit (Quit: Leaving)
[3:17] * adjohn (~adjohn@mb80536d0.tmodns.net) Quit (Quit: adjohn)
[3:20] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[3:20] * renzhi (~renzhi@ has joined #ceph
[3:22] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[3:33] * renzhi (~renzhi@ Quit (Ping timeout: 480 seconds)
[3:38] * yoshi_ (~yoshi@p37158-ipngn3901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[3:44] * yoshi (~yoshi@p3167-ipngn3601marunouchi.tokyo.ocn.ne.jp) Quit (Ping timeout: 480 seconds)
[3:45] * renzhi (~renzhi@ has joined #ceph
[4:00] * chutzpah (~chutz@ Quit (Quit: Leaving)
[4:04] * sjusthm (~sam@24-205-39-1.dhcp.gldl.ca.charter.com) has joined #ceph
[4:16] * MK_FG (~MK_FG@ Quit (Ping timeout: 480 seconds)
[4:20] * renzhi (~renzhi@ Quit (Ping timeout: 480 seconds)
[4:20] * renzhi (~renzhi@ has joined #ceph
[4:26] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[4:44] * MK_FG (~MK_FG@ has joined #ceph
[4:45] * renzhi (~renzhi@ Quit (Quit: Leaving)
[6:22] * The_Bishop (~bishop@cable-86-56-102-91.cust.telecolumbus.net) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[6:48] * aliguori (~anthony@ has joined #ceph
[6:48] * diggalabs (~jrod@cpe-72-177-238-137.satx.res.rr.com) has joined #ceph
[7:21] * stass (stas@ssh.deglitch.com) has joined #ceph
[8:06] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[8:22] * aliguori (~anthony@ has joined #ceph
[8:31] * sjusthm (~sam@24-205-39-1.dhcp.gldl.ca.charter.com) Quit (Remote host closed the connection)
[9:19] * BManojlovic (~steki@ has joined #ceph
[9:21] * hijacker (~hijacker@ Quit (Quit: Leaving)
[9:26] * hijacker (~hijacker@ has joined #ceph
[9:34] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:42] * s[X]_ (~sX]@ppp59-167-154-113.static.internode.on.net) Quit (Remote host closed the connection)
[9:56] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[10:05] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[10:06] * aliguori (~anthony@ has joined #ceph
[10:20] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[10:24] * stass (stas@ssh.deglitch.com) Quit (Remote host closed the connection)
[10:27] * bchrisman1 (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[10:30] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[10:35] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[10:37] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[10:39] * bchrisman1 (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[10:51] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[10:55] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[11:01] * Ryan_Lane (~Adium@ has joined #ceph
[11:16] * yoshi_ (~yoshi@p37158-ipngn3901marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:17] * zykes_ (~zykes@184.79-161-107.customer.lyse.net) Quit (Ping timeout: 480 seconds)
[12:06] * aliguori (~anthony@ Quit (Remote host closed the connection)
[12:14] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[12:28] * lofejndif (~lsqavnbok@82VAAD7QK.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:33] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[12:34] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[13:22] * Ryan_Lane (~Adium@ Quit (Quit: Leaving.)
[13:25] * julienhuang (~julienhua@ has joined #ceph
[13:36] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[13:39] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[13:42] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[13:50] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[14:09] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[14:29] * s[X]__ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[14:29] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Read error: Connection reset by peer)
[14:43] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Quit: Leaving)
[14:56] * s[X]__ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[14:57] * yanzheng (~zhyan@ has joined #ceph
[14:58] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[15:02] * s[X] (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[15:10] * s[X] (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Ping timeout: 480 seconds)
[15:23] * lofejndif (~lsqavnbok@82VAAD7QK.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[15:27] * Ryan_Lane (~Adium@ has joined #ceph
[15:45] * julienhuang_ (~julienhua@ has joined #ceph
[15:46] * julienhuang_ (~julienhua@ Quit ()
[15:50] * renzhi (~renzhi@ has joined #ceph
[15:53] * julienhuang (~julienhua@ Quit (Ping timeout: 480 seconds)
[16:02] * jochen (~jochen@laevar.de) has joined #ceph
[16:02] * jochen is now known as laevar
[16:02] <laevar> hi
[16:03] <laevar> we are trying to run (full) lxc containers on a ceph-mounted directory with an ubuntu precise (12.04)
[16:04] <laevar> sadly, we get a kernel panic when we start the container
[16:04] <laevar> before we investigate this further: is it in principal possible to run lxc-containers on a ceph-mounted dir? has someone done this already?
[16:10] * CristianDM (~CristianD@201-213-234-191.net.prima.net.ar) has joined #ceph
[16:10] * sdx23 (~sdx23@with-eyes.net) has joined #ceph
[16:11] <CristianDM> Hi. When I run "rbd export imagename" the space of the image are the space used or the image size?
[16:15] <CristianDM> I try to export rbd image
[16:15] <CristianDM> rbd export 1cd58310-8d5a-4e7e-97e7-a130ebc8200d /root/test.img
[16:15] <CristianDM> And return
[16:15] <CristianDM> error opening image 1cd58310-8d5a-4e7e-97e7-a130ebc8200d: (2) No such file or directory
[16:16] <CristianDM> But the image exist
[16:16] <CristianDM> rbd -p images ls
[16:16] <CristianDM> 1cd58310-8d5a-4e7e-97e7-a130ebc8200d
[16:18] <CristianDM> any?
[16:20] <CristianDM> Sorry, I add the pool name and works
[16:24] * renzhi (~renzhi@ Quit (Read error: Connection reset by peer)
[16:27] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has left #ceph
[16:31] * Ryan_Lane (~Adium@ Quit (Quit: Leaving.)
[16:32] * yanzheng (~zhyan@ Quit (Ping timeout: 480 seconds)
[16:46] * yanzheng (~zhyan@ has joined #ceph
[16:50] * s[X] (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[16:58] * s[X] (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Ping timeout: 480 seconds)
[17:02] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:21] * lofejndif (~lsqavnbok@83TAAGFUZ.tor-irc.dnsbl.oftc.net) has joined #ceph
[17:27] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[17:36] * joao (~JL@aon.hq.newdream.net) has joined #ceph
[17:41] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:49] * lofejndif (~lsqavnbok@83TAAGFUZ.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[17:51] * Tv_ (~tv@aon.hq.newdream.net) has joined #ceph
[17:56] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[18:01] * yehudasa (~yehudasa@aon.hq.newdream.net) Quit (Remote host closed the connection)
[18:07] * lofejndif (~lsqavnbok@04ZAADKH7.tor-irc.dnsbl.oftc.net) has joined #ceph
[18:18] * yanzheng (~zhyan@ Quit (Ping timeout: 480 seconds)
[18:35] * yehudasa (~yehudasa@aon.hq.newdream.net) has joined #ceph
[18:35] <gregaf> laevar: I'm not certain what all is required for lxc containers, but (while not fully-tested) cephfs is a posix-compliant filesystem that fulfills the VFS contracts???I don't believe it should be a problem
[18:36] <Tv_> or use rbd.ko if that's what you want, but it's just a filesystem mounted on the host
[18:37] * CristianDM (~CristianD@201-213-234-191.net.prima.net.ar) Quit ()
[18:38] * s[X] (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[18:46] * s[X] (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Ping timeout: 480 seconds)
[18:48] * Ryan_Lane (~Adium@gateway01.m3-connect.de) has joined #ceph
[18:49] * bchrisman (~Adium@ has joined #ceph
[18:52] * Ryan_Lane1 (~Adium@gateway01.m3-connect.de) has joined #ceph
[18:52] * Ryan_Lane (~Adium@gateway01.m3-connect.de) Quit (Read error: Connection reset by peer)
[18:57] * athy (~athy@83TAAGFXA.tor-irc.dnsbl.oftc.net) has joined #ceph
[18:57] <athy> brazil
[18:57] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[18:57] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[18:57] * athy (~athy@83TAAGFXA.tor-irc.dnsbl.oftc.net) has left #ceph
[19:04] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[19:07] * lofejndif (~lsqavnbok@04ZAADKH7.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[19:10] <sagewk> elder: ok, pushed wip-messenger-2 for real now :)
[19:10] <elder> OK.
[19:11] <elder> I'll look it over.
[19:11] <elder> Probably would have been nicer to call it something else so I would have my original for comparison.
[19:11] <sagewk> yeah, sorry
[19:12] <elder> I'm wondering if this refcount bug in the OSD could be causing some of the problems...
[19:12] <sagewk> yeah, me too
[19:13] <elder> Sweet, now that I'm looking at your patches it's just about exactly what I was thinking of doing.
[19:14] <elder> I'll incorporate them and do some testing.
[19:19] * Ryan_Lane1 (~Adium@gateway01.m3-connect.de) Quit (Quit: Leaving.)
[19:22] * chutzpah (~chutz@ has joined #ceph
[19:30] <laevar> gregaf: we did a bad mistake: we tried the kernel-client on a osd-system. Somehow slipped through and it was working fine with normal file operations until starting the lxc. Using the fuse-client seems to work fine now
[19:32] <Tv_> laevar: loopback mounts work until you get memory pressure, and as long as it's a sunny day
[19:32] <laevar> Tv_: yes, rbd would be the "fallback". As i understand running full fledged os-level virtual machines should be perform better using the fs directly instead of using the block-device
[19:32] <Tv_> laevar: uhh, the ceph distributed filesystem probably won't perform amazingly yet
[19:34] <laevar> Tv_: so it would be better to rely on rbd the time beeing, regardinf perfomance *and* stabilitz?
[19:34] <laevar> s/stabilitz/stability/
[19:34] <Tv_> laevar: we explicitly say ceph dfs is not ready for production yet
[19:34] <laevar> yes, i know
[19:34] <Tv_> laevar: that means it has too many bugs left, it doesn't perform well enough, etc
[19:36] <laevar> we seriously are thinking about using it in production, because it is exactly the thing we dreamed of
[19:37] <laevar> but rbd would be very fine also
[19:39] <Tv_> laevar: our current plan is to stabilize ceph dfs by the end of the year
[19:50] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[19:54] <laevar> Tv_: is there something we could help with in particular? at the moment we are anyway only playing around with a testcluster running lxc-vm's (as already stated)
[19:55] <Tv_> laevar: right now, we're just focusing our efforts on the rados level; the dfs will benefit from that too
[19:56] <Tv_> laevar: so it's more.. please explore, but understand that we can't necessarily help with all the bugs
[19:56] <Tv_> laevar: naturally, Inktank professional services has slightly different priorities, especially if you're building something big ;)
[19:56] <Tv_> laevar: playing around is very very much appreciated
[19:58] <laevar> Tv_: no problem. As i said we are looking forward for using ceph and if we can help, thats great. Our setup is, and will be, really rather small (only up to 10 nodes)
[20:01] <laevar> Tv_: but thats why we benefit of the design of ceph: every node can distribute to redundancy and perfomance, be it small or big. any other setup will be either expensive (SAN) or more complicated and less flexible (several pairs of DRBD for example)
[20:11] * s[X] (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[20:17] * mtk (~mtk@ool-44c35967.dyn.optonline.net) Quit (Remote host closed the connection)
[20:19] * s[X] (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Ping timeout: 480 seconds)
[20:28] * mtk (~mtk@ool-44c35967.dyn.optonline.net) has joined #ceph
[20:41] <elder> sagewk
[20:44] * lofejndif (~lsqavnbok@659AAAC91.tor-irc.dnsbl.oftc.net) has joined #ceph
[20:54] * elder1 (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[21:00] <sagewk> elder
[21:00] <elder> Now I have to remember my question...
[21:02] <elder> I'm holding a few patches because they hadn't completed review. One of them is 07/13, because you brought up those refcount issues.
[21:02] <elder> I have looked at the four patches you proposed and all look fine, and I think may eliminate your original concerns about the patch needing an update.
[21:03] <elder> So basically I want to know how you'd prefer I proceed. I can re-submit the patches, including yours. That might be the most clear.
[21:03] <elder> (I have already committed to the testing branch some that were reviewed.)
[21:05] <sagewk> i think if you squash the first with your patch and then add the others it'll be good
[21:06] <sagewk> i only compile-tested, so you probably want to do a bit more than that first :)but
[21:06] <sagewk> but it gets my reviewed-by
[21:06] <elder> I will of course.
[21:06] <sagewk> i think you can stick it straight in the tree
[21:06] <elder> Specifically which do you mean to "squash"?
[21:06] <elder> drop connectino refcounting?
[21:06] <elder> for mon_client
[21:09] <elder> Let me just post my next dozen or so patches to the list.
[21:09] <elder> 3-4 of those will be from you.
[21:09] <elder> I'll make note for those that have already been posted, so it doesn't require a very serious review.
[21:09] <sagewk> yeah
[21:09] <elder> That process will leave no questions.
[21:10] <sagewk> my libceph: drop connection refcounting for mon_client and your embed con into mon_client
[21:10] <sagewk> or put mine right before yours, that pbly makes more sense
[21:10] <elder> OK.
[21:10] <elder> It does.
[21:10] <sagewk> cool
[21:28] <darkfader> i think some of you know a little about lsi controllers. can they get stuck in the learn cycle?
[21:28] <elder> nhm?
[21:34] <nhm> elder: yo
[21:34] <elder> see darkfader's question
[21:35] <nhm> darkfader: hrm, I don't think I've seen that.
[21:35] <darkfader> hi nhm hehe
[21:35] <nhm> darkfader: but it wouldn't surprise me. ;P
[21:35] <darkfader> i see it says
[21:35] <darkfader> Learn Cycle Requested : Yes
[21:35] <darkfader> Learn Cycle Active : No
[21:35] <darkfader> and if i try to change LD properties
[21:36] <darkfader> then it says "we're in learn cycle" or so
[21:36] <darkfader> but i'll just try to be patient and let it keep charging some more
[21:36] <darkfader> thanks both of you :)
[21:42] <dmick> I had a Learn Cycle when I was younger. I took the wheels off when I was about 7
[21:43] <darkfader> dmick: humans are different, we only lose cache contents if we drink too much
[21:43] <darkfader> so no battery
[21:56] * The_Bishop (~bishop@cable-86-56-102-91.cust.telecolumbus.net) has joined #ceph
[22:15] * jerker (jerker@Psilocybe.Update.UU.SE) has joined #ceph
[22:24] * stass (stas@ssh.deglitch.com) has joined #ceph
[22:25] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[22:28] * lofejndif (~lsqavnbok@659AAAC91.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[22:59] * BManojlovic (~steki@ has joined #ceph
[23:01] * brambles (brambles@ Quit (Quit: leaving)
[23:02] * brambles (brambles@ has joined #ceph
[23:07] * cattelan is now known as cattelan_away
[23:16] * elder1 (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has left #ceph
[23:24] * loicd (~loic@magenta.dachary.org) has joined #ceph
[23:26] <loicd> Hi, I would like to know if storing 50 millions objects, each 200KB large with ceph is a) easy assuming you have enough hardware, b) require some work, c) is quite difficult. I realize I'm just asking for a hint, not a proper authoritative answer to a very fuzzy question ;-)
[23:28] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[23:29] * izdubar (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[23:30] <Tv_> loicd: 10TB? i have that much space on much home desk, though not replicated ;)
[23:33] <loicd> Tv_: True ;-) it's not so much about the total size, I guess. Rather the number of objects. For instance when storing over 10 million files on an ext4 file system, walking all of them takes much time and some operations become impractical. Do you have more than 10 million files somewhere ?
[23:35] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[23:36] <Tv_> loicd: my current understanding is that the osds store all objects in a single pg into a single directory.. so it's a question of how well your backend filesystem handles total_num_of_objects / num_of_pgs files in one directory
[23:36] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[23:36] <Tv_> sjust: correct me here please
[23:37] <sjust> Tv_: that's not quite true, a pg will split into sub collections as the number of files grows
[23:37] <elder> Tv_, should I be able to build my kernel 3x faster on my own machine than on an autobuilder machine?
[23:37] <Tv_> oh and leveldb guys measured 1000 files/dir = 9microsec to open, 10k=10, 100k=16
[23:38] <Tv_> sjust: ahh right
[23:38] <sjust> loicd: do you intend to use the dfs or is an object store adequate?
[23:39] <Tv_> elder: the current autobuilders suck and are memory starved; in other news, i've observed ceph.git needing 6GB RAM to not swap at the level of concurrency those boxes could otherwise easily handle
[23:39] <loicd> sjust: the more efficient solution would be adequade. There is no requirement to go thru a filesystem, it can be object store.
[23:39] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has left #ceph
[23:39] <dmick> figure 1G per parallel job on ceph
[23:39] <dmick> dunno how bad the kernel is
[23:39] <Tv_> dmick: mostly better, because it's C
[23:39] <dmick> (and that's a little overestimate, but the biggest compiles take about a gig)
[23:40] <dmick> (yeah, I'd expect betteR)
[23:41] <dmick> GRR! Stupid iDRAC emulation devices
[23:42] <sjust> loicd: in that case, there is a key-value mapping associated with each librados object implemented efficiently using leveldb behind the scenes
[23:42] <sjust> so even if raw librados objects are too expensive, you could partition the objects across these mappings
[23:43] <loicd> sjust: thanks for the hint. that gives me enough to try it out :-)
[23:44] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[23:45] * ecawthon (~eleanor@aon.hq.newdream.net) has joined #ceph
[23:46] <dmick> hi ecawthon
[23:46] <ecawthon> hi dmick
[23:48] <elder> I presume these gitbuilder machines are virtual. How many are there? The difference between 10 minutes and 40 minutes for a build is pretty significant. It makes for a characteristically different work flow.
[23:50] <gregaf> elder: http://ceph.com/gitbuilder.cgi many
[23:50] <gregaf> (actually I think a bit more than that, but not sure)
[23:50] <elder> OK, but what's the hardware that underlies that?
[23:50] <dmick> just about 10 vms. four servers
[23:50] <gregaf> something much wimpier than our vercoi machines
[23:52] <dmick> not certain of that
[23:52] <Tv_> elder: the gitbuilders are a mess; well known, on the list, not quite there yet
[23:52] <dmick> 24 core (Xeon L5640@2.27G)
[23:52] <dmick> 34G
[23:52] <dmick> er, 24
[23:52] <Tv_> ceph-kvm have plenty of cores, they run out of RAM & disk
[23:52] <elder> Do you know what other VM's the kernel-amd64 image shares a machine with?
[23:53] <elder> Or rather, how can I tell?
[23:54] <dmick> well you could log in and look at the kvm info, but that's probably not worth learning
[23:54] <elder> Just wondering if I'm sharing with maverick-deb-amd64, which seems to have two builds going on right now as well.
[23:54] <dmick> precise-db
[23:55] <dmick> *deb
[23:55] <dmick> precise-gcov
[23:55] <elder> OK, well those don't appear to be busy.
[23:55] <Tv_> WAAH rich test to vger
[23:55] <Tv_> hate
[23:55] <Tv_> *Text
[23:56] * dmick balls hand into fist to add to Tv's hate
[23:56] <dmick> elder: doesn't seem like it's being driven particularly hard, no
[23:57] <dmick> packaging now
[23:59] <elder> Well my cores run at a 50% faster clock rate than those, and I have four of them, hyperthreaded. Do the VM's get access to all 24 cores?

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.