#ceph IRC Log


IRC Log for 2011-11-09

Timestamps are in GMT/BST.

[0:04] * fronlius (~fronlius@f054114193.adsl.alicedsl.de) Quit (Quit: fronlius)
[0:15] * darkfader (~floh@ Quit (Ping timeout: 480 seconds)
[0:39] * aliguori (~anthony@ Quit (Quit: Ex-Chat)
[2:17] * Tv (~Tv|work@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[3:00] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[3:14] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[3:38] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) Quit (Quit: cp)
[3:54] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[6:14] * aa (~aa@r190-133-125-45.dialup.adsl.anteldata.net.uy) has joined #ceph
[6:40] * wido (~wido@rockbox.widodh.nl) Quit (Remote host closed the connection)
[11:12] -solenoid.oftc.net- *** Looking up your hostname...
[11:12] -solenoid.oftc.net- *** Checking Ident
[11:12] -solenoid.oftc.net- *** No Ident response
[11:12] -solenoid.oftc.net- *** Found your hostname
[11:12] * CephLogBot (~PircBot@rockbox.widodh.nl) has joined #ceph
[11:35] * aa (~aa@r190-133-125-45.dialup.adsl.anteldata.net.uy) Quit (Quit: Konversation terminated!)
[11:41] * gregorg_taf (~Greg@ has joined #ceph
[11:41] * RupS| (~rups@panoramix.m0z.net) has joined #ceph
[11:42] * monrad (~mmk@domitian.tdx.dk) has joined #ceph
[11:45] * NaioN_ (~stefan@andor.naion.nl) has joined #ceph
[11:46] * DLange_ (~DLange@sixtina.faster-it.de) has joined #ceph
[11:46] * _Shiva__ (shiva@whatcha.looking.at) has joined #ceph
[11:46] * Ormod_ (~valtha@ohmu.fi) has joined #ceph
[11:46] * nms_ (martin@sexyba.be) has joined #ceph
[11:46] * johnl_ (~johnl@johnl.ipq.co) has joined #ceph
[11:46] * `gregorg` (~Greg@ Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * NaioN (~stefan@andor.naion.nl) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * andret (~andre@pcandre.nine.ch) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * nms (martin@sexyba.be) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * johnl (~johnl@ Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * damoxc (~damien@94-23-154-182.kimsufi.com) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * Ormod (~valtha@ohmu.fi) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * RupS (~rups@panoramix.m0z.net) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * monrad-51468 (~mmk@domitian.tdx.dk) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * DLange (~DLange@dlange.user.oftc.net) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * _Shiva_ (shiva@whatcha.looking.at) Quit (reticulum.oftc.net magnet.oftc.net)
[11:46] * raso (~raso@debian-multimedia.org) Quit (reticulum.oftc.net magnet.oftc.net)
[11:56] * andret (~andre@pcandre.nine.ch) has joined #ceph
[11:56] * damoxc (~damien@94-23-154-182.kimsufi.com) has joined #ceph
[11:57] * raso (~raso@debian-multimedia.org) has joined #ceph
[12:01] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[12:16] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[12:31] <psomas> Is there any specific reason for the limits imposed by the rbd kernel driver for the rbd image name?
[13:30] * votz (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) Quit (Read error: Connection reset by peer)
[13:30] * votz (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) has joined #ceph
[14:14] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[15:53] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Quit: fronlius)
[16:53] * pserik (~Serge@eduroam-60-133.uni-paderborn.de) has joined #ceph
[16:54] <pserik> hello @all
[16:56] <pserik> i'm trying to compile the ceph client from sources (ceph-client-standalone) and actually using the wiki entry from http://ceph.newdream.net/wiki/Building_kernel_client. But the tutorial is not working for me now
[16:57] <pserik> are there some new tutorials or entries for this task?
[17:02] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) has joined #ceph
[17:03] <pserik> "make -C libceph" always returns the error: fatal error: keys/ceph-type.h: No such file or directory
[17:09] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[17:13] <psomas> pserik: i'm not sure the wiki instruction are valid any more, i just cloned the github ceph-client repo
[17:16] <pserik> psomas: ceph-client implies all the kernel stuff. it is possible to compile only the ceph-client stuff?
[17:20] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[17:25] <psomas> pserik: you can build it as an external module
[17:26] <psomas> but i'm not sure if it's going to work, if the kernel sources/headers you're building against are not very recent
[17:26] <pserik> can you give me a short tutorial please?
[17:26] <pserik> i'm going to try it
[17:27] <psomas> cd ~/ceph-client/net/ceph/
[17:27] <psomas> make -C /usr/src/linux-headers-3.0.0-2-amd64/ M=$(pwd) libceph.ko
[17:27] <psomas> that's on a debian running 3.0.0-2-amd64
[17:27] <psomas> same for rbd.ko
[17:28] <psomas> but if the ceph headers have changed (ie include/linux/ceph) i think you must copy them to the linux-headers dir
[17:28] <psomas> cp -r ceph-client/include/linux/ceph/* /usr/src/linux-$(uname-r)/include/linux/ceph/
[17:28] <psomas> something like that
[17:28] <psomas> probably there's a better way to do it
[17:28] <pserik> psomas: ok, thank you. going to try it now
[17:35] <pserik> psomas: what is the M option?
[17:37] <pserik> by executing the "make -C /usr/src/linux-headers-2.6.38-8-server/ libceph.ko" i get the "/usr/src/linux-headers-2.6.38-8-server/ : No such file or directory. Stop."
[17:45] <pserik> ok, the libceph did is mssing
[17:45] <pserik> *dir*
[17:46] <pserik> make -C /usr/src/linux-headers-3.0.0-2-amd64/  M=$(pwd) libceph.ko
[17:46] <pserik> püs
[17:50] <pserik> psomas: ok, i treid to execute the command "make -C /usr/src/linux-headers-2.6.38-8-server/ M=$(pwd) libceph.ko" and i got the same error message "fatal error: keys/ceph-type.h: No such file or directory"
[17:57] * Tv (~Tv|work@aon.hq.newdream.net) has joined #ceph
[18:13] * fronlius (~fronlius@e182094169.adsl.alicedsl.de) has joined #ceph
[18:18] * pserik (~Serge@eduroam-60-133.uni-paderborn.de) has left #ceph
[18:18] * The_Bishop (~bishop@port-92-206-45-45.dynamic.qsc.de) has joined #ceph
[18:41] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:24] * pserik (~Serge@ip-109-90-70-227.unitymediagroup.de) has joined #ceph
[19:33] <NaioN_> I have a strange logging behavior of the mds
[19:33] <NaioN_> the log gets really huge (fills the disk, about 300G)
[19:33] <NaioN_> I didn't turn any debug level on!
[19:36] <NaioN_> I get a lot of these messages: 2011-11-09 19:34:47.461203 7f6be319a700 mds.0.cache.dir(1000024d02a) [dentry #1<PATH TO FILE> [2,head] auth (dversion lock) v=24065 inode=0x3f43980 0x1ad8d28] n(v0 b6268 1=1+0)
[19:36] <NaioN_> where <PATH TO FILE> is the file that got transfered to the fs
[19:38] * fronlius (~fronlius@e182094169.adsl.alicedsl.de) Quit (Quit: fronlius)
[19:39] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[19:39] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[19:39] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Remote host closed the connection)
[19:40] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[19:45] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[20:03] * grape (~grape@c-76-17-80-143.hsd1.ga.comcast.net) Quit (Ping timeout: 480 seconds)
[20:05] <sjust> wip_osdmap has the OSDMap const cleanup and pg locking changes, if anyone wants to review it
[20:08] <gregaf1> all: just pushed a patch that should make perfcounters happy again; sorry I forgot to audit the rest of the codebase :(
[20:17] * three18ti (~jon@ has joined #ceph
[20:19] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[20:25] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[20:28] * fronlius (~fronlius@g231173223.adsl.alicedsl.de) has joined #ceph
[20:39] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) has joined #ceph
[20:40] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[20:40] * fronlius (~fronlius@g231173223.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[20:44] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:50] * fronlius (~fronlius@f054021246.adsl.alicedsl.de) has joined #ceph
[20:58] * nwatkins` (~user@kyoto.soe.ucsc.edu) has joined #ceph
[21:01] * Goku (50951210@ircip1.mibbit.com) has joined #ceph
[21:03] <Goku> hey guys, i am trying to configure ceph, but stuck with some errors...is somebody there? thanks
[21:04] <NaioN_> what errors?
[21:04] * NaioN_ is now known as NaioN
[21:06] <Goku> ceph is up and running...but MDS crashes after sometime...say 30 mins or so...
[21:06] <Goku> some clients are not able to mount...
[21:06] <Tv> Goku: what does the mds log say?
[21:06] <Goku> dmesg on clients give messages like "had xxx fsid, got yyy"
[21:07] <NaioN> Goku: Tv means the mds log
[21:08] <Goku> no...this was only dmesg from the client...wait a min...i will check the log...
[21:08] <NaioN> and those messages in dmesg are btrfs related if I'm correct
[21:09] <Goku> well im using ext4 right now...
[21:10] <Goku> well i cannot decode the log myself...can i paste it here or something?
[21:10] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[21:10] <NaioN> or pastebin if it's large
[21:10] <Tv> pastebin please
[21:11] <Goku> just a moment...pastebin...
[21:13] <Goku> sorry for the delay...here it is...http://pastebin.com/4bMmG5R6
[21:14] <Goku> i have tried to paste a lot...hope someone is able to find some flaw...
[21:15] <Tv> now that assert looks pretty darn bad
[21:16] <Goku> ohh..
[21:16] * verwilst (~verwilst@d51A5B077.access.telenet.be) has joined #ceph
[21:17] <joshd> usually that assert happens when a lock is used after it's been freed
[21:18] <Tv> yup
[21:19] <Tv> Goku: you should file a bug at http://tracker.newdream.net/projects/ceph
[21:19] <psomas> Is there a specific reason for the limits on the rbd image names (96 chars) snapshots etc in the kernel rbd driver?
[21:22] <Goku> @Tv: are u sure that this is worth mentioning in the bug tracker?
[21:22] <Tv> Goku: yes
[21:23] <Goku> Tv: shud i also paste the log?
[21:23] <Tv> Goku: yes please
[21:23] <Goku> Tv: ok i will do it!
[21:24] <Tv> Goku: thank you
[21:24] <Goku> @all: thanks everyone
[21:24] <joshd> psomas: I don't see that limit in the on-disk structure, unless I'm missing something
[21:24] <joshd> psomas: yehudasa_ would know for sure
[21:25] <Tv> char snap_name[RBD_MAX_SNAP_NAME_LEN];
[21:25] <Tv> #define RBD_MAX_SNAP_NAME_LEN 32
[21:25] <Tv> etc
[21:25] <Tv> in the kernel driver
[21:25] <Tv> that might just be the kernel driver
[21:26] <Tv> a quick browse doesn't reveal a server-side limit
[21:27] <joshd> there's a define for it in librbd, but it's not used, so it may be a remnant from an older version of the format
[21:28] <Tv> i see it has a length limit for image name and the local blockdevice name
[21:28] <Tv> that is, RBD_MAX_IMAGE_NAME_SIZE and RBD_MAX_BLOCK_NAME_SIZE exist outside the kernel
[21:29] <Tv> git says ceph.git never had the string RBD_MAX_SNAP_NAME_LEN in it
[21:30] <joshd> I was looking at rbd_types.h: #define RBD_MAX_OBJ_NAME_SIZE 96
[21:30] <Tv> yeah it has object name size, but not snap name size
[21:30] <Tv> kernel side has both
[21:33] <Tv> oooh RBD_MAX_SNAP_NAME_LEN is almost not used..
[21:36] <Goku> hi guys, one more question, my conf file is like this http://pastebin.com/sQb8WZbx...is this well defined?
[21:37] <joshd> it looks like a bunch of these checks are inconsistent between kernel and userspace - max object order is 30 in the kernel and 25 in the rbd command line tool
[21:41] <gregaf1> Goku: yeah, that config looks fine
[21:44] <Goku> gregaf1: thanks...
[21:47] <joshd> psomas: created http://tracker.newdream.net/issues/1701 to track this
[21:54] * hijacker (~hijacker@ Quit (Read error: Connection reset by peer)
[21:55] * hijacker (~hijacker@ has joined #ceph
[22:10] <psomas> joshd: ok, thanks
[22:10] <psomas> i was trying to track down a bug, with an image name > 40 chars and i saw that too
[22:11] <psomas> the other thing is that for some reason the object name (oid) in libceph (messenger.c i think) is harcoded to 40 chars
[22:11] <psomas> so even though the rbd driver says the max name is 96, it actually truncuates the object name to 40 chars (including the .rbd suffix) and so the mapping fails
[22:11] <gregaf1> ah, that used to be a limit to oid length, but we got rid of it!
[22:12] <gregaf1> apparently it's still alive in some places, though :x
[22:12] * Goku (50951210@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[22:12] <gregaf1> can you file a bug? it should be fixed sooner rather than later
[22:12] <psomas> i was going to fix that, but i was not sure if it's something needed by other users of libceph (in fs/ceph or sth)
[22:13] <gregaf1> I'm pretty sure it's just a relic left over from when oids had a fixed length
[22:13] <psomas> gregaf1: ok, i'll file a bug, and if i have time i can post a patch too
[22:13] <gregaf1> thanks!
[22:14] <psomas> and one more thing (i'll send a patch for this tomorrow probably)
[22:14] <psomas> in rbd.cc the showmapped cmd will fail (with assert(0) is anything else is given as an argument/option), the fix is pretty much trivial
[22:14] <psomas> i can file a bug for it too, if you want
[22:15] <gregaf1> yeah – don't know much about that stuff but everybody loves tracked bugs ;)
[22:17] <psomas> kk :)
[22:19] * fronlius (~fronlius@f054021246.adsl.alicedsl.de) Quit (Quit: fronlius)
[22:21] * pserik (~Serge@ip-109-90-70-227.unitymediagroup.de) has left #ceph
[22:26] <sagewk> nwatkins`: i don't see a patch 3/4 in that series..
[22:26] <gregaf1> sagewk: 1: return stripe address replicas
[22:27] <gregaf1> 2: handle new ceph_get_file_stripe_address
[22:27] <gregaf1> 3: return all replica hostnames
[22:27] <gregaf1> 4: make listStatus quiet
[22:27] <sagewk> weird, i didn't 3
[22:27] <gregaf1> you don't have all of those?
[22:28] <gregaf1> I was going to merge them in and check but then I got stuck on not being able to make git-am or patch work and switched gears to a code review *sigh*
[22:28] <sagewk> ah spam filter caught it
[22:28] <gregaf1> bizarre
[22:33] <sagewk> srsly
[22:33] <nwatkins`> sagewk: I see being sent by vger. the subject is [PATCH 3/4] hadoop: return all replica hostnames
[22:34] <sagewk> yeah i found it :)
[22:34] <sagewk> pushed
[22:34] <nwatkins`> cool
[22:34] <nwatkins`> thanks!
[22:35] <nwatkins`> that's pretty much the end of the hadoop fixes for the time being--at least for the students this semester. lots more cleanup stuff to do :)
[22:42] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[22:43] <Tv> nwatkins`: hehe, nothing like slave labour ;)
[22:45] <nwatkins`> Tv: hah...
[22:45] <nwatkins`> Tv: it's really quite nice here in the cubicle
[22:46] <nwatkins`> yikes--the wiki is crashing a lot today
[22:48] <gregaf1> it's just been bad lately, there's no way around it
[23:17] * verwilst (~verwilst@d51A5B077.access.telenet.be) Quit (Quit: Ex-Chat)
[23:26] * grape (~grape@ has joined #ceph
[23:31] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[23:36] * stass (stas@ssh.deglitch.com) Quit (Read error: Connection reset by peer)
[23:41] * stass (stas@ssh.deglitch.com) has joined #ceph
[23:53] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.