#ceph IRC Log


IRC Log for 2011-11-15

Timestamps are in GMT/BST.

[0:00] <grape> Tv: are there any mounting/unmounting concerns? how will it handle a reboot?
[0:01] <Tv> grape: init scripts should do it for you, but once again, just don't use btrfs_devs, manage it like any data partition, and you don't need to ask any of these questions ;)
[0:02] <grape> Tv: fair enough
[0:05] * jvarghese (~jvarghese@ has joined #ceph
[0:06] * jvarghese is now known as jojy
[0:08] <mgalkiewicz> gregaf: Yes I know. I did it. However no matter what caps I configure it still hangs during ceph -s command.
[0:09] <mgalkiewicz> gregaf: It seems that only with client.admin key other nodes are able to see ceph's stats etc.
[0:09] <gregaf> mgalkiewicz: yes, I'm referring to the keyring, not the caps :)
[0:10] <gregaf> if you want you can add other clients, but you need to do more than just create one on the local node with the right caps; you need to actually feed that key (with associated caps) into the monitor cluster
[0:10] <mgalkiewicz> I did it
[0:11] <gregaf> how?
[0:11] <mgalkiewicz> ceph auth add ...
[0:11] <mgalkiewicz> on monitor
[0:11] <gregaf> hmm
[0:11] <gregaf> I think this is actually documented on the wiki, although it seems to be having trouble
[0:12] <mgalkiewicz> I havent found it on wiki
[0:12] <mgalkiewicz> ceph documentation is really poor:(
[0:12] <gregaf> yeah :(
[0:13] <grape> A doc sprint couldn't hurt ;-)
[0:14] <grape> You can count me in. A couple of days working on docs would probably save me five to ten figuring things out.
[0:14] <gregaf> http://ceph.newdream.net/wiki/Cephx has some limited info about manipulating keys
[0:15] <mgalkiewicz> I know this page
[0:15] <mgalkiewicz> keys from my second node are shown in ceph auth list
[0:16] <mgalkiewicz> they also have caps allow rwx
[0:16] <mgalkiewicz> osd and mds is properly added to the cluster
[0:16] <gregaf> when you dump the keyring, are there any differences between the client.admin caps and the caps you've given your new keys?
[0:17] <mgalkiewicz> none
[0:17] <gregaf> they need monitor caps too, not just osd/mds ones
[0:17] <mgalkiewicz> mds allow
[0:17] <mgalkiewicz> mon allow *
[0:17] <mgalkiewicz> osd allow *
[0:18] <mgalkiewicz> these are caps for osd.1
[0:18] <gregaf> oh, I see
[0:18] <gregaf> (stupid bit of user unfriendliness follows) so when you run ceph tool it defaults to using client.admin
[0:18] <gregaf> and if that key isn't in the local keyring it'll probably fail
[0:19] <gregaf> you need to specify the right key with the -n option
[0:19] <mgalkiewicz> hmm give me a sec
[0:19] <gregaf> and actually I'm not sure that the ceph tool will let you use OSD caps, since those are for OSDs ;)
[0:20] <gregaf> I'd really just get the client.admin key wherever you need it
[0:20] <mgalkiewicz> it is not easy to transfer it in my cluster
[0:21] <mgalkiewicz> option -n is for setting name
[0:21] <gregaf> yes
[0:21] <mgalkiewicz> it works
[0:21] <gregaf> cool
[0:21] <mgalkiewicz> ceph -s -n osd.1
[0:21] <mgalkiewicz> great
[0:22] <mgalkiewicz> so the next question is what is the difference between rwx and *?
[0:22] <mgalkiewicz> nd how to properly set caps
[0:22] <gregaf> IIRC, * gives you caps on everything, period, the end
[0:23] <gregaf> rwx is your default caps, but if you don't own the data in question (eg, RADOS pools) you can't touch it
[0:23] <mgalkiewicz> I mean what caps should have osd, what mds
[0:23] <gregaf> check out what's default-generated, those are correct
[0:24] <mgalkiewicz> ok
[0:24] <gregaf> it's not a very fine-grained permission system right now
[0:24] <mgalkiewicz> and if after word "allow" there is nothing more it means no rights?
[0:26] <mgalkiewicz> I also discovered some weird behaviour. If caps are defined in keyring, all caps specified as parameters of ceph auth command are ignored. I am not sure if it is desired.
[0:27] <gregaf> let me check on the allow thing
[0:27] <mgalkiewicz> ok
[0:27] <gregaf> I think it means nothing, although for the mds it just be all there is to say...
[0:34] <mgalkiewicz> if it means nothing it is ignored
[0:34] <mgalkiewicz> and what about caps in keyring?
[0:35] <gregaf> those are the ones that matter; the caps in the keyring
[0:36] <mgalkiewicz> I think that most unix commands prefers parameters than settings from configuration file.
[0:37] <mgalkiewicz> I am not sure if ceph has a good approach.
[0:37] <gregaf> the keyring that you get off the monitor is what the cluster can validate as being correct; you can feed those into the monitors however you like
[0:37] <gregaf> but if you could just specify new caps when you started up a new command...eww
[0:38] * gregorg_taf (~Greg@ has joined #ceph
[0:38] * gregorg (~Greg@ Quit (Read error: Connection reset by peer)
[0:40] <mgalkiewicz> well I just suggest that this behaviour should be consisted with other unix commands. Anyway thx for help! You saved me a lot of work:)
[0:40] <gregaf> yeah, sorry for the confusion!
[0:41] <gregaf> if you think there's a problem with the tools you should create a bug in the tracker, btw — conflicting desires between a local keyring file and the command-line options is definitely a bit weird but we should try and do something to resolve it rather than silently letting the file win
[0:44] <mgalkiewicz> Ok I will create a bug. You can consider then how to solve this. Maybe more users will agree that it doesn't work how they expect.
[0:45] <mgalkiewicz> thx once again
[0:45] <mgalkiewicz> bye
[0:46] * sagelap (~sage@wireless-wscc-users-2930.sc11.org) has joined #ceph
[0:46] * mgalkiewicz (~maciej.ga@ Quit (Quit: Ex-Chat)
[0:47] <sagelap> tv: ping?
[0:49] <gregaf> he stepped out a bit ago, I imagine he's meeting about something
[0:54] <Tv> sagelap: pong
[0:54] <Tv> just picked RK's brain about the new sepia networking
[1:02] * diegows (~diegows@50-57-106-86.static.cloud-ips.com) has joined #ceph
[1:02] * adjohn (~adjohn@m870536d0.tmodns.net) Quit (Read error: Connection reset by peer)
[1:02] <diegows> hi
[1:02] * adjohn (~adjohn@ has joined #ceph
[1:03] <diegows> there is a stable branch in the git subtree but FAQ says that Ceph is not ready for production
[1:03] * adjohn (~adjohn@ Quit ()
[1:03] * adjohn (~adjohn@ has joined #ceph
[1:03] <diegows> is the FAQ outdated or stable doesn't mean what I think? :)
[1:05] <ajm> the latter
[1:06] <sagelap> tv: der, need to fix my pidgin alert settings
[1:06] <sagelap> tv: any red flags in the mon bootstrap braindump?
[1:06] <diegows> ajm: ok, thanks :)
[1:06] <ajm> diegows: someone from the ceph team could probably comment better, but I think the term "overabundance of caution" comes to mind
[1:07] <ajm> well s/over//
[1:07] <Tv> sagelap: haven't put in enough brain to really know
[1:07] <diegows> the project looks great, I'm a little anxious but I have no time for an adventure right now :)
[1:08] <Tv> sagelap: i'm worried about crushmaps etc but not in any specific way
[1:08] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[1:08] <ajm> diegows: its a pretty cool adventure if you do decide to take it
[1:09] <sagelap> tv: k. the main thing is it currently requires specifying the monitor ips for initial cluster, and for new monitors. once we have the subnets thing we take as much advantage of that as we can..
[1:09] <Tv> sagelap: yeah that will have to change but obvsly needs the subnet stuff first
[1:09] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #ceph
[1:09] <Tv> sagelap: i spent most of today coming up with a plan of how we'll use the new sepia hardware, it's looking good as in i think we didn't purchase the wrong things etc
[1:10] <sagelap> tv: phew!
[1:10] <Tv> sagelap: i went through and verified remote management capabilities of every component etc
[1:11] <Tv> sagelap: there's a network diagram that's still messy but it finally exists, etc
[1:12] <sagelap> tv: great.
[1:13] <sagelap> tv: further progress on the mon stuff is blocked until the subnets piece is there.. let's defer whatever planning you can until that's ready?
[1:14] <Tv> sagelap: yeah i needed to do some just to get over my email from this morning, i'll get back to the subnet thing; it looks easy
[1:19] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[1:23] <grape> grrr who the heck is this ubuntu user #$%$#$#%!!
[1:26] <grape> I fought with this remote access with no user business this morning, but it looks like he snuck back up on me. Does anyone have any config info that is better than the docs?
[1:33] * sagelap (~sage@wireless-wscc-users-2930.sc11.org) Quit (Ping timeout: 480 seconds)
[1:43] * adjohn is now known as Guest17070
[1:43] * adjohn (~adjohn@m870536d0.tmodns.net) has joined #ceph
[1:44] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[1:44] * adjohn (~adjohn@m870536d0.tmodns.net) Quit (Read error: Connection reset by peer)
[1:44] * adjohn (~adjohn@m870536d0.tmodns.net) has joined #ceph
[1:45] * adjohn is now known as Guest17071
[1:45] * Guest17071 (~adjohn@m870536d0.tmodns.net) Quit (Read error: Connection reset by peer)
[1:45] * adjohn (~adjohn@m870536d0.tmodns.net) has joined #ceph
[1:46] * adjohn is now known as Guest17072
[1:46] * Guest17072 (~adjohn@m870536d0.tmodns.net) Quit (Read error: Connection reset by peer)
[1:46] * adjohn (~adjohn@m870536d0.tmodns.net) has joined #ceph
[1:47] * Guest17070 (~adjohn@ Quit (Ping timeout: 480 seconds)
[1:50] * adjohn is now known as Guest17073
[1:50] * Guest17073 (~adjohn@m870536d0.tmodns.net) Quit (Read error: Connection reset by peer)
[1:50] * adjohn (~adjohn@m870536d0.tmodns.net) has joined #ceph
[1:51] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:52] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[1:52] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:53] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[1:54] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:58] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:07] * aa (~aa@r190-135-201-202.dialup.adsl.anteldata.net.uy) has joined #ceph
[3:16] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[3:30] * yoshi_ (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[3:30] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Read error: Connection reset by peer)
[3:34] * aa (~aa@r190-135-201-202.dialup.adsl.anteldata.net.uy) Quit (Remote host closed the connection)
[3:40] * adjohn (~adjohn@m870536d0.tmodns.net) Quit (Quit: adjohn)
[3:56] * jojy (~jvarghese@ Quit (Quit: jojy)
[3:57] * gohko (~gohko@natter.interq.or.jp) Quit (Quit: Leaving...)
[3:58] * ghaskins (~ghaskins@68-116-192-32.dhcp.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[4:02] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[4:03] * ghaskins (~ghaskins@68-116-192-32.dhcp.oxfr.ma.charter.com) has joined #ceph
[4:06] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) has joined #ceph
[4:25] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) has joined #ceph
[4:59] * gohko (~gohko@natter.interq.or.jp) has joined #ceph
[5:13] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[5:41] * bchrisman (~Adium@c-67-161-45-211.hsd1.ca.comcast.net) has joined #ceph
[5:49] * bchrisman (~Adium@c-67-161-45-211.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[6:12] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) has joined #ceph
[6:35] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[6:49] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[8:07] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[9:18] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:26] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) Quit (synthon.oftc.net oxygen.oftc.net)
[9:26] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (synthon.oftc.net oxygen.oftc.net)
[9:26] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (synthon.oftc.net oxygen.oftc.net)
[9:26] * votz (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) Quit (synthon.oftc.net oxygen.oftc.net)
[9:26] * darkfader (~floh@ Quit (synthon.oftc.net oxygen.oftc.net)
[9:27] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) has joined #ceph
[9:27] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #ceph
[9:27] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[9:27] * votz (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) has joined #ceph
[9:27] * darkfader (~floh@ has joined #ceph
[9:33] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[9:33] <chaos__> what is the difference between op_w_rlat and op_w_lat?
[9:46] * votz_ (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) has joined #ceph
[9:47] * darkfader (~floh@ Quit (synthon.oftc.net oxygen.oftc.net)
[9:47] * votz (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) Quit (synthon.oftc.net oxygen.oftc.net)
[9:47] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (synthon.oftc.net oxygen.oftc.net)
[9:47] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) Quit (synthon.oftc.net oxygen.oftc.net)
[9:47] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (synthon.oftc.net oxygen.oftc.net)
[9:49] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) has joined #ceph
[9:49] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #ceph
[9:49] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[9:51] * darkfader (~floh@ has joined #ceph
[10:01] <chaos__> why opq is always 0? it's rados specific? (I dont use rados) or my cluster is always on time?
[10:19] <chaos__> and the last one.. read latency is quite ok - always 180-300us, but write latency is 498ms (today) and 639ms (last week) avarage, and it can go sky high up to 2 seconds sometimes
[10:21] <chaos__> i'm not happy with such write delay, there is anyway to improve this? my cluster consists of 2 osd on two separate machines, both are in the same rack and they are connected with 100mbit link
[10:40] * Olivier_bzh (~langella@xunil.moulon.inra.fr) has joined #ceph
[10:42] <Olivier_bzh> hello everyone !
[10:42] <chaos__> hi
[10:43] <Olivier_bzh> I am a new user of ceph : I am trying to compile it on Ubuntu server 11.10
[10:44] <chaos__> there is no packages for 11.10?
[10:46] <Olivier_bzh> I have some trouble : the configure script does not find libedit
[10:46] <Olivier_bzh> I have installed required packages and libgoogle-perftools-dev
[10:46] <Olivier_bzh> I haven't seen packages for 11.10
[10:46] <Olivier_bzh> only natty
[10:46] <Olivier_bzh> but this is not working because it depends on libcrypt8++
[10:46] <Olivier_bzh> and oneiric has libcrypto++9
[10:48] <chaos__> i'm using <11.10 sorry, and core developers are here something around 16-17 CET
[10:48] <Olivier_bzh> ah ok ;-)
[10:49] <chaos__> but.. leave session here;) someone will help you
[10:49] <Olivier_bzh> So I'm investigating, I will report if I found a way to compile
[10:49] <Olivier_bzh> thank you
[10:49] <Olivier_bzh> good bye
[10:55] * Olivier_bzh (~langella@xunil.moulon.inra.fr) Quit (Read error: Operation timed out)
[11:00] * Olivier_bzh (~langella@xunil.moulon.inra.fr) has joined #ceph
[11:22] * yoshi_ (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:27] <Olivier_bzh> hello again,
[11:28] <Olivier_bzh> I've found out how to compile from ceph-0.38.tar.gz on Ubuntu 11.10 :
[11:28] <Olivier_bzh> I had to install the package :
[11:29] <Olivier_bzh> libgoogle-perftools-dev that provide tcmalloc
[11:30] <Olivier_bzh> and set the environment variable :
[11:31] <Olivier_bzh> LIBEDIT_CFLAGS='-ledit -lcurses '
[12:10] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) Quit (Remote host closed the connection)
[13:30] * MKFG (~MK_FG@ has joined #ceph
[13:33] * MK_FG (~MK_FG@219.91-157-90.telenet.ru) Quit (Ping timeout: 480 seconds)
[13:33] * MKFG is now known as MK_FG
[13:34] * gregorg_taf (~Greg@ Quit (Ping timeout: 480 seconds)
[13:35] <psomas> gregaf: to make sure i get this right, i add public_addr and cluster_addr under the osd.$id section of the config, where pub_addr is the one the clients will use, and clust_addr the one used for replication/peering etc?
[13:36] <psomas> and what about the monitors
[13:37] * gregorg (~Greg@ has joined #ceph
[15:07] * royh (~royh@mail.vgnett.no) has left #ceph
[15:10] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[16:59] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) has joined #ceph
[17:26] <damoxc> psomas: i think they just need public addresses
[18:14] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:16] <grape> ceph has started without errors on all nodes :-D
[18:16] <grape> not sure if it is good or bad that the health command isn't reporting anything back to me
[18:18] <grape> ok, now for some janitorial work: are there any mount parameters other than noatime that need to go in fstab?
[18:21] * adjohn is now known as Guest17178
[18:21] * Guest17178 (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) Quit (Read error: Connection reset by peer)
[18:21] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) has joined #ceph
[18:23] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[18:37] * sagelap (~sage@wireless-wscc-users-2076.sc11.org) has joined #ceph
[18:37] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Read error: Operation timed out)
[18:51] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:00] <damoxc> does anyone know what would be causing #1725
[19:01] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[19:02] * sagelap (~sage@wireless-wscc-users-2076.sc11.org) Quit (Read error: Operation timed out)
[19:03] <sjust> damoxc: looking
[19:03] <damoxc> sjust: thanks
[19:03] <grape> is it reasonable to configure logging in the [global] section of ceph.conf
[19:04] <sjust> damoxc: ext4, right?
[19:04] <damoxc> sjust: btrfs
[19:04] <sjust> damoxc: ah...
[19:04] <damoxc> sjust: running 3.1.0
[19:04] <sjust> damoxc: ah, it's likely a bug in btrfs
[19:05] <sjust> try going onto that machine and making a file with an xattr of size 3k
[19:05] <sjust> (that should suceed)
[19:05] <sjust> then overwrite the same xattr with the same value
[19:05] <sjust> if it's the bug I'm thinking of, it'll fail
[19:05] <joshd> grape: that will configure logging for command line tools too, which you might not want
[19:05] <grape> joshd: lol good point
[19:07] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) has joined #ceph
[19:08] <damoxc> sjust: yes you're correct
[19:08] <damoxc> attr_set: Value too large for defined data type
[19:08] <damoxc> sjust: do you know if that's been fixed?
[19:08] <sjust> damoxc: let me see if I can dig up the patch we got from josef
[19:09] <damoxc> sjust: awesome thanks
[19:11] <sjust> commit f7beac469a8fd8a77823a97f6ed36a41a3845425
[19:11] <sjust> Author: Josef Bacik <josef at>
[19:11] <sjust> Date: Thu Oct 13 17:11:34 2011 +0000
[19:11] <sjust> Btrfs: fix regression in re-setting a large xattr
[19:11] <sjust>
[19:11] <sjust> Recently I changed the xattr stuff to unconditionally set the xattr first in
[19:11] <sjust> case the xattr didn't exist yet. This has introduced a regression when setting
[19:11] <sjust> an xattr that already exists with a large value. If we find the key we are
[19:11] <sjust> looking for split_leaf will assume that we're extending that item. The problem
[19:11] <sjust> is the size we pass down to btrfs_search_slot includes the size of the item
[19:11] <sjust> already, so if we have the largest xattr we can possibly have plus the size of
[19:11] <sjust> the xattr item plus the xattr item that btrfs_search_slot we'd overflow the
[19:11] <sjust> leaf. Thankfully this is not what we're doing, but split_leaf doesn't know this
[19:11] <sjust> so it just returns EOVERFLOW. So in the xattr code we need to check and see if
[19:11] <sjust> we got back EOVERFLOW and treat it like EEXIST since that's really what
[19:11] <sjust> happened. Thanks,
[19:11] <sjust>
[19:11] <sjust> Signed-off-by: Josef Bacik <josef <at> redhat.com>
[19:11] <sjust> diff --git a/fs/btrfs/xattr.c b/fs/btrfs/xattr.c
[19:11] <sjust> index 69565e5..5bd7877 100644
[19:11] <sjust> --- a/fs/btrfs/xattr.c
[19:11] <sjust> +++ b/fs/btrfs/xattr.c
[19:11] <sjust> @@ -127,7 +127,18 @@ static int do_setxattr(struct btrfs_trans_handle *trans,
[19:11] <sjust> again:
[19:11] <sjust> ret = btrfs_insert_xattr_item(trans, root, path, btrfs_ino(inode),
[19:11] <sjust> name, name_len, value, size);
[19:11] <sjust> - if (ret == -EEXIST) {
[19:11] <sjust> + /*
[19:11] <sjust> + * If we're setting an xattr to a new value but the new value is say
[19:12] <sjust> + * exactly BTRFS_MAX_XATTR_SIZE, we could end up with EOVERFLOW getting
[19:12] <sjust> + * back from split_leaf. This is because it thinks we'll be extending
[19:12] <sjust> + * the existing item size, but we're asking for enough space to add the
[19:12] <sjust> + * item itself. So if we get EOVERFLOW just set ret to EEXIST and let
[19:12] <sjust> + * the rest of the function figure it out.
[19:12] <sjust> + */
[19:12] <sjust> + if (ret == -EOVERFLOW)
[19:12] <sjust> + ret = -EEXIST;
[19:12] <sjust> +
[19:12] <sjust> + if (ret == -EEXIST || ret == -EOVERFLOW) {
[19:12] <sjust> if (flags & XATTR_CREATE)
[19:12] <sjust> goto out;
[19:12] <sjust> /*
[19:13] <sjust> actually: http://article.gmane.org/gmane.comp.file-systems.btrfs/13630/match=large+xattr
[19:14] <damoxc> cool thanks, I'll roll a new kernel and see how things go
[19:20] <damoxc> sjust: do you know what kernel that patch is based off, applying to 3.1.0 works with offset + fuzz, but build fails
[19:21] <damoxc> sjust: ignore me, I'm being dozy
[19:27] <gregaf> psomas: correct; and the monitors just use one address for everything
[19:28] <gregaf> chaos_: these are values you're pulling out with collectd?
[19:33] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[19:42] <grape> is the debigging page on the wiki up to date? http://ceph.newdream.net/wiki/Debugging
[19:44] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[19:45] <gregaf> grape: it looks right to me
[19:45] <grape> gregaf: awesome
[20:01] * adjohn (~adjohn@ has joined #ceph
[20:06] <grape> gregaf: a little logging makes all the difference :-)
[20:06] <gregaf> "little"
[20:08] * jojy (~jvarghese@ has joined #ceph
[20:08] <grape> gregaf: I'll dig into what it all means after lunch. It appears that everything is set up properly and not throwing errors.
[20:13] <damoxc> sjust: that works, thanks for all your help :-)
[20:13] <sjust> damoxc: glad it worked!
[20:15] <todin> hi, is this right to get a bt from a osd core? gdb /usr/lib/debug/usr/bin/ceph-osd /root/core
[20:17] <yehudasa_> Tv: there are a couple of tests I'm trying to disable in the swift suite which I'm having trouble with
[20:17] <yehudasa_> Tv: the thing is that those tests succeed in the regular TestFile, but fail in TestFileUTF8
[20:17] <gregaf> todin: I believe you're going to want to reference the actual ceph-osd binary and then pass in an option with gdb to include the debug symbols
[20:17] <yehudasa_> Tv: the reason it fails is because they create strange header fields that we don't want to support
[20:19] <yehudasa_> Tv: TestFileUTF8 just extends TestFile, so I tried to overload those specific tests and add the attr decorator to it, but it didn't work
[20:19] <yehudasa_> Tv: any idea?
[20:20] <Tv> yehudasa_: let me look
[20:21] <yehudasa_> Tv: thanks. specifically one of the tests is testCopy
[20:22] <yehudasa_> Tv: this didn't work: http://pastebin.com/WqQd8UqP
[20:22] <Tv> yehudasa_: test/functional/tests.py?
[20:22] <yehudasa_> Tv: yes
[20:28] <Tv> yehudasa_: i'm trying to understand all this.. what is so horrible about the characters the utf8 test puts in headers?
[20:29] <Tv> yehudasa_: it seems like it's switching create_name utility from ascii uuids to random but valid utf8 strings
[20:29] <yehudasa_> Tv: I can verify that, but apache eats it
[20:30] <Tv> i don't see any reason why @attr wouldn't work there, still poking it
[20:30] * aliguori (~anthony@ has joined #ceph
[20:32] * Nightdog (~karl@190.84-48-62.nextgentel.com) Quit (Remote host closed the connection)
[20:33] <Tv> those tests are fundamentally flawed though, bleh
[20:33] <Tv> there's no non-ASCII in http headers, ever
[20:34] <yehudasa_> Tv: is that written somewhere officially? had trouble finding that in the RFC
[20:35] <Tv> yehudasa_: too many browser tabs open, hold on a bit and i'll hunt it down..
[20:36] * jojy (~jvarghese@ Quit (Quit: jojy)
[20:37] <Tv> http://tools.ietf.org/html/rfc2616.html#section-4.2
[20:37] <Tv> message-header = field-name ":" [ field-value ]
[20:37] <Tv> field-name = token
[20:37] <Tv> http://tools.ietf.org/html/rfc2616.html#section-2.2
[20:37] * jojy (~jvarghese@ has joined #ceph
[20:37] <Tv> token = 1*<any CHAR except CTLs or separators>
[20:37] <Tv> CHAR = <any US-ASCII character (octets 0 - 127)>
[20:37] <yehudasa_> ok, cool
[20:37] <yehudasa_> now, how do I disable it?
[20:37] <Tv> looking..
[20:46] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[20:46] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[20:54] <Tv> yehudasa_: i think what's going on is that nose's unittest.py compatability (which swift is using) doesn't work with negated attr's, due to some implementation intricacy
[20:55] <yehudasa_> Tv: but I am able to disable tests.. if I'd put attr in TestFile, it will get disabled
[20:55] <yehudasa_> but that's a bigger hammer
[20:55] <Tv> yehudasa_: i'm filing a bug report on nose soon, but you'll need some other workaround
[20:56] <Tv> yehudasa_: it's the inheritance that breaks it
[20:56] <Tv> i think
[20:56] <yehudasa_> yeah, was my feeling too
[20:56] <yehudasa_> I think we can just disable this test
[20:57] <yehudasa_> (I mean, for both)
[20:57] <yehudasa_> or better yet, fix it that it doesn't use utf8 header fields
[20:58] <Tv> yehudasa_: yeah, fix upstream instead of just disable a test
[20:58] <Tv> ideally
[21:05] <Tv> http://code.google.com/p/python-nose/issues/detail?id=471
[21:13] * yehudasa_ (~yehudasa@aon.hq.newdream.net) has left #ceph
[21:13] * yehudasa_ (~yehudasa@aon.hq.newdream.net) has joined #ceph
[21:23] * slang (~slang@chml01.drwholdings.com) Quit (Quit: Leaving.)
[21:32] * ircleuser (~ivsipi@216-239-45-4.google.com) has joined #ceph
[21:36] * slang (~slang@chml01.drwholdings.com) has joined #ceph
[21:55] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) Quit (Quit: cp)
[22:15] * ircleuser (~ivsipi@216-239-45-4.google.com) Quit (Ping timeout: 480 seconds)
[22:18] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) has joined #ceph
[22:19] * adjohn is now known as Guest17208
[22:19] * adjohn (~adjohn@m830536d0.tmodns.net) has joined #ceph
[22:20] * adjohn (~adjohn@m830536d0.tmodns.net) Quit ()
[22:25] * Guest17208 (~adjohn@ Quit (Ping timeout: 480 seconds)
[22:27] * verwilst (~verwilst@d51A5B534.access.telenet.be) has joined #ceph
[22:27] <grape> I'm trying to determine if I have a healthy cluster, so I checked health using the correct keyring and config file. Nothing is returned. Can anyone point me in a direction?
[22:28] <grape> I have effectively completed the install procedure through the end of this page -> http://ceph.newdream.net/docs/latest/ops/install/mkcephfs/
[22:29] <grape> Except for the health check that is :-)
[22:35] <gregaf> grape: what do you mean, nothing is returned?
[22:35] <gregaf> it hangs?
[22:35] <gregaf> when I run it on one of our clusters I get:
[22:35] <gregaf> root@peon1540:~# ceph health
[22:35] <gregaf> 2011-11-15 13:35:34.318946 mon <- [health]
[22:35] <gregaf> 2011-11-15 13:35:34.319396 mon.0 -> 'HEALTH_WARN 1/4479249 degraded (0.000%)' (0)
[22:35] <gregaf> on stdout
[22:36] <grape> Yeah it doesn't do anything. Returns nothing.
[22:36] <gregaf> oh, hey
[22:37] <grape> oh wait
[22:37] <gregaf> when I try this on a vstart it's failing somehow and returning 1
[22:38] <grape> nothing is runnign
[22:38] <Tv> grape: i bet it's exiting with status 1
[22:39] <grape> would I find that in the osd logs
[22:39] <Tv> grape: just standard unix, echo $?
[22:39] <Tv> that it does not output an error message is a bug
[22:39] <gregaf> looks like the tool got busted at some point in master :/
[22:41] <grape> the health tool?
[22:41] <gregaf> the ceph tool
[22:41] <grape> gotcha
[22:41] <gregaf> health is just one command of many
[22:41] <grape> yeah
[22:42] <gregaf> unfortunately I don't see anything obvious in the changelog
[22:43] <grape> have you seen this before?
[22:45] <gregaf> nope
[22:45] <gregaf> I'm looking into it
[22:45] <gregaf> are you building from git or did you install from a package?
[22:48] <grape> installed from package, ubuntu 11.10
[22:48] <gregaf> ugh, okay
[22:48] <grape> lol
[22:48] <gregaf> wait, which packages?
[22:48] <gregaf> just the ones in the ubuntu repo?
[22:48] <grape> I believe it is the dailys
[22:48] <grape> let me verify
[22:48] <gregaf> ah, okay
[22:49] <gregaf> that's better then
[22:49] <grape> yeah deb http://ceph.newdream.net/debian/ oneiric main
[22:50] <gregaf> okay, as long as it's not something we pushed out as a proper release
[22:50] <grape> :-D
[22:50] <grape> I can pastebin whatever you need
[22:50] <gregaf> actually it might be
[22:51] <gregaf> no, I've reproduced it locally, thanks
[22:51] <grape> whew
[22:52] <sjust> sage, sagewk: are you around?
[22:59] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) Quit (Quit: Leaving)
[23:05] <gregaf> grape: hmm, I can't reproduce it anymore and I'm not sure what commit I was on before *blush*
[23:05] <gregaf> can you run ceph-osd -v and give me the output?
[23:05] <Tv> gregaf: git reflog
[23:06] <grape> gregaf: sure. one minute.
[23:06] <grape> gregaf: ceph version 0.38 (commit:b600ec2ac7c0f2e508720f8e8bb87c3db15509b9)
[23:06] <gregaf> Tv: I always think of git reflog as a black hole for some reason
[23:06] <gregaf> but it's really pretty simple, ty!
[23:06] <gregaf> thanks grape
[23:07] <Tv> it's confusing until you realize how git branches & checking out a different state actually works
[23:07] <grape> gregaf: absolutely my pleasure
[23:11] <gregaf> okay, I can't reproduce at all any more
[23:11] <gregaf> that's just bizarre
[23:11] <gregaf> grape: what's ceph -v give you?
[23:13] * cp (~cp@108-203-50-207.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[23:15] <grape> the same thing
[23:15] <grape> exactly
[23:16] <gregaf> what's ceph health give you now?
[23:16] <grape> gregaf: root@bet:~#
[23:17] <grape> gregaf: nothing
[23:19] <gregaf> grape: are you using cephx?
[23:21] <joshd> grape: does 'ceph health --log-to-stderr 2' give you any output?
[23:21] <grape> gregaf: in the config I have auth supported = cephx
[23:21] <gregaf> looks to me like something's gone wrong with cephx and the ceph tool
[23:21] <grape> joshd: yes it does give me output
[23:22] <gregaf> grape: "unable to authenticate…"
[23:22] <grape> gregaf: joshd: 2011-11-15 17:21:28.326541 7f18bea4e740 monclient(hunting): MonClient::init(): Failed to create keyring
[23:23] <gregaf> okay, so it looks like the keyring isn't accessible to the ceph tool
[23:23] * verwilst (~verwilst@d51A5B534.access.telenet.be) Quit (Quit: Ex-Chat)
[23:23] <grape> gregaf: and it ended with 2011-11-15 17:21:28.326581 7f18bea4e740 ceph_tool_common_init failed.
[23:26] <gregaf> grape: do you know where your keyring file is?
[23:26] <grape> gregaf: I have it in /etc/ceph/
[23:30] <joshd> grape: you should specify the keyring in your ceph.conf in the [client.admin] section, i.e. keyring=/etc/ceph/client.admin.keyring
[23:30] <gregaf> that doesn't seem to be the problem
[23:31] <gregaf> I can do a vstart and then try to do it: " ./ceph health --log-to-stderr 2 --debug_monc 10 -k .ceph_keyring"
[23:31] <gregaf> and it fails to authenticate as client.admin
[23:31] <grape> joshd: does it matter what it is called?
[23:32] <joshd> grape: the filename doesn't matter
[23:32] <gregaf> ….and I can't look at what the mon thinks the right keys are because that requires an authenticated admin
[23:32] <gregaf> *sigh*
[23:33] <grape> joshd: gregaf: It seems I was literal with the docs and have keyring = /etc/ceph/$name.keyring
[23:33] <gregaf> but yeah, that seems to be the problem
[23:33] <gregaf> "2011-11-15 14:26:19.203512 7fc2b2f0a710 cephx server client.admin: handle_request get_auth_session_key for client.admin
[23:33] <gregaf> 2011-11-15 14:26:19.203517 7fc2b2f0a710 cephx server client.admin: couldn't find entity name: client.admin
[23:33] <gregaf> "
[23:34] <joshd> grape: that should work if your files are named correctly - name should expand to be client.admin I think
[23:34] <gregaf> so the keys aren't getting added for cephx, at least in vstart, probably with mkcephfs either
[23:34] * chaos__ (~chaos@hybris.inf.ug.edu.pl) Quit (Ping timeout: 480 seconds)
[23:35] <grape> the mkcephfs command i specified the filename
[23:35] <grape> joshd: let me clean it up and see if it works
[23:37] <joshd> gregaf: it's working for me at 2e195500b5d3a8ab8512bcf2a219a6b7ff922c97
[23:38] <gregaf> I wonder if it's an issue of the scripts not being upgraded properly with the executables
[23:39] <joshd> gregaf: you're using an installed version of vstart?
[23:39] <gregaf> just the one in the repo
[23:39] <joshd> gregaf: you are passing -x, right?
[23:39] <gregaf> but there might have been some confusion with the packaging since it was done late
[23:39] <gregaf> and yes
[23:40] <joshd> gregaf: make sure you don't have a /etc/ceph/ceph.conf overriding the one vstart generates
[23:41] <gregaf> I think the way I'm managing to get it is with the .38 binaries and the master vstart script
[23:41] <gregaf> which is why I could initially reproduce it and then couldn't; because I had old binaries lying around
[23:42] <grape> I cleared the logs and ran everything from mkcephfs -a... with the updated ceph.conf. health is taking a long time to think
[23:43] <grape> there were no errors noticable
[23:50] <grape> gregaf: ceph health --log-to-stderr 2 gives me a bunch of the following
[23:50] <grape> --> -- auth(proto 0 30 bytes) v1 -- ?+0 0x1b7d660 con 0x1b7d3f0
[23:50] <grape> >> pipe(0x1b7d180 sd=4 pgs=0 cs=0 l=0).fault with nothing to send, goiny
[23:50] <grape> >> pipe(0x1b7d180 sd=3 pgs=0 cs=1 l=0).fault first fault
[23:50] <gregaf> are the monitor nodes running?
[23:51] <todin> any gdb expert here? how can I print the value of pending_ops? that doesnt work p 'OSD::dequeue_op' pending_ops
[23:52] <grape> gregaf: lol nothing is running
[23:52] <gregaf> todin: if you're in the right stack frame you should be able to just say "print pending_ops"
[23:52] <gregaf> grape: yeah, it's not real well-behaved in that case but it's basically complaining because it couldn't connect
[23:53] <grape> still the auth issue?
[23:53] <gregaf> no, if nothing's running and you ask the ceph tool to connect to a monitor it's going to try to
[23:53] <gregaf> and that output is what you get when it times out
[23:54] <todin> gregaf: that doesnt work, how do I change into the stack frame, I never debuged threaded progs.
[23:54] <gregaf> ah
[23:54] <grape> gregaf: checking for health on something that isn't running never gives good results
[23:55] <gregaf> todin: "thread 1"
[23:55] <gregaf> "frame 11"
[23:55] <gregaf> what's that give you?
[23:56] <grape> gregaf: just for the record, what should the [global] "keyring = " line read?
[23:57] <gregaf> ah! config and setup questions! *ducks*
[23:57] <grape> gregaf: and should there be multiple *.keyrings?
[23:58] <gregaf> so if you've got one keyring that holds all the keys then it should be the location of that file
[23:58] <todin> (gdb) thread 1
[23:58] <todin> [Switching to thread 1 (Thread 1646)]#11 0x000000000055a4db in OSD::dequeue_op (this=0x2353000, pg=0x26e1000) at osd/OSD.cc:5534
[23:58] <todin> 5534 in osd/OSD.cc
[23:58] <todin> (gdb) frame 11
[23:58] <todin> #11 0x000000000055a4db in OSD::dequeue_op (this=0x2353000, pg=0x26e1000) at osd/OSD.cc:5534
[23:58] <todin> 5534 in osd/OSD.cc
[23:58] <gregaf> todin: now "print pending_ops"
[23:59] <gregaf> grape: I think generally you're more likely to have a bunch of keyring files, one for each of your daemons (so that you don't need all your keys on all your nodes)
[23:59] <gregaf> in which case you probably don't want to specify a single keyring file, and you'd have to consult the docs/Tv for how best to set that up
[23:59] * MK_FG (~MK_FG@ Quit (Quit: o//)
[23:59] <grape> gregaf: so as the docs say $name.keyring does make sense
[23:59] <gregaf> yeah, that's a good strategy
[23:59] <todin> gregaf: ahh, know it unterstand it, you got an the frame 11 one the stack, and then you can print it

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.