#ceph IRC Log


IRC Log for 2011-08-11

Timestamps are in GMT/BST.

[0:13] * jim (~chatzilla@astound-69-42-16-6.ca.astound.net) has joined #ceph
[0:42] * hutchins (~hutchins@c-75-71-83-44.hsd1.co.comcast.net) has joined #ceph
[0:43] <hutchins> I have a question regarding the OSDs. Are they OSDs as per the T10 specification?
[0:44] <Tv> hutchins: no
[0:44] <Tv> but they fill a similar role
[0:45] <hutchins> Ah ok
[0:45] <hutchins> thanks
[0:47] <hutchins> Is there documentation on the Ceph OSD?
[0:47] <Tv> hutchins: what kind of documentation are you looking for?
[0:48] <hutchins> Explaination on the design of the OSD.
[0:49] <Tv> the academic papers are best for that.. http://ceph.newdream.net/publications/
[0:49] <Tv> first one is probably the best one
[0:52] <hutchins> ok
[1:21] * hutchins (~hutchins@c-75-71-83-44.hsd1.co.comcast.net) Quit (Remote host closed the connection)
[1:27] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Remote host closed the connection)
[2:01] * Tv (~Tv|work@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[2:08] * huangjun (~root@ has joined #ceph
[2:21] * greglap (~Adium@ has joined #ceph
[2:45] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[2:56] * cmccabe (~cmccabe@c-24-23-254-199.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[3:17] * greglap (~Adium@ Quit (Quit: Leaving.)
[3:21] * jim (~chatzilla@astound-69-42-16-6.ca.astound.net) Quit (Ping timeout: 480 seconds)
[3:27] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Read error: Operation timed out)
[3:28] * yoshi (~yoshi@p10166-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[3:38] * lxo (~aoliva@09GAAF054.tor-irc.dnsbl.oftc.net) Quit (Quit: later)
[3:39] * lxo (~aoliva@9YYAAARTQ.tor-irc.dnsbl.oftc.net) has joined #ceph
[3:48] * jim (~chatzilla@astound-69-42-16-6.ca.astound.net) has joined #ceph
[3:49] * jojy (~jojyvargh@70-35-37-146.static.wiline.com) Quit (Quit: jojy)
[4:19] * jim (~chatzilla@astound-69-42-16-6.ca.astound.net) Quit (Ping timeout: 480 seconds)
[4:59] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[4:59] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) Quit ()
[5:49] * yoshi (~yoshi@p10166-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[6:44] * yehuda_hm (~yehuda@99-48-179-68.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[6:46] * hutchins (~hutchins@c-75-71-83-44.hsd1.co.comcast.net) has joined #ceph
[6:48] * hutchins (~hutchins@c-75-71-83-44.hsd1.co.comcast.net) Quit (Remote host closed the connection)
[7:13] * yoshi (~yoshi@u610174.xgsfmg1.imtp.tachikawa.mopera.net) has joined #ceph
[8:04] * yoshi (~yoshi@u610174.xgsfmg1.imtp.tachikawa.mopera.net) Quit (Remote host closed the connection)
[8:11] * NeonLicht (~NeonLicht@darwin.ugr.es) has joined #ceph
[8:11] <NeonLicht> Hello.
[8:42] * jim (~chatzilla@astound-69-42-16-6.ca.astound.net) has joined #ceph
[10:49] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[11:21] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Remote host closed the connection)
[11:24] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[11:26] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Remote host closed the connection)
[11:36] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[11:44] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[13:24] * huangjun (~root@ Quit (Quit: Lost terminal)
[13:34] * Dantman (~dantman@S0106001731dfdb56.vs.shawcable.net) Quit (Read error: Operation timed out)
[14:35] * huangjun (~root@ has joined #ceph
[14:47] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Remote host closed the connection)
[14:54] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[15:07] * Juul (~Juul@slim.visitor.camp.ccc.de) has joined #ceph
[15:12] * huangjun (~root@ Quit (Ping timeout: 480 seconds)
[15:13] * Dantman (~dantman@S0106001731dfdb56.vs.shawcable.net) has joined #ceph
[15:14] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[15:15] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[16:00] * lxo (~aoliva@9YYAAARTQ.tor-irc.dnsbl.oftc.net) Quit (Remote host closed the connection)
[16:13] * Juul (~Juul@slim.visitor.camp.ccc.de) Quit (Ping timeout: 480 seconds)
[16:18] * lxo (~aoliva@9YYAAASG6.tor-irc.dnsbl.oftc.net) has joined #ceph
[16:23] * Juul (~Juul@ has joined #ceph
[16:51] * lxo (~aoliva@9YYAAASG6.tor-irc.dnsbl.oftc.net) Quit (Remote host closed the connection)
[16:51] * lxo (~aoliva@659AADKHT.tor-irc.dnsbl.oftc.net) has joined #ceph
[17:18] * Juul (~Juul@ Quit (Ping timeout: 480 seconds)
[17:24] * Tv (~Tv|work@aon.hq.newdream.net) has joined #ceph
[17:29] * Juul (~Juul@slim.visitor.camp.ccc.de) has joined #ceph
[17:36] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:52] * greglap (~Adium@ has joined #ceph
[18:00] * Juul (~Juul@slim.visitor.camp.ccc.de) Quit (Ping timeout: 480 seconds)
[18:09] * Juul (~Juul@ has joined #ceph
[18:35] * bchrisman (~Adium@70-35-37-146.static.wiline.com) has joined #ceph
[18:41] * greglap (~Adium@ Quit (Quit: Leaving.)
[18:41] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[18:41] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[18:47] <joshd> NeonLicht: hi
[18:51] * cmccabe (~cmccabe@ has joined #ceph
[19:04] * aliguori (~anthony@ has joined #ceph
[19:14] * jojy (~jojyvargh@70-35-37-146.static.wiline.com) has joined #ceph
[19:22] * Juul (~Juul@ Quit (Ping timeout: 480 seconds)
[19:32] * hutchins (~hutchins@ltc-vpn.dothill.com) has joined #ceph
[19:52] <NeonLicht> Hi, joshd, I almost miss your greeting among so many messages! :-)
[19:54] <joshd> heh
[19:54] <gregaf> we're a lot more active during the US' West coast business hours :)
[19:55] <NeonLicht> Ah! I though I was going to need to wait for another 10 hours for someone to laugh at my joke!
[19:56] <NeonLicht> It must be around 1000 there, right?
[19:56] <gregaf> 10:56 am
[19:56] <NeonLicht> 1956 here :)
[19:56] <gregaf> for all the developers, right now
[19:57] <NeonLicht> Good morning, then!
[19:57] <gregaf> occasionally two users in the same time zone will bump into each other but all the idlers have figured out nothing interesting is going to happen outside of our work time ;)
[19:57] <gregaf> and good evening to you!
[19:59] <NeonLicht> I'm learning about distributed FSs and ceph, and thought I might hang around the channel for a while to learn some more.
[20:01] <cmccabe> so about this rados_exec question
[20:01] <gregaf> well right now you're more likely to see configuration issues and coding stuff than anything about designing distributed FSes, but enjoy!
[20:02] <cmccabe> the way I see it, we need to provide an API in librados to load these modules to make rados_exec usable
[20:02] <cmccabe> otherwise we should take it out
[20:02] <cmccabe> because it can't actually be used
[20:03] <gregaf> cmccabe: we had such an API and ripped it out because it was too much trouble; it's now just on the sysadmins to handle class installation
[20:03] <NeonLicht> Thanks, gregaf.
[20:03] <cmccabe> gregaf: I remember that discussion, I just didn't remember the resolution
[20:03] <gregaf> and we've now exhausted the limit of my knowledge about classes, Sage is out doing an interview or something and I'm not sure if yehudasa is still here or not
[20:04] <gregaf> cmccabe: yeah, the resolution was that anybody wanting classes can handle the install themselves because there are already good tools for such things
[20:04] <gregaf> and the normal Ceph install has the ability to take care of RBD, which is our only home-grown user
[20:04] <cmccabe> gregaf: perhaps the test can test against an RBD module then
[20:05] <cmccabe> gregaf: if that is installed by default
[20:05] <gregaf> probably the best bet, but I really don't know what you'll need to go through to make it happen, sorry
[20:05] <gregaf> though joshd might?
[20:06] <joshd> there's no extra setup needed for rbd
[20:07] <joshd> for the test, you could probably call rados_exec(ioctx, "rbd", "assign_bid", ...)
[20:08] <joshd> the other rbd class methods deal with snapshots, which would require an rbd image
[20:08] <cmccabe> joshd: ok, great
[20:08] <cmccabe> joshd: what does assign_bid do?
[20:09] <joshd> it's used for a unique id for rbd images - it's the first step of rbd image creation
[20:09] <cmccabe> so really, nothing visible
[20:10] <cmccabe> well ok
[20:10] <cmccabe> calling that is better than not testing rados_exec
[20:11] <joshd> you could call it more than once and make sure it gave you different ids, but there's nothing that would test just rados_exec and not the underlying class
[20:13] <gregaf> given that rados_exec's whole purpose is calling external code, there's not a lot you can do to test it in terms of consequences anyway ??? in this case if it returns a valid ID you've basically tested the (successful path) librados part of it all the way through
[20:13] <joshd> oh, looking at it more closely you have to call it on the RBD_INFO object (from include/rbd_types.h)
[20:14] <gregaf> so depending on how thorough this test suite is it's basically making one successful call and then making a nonsense call and making sure you get back the right response
[20:53] * cp (~cp@ has joined #ceph
[20:55] <cp> Hi everyone
[20:56] <joshd> hi cp
[21:05] <cp> I have some questions about setting up Ceph. Who/where should I ask?
[21:06] <cp> Hi Josh
[21:06] <joshd> ask away
[21:06] <cp> Thanks :)
[21:06] <cp> I've set up a simple 1-node ceph.conf file but get an error like this:
[21:06] <cp> ** ERROR: error creating empty object store in ceph_osd/dev/osd0: Operation not supported
[21:07] <cp> when I run: mkcephfs --allhosts -c /etc/ceph/ceph2.conf -k /etc/ceph/keyring.bin
[21:08] <cp> Also:
[21:08] <cp> failed: '/usr/bin/cosd -c /etc/ceph/ceph2.conf --monmap /tmp/monmap.9434 -i 0 --mkfs --osd-data ceph_osd/dev/osd0'
[21:08] <joshd> is ceph_osd/dev/osd0 a disk, or is it a mounted fs? if so, is it btrfs?
[21:09] <joshd> can you pastebin your ceph.conf?
[21:09] <cp> It's a directory. ext3 I think, not btrfs
[21:09] <joshd> is it mounted with the user_xattr option?
[21:09] <cp> hmmm... ?
[21:10] <joshd> I mean ceph_osd/dev/osd0 - it needs to have xattrs enabled, since ceph uses them extensively
[21:11] <joshd> you'll also have more luck with ext4 or btrfs than ext3
[21:11] <cp> Ah, how do I go about that (I'm using ubuntu and very new to a lot of this stuff)
[21:12] <cp> I'll try to limit the questions :)
[21:13] <joshd> in /etc/fstab, change the line for ceph_osd/dev/osd0 (or whatever partition it's on) where it says ext3 (options) to include user_xattr
[21:14] <joshd> like this: /dev/sdb1 /more ext4 defaults,user_xattr,barrier=1 0 2
[21:14] <joshd> then unmount and remount the partition
[21:16] <cp> [global]
[21:16] <cp> log dir = ceph_conf/out
[21:16] <cp> logger dir = ceph_conf/log
[21:16] <cp> chdir = ""
[21:16] <cp> pid file = ceph_conf/out/$type$id.pid
[21:16] <cp> [mds]
[21:16] <cp> pid file = ceph_conf/out/$name.pid
[21:16] <cp> lockdep = 1
[21:16] <cp> debug ms = 1
[21:16] <cp> debug mds = 20
[21:16] <cp> mds log max segments = 2
[21:16] <cp> [mds0]
[21:16] <cp> #host=hdp2
[21:16] <cp> mon addr =
[21:16] <cp> [mon]
[21:16] <cp> lockdep = 1
[21:16] <cp> debug mon = 20
[21:16] <cp> debug paxos = 20
[21:16] <cp> debug ms = 1
[21:16] <cp> mon data = ceph_osd/dev/mon$id
[21:16] <cp> [mon0]
[21:16] <cp> #host = hdp2
[21:16] <cp> mon addr =
[21:16] <cp> [osd]
[21:16] <cp> lockdep = 1
[21:16] <cp> debug ms = 1
[21:16] <cp> debug osd = 25
[21:16] <cp> debug journal = 20
[21:16] <cp> debug filestore = 10
[21:17] <cp> [osd0]
[21:17] <cp> #host = hdp4
[21:17] <cp> user $USER
[21:17] <cp> osd data = ceph_osd/dev/osd0
[21:17] <cp> osd journal = ceph_osd/dev/osd0/journal
[21:17] <cp> osd journal size = 100
[21:17] <cp> [mds.a]
[21:17] <Tv> gregaf, cmccabe: we could add a "ping" class that does something like rot13 on its argument an returns it; then you can test it..
[21:18] <cmccabe> tv: yeah, that could be worthwhlie
[21:18] <cmccabe> although presumably there will be some coverage by the rbd tests as well
[21:19] <Tv> yeah
[21:21] <cp> joshd: Hmmm... I haven't set up ceph_osd/dev/osd0 as any kind of disk, it's just a directory at the moment. Should I be creating/mounting something special for it?
[21:22] <joshd> cp: it's not necessary, especially if you're just testing it, but you'll get better performance by putting the osd journal and the osd data on separate disks from each other
[21:23] <joshd> cp: it'll run just fine with a directory
[21:25] <cp> joshd: Right, I read that and it made sense. At the moment I'm at the testing stage. (and apologies that I'm not that skilled in linux guts) So, since I'm running it in a directory what do I do about the xattrs?
[21:26] <Tv> cp: the above for the filesystem containing that directory
[21:28] <cp> So this is my new line: /dev/mapper/node2-root / ext4 defaults,user_xattr,barrier=1 errors=remount-ro 0 1
[21:28] <cp> with a comma :)
[21:30] <Tv> looks good; since it's root, you can't unmount; either "remount -o remount,user_xattr /" or reboot
[21:45] <slang> wido: trying to setup/use radosgw with lighttpd, and your libs3 fork as the client, I see errors while trying to create a bucket:
[21:45] <slang> ERROR: ErrorMissingContentLength
[21:45] <slang> wido: list works ok though
[21:46] * aliguori (~anthony@ Quit (Quit: Ex-Chat)
[21:46] <slang> and it looks like the Content-Length is in the http header:
[21:47] <slang> (well, I can't seem to paste from tcpdump)
[21:47] <slang> but its there!
[21:49] <cp> joshd,tv: Thanks. The xattrs definitely weren't accessible before but are now.
[21:50] <sagewk> tv: g++: Internal error: Killed (program cc1plus)
[21:50] <sagewk> do you remember what cased that before?
[21:50] <sagewk> (on the gitbuilder vms)
[21:51] <sagewk> bad ccache or something?
[21:52] <gregaf> we saw it once from an utterly bizarre bug in gcc with one of our stupid binaries that we fixed by changing the binary in question
[21:52] <gregaf> testlibrados, maybe?
[21:52] <gregaf> I don't remember anything else doing it though
[21:52] <yehudasa> sagewk: usually bad ccache
[21:57] <gregaf> sagewk: is that what happened to one of the gitbuilders? and did you notice the other one is sad too?
[22:00] <sagewk> gregaf: different one
[22:12] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[22:15] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit ()
[22:16] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[22:43] <Tv> sagewk: hrmm.. can't find anything in my irc/jabber logs..
[22:44] <Tv> sagewk: the ccache theory is easy to test, though
[22:44] <Tv> what gitbuilder is this?
[22:45] <sagewk> figured it out, just oom
[22:45] <Tv> ahh
[22:45] <sagewk> the dho one. blah
[22:45] <Tv> i have a dim memory of either oom or disk filling doing that, earlier
[22:45] <Tv> crappy g++
[22:45] <Tv> (that's not an internal error..)
[22:46] <jojy> is it possible to create a pg such that we can assign a preffered OSD to it and assign and pg to an rbd device?
[22:48] <joshd> jojy: I think you can create a special pool for such rbd devices and adjust your crushmap to use certain osds for that pool
[22:48] <joshd> what's the goal of doing this?
[22:50] * MK_FG (~MK_FG@ Quit (Ping timeout: 480 seconds)
[22:50] <jojy> we are trying to measure performance of reads for an rbd device if it can get data from local OSDs
[22:50] <jojy> for us the client and OSD are on the same host
[22:51] <jojy> correction: client and OSD COULD be on the same host
[22:51] <jojy> where client = rbd block device
[22:52] <joshd> ah, ok
[22:53] <joshd> I think sjust also added an option to localize reads? that may apply here?
[22:54] <jojy> also the next step might be to try reading blocks directly from the disks if its locally available instead of going thru the tcp request process
[22:54] * yehuda_hm (~yehuda@99-48-179-68.lightspeed.irvnca.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[22:54] <jojy> i was chatting with yehuda yesterday and he mentioned that it might be a problem to do so
[22:55] <jojy> but i think it might be worth trying
[22:57] <jojy> joshd: did u say there is already an option for doing local reads?
[22:57] <joshd> apparently it only works when there are no writes
[22:58] <jojy> writes could still go thro the normal request process
[22:58] <Tv> jojy: going straight to the objects, skipping the osd, sounds like a can of worms
[22:58] <jojy> Tv: thats exactly what yehuda told me :)
[22:58] * Tv . o O ( ... what does a can of worms sounds like? )
[22:59] * hutchins (~hutchins@ltc-vpn.dothill.com) Quit (Remote host closed the connection)
[23:01] * The_Bishop (~bishop@port-92-206-21-65.dynamic.qsc.de) has joined #ceph
[23:03] <jojy> joshd: is there a quick doc somewhere for creating special pools ?
[23:03] * yehuda_hm (~yehuda@99-48-179-68.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[23:06] <joshd> jojy: there's http://ceph.newdream.net/wiki/Custom_data_placement_with_CRUSH
[23:18] <jojy> joshd: thanks! let me see if i can come up with a special pool map for what we are trying to do ..
[23:53] <cp> New error:
[23:53] <cp> service ceph -a start
[23:53] <cp> === mon.0 ===
[23:53] <cp> Starting Ceph mon0 on node2...
[23:53] <cp> ** WARNING: Ceph is still under heavy development, and is only suitable for **
[23:53] <cp> ** testing and review. Do not trust it with important data. **
[23:53] <cp> problem opening monitor store in ceph_osd/dev/mon0: No such file or directory
[23:53] <cp> failed: ' /usr/bin/cmon -i 0 -c /etc/ceph/ceph.conf '
[23:54] <cp> ... but the directory and file are there and were created
[23:55] <joshd> cp: what does 'ceph -s' say?
[23:55] <cp> 2011-08-11 22:55:28.491224 7fb8c964f700 -- :/11485 >> pipe(0x2432830 sd=3 pgs=0 cs=0 l=0).fault first fault
[23:56] <cp> hmmm... but this machine is .1.100
[23:56] <cp> strange
[23:56] <joshd> that means the monitor definitely didn't start
[23:56] <cp> ah, ok
[23:59] <joshd> you might need to use an absolute path for most of the things in your ceph.conf
[23:59] <cp> cp: OK, I'll try that

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.