#ceph IRC Log


IRC Log for 2012-02-16

Timestamps are in GMT/BST.

[0:08] * BManojlovic (~steki@ has joined #ceph
[0:11] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[0:12] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[0:18] * joao (~joao@ Quit (Quit: joao)
[0:24] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[0:28] * lofejndif (~lsqavnbok@57.Red-88-19-214.staticIP.rima-tde.net) Quit (Quit: Leaving)
[0:40] * Ludo (~Ludo@88-191-129-65.rev.dedibox.fr) Quit (Quit: Irssi: the Cadillac of all clients)
[0:52] * jmlowe (~Adium@c-98-223-195-84.hsd1.in.comcast.net) has joined #ceph
[1:35] * yoshi (~yoshi@p8031-ipngn2701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:47] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Quit: Leaving)
[2:22] * Ludo (~Ludo@88-191-129-65.rev.dedibox.fr) has joined #ceph
[3:10] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Quit: adjohn)
[3:19] * BManojlovic (~steki@ Quit (Ping timeout: 480 seconds)
[4:11] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) has joined #ceph
[4:15] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) Quit ()
[4:27] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) has joined #ceph
[4:35] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) Quit (Ping timeout: 480 seconds)
[4:40] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[4:45] * chutzpah (~chutz@ Quit (Quit: Leaving)
[5:21] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[6:09] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Quit: Leaving)
[6:59] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) Quit (Read error: Connection reset by peer)
[7:05] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[7:05] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) has joined #ceph
[7:27] * gohko (~gohko@natter.interq.or.jp) Quit (Quit: Leaving...)
[7:30] * gohko (~gohko@natter.interq.or.jp) has joined #ceph
[7:32] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[7:39] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[8:24] * spikebike (~bill@ Quit (Read error: Operation timed out)
[9:06] * fronlius (~fronlius@testing78.jimdo-server.com) has joined #ceph
[9:29] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[9:32] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[9:34] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #ceph
[10:06] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[10:23] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[10:26] * BManojlovic (~steki@ has joined #ceph
[11:36] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) has joined #ceph
[12:21] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[12:25] * joao (~joao@89-181-154-123.net.novis.pt) has joined #ceph
[12:25] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[12:26] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[12:26] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit ()
[12:37] * yoshi (~yoshi@p8031-ipngn2701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[13:16] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[13:19] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) has joined #ceph
[13:27] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) Quit (Ping timeout: 480 seconds)
[14:22] * lofejndif (~lsqavnbok@57.Red-88-19-214.staticIP.rima-tde.net) has joined #ceph
[14:41] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[14:46] * pulsar (6a5be70dba@ has joined #ceph
[14:49] <pulsar> is it possible to run multiple osd daemons per machine? i would like to use mutiple harddisks / mount points per node. did not find anything on that topic in the wiki area yet
[15:02] <pulsar> hmmm... but i've found articles about some gall bladder diets - maybe this will help :)
[15:11] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[15:19] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) has joined #ceph
[15:24] <iggy> pulsar: yes
[15:27] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) Quit (Ping timeout: 480 seconds)
[15:31] * BManojlovic (~steki@ Quit (Ping timeout: 480 seconds)
[15:59] <pulsar> iggy: any hints how to achieve this? as far i see, the ceph-osd service binds to<whatever>, is there an easy way or do i have to work with eth aliases and beat the crap out of the config files?
[16:02] <jmlowe> I run 6 per machine
[16:03] <pulsar> just like that?
[16:04] <pulsar> well... perhaps i worry too much.
[16:05] <jmlowe> [osd.1]
[16:05] <jmlowe> host=myhost
[16:05] <jmlowe> [osd.2]
[16:06] <jmlowe> host=myhost
[16:06] <jmlowe> ...
[16:06] <pulsar> i take it the ports are allocated dynamically and reported back to the master then.
[16:06] <pulsar> thanks
[16:07] <jmlowe> yep, you are most assuredly over thinking it
[16:10] <pulsar> just being careful, after working with hbase/hdfs for a longer period of time i do always expect the worst results from any scenario :)
[16:33] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Quit: fronlius)
[16:41] * joao is now known as Guest2821
[16:41] * joao (~joao@ has joined #ceph
[16:41] * gregorg (~Greg@ has joined #ceph
[16:47] * Guest2821 (~joao@89-181-154-123.net.novis.pt) Quit (Ping timeout: 480 seconds)
[16:47] <iggy> pulsar: all the state is conveyed via cmon... that's the only thing I think that has static ports
[16:47] <nhm> pulsar: expecting the worst is a good habit. ;)
[16:48] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (Quit: Ex-Chat)
[16:48] <nhm> jmlowe: any luck getting your storage arrays back in working order?
[16:49] <jmlowe> I'm going to shuffle around the disks and see if I can't take some out of service and keep the thermals under control
[16:52] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #ceph
[16:54] <pulsar> should i expect some major pain from running ceph ontop of xfs?
[16:55] * lofejndif (~lsqavnbok@57.Red-88-19-214.staticIP.rima-tde.net) Quit (Quit: Leaving)
[16:55] <pulsar> yay! up and running! osd e42: 76 osds: 76 up, 76 in / 69170 GB used, 138 TB / 206 TB avail
[16:57] * jmlowe (~Adium@c-98-223-195-84.hsd1.in.comcast.net) Quit (Quit: Leaving.)
[17:05] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[17:05] <nhm> pulsar: congrats, what kind of OSDs?
[17:06] <pulsar> what kind?
[17:06] <nhm> sorry, what kind of hosts for the OSDs?
[17:06] <pulsar> rented root servers
[17:06] <pulsar> over at hetzner.
[17:07] <pulsar> i use that cluster for map/reduce mostly
[17:07] <nhm> pulsar: how many OSDs per host on that setup?
[17:07] <pulsar> 2
[17:07] <pulsar> 2 racks, 20 servers each
[17:08] <nhm> nice
[17:08] <pulsar> one node per rack is reserved
[17:08] <pulsar> for monitoring or running the namenode / zookeeper etc.
[17:08] <nhm> I'll be curious to hear how your testing goes
[17:08] <pulsar> me too
[17:08] <pulsar> need to run at least files/directories in that fs
[17:09] <nhm> sounds like a good stress test.
[17:09] <pulsar> yeah, i think i'
[17:09] <nhm> I'm not sure if anyone has done a billion files on ceph yet.
[17:09] <pulsar> ll create 100mio directories with 10 files in each, 2megs of zeor-data
[17:10] <pulsar> we'll see. either the fs can handle it, or i'll need to start coding a file-to-node partitioning mechanism for my workers
[17:11] <pulsar> don't really feel like reinveinting the dht
[17:12] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[17:16] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[17:25] <pulsar> hmm... how long does ceph-fuse take to start the ceph client after fresh format & cluster boot?
[17:26] <pulsar> i see a lot of "scrub ok" messages in ceph -w but thats about it.
[17:33] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) has joined #ceph
[17:34] * linuxuser1357 (~linuxuser@50-82-41-66.client.mchsi.com) has joined #ceph
[17:38] <linuxuser1357> Hello, I have mounted /dev/sda1 to /ceph/osd-data. It is formatted as btrfs. When I run mkcephfs -a -c /etc/ceph/ceph.conf, I get an error back stating that it could not createan empty object store. The additional error message given is 'Is a directory'. Can anyone help?
[17:40] <pulsar> linuxuser1357: you might need to create the "osd data" directories, or configure them in the first place. had similiar issues today.
[17:41] <linuxuser1357> I have 'osd data = /ceph/osd-data' in my osd.0 section...
[17:41] <pulsar> i have osd data defined in the parent/osd section for all of my nodes. should not make any difference though
[17:42] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) has joined #ceph
[17:44] <pulsar> {0=1=up:creating}, lost one node already... hmmm
[17:45] <pulsar> did not experience that kind of issues when running only one osd instance per node. still waiting for the fs to become mountable
[17:46] <iggy> pulsar: did you try the normal kernel installer (instead of fuse)?
[17:46] <pulsar> i'm on 2.6.32-5-amd64
[17:46] <iggy> the last time I built a ceph cluster, I could mkcephfs and instantly mount
[17:46] <pulsar> debian stable
[17:47] <iggy> s/installer/filesystem/
[17:47] <pulsar> yeah, cephfs is not available for me/debian yet
[17:48] <pulsar> ceph -s give me "mds e3: 1/1/1 up {0=1=up:creating}
[17:48] <pulsar> up:creating sounds like the fs is not yet ready
[17:48] <pulsar> still looking for some explaination on that status messages
[17:59] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[18:03] <pulsar> no luck with ceph-fuse or booting up with ~80 osd nodes. i'll disable half of the osd nodes and try again.
[18:03] * gregorg (~Greg@ Quit (Quit: Quitte)
[18:04] <pulsar> 2012-02-16 18:03:55.454056 7f2916048700 -- >> pipe(0x220a780 sd=19 pgs=0 cs=0 l=0).connect claims to be not - wrong node!
[18:24] * aliguori (~anthony@ has joined #ceph
[18:27] <linuxuser1357> Question: Can OSD JOURNAL be pointed to a partition mount point? Or must it be a normal file or partition device file?
[18:35] <linuxuser1357> Can the osd journal be specified as /journal (assuming /dev/sda1 is mounted at /journal btrfs), or does it have to be /journal/file or /dev/sda1 ?
[18:38] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[18:39] <iggy> I think all the ceph guys are still at a conference, you might have to wait around a while
[18:42] <linuxuser1357> ok thanks, after looking through the source and reading some more web pages, it appears to be the case, i'm re-testing with the raw partition /dev/sda1
[18:55] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[19:02] * chutzpah (~chutz@ has joined #ceph
[19:04] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[19:04] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:05] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[19:11] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[19:47] <pulsar> seems it just takes a long while for the fs to initialize 80 osd nodes on xfs. during this time some nodes time out and i got those "wrong node" errors above in the local osd logfiles. will keep an eye on it as i try to increase the number of nodes to the maximum tomorrow.
[19:48] <pulsar> doing some preliminary tests over night with a couple of TB
[20:03] * BManojlovic (~steki@ has joined #ceph
[20:30] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[20:42] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Quit: adjohn)
[20:44] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[21:02] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[21:04] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Quit: adjohn)
[22:04] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[22:04] * jmlowe (~Adium@140-182-216-197.dhcp-bl.indiana.edu) has joined #ceph
[22:09] * jmlowe (~Adium@140-182-216-197.dhcp-bl.indiana.edu) Quit (Read error: Operation timed out)
[22:23] * lofejndif (~lsqavnbok@57.Red-88-19-214.staticIP.rima-tde.net) has joined #ceph
[22:25] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[22:34] * fronlius (~fronlius@f054111204.adsl.alicedsl.de) has joined #ceph
[22:52] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[23:11] * grape_ is now known as grape
[23:41] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Ping timeout: 480 seconds)
[23:49] * aliguori (~anthony@ Quit (Quit: Ex-Chat)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.