#ceph IRC Log


IRC Log for 2013-05-20

Timestamps are in GMT/BST.

[0:07] * BillK (~BillK@124-169-186-145.dyn.iinet.net.au) has joined #ceph
[0:20] * KindTwo (KindOne@h32.209.89.75.dynamic.ip.windstream.net) has joined #ceph
[0:20] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[0:20] * KindTwo is now known as KindOne
[0:25] * lightspeed (~lightspee@lns-c10k-ld-01-m-62-35-37-66.dsl.sta.abo.bbox.fr) has joined #ceph
[0:30] * dcasier (~dcasier@ Quit (Ping timeout: 480 seconds)
[0:31] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[0:39] * dcasier (~dcasier@ has joined #ceph
[0:41] * diegows (~diegows@ has joined #ceph
[0:45] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[0:45] * KindOne (~KindOne@0001a7db.user.oftc.net) has joined #ceph
[0:50] * tnt (~tnt@ Quit (Read error: Operation timed out)
[0:50] * ScOut3R (~ScOut3R@dsl51B614D7.pool.t-online.hu) has joined #ceph
[0:54] * dcasier (~dcasier@ Quit (Ping timeout: 480 seconds)
[0:56] * ebo_ (~ebo@koln-5d813c6c.pool.mediaWays.net) has joined #ceph
[0:57] * ebo_ (~ebo@koln-5d813c6c.pool.mediaWays.net) Quit ()
[0:58] * ScOut3R (~ScOut3R@dsl51B614D7.pool.t-online.hu) Quit (Remote host closed the connection)
[0:59] * ScOut3R (~ScOut3R@dsl51B614D7.pool.t-online.hu) has joined #ceph
[1:03] * ebo^ (~ebo@koln-5d811f87.pool.mediaWays.net) Quit (Ping timeout: 480 seconds)
[1:08] * ScOut3R (~ScOut3R@dsl51B614D7.pool.t-online.hu) Quit (Ping timeout: 480 seconds)
[1:08] * The_Bishop (~bishop@e179006001.adsl.alicedsl.de) has joined #ceph
[1:12] * MarkN (~nathan@ has joined #ceph
[1:16] * MarkN (~nathan@ Quit ()
[1:27] * esammy (~esamuels@host-2-102-69-49.as13285.net) Quit (Quit: esammy)
[1:31] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[2:02] * LeaChim (~LeaChim@ Quit (Ping timeout: 480 seconds)
[2:08] * loicd (~loic@host-78-149-113-87.as13285.net) Quit (Quit: Leaving.)
[2:20] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[2:45] * vhasi_ (vhasi@vha.si) has joined #ceph
[2:47] * MarkN (~nathan@ has joined #ceph
[2:47] * MarkN (~nathan@ has left #ceph
[2:47] * vhasi (vhasi@vha.si) Quit (Ping timeout: 480 seconds)
[2:48] * Neptu (~Hej@mail.avtech.aero) Quit (Ping timeout: 480 seconds)
[2:48] * Neptu (~Hej@mail.avtech.aero) has joined #ceph
[3:21] * matt_ (~matt@220-245-1-152.static.tpgi.com.au) has joined #ceph
[4:52] * Kioob (~kioob@luuna.daevel.fr) Quit (Ping timeout: 480 seconds)
[5:18] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[6:17] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:26] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[6:28] * Cube (~Cube@173-8-221-113-Oregon.hfc.comcastbusiness.net) has joined #ceph
[7:25] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[8:08] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Remote host closed the connection)
[8:14] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:22] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[8:24] * stxShadow (~Jens@ip-88-152-161-249.unitymediagroup.de) has joined #ceph
[8:33] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[8:43] * esammy (~esamuels@host-2-102-69-49.as13285.net) has joined #ceph
[8:43] * coyo (~unf@00017955.user.oftc.net) Quit (Remote host closed the connection)
[8:44] * Vjarjadian (~IceChat77@ Quit (Quit: Why is the alphabet in that order? Is it because of that song?)
[9:17] * Volture (~Volture@office.meganet.ru) Quit (Quit: Злые вы!! Уйду я от вас)
[9:21] * BManojlovic (~steki@ has joined #ceph
[9:31] * vhasi_ is now known as vhasi
[9:37] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:39] * humbolt1 (~elias@80-121-52-87.adsl.highway.telekom.at) has joined #ceph
[9:43] * humbolt (~elias@80-121-50-39.adsl.highway.telekom.at) Quit (Ping timeout: 480 seconds)
[9:49] * syed_ (~chatzilla@ has joined #ceph
[9:51] * leseb (~Adium@ has joined #ceph
[9:54] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 480 seconds)
[9:54] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[10:00] * loicd (~loic@host-78-149-113-87.as13285.net) has joined #ceph
[10:28] * LeaChim (~LeaChim@ has joined #ceph
[10:38] * Cube (~Cube@173-8-221-113-Oregon.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[10:39] * loicd (~loic@host-78-149-113-87.as13285.net) Quit (Quit: Leaving.)
[10:57] * Morg (~oftc-webi@host- has joined #ceph
[11:08] * loicd (~loic@host-78-149-113-87.as13285.net) has joined #ceph
[11:09] * Cube (~Cube@173-8-221-113-Oregon.hfc.comcastbusiness.net) has joined #ceph
[11:10] * eternaleye (~eternaley@2607:f878:fe00:802a::1) Quit (Ping timeout: 480 seconds)
[11:13] * tnt (~tnt@ has joined #ceph
[11:14] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[11:14] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[11:16] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[11:21] * Cube (~Cube@173-8-221-113-Oregon.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[11:24] * danieagle (~Daniel@ has joined #ceph
[11:25] * syed_ (~chatzilla@ Quit (Ping timeout: 480 seconds)
[11:28] * dcasier (~dcasier@ has joined #ceph
[11:40] * valentin (~Cherry@ has joined #ceph
[11:41] * valentin (~Cherry@ Quit (Quit: Leaving)
[11:44] * syed_ (~chatzilla@ has joined #ceph
[11:46] * eternaleye (~eternaley@2607:f878:fe00:802a::1) has joined #ceph
[11:50] * Volture (~Volture@office.meganet.ru) has joined #ceph
[11:55] * syed_ (~chatzilla@ Quit (Read error: Connection reset by peer)
[12:00] * eternaleye (~eternaley@2607:f878:fe00:802a::1) Quit (Ping timeout: 480 seconds)
[12:03] * eternaleye (~eternaley@2607:f878:fe00:802a::1) has joined #ceph
[12:08] * loicd (~loic@host-78-149-113-87.as13285.net) Quit (Quit: Leaving.)
[12:19] * LeaChim (~LeaChim@ Quit (Ping timeout: 480 seconds)
[12:28] * eschnou (~eschnou@249.73-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[12:28] <mrjack> joao: arround�
[12:29] <joao> heading out to lunch in 20 minutes
[12:29] <joao> what's up?
[12:29] <mrjack> joao: i noticed something interesting.. i have set mon compact on start = true
[12:30] <mrjack> joao: i have 5 mon, if i restart mon.4, the store.db is not compacted... i tried restarting the mons from mon.4 to mon.0 - the store.db on _ALL_ mon's only compacts, if i restart the mon.0 (which is the current leader) - is that intentional?
[12:32] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[12:32] <joao> no
[12:34] <mrjack> ok
[12:34] * eschnou (~eschnou@249.73-201-80.adsl-dyn.isp.belgacom.be) Quit (Read error: Connection reset by peer)
[12:34] <mrjack> i sent you mail
[12:34] <mrjack> i also saw ceph-mon getting 8gb big yesterday
[12:35] <joao> mrjack, was quorum already formed when you checked the mon store's sizes again and got the 1GB size?
[12:36] <mrjack> yes
[12:36] <joao> kay
[12:36] <joao> weird :\
[12:36] <mrjack> yup
[12:36] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[12:37] <mrjack> let me know if i can do something to debug this
[12:43] * pixel (~pixel@ has joined #ceph
[12:46] <pixel> Hi everyone, I have xen server and rbd mapped on it. How can I start to use this block device with XEN?
[12:47] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[12:48] * Neptu (~Hej@mail.avtech.aero) Quit (Remote host closed the connection)
[12:48] * DarkAceZ (~BillyMays@ has joined #ceph
[12:56] <tnt> pixel: you can just use phy:/dev/rbd/rbdX (X being the mapped number).
[12:57] <tnt> sorry, /dev/rbdX or /dev/rbd/pool_name/image_name (if you have udev scripts in place for naming)
[12:57] <pixel> so, if I have /dev/rbd0 then we need to put it in conf file
[12:58] <tnt> yes
[12:58] * Morg (~oftc-webi@host- Quit (Quit: Page closed)
[12:58] <pixel> ok, thanks
[13:14] * Cube (~Cube@173-8-221-113-Oregon.hfc.comcastbusiness.net) has joined #ceph
[13:22] * Cube (~Cube@173-8-221-113-Oregon.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[13:41] * LeaChim (~LeaChim@ has joined #ceph
[13:42] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[14:03] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[14:11] * diegows (~diegows@ has joined #ceph
[14:15] * eschnou (~eschnou@203.39-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[14:25] * KindTwo (KindOne@h183.187.130.174.dynamic.ip.windstream.net) has joined #ceph
[14:27] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[14:28] * imjustmatthew (~imjustmat@c-24-127-107-51.hsd1.va.comcast.net) has joined #ceph
[14:32] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[14:32] * KindTwo is now known as KindOne
[14:51] <nooky> hello, i've a problem with growing my cluster, today i added a osd into a node that was a osd exist, made a reweight and add a replica, the crushmap is up to date and now i get some pgs in stuck unclean, i was playing with tuneables options and not solve the problem
[14:51] * eschnou (~eschnou@203.39-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[14:55] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[15:04] * pixel (~pixel@ Quit (Quit: Ухожу я от вас (xchat 2.4.5 или старше))
[15:08] * allsystemsarego (~allsystem@ has joined #ceph
[15:11] * portante|ltp (~user@c-24-63-226-65.hsd1.ma.comcast.net) Quit (Read error: Operation timed out)
[15:17] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:22] <tchmnkyz> hey guys, will try again today. I am having problems starting up MDS services on 2 of the nodes that previously worked fine. The log file gives the following error: mds.-1.0 handle_mds_map mdsmap compatset compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding} not writeable with daemon features compat={},rocompat={},incompat={1=base v0.20,2=cl
[15:22] <mikedawson> joao: ping
[15:22] <tchmnkyz> Can someone please help me get them back up as i am running on 1 out of 3 MDS servers
[15:28] * DarkAce-Z (~BillyMays@ has joined #ceph
[15:31] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[15:46] * portante|afk (~user@ Quit (Quit: upgrading)
[15:47] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) has joined #ceph
[15:47] * markl_ (~mark@tpsit.com) has joined #ceph
[15:48] * Cube (~Cube@173-8-221-113-Oregon.hfc.comcastbusiness.net) has joined #ceph
[15:53] * PerlStalker (~PerlStalk@ has joined #ceph
[16:08] * portante|ltp (~user@ has joined #ceph
[16:10] * stxShadow (~Jens@ip-88-152-161-249.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[16:23] <Fetch> did a qemu+rbd build for centos ever get pushed to the Ceph EL repo?
[16:27] * brother (foobaz@vps1.hacking.dk) Quit (Ping timeout: 480 seconds)
[16:32] * Cube (~Cube@173-8-221-113-Oregon.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[16:32] * brother (~brother@vps1.hacking.dk) has joined #ceph
[16:34] * ekarlso (~ekarlso@cloudistic.me) has joined #ceph
[16:46] * tkensiski (~tkensiski@45.sub-70-197-6.myvzw.com) has joined #ceph
[16:46] * tkensiski (~tkensiski@45.sub-70-197-6.myvzw.com) has left #ceph
[16:47] * tkensiski1 (~tkensiski@45.sub-70-197-6.myvzw.com) has joined #ceph
[16:47] * tkensiski1 (~tkensiski@45.sub-70-197-6.myvzw.com) has left #ceph
[16:47] <joao> mikedawson, just saw your email and bug update
[16:47] <joao> many thanks!
[16:48] <mikedawson> joao: Glad to help. Let me know what you guys find.
[16:50] * tkensiski2 (~tkensiski@45.sub-70-197-6.myvzw.com) has joined #ceph
[16:50] * tkensiski2 (~tkensiski@45.sub-70-197-6.myvzw.com) has left #ceph
[17:00] <jefferai> hi guys -- I'm suddenly finding myself nearing a good point to upgrade Ceph
[17:00] <jefferai> I've been running a pre-bobtail dev build
[17:00] <jefferai> should I expect any issues upgrading to the current stable?
[17:00] * yehuda_hm (~yehuda@2602:306:330b:1410:6d5b:cd48:2d40:7a01) has joined #ceph
[17:01] <tnt> jefferai: I think you have to go through 0.56.6 first.
[17:02] <jefferai> tnt: ah, yeah, I see that
[17:03] <jefferai> also, I seem to remember that it's a good idea to mark your OSDs as noout if you are going to reboot
[17:03] <jefferai> does that make sense?
[17:03] <tnt> yes
[17:07] <jefferai> also mark them nodown?
[17:07] <tnt> no
[17:07] <jefferai> I guess you want them to go down
[17:07] <jefferai> but not be taken out
[17:07] <jefferai> okay
[17:07] <tnt> precisely
[17:08] * portante|ltp (~user@ Quit (Read error: Connection reset by peer)
[17:09] <jefferai> And I assume 12.04 is still recommended...
[17:09] <jks> how do I read the min_size setting for a pool?
[17:12] * MarkN (~nathan@ has joined #ceph
[17:17] * treaki (48d29226a7@p4FF4ABBD.dip0.t-ipconnect.de) has joined #ceph
[17:17] <jks> ceph osd dump doesn't show it... does that mean that it has the default value?
[17:28] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:29] * scuttlemonkey_ is now known as scuttlemonkey
[17:29] * ChanServ sets mode +o scuttlemonkey
[17:33] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[17:37] * lightspeed (~lightspee@lns-c10k-ld-01-m-62-35-37-66.dsl.sta.abo.bbox.fr) Quit (Ping timeout: 480 seconds)
[17:39] * The_Bishop_ (~bishop@f052098107.adsl.alicedsl.de) has joined #ceph
[17:40] * lightspeed (~lightspee@i01m-62-35-37-66.d4.club-internet.fr) has joined #ceph
[17:44] <jtang> just wondering
[17:45] <jtang> any storage/devops/sysadmins want a job in ireland?
[17:45] <jtang> http://www.tchpc.tcd.ie/node/1098 -- knowning ceph might help ;)
[17:46] * The_Bishop (~bishop@e179006001.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[17:48] * DarkAce-Z is now known as DarkAceZ
[17:54] * gregaf (~Adium@2607:f298:a:607:6c73:5681:87:beda) Quit (Quit: Leaving.)
[17:55] * gregaf (~Adium@2607:f298:a:607:94dc:a315:cc72:795f) has joined #ceph
[18:02] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Read error: Operation timed out)
[18:11] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[18:25] * arye (~arye@nat-oitwireless-outside-vapornet3-a-80.Princeton.EDU) has joined #ceph
[18:27] <arye> are there any new numbers after http://ceph.com/community/ceph-performance-part-1-disk-controller-write-throughput/ for small (4k) object writes?
[18:27] * leseb (~Adium@ Quit (Quit: Leaving.)
[18:36] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[18:41] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[18:42] * loicd (~loic@host-78-149-113-87.as13285.net) has joined #ceph
[18:46] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[18:47] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[18:55] * Tamil (~tamil@ has joined #ceph
[18:57] <PerlStalker> Is there a way of seeing how much space is available to each storage pool?
[19:02] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[19:02] * BManojlovic (~steki@fo-d- has joined #ceph
[19:07] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[19:15] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[19:18] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[19:20] <scuttlemonkey> perlstalker: a la 'rados df' or are you talking max storage allocated?
[19:21] <PerlStalker> max
[19:22] <PerlStalker> I've got a couple of pools with distinct osds.
[19:23] <scuttlemonkey> I'm not sure there is an explicit way of doing that since, by default, it expects to have the whole cluster to play with
[19:23] <PerlStalker> I've seen various ways of seeing how much is used per pool but I've found no way of seeing how much is available.
[19:23] <scuttlemonkey> which is shared by all pools, and dynamic
[19:24] <scuttlemonkey> yeah, it's a hard question to answer since each pool can have different replication levels...and ultimately the total cluster space is shared
[19:24] <tnt> I guess you could collect available OSDs from the pg dump for each pool, then add the free space on those OSDs then divide by replication level.
[19:24] <tnt> But I don't know any pre-made command that will do that in 1 operation ...
[19:25] <scuttlemonkey> yeah
[19:25] <scuttlemonkey> and that would only be correct in a napkin sketch kinda way
[19:26] <scuttlemonkey> arye: that's a question for nhm, I know he has been poking a bit, and is planning another blog series soon-ish
[19:26] <kyle__> i'm trying to use ceph-deploy to setup a new cluster. After bootstrapping, using "ceph-deploy new", then installing the packages to all the hosts, and running "ceph-deploy mon create". The next step is to gather the keys at which point i receive unable to find /etc/ceph/ceph.client.admin.keyring on ['ceph-mon0']. Am i missing a step here?
[19:26] <scuttlemonkey> jtang: love it!
[19:26] <sagewk> paravoid: does http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=681809#34 mean that wheezy will get an updated udev? or that it will be available via the backports site?
[19:28] <scuttlemonkey> kyle__: how long did you wait after mon create? Sometimes it can take a bit before everything finishes and keys are created
[19:29] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) has joined #ceph
[19:31] <kyle__> scuttlemonkey, initially not much time, but it's still not there now. What step creates the "ceph.mon.keyring" and "ceph.client.admin.keyring" on the monitors because that's not happening for me. I had to manually copy the ceph.mon.keyring to the monitos. I was going to do the same for the ceph.client.admin.keyring but i don't even see it created anywhere even on my workstation where i'm using the ceph-deploy tool
[19:37] <PerlStalker> Next question. I've added 36 new osds in a new pool but they keep dropping out of the cluster.
[19:38] <scuttlemonkey> kyle__: I think this is where you should be able to see it: https://github.com/ceph/ceph-deploy/blob/master/ceph_deploy/new.py
[19:38] <Tamil> kyle_: on which distro are you trying?
[19:38] <scuttlemonkey> kyle__: fwiw, it should be logging any issues in /var/log/ceph....
[19:39] <kyle__> ubuntu 12.04 with ugraded kernel
[19:40] <Tamil> kyle_: is ceph-create-keys process still running?
[19:41] * tkensiski (~tkensiski@ has joined #ceph
[19:41] * tkensiski (~tkensiski@ has left #ceph
[19:41] * loicd (~loic@host-78-149-113-87.as13285.net) Quit (Quit: Leaving.)
[19:43] <kyle__> Tamil, no it's not. I checked the monitors and my local
[19:44] <Tamil> kyle_: i hope your mon is running?
[19:45] * portante|ltp (~user@ has joined #ceph
[19:45] <Tamil> kyle_: ceph-create-keys is the one that creates the keys
[19:45] <kyle__> Tamil, yes, on my monitors "ps ax | grep ceph" outputs "/usr/bin/ceph-mon --cluster=ceph -i ceph-monX -f"
[19:46] <kyle__> but not seeing the ceph-create-keys in that output
[19:47] <Tamil> kyle_: ls /etc/ceph?
[19:48] <kyle__> on the mons there is only the ceph.conf
[19:48] <kyle__> local there is:
[19:48] <kyle__> ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.conf ceph-deploy ceph.log ceph.mon.keyring
[19:52] <kyle__> i can try purging and starting over, if you think it will help.
[19:52] <Tamil> kyle_: which means ceph-create-keys did create the keys in /var/lib/ceph when you did a 'mon create' and gatherkeys did copy all the keys from mon to local except client.admin
[19:53] <Tamil> kyle_: just a sec
[19:54] * efdev (~chatzilla@ has joined #ceph
[19:54] * efdev is now known as developer
[19:59] <developer> hello
[19:59] <developer> !
[20:01] <Tamil> kyle_: i never hit this issue
[20:01] <Tamil> kyle_: could you please see if there is something on /var/log/ceph/ceph-mon*.log
[20:02] <Tamil> kyle_: i suggest you file a bug with all logs, if you were going to purge and start over
[20:03] <nooky> hello, i've a problem with growing my cluster, today i added a osd into a node that was a osd exist, made a reweight and add a replica, the crushmap is up to date and now i get some pgs in stuck unclean, i was playing with tuneables options and not solve the problem
[20:03] * rturk-away is now known as rturk
[20:04] * al (quassel@niel.cx) Quit (Remote host closed the connection)
[20:04] <paravoid> sagewk: he's probably referring to squeeze-backports -- nowadays they're not a separate site but integrated in the debian.org archive, albeit with a separate sources.list entry
[20:04] <paravoid> er, wheezy-backports :)
[20:05] <developer> I have tried install ceph on centos and after yum install ceph i get:
[20:05] <developer> --> Finished Dependency Resolution
[20:05] <developer> Error: Package: librbd1-0.61.2-0.el6.x86_64 (ceph)
[20:05] <developer> Requires: libleveldb.so.1()(64bit)
[20:05] <developer> Error: Package: librbd1-0.61.2-0.el6.x86_64 (ceph)
[20:05] <developer> Requires: libsnappy.so.1()(64bit)
[20:05] <developer> Error: Package: ceph-0.61.2-0.el6.x86_64 (ceph)
[20:05] <developer> Requires: libtcmalloc.so.4()(64bit)
[20:05] <developer> Error: Package: ceph-0.61.2-0.el6.x86_64 (ceph)
[20:05] <developer> Requires: libsnappy.so.1()(64bit)
[20:05] <developer> Error: Package: libcephfs1-0.61.2-0.el6.x86_64 (ceph)
[20:05] <developer> Requires: libleveldb.so.1()(64bit)
[20:05] <developer> Error: Package: ceph-0.61.2-0.el6.x86_64 (ceph)
[20:05] <developer> Requires: libleveldb.so.1()(64bit)
[20:05] <developer> Error: Package: libcephfs1-0.61.2-0.el6.x86_64 (ceph)
[20:06] <developer> Requires: libsnappy.so.1()(64bit)
[20:06] <developer> Error: Package: ceph-0.61.2-0.el6.x86_64 (ceph)
[20:06] <developer> Requires: python-argparse
[20:06] <developer> Error: Package: ceph-0.61.2-0.el6.x86_64 (ceph)
[20:06] <developer> Requires: python-lockfile
[20:06] <developer> Error: Package: librados2-0.61.2-0.el6.x86_64 (ceph)
[20:06] <developer> Requires: libsnappy.so.1()(64bit)
[20:06] <developer> Error: Package: ceph-0.61.2-0.el6.x86_64 (ceph)
[20:06] <developer> Requires: gdisk
[20:06] <developer> Error: Package: librados2-0.61.2-0.el6.x86_64 (ceph)
[20:06] <developer> Requires: libleveldb.so.1()(64bit)
[20:06] <paravoid> sagewk: and I just realized we both replied to the bug :)
[20:06] * al (quassel@niel.cx) has joined #ceph
[20:07] <sagewk> paravoid: ok, i think that means we'll go ahead with shipping the duplicated rule. :)
[20:08] <ekarlso> will Ceph do Dedup soon ?
[20:08] <sagewk> ekarlso: someday
[20:08] <sagewk> not soon
[20:08] <ekarlso> sagewk: why not ? :p
[20:09] * developer (~chatzilla@ Quit (Remote host closed the connection)
[20:12] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[20:13] <paravoid> sagewk: so, I had an outage again this weekend...
[20:13] <paravoid> combination of a) failed disk, b) possibly a controller issue, c) the slow peering issue
[20:14] <paravoid> (b) was that 7 neighbouring OSDs to the failed disk also started to be marked as down
[20:14] * yehuda_hm (~yehuda@2602:306:330b:1410:6d5b:cd48:2d40:7a01) Quit (Read error: Connection timed out)
[20:15] <paravoid> min reporters is 14, and I saw the "14 reports from 14 peers" for three of them, but the other four were marked as down without that
[20:15] <paravoid> added that to #4967
[20:15] <paravoid> I think it's just mon losing some of the messages(?)
[20:17] <paravoid> (c) is the fact that eight osds failing results in an several minutes long outage
[20:19] * yehuda_hm (~yehuda@2602:306:330b:1410:6d5b:cd48:2d40:7a01) has joined #ceph
[20:22] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[20:22] * dmick (~dmick@2607:f298:a:607:48ba:6269:d2b1:a0d0) has joined #ceph
[20:30] <kyle__> Tamil, i purged and started over and was able to properly create the keys. Everything looks good so far. Thanks for your help!
[20:31] <kyle__> Should "ceph-deploy osd create..." bring the osds up and in? Log shows they were activated but I'm seeing: osdmap e4: 3 osds: 0 up, 0 in
[20:32] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[20:36] <Tamil> kyle_: ceph health?
[20:37] * alram (~alram@cpe-75-83-127-87.socal.res.rr.com) has joined #ceph
[20:38] <Tamil> kyle_: yes, osd create will bring the osds up and in
[20:40] * ghartz (~ghartz@33.ip-5-135-148.eu) has joined #ceph
[20:42] <kyle__> yeah looks like the osds are having an issue starting the daemon
[20:42] <kyle__> service ceph start
[20:42] <kyle__> === osd.0 ===
[20:42] <kyle__> 2013-05-20 11:33:15.978820 7f8f65ad3780 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication
[20:42] <kyle__> 2013-05-20 11:33:15.978847 7f8f65ad3780 -1 ceph_tool_common_init failed.
[20:42] <kyle__> Starting Ceph osd.0 on ceph-data0...
[20:42] <kyle__> 2013-05-20 11:33:16.008772 7fa2a9e39780 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-0: (2) No such file or directory
[20:43] <kyle__> failed: 'ulimit -n 8192; /usr/bin/ceph-osd -i 0 --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf '
[20:43] <Tamil> kyle_: which ceph branch have you installed?
[20:43] <kyle__> i didn't use any flags so shoud be default which is stable i believe
[20:44] <Tamil> kyle_: are the disks mounted?
[20:45] <Tamil> kyle_: df -h?
[20:45] <kyle__> oh crap that must be it.
[20:45] <kyle__> i assumed that would happen with the ceph-deploy stuff but i guess that's silly
[20:45] <Tamil> kyle_: but that will all be taken care of by 'osd create'
[20:45] <kyle__> hmm thought so
[20:45] <kyle__> but yeah they were not mounted
[20:46] <Tamil> kyle_: which means, osd create command did not succeed
[20:47] <Tamil> kyle_: if you dont mind zapping the disk, try osd create with --zap-disk
[20:47] <kyle__> hmm the log stuff seemed to indicate it did. maybe it just didn't finish
[20:47] <kyle__> well i cannot zap because it's a raid10 so only one disk with the OS partitioned seperately
[20:48] <kyle__> i had to supply /dev/sda6 when i ran osd create
[20:48] * mtanski (~mtanski@ has joined #ceph
[20:48] <kyle__> not the actual /ev/sda device
[20:48] <Tamil> kyle_: whats the size of your cluster?
[20:48] <kyle__> 3 osds with about 800GB each
[20:48] <Tamil> kyle_: how many nodes?
[20:49] <kyle__> 3 mons 2meta 3osd
[20:49] <kyle__> meta not setup yet thoug
[20:49] <kyle__> well not deployed i should say
[20:49] <Tamil> kyle_: thats ok, how many nodes are you using?
[20:49] <kyle__> each has their own physical server
[20:50] <kyle__> so eight daemons with eight servers
[20:50] <Tamil> kyle_: oh ok
[20:51] <kyle__> had a bunch of dell 2950s i ended up using for this
[20:51] <Tamil> kyle_: kool
[20:51] <Tamil> kyle_: it would be nice if you could paste your osd create command output
[20:53] <kyle__> okay. command was this: ceph-deploy osd create ceph-data0:/dev/sda6:/dev/sda6
[20:53] <kyle__> 2013-05-20 11:12:54,643 ceph_deploy.osd DEBUG Preparing cluster ceph disks ceph-data0:/dev/sda6:/dev/sda6
[20:53] <kyle__> 2013-05-20 11:12:55,273 ceph_deploy.osd DEBUG Deploying osd to ceph-data0
[20:53] <kyle__> 2013-05-20 11:12:55,345 ceph_deploy.osd DEBUG Host ceph-data0 is now ready for osd use.
[20:53] <kyle__> 2013-05-20 11:12:55,345 ceph_deploy.osd DEBUG Preparing host ceph-data0 disk /dev/sda6 journal /dev/sda6 activate True
[20:53] <kyle__> 2013-05-20 11:14:13,187 ceph_deploy.osd DEBUG Preparing cluster ceph disks ceph-data1:/dev/sda6:/dev/sda6
[20:53] <kyle__> 2013-05-20 11:14:13,736 ceph_deploy.osd DEBUG Deploying osd to ceph-data1
[20:53] <kyle__> 2013-05-20 11:14:13,801 ceph_deploy.osd DEBUG Host ceph-data1 is now ready for osd use.
[20:53] <kyle__> 2013-05-20 11:14:13,801 ceph_deploy.osd DEBUG Preparing host ceph-data1 disk /dev/sda6 journal /dev/sda6 activate True
[20:53] <kyle__> 2013-05-20 11:14:19,554 ceph_deploy.osd DEBUG Preparing cluster ceph disks ceph-data2:/dev/sda6:/dev/sda6
[20:54] <kyle__> 2013-05-20 11:14:20,087 ceph_deploy.osd DEBUG Deploying osd to ceph-data2
[20:54] <kyle__> 2013-05-20 11:14:20,215 ceph_deploy.osd DEBUG Host ceph-data2 is now ready for osd use.
[20:54] <kyle__> 2013-05-20 11:14:20,215 ceph_deploy.osd DEBUG Preparing host ceph-data2 disk /dev/sda6 journal /dev/sda6 activate True
[20:54] <Tamil> kyle_: ceph-deploy disk list ceph-data0?
[20:54] <kyle__> /dev/sda :
[20:54] <kyle__> /dev/sda1 other, xfs, mounted on /
[20:54] <kyle__> /dev/sda2 other
[20:54] <kyle__> /dev/sda3 other, Linux filesystem
[20:54] <kyle__> /dev/sda5 swap, swap
[20:54] <kyle__> /dev/sda6 other, xfs
[20:54] <kyle__> /dev/sr0 other, unknown
[20:56] <kyle__> mount only shows the /dev/sda1 mounted
[20:56] <Tamil> kyle_: looks like osd create did not prepare or activate your disk
[20:56] <Tamil> kyle_: it would eb shown here if it was successful
[20:56] <Tamil> kyle: be*
[20:57] <kyle__> hmm okay. should i try to run those seperately instead of "create"
[21:03] * andrei (~andrei@host86-155-31-94.range86-155.btcentralplus.com) has joined #ceph
[21:06] <kyle__> looks like it tries to mount during prepare:
[21:06] <kyle__> May 20 12:01:23 ceph-data0 kernel: [332617.198783] XFS (sda6): Mounting Filesystem
[21:06] <kyle__> May 20 12:01:23 ceph-data0 kernel: [332617.254646] XFS (sda6): Ending clean mount
[21:06] <kyle__> that's from the osd syslog
[21:06] * eschnou (~eschnou@203.39-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:11] <Tamil> kyle_: yeah maybe try that
[21:12] <kyle__> yeah it never seems to finish the prepare
[21:13] <kyle__> ceph_deploy.osd DEBUG Preparing host ceph-data1 disk /dev/sda6 journal /dev/sda6 activate False
[21:13] <kyle__> never gets past this
[21:14] <Tamil> kyle_: if prepare is successful, that specific disk/partition will be tagged in "ceph-deploy disk list" command
[21:15] <kyle__> okay. is it typical for prepare to take some time?
[21:15] <Tamil> kyle_: not long though
[21:15] <Tamil> kyle_: how long have you been waiting on the disk prepare?
[21:15] * eschnou (~eschnou@203.39-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[21:19] <kyle__> 18 minutes so far
[21:21] <kyle__> just to clarify, this is indicating that it is still preparing correct?
[21:21] <kyle__> 2013-05-20 12:18:05,115 ceph_deploy.osd DEBUG Preparing cluster ceph disks ceph-data2:/dev/sda6:/dev/sda6
[21:21] <kyle__> 2013-05-20 12:18:05,650 ceph_deploy.osd DEBUG Deploying osd to ceph-data2
[21:21] <kyle__> 2013-05-20 12:18:05,740 ceph_deploy.osd DEBUG Host ceph-data2 is now ready for osd use.
[21:21] <kyle__> 2013-05-20 12:18:05,740 ceph_deploy.osd DEBUG Preparing host ceph-data2 disk /dev/sda6 journal /dev/sda6 activate False
[21:22] <kyle__> a tad confused because if the first line there
[21:22] <kyle__> of*
[21:22] <kyle__> oh i see now preparing cluster then preparing host
[21:23] <jtang> scuttlemonkey: you still want some of the ansible stuff for ceph?
[21:23] <jtang> not sure how much use it is for others -- https://github.com/jcftang/ansible-ceph
[21:24] <jtang> its good enough for using for development work against the radosgw
[21:24] <jtang> its far from complete for production usage
[21:24] <scuttlemonkey> jtang: awesome
[21:24] <scuttlemonkey> yeah, I'd love to get an ansible writeup akin to the juju one I did
[21:24] <scuttlemonkey> if you are willing
[21:25] <jtang> its mostly geared towards a single ubuntu machine to get radosgw going - we use vagrant + ansible in work for the project im working on
[21:25] <scuttlemonkey> nice
[21:25] <scuttlemonkey> yeah, writing that up and saying "if you want to do more, or expand to N machines you could do X" or something
[21:25] <scuttlemonkey> would be awesome
[21:25] <jtang> yea i was planning on doing that at somepoint
[21:26] <jtang> tbh ceph-deploy does mostly what i was planning with the ceph_facts plugin for ansible and some "scripting"
[21:26] <scuttlemonkey> hehe
[21:26] <jtang> the sooner the rest api for the ceph management interface is out and stable, then thats a different kettle of fish
[21:26] <scuttlemonkey> yeah, ceph-deploy has come a long way very quickly
[21:26] <scuttlemonkey> for certain
[21:27] <kyle__> could my issue have something to do with one of my mons continuously running "/usr/bin/python /usr/sbin/ceph-create-keys -i 0"
[21:27] <scuttlemonkey> so yeah, if you wanna write up something that is similar to: http://ceph.com/dev-notes/deploying-ceph-with-juju/ or Loic's http://ceph.com/dev-notes/deploying-ceph-with-juju/ that would be awesome
[21:27] <scuttlemonkey> send to community@inktank.com
[21:27] <jtang> i dont think it would take too much work to get an ansible module to configure/build a ceph cluster
[21:27] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Read error: Operation timed out)
[21:27] <scuttlemonkey> and I'll push it to the blog
[21:27] <jtang> its just time
[21:27] <scuttlemonkey> yeah
[21:28] <scuttlemonkey> now, if you could build me an ansible module to get more time in my day...that I'd pay real money for :)
[21:28] * diegows (~diegows@ has joined #ceph
[21:28] <jtang> heh
[21:29] <jtang> well like i said once the rest api is out it will be easy to do so ;)
[21:30] <scuttlemonkey> haha
[21:30] <jtang> you could cheat and wrap ceph-deploy up in a set of ansible tasks
[21:30] <jtang> then replace it with a bunch of calls to the rest api later on
[21:30] <scuttlemonkey> I suppose
[21:30] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[21:30] <jtang> but then its a bit of a waste of time to abstract it away
[21:31] <scuttlemonkey> yeah
[21:31] <scuttlemonkey> at that point the only gain is for people using exclusively ansible
[21:31] <jtang> btw ceph-deploy is kinda borked on scientificlinux
[21:31] <jtang> cause it checks for centos or rhel
[21:31] <jtang> i've been meaning to file a bug report or send a patch
[21:32] <jtang> its just a case of adding in a bunch of checks for the redhat family of distros rather than just a centos or rhel check
[21:32] <jtang> and it'd be nice to avoid depending on lsb to get that information
[21:32] <scuttlemonkey> yeah, makes sense
[21:32] <scuttlemonkey> nod
[21:32] <scuttlemonkey> lsb can be a bit imprecise
[21:33] <jtang> on rhel lsb pulls in far too much
[21:33] <jtang> it installs qt-x11 which descends into a big mess
[21:33] <jtang> it took me a few hours to figure that one out last week when i realised that ceph-deploy needs redhat lsb to run
[21:33] <scuttlemonkey> hehe
[21:33] <scuttlemonkey> ew
[21:35] * eschnou (~eschnou@203.39-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:35] <jtang> btw will radosgw configuration be added to the ceph-deploy tool?
[21:35] <jtang> or is it planned?
[21:35] <scuttlemonkey> jtang: that's the plan eventually
[21:35] <jtang> *cool*
[21:36] <scuttlemonkey> the thinking (as I understand it) is that ceph-deploy will be the default work that we do for everything...and a good basis for all the orchestration tools like chef, puppet, juju, etc
[21:36] <jtang> btw the radosgw init scripts and few other helper scripts are boned in rhel based distros
[21:36] <jtang> they dont work :P
[21:36] <scuttlemonkey> hah
[21:36] * jtang shakes his head
[21:37] <scuttlemonkey> well, I'm also hoping RGW gets to be a little bit more friendly/extensible in Dumpling
[21:37] <scuttlemonkey> assuming Wido finishes his work
[21:37] <jtang> i ended up creating an ubuntu vm to run the gateway -- hence th ansible playbook for the radosgw
[21:37] <scuttlemonkey> not forcing apache/FCGI will be ++ imo
[21:38] <jtang> heh, what nginx and ligttpd support?
[21:38] <jtang> i dont understand why people get so hung up on apache, its really good and not all that bad
[21:38] <jtang> i guess more options are good
[21:38] <scuttlemonkey> well, nothing specific
[21:38] <scuttlemonkey> http://wiki.ceph.com/01Planning/02Blueprints/Dumpling/RADOS_Gateway_refactor_into_library%2C_internal_APIs
[21:38] <jtang> btw i caught up on a few of them ceph summit videos
[21:38] <scuttlemonkey> just giving people options
[21:39] <jtang> the erasure coding one looks good, i wish i had more time to poke
[21:39] * jtang is looking forward to the erasure coding work
[21:40] <jtang> and the disaster recovery bits
[21:40] * dcasier (~dcasier@ Quit (Ping timeout: 480 seconds)
[21:40] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[21:40] <scuttlemonkey> yeah, the summit had a ton of great stuff
[21:41] <scuttlemonkey> I'm really stoked to see how all of it shapes up
[21:41] * Tamil (~tamil@ Quit (Quit: Leaving.)
[21:42] <jtang> between the erasure coding and async replication i think we will be a happy user as soon as i can hire a person to work on finishing of an evaluation and deploy of distributed/parallel filesystems
[21:42] <jtang> so far based on the roadmap of ceph it looks pretty likely we'll go with it
[21:42] <scuttlemonkey> nice
[21:42] <jtang> btw we're a gpfs house in work
[21:42] <andrei> hello guys
[21:43] <andrei> i was wondering if it's safe to use ceph with btrfs for production?
[21:43] <jtang> andrei: probably not
[21:43] <sjust> andrei: not without extensive testing
[21:43] <jtang> i tried it out a few months ago with bad consequences
[21:43] <andrei> so, I should use xfs instead, right?
[21:43] <jtang> then again i was running it on rhel6 which isnt exactly up to date
[21:43] <sjust> correct
[21:44] <andrei> okay, thanks
[21:44] <andrei> by the way, has anyone tested zfs with ceph?
[21:44] <jtang> wido had i think
[21:44] <sjust> there have been preliminary efforts in that direction
[21:44] <andrei> okay, but i guess it's not production level at all
[21:44] <sjust> it is not
[21:45] <andrei> thanks guys
[21:45] <andrei> i am currently in the process of setting up a small PoC with Ceph
[21:45] <scuttlemonkey> andrei: there were a couple of polish devs who had ceph in production w/ zfs
[21:45] <scuttlemonkey> but yeah, for the most part there hasn't been enough rigor around it
[21:45] <andrei> yeah, i understand
[21:45] <andrei> thanks
[21:45] <scuttlemonkey> there was also a doc request: http://tracker.ceph.com/issues/3409
[21:46] <scuttlemonkey> np
[21:46] <jtang> you could try running ext4/xfs on top of a zvol
[21:46] <jtang> not sure how well that works
[21:46] <jtang> i guess that ticket explains it
[21:47] <jtang> http://www.inktank.com/career/europe-sales-representative/
[21:48] <jtang> heh
[21:48] <jtang> now if only professional services or devops/engineering were to come to europe that would be nice
[21:48] * jtang eyes the careers section
[21:49] <andrei> i am setting up a new poc cluster with ceph and two servers with different number of disks
[21:49] * Vjarjadian (~IceChat77@ has joined #ceph
[21:49] <andrei> i was looking at docs and I don't really know if I should use mkcephfs or ceph-deploy?
[21:49] <jtang> apparently ceph-deploy is the way to go
[21:50] * Tamil (~tamil@ has joined #ceph
[21:50] <dspano> andrei: From what I read ceph-deploy is replacing mkcephfs
[21:50] <Tamil> kyle_: thats odd
[21:51] <jtang> having ceph-deploy accept a directory on a host for an osd would be nice (it just occured to me) right now it just accepts a block device
[21:51] <jtang> which isn't too convenient
[21:52] <dmick> jtang: it does
[21:52] <jtang> does it?
[21:53] <jtang> the docs on the site didnt seem to indicate it too clearly
[21:53] <dmick> Use The Source, Luke :)
[21:53] <dmick> yes, it does
[21:53] <jtang> ah excellent
[21:53] <dmick> DISK can be a whole-disk partition, a partition, or a directory
[21:53] * jtang has a bunch of machines with only one disk
[21:54] * loicd (~loic@ has joined #ceph
[21:55] <andrei> nice
[21:56] <andrei> ))
[21:56] <andrei> do you recommend connecting individual physical hard disk as a single osd or the phone soft raid partition as an osd?
[21:58] <nhm> andrei: usually it's best to have 1 disk per OSD.
[21:58] <jtang> i guess i miss things since i dont work on ceph much in work
[21:58] <dmick> it's not very obvious
[21:59] <dmick> difficult to come up with a short word that means "disk, partition, or dir"
[21:59] <jtang> nhm: is there still a requirement for syncfs in libc ?
[21:59] <nhm> jtang: so long as it's in the kernel, it should get picked up with newer versions of Ceph
[21:59] <jtang> hrm...
[21:59] <jtang> i guess that means i need to check the change logs in the elrepo kernel repo
[22:00] <jtang> or the kernel change logs for when syncfs got put into the kernel?
[22:00] <jtang> nhm: thats good news for people who want to have 45 disks in a machine
[22:00] <dmick> it's been in there for a long time
[22:00] <nhm> jtang: was some time back around 2.6.38 afaik
[22:01] <jtang> nhm: the el6 kernels are .32
[22:01] <jtang> :)
[22:01] <jtang> 2.6.32
[22:01] <nhm> jtang: rhel may have backported it
[22:01] <jtang> yea i need to check either way
[22:03] <jtang> hrm, it might be in and newer
[22:04] <andrei> thanks
[22:05] <andrei> also, how do I add a journal ssd disk to ceph?
[22:07] <nhm> andrei: you can point the journal at a partition. In the "osd.X" section of your ceph.conf file, something like "osd journal = /dev/sdb1" or something.
[22:08] <andrei> nhm: could i have a single ssd disk journaling multiple osds?
[22:08] <jtang> dmick: i was following this documentation when i was trying otu ceph-deploy -- http://ceph.com/docs/master/rados/deployment/ceph-deploy-osd/
[22:09] <amb> If I want to remove an OSD, the first thing to do is to mark it out (i.e. ceph osd out). How can I *programatically* tell when that OSD is safe to take down & remove?
[22:09] <jtang> thats where i meant that it wasnt obvious to me that a directory could be specified
[22:09] <nhm> andrei: yes, that's what I do. Currently I'm using 3 journals per SSD. The trade off though is that if you overload the SSDs your sequential write throughput suffers, and the more OSD journals you put on 1 SSD, the more OSDs you lose in the event of an SSD failure.
[22:10] <andrei> ah, i see
[22:10] <andrei> so, if the ssd fails would I also loose the osd?
[22:11] <andrei> i thougth that ssd would be more like a cache for journaling
[22:11] <andrei> so it would use it as a temp storage
[22:11] <andrei> and flush to spinning disks once in a while
[22:11] <nhm> andrei: ceph does direct IO writes to the journals to try and ensure that data hits the disk before it sends acks to a client that writes were performed. It then lazily copies that data over to the OSD.
[22:12] * imjustmatthew (~imjustmat@c-24-127-107-51.hsd1.va.comcast.net) Quit (Remote host closed the connection)
[22:12] * eschnou (~eschnou@203.39-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[22:13] <andrei> so, if the ssd fails the osd would still contain data, right
[22:13] <nhm> andrei: A benefit is that because writes to the journal are just appended, there is no seek overhead to find unused blocks, or dentry/metadata lookups. It can happen very fast even for random writes. The writes from the journal to the OSD storage on the other hand do have to do all of those things. In this way, it does sort of give you some performance benefits.
[22:13] <andrei> and would i be able to continue if the ssd fails?
[22:13] <andrei> let's assume that i have 8 disk storage server
[22:13] <andrei> and i want to use 1 ssd for the journal
[22:14] <andrei> if that ssd fails, would I loose all 8 osds?
[22:14] <andrei> and thus the server?
[22:15] <andrei> nhm: i think by default ceph uses 1gb journal size
[22:15] <nhm> andrei: afaik it still depends on whether or not you are using btrfs or xfs/ext4. With btrfs, I think you can just recover any missing data from replicas so recovery is faster. With XFS/Ext4 I think the OSD has to basically be recreated from replicas.
[22:16] <nhm> andrei: you'd definitely lose it in the temporary sense.
[22:16] <andrei> okay, got it
[22:16] <andrei> thanks
[22:16] <andrei> so, it is better to have at least 2 ssds per server
[22:16] <andrei> of 8 disks
[22:17] <andrei> coming back to the journal size - if there is a 1gb of journal per osd by default, would you not waste space with larger ssds if you only use it for like 3 drives?
[22:17] <nhm> andrei: yes. Depending on what your needs are, it may be safer to stick with just spinning disks and putting the journal on the same disk as the data. You can get some benefit in this scenario by using controllers with WB cache.
[22:18] <andrei> or can you actually increase the journal size to fill your ssd?
[22:18] <andrei> would it make sense to do that?
[22:18] <nhm> andrei: the question with SSDs is more about write cycles than about storage space. The more cells there are, the more the wear levelling can spread the writes out and hopefully make the drives last longer.
[22:19] <andrei> makes sense
[22:19] <andrei> so, there is no practical reason to have more than 1gb of journal per osd?
[22:19] <nhm> you can make the journal as large as you wish, and you can change how often data gets flusehed from the journal, but there are tradeoffs. Do you want lots of small flushes or few really large ones? Likely the best scenario is somewhere inbetween.
[22:20] <andrei> i mean if you give it 10gb, would it not increase the speed on busy servers?
[22:20] <nhm> andrei: What's ultimately important is to make sure that you can absorb enough data to keep data flowing constantly to the spinning disks.
[22:22] <andrei> nhm: okay, so if your ssd can do 500mb/s write and your spinners can onlly do 100, you should have 1 ssd per 5 disks or so
[22:22] <andrei> to make sure there is enough speed
[22:22] <nhm> 1GB is reasonable given the speed of spinning disks, but you could do more. If you do significantly less, (so 100MB) you can only absorb 1s of writes and that could cause the OSD to have to wait on new requests while data is being flushed.
[22:22] <nhm> s/so/say
[22:22] <jtang> is it possible to adjust the journal size?
[22:23] <jtang> or tweak the rate at it which flushes? or does it flush when its full?
[22:23] <nhm> andrei: something like that, though at some point network will becoming the limiting factor.
[22:23] <PerlStalker> Important safety tip: Make sure your firewall is allowing the OSDs to talk to each other. :-)
[22:23] <nhm> jtang: yes, you can tweak all of that
[22:23] * eschnou (~eschnou@203.39-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[22:23] <nhm> PerlStalker: Yes! I've been bitten by that. :)
[22:24] <jtang> heh
[22:24] <andrei> nhm: ah, i've got infiniband, i should be able to do 1gb/s on ipoib ))
[22:24] <jtang> heh ipoib!
[22:25] <jtang> you could probably push 4-5gb/s over it these days
[22:25] <nhm> andrei: yeah, I've gotten about 2GB/s with QDR IPoIB.
[22:25] <nhm> andrei: we're looking into rsockets a bit.
[22:25] <andrei> nhm: really?
[22:25] <PerlStalker> nhm: I just brought up 3 new nodes with 12 OSDs each but my default firewall rule set only opened up ports for the first 3.
[22:25] <andrei> i was not getting much over 1gb/s
[22:25] <nhm> PerlStalker: I bet they were marking each other out all the time.
[22:25] <nhm> :)
[22:25] <andrei> what drivers did you use?
[22:25] <PerlStalker> nhm: Yup.
[22:25] <andrei> any tips would be great!
[22:26] <nhm> andrei: this was on a testbed at ORNL. Not sure off the top of my head what drivers they were using, but they posted on the mailing list a while back.
[22:26] <jtang> probably ofed
[22:26] <andrei> i thought that ipoib is around 10gbit/s limit
[22:26] <andrei> not sure where i've got this from
[22:27] <andrei> i've tried so many different ofed versions
[22:27] <nhm> andrei: I think the big thing they mentioned was interrupt affinity tuning.
[22:27] <andrei> still wasn't getting more than 10gbit/s
[22:27] <andrei> thanks, i will look into this
[22:28] <jtang> you could try running ceph with sdp over infiniband
[22:28] <jtang> you might get lucky and its better than ipoib
[22:28] <nhm> jtang: Isn't sdp deprecated?
[22:29] <nhm> http://comments.gmane.org/gmane.network.openfabrics.enterprise/5371
[22:29] <jtang> nhm: i havent been running newer ofed stacks in a while
[22:29] <andrei> nhm: in terms of configuration of ssd, should I make one partition and assign journal as file on that partition or should i create multiple partitions, one for each osd's journal file?
[22:29] <jtang> so for me it isnt deprecated :)
[22:30] <nhm> andrei: theoretically it's better to do multiple partitions, one for each journal. In reality I don't know how much it actually matters.
[22:30] <jtang> *sigh* still why did they drop sdp from ofed
[22:31] <nhm> Though it may help to only partition as much space as you need on the SSD. I think some SSDs do better wear levelling if you leave a bunch of unpartitions space.
[22:31] <andrei> so, when configuring, do I just point journal to like /dev/sda1 for multiple osds or should I mount /dev/sda1 and point journals to the mounted path?
[22:32] <nhm> shouldn't need to mount anything. Just partition the spaces, and then tell each OSD which (unique) partition it should use.
[22:32] <andrei> nhm: sorry to ask too many questions, i am new to ceph
[22:32] <nhm> andrei: no worries, that's what we are here for. :)
[22:33] <andrei> now, the fun part, let's say i've configured journaling with ssd
[22:33] <andrei> for 4 osds
[22:33] <andrei> and the ssd fails
[22:33] <andrei> how do i recover?
[22:33] <andrei> do i simply point journal to those ssds to another ssd?
[22:33] <andrei> or is there another way?
[22:35] <nhm> The OSD should get marked out, and ceph will (after a configurable amount of time) re-replicate all of the data that was on those OSDs to other OSDs in the cluster. Eventually you'll replace the SSD, make the OSDs available again, and data will get re-replicated back.
[22:35] * coyo (~unf@ has joined #ceph
[22:35] <tnt> nhm: I thought that if you lost the journal, the OSD were as good as lost.
[22:36] <nhm> tnt: I think it may not be totally dire with BTRFS, but yes for XFS/EXT4.
[22:37] <nhm> tnt: so basically the "replicated back" step in that case would be like adding new OSDs back into the cluster.
[22:37] <nhm> afaik
[22:37] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[22:38] * loicd (~loic@ Quit (Ping timeout: 480 seconds)
[22:38] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:38] * ghartz|2 (~ghartz@33.ip-5-135-148.eu) has joined #ceph
[22:38] * ghartz (~ghartz@33.ip-5-135-148.eu) Quit (Remote host closed the connection)
[22:40] * PerlStalker (~PerlStalk@ Quit (Ping timeout: 480 seconds)
[22:43] * PerlStalker (~PerlStalk@ has joined #ceph
[22:43] * ghartz|2 (~ghartz@33.ip-5-135-148.eu) Quit (Remote host closed the connection)
[22:44] <jamespage> sagewk, hey - I enabled the arm builds for google-perftools in Ubuntu saucy - seems to work OK
[22:45] <sagewk> awesome! glowell, ^
[22:45] <nhm> jamespage: excellent news
[22:45] <glowell> Good News. Thanks.
[22:45] <sagewk> we are currently testing our arm stuff on quantal. presumably backporting the packaging change is trivial?
[22:45] <tnt> saucy ... seriously, they named it saucy ...
[22:46] <jamespage> hrm - lemme check
[22:46] <jtang> huh? ceph on arm?
[22:47] <nhm> jtang: :)
[22:47] <jamespage> sagewk, yeah - should be - its the same major version
[22:47] <jtang> armv6 i hope
[22:47] <jamespage> jtang, indeed
[22:47] * jtang rubs his hands
[22:47] <jtang> i think i need to order me a few more raspberry pi's
[22:48] <jamespage> jtang, actually no - its v7+
[22:48] * jtang cries
[22:48] <jtang> does ceph really need hf ?
[22:48] <jtang> well ubuntu doesnt support armv6
[22:48] <jtang> debian does
[22:48] <jamespage> jtang, it might not but Ubuntu does not have anything other than hf now
[22:49] <andrei> is there a howto from zero to ceph using ceph-deploy?
[22:49] <andrei> i think the 5 minutes quick start guide is an old one and not designed for ceph-deploy
[22:49] <andrei> right?
[22:50] <jtang> jamespage: are you cross compiling or compiling natively on an arm device?
[22:50] <jamespage> jtang, native compile
[22:50] <tnt> on what board ?
[22:51] <jamespage> jtang, well I tested on a pandaboard
[22:51] <jamespage> tnt, ^^
[22:51] <nhm> andrei: there's this: http://ceph.com/howto/deploying-ceph-with-ceph-deploy/
[22:51] <jamespage> the official builders for Ubuntu use those as well
[22:51] <nhm> andrei: I confess I'm still using the old mkcephfs stuff. ;)
[22:52] <jtang> ah okay
[22:52] <andrei> ))
[22:52] <andrei> the old and trusted way
[22:52] <jtang> with ceph working on an arm board, its just begging to be commercialised into a product where one can buy an "osd packaged up"
[22:53] <jamespage> sagewk, why quantal?
[22:53] <nhm> jtang: seems like a great idea
[22:53] <jamespage> jtang, the obvious restriction being that it on has 100MBps network and usb < 3 for storage
[22:53] <jtang> jamespage: yea i suppose
[22:53] <jamespage> i.e. performance really does suck
[22:54] <Vjarjadian> 4tb HDD is painful with gigabit... without it would be unusable
[22:54] <jtang> i can see a scenario for the preservation/archiving crowd
[22:54] <andrei> does anyone know if ceph-deploy can work with CentOS?
[22:54] <nhm> not all arm gear is limited to those options though.
[22:54] <jamespage> nhm, indeed not
[22:54] <jtang> where you can plonk stuff onto a "cluster of osd's" then pull them out for cold storage
[22:54] <jtang> using the planned async replication feature that is planned
[22:54] <scuttlemonkey> andrei: I reposted a decent blog entry from Loic the other day on ceph-deploy
[22:54] <scuttlemonkey> http://ceph.com/howto/deploying-ceph-with-ceph-deploy/
[22:55] <scuttlemonkey> dunno if that's helpful or not
[22:55] <mikedawson> sagewk: Have you had a chance to look at the transaction dump I uploaded? If so, was this one more useful than the last?
[22:55] <nhm> scuttlemonkey: jinx! ;)
[22:55] <jtang> jamespage: you can get arm boards with gigabit ;)
[22:55] <sagewk> mikedawson: looking at it now. so big! :(
[22:55] <scuttlemonkey> nhm: hah!
[22:56] <mikedawson> sagewk: yeah, I grabbed it as soon as I noticed it growing...
[22:56] <scuttlemonkey> missed yer link :P
[22:56] <jks> jtang, small arm board with gigabit and sata interface... make that PoE and stick it on a small enough PCB to fit in one of those generic boxes for a single drive (used for USB/firewire external drives)... that would be a killer!
[22:56] <jtang> jks: thats what i was thinking
[22:56] <jamespage> jks: agreed
[22:56] <jtang> you could probably build a POC with a raspberry pi or a pandaboard
[22:57] <jtang> the only missing component is avahi integration
[22:57] <jks> does Renesas produce a bridge perhaps?
[22:57] * nhm giggles
[22:57] <jtang> so an osd can auto announce its presence to a ceph cluster
[22:57] <jtang> im kinda surprised no one has considered that yet
[22:57] <jks> jtang, avahi would be relatively easy to put in I guess
[22:57] <jtang> given how much funky stuff ceph has been churning out
[22:58] <jtang> a well defined OSD with avahi would be interesting
[22:58] <jks> I have done zeroconf (avahi is a zeroconf implementation) on embedded hardware before...
[22:58] <jks> makes everything very easy for the user, and the amount of code needed to support it can be relatively small
[22:58] <jtang> just plug an play, and have somewhat a sensible set of heuristics for adjusting the crush map
[22:59] <jtang> you could hit the low end market quite easily with the tech
[22:59] <andrei> scuttlemonkey: cheers
[22:59] <andrei> i will go through it now
[23:00] <jtang> jks: yea, i reckon so
[23:00] <jks> hmm, think I just found the ideal hardware
[23:00] <jks> jtang, take a look at this: http://www.newit.co.uk/shop/proddetail.php?prod=TonidoPlug2
[23:00] <jks> basically a 2.5" sata enclosure with a 800 mhz cpu and gigabit ethernet
[23:01] <jtang> jks: nice
[23:01] <jtang> it sounds like a sheeva/guru plug
[23:01] <jks> I haven't tried that ARM CPU before, so I don't know how fast it is though (It's the Marvell Armada310)
[23:01] <jks> yep, exactly like a sheeva/guru plug... just a bit different ;-)
[23:02] <jks> the price is right as well I think
[23:02] <jtang> i've a pair of pogo plugs here
[23:02] <jtang> if ceph builds and runs on armv6 there is no shortage of cheap hardware
[23:02] <jks> about 113$ for that TonidoPlug2... and you can get discounts for buying bulk
[23:02] <jks> the tonido is armv7 ;-)
[23:02] <nhm> jtang: just keep in mind you'll be super CPU limited.
[23:02] <jtang> you'd have to roll up a distro/release for it to be plug and play
[23:03] <jtang> nhm: yea i know :P
[23:03] <jks> jtang, that would be quite easy
[23:03] <nhm> jtang: The A9s and A15s are a better bet right now.
[23:03] <jtang> but the capability of having self healing storage on cheap lower power consuming hardware at home/lab sounds appealing
[23:04] <jks> kontron makes a small board with a 900 Mhz quad-core ARM A9 with gigabit ethernet and SATA
[23:04] <jks> it's mITX size, so it's not comparable to the plugs ofcourse
[23:04] <jtang> this is beginning to sound like it could be an idea
[23:05] <jks> actually the new processors from AMD would be ideal for this
[23:05] <jks> if someone would just put out a board for them that fit this ;-)
[23:05] <jtang> i've been mulling over building small "bricks" for a storage system for preservation of data with erasure coding for a while now
[23:05] <jtang> this idea doesnt quite map back to that
[23:05] <jtang> but it still sounds interesting to try
[23:05] <jks> amd has just released a low power chip without graphics, but with the sata controller and RAM controllers on chip
[23:06] <andrei> guys, if i have two servers with 9x 3tb disks on one and 8x1.5tb on another one with replication ration of 2, I will end up with a 8x1.5TB file system, right?
[23:06] <jks> quad-core 1.6 ghz with great performance (i.e. significantly better than Intel Atom)
[23:06] <nhm> andrei: unless you are willing to allow replication inside the same host.
[23:06] <andrei> i would like to make sure that my data is replicated across the two servers
[23:06] <jtang> jks, speaking of this idea of OSD's that are packaged up as a 'brick' checkout lockss
[23:06] <andrei> nope, i do not want this at all
[23:07] * jtang ponders
[23:07] <andrei> my aim is to have a storage server which could be brought down for maintenance and not to shutdown my vm infrastructure
[23:07] <nhm> andrei: Ceph can run on heterogeneous hardware, but it's better if it's homogeneous.
[23:07] <andrei> nhm: yeah, I know
[23:08] <andrei> i just can't buy anything at the moment
[23:08] <andrei> this is a PoC with limited budget
[23:08] <jks> jtang, hmm... looking at the Pogoplug Series 4
[23:08] <jks> jtang, 99$, gigabit ethernet, SATA, small pleasantly looking device with an 800 Mhz ARM CPU
[23:08] <jtang> jks: the pogoplug classics are ~20usd a pop
[23:08] <jks> yes, but way slower and no SATA port
[23:09] <jtang> but dog slow as nhm points out
[23:10] <nhm> jks: this thing? http://www.adorama.com/COCPOGOV4A3.html?gclid=CNGHpP3LpbcCFYFhMgodOUUAWg
[23:10] <jks> exactly
[23:10] <jtang> jks: it would be odd to get a board thats 100usd and two 3tb disks that are 200usd in cost to make an osd
[23:10] <jks> that's a way better price on that website :-)
[23:10] <jks> nhm, it has this cpu:
[23:10] <jks> nhm: http://www.marvell.com/embedded-processors/kirkwood/assets/88F6192-003_ver1.pdf
[23:10] <jks> so still relatively slow
[23:11] <andrei> can I add a journal disk to an osd at a later date?
[23:12] <andrei> or does it have to be done at creation?
[23:12] <jks> andrei, yes
[23:12] <jtang> still ~1000k usd for a bunch of arm boards and some hdd's for some self healing storage is appealing
[23:12] <andrei> yes to done at creation?
[23:12] <nhm> jks: yeah, you wouldn't probably be able to saturate even a single disk with it.
[23:12] <jtang> even if it just offers 3-5tb of usable storage, it would meet the needs of most small businesses/labs and home use
[23:13] <jks> andrei, you can add later
[23:13] <andrei> cool
[23:13] <jks> nhm: but still... given the costs, you could have plenty more disks
[23:13] <nhm> jks: true
[23:13] <nhm> recovery may be very painful.
[23:13] <jks> it would be possible to create a sub 100$ osd ;-)
[23:14] <jtang> nhm: yea it could, if you only had 3 osd's it would be
[23:14] <jtang> but if its possible to have 100usd osd's, i'd get 10 of them
[23:15] <jtang> so a single failure is less painful
[23:15] <jks> bundled with a cheap PoE switch... it would be a killer
[23:15] <jtang> it'd kill off companies like panasas and isilon :P
[23:15] <nhm> how about a meshnetwork of wireless routers with hard drives? ;)
[23:15] <jtang> and give equalogic a run for its money
[23:15] <jks> nhm, haha :)
[23:16] <jtang> nhm: throw in the babel router for a self forming network as well?
[23:16] <jks> or pigeon carriers with usb thumb drives!
[23:17] <nhm> jtang: add new storage by plugging another wireless router into the wall? ;)
[23:17] <jtang> nhm: heh
[23:19] * arye (~arye@nat-oitwireless-outside-vapornet3-a-80.Princeton.EDU) Quit (Quit: arye)
[23:20] * eschnou (~eschnou@203.39-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[23:20] <jtang> still, this idea probably wont be trivial until the rest api's come along
[23:21] <jtang> i wouldnt see myself even begin to attempt to test the idea out without the rest api
[23:24] <jtang> jamespage: are you using the stock packaging files for ubuntu from the git repo of ceph to build?
[23:24] <jtang> and did you make any changes to ceph or the packaging?
[23:25] <jtang> if no changes have been made, i might try building it from the git repo on my pi
[23:26] <jtang> assuming ceph builds on wheezy
[23:27] <jtang> i'd have to get a new sd card though, i only have 4gb sd cards for my pi
[23:28] * senner1 (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) has joined #ceph
[23:28] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) Quit (Ping timeout: 480 seconds)
[23:29] <andrei> i am having an issue with ceph-deploy on centos 6.3
[23:29] <andrei> has anyone made it work?
[23:29] <andrei> i have a python Traceback
[23:31] <andrei> [remote] sudo: sorry, you must have a tty to run sudo
[23:31] <jtang> andrei: yea
[23:31] <nhm> andrei: sounds like your remote user can't use sudo without a tty, you can probably configure that.
[23:31] <jtang> you need to disable the require tty option
[23:32] <jtang> visudo to edit the config file
[23:32] <jtang> and comment out the require tty line
[23:33] <jtang> defaults requiretty if i remember right
[23:34] <sagewk> jamespage: still there? is this the right approach here? https://github.com/ceph/ceph/commit/bc4805bac98df5f45f7ae3e49599001cc4b94bf0
[23:36] <andrei> thanks, fixed it
[23:36] <andrei> you are right
[23:36] <andrei> require tty
[23:37] <jtang> i dont think thats documented in the documentation pages on ceph.com
[23:39] * senner1 (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) Quit (Ping timeout: 480 seconds)
[23:41] * themgt (~themgt@24-177-232-33.dhcp.gnvl.sc.charter.com) has joined #ceph
[23:44] * loicd (~loic@magenta.dachary.org) has joined #ceph
[23:51] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) has joined #ceph
[23:52] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[23:52] <tnt> Damn, got it by that growing mon thing again. Well at least the OSDs are not leaking memory any more
[23:54] <andrei> I am having some issues with starting a cluster with ceph-deploy
[23:54] <andrei> i am trying ceph-deploy gatherkeys
[23:54] <andrei> and it can't find the keys
[23:54] <andrei> Unable to find /etc/ceph/ceph.client.admin.keyring on ['arh-ibstorage1-ib']
[23:54] <andrei> any idea what's wrong?
[23:55] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Read error: Connection reset by peer)
[23:55] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[23:55] <andrei> i am following the guide: http://ceph.com/howto/deploying-ceph-with-ceph-deploy/
[23:56] <nhm> andrei: hrm, not really my area of expertise. dmick might be able to help.
[23:56] <andrei> dmick: are you around?
[23:56] <andrei> i am having a bit of an issue with ceph-deploy
[23:56] <andrei> gatherkeys is not working
[23:57] * portante|ltp (~user@ Quit (Quit: heading out)
[23:57] <andrei> it found the mon keys
[23:57] <andrei> but can't find any other keys
[23:57] <dmick> mons running?
[23:58] <andrei> yeah
[23:58] <andrei> /usr/bin/ceph-mon --cluster=ceph -i arh-ibstorage2 -f
[23:58] <dmick> do you have a ceph-create-keys job stuck on that box?
[23:58] <andrei> yeah
[23:59] <andrei> that's right
[23:59] <dmick> it needs to complete; something's stopping it
[23:59] <andrei> /usr/bin/ceph-mon --cluster=ceph -i arh-ibstorage1 -f
[23:59] <andrei> you mean this one?
[23:59] <andrei> i've got it running on 2 ubuntu servers and 1 centos

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.