#ceph IRC Log


IRC Log for 2013-02-12

Timestamps are in GMT/BST.

[0:00] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[0:02] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[0:07] * BMDan (~BMDan@ Quit (Quit: Leaving.)
[0:08] * BillK (~BillK@58-7-243-105.dyn.iinet.net.au) Quit (Quit: Leaving)
[0:08] * scalability-junk (~stp@188-193-201-35-dynip.superkabel.de) Quit (Quit: Leaving)
[0:15] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Remote host closed the connection)
[0:19] * ScOut3R (~scout3r@5400CAE0.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[0:20] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[0:21] * miroslav1 (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[0:21] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Read error: Connection reset by peer)
[0:26] * nick5 (~nick@ has joined #ceph
[0:27] <mauilion> so another cephx problem.
[0:27] <mauilion> I am trying to configure /etc/cinder/cinder.conf
[0:28] <mauilion> and I have configured it to both to use libvirt
[0:28] <mauilion> and I have added the ceph.client.volumes.keyring in /etc/ceph/
[0:28] <mauilion> chowned it to cinder:cinder
[0:29] <mauilion> all I see are auth errors.
[0:29] <mauilion> I am using this page for reference
[0:29] <mauilion> http://ceph.com/docs/master/rbd/rbd-openstack/
[0:30] <joshd> is cinder-volume running with CEPH_ARGS="--id volumes" set?
[0:30] <mauilion> it's not starting
[0:30] <mauilion> but I have added that to the init file
[0:31] <joshd> can cinder access /etc/ceph/ceph.conf as well?
[0:31] <mauilion> yes
[0:32] <mauilion> sudo -u cinder cat /etc/ceph/ceph.conf
[0:32] <mauilion> works
[0:32] <joshd> and 'sudo -u cinder rados lspools' works?
[0:32] <joshd> err, with --id volumes in there too
[0:33] <mauilion> root@jayz:/etc/ceph# sudo -u cinder rados lspools --id volumes
[0:33] <mauilion> 2013-02-11 15:32:01.163401 7f06c50ae780 0 librados: client.volumes authentication error (95) Operation not supported
[0:33] <mauilion> couldn't connect to cluster! error -95
[0:33] <mauilion> :(
[0:33] <joshd> do you have different versions of ceph, one of which is defaulting to cephx and the other not?
[0:34] <mauilion> shouldn't it pick up the cephx thing from etc/ceph/ceph.conf?
[0:34] <mauilion> let me check
[0:34] <joshd> yeah, if it's set there
[0:34] <mauilion> it's set there
[0:34] <mauilion> ah
[0:35] <mauilion> the ceph-common package is 0.48.2
[0:35] <mauilion> the server has 0.56.2
[0:36] <mauilion> 1 sec
[0:36] <mauilion> let me get them on the same page.
[0:37] <joshd> that would do it. I'd double check librbd1 and librados2 as well
[0:37] <joshd> plus python-ceph if you're using glance
[0:37] <mauilion> looks like apt-get upgrade got them
[0:37] * loicd (~loic@magenta.dachary.org) Quit (Read error: Connection reset by peer)
[0:37] * loicd (~loic@2a01:e35:2eba:db10:142c:984e:a283:b04c) has joined #ceph
[0:38] <mauilion> profit!
[0:38] <mauilion> thanks man
[0:38] <joshd> np
[0:39] <paravoid> yehudasa: ping?
[0:40] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[0:40] <yehudasa> paravoid: pong
[0:41] <paravoid> hey
[0:41] <paravoid> so, radosgw is very noisy with the default log levels
[0:41] <paravoid> it keeps getting my disks full even with aggressive logrotate :)
[0:42] <yehudasa> paravoid: debug rgw = 1 might help
[0:43] <paravoid> I'll do that, but maybe you could consider increasing the debug level of some of those messages?
[0:43] <paravoid> it's too verbose by default I think
[0:44] <paravoid> I can open a bug report for you if you prefer
[0:46] <paravoid> it's writing logs at around 2MB/s for me :)
[0:46] <Kdecherf> not bad :)
[0:46] <yehudasa> paravoid: the reason the log level is high is that it's easier to debug users' issues this way. You can always turn it off.
[0:47] <paravoid> even with rgw = 1 it prints two lines per request
[0:47] <yehudasa> paravoid: you can do debug rgw = 0 if you want it completely off
[0:47] <yehudasa> or you can set log file = /dev/null
[0:48] <paravoid> I can sustain two lines per request for now
[0:48] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[0:48] <paravoid> under regular circumstances I'm going to have ~500 reqs/s, so it's not too bad
[0:48] <paravoid> per box that is
[0:55] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[0:57] <nick5> hi. does anyone know if there are good docs for setting up radosgw on centos6.3 (with mod_fcgi?)
[0:58] <nick5> i've worked my way through the wiki and ubuntu parts and had to hack together pieces, and now and trying to figure out the last little bugs.
[1:08] * noob2 (~noob2@ext.cscinfo.com) has joined #ceph
[1:16] * loicd (~loic@2a01:e35:2eba:db10:142c:984e:a283:b04c) Quit (Quit: Leaving.)
[1:16] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[1:16] * loicd (~loic@magenta.dachary.org) has joined #ceph
[1:17] <noob2> is there a way to change all the OSD's weights at the same time without ceph reordering data?
[1:17] * loicd (~loic@magenta.dachary.org) Quit ()
[1:17] <noob2> i want to change all my osd's from 1 -> 3
[1:18] <noob2> when i reweight any osd's it seems to kick off a huge backfill process
[1:26] <gregaf> noob2: you can dump the crush map to a local file, edit that, and then inject it to reweight them all at once
[1:27] <noob2> would that stop the backfilling?
[1:27] <noob2> i'm guessing it would
[1:28] * jlogan1 (~Thunderbi@2600:c00:3010:1:c1d0:fd9:21b5:8830) Quit (Ping timeout: 480 seconds)
[1:28] <gregaf> well, it would put all the data back where it was
[1:28] <gregaf> whether that stops the backfill depends on whether any of it has been permanently moved already or not
[1:30] <gregaf> or if it's a small number you can just change them all on the command line real quick; it's backfilling because when you change one at a time, the way PGs map to OSDs is changing
[1:30] <noob2> i gotcha
[1:30] <gregaf> but if you set them all to 3 then the final mapping will be the same as the original one, and it won't take too long for everything to get happy again
[1:30] <noob2> awesome :)
[1:30] <noob2> thanks
[1:31] * jlogan1 (~Thunderbi@ has joined #ceph
[1:31] <gregaf> np
[1:32] <noob2> looks like that called for a new monitor election.
[1:33] <noob2> i think that did it
[1:33] <noob2> yeah it's back happy again
[1:33] <noob2> i love this freaking software man :D
[1:42] * LeaChim (~LeaChim@b0faa140.bb.sky.com) Quit (Ping timeout: 480 seconds)
[2:02] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[2:17] * junglebells (~bloat@CPE-72-135-215-158.wi.res.rr.com) Quit (Read error: Connection reset by peer)
[2:18] * jjgalvez (~jjgalvez@ has left #ceph
[2:23] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[2:27] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[2:29] <ShaunR> does ceph come with any good tools for monitoring osd performance? (seeing throughput and IOPS)
[2:29] <ShaunR> i guess i could just use the standard methods
[2:31] <dmick> there are a lot of stats kept ShaunR, and I nkow there are monitoring-framework plugins that consume them
[2:33] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) has joined #ceph
[2:37] <ShaunR> dmick: were can i see all the stats available?
[2:37] <themgt> I have a 4 OSD, 3 mon, 3 mds cluster w/ 0.48(spread across 4 physical nodes). just upgraded to 0.52, osd/mon upgrade went smooth, ran msd upgrade on first node and it won't start back up. logs say:
[2:37] <themgt> http://pastie.org/6121921
[2:38] <themgt> "not writeable with daemon features" seems to be the key… not sure if I should upgrade the other 2 mds or not?
[2:38] <dmick> ShaunR: ceph --admin-daemon <admin-socket-path> perf dump, perhaps with | python -mjson.tool
[2:38] <dmick> <admin-socket-path>: there's one per daemon, by default at /var/run/ceph/$cluster-$name.asok
[2:46] <gregaf> themgt: why did you upgrade to 0.52 instead of 0.56.2?
[2:46] <gregaf> I think .52 might have suffered from an issue with the way the CompatSet comparisons worked that was masked in earlier code and got fixed later on
[2:46] <gregaf> or actually the fix got backported to argonaut if you had one of the newer point releases
[2:47] <themgt> gregaf: sorry, should be 0.56.2 - I just switched to http://ceph.com/debian-bobtail/ and apt-get update/apt-get install ceph
[2:48] <Kdecherf> gregaf: I'm resolving ceph with debug symbols on another machine
[2:52] <gregaf> Kdecherf: sweet! I'm about to head out for the evening, but there's usually one or two people around later if you need help with gdb :)
[2:52] <gregaf> the segfault doesn't have an obvious source so figuring out where it's coming from would be key in identifying the source of that crash
[2:53] <Kdecherf> "Reading symbols from /usr/bin/ceph-mds...(no debugging symbols found)" Hm, I failed somewhere.
[2:53] <Kdecherf> gregaf: np, I'm going to sleep (it's 3AM here) anyway
[2:54] <gregaf> *blink* I encourage that plan
[2:55] <gregaf> Kdecherf: if you haven't already, "apt-get install ceph-debug-mds" (or ceph-mds-dbg or something) is probably the issue
[2:55] <Kdecherf> I'm on an exotic distribution ;-)
[2:56] <gregaf> themgt: hrm, I wouldn't expect that error to come out then — which version of argonaut were you on previously?
[2:56] <gregaf> Kdecherf: well, adapt accordingly ;)
[2:56] <themgt> gregaf: probably the first first one
[2:56] <gregaf> they are a separate debug symbol package was the important takeaway
[2:56] <themgt> I can try to upgrade to the latest argonaut then bobtail?
[2:57] <gregaf> no, I don't think that should make a difference
[2:58] <gregaf> umm, the MDSes aren't going to write anything to disk given that, so downgrading them would be easy
[2:58] <gregaf> you could try upgrading them all and downgrading if it continues to not work, but I can't diagnose it any more than that tonight, sorry!
[3:01] <Kdecherf> gregaf: I will check that tomorrow :)
[3:01] <themgt> gregaf: ok, thx. gonna back stuff up a bit then give that a shot
[3:04] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[3:09] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:17] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[3:22] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[3:29] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:39] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[3:43] * buck (~buck@bender.soe.ucsc.edu) has left #ceph
[3:44] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[3:46] * jlogan1 (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[3:48] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[3:51] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:57] * rturk is now known as rturk-away
[3:58] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[4:01] <infernix> common/Mutex.cc: In function 'void Mutex::Lock(bool)' thread 7f95411a9700 time 2013-02-11 21:54:47.780875
[4:01] <infernix> common/Mutex.cc: 94: FAILED assert(r == 0)
[4:01] <infernix> http://pastebin.ca/2313016
[4:01] <infernix> this happens when I try to write 2048MB to an rbd device with python
[4:02] <infernix> fine with 2047MB
[4:02] <infernix> i have reproducible code
[4:02] <infernix> how do I figure out what's wrong here?
[4:03] <sjustlaptop> that usually indicates that the thread is locking a mutex it already has locked
[4:03] <sjustlaptop> if you follow the call chain, you should find the first locker
[4:04] <infernix> but why does it only happen when i'm writing objects of 2048MB or larger?
[4:05] <infernix> code: http://pastebin.ca/2313018
[4:05] <infernix> it's a rbd benchmarker
[4:06] <joshd1> how much ram do you have?
[4:06] <infernix> 32GB
[4:07] <infernix> it looks like http://tracker.ceph.com/issues/3836 but that claims to be fixed in bobtail
[4:08] <joshd1> it's a generic assert which is used by lots of unrelated code, it's different from that issue
[4:09] <infernix> well, basically I am planning to retool this benchmark into a multiprocessing one
[4:10] <infernix> where i read large (25GB to 4TB) block devices and write them with 1-4 threads to an rbd volume
[4:10] <infernix> *rbd pool
[4:11] <infernix> i don't think it's a real big issue because I have to split that up in 256MB threads anyway
[4:11] <infernix> but i'm still curious as to why this fails
[4:11] <ShaunR> when using rbd does anybody know if i should still be using virtio for the interface (if=) on the -drive?
[4:12] <infernix> ShaunR: you would want to if that works, ye
[4:13] <infernix> it'll give you less overhead in the guest
[4:15] <ShaunR> So far i have not seen ceph-mds really doing much, it is needed when doing rbd right? i'm not using cephfs
[4:18] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[4:20] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[4:29] <infernix> mds is only used for cephfs
[4:31] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[4:32] * chutzpah (~chutz@ Quit (Quit: Leaving)
[4:33] * ircolle (~ircolle@c-67-172-132-164.hsd1.co.comcast.net) Quit (Quit: Leaving.)
[4:33] * The_Bishop (~bishop@e179013183.adsl.alicedsl.de) has joined #ceph
[4:41] <iggy> ShaunR: virtio is the preferred disk interface
[4:44] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[4:49] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:50] * themgt (~themgt@97-95-235-55.dhcp.sffl.va.charter.com) has joined #ceph
[4:53] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[4:58] * jlogan1 (~Thunderbi@2600:c00:3010:1:f10b:fe00:c3e7:1d31) has joined #ceph
[4:59] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[5:13] * Cube1 (~Cube@ Quit (Ping timeout: 480 seconds)
[5:39] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[5:42] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[5:45] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[5:47] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[5:48] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit ()
[5:57] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:33] * miroslav1 (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[6:41] * yehuda_hm (~yehuda@2602:306:330b:a40:1cf6:5f3:81cd:6df6) Quit (Ping timeout: 480 seconds)
[6:50] * yehuda_hm (~yehuda@2602:306:330b:a40:5046:9efc:4382:29bf) has joined #ceph
[7:05] * jlogan1 (~Thunderbi@2600:c00:3010:1:f10b:fe00:c3e7:1d31) Quit (Ping timeout: 480 seconds)
[7:06] * themgt (~themgt@97-95-235-55.dhcp.sffl.va.charter.com) Quit (Quit: themgt)
[7:09] * fghaas (~florian@212095007040.public.telering.at) has joined #ceph
[7:18] * bulent (~bulent@adsl-75-22-70-35.dsl.irvnca.sbcglobal.net) has joined #ceph
[7:21] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) has joined #ceph
[7:21] * bulent (~bulent@adsl-75-22-70-35.dsl.irvnca.sbcglobal.net) has left #ceph
[7:32] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[7:36] * fghaas (~florian@212095007040.public.telering.at) Quit (Ping timeout: 480 seconds)
[7:39] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[7:48] * fghaas (~florian@213162068075.public.t-mobile.at) has joined #ceph
[8:00] * fghaas (~florian@213162068075.public.t-mobile.at) Quit (Quit: Leaving.)
[8:02] * fghaas (~florian@213162068075.public.t-mobile.at) has joined #ceph
[8:11] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[8:12] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #ceph
[8:15] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[8:22] * topro (~topro@host-62-245-142-50.customer.m-online.net) Quit (Ping timeout: 480 seconds)
[8:24] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[8:26] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #ceph
[8:38] * topro (~topro@host-62-245-142-50.customer.m-online.net) has joined #ceph
[8:40] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[8:47] * sleinen (~Adium@2001:620:0:25:ad5e:1a89:708f:f229) has joined #ceph
[8:50] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:52] * jtang1 (~jtang@ has joined #ceph
[8:57] * jlogan (~Thunderbi@2600:c00:3010:1:f10b:fe00:c3e7:1d31) has joined #ceph
[8:59] * sleinen (~Adium@2001:620:0:25:ad5e:1a89:708f:f229) Quit (Quit: Leaving.)
[8:59] * fghaas (~florian@213162068075.public.t-mobile.at) Quit (Ping timeout: 480 seconds)
[9:02] * sleinen (~Adium@ has joined #ceph
[9:03] * sleinen1 (~Adium@2001:620:0:26:a8e6:f6b0:e222:c994) has joined #ceph
[9:04] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[9:09] * mebbet (987694e2@ircip1.mibbit.com) has joined #ceph
[9:09] * hybrid512 (~w.moghrab@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:10] * mebbet (987694e2@ircip1.mibbit.com) Quit ()
[9:10] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[9:17] * BManojlovic (~steki@ has joined #ceph
[9:21] * leseb (~leseb@mx00.stone-it.com) has joined #ceph
[9:23] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[9:24] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:24] * loicd (~loic@magenta.dachary.org) has joined #ceph
[9:27] * ScOut3R (~ScOut3R@ has joined #ceph
[9:29] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:31] * LeaChim (~LeaChim@b0faa140.bb.sky.com) has joined #ceph
[9:40] * fghaas (~florian@212095007003.public.telering.at) has joined #ceph
[9:41] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: Always try to be modest, and be proud about it!)
[9:45] * Morg (b2f95a11@ircip4.mibbit.com) has joined #ceph
[9:48] * Morg (b2f95a11@ircip4.mibbit.com) Quit ()
[9:49] * Morg (b2f95a11@ircip3.mibbit.com) has joined #ceph
[9:50] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[9:55] * fghaas (~florian@212095007003.public.telering.at) Quit (Quit: Leaving.)
[9:58] * low (~low@ has joined #ceph
[10:05] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[10:07] * nz_monkey (~nz_monkey@ Quit (Ping timeout: 480 seconds)
[10:08] * nz_monkey (~nz_monkey@ has joined #ceph
[10:15] * loicd (~loic@lvs-gateway1.teclib.net) has joined #ceph
[10:16] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has left #ceph
[10:23] * gucki (~smuxi@HSI-KBW-095-208-162-072.hsi5.kabel-badenwuerttemberg.de) Quit (Remote host closed the connection)
[10:24] * gucki (~smuxi@HSI-KBW-095-208-162-072.hsi5.kabel-badenwuerttemberg.de) has joined #ceph
[10:26] * sleinen1 (~Adium@2001:620:0:26:a8e6:f6b0:e222:c994) Quit (Quit: Leaving.)
[10:26] * sleinen (~Adium@ has joined #ceph
[10:27] * tziOm (~bjornar@ has joined #ceph
[10:34] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[10:41] * jtang1 (~jtang@2001:770:10:500:c41:7d6:19e8:d8dc) has joined #ceph
[10:45] * sleinen (~Adium@ has joined #ceph
[10:47] * sleinen1 (~Adium@2001:620:0:26:2d77:e0e1:7569:ee09) has joined #ceph
[10:52] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[10:54] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[10:54] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[11:00] * xdeller (~xdeller@broadband-77-37-224-84.nationalcablenetworks.ru) has joined #ceph
[11:00] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[11:04] * jlogan (~Thunderbi@2600:c00:3010:1:f10b:fe00:c3e7:1d31) Quit (Ping timeout: 480 seconds)
[11:24] * gucki (~smuxi@HSI-KBW-095-208-162-072.hsi5.kabel-badenwuerttemberg.de) Quit (Ping timeout: 480 seconds)
[11:26] * gucki (~smuxi@HSI-KBW-095-208-162-072.hsi5.kabel-badenwuerttemberg.de) has joined #ceph
[11:35] * leseb (~leseb@mx00.stone-it.com) Quit (Remote host closed the connection)
[11:38] * leseb (~leseb@mx00.stone-it.com) has joined #ceph
[11:44] * BillK (~BillK@58-7-243-105.dyn.iinet.net.au) has joined #ceph
[11:58] * houkouonchi-work (~linux@ Quit (Ping timeout: 480 seconds)
[12:03] * houkouonchi-work (~linux@ has joined #ceph
[12:33] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[12:38] <mattch> Just wondering what the progress is on ceph deploy... or is everyone just using chef (or hacking it with mkcephfs) these days?
[12:41] <ninkotech> mattch: some people (who love python) prefer #salt or #fabric over chef :)
[12:42] * fghaas (~florian@91-119-222-199.dynamic.xdsl-line.inode.at) has joined #ceph
[12:45] <mattch> ninkotech: Great - more non-official hacks to get it deployed :-p
[12:55] <ninkotech> i used salt last time and it worked
[12:57] * fghaas (~florian@91-119-222-199.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[12:58] <jtang> ninkotech: im using ansible for my test deployments
[12:59] <ninkotech> interesting. never noticed that one
[12:59] * tziOm (~bjornar@ Quit (Ping timeout: 480 seconds)
[13:00] * hybrid512 (~w.moghrab@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[13:02] <mattch> jtang: and another one... :)
[13:03] <jtang> i was using puppet
[13:03] <jtang> but ansible is just much nicer to get going for the team that im on
[13:08] * tziOm (~bjornar@ has joined #ceph
[13:11] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[13:11] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[13:12] * fghaas (~florian@91-119-222-199.dynamic.xdsl-line.inode.at) has joined #ceph
[13:12] * fghaas (~florian@91-119-222-199.dynamic.xdsl-line.inode.at) has left #ceph
[13:25] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[13:30] <noob2> oh man, i managed to get my cluster into stuck unclean state
[13:30] * fghaas1 (~florian@91-119-222-199.dynamic.xdsl-line.inode.at) has joined #ceph
[13:30] * fghaas1 (~florian@91-119-222-199.dynamic.xdsl-line.inode.at) has left #ceph
[13:37] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[13:41] <noob2> if ceph gets into stuck unclean territory for a long time do i have to revert the lost pgs?
[13:46] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[13:59] * noob2 (~noob2@ext.cscinfo.com) Quit (Quit: Leaving.)
[14:01] * hybrid512 (~w.moghrab@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[14:13] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[14:27] <Teduardo> ninkotech: also crowbar does "kind of" work too
[14:28] <Teduardo> Does anyone know if the kernel modules for block devices can recover from it's target node going offline?
[14:32] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[14:43] * sleinen1 (~Adium@2001:620:0:26:2d77:e0e1:7569:ee09) Quit (Quit: Leaving.)
[14:43] * sleinen (~Adium@ has joined #ceph
[14:49] * sleinen1 (~Adium@2001:620:0:25:58e9:30be:4499:9a86) has joined #ceph
[14:51] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[14:52] <ninkotech> i didnt like crowbar
[14:54] * itamar (~itamar@ has joined #ceph
[14:56] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[15:04] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[15:04] <nhorman> ose
[15:10] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[15:10] * itamar (~itamar@ Quit (Quit: Ex-Chat)
[15:14] <ivoks> hi
[15:15] <ivoks> how do i resolve 'pg <pg id> is stuck unclean since forever'?
[15:15] <ivoks> i don't care about data
[15:22] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[15:25] <scuttlemonkey> ivoks: can you pastebin you ceph -s and ceph pg dump_stuck unclean?
[15:25] <scuttlemonkey> your*
[15:27] * aliguori (~anthony@cpe-70-112-157-151.austin.res.rr.com) Quit (Quit: Ex-Chat)
[15:27] <scuttlemonkey> noob2: as far as I know 'revert' is the only option available on mark_unfound_lost...but that may have changed recently
[15:28] <scuttlemonkey> best option is obviously to hunt down the location of the missing pg...but barring that yeah, revert is probably the only option left to you
[15:29] <ivoks> scuttlemonkey: http://pastebin.com/7iRRYXg1
[15:29] <ivoks> scuttlemonkey: thanks
[15:29] <ivoks> http://pastebin.com/raw.php?i=7iRRYXg1
[15:29] <ivoks> raw is probably nicer
[15:36] <scuttlemonkey> lets query one of those pgs 'ceph pg 0.3c query'
[15:37] <ivoks> http://pastebin.com/ZJrKybjW
[15:39] * sleinen1 (~Adium@2001:620:0:25:58e9:30be:4499:9a86) Quit (Read error: No route to host)
[15:39] * sleinen (~Adium@2001:620:0:25:58e9:30be:4499:9a86) has joined #ceph
[15:46] <scuttlemonkey> ivoks: how many osds do you have?
[15:46] <ivoks> scuttlemonkey: 3
[15:47] <ivoks> each on separate machine
[15:47] <scuttlemonkey> k
[15:48] * noob2 (~noob2@ext.cscinfo.com) has joined #ceph
[15:50] <scuttlemonkey> can we grab a 'ceph osd dump' and 'ceph osd tree'
[15:51] <noob2> can anyone help with a ceph stuck unclean? i'm not sure how to repair this
[15:52] <scuttlemonkey> noob2: seems to be going around this morning
[15:52] <noob2> uh oh
[15:52] <noob2> yeah i did a ceph reweight and a few pgs are stuck
[15:53] <scuttlemonkey> was just taking a look at ivoks
[15:53] <noob2> ok
[15:53] <ivoks> scuttlemonkey: http://pastebin.com/raw.php?i=XdB9D11H
[15:55] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[15:55] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[15:56] <slang1> noob2: ceph pg dump?
[15:57] <noob2> ok lemme pastebin
[15:57] <noob2> it's huge
[16:00] <noob2> i'm over the max size for pastebin or fpaste. maybe i could just paste the degraded stuff?
[16:00] <slang1> noob2: that's ok, I didn't see the pastebins from earlier
[16:00] <noob2> http://pastebin.com/rJQrUvwa
[16:01] <scuttlemonkey> slang1: keep in mind this is 2 different folks
[16:01] <scuttlemonkey> ivoks is the one who I have been getting pastebins from, noob2 sounds like a slightly different issue having gotten to unclean after a reweight
[16:01] <slang1> ah yes
[16:01] * slang1 goes for more coffee
[16:01] <ivoks> :)
[16:01] <noob2> hehe
[16:02] <noob2> i have 27 pgs that say they're active+remapped and 18 that are active+degraded
[16:02] <noob2> i'm guessing that means they have 1 replica running around but missing the other 2?
[16:02] <scuttlemonkey> to me ivoks' issue looks strange b/c it only has one osd mapped
[16:03] <scuttlemonkey> trying to remember the incantation for crushtool
[16:03] <noob2> haha
[16:03] <scuttlemonkey> osdmaptool <filename> --export-crush <filename2>
[16:03] <scuttlemonkey> then crushtool -d <filename2>
[16:03] <scuttlemonkey> right?
[16:06] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[16:06] <noob2> yeah that'll decompile it
[16:06] <noob2> ceph osd getcrushmap -o {compiled-crushmap-filename}
[16:06] * aliguori (~anthony@ has joined #ceph
[16:06] * tziOm (~bjornar@ Quit (Remote host closed the connection)
[16:07] <scuttlemonkey> there we go
[16:07] <scuttlemonkey> knew there was an easier way to do it...but couldn't remember
[16:07] <scuttlemonkey> time to order some more of that caffeinated soap from thinkgeek :P
[16:07] <ivoks> haha
[16:09] <noob2> scuttlemonkey: want to see my crushmap to double check it?
[16:09] <noob2> http://fpaste.org/22w3/
[16:10] <noob2> i set each osd to weight of 3 and i want it to choose 1 rack so i get a replica in each rack
[16:11] <ivoks> scuttlemonkey: http://pastebin.com/uuVgjyCq
[16:13] * jtangwk1 (~Adium@2001:770:10:500:2869:8c8d:3b8d:16c6) has joined #ceph
[16:14] * eschnou (~eschnou@ has joined #ceph
[16:18] * gerard_dethier (~Thunderbi@ has joined #ceph
[16:18] * vata (~vata@2607:fad8:4:6:65a5:37a6:d751:49f7) has joined #ceph
[16:21] <scuttlemonkey> ivoks: your crush rulesets don't match your available pools
[16:21] * jtangwk (~Adium@2001:770:10:500:58a8:8af:cd35:badf) Quit (Ping timeout: 480 seconds)
[16:21] <scuttlemonkey> you have 5 pools, but only the first 3 have rulesets...looks like yer missing 'images' and 'cinder'
[16:21] <ivoks> right, i see that too
[16:21] <noob2> yeah i think you should have host instead of osd
[16:22] <noob2> oh..
[16:22] <noob2> i have that also
[16:22] <noob2> i have 11 pools and 4 rulesets
[16:22] <ivoks> scuttlemonkey: still, i'm not sure how to solve that
[16:22] <ivoks> ...pretty new in ceph...
[16:22] <scuttlemonkey> although images and cinder appear to be using data
[16:22] <noob2> i thought you could reuse rulesets for different pools
[16:23] <scuttlemonkey> yeah, that should be fine
[16:23] <ivoks> maybe i could delete those pools and recreate them?
[16:24] <scuttlemonkey> hang on one sec here
[16:24] <ivoks> k
[16:24] * phillipp1 (~phil@p5B3AFA06.dip.t-dialin.net) has joined #ceph
[16:26] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[16:26] * Morg (b2f95a11@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[16:27] * phillipp (~phil@p5B3AF189.dip.t-dialin.net) Quit (Ping timeout: 480 seconds)
[16:29] * themgt (~themgt@97-95-235-55.dhcp.sffl.va.charter.com) has joined #ceph
[16:29] <loicd> joshd joshd1 if you are around, is there a way for me to edit the description of http://tracker.ceph.com/issues/4101 ? I made a typo and I would like to fix it. It shows "iterator" instead of " iterator(list *l, unsigned o, std::list<ptr>::iterator ip, unsigned po)" which makes for a slightly confusing ticket
[16:31] <loicd> or anyone with knowledge of the redmine permissions ;-)
[16:36] <joao> loicd, assuming you don't need special permissions for that, once you go to 'update', you should be able to see a 'Change Properties (More)', with 'More' being a link that should let you edit the original post
[16:37] <ivoks> scuttlemonkey: i'll just scratch it
[16:38] <scuttlemonkey> sry, got interrupted by fedex dood
[16:38] <ivoks> scuttlemonkey: :)
[16:38] <scuttlemonkey> ivoks: looks like yer crushmap is the issue though
[16:38] <scuttlemonkey> slang was just pointing out how you both have similar issue
[16:38] <scuttlemonkey> although subtlely different
[16:39] <scuttlemonkey> the choose lines are most likely the issue
[16:40] <loicd> joao: when I click update I don't see more. I'm using another redmine where I see (more), it's slightly confusing but in this case it just does not show. Does it show for you ?
[16:40] <ivoks> scuttlemonkey: k
[16:40] <ivoks> scuttlemonkey: thank your for your time and wisdom!
[16:40] <loicd> my user is http://tracker.ceph.com/users/789
[16:41] <joao> loicd, I can see it, but I'm sure I have extra permissions, so I'm not sure if that's due to that or what :\
[16:41] <scuttlemonkey> so ivoks:
[16:41] <scuttlemonkey> step take root
[16:41] <scuttlemonkey> step choose firstn 0 type host
[16:41] <scuttlemonkey> step choose firstn 1 type osd
[16:41] <scuttlemonkey> step emit
[16:41] <scuttlemonkey> is probably what you want
[16:41] <scuttlemonkey> http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg00414.html
[16:41] <scuttlemonkey> ^^ sage does such a great job of saying things succinctly
[16:41] <loicd> joao: :-) who should I ask to get the extra permission required to edit the ticket description ?
[16:42] <joao> my guess is sage
[16:42] <joao> maybe scuttlemonkey or rturk-away?
[16:42] <noob2> scuttlemonkey: my crush for choosing a rack is this:
[16:42] <noob2> rule vmware {
[16:42] <noob2> ruleset 3
[16:42] <noob2> type replicated
[16:42] <noob2> min_size 1
[16:42] <noob2> max_size 10
[16:42] <noob2> step take WILM-DC1
[16:42] <noob2> step chooseleaf firstn 0 type rack
[16:42] <noob2> step emit
[16:42] <noob2> }
[16:42] <scuttlemonkey> and boom
[16:42] <noob2> heh
[16:43] <scuttlemonkey> although I don't have yer map in front of me anymore... is WLM-DC1 root?
[16:43] <noob2> yeah
[16:43] * lightspeed (~lightspee@ Quit (Ping timeout: 480 seconds)
[16:43] <loicd> scuttlemonkey: hi, would it be possible to get enough permissions to edit the description of http://tracker.ceph.com/issues/4101 for the user http://tracker.ceph.com/users/789 ?
[16:43] <scuttlemonkey> loicd: was just looking to see if I had rights to do that
[16:43] <loicd> joao: thanks for the suggestions, I'll try ;-)
[16:43] <scuttlemonkey> I don't jockey in redmine much
[16:44] <loicd> scuttlemonkey: thanks ;-)
[16:45] <loicd> I'm embarrassed to ask. It's really difficult to get it right the first time. Hard core ticket creation, no mistakes ;-)
[16:45] <scuttlemonkey> noob2: looks like you still want 'root' vs WLM-DC1
[16:45] <scuttlemonkey> since step commands take the bucket-type
[16:46] <scuttlemonkey> loicd: it appears I do not have sufficient privs, sage is probably yer guy
[16:46] <joao> loicd, need me to change something in the ticket for the time being?
[16:47] <slang1> loicd: are you able to update the ticket, just not the description?
[16:49] <loicd> slang1: I'm able to add to it ( I see the Update box ).
[16:49] <slang1> loicd: when you click on Update, is there a (More) link next to Change Properties?
[16:49] <loicd> let me screenshot what I see
[16:50] * diegows (~diegows@ has joined #ceph
[16:50] <noob2> for the ceph osd crush set command what does pool={pool-name} mean?
[16:50] <noob2> i usually wouldn't think to set an osd into a pool
[16:51] <loicd> slang1: http://dachary.org/loic/a.png
[16:51] <slang1> huh
[16:52] <slang1> loicd: yeah that's much different than what I have
[16:52] <loicd> joao: I was hoping to quote the <> in the iterator link at the beginning of the descrption
[16:52] <slang1> loicd: sage or ross should be able to give you the right permissions there
[16:53] * jskinner (~jskinner@ has joined #ceph
[16:53] <loicd> slang1: ok, I'll ask. It's nothing urgent, just a recurring annoyance ;-) Thanks for trying to help me out.
[16:54] * low (~low@ Quit (Quit: bbl)
[16:54] <slang1> loicd: I didn't realize submitters couldn't delete/modify their own tickets
[16:56] <joao> loicd, updated; hope it's what you meant
[16:56] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[16:56] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[17:00] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[17:01] * aliguori (~anthony@ has joined #ceph
[17:02] <loicd> joao: exactly ! thank you :-D
[17:03] * jlogan1 (~Thunderbi@2600:c00:3010:1:f10b:fe00:c3e7:1d31) has joined #ceph
[17:05] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[17:05] * capri (~capri@ Quit (Ping timeout: 480 seconds)
[17:06] * gerard_dethier (~Thunderbi@ Quit (Quit: gerard_dethier)
[17:07] <noob2> scuttlemonkey: should I wait on the ink tankers to come in to look at the stuck state?
[17:08] * jskinner (~jskinner@ has joined #ceph
[17:08] <slang1> noob2: I think pool={pool-name} in that doc is wrong
[17:08] <noob2> ok
[17:08] * BillK (~BillK@58-7-243-105.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[17:08] <noob2> i thought so also
[17:08] <noob2> osd's don't go into pools per say
[17:08] <noob2> pools just define rules
[17:08] <slang1> noob2: did you upload a new crushmap and pgs are still stuck?
[17:09] <noob2> well what i did was i started reweighting my osd's and then before that was finished uploaded a new crush map
[17:09] <slang1> noob2: a pool is a collection of pgs
[17:09] <slang1> noob2: and a pool has a ruleset that points to the rule to use for choosing osds
[17:09] <noob2> so i think by not letting the reweight complete i borked it
[17:09] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[17:10] <noob2> slang1: i see
[17:11] * jskinner (~jskinner@ has joined #ceph
[17:11] <slang1> noob2: it looked like there were problems with the crushmap you uploaded, did you upload another that fixes those?
[17:11] <noob2> could you check my crushmap?
[17:12] <noob2> here's the current one: http://fpaste.org/gYDE/
[17:12] <noob2> i have 2 hosts in a rack and 3 racks
[17:12] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[17:13] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[17:13] <slang1> noob2: the step take line is wrong
[17:13] <noob2> step take WILM-DC1 ?
[17:13] * slang1 nods
[17:13] * eschnou (~eschnou@ Quit (Remote host closed the connection)
[17:13] <noob2> that's my root though
[17:13] <slang1> noob2: should be: step take root
[17:13] * jtang2 (~jtang@2001:770:10:500:6c5c:ace3:6c57:7c91) has joined #ceph
[17:13] <slang1> noob2: its the type not the name
[17:13] * jskinner (~jskinner@ has joined #ceph
[17:13] <noob2> oh..
[17:14] <slang1> noob2: also, you're not actually using the vmware rule at present
[17:14] <noob2> i should be?
[17:14] <noob2> pool 3 'vmware' rep size 3 crush_ruleset 3 object_hash rjenkins pg_num 4200 pgp_num 4200 last_change 17 owner 0
[17:14] <slang1> ah you changed that
[17:15] <slang1> noob2: ok
[17:15] <noob2> yeah some of the pools are using it
[17:15] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[17:15] <noob2> so when i change this to root is it going to cause my cluster to go nuts remapping?
[17:15] <noob2> the root name WILM-DC1 is just for me?
[17:15] <slang1> noob2: it depends on what you had before you uploaded the broken one
[17:16] <janos> names help with osd tree and sanity ;)
[17:16] <noob2> ok so it's just for my sanity
[17:16] <noob2> yeah i had it uploaded just like this for weeks
[17:16] <noob2> somehow it worked?
[17:16] <slang1> noob2: maybe just new objects aren't getting mapped properly
[17:17] <noob2> does the step chooseleaf firstn 0 type rack look right?
[17:17] <noob2> i'd like 1 replica per rack
[17:17] * jtang1 (~jtang@2001:770:10:500:c41:7d6:19e8:d8dc) Quit (Ping timeout: 480 seconds)
[17:17] * jskinner (~jskinner@ has joined #ceph
[17:17] <slang1> noob2: yes
[17:17] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[17:17] <noob2> phew :)
[17:18] <slang1> noob2: you should change the other rules to have that as well
[17:18] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:18] <noob2> yeah lemme change them all
[17:18] <noob2> ok i have this now: http://fpaste.org/KOB7/
[17:19] <noob2> so that should emit a rack, which then chooses one host and one osd ?
[17:19] <janos> i dont consider myself terribly knowledgable, but is that min_size right?
[17:20] <noob2> that was the default that i just copied
[17:20] * sleinen (~Adium@2001:620:0:25:58e9:30be:4499:9a86) Quit (Quit: Leaving.)
[17:20] * sleinen (~Adium@ has joined #ceph
[17:20] <janos> i'd have to read up on that
[17:20] <janos> i would if i were you ;)
[17:21] <janos> http://ceph.com/docs/master/rados/operations/crush-map/
[17:21] <janos> "If a pool makes fewer replicas than this number, CRUSH will NOT select this rule."
[17:21] * BillK (~BillK@ has joined #ceph
[17:21] <janos> i think you should be safe
[17:21] <slang1> noob2: yeah that looks good to me now
[17:21] <noob2> ok lemme upload it and pray haha
[17:22] <noob2> root@plcephd01:/tmp# crushtool -c decompiled-crush-map -o newcrush
[17:22] <noob2> in rule 'data' item 'root' not defined
[17:22] <noob2> in rule 'metadata' item 'root' not defined
[17:22] <noob2> in rule 'rbd' item 'root' not defined
[17:22] <noob2> in rule 'vmware' item 'root' not defined
[17:22] <noob2> yeah this is what i got before
[17:22] <noob2> WILM-DC1 works but root doesn't
[17:23] <noob2> when i replace it with wilm-dc1 it compiles again
[17:24] <janos> i'm looking at mine just now
[17:24] <janos> it's using name
[17:24] <janos> granted, that name is "default"
[17:24] <janos> but a name nonetheless
[17:24] <noob2> right
[17:24] <noob2> i changed default to a different name
[17:25] <janos> yeah
[17:25] <janos> i have min_size 2, but according to the docs you should be fine
[17:26] <noob2> yeah
[17:27] <slang1> hmm maybe its not the type
[17:27] <slang1> noob2: yeah it looks like the doc is wrong there
[17:28] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[17:28] <noob2> gotcha
[17:28] <noob2> that's ok :)
[17:28] <noob2> now we know
[17:28] <noob2> i set everyone up to use rack and it's moving about 1.5% of the pgs
[17:29] <slang1> noob2: ok cool
[17:29] <noob2> hopefully this fixes my stuck pgs
[17:29] <slang1> noob2: hopefully that solves the..nods
[17:29] <noob2> i tried restarting all the osd's but that didn't change it
[17:29] <noob2> i saw on someones email about setting all tunables to zero
[17:30] <noob2> i'm not sure that would work though because i'm on ubuntu 12.04 and kernel 3.2.x
[17:32] <noob2> does uploading a new crushmap kick off a restart of all the backfill processes?
[17:32] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[17:33] * portante (~user@ has joined #ceph
[17:33] * ScOut3R (~ScOut3R@ Quit (Ping timeout: 480 seconds)
[17:38] * mauilion (~dcooley@2605:6400:2:fed5:4::351) Quit (Remote host closed the connection)
[17:38] * mauilion (~dcooley@2605:6400:2:fed5:4::351) has joined #ceph
[17:40] <jms_> noob2: Using the tunables on kernel 3.5.7 didn't work for me ... I think 3.7.x is needed for those :/
[17:42] * rturk-away is now known as rturk
[17:42] <noob2> woot!
[17:42] <noob2> slang1: that fixed it
[17:42] <noob2> health is ok again.
[17:43] <janos> nice
[17:43] <janos> jms_: dosc say 3.6.x +
[17:43] <noob2> so note to self. changing weights on all osds should happen with the crush tool
[17:43] <janos> *docs
[17:43] <noob2> reweighting 1 by 1 is bad :)
[17:43] <janos> haha
[17:43] <slang1> noob2: cool
[17:43] <noob2> yeah i got impatient and interrupted the reweight by uploading a new map
[17:43] <noob2> bad idea
[17:44] <janos> tunables2 the dosc say 3.9 + )_o
[17:44] <noob2> wow
[17:44] <janos> gah i can't type. sorry
[17:44] <janos> *docs
[17:44] <noob2> well kernel 3.8-rc7 is out :)
[17:45] * capri (~capri@ has joined #ceph
[17:47] * noob21 (~noob2@ext.cscinfo.com) has joined #ceph
[17:48] <noob21> is the blog borked? the css isn't rendering or something on firefox
[17:48] * janos looks
[17:49] <janos> looks ok here
[17:49] <janos> not a definitive test or anything, but WOMM!
[17:49] <janos> ;)
[17:49] <janos> aww man i haven't had a chance to say that in eons
[17:49] <noob21> lol
[17:50] <noob21> seems to just be my firefox. chrome works
[17:53] * capri (~capri@ Quit (Ping timeout: 480 seconds)
[17:53] <noob21> i'm thinking i should go through a round of upgrades to 0.56.2
[17:54] * aliguori_ (~anthony@ has joined #ceph
[17:54] <sstan> Has there been any improvement in the documentation lately ?
[17:54] * noob2 (~noob2@ext.cscinfo.com) Quit (Ping timeout: 480 seconds)
[17:57] <noob21> in what part of the docs?
[17:57] <noob21> i think they're auto built from source
[17:58] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[18:00] * ircolle (~ircolle@c-67-172-132-164.hsd1.co.comcast.net) has joined #ceph
[18:01] <sstan> ah I didn't know that
[18:01] <noob21> yeah that's what i heard
[18:01] <sstan> docs could be better I think ... I'd write some, but I'm not a pro yet
[18:01] <noob21> go for it :). fork the github repo and write away i'd think
[18:02] <noob21> i made a small change recently with command usage for the radosgw
[18:03] <sstan> you mean anyone can push stuff in the git ?
[18:03] <noob21> yeah you can fork the repo
[18:03] * rturk is now known as rturk-away
[18:03] <noob21> then submit a pull request with your changes
[18:03] <noob21> it's great :D
[18:03] * rturk-away is now known as rturk
[18:11] * rturk is now known as rturk-away
[18:11] * BillK (~BillK@ Quit (Ping timeout: 480 seconds)
[18:22] * capri (~capri@ has joined #ceph
[18:24] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (Quit: Ex-Chat)
[18:24] * sleinen (~Adium@2001:620:0:25:ec33:f38b:ce4a:f461) has joined #ceph
[18:24] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[18:24] * loicd (~loic@lvs-gateway1.teclib.net) Quit (Ping timeout: 480 seconds)
[18:25] * The_Bishop_ (~bishop@e177090133.adsl.alicedsl.de) has joined #ceph
[18:27] * BillK (~BillK@58-7-168-223.dyn.iinet.net.au) has joined #ceph
[18:28] <sstan> I'm sure a good book about Ceph would sell ;)
[18:29] <noob21> i'm sure
[18:33] * The_Bishop (~bishop@e179013183.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[18:34] * via (~via@smtp2.matthewvia.info) Quit (Ping timeout: 480 seconds)
[18:42] * via (~via@smtp2.matthewvia.info) has joined #ceph
[18:42] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[18:44] * sleinen (~Adium@2001:620:0:25:ec33:f38b:ce4a:f461) Quit (Quit: Leaving.)
[18:44] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[18:49] <noob21> anyone else joining the inktank talk at 1pm?
[18:49] <noob21> i think it's mountain time
[18:49] <dignus> what's the topic this week? sorry, browser is b0rked atm :)
[18:50] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) has joined #ceph
[18:50] <joao> 'advanced topics on ceph' I think
[18:50] <joao> btw, is that 1pm PST?
[18:51] <noob21> yeah i think so
[18:51] <noob21> it says it's starting in an hour
[18:51] <dignus> aha
[18:51] <joao> then it can't be PST
[18:51] <dignus> 8PM for me ;)
[18:51] <noob21> lol
[18:51] * jtang2 (~jtang@2001:770:10:500:6c5c:ace3:6c57:7c91) Quit (Ping timeout: 480 seconds)
[18:51] <joao> it's 9h50am PST afaik
[18:51] <noob21> beats me but it says 56 min remaining
[18:52] <gregaf> Sage is doing a webinar in 8 minutes
[18:52] <gregaf> I believe
[18:52] <dignus> last week it was at 7PM IIRC (10 minutes from now)
[18:52] <joao> oh
[18:52] <noob21> weird.. i wonder why its saying to wait
[18:52] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[18:54] * capri (~capri@ Quit (Ping timeout: 480 seconds)
[18:55] * chutzpah (~chutz@ has joined #ceph
[18:58] * loicd (~loic@2a01:e35:2eba:db10:710b:b6a1:1908:4a44) has joined #ceph
[19:00] * leseb (~leseb@mx00.stone-it.com) Quit (Remote host closed the connection)
[19:00] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[19:09] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) Quit (Quit: Leaving.)
[19:10] <ShaunR> I've been reading and it looks like since i'm not using cephfs and only using rados/rbd to connect to ceph that i dont need a MDS, can anybody confirm this.
[19:10] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) has joined #ceph
[19:10] <dignus> ShaunR: true
[19:11] <joao> MDS is only required for cephfs
[19:13] * BillK (~BillK@58-7-168-223.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[19:13] * sjustlaptop (~sam@2607:f298:a:697:2423:5e17:cc4e:80ed) has joined #ceph
[19:16] * jluis (~JL@ has joined #ceph
[19:16] <ShaunR> where are the docs for all the command that can be issued to ceph? i found a doc for removing a MDS but a ceph --help doesnt even show them
[19:17] <ShaunR> I'm also seeing HEALTH_WARN mds b is laggy
[19:17] <ShaunR> but i just removed them.
[19:17] <ShaunR> ceph mds rm 0 && ceph mds rm 1
[19:20] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[19:20] * Cube (~Cube@ has joined #ceph
[19:21] * joao (~JL@ Quit (Ping timeout: 480 seconds)
[19:21] * capri (~capri@ has joined #ceph
[19:23] * Cube1 (~Cube@ has joined #ceph
[19:25] <Kdecherf> Does anyone know the average memory size used by an object in the mds cache? (I want to increase the global cache size)
[19:25] * themgt (~themgt@97-95-235-55.dhcp.sffl.va.charter.com) Quit (Quit: themgt)
[19:26] <jms_> Anyone seen a fix for when using btrfs it makes the osd current directory read-only? :/
[19:26] <jms_> This is with mkcephfs
[19:26] * BillK (~BillK@58-7-139-218.dyn.iinet.net.au) has joined #ceph
[19:28] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[19:30] <jms_> http://pastie.org/6136779
[19:32] * sjustlaptop (~sam@2607:f298:a:697:2423:5e17:cc4e:80ed) Quit (Ping timeout: 480 seconds)
[19:33] * capri_on (~capri@ has joined #ceph
[19:36] * capri_on (~capri@ Quit ()
[19:37] <ShaunR> [root@storage1 munin]# ceph mds newfs metadata data
[19:37] <ShaunR> strict_strtoll: expected integer, got: 'metadata'
[19:37] * joao (~JL@ has joined #ceph
[19:37] * ChanServ sets mode +o joao
[19:38] * jluis (~JL@ Quit (Ping timeout: 480 seconds)
[19:40] <noob21> is the ceph fs coming close to being production ready?
[19:40] <noob21> i'm watching sage's talk
[19:40] * capri (~capri@ Quit (Ping timeout: 480 seconds)
[19:42] * sleinen (~Adium@user-23-12.vpn.switch.ch) has joined #ceph
[19:43] * loicd (~loic@2a01:e35:2eba:db10:710b:b6a1:1908:4a44) Quit (Quit: Leaving.)
[19:43] * loicd (~loic@magenta.dachary.org) has joined #ceph
[19:50] * dosaboy (~user1@host86-164-229-186.range86-164.btcentralplus.com) Quit (Remote host closed the connection)
[19:51] * dosaboy (~gizmo@host86-164-229-186.range86-164.btcentralplus.com) has joined #ceph
[19:57] <gregaf> jms_: you should be okay with just renaming the "current" subvolume out of the way and letting the OSD recover from an older snapshot
[19:58] <gregaf> ShaunR: that command expects integer pool IDs, not pool names
[19:58] <ShaunR> gregaf: i was reading this... http://www.sebastien-han.fr/blog/2012/07/04/remove-a-mds-server-from-a-ceph-cluster/
[19:58] <gregaf> however, right now you can't fully remove the last MDS from a cluster once you turn one on; there's a bug for it in the tracker but it's not super high-priority right now since it's just a bit of an annoyance
[19:59] <gregaf> noob21: we do have people working hard on the MDS again, but I hate to make guesses about when ;)
[19:59] <noob21> yeah
[19:59] <noob21> no worries i was just wondering
[20:00] <gregaf> ShaunR: hmm, looks like that's off a bit, although I like the workaround for the error message
[20:00] <ShaunR> gregaf: where are all the docs for this stuff, i cant find anything
[20:00] * rturk-away is now known as rturk
[20:01] <gregaf> the MDS is less doc'ed than the other things since it's less ready for use
[20:02] <jms_> gregaf: This is during the initial mkcephfs run on an empty partition ... I wouldn't _think_ it would snapshot by then :/
[20:02] <ShaunR> gregaf: so then i would want the command to look somthing like ceph mds newfs 0 1 2? i only have the default 3 pools
[20:02] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[20:03] <gregaf> ShaunR: you're telling it which pools to use for file data and for MDS metadata; in this case you don't much care since you won't be turning on an MDS or writing anything
[20:03] <gregaf> but it would be "ceph mds newfs 0 1" in order to use pool 0 ("data") and 1 ("metadata") as you'd expect
[20:04] <gregaf> or perhaps the other way around; I forget and it doesn't much matter for you ;)
[20:04] <ShaunR> why not pool 3 (rbd)
[20:04] * ShaunR notes he's very new to ceph
[20:04] <gregaf> ShaunR: "newfs" is a command for CephFS, not for the rest of the system, and you're specifying where to store the CephFS data with that command
[20:04] <gregaf> the RBD pool is by default only used for RBD images ;)
[20:05] <gregaf> and you presumably don't want any CephFS data going in there
[20:05] <ShaunR> ah, i've been storing my rbd images in data
[20:05] <gregaf> jms_: odd, not sure then
[20:05] <gregaf> kernel version?
[20:05] <ShaunR> guess thats not the right place
[20:06] <gregaf> ShaunR: well, it doesn't much matter as long as you can handle mixing the data objects and filtering appropriately (which you seemingly aren't worried about since you're only using RBD)
[20:06] <gregaf> jms_: and ask sjust or somebody who pays more attention to the btrfs/OSD interactions :)
[20:07] <ShaunR> I'd really rather use the system in the way it was intended to be, usually that causes less problems :)
[20:07] <sjust> jms_: haven't seen anything like that
[20:07] <gregaf> ShaunR: the use of those pools is just convention — all the tools can handle using any pool, the default pools with various commands just won't be set right so you'll need to specify
[20:07] <ShaunR> i though about creating my own pool called vmimages
[20:08] <janos> i create separate pools to avoid any issues
[20:08] <janos> though i initially did it because i kinda feared treading on default pools out of ignorance
[20:08] <janos> ;)
[20:09] <ShaunR> janos: what are you using ceph for specifically?
[20:09] <janos> i'm just dogfooding it at home - so for backups, will be using for vm images, my music, etc
[20:09] <janos> using RBD
[20:10] <janos> i will use at home likely for a year before i push to work
[20:10] <ShaunR> ah, ok, was hoping you had a larger scale deployment :)
[20:10] <janos> i work at home, so i am in both places right now
[20:10] <janos> naw i wish
[20:10] <janos> if anyone wishes to find the JanosSupermassiveHomeCluster project i iwll accept funds
[20:10] <janos> find/fund
[20:11] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) has joined #ceph
[20:14] <ShaunR> gregaf: since your here, and if you have time i got a question about storage... mainly raid. Currently we build out our VM hosts with expensive raid controllers (LSI 9266-4i) using 4 1TB sata RE disks in a raid 10 array. Our bottle neck is always IO and were left with alot of diskspace, ram and cpu left over on each host. This is one of the reasons i'm testing ceph right now, were looking to
[20:14] <ShaunR> stop building these hosts out with these cards/disks and to instead use a storage cluster...
[20:14] * jtang1 (~jtang@ has joined #ceph
[20:15] <gregaf> okay...
[20:15] <ShaunR> What i'm trying to figure out exactly is what to put in the storage servers, it sounds like most say to run the disk as is with one OSD on each, but i'm curious why the storage hosts would perform better with a good raid card in a raid10 array (one osd per raid10 array).
[20:15] <ShaunR> i've done some reading and i keep seeing raid0 setups with single disks
[20:16] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[20:16] * BillK (~BillK@58-7-139-218.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[20:16] <loicd> rturk: hi
[20:16] <ShaunR> just wondering what you think would perform better, a 24 disk server setup in raid 10 by a good card vs using the on board controller with 24 disk seperately mounted each with it's own OSD
[20:17] <gregaf> well, 24 OSD daemons is probably going to overload the CPU; we generally recommend that each daemon gets 1GHz of CPu and 1GB of RAM as a rough guide
[20:17] * rturk is now known as rturk-away
[20:17] <gregaf> so if you have way more disks than that amount of compute power, you'd want to do some disk aggregation
[20:18] <gregaf> but otherwise, you're already paying for replication with Ceph so why would you want to lose even more to RAID arrays?
[20:18] * rturk-away is now known as rturk
[20:18] <gregaf> and we've seen lots of issues with even good cards just having abysmal throughput in a lot of our use cases; you can check out nhm_'s series of blog posts on ceph.com for more info about that
[20:19] <gregaf> (he's on vacation right now I think though, so hopefully he won't actually see that ping!)
[20:19] <ShaunR> my current hosts have 24 cpu's (hex core w/ hyperthreading)... they are the E5-2620 2GHz
[20:19] <ShaunR> sorry, dual hex core w/ ht
[20:21] <ninkotech> ShaunR: just for storage?
[20:21] <ShaunR> thats just whats in these test machines i have... but if i need to build them out like that i will
[20:21] <ShaunR> gregaf: i've read a few of his blog posts, specifically the one were he's testing like 6 different cards.
[20:22] <gregaf> anyway, those CPU limits would in general be the only reason I'd bother to aggregate
[20:22] <gregaf> at least at this time
[20:22] <dignus> hm
[20:22] <dignus> question related to that
[20:23] <dignus> would you pick a jbod card over an Intel ICHR based mainboard?
[20:23] <gregaf> using RAID arrays underneath gets you a whole different set of performance and reliability tradeoffs that we don't have much experience with, but in general the reliability ones seem to not be worth it according to Mark's modelling
[20:23] <dignus> I'm very interested to see LSI + cacheCade in action with CPH
[20:23] <dignus> CEPH
[20:23] <gregaf> yeah, I have no idea about that, dignus; data center hardware esoterica isn't my thing
[20:23] <ShaunR> ok, how about the cards, do you think it would be worth spending the month on a LSI card with lots of cache and do JBOD, or do you think the onboard disk controller would be ok? We typically buy mostly supermicro stuff
[20:23] <gregaf> ^ ;)
[20:23] <loicd> leseb: good evening sir ;-)
[20:23] <dignus> ShaunR: all great minds think alike ;)
[20:24] <ShaunR> haha
[20:25] <ninkotech> dignus: works also for small minds :)
[20:25] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[20:25] <dignus> haha
[20:25] <ninkotech> they are even more unified
[20:25] <dignus> all minds thinks alike when it comes to CEPH? :)
[20:25] * rturk is now known as rturk-away
[20:26] <phillipp1> t try to run radosgw with lighttpd AND nginx (via fastcgi) and it crashes each time
[20:26] <ShaunR> if cpu is important for the OSD i'm curious how using the onboard controller's would perform, they typically share/use the cpu right?
[20:26] * LeaChim (~LeaChim@b0faa140.bb.sky.com) Quit (Ping timeout: 480 seconds)
[20:27] <dignus> ShaunR: if you're using the raid functionality.. yes
[20:27] <dignus> ShaunR: but what if you're not?
[20:27] * mauilion (~dcooley@2605:6400:2:fed5:4::351) Quit (Ping timeout: 480 seconds)
[20:27] <dignus> that's what bothers me
[20:27] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[20:29] <ShaunR> gregaf: one last thing, you said this.... [11:18] <gregaf> but otherwise, you're already paying for replication with Ceph so why would you want to lose even more to RAID arrays?
[20:29] * nwat (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[20:29] <ShaunR> are you saying lose in performance or just meaning lose in diskspace
[20:30] <dignus> both :)
[20:30] <dignus> you add network latency anyway
[20:30] <ShaunR> thats what i was thinking but wanted to make sure
[20:35] * LeaChim (~LeaChim@5e0d73fe.bb.sky.com) has joined #ceph
[20:43] * eschnou (~eschnou@154.180-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[20:43] <leseb> loicd: bonsoir :)
[20:44] <ShaunR> anybody ever tried to pxe boot osd's
[20:44] <ShaunR> so that the osd didnt need a OS disk
[20:46] <janos> hrmmm. i have not, though i like that idea
[20:46] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[20:46] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[20:48] * gaveen (~gaveen@ has joined #ceph
[20:55] <ShaunR> onapps new storage system can do that.
[21:09] * hybrid5121 (~w.moghrab@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[21:10] * leseb_ (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[21:10] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[21:11] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[21:12] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[21:14] * hybrid512 (~w.moghrab@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[21:16] * jskinner (~jskinner@ has joined #ceph
[21:23] * vata (~vata@2607:fad8:4:6:65a5:37a6:d751:49f7) Quit (Ping timeout: 480 seconds)
[21:27] * nyeates (~nyeates@pool-173-59-239-231.bltmmd.fios.verizon.net) has joined #ceph
[21:27] * gaveen (~gaveen@ Quit (Remote host closed the connection)
[21:31] * jjgalvez (~jjgalvez@ has joined #ceph
[21:33] * wschulze1 (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[21:40] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Ping timeout: 480 seconds)
[21:40] <sstan> I thought about that. It can certainly be done. I was thinking about booting the OS via iSCSI
[21:41] <sstan> iSCSI would be supported by an RBD
[21:41] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[21:42] * nyeates (~nyeates@pool-173-59-239-231.bltmmd.fios.verizon.net) Quit (Quit: nyeates)
[21:42] * aliguori_ (~anthony@ Quit (Ping timeout: 480 seconds)
[21:43] * vata (~vata@2607:fad8:4:6:345e:43e0:3ced:a3d2) has joined #ceph
[21:48] * KindOne (KindOne@h72.34.28.71.dynamic.ip.windstream.net) Quit (Ping timeout: 480 seconds)
[21:49] <noob21> anyone else seen errors with the rados gw and huge uploads? my uploads seem to fail at 2GB
[21:49] <noob21> it's using the rados apache setup
[21:52] * aliguori_ (~anthony@ has joined #ceph
[21:56] <lightspeed> infernix was recently describing something failing for him at 2GB I think, not sure if it involved rgw though
[21:56] * jskinner (~jskinner@ has joined #ceph
[21:57] <ShaunR> noob21: what version of apache?
[21:58] <ShaunR> sstan: i was thinking more of a livecd type setup
[22:00] <sstan> that's a good idea, but you would have to re-write your image for every change you want to make (updates, etc.)
[22:00] * ScOut3R (~scout3r@5400CAE0.dsl.pool.telekom.hu) has joined #ceph
[22:01] * danieagle (~Daniel@ has joined #ceph
[22:01] <sstan> and for your livecd situation you would need some writable space I think
[22:07] * rturk-away is now known as rturk
[22:07] * rturk is now known as rturk-away
[22:07] * rturk-away is now known as rturk
[22:08] * KindOne (~KindOne@h215.211.89.75.dynamic.ip.windstream.net) has joined #ceph
[22:11] * loicd (~loic@magenta.dachary.org) Quit (Read error: Connection reset by peer)
[22:11] * loicd1 (~loic@2a01:e35:2eba:db10:710b:b6a1:1908:4a44) has joined #ceph
[22:15] <ShaunR> sstan: livecd's i'm pretty sure use RAM normally.
[22:15] <noob21> ShaunR: lemme check
[22:15] <sstan> hmm true
[22:16] <sstan> they copy themselves in a ramdisk
[22:16] <noob21> ShaunR: 2.2.22-1ubuntu1-inktank1
[22:17] <ShaunR> sstan: I dont know much about ceph yet but i assume the OSD's would probably need to log to a remote syslog server or somthing
[22:17] <sstan> so yeah ... you could probably make some custom liveCD that contains, perhaps a puppet agent or something
[22:17] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[22:17] <ShaunR> noob21: where is that?
[22:17] <noob21> it's inktank's custom version i think
[22:18] <ShaunR> from?
[22:18] <sstan> ShaunR : every system needs a different hostname ... how would you address that problem ?
[22:20] <ShaunR> sstan: could pull from RDNS... you would still need to pull some files from a master server.
[22:20] * leseb_ (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[22:20] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[22:20] <ShaunR> dunno, i think it could be done, would be kinda cool that way you wouldnt have to waste a disk or two on the OS
[22:21] <sstan> ShaunR : I was thinking perhaps some DHCP with an entry for every computer's mac address
[22:21] <sstan> true .. I agree with that
[22:21] * gucki (~smuxi@HSI-KBW-095-208-162-072.hsi5.kabel-badenwuerttemberg.de) Quit (Remote host closed the connection)
[22:21] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[22:21] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[22:22] <ShaunR> sstan: yep! and with pxe it will load configs baised on the mac
[22:23] * aliguori_ (~anthony@ Quit (Ping timeout: 480 seconds)
[22:23] <sstan> yeah it could download specific config from some server before starting the daemons
[22:23] * The_Bishop_ (~bishop@e177090133.adsl.alicedsl.de) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[22:24] <sstan> idk why you say "with pxe" ... I tought pxe was serving only ONE image
[22:25] <sstan> 1) n computers boot from identical images 2) acquire distinct hostnames and IPs from a DHCP server 3) download right configuration 4 ) done!
[22:26] <lightspeed> they could also be pre-populated with a script that runs on boot to "uniquify" themselves without necessarily pulling any extra config from elsewhere
[22:26] <sstan> I was thinking about puppet for that
[22:26] <lightspeed> yeah I guess that's more flexible in terms of subsequent modifications
[22:27] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[22:27] <sstan> the only problem I guess is that the DCHP / puppet master servers need not to fail
[22:30] <sstan> the 'master node' could simply be some USB key ... that can be transferred to any machine of the cluster (any machine could host the "master node" OS)
[22:32] * aliguori (~anthony@ has joined #ceph
[22:32] * mdawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[22:32] <sstan> assuming that all the machines will never fail at the same time .. one could even design a system where the 'master node' is a virtual machine backed by some RBD
[22:33] <lightspeed> yeah that's a bit risky though :)
[22:33] * mdawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit ()
[22:33] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[22:34] <sstan> I like the concept of RADOS hosting all what is needed to run ... itself
[22:35] <lightspeed> I don't know about puppet, but it should be straightfoward to setup HA for DHCP / TFTP / HTTP for the PXE server
[22:36] <sstan> puppet is straightforward too
[22:38] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:38] <sstan> I should really do that... I said here two weeks ago that I would XD
[22:38] <phantomcircuit> sstan, i actually did that for fun
[22:38] <phantomcircuit> while it works it's obviously a bad idea
[22:38] <sstan> phantom: PXE ?
[22:38] <sstan> what happened ? tell us !
[22:39] <phantomcircuit> start normal monitor
[22:39] <phantomcircuit> start monitor on mapped rbd
[22:39] <phantomcircuit> kill original monitor
[22:39] <phantomcircuit> look no hands
[22:39] <phantomcircuit> of course if it dies you cant restart it
[22:40] <ShaunR> sstan: because pxe is what you need to diskless boot.
[22:40] <dmick> ShaunR: I think sstan was asking if phantomcircuit had been doing PXE
[22:41] <sstan> phantomcircuit : ture I didn't even think about that aspect for PXE. There is no way of succeeding without doing the transition you described
[22:41] <phantomcircuit> sstan, also yeah you could do pxe and i suspect it would work fine
[22:41] <sstan> yeah :) thanks dmick; I wanted to know what phantomcircuit's setup was
[22:42] <phantomcircuit> but putting the monitors on rbd is a bad idea
[22:42] <phantomcircuit> i just did it to see if it was possible
[22:42] <sstan> thanks for sharing that! :)
[22:42] <ShaunR> I'm not sure why you would want to put the monitors on pxe other than for the initial installation... the OSD's really only make sense to do what i was saying
[22:43] <wer> I can't seem to get above 200MB on writes with my cluster. I don't know where my bottleneck is.
[22:43] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[22:44] <ShaunR> the way onapps works is that the storage server boots from pxe originally, shows up in a mangement interface where you configure that server. Once you do that it reboots the server and it comes back up and loads it's config.
[22:46] <ShaunR> I may play with this ideal more once i get more fermiliar with OSD's
[22:47] <ShaunR> the idea of waisting a disk on the OS pains me :)
[22:48] <sstan> ShaunR : you always need monitors around to use ceph ..
[22:48] <lightspeed> other options for avoiding such disk wastage could include booting from SD / USB media or similar
[22:48] <ShaunR> sstan: right, i know that... i dont care about those machines because they are seperate physical machines anyway
[22:48] <ShaunR> they can have a disk or two
[22:49] <darkfader> ideally, your systems try pxe and fallback to usb if they get an invalid info from pxe (and only get valid info if they're in for an update)
[22:49] <darkfader> that way you don't see your whole infra go nuts if the dhcp is broken after a power failure
[22:50] <lightspeed> darkfader: that sounds vaguely like vmware's latest iteration of autodeploy (with "stateless caching")
[22:50] <sstan> how would booting something form PXE allow to update the system that's on the disk ?
[22:50] <ShaunR> we use pxe here like crazy, all our deploys are done over it for initial system deployments, i even use syslinux to boot cdrom images (ultimate boot cd, live cd's, etc)
[22:50] * LeaChim (~LeaChim@5e0d73fe.bb.sky.com) Quit (Ping timeout: 480 seconds)
[22:51] <ShaunR> sstan: you would reboot the server and it would load a newer OS/kernel
[22:51] <ShaunR> or even software version
[22:51] <ShaunR> it would basically pull a updated image
[22:51] * LeaChim (~LeaChim@5e0d73fe.bb.sky.com) has joined #ceph
[22:51] <darkfader> yup, as long as you have proper separation of data and OS disks (usb or not) this is all quite clean and safe
[22:52] <ShaunR> assuming your going the live route...
[22:52] <sstan> ah the pxe image would boot and put an updated image on the usb disk
[22:52] <ShaunR> sstan: well in my case the systems have no local storage, it would would like a live cd works...
[22:54] * KindOne (~KindOne@h215.211.89.75.dynamic.ip.windstream.net) Quit (Ping timeout: 480 seconds)
[22:56] <sstan> if it has no local storage .. what would happen if, like darkfader said, there was to be a power failure
[22:58] <noob21> scuttlemonkey: nice article :D
[22:59] <sstan> ceph blog ? I 'll read that :)
[22:59] <noob21> yep
[22:59] <scuttlemonkey> noob2: thanks, I only wrote the first paragraph though...the rest was all the Synnefo guys
[22:59] <noob21> it's cool how people are starting to integrate ceph into their infrastructure
[22:59] <scuttlemonkey> yeah, I love seein that stuff
[23:00] <noob21> i'm going to head out for the evening. thanks for the help today guys
[23:00] <scuttlemonkey> glad you got it ironed
[23:00] <scuttlemonkey> enjoy your unplugg-ed-ness
[23:00] <noob21> :)
[23:01] * noob21 (~noob2@ext.cscinfo.com) Quit (Quit: Leaving.)
[23:01] <sagewk> sjust,gregaf: wip-osd-msgr2... but the first patch would go in master, i think (not needed here)
[23:01] <sagewk> and no real need to backport
[23:02] <sjust> looking
[23:02] <sagewk> (no need to backport to argonaut i mean)
[23:03] <ShaunR> sstan: the pxe server would have to come up first
[23:03] <ShaunR> and whatever other servers that the OSD's depend on as well
[23:05] <gregaf> sagewk: I wonder if maybe we want that behavior to be configurable, actually....
[23:05] <gregaf> I could see some nasty bugs slipping through if we near-silently drop unknown messages
[23:05] <gregaf> if nothing else we definitely need to make sure that output gets caught by teuthology runs before merging
[23:06] <sagewk> gregaf: good idea
[23:07] <sjust> other than greg's thing, looks good
[23:08] <sagewk> k
[23:08] <gregaf> sagewk: can you glance at my one-liner wip-bobtail-dumper?
[23:09] * ScOut3R (~scout3r@5400CAE0.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[23:09] <gregaf> I guess strictly speaking it should go into next and get cherry-picked onto bobtail
[23:09] <gregaf> but the MDS Dumper isn't working :(
[23:09] <gregaf> (actually testing now)
[23:12] * dalgaaf (~dalgaaf@nrbg-4dbfc6f7.pool.mediaWays.net) has joined #ceph
[23:12] * jtang1 (~jtang@ has joined #ceph
[23:13] * ScOut3R (~ScOut3R@5400CAE0.dsl.pool.telekom.hu) has joined #ceph
[23:13] * eschnou (~eschnou@154.180-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[23:13] * KindOne (KindOne@h3.41.28.71.dynamic.ip.windstream.net) has joined #ceph
[23:19] * dalgaaf (~dalgaaf@nrbg-4dbfc6f7.pool.mediaWays.net) Quit (Quit: Konversation terminated!)
[23:19] <sagewk> gregaf: looks right
[23:19] * KindOne- (KindOne@h149.20.131.174.dynamic.ip.windstream.net) has joined #ceph
[23:19] <gregaf> thanks
[23:21] * ScOut3R (~ScOut3R@5400CAE0.dsl.pool.telekom.hu) Quit (Ping timeout: 480 seconds)
[23:22] * KindOne (KindOne@h3.41.28.71.dynamic.ip.windstream.net) Quit (Ping timeout: 480 seconds)
[23:22] * KindOne- is now known as KindOne
[23:24] * dosaboy (~gizmo@host86-164-229-186.range86-164.btcentralplus.com) Quit (Quit: Leaving.)
[23:26] * wschulze1 (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[23:26] <madkiss> gregaf: on a cluster where CephFS is not in use, can I create the mds metadata without the danger of dataloss?
[23:27] <gregaf> you mean can you add an MDS?
[23:27] <gregaf> yes
[23:27] <wer> Where my bottleneck at?
[23:27] <gregaf> it writes to segregated pools and if you want you can set it so it can't write to other pools at all (though it won't use them anyway)
[23:28] <madkiss> gregaf: the problem is that for some strange reasons, the MDSes just stopped worling (and it's not possible to recover them, I can see an exception when the MDS starts up and that's it)
[23:28] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[23:28] <madkiss> so what i was thinking is
[23:31] <madkiss> I would just do the steps as described in http://www.sebastien-han.fr/blog/2012/07/04/remove-a-mds-server-from-a-ceph-cluster/ and have it recreate the MDS
[23:32] <madkiss> problem is that first of all, the commands as explained in there don't work, and even if I replace the pool names by the corresponding IDs, it still tells me that I would have to use —i-know-what-I-do
[23:32] <madkiss> which sort of scared me off.
[23:32] <ShaunR> madkiss: i just used that page today
[23:33] <madkiss> with bobtail?
[23:33] <ShaunR> with the latest
[23:33] <ShaunR> 56.2
[23:33] <ShaunR> the commands are wrong
[23:33] <madkiss> that's what I think.
[23:33] <ShaunR> `ceph mds rm mds.0` really should be `ceph mds rm 0` for example
[23:34] <ShaunR> and as gregaf pointed out to me earlier this `ceph mds newfs metadata data` should really be the pool ids... so `ceph mds newfs 0 1`
[23:34] <ShaunR> which all worked, but i used this to remove a MDS since i dont need MDS
[23:35] <gregaf> madkiss: right, so doing that does create a new *filesystem*; so you'd want to remove and recreate the filesystem pools
[23:35] <gregaf> but it won't hurt your RBD data for instance
[23:35] <madkiss> I don't think this cluster currently *has* any CephFS data.
[23:35] <madkiss> my real problem, anyway, is that I see
[23:36] <madkiss> 2013-02-12 23:34:37.044179 mon.0 [INF] mdsmap e27885: 1/1/1 up {0=be-ceph03=up:replay(laggy or crashed)}
[23:36] <madkiss> and this won't go away.
[23:36] <ShaunR> madkiss: yep
[23:36] <ShaunR> thats what i had
[23:36] <ShaunR> the last command fixed mine, but i dont use cephfs only rbd
[23:36] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Quit: ChatZilla 0.9.90 [Firefox 18.0.2/20130201065344])
[23:37] <madkiss> Gives me
[23:37] <madkiss> this is DANGEROUS and will wipe out the mdsmap's fs, and may clobber data in the new pools you specify. add --yes-i-really-mean-it if you do.
[23:37] <ShaunR> yep.
[23:37] <ShaunR> again, i was only using rbd with my vm's though...
[23:38] <madkiss> and you did execute this command on all your RBD pools?
[23:38] <ShaunR> i dont know enough about what it's doing to say for sure it's not going to F you setup.
[23:38] <gregaf> madkiss: right, so it's bad for your CephFS data
[23:38] <gregaf> not for your other data
[23:38] <gregaf> you shouldn't execute it on your RBD pool, but even if you did it wouldn't hurt anything, just be silly :)
[23:38] <ShaunR> madkiss: i only ran it on pools 0 and 1 if i remember correctly
[23:38] <gregaf> somebody though had data in CephFS they wanted to keep and decided that running newfs would be a good idea
[23:39] <ShaunR> and in my case my rbd data was actually in 1
[23:39] <gregaf> because you can totally do that with ext4, right? ;)
[23:39] <ShaunR> gregaf: what FS do you recommend for OSD's?
[23:39] <gregaf> depends, but generally xfs
[23:39] <ShaunR> i was testing ext4 but i'm thinking XFS might be better
[23:39] <ShaunR> i'm killing my existing setup now to split my raid10 array into seperate disks
[23:40] <madkiss> gregaf: to be honest, all I would like to do is get the MDSes out of this setup so that I can properly restore them
[23:40] * Kolobok (426d18f4@ircip1.mibbit.com) has joined #ceph
[23:42] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Remote host closed the connection)
[23:42] <madkiss> WTF
[23:42] <Kolobok> Hello guys wonder if anybody could help me out was trying to set up simple ceph conf with no luch getting proper ceph health
[23:42] <gregaf> madkiss: what do you mean properly restore them?
[23:42] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[23:42] <gregaf> if you aren't using them, apparently creating a newfs and not turning on any MDS will clear them out of your hair and ceph health warnings
[23:42] <madkiss> what I have now is " health HEALTH_WARN 3768 pgs stale; 3768 pgs stuck stale"
[23:44] <Kolobok> http://pastebin.com/0D3CMk93
[23:45] <Kolobok> I get HEALTH_ERR
[23:46] <Kolobok> If i remove devs = /dev/sdb from config then it checks HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery 21/42 degraded (50.000%)
[23:47] <ShaunR> Kolobok: this a single server with only 1 osd?
[23:47] <lightspeed> Kolobok: it looks like maybe you have just a single OSD, but a replication level of 2
[23:48] <madkiss> gregaf: apprently, "ceph mds newfs 13 —yes-i-really-mean-it" leads to the cluster go haywire
[23:48] <Kolobok> :) Ok got it a got like 6 disks in their but i have included only one on the config
[23:49] <madkiss> So I restarted Ceph on two nodes and now I am back to "2013-02-12 23:48:50.737711 mon.0 [INF] mdsmap e27913: 1/1/1 up {0=be-ceph03=up:replay(laggy or crashed)}"
[23:49] <gregaf> the stale means that your OSDs haven't reported in on those PGs in a while
[23:49] <gregaf> I thought you were trying to get rid of your MDSes?
[23:50] <gregaf> in which case you won't want them turning on
[23:50] <gregaf> if you wanted to clean up completely, you would also remove the CephFS pools (data and metadata) so you didn't have any objects sitting around left over from them
[23:50] <gregaf> then you could re-create them as very small pools and run newfs and point to them
[23:50] <gregaf> is how I'd do it
[23:51] <madkiss> gregaf: I didn't make me clear enough apparently.
[23:51] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[23:51] <madkiss> I executed the —yes-i-really … command.
[23:52] <madkiss> and only seconds later, what I was seeing in "ceph -w" was "   health HEALTH_WARN 3768 pgs stale; 3768 pgs stuck stale"
[23:52] <gregaf> that may be, but you didn't cause that by running newfs; that takes time to happen and is because your OSDs aren't reporting in on the PGs very often
[23:53] <madkiss> then why the cluster healthy before doing the command, and why did it turn back into healthy mode after restarting the OSDs on all nodes?
[23:53] <madkiss> s/why/why was/?
[23:53] <gregaf> with the detail you've provided, I have no idea
[23:53] <madkiss> I can run this again if you want more detail.
[23:55] <gregaf> not too fussed about a command that's hidden behind a yes-i-really-mean-it flag ;)
[23:55] <madkiss> wtf.
[23:55] <madkiss> okay, then again. how do I make these "{0=be-ceph03=up:replay(laggy or crashed)}" disappear?
[23:55] * jtang1 (~jtang@ has joined #ceph
[23:57] <madkiss> I tried "ceph mds rm 0" to "ceph mds rm 2", which apparently didn't help
[23:57] <gregaf> turn them off and prevent them from turning back on
[23:57] <gregaf> delete and then recreate the metadata and data pools
[23:57] <gregaf> run "ceph mds newfs 0 1"
[23:58] <gregaf> ta-da, the system doesn't remember you ever had an MDS running and now you don't have any running anyway
[23:58] <Kolobok> === osd.0 === No filesystem type defined! root@ceph~# service ceph start osd1 === osd.1 === No filesystem type defined!
[23:58] <madkiss> the data pool appears to have been deleted by the customer, there is a "metadata" and a "cephfs" pool.

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.