#ceph IRC Log


IRC Log for 2013-03-07

Timestamps are in GMT/BST.

[0:02] <dmick> devs= settings are for mkcephfs or ceph-disk-prepare to initialize the filesystems
[0:03] * drokita (~drokita@ Quit (Quit: Leaving.)
[0:03] * drokita (~drokita@ has joined #ceph
[0:06] * Tiger (~kvirc@ Quit (Ping timeout: 480 seconds)
[0:10] * leseb_ (~leseb@ Quit (Remote host closed the connection)
[0:10] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[0:10] * loicd (~loic@magenta.dachary.org) has joined #ceph
[0:11] * drokita (~drokita@ Quit (Ping timeout: 480 seconds)
[0:12] * rinkusk (~Thunderbi@CPE00259c467789-CM00222d6c26a5.cpe.net.cable.rogers.com) Quit (Ping timeout: 480 seconds)
[0:21] * markbby1 (~Adium@ Quit (Quit: Leaving.)
[0:29] <Kioob> how can I find the space really used by an RBD image + snapshots ?
[0:29] <phantomcircuit> that's a good question
[0:31] <Kioob> :)
[0:38] * rinkusk (~Thunderbi@ has joined #ceph
[0:49] * PerlStalker (~PerlStalk@ Quit (Quit: ...)
[0:52] <dmick> Kioob: you can look up its data prefix with rbd info, then find all those objects with rados ls, then rados stat them, which will give you the actually-used size in each object (and then sum them)
[0:53] <dmick> I don't know of anything that automates that but it's an easy script
[0:53] <Kioob> good !
[0:54] <Kioob> I try that
[0:54] <Kioob> thanks
[0:55] <dmick> and I should say I'm not 100% certain about snapshots and their overlap in usage with parent
[0:55] <dmick> I'm pretty sure the object storage is shared unless/until the writable image changes
[0:56] <dmick> (modulo some bookkeeping)
[0:58] * noob22 (~cjh@ has joined #ceph
[0:59] <Kioob> I didn't find a �rados stat� command
[0:59] <Kioob> oh, I found
[1:03] * noob21 (~cjh@ Quit (Ping timeout: 480 seconds)
[1:04] <Kioob> it works, thanks dmick
[1:04] <Kioob> (but it's really slow)
[1:05] <dmick> exact answers always are, on the cluster :)
[1:05] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[1:09] * ScOut3R (~scout3r@1F2EAE22.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[1:17] <Kioob> so, last problem : if I use �chooseleaf host� instead of �choose osd�, I have stucked PG in �active+remapped� status
[1:18] <Kioob> with 5 OSD, and a pool with only 2 copy (= 1 replica)
[1:19] <Kioob> erf, 5 hosts
[1:19] <Kioob> (39 OSD)
[1:20] <Kioob> (and I upgrade from 0.56.2 to 0.56.3, same problem)
[1:23] * rinkusk (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[1:24] * rinkusk (~Thunderbi@ has joined #ceph
[1:26] * xiaoxi (~xiaoxiche@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[1:28] * jlogan1 (~Thunderbi@2600:c00:3010:1:d431:8b06:8e11:1828) Quit (Ping timeout: 480 seconds)
[1:30] * ScOut3R (~ScOut3R@1F2EAE22.dsl.pool.telekom.hu) has joined #ceph
[1:37] * jlogan1 (~Thunderbi@ has joined #ceph
[1:38] <iggy> Kioob: if I understand what you said there, the master counts as a replica, so you want 2 replicas total
[1:38] * ScOut3R (~ScOut3R@1F2EAE22.dsl.pool.telekom.hu) Quit (Ping timeout: 480 seconds)
[1:39] <Kioob> yes iggy
[1:40] <Kioob> to with 5 hosts, I shouldn't have any �mapping� problem
[1:49] * rinkusk (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[1:54] <dmick> Kioob: you might try crushtool --test --output-csv on your map to see what's wrong
[1:54] <dmick> something must be
[1:55] <Kioob> oh, great tool
[2:02] * vata (~vata@2607:fad8:4:6:7c6f:a43a:3c1f:85ab) Quit (Quit: Leaving.)
[2:04] <Kioob> the man page of the tool looks wrong :D
[2:04] <Kioob> :S
[2:05] * tryggvil_ (~tryggvil@95-91-243-238-dynip.superkabel.de) has joined #ceph
[2:07] * yanzheng (~zhyan@ has joined #ceph
[2:08] * tryggvil (~tryggvil@95-91-243-238-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[2:08] * tryggvil_ is now known as tryggvil
[2:08] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[2:12] <ShaunR> hmm, should a stopped osd still be showing up?
[2:16] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[2:20] <Kioob> dmick: I use that command, without any result : crushtool -i crush.compiled --test --rule 4 --output-csv
[2:20] <Kioob> but I can't find documentation about that command
[2:20] <dmick> ls -ltr
[2:21] * alram (~alram@ Quit (Quit: leaving)
[2:21] * Cube1 (~Cube@ Quit (Ping timeout: 480 seconds)
[2:21] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[2:23] <dmick> see the output?
[2:23] <Kioob> lol... thanks :)
[2:24] <dmick> it's not exactly obvious with no docs
[2:24] <dmick> so don't feel bad
[2:27] * LeaChim (~LeaChim@b0faa0c8.bb.sky.com) Quit (Ping timeout: 480 seconds)
[2:29] <Kioob> 469,7,11,27,22
[2:29] <Kioob> 470,8,3,10
[2:30] <Kioob> how should I read that (from placement_information.csv)
[2:32] <Kioob> well
[2:32] <Kioob> I added the �--show-bad-mappings� option
[2:32] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[2:32] <Kioob> and I see a lot of lines like that :
[2:32] <Kioob> bad mapping rule 4 x 19 num_rep 2 result [30]
[2:32] <Kioob> bad mapping rule 4 x 23 num_rep 2 result [28]
[2:36] * b1tbkt (~Peekaboo@68-184-193-142.dhcp.stls.mo.charter.com) has joined #ceph
[2:40] * noob22 (~cjh@ Quit (Read error: Connection reset by peer)
[2:41] * noob21 (~cjh@ has joined #ceph
[2:43] <Kioob> sorry, but I don't understand where is the mistake.
[2:46] <Kioob> → my OSD tree is http://pastebin.fr/26576 . I have 4 active OSD (brontes, alim, noburo, and keron).
[2:46] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[2:47] <Kioob> and the rule : http://pastebin.fr/26577
[2:51] * diegows (~diegows@ has joined #ceph
[2:57] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Ping timeout: 480 seconds)
[2:59] * dpippenger (~riven@ Quit (Remote host closed the connection)
[3:00] * KindTwo (~KindOne@h79.24.131.174.dynamic.ip.windstream.net) has joined #ceph
[3:00] * noob21 (~cjh@ Quit (Quit: Leaving.)
[3:05] * KindOne (KindOne@h27.17.131.174.dynamic.ip.windstream.net) Quit (Ping timeout: 480 seconds)
[3:05] * KindTwo is now known as KindOne
[3:08] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[3:08] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[3:08] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[3:12] * sjustlaptop (~sam@m9a0436d0.tmodns.net) has joined #ceph
[3:19] * rinkusk (~Thunderbi@CPEbc14015a7093-CMbc14015a7090.cpe.net.cable.rogers.com) has joined #ceph
[3:20] * buck (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving.)
[3:29] * rinkusk1 (~Thunderbi@cmr-208-97-77-198.cr.net.cable.rogers.com) has joined #ceph
[3:30] <themgt> I have a bunch of pgs which are only on one osd (osdmap e921 pg 2.71 (2.71) -> up [3] acting [3]) my other osds aren't full. is there some way to force them to replicate over?
[3:31] * rinkusk (~Thunderbi@CPEbc14015a7093-CMbc14015a7090.cpe.net.cable.rogers.com) Quit (Ping timeout: 480 seconds)
[3:38] * sjustlaptop (~sam@m9a0436d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[3:43] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[3:45] * rturk is now known as rturk-away
[3:55] * jlogan1 (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[4:05] * xiaoxi (~xiaoxiche@jfdmzpr05-ext.jf.intel.com) Quit (Ping timeout: 480 seconds)
[4:16] * noob21 (~cjh@pool-96-249-204-90.snfcca.dsl-w.verizon.net) has joined #ceph
[4:17] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[4:19] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[4:19] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[4:21] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[4:21] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[4:23] * Psi-jack (~psi-jack@yggdrasil.hostdruids.com) has joined #ceph
[4:28] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[4:28] * loicd (~loic@magenta.dachary.org) has joined #ceph
[4:31] * chutzpah (~chutz@ Quit (Quit: Leaving)
[4:31] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[4:31] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[4:34] * rinkusk1 (~Thunderbi@cmr-208-97-77-198.cr.net.cable.rogers.com) Quit (Ping timeout: 480 seconds)
[5:01] * noob21 (~cjh@pool-96-249-204-90.snfcca.dsl-w.verizon.net) Quit (Quit: Leaving.)
[5:12] * jefferai (~quassel@quassel.jefferai.org) Quit (Ping timeout: 480 seconds)
[5:22] * jefferai (~quassel@quassel.jefferai.org) has joined #ceph
[5:22] * noahmehl (~noahmehl@cpe-75-186-45-161.cinci.res.rr.com) Quit (Read error: No route to host)
[5:23] * noob21 (~cjh@pool-96-249-204-90.snfcca.dsl-w.verizon.net) has joined #ceph
[5:25] * noob21 (~cjh@pool-96-249-204-90.snfcca.dsl-w.verizon.net) Quit ()
[5:30] * ivoks (~ivoks@jupiter.init.hr) Quit (Ping timeout: 480 seconds)
[5:40] * ivoks (~ivoks@jupiter.init.hr) has joined #ceph
[5:40] * ivoks (~ivoks@jupiter.init.hr) Quit ()
[5:40] * ivoks (~ivoks@jupiter.init.hr) has joined #ceph
[5:41] * ivoks (~ivoks@jupiter.init.hr) Quit ()
[5:41] * ivoks (~ivoks@jupiter.init.hr) has joined #ceph
[5:42] * ivoks (~ivoks@jupiter.init.hr) Quit ()
[5:46] * xiaoxi (~xiaoxiche@ has joined #ceph
[5:48] * ivoks (~ivoks@jupiter.init.hr) has joined #ceph
[5:50] <infernix> here's a question, can i run multiple instances of 'rbd rm rbdimage' on the same image to speed things up?
[5:52] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[5:53] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[5:53] <elder> On a kernel rbd image it won't speed anything up.
[5:53] <elder> Wait.
[5:54] <elder> I guess I don't know the answer... It doesn't involve the kernel.
[5:54] <elder> But I predict the answer is "no."
[5:54] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[5:56] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) Quit (Quit: themgt)
[5:57] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) has joined #ceph
[5:57] <infernix> time rbd rm 4gbtestfile335: 0m25.456s
[5:58] <infernix> time (4x rdb rm): 0m6.471s
[5:58] <infernix> answer is yes
[5:58] <elder> Interesting.
[5:58] <infernix> but only one will go to 100%
[5:58] <infernix> the others fail
[5:58] <infernix> endresult is equal
[5:59] <infernix> so i might forkbomb rbd rm
[5:59] <elder> They are probably fighting with each other. If the rbd command parallelized the removal itself I suspect it would be more efficient.
[5:59] <infernix> instead of having backgroundprocesses run for ages on 4TB volumes
[5:59] <infernix> yeah i will look at that at some point
[6:00] <elder> Each osd backing the rbd image could have a stream of remove operations going, concurrent with the other streams.
[6:03] <iggy> themgt: I think that means your OSD weights are wrong or you have a pool with repl=1 (depending on what you mean)
[6:05] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) has joined #ceph
[6:06] <themgt> iggy: yeah I'm thinking its maybe because I have custom osd weights and a large (relative to disk size) .rgw.buckets pool with only (oops the default of 8) pgs … not sure what the way out is
[6:11] * janeUbuntu (~jane@2001:3c8:c103:a001:f940:6bf6:60ac:ff19) Quit (Ping timeout: 480 seconds)
[6:13] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) Quit (Ping timeout: 480 seconds)
[6:25] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[6:27] <iggy> not sure either... I'm not sure you can change #PGs... might have to recreate things
[6:27] <iggy> pool or *... not sure
[6:30] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[6:40] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[6:54] * xiaoxi (~xiaoxiche@ Quit ()
[7:08] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[7:18] * yanzheng (~zhyan@ has joined #ceph
[7:37] * mcclurmc_laptop (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)
[8:18] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[8:24] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[8:24] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[8:45] * tryggvil (~tryggvil@95-91-243-238-dynip.superkabel.de) Quit (Quit: tryggvil)
[8:46] * l0nk (~alex@ has joined #ceph
[8:52] * Morg (b2f95a11@ircip1.mibbit.com) has joined #ceph
[9:03] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:04] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: Life without danger is a waste of oxygen)
[9:09] * sleinen (~Adium@2001:620:0:26:c4e9:edc1:2141:432c) has joined #ceph
[9:10] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[9:10] * dosaboy (~user1@host86-164-227-220.range86-164.btcentralplus.com) Quit (Quit: Leaving.)
[9:18] * leseb (~leseb@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[9:25] * BillK (~BillK@58-7-59-122.dyn.iinet.net.au) has joined #ceph
[9:27] * ivoks (~ivoks@jupiter.init.hr) Quit (Quit: leaving)
[9:28] * ivoks (~ivoks@jupiter.init.hr) has joined #ceph
[9:30] * eschnou (~eschnou@ has joined #ceph
[9:30] * loicd (~loic@lvs-gateway1.teclib.net) has joined #ceph
[9:35] * gerard_dethier (~Thunderbi@ has joined #ceph
[9:37] * LeaChim (~LeaChim@b0faa0c8.bb.sky.com) has joined #ceph
[9:37] * ScOut3R (~ScOut3R@ has joined #ceph
[9:40] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[9:55] * Philip__ (~Philip@hnvr-4d079fe4.pool.mediaWays.net) has joined #ceph
[9:55] * mcclurmc_laptop (~mcclurmc@firewall.ctxuk.citrix.com) has joined #ceph
[10:11] * themgt (~themgt@24-177-232-181.dhcp.gnvl.sc.charter.com) Quit (Quit: Pogoapp - http://www.pogoapp.com)
[10:20] * dosaboy (~gizmo@faun.canonical.com) has joined #ceph
[10:23] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[10:56] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[10:57] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[11:08] * Philip_ (~Philip@hnvr-4d07ac83.pool.mediaWays.net) has joined #ceph
[11:16] * Philip__ (~Philip@hnvr-4d079fe4.pool.mediaWays.net) Quit (Ping timeout: 480 seconds)
[11:24] * gregorg_taf (~Greg@ has joined #ceph
[11:24] * gregorg (~Greg@ Quit (Read error: Connection reset by peer)
[11:41] * leseb_ (~leseb@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[11:43] * leseb (~leseb@3.46-14-84.ripe.coltfrance.com) Quit (Read error: Connection reset by peer)
[11:58] * diegows (~diegows@ has joined #ceph
[12:08] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[12:09] * leseb_ (~leseb@3.46-14-84.ripe.coltfrance.com) Quit (Remote host closed the connection)
[12:15] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Connection reset by peer)
[12:15] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[12:16] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[12:19] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[12:25] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Connection reset by peer)
[12:25] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[12:30] <joelio> Any recommendations for SATA spinners? Guys at $WORK generally buy WD Black 2TB 7200 RPM (seem to be quite reasonable cost per GB wise) - 7200RPM is fine, looking for density and number of OSDs more than straight line speed
[12:30] <joelio> interested to hear what others use
[12:31] <joelio> 3 1/2"
[12:33] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[12:36] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Connection reset by peer)
[12:36] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[12:38] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[12:40] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Connection reset by peer)
[12:40] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[12:50] * dosaboy (~gizmo@faun.canonical.com) Quit (Ping timeout: 480 seconds)
[12:54] * leseb (~leseb@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[12:58] * dosaboy (~gizmo@faun.canonical.com) has joined #ceph
[13:02] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Connection reset by peer)
[13:02] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[13:05] <jtangwk> does the ceph osd/mds/mon components all work/compile under rhel5?
[13:06] <jtangwk> anyone testing/using rhel5 for osd's?
[13:09] <mattch> jtangwk: Tried and failed a year or 2 ago with sl5, now using sl6 without issue. Don't suspect it would be working now, and it's not on the recommended OS list: http://ceph.com/docs/master/install/os-recommendations/
[13:11] <jtangwk> just curious
[13:16] * dpippenger (~riven@cpe-76-166-221-185.socal.res.rr.com) has joined #ceph
[13:26] * The_Bishop (~bishop@2001:470:50b6:0:381a:b0c7:7e3e:43c3) has joined #ceph
[13:28] * rinkusk (~Thunderbi@CPEbc14015a7093-CMbc14015a7090.cpe.net.cable.rogers.com) has joined #ceph
[13:32] * gregorg_taf (~Greg@ Quit (Read error: Connection reset by peer)
[13:32] * gregorg_taf (~Greg@ has joined #ceph
[13:34] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Operation timed out)
[13:39] <jluis> jtangwk, I think we had it compiling under sles, or at least there were some efforts into doing that
[13:39] <jluis> don't really know the status on that, but glowell might have a better understanding than I do
[13:43] * l0nk (~alex@ Quit (Quit: Leaving.)
[13:46] <jtangwk> its more out of curiousity than a need ot build ceph on a rhel5 based system
[13:46] * yanzheng (~zhyan@ has joined #ceph
[13:46] <jtangwk> btw i think http://ceph.com/docs/master/install/os-recommendations/ might be wrong for centos6.3
[13:46] <jtangwk> i didnt think that centos6 had syncfs
[13:46] <jtangwk> we use SL6 and it doesnt have syncfs in glibc
[13:48] * rinkusk (~Thunderbi@CPEbc14015a7093-CMbc14015a7090.cpe.net.cable.rogers.com) Quit (Ping timeout: 480 seconds)
[13:49] <jtangwk> btw, has anyone taugh "ceph status" to output json?
[13:49] <jtangwk> taught
[13:49] <jtangwk> would be nice if it did
[13:51] * MKS (~cyberman@dslb-088-067-120-190.pools.arcor-ip.net) has joined #ceph
[13:52] <MKS> Hi there. Got a problem getting a fresh install of Ceph (version 0.49) going, it seems I'm not the first but solutions I've managed to find on-line do not work...
[13:53] <MKS> I've run mkcephfs without errors. However, when I try to start the daemons ceph-osd dies with "ERROR: osd init failed: (1) Operation not permitted."
[13:54] <absynth> any specific reason you are using an ancient ceph version?
[13:54] <MKS> There is also a message in mon logs which seems to be related: "cephx server osd.0: unexpected key: req.key=...... expected_key=.......".
[13:55] <absynth> you can try disabling cephx completely
[13:55] <MKS> Yes, that works.
[13:55] <absynth> and then you'll want to upgrade to bobtail
[13:55] <MKS> Only that it is the version my package manager gave me by default. Reckon I ought to update? The latest one I've got available is 0.56.1.
[13:55] <MKS> Okay, let's try that.
[13:56] <absynth> current is 0.56.4
[13:56] <absynth> 0.57 is somewhere around the corner, i think
[13:56] <Kioob`Taff> .4 ???
[13:56] <Kioob`Taff> I deployed 0.56.3 yesterday :(
[13:56] <absynth> at least i think so
[13:56] * MKS looks again at the channel topic and chuckles
[13:57] <absynth> well actually
[13:57] <absynth> we are at 0.58
[13:57] <absynth> released on march 5
[13:57] <MKS> Should I run mkcephfs again after updating?
[13:57] <absynth> http://ceph.com/docs/master/install/debian/
[13:57] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[13:57] <absynth> that shouldn't be necessary, no
[13:57] * loicd (~loic@lvs-gateway1.teclib.net) Quit (Ping timeout: 480 seconds)
[13:57] <scuttlemonkey> it's a question of the "LTS" releases
[13:57] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[13:57] <scuttlemonkey> 56.x is bobtail which is
[13:58] <scuttlemonkey> 57, 58 are not
[13:58] <absynth> scuttlemonkey: but 0.49 never was, right?
[13:58] <absynth> because 0.48 is
[13:58] <scuttlemonkey> until cuttlefish
[13:58] <absynth> (argonaut)
[13:58] <scuttlemonkey> right, named releases will get backports and the like
[13:58] <scuttlemonkey> the intermediary releases (mostly) do not
[13:58] <absynth> will there ever be a Scuttlemonkey release?
[13:58] <scuttlemonkey> hehe
[13:59] <scuttlemonkey> if only I were a type of squid
[13:59] <absynth> be careful what you wish for
[14:00] <absynth> oh, yay, the daily java update
[14:00] <absynth> Just Another Vulnerability Announced
[14:01] <scuttlemonkey> haha
[14:01] <Morg> heheh
[14:01] <Morg> ver. 1.00 should be called octopussy ;]
[14:01] <absynth> nah, that was version 0.07
[14:01] * MKS (~cyberman@dslb-088-067-120-190.pools.arcor-ip.net) has left #ceph
[14:01] <Morg> oh right, true that ;]
[14:02] <scuttlemonkey> actually...we have a new release cadence
[14:02] <Morg> i guess its too late for that now ;]
[14:02] <scuttlemonkey> so if I sat down and thought about it we should be able to extrapolate what 1.0 will be named
[14:03] <absynth> yay, we handled this support request by starting a naming meta-discussion
[14:03] <absynth> :D
[14:03] <MrNPP> anyone running qemu with virtio, i can't get lilo to install i get this http://i.imgur.com/YgoVUKG.png, it appears the device id changes from fc00 to fe00, is there a way to map it?
[14:03] * MKS (~cyberman@dslb-088-067-120-190.pools.arcor-ip.net) has joined #ceph
[14:03] <MrNPP> its on a virtio, which is a ceph rbd
[14:03] <scuttlemonkey> absynth: there was a support discussion? (I just got in)
[14:04] <absynth> mks wanted to know something
[14:04] <absynth> cephx issue, i reckon
[14:05] <absynth> MrNPP: we are running qemu with virtio, but _lilo_?
[14:05] <jluis> scuttlemonkey, but are we aiming at moving to 1.00 after some version? As far as I can tell, our current versioning scheme allows us to keep up on the decimal side of the dot forever :p
[14:05] <MrNPP> i tried grub too
[14:05] <MrNPP> i still couldn't get it to boot
[14:05] <MrNPP> said could not init
[14:06] <MKS> Okay, updated to 0.56.1...
[14:06] <MrNPP> i'm not using an initrfs
[14:06] <MKS> ...and now none of the daemons start at all.
[14:06] <jluis> were you in a version < 0.55 and not using cephx?
[14:06] <absynth> what's the error(s)
[14:06] <absynth> did you disable cephx, as i said earlier?
[14:06] <jluis> oh, they don't start?
[14:07] <MKS> No messages at all.
[14:07] <absynth> 13:55:09 < absynth> you can try disabling cephx completely
[14:07] <absynth> 13:55:16 < MKS> Yes, that works.
[14:07] <jluis> well, even without cephx I *think* they ought to start
[14:07] <absynth> that#s indeed strange
[14:07] <absynth> libc issue?
[14:07] <absynth> can you try starting one manually and see if it borks?
[14:07] <MrNPP> absynth: any specific grub changes i should use? and a grub version? 1 or 2?
[14:08] <jluis> MKS, try running them manually, as in 'ceph-foo -i <id> -d'
[14:08] <jluis> see if something pops up
[14:08] <absynth> MrNPP: as far as i know we are running plain vanilla grub1 and 2 successfully with our VMs
[14:08] <MKS> Unlikely. I run Gentoo, everything's compiled from source - and I only just have done that so it's very unlikely Ceph binaries have got linked to something out of date.
[14:08] <MKS> Okay, let's try that.
[14:09] <MKS> Any particular order I should launch daemons in?
[14:09] <jluis> MKS, no, but that command won't detach the daemon from the terminal
[14:09] <MKS> Fair enough.
[14:09] <jluis> so just start a couple of them
[14:09] <jluis> see if they keep on running or if they just exit
[14:09] <jluis> and if they do, let us know what happened
[14:10] <jluis> we can get more aggressive with debugging after that
[14:11] <MKS> Mon and mds are back up, like they were before. Now to see what osd does, as that's where the problems were earlier...
[14:11] <MKS> For now I've still got cephx disabled, though.
[14:13] <MKS> Okay, now at least I've got all three daemons running - that's better than before. Trying to mount the Ceph file system fails with "mount error 22 = Invalid argument", though.
[14:14] <jluis> what's the output of ceph -s ?
[14:14] <MKS> health HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery 21/42 degraded (50.000%)
[14:15] <MKS> Plus some map details.
[14:16] <MKS> (let me know if you need them)
[14:17] <jluis> the osds are recovering; while in that state, you won't be able to use to mount the fs
[14:17] <jluis> ah
[14:17] <MKS> Also, I have just re-enabled cephx and it still works. Weird...
[14:17] <MKS> How long can I expect that to take? It's been stuck at 50 percent for several minutes now.
[14:18] <jluis> have you started only half of your osds by any chance?
[14:18] * loicd (~loic@lvs-gateway1.teclib.net) has joined #ceph
[14:18] <MKS> No, I've only got one.
[14:18] <absynth> you have one osd?
[14:18] <jluis> oh, well, that's why you won't get past the 50%
[14:19] <jluis> but I think the cluster should work anyway in that case
[14:19] <MKS> Well, I *will* have more but I have to start testing somewhere...
[14:19] <jluis> MKS, default replication level is 2; if you only have one osd, the cluster is unable to fully replicate
[14:19] <jluis> hence the 50% degraded
[14:19] <jluis> you either change the default replication level, or add another osd
[14:20] * itamar_ (~itamar@ has joined #ceph
[14:20] <MrNPP> MKS: i use gentoo with ceph also, we should exchange tips some day
[14:20] <itamar_> Hi all
[14:20] <jluis> but I believe the default crush replication placement changed from the legacy 'osd' to 'host', just not sure when, so you might have to change the crush map accordingly if you add a new osd on the same host
[14:20] <absynth> jluis: sometime in 0.56
[14:20] <MKS> Huh, the sample config file said nothing about that. All the comment say is "you need at least one, two if you want data to be replicated."
[14:21] <absynth> MKS: yeah, but that's exactly jluis's point
[14:21] <itamar_> got an issue, I have a cluster running 0.56.3 where I have rewritten the crushmpa for some separation
[14:21] <absynth> if you have a replica count of 2 in the config, you are replicating
[14:21] <MKS> Yes - but it doesn't say anything about "if you want only one there is something else to change in the config".
[14:21] <MKS> Let's just try setting up another OSD, then.
[14:21] <absynth> you cannot replicate if there's no osd that you can replicate to
[14:21] <itamar_> I create new rados pools and tie them to the ruleset of the new crush pools
[14:22] <itamar_> but I keep on getting active+remapped
[14:22] <itamar_> on the PGs of the new rados pools
[14:23] * The_Bishop (~bishop@2001:470:50b6:0:381a:b0c7:7e3e:43c3) Quit (Ping timeout: 480 seconds)
[14:23] <jluis> MKS, if you only want one you only have to run a 'ceph osd pool size 1' (iirc the command correctly)
[14:24] <jluis> maybe we should add that to the docs if it's missing
[14:24] <scuttlemonkey> itamar_: active+remapped is telling you that it's shuffling PGs around in response to your change
[14:24] <scuttlemonkey> once things get to where they're going it should settle in
[14:24] * jluis is now known as joao
[14:25] <itamar_> Julis: it doesn't change
[14:25] <joao> itamar_, ?
[14:25] <joao> what doesn't?
[14:26] <scuttlemonkey> itamar_: do you have any that are stale or stuck?
[14:26] <itamar_> 2304 pgs: 2176 active+clean, 128 active+remapped; 14526 MB data, 65474 MB used, 16550 GB / 16614 GB avail; 1399B/s wr, 0op/s
[14:26] <joao> well, I really have to heat something for lunch; be back in a jiffy
[14:26] <scuttlemonkey> can you pastebin your new crushmap?
[14:26] <MKS> joao: Running "ceph osd pool size 1" gives me "(22) Invalid argument".
[14:27] <itamar_> sure
[14:27] <itamar_> in a few seconds
[14:27] <itamar_> go have lunch :)
[14:27] <MKS> I've also tried setting "osd pool default size = 1" in the [mon] section of ceph.conf, doesn't seem to do anything (recovery's still stuck at 50%)
[14:27] <scuttlemonkey> itamar_: I'm east coast US, not lunchtime for me
[14:28] <scuttlemonkey> :)
[14:28] <fghaas> MKS: incorrect syntax
[14:28] <fghaas> should be "ceph osd pool set {pool-name} size 1"
[14:28] <scuttlemonkey> ^^
[14:28] <scuttlemonkey> http://ceph.com/docs/master/rados/operations/pools/
[14:29] <scuttlemonkey> can see current replication levels with:
[14:29] <scuttlemonkey> 'ceph osd dump | grep 'rep size''
[14:29] <MKS> Yeah, they're all still at 2.
[14:30] * jabadia (~jabadia@IGLD-84-228-60-131.inter.net.il) has joined #ceph
[14:30] <scuttlemonkey> the syntax fghaas gave should fix that
[14:30] <MKS> Okay, set all three (data, metadata, rbd) to 1. Any way I can put it in config file?
[14:30] <MKS> Or is this persistent?
[14:30] <joao> back
[14:31] <MKS> Anyway, now the status is "HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean".
[14:31] <MKS> And mounting still doesn't work.
[14:32] <scuttlemonkey> what does 'ceph health detail' give you?
[14:32] <fghaas> MKS: silly question, are your MDSs up?
[14:33] <MKS> I've got one mon, one mds and one osd up and running.
[14:33] <itamar_> here's the crushmap, abit complex..
[14:33] <itamar_> https://gist.github.com/itamarla/ff8f03901e14d99e84f7
[14:34] <fghaas> MKS, did you check? (see scuttlemonkey's comment for the command you should be using)
[14:34] <MKS> pg 2.e is stuck unclean since forever, current state active+degraded, last acting [0]
[14:34] <MKS> pg 2.e is active+degraded, acting [0]
[14:34] <MKS> For every single page, it seems.
[14:34] <fghaas> that's a placement group, not a page
[14:34] <scuttlemonkey> pg = placement group
[14:34] <MKS> Ah, okay.
[14:36] <itamar_> I added the the osd tree to the gist
[14:36] <Kioob`Taff> "crushtool --test" report me a "bad mapping" that I don't understand. Any idea how can I find what is the error ?
[14:37] <fghaas> um, jluis, silly question, how can a PG be degraded if it has one active replica and the pool size is 1?
[14:37] <fghaas> is that thing just checking whether the osd list length is >1, and if it's not, it's considered degraded?
[14:38] <joao> my guess it that the osd doesn't have the pg, didn't create it or it got lost
[14:38] <scuttlemonkey> fghaas: perhaps this is a function of it thinking there should have been two replicas and having that osd come up and down
[14:39] <fghaas> "pg 2.e is active+degraded, acting [0]" doesn't make sense per se
[14:39] <itamar_> joao: did you have a look?
[14:39] <scuttlemonkey> if it tried to come up first as 0 and came back up as 1 it would think there is a better version
[14:39] <scuttlemonkey> itamar_: I'm not joao :P
[14:39] <scuttlemonkey> looking now though
[14:39] <scuttlemonkey> definitely a different kind of setup here
[14:40] <itamar_> :) Thanks
[14:41] <joao> MKS, when all else fails, try restarting the osd; when that fails, maybe inquiring sjust when he's around would be the way to go
[14:41] * gregorg_taf (~Greg@ Quit (Read error: Connection reset by peer)
[14:42] * xmltok (~xmltok@pool101.bizrate.com) Quit (Read error: Connection timed out)
[14:43] * gregorg (~Greg@ has joined #ceph
[14:44] <scuttlemonkey> joao: can you look at itamar_'s crushmap?
[14:44] * BillK (~BillK@58-7-59-122.dyn.iinet.net.au) Quit (Quit: Leaving)
[14:44] <scuttlemonkey> to me it looks like his chooseleaf step isn't quite what it needs to be
[14:44] <itamar_> scuttlemonkey: Yes - it's different
[14:44] <joao> will do in a sec; let me just fix something else for lunch -- looks like one lunch is not enough today
[14:44] <joao> be back in a couple of minutes
[14:45] <scuttlemonkey> hehe
[14:45] <scuttlemonkey> itamar_: ok, joao is much better at this than I am
[14:45] <itamar_> scuttlemonkey: Thanks..
[14:45] <scuttlemonkey> but to me it looks like (for instance) you are selecting a root (sec-zone02 for example) and iterating over hosts
[14:45] <scuttlemonkey> but there is only one host
[14:46] <scuttlemonkey> sec-zone02-compute-0-0
[14:47] <joelio> fac/wi3
[14:47] <itamar_> scuttlemonkey: I add the same osds to many roots if I do not have enough osds to go around.
[14:47] <joelio> balls
[14:47] <joelio> damn tab complete
[14:47] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has left #ceph
[14:48] <scuttlemonkey> root -> host -> osd
[14:49] <itamar_> scuttlemonkey: I'l add the rados pool dump to the gist so you will see the pools taht take the rulesets
[14:49] <joao> why do you have the same osds on so many different hosts?
[14:49] <MKS> Still, it seems I've got mounting working - forgot that cephx mounting requires specifying name and secret (d'oh).
[14:49] <joao> I'm not sure, but I would think that could cause all sorts of problems
[14:49] <joao> itamar_, ^
[14:50] <itamar_> joao: in the early days I got a configuration example from Inktank to do separate pools and the hostname was changed in the crushmap to match the crush rule name
[14:50] <itamar_> joao: I followed suit..
[14:51] <scuttlemonkey> itamar_: this might help a bit:
[14:51] <itamar_> joao: is it OK crushwize to keep on using the same hostname in different location on the crushmap?
[14:51] <scuttlemonkey> http://ceph.com/docs/master/rados/operations/crush-map/
[14:51] <MKS> There is that "error writing mtab" issue I've already seen mentioned, which is to be expected my mtab is a symlink to /proc/self/mounts (would be nice to prevent mount.ceph from trying, though) - but Ceph mounts anyway.
[14:52] <itamar_> oh yes, but I need an osd to be at several crush roots at one time
[14:52] <itamar_> this command line moves it around.. does not duplicate
[14:53] <MKS> Thanks for your help!
[14:53] <scuttlemonkey> MKS: hooray!
[14:54] <MKS> Cheers.
[14:54] <itamar_> scuttlemonkey: I have opened a ticket with Tyler asking for the ceph osd crush set to support several locations at once. that would have made my life easier.
[14:55] <joao> itamar_, I can see how you'd want different osds under different crush roots, although I'm far from an expert on that and have no idea if it works this way, but I'm having a hard time understanding how you can't simply do that by adding the same host multiple times to different roots instead of adding the same osds multiple times to different hosts
[14:55] <joao> the latter just seems weird
[14:55] <itamar_> the issue is with the active+remapped which I see only now and can't figure out
[14:55] <itamar_> joao: I was not aware it was possible
[14:56] <joao> although I can understand if you have different rules, with steapleaf host, and want different rules for different osds, but even so, it feels as if it is probably going to mess something up
[14:56] <joao> scuttlemonkey, who do you think would be more aware of how this works? sage? gregaf?
[14:57] <scuttlemonkey> sage for sure
[14:57] <scuttlemonkey> slang is quite good at diagnosing crush magic too
[14:58] * MKS (~cyberman@dslb-088-067-120-190.pools.arcor-ip.net) has left #ceph
[15:02] * markbby (~Adium@ has joined #ceph
[15:02] <scuttlemonkey> I still think it's a question of chooseleaf mapping
[15:03] <scuttlemonkey> can we look at a specific pg and see what it's reporting?
[15:03] <jtang> on the note of crushmaps, are their plans on a restful api to modifying a crush map?
[15:03] <jtang> and a json representation of a crushmap?
[15:03] <scuttlemonkey> jtang: a management API is one of the big current tasks
[15:03] <jtang> +1 on that ;)
[15:03] <scuttlemonkey> I (think) it should include crushmap manipulation
[15:04] <jtang> more json output from various commands would be nice too
[15:04] <jtang> that'd play well with building things on top
[15:04] <scuttlemonkey> yeah the goal is to be able to build a management interface into things like the openstack dashboard
[15:05] <scuttlemonkey> (as one small example)
[15:06] <scuttlemonkey> itamar_: if you would like to keep going lets grab 'ceph health detail' and pick up one of the remapped pgs and really look at it under a microscope
[15:06] <jtang> well its +1 from me on that ;)
[15:06] <jtang> i'd use it if it was there
[15:06] <slang> itamar_: that crushmap looks like it could use a good bit of cleanup
[15:06] <scuttlemonkey> cool
[15:06] <scuttlemonkey> that and geo replication are the two main focus points atm
[15:06] <jtang> yea i had a chat with nigel thomas i think
[15:07] <slang> itamar_: not sure if joao or scuttlemonkey mentioned this already, but the metadata rule depends on the default bucket
[15:07] <slang> which isn't defined anywhere
[15:07] <slang> oh wait
[15:07] <slang> there it is
[15:07] <jtang> just before christmas on the things we're wanting for one of our projects and a possible set of features that could be contributed from us
[15:07] <scuttlemonkey> jtang: cool, where did you meet up with nigel?
[15:07] <jtang> sadly our storage person isnt with us anymore, so its up in the air right now
[15:08] <jtang> scuttlemonkey: i had a conf call with him either nigel thomas or levine
[15:08] <scuttlemonkey> slang: mornin :)
[15:08] <jtang> one of them was nigel anyway
[15:08] <itamar_> slang: Hi Sage, long time no see..
[15:08] <scuttlemonkey> jtang: yeah Nigel Thomas is our VP of both sales and channels
[15:09] <scuttlemonkey> Neil Levine is our Product guy driving the roadmap
[15:09] * slang is slang (not Sage) :-)
[15:09] <scuttlemonkey> itamar_: slang != sage
[15:09] <jtang> ah neil
[15:09] <jtang> yea, he's an ex TCD person ;)
[15:09] <itamar_> slang: oh.. sorry
[15:09] <jtang> or was that nigel
[15:09] <jtang> *doh* im getting confused with names now
[15:10] <jtang> to cut it short, we had a follow up discussion post SC12
[15:10] <scuttlemonkey> cool
[15:11] <jtang> we've a few hundred tb's of data stored in GPFS right now, and its prime data to be migrated to a ceph system, if i can convince the guys here and build up the expertise to move to ceph
[15:11] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[15:11] <jtang> having something that actively moves data around is far more appealing than what GPFS is doing on our system
[15:11] <scuttlemonkey> hehe
[15:11] <jtang> we never quite figured out how to write a policy to scrub data and to move things around
[15:12] <jtang> also, you guys should really publish one or two light weight examples on extending rados objects
[15:12] <jmlowe> jtang, what institution are you located at, if you don't mind me asking?
[15:12] <jtang> or writing services on top of it
[15:12] <jtang> jmlowe: trinity college dublin
[15:12] <jmlowe> <- Indiana University
[15:12] <jtang> im part of the digital repository of ireland project
[15:13] <jtang> cool ;)
[15:13] <jmlowe> xsede
[15:13] <scuttlemonkey> jtang: absolutely!
[15:13] <jtang> know anyone in the library (and works in digital preservation/archiving) and wants to come to ireland for hydracamp
[15:13] <jtang> :)
[15:13] <scuttlemonkey> we have a customer who is building out some rados methods
[15:13] <scuttlemonkey> they are the first I have seen to do so
[15:14] <jmlowe> hmm, there are some weird dotted lines on our org chart to the digital libraries
[15:14] <scuttlemonkey> hoping they'll open source them, or at least share the approach with the community
[15:14] <jtang> scuttlemonkey: there's a certain appeal at extending it to do checksumming of data, resizing images (zoomifying) and streaming
[15:14] <scuttlemonkey> definitely
[15:14] <joao> jmlowe, it was you that came up with that mob accountant comparison wrt the monitors, right?
[15:14] <scuttlemonkey> I think these folks are using it to do log post-processing
[15:14] <jmlowe> Yeah
[15:15] <jtang> its something that we are keeping an eye on as the messaging system that rados has might solve some problems of ours
[15:15] <scuttlemonkey> yeah jmlowe: loved that
[15:15] <joao> jmlowe, how would you feel if we included that on the docs, and on a blog post, due credit being given? :)
[15:15] <jtang> assuming we pick ceph for the project, life would be much easier
[15:15] <jtang> heh
[15:15] <jmlowe> I watched "the dark knight" on tv a lot recently, struck me how knocking out the accountant was crippling because he had all the information
[15:15] <scuttlemonkey> jtang: we hope so too! :)
[15:15] <jtang> the libraries that are digitising stuff usually want it stored somewhere
[15:15] <jmlowe> I'd be honored to have that included
[15:16] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[15:16] <jtang> but they usually dont know how or where
[15:16] <joao> awesome
[15:16] <jtang> which is a headache
[15:16] <itamar_> scuttlemonkey: Any idea?
[15:16] <joao> that comparison will make my life so much easier on the intro to this blog post
[15:16] <scuttlemonkey> [09:05] <scuttlemonkey> itamar_: if you would like to keep going lets grab 'ceph health detail' and pick up one of the remapped pgs and really look at it under a microscope
[15:17] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[15:17] <jtang> jmlowe: you part of the IT services in indiana? or HPC?
[15:17] <joao> btw, scuttlemonkey, still working on the article; yesterday was a wash, ended up spending most of my day in bed and, well, things got slightly delayed
[15:17] <jmlowe> it's a little more up to date and well known than references to byzantine generals
[15:19] <jmlowe> jtang: I'm part of research technologies, I'm in the group that owns the hpc systems, I dole out vm's that are intended to run front ends for our hpc systems
[15:19] <itamar_> scuttlemonkey: https://gist.github.com/itamarla/ff8f03901e14d99e84f7 added on top
[15:19] <scuttlemonkey> joao: no worries, would still love to post it today at some point if you can
[15:19] <scuttlemonkey> if not we can either put it up tomorrow (although that usually results in lower traffic) or mon/tues-ish
[15:19] <jtang> jmlowe: nice
[15:19] <jmlowe> jmlowe: still part of the university it structure, they never spun off hpc here
[15:19] <jtang> yea, im based in the hpc group here too in tcd
[15:20] <joao> scuttlemonkey, yeah, today should work still :)
[15:20] <scuttlemonkey> sweet
[15:20] <joao> and I just misplaced my lighter
[15:20] * joao goes on a hunt throughout the house
[15:20] <joao> brb
[15:20] <jtang> we never spun out, in fact due to the economy we ended up getting sucked into the core computing/IT services of the university
[15:20] <jmlowe> we are getting ready to take delivery of this http://kb.iu.edu/data/bcqt.html
[15:21] <scuttlemonkey> itamar_: ok, no snag 'ceph pg 14.65 query'
[15:21] <scuttlemonkey> s/no/now
[15:21] <itamar_> scuttlemonkey: gist updated
[15:22] <jtang> jmlowe: nice
[15:22] <itamar_> scuttlemonkey: https://gist.github.com/itamarla/ff8f03901e14d99e84f7#file-gistfile1-txt
[15:22] <jtang> heh lustre!
[15:22] <jtang> eewwww
[15:23] <scuttlemonkey> itamar_: looking
[15:23] <itamar_> scuttlemonkey: cheers..
[15:23] * loicd1 (~loic@lvs-gateway1.teclib.net) has joined #ceph
[15:24] <joelio> Is it possible for 'ceph -w' or another command to show estimated time to completion for stuff like rebuilds etc.
[15:25] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[15:26] <jmlowe> Lustre has only died once this week!
[15:26] * loicd (~loic@lvs-gateway1.teclib.net) Quit (Ping timeout: 480 seconds)
[15:27] <jtang> how many sysadmins went crazy is the question?
[15:27] <absynth> our ceph survived a disk change!
[15:27] <absynth> let's all hold hands and dance around a bonfire
[15:29] <jtang> heh
[15:29] <jtang> ceph didnt survive a backblaze pod though
[15:30] <absynth> i think those two are not an ideal match, anyway
[15:30] <joao> joelio, latest versions provide some insight on how many op/s and b/s are being performed; but don't think there's any ETA for completion
[15:30] <jmlowe> It wasn't too bad, just a flakey network connection on the mds, makes me long for the day when we will all use cephfs and have automatic failover
[15:30] <absynth> you can build your own ETA with a rather simple shell script
[15:30] <jtang> tbh, gpfs isnt bad if you need a good stable parallel filesystem that does posix
[15:31] * loicd (~loic@lvs-gateway1.teclib.net) has joined #ceph
[15:31] <jtang> we kinda like it here at our side
[15:31] <jtang> site
[15:31] <absynth> you can average the recovery ops per second over 2 minutes or so, then multiply by remaining unclean objects
[15:31] <jtang> we probably about 400tb sitting on gpfs
[15:31] * loicd1 (~loic@lvs-gateway1.teclib.net) Quit (Read error: No route to host)
[15:31] <jmlowe> jtang: we liked it here, but politics, price, and old ddn gear killed it
[15:32] <jtang> heh, the new academic pricing isnt bad
[15:32] <jmlowe> jtang: I've personally been responsible for 4 differenct gpfs filesystems
[15:32] <jtang> its quite good and competitive compared lustre offerings
[15:32] <jtang> ive only been responsible for 2-3 systems
[15:33] <jmlowe> jtang: yeah, used to be something like a couple of thousand a node, or free with an ibm cluster
[15:33] <jtang> we were the first to run it over ipoib on more than 100nodes when we first got it
[15:33] <scuttlemonkey> itamar_: ok, this is getting near the limits of my expertise...however, this pg says the acting osds are 5,8
[15:33] <jtang> as far as i could tell
[15:33] <scuttlemonkey> but none of your rules appear to use osd.8
[15:33] <jtang> back when ib was new
[15:34] <jtang> was kinda odd, we bought 2x256 port switches for our ib cluster
[15:34] <jmlowe> ready to have a conniption fit? gpfs-wan
[15:34] <jtang> i think we were voltaires biggest customer for about 6months before others realised it was cheaper than myrinet
[15:34] <jtang> cool!
[15:35] <itamar_> scuttlemonkey: I see.. maybe it's because I first create the pool and a line of code later tie it to the crush pool?
[15:35] <jtang> ive only seen gpfs over a wan over at DEISA (pre PRACE project)
[15:35] <scuttlemonkey> itamar_: host vm-manager-0-0 appears to be the only one that uses osd.8
[15:35] <jtang> what was the distance?
[15:35] <itamar_> scuttlemonkey: and it fails for some reason..
[15:35] <jtang> the new AFM stuff in GPFS fixes alot of issues with running over a WAN
[15:35] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Quit: Ex-Chat)
[15:35] * jtang hopes ceph will have a similar feature with the async replication
[15:36] <jmlowe> San Diego to Indiana, about 2k miles
[15:36] <scuttlemonkey> itamar_: you'll probably have to ask sage or someone who knows crush guts much better than I for specifics
[15:36] <jtang> thats far more than whats been done in europe
[15:36] <jtang> :P
[15:36] <scuttlemonkey> but my guess is it doesn't like something about your setup
[15:36] <jtang> then again europe is smaller
[15:36] <jmlowe> circa 2006
[15:36] <itamar_> scuttlemonkey: Thanks a lot, I will open a ticket and continue to play with it..
[15:37] <scuttlemonkey> whether it's a function of the way things were mapped and then changed...or if it just doesn't like the way you share osds
[15:37] <scuttlemonkey> sure, sorry I couldn't get you the last mile
[15:37] <jtang> jmlowe: that sounds like that was when DEISA was getting IBM to do the cluster to cluster stuff
[15:37] <scuttlemonkey> itamar_: that would be a good one for the mailing list if you're willing
[15:37] <jmlowe> jtang: it was part of teragrid
[15:37] <scuttlemonkey> I know I would like to see the answer
[15:37] <jtang> i thought teragrid was more gridftp across a wan
[15:38] <jtang> with high speed endpoints
[15:38] <jtang> for storage that is
[15:38] <scuttlemonkey> afk a few
[15:38] * Morg (b2f95a11@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[15:39] * janeUbuntu (~jane@ has joined #ceph
[15:39] * jtang still shudders at globus tool kit
[15:39] <absynth> jtang: WS-GRAM!
[15:40] <jmlowe> jtang: that too, the people in charge keep telling us to violate the laws of physics and make one national filesystem that is fully consistent available and distributed, when we can't do it they settle for hauling around files
[15:40] <janos> ouch
[15:40] <jmlowe> absynth: now I'm going to have a conniption fit
[15:40] <jtang> heh
[15:40] <absynth> jmlowe: wait until you see my GSI proxy certificate chain!
[15:41] <jtang> ive strategically avoided GSI when i was working with iRODS
[15:41] <absynth> coward.
[15:41] <jtang> heh
[15:41] * jtang just wants to keep his sanity
[15:41] <absynth> i used alcohol for that
[15:41] <jtang> heh
[15:41] <jtang> right time to go an write more tests
[15:41] <absynth> since my phd thesis was about gsi, i used *lots* of alcohol.
[15:41] <jtang> or test more tests
[15:42] <absynth> or do what nhm does: test more writes.
[15:42] <jtang> heh
[15:42] <jtang> well im doing some testing with ruby
[15:42] <jtang> and rails
[15:42] <jtang> *sigh*
[15:42] <jtang> web apps in archiving and preservation
[15:42] <jmlowe> I remember when irods was called srb
[15:42] <jtang> srb was good
[15:42] <jtang> some much stuff doesnt work irods
[15:42] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[15:42] <jtang> well didnt
[15:42] <jtang> its better now
[15:43] <jtang> though the new irods consortium might be interseting to look at
[15:43] <jtang> it might die a horrible death
[15:43] <jtang> or splinter like lustre with out a leader
[15:43] <jtang> then get bought up by the likes of DDN or IBM
[15:43] <absynth> i was astonished to read that sagewk invented the Webring concept
[15:43] <jmlowe> srb had a bad reputation, I had the feeling they changed the name to get a fresh start
[15:44] * jtang goes to write tests
[15:45] * Philip_ (~Philip@hnvr-4d07ac83.pool.mediaWays.net) Quit (Read error: Connection reset by peer)
[15:48] * Philip (~Philip@hnvr-4d07ac83.pool.mediaWays.net) has joined #ceph
[15:48] * Philip is now known as Guest1186
[15:49] <jmlowe> absynth: he did that and started dreamhost as an undergrad? wow
[15:49] <absynth> in case someone needs IEEE publications under their belt: http://www.ischool.drexel.edu/bigdata/bigdata2013/index.htm
[15:50] <absynth> cfp is here: http://www.ischool.drexel.edu/bigdata/bigdata2013/callforpaper.htm
[15:55] * sleinen (~Adium@2001:620:0:26:c4e9:edc1:2141:432c) Quit (Quit: Leaving.)
[15:58] * PerlStalker (~PerlStalk@ has joined #ceph
[16:03] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[16:09] * aliguori (~anthony@ has joined #ceph
[16:09] * loicd (~loic@lvs-gateway1.teclib.net) Quit (Ping timeout: 480 seconds)
[16:12] * vata (~vata@2607:fad8:4:6:4988:7146:954e:8567) has joined #ceph
[16:13] * NuxRo (~nux@ has joined #ceph
[16:15] <NuxRo> Hello, why must I define a "devs" since I will have to mount that "dev" anyway?
[16:16] <NuxRo> the reason I'm asking is I only have 1 dev (one big raid /dev/sda); i could split it with lvm, but not if there are alternatives
[16:16] * drokita (~drokita@ has joined #ceph
[16:20] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[16:21] <Gugge-47527> NuxRo: why do you think you have to define a "devs" ?
[16:21] * gregorg_taf (~Greg@ has joined #ceph
[16:22] * gregorg (~Greg@ Quit (Read error: Connection reset by peer)
[16:26] <NuxRo> Gugge-47527: ops, right, was reading some random stuff
[16:26] <NuxRo> btw where's the docs for the configuration of ceph?
[16:28] <scuttlemonkey> NuxRo: http://ceph.com/docs/master/
[16:30] <NuxRo> thanks
[16:34] * flakrat (~flakrat@eng-bec264la.eng.uab.edu) has joined #ceph
[16:35] <NuxRo> at some point i knew there was a graphical tool for ceph, is it still used? i can't see it in http://ceph.com/rpm/el6/x86_64/
[16:35] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[16:36] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[16:36] <darkfader> NuxRo: no it's had some bugs and was deprecated i think
[16:36] <darkfader> last time i used it it only ran for 2-3 seconds and then crashed
[16:36] <darkfader> it was still great
[16:36] <NuxRo> ouch, alright, thanks
[16:36] <darkfader> but i think the idea is to make better hooks on ceph side and wait till a new tool comes around
[16:36] <darkfader> name was gceph
[16:37] <NuxRo> thanks
[16:38] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[16:51] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[16:53] * aliguori_ (~anthony@ has joined #ceph
[16:53] * rinkusk (~Thunderbi@CPE00259c467789-CM00222d6c26a5.cpe.net.cable.rogers.com) has joined #ceph
[16:55] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[16:56] * gerard_dethier (~Thunderbi@ Quit (Quit: gerard_dethier)
[16:56] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[16:59] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[16:59] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Connection reset by peer)
[17:00] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[17:03] * ntranger_ (~ntranger@proxy2.wolfram.com) has joined #ceph
[17:03] * ntranger (~ntranger@proxy2.wolfram.com) Quit (Remote host closed the connection)
[17:04] * eschnou (~eschnou@ Quit (Remote host closed the connection)
[17:11] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Operation timed out)
[17:12] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[17:12] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Quit: http://www.psi-jack.info/)
[17:13] * topro (~topro@host-62-245-142-50.customer.m-online.net) Quit (Quit: Konversation terminated!)
[17:14] * jlogan1 (~Thunderbi@2600:c00:3010:1:3500:efc8:eaed:66fd) has joined #ceph
[17:16] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[17:17] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) Quit (Remote host closed the connection)
[17:24] * l0nk (~alex@87-231-111-125.rev.numericable.fr) has joined #ceph
[17:24] <l0nk> hello all :)
[17:26] <scuttlemonkey> hey l0nk
[17:26] <l0nk> what do you think about using a PERC S110 for 3x2 TB hard drive? any experience?
[17:27] <l0nk> and the journal on an intel SSD
[17:27] <scuttlemonkey> sounds like a dandy setup to me :)
[17:28] <scuttlemonkey> but I'm not a hardware guy
[17:28] <scuttlemonkey> person you'd probably wanna hear from is nhm
[17:29] * esammy (~esamuels@host-2-99-4-21.as13285.net) has joined #ceph
[17:29] <l0nk> i don't know if the H310 will really help
[17:29] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Read error: Connection reset by peer)
[17:29] * tryggvil (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[17:31] <l0nk> ok thanks I'll ask him when he'll be free :)
[17:31] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[17:32] <l0nk> maybe I can plug the SSD to the integrated SATA ports to let the S110 free for OSD
[17:32] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[17:32] <scuttlemonkey> I know that splitting out the journal to ssd is a great setup
[17:33] <absynth> works fine for us
[17:33] <absynth> plus an extra ssd for controller cache
[17:33] <scuttlemonkey> but as for specific hardware thumbs up/down you'll want the guy who plays with stuff like this all day: http://ceph.com/community/ceph-bobtail-jbod-performance-tuning/
[17:34] <l0nk> so the perc s110 can cope to the ssd and the osd ?
[17:35] <l0nk> did you plug the ssd directly to the mainboard absynth? i'm really curious about that since I don't know dell hardware
[17:36] * andrew (~andrew@ip68-231-33-29.ph.ph.cox.net) has joined #ceph
[17:36] <absynth> l0nk: we have supermicro, not dell
[17:36] <absynth> i was only referring to the "journal on ssd" design
[17:36] <andrew> how do i git the .58 source?
[17:36] <l0nk> oh ok sorry i misunderstood
[17:36] * tryggvil_ (~tryggvil@Router086.inet1.messe.de) has joined #ceph
[17:39] * ScOut3R (~ScOut3R@ Quit (Ping timeout: 480 seconds)
[17:40] <scuttlemonkey> andrew: all downloads here -- http://ceph.com/download/
[17:41] * tryggvil (~tryggvil@Router086.inet1.messe.de) Quit (Ping timeout: 480 seconds)
[17:41] <scuttlemonkey> or you can git clone tagged branches from https://github.com/ceph/ceph.git
[17:41] <andrew> thats not a "git" answer, but its good enough. thx
[17:42] <scuttlemonkey> andrew: yeah, the second one was the git answer...I misread as "get .58 source"
[17:42] <scuttlemonkey> #damnyoubrainautocorrect
[17:43] <andrew> ack. thx.
[17:43] <andrew> oops. which branch is .58 ?
[17:44] * tryggvil_ (~tryggvil@Router086.inet1.messe.de) Quit (Ping timeout: 480 seconds)
[17:45] * mcclurmc_laptop (~mcclurmc@firewall.ctxuk.citrix.com) Quit (Read error: Connection reset by peer)
[17:45] <scuttlemonkey> andrew: the releases are done as tags
[17:46] <scuttlemonkey> which you can see with 'git tag -l'
[17:46] <scuttlemonkey> and checkout a specific tag using 'git checkout <tag_name>'
[17:46] <scuttlemonkey> if you are doing it for dev reasons you'll probably want to make a new branch a la:
[17:46] * mcclurmc_laptop (~mcclurmc@firewall.ctxuk.citrix.com) has joined #ceph
[17:47] <scuttlemonkey> 'git checkout -b new_branch tag_name
[17:47] <scuttlemonkey> the second will allow you to avoid a detatched head
[17:52] * aliguori_ (~anthony@ Quit (Ping timeout: 480 seconds)
[17:54] * aliguori (~anthony@ has joined #ceph
[17:57] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[17:58] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[18:01] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[18:02] <infernix> if I have an 8MB rbd image that consists of 2 4MB blocks, and I write 8MB of '1's to it
[18:02] <infernix> then make a snapshot (new format)
[18:02] <infernix> and then write 4MB of 1s and 4MB of 2s to it
[18:03] <infernix> do I end up with 2 (snapshot) +2 (new) objects, or 2 (snapshot) + 1 (new) objects?
[18:04] <infernix> in other words, if I write the exact same data to an object for which there is a snapshot, does it create a new object or not?
[18:04] <infernix> *to an rbd image for which..
[18:05] * itamar_ (~itamar@ Quit (Quit: Leaving)
[18:07] <infernix> I could imagine that if an object has a crc32 value in its metadata, librbd could actually test whether if the newly written data is different
[18:09] * topro (~topro@host-62-245-142-50.customer.m-online.net) has joined #ceph
[18:12] * infernix digs in librbd sources
[18:12] <iggy> I seriously doubt it tries to microoptimize like that
[18:13] <iggy> it would be a lot of overhead for what would presumably be little gain
[18:13] <iggy> most things that do snapshots, do a cow and go on about their lives
[18:16] <infernix> yeah, looking at the code there's no such thing
[18:16] <infernix> and it'll definitely hinder performance
[18:17] * noob21 (~cjh@ has joined #ceph
[18:17] <infernix> it would be great for my use case but that's too specific
[18:18] <infernix> the alternative is that i read the data from the rbd image before i write it, and only write when it's different, but that's even slower than storing a crc or hash in metadata per object
[18:19] <infernix> and there's no metadata for rbd devices so
[18:19] <infernix> back to the drawing board
[18:20] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[18:21] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) has joined #ceph
[18:21] * xdeller (~xdeller@ has joined #ceph
[18:23] * leseb (~leseb@3.46-14-84.ripe.coltfrance.com) Quit (Remote host closed the connection)
[18:26] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[18:32] <andrew> i don't want to seem a doofus, but i wanted to feedback some source changes. i don't have git write permissions. i have seen patch (i think) diffs on the mailing list. how should i contribute such changes?
[18:34] * markbby (~Adium@ Quit (Quit: Leaving.)
[18:34] * markbby (~Adium@ has joined #ceph
[18:35] * esammy_ (~esamuels@host-2-99-5-226.as13285.net) has joined #ceph
[18:36] <iggy> when you're just getting started, it's often easier to just have a pristine source tree and whatever you are working on and using diff -ruN between them
[18:36] <iggy> as you get more used to things git diff will become your friend
[18:36] <iggy> then you'll move on to things like git-format-patch, etc.
[18:37] <iggy> but you'll generally just be emailing diffs to the mailing list
[18:37] <iggy> I'm not sure anyone outside of inktank has write permissions
[18:38] <scuttlemonkey> iggy: a couple maybe...but most stuff is done via pull request
[18:38] <iggy> also when you get good with git/github/etc, you can just have your own fork of the code with branches for your changes and send github merge requests
[18:38] <scuttlemonkey> sage is quite good about snagging pending PRs if they make sense
[18:39] <iggy> so yeah, a github crash course might be a good idea (they have decent docs)
[18:39] <scuttlemonkey> andrew: the ceph devs are extremely helpful and friendly compared to many FOSS projects I have worked with
[18:40] <gregaf> says the community manager *eyeroll*
[18:40] <gregaf> ;)
[18:40] <scuttlemonkey> if you send a msg to ceph-devel with a little bit about what you're changing and one of the following you'll be in good shape: a) pasted diff b) location to git pull changes from c) github pull request
[18:40] <scuttlemonkey> hah
[18:41] * esammy (~esamuels@host-2-99-4-21.as13285.net) Quit (Ping timeout: 480 seconds)
[18:41] * esammy_ is now known as esammy
[18:41] <scuttlemonkey> well, my statement has no bearing on the community team
[18:41] <scuttlemonkey> ever try to submit a patch to the kernel out of the blue? openBSD back in the day?
[18:41] <scuttlemonkey> those were two of the worst for folks I watched :P
[18:41] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[18:41] <scuttlemonkey> you guys are friggin mana from heaven by comparison
[18:42] * scuttlemonkey waits for these logs to show up in google and the car-bombs to commence.
[18:45] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[18:49] * esammy (~esamuels@host-2-99-5-226.as13285.net) Quit (Ping timeout: 480 seconds)
[18:49] * l0nk (~alex@87-231-111-125.rev.numericable.fr) Quit (Quit: Leaving.)
[18:53] * esammy (~esamuels@host-2-103-101-90.as13285.net) has joined #ceph
[18:54] <andrew> thanks. i will send a missive to teh devel list.
[18:55] <yehudasa> gregad: wip-4363 will thank you if you could take a look
[18:56] <yehudasa> gregaf: ^^
[18:56] <gregaf> in my queue
[19:02] <madkiss> hello gregaf :
[19:02] <madkiss> :)
[19:03] * chutzpah (~chutz@ has joined #ceph
[19:07] <gregaf> hi
[19:09] * Guest1186 (~Philip@hnvr-4d07ac83.pool.mediaWays.net) Quit (Ping timeout: 480 seconds)
[19:11] * dosaboy (~gizmo@faun.canonical.com) Quit (Remote host closed the connection)
[19:11] * dosaboy (~gizmo@faun.canonical.com) has joined #ceph
[19:11] <sstan> if the cluster network fails, will OSDs fallback to the public network?
[19:13] * dpippenger (~riven@cpe-76-166-221-185.socal.res.rr.com) Quit (Remote host closed the connection)
[19:15] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[19:18] <gregaf> sstan: not on their own, no
[19:20] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) Quit (Quit: Leaving.)
[19:21] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:23] * xdeller (~xdeller@ Quit (Quit: Leaving)
[19:27] * alram (~alram@ has joined #ceph
[19:38] * mcclurmc_laptop (~mcclurmc@firewall.ctxuk.citrix.com) Quit (Ping timeout: 480 seconds)
[19:44] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[19:45] * l0nk (~alex@87-231-111-125.rev.numericable.fr) has joined #ceph
[19:45] <l0nk> hello
[19:46] <l0nk> maybe it's a stupid, question but is the max size of a rbd is limited by something?
[19:48] * dmner (~tra26@tux64-13.cs.drexel.edu) Quit (Quit: leaving)
[19:52] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[19:58] * noob22 (~cjh@ has joined #ceph
[20:04] * noob21 (~cjh@ Quit (Ping timeout: 480 seconds)
[20:05] <iggy> I'm sure it has some limit, hopefully it's 64bit which would be exabytes+
[20:06] <jmlowe> I accidentally made one 500TB once
[20:06] <jmlowe> took a day or two to delete
[20:06] <iggy> ouch...
[20:06] <jmlowe> got a little carried away with my right shift 3
[20:10] <janos> yeah i screwed up and did a petabyte
[20:10] * loicd (~loic@magenta.dachary.org) has joined #ceph
[20:10] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[20:14] <infernix> jmlowe: could've sped that up with multiple concurrent instances of rbd rm
[20:14] <infernix> which for some reason works
[20:17] * dosaboy (~gizmo@faun.canonical.com) Quit (Quit: Leaving.)
[20:17] * ScOut3R (~scout3r@1F2EAE22.dsl.pool.telekom.hu) has joined #ceph
[20:18] <l0nk> 500TB wow, did you at least had a cluster of 500TB when you did that iggy?
[20:18] <iggy> that was someone else
[20:18] <l0nk> sorry jmlowe!
[20:18] <iggy> but rbd images are sparse, so it wouldn't use the space up front
[20:19] <l0nk> ok didn't know that, thanks
[20:19] <l0nk> (your nick and jmlowe are pretty much the same color in my client, that explain my mistake)
[20:22] * houkouonchi-work (~linux@ Quit (Read error: Connection reset by peer)
[20:22] * houkouonchi-work (~linux@ has joined #ceph
[20:23] * markbby (~Adium@ Quit (Ping timeout: 480 seconds)
[20:24] <jmlowe> no I did not have that much space, sparse files saved me
[20:26] <l0nk> nice to hear :)
[20:33] * dpippenger (~riven@ has joined #ceph
[20:53] * eschnou (~eschnou@157.68-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:05] * noob22 (~cjh@ Quit (Quit: Leaving.)
[21:24] * BillK (~BillK@58-7-124-91.dyn.iinet.net.au) has joined #ceph
[21:25] * Tiger (~kvirc@ has joined #ceph
[21:27] * l0nk (~alex@87-231-111-125.rev.numericable.fr) Quit (Quit: Leaving.)
[21:33] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[21:53] * Cube (~Cube@ has joined #ceph
[21:56] * ScOut3R_ (~scout3r@1F2EAE22.dsl.pool.telekom.hu) has joined #ceph
[21:56] * ScOut3R (~scout3r@1F2EAE22.dsl.pool.telekom.hu) Quit (Read error: Connection reset by peer)
[21:57] * esammy (~esamuels@host-2-103-101-90.as13285.net) has left #ceph
[21:59] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[21:59] <Kioob> I still have a problem with CRUSH rules. If I simplify my map, by setting only [root]→[host] it works. but with [root]→[datacenter]→[host] it doesn't works
[22:00] * terje (~joey@97-118-178-216.hlrn.qwest.net) Quit (Ping timeout: 480 seconds)
[22:00] <Kioob> the �step chooseleaf first 0 type XXX�, the XXX refer to the direct child of the item in �step take ZZZ� ?
[22:09] * eschnou (~eschnou@157.68-201-80.adsl-dyn.isp.belgacom.be) Quit (Quit: Leaving)
[22:10] * mcclurmc_laptop (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[22:16] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[22:17] * jabadia (~jabadia@IGLD-84-228-60-131.inter.net.il) Quit (Remote host closed the connection)
[22:18] * The_Bishop (~bishop@ has joined #ceph
[22:20] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[22:20] <ntranger_> hey dmick! Thanks for your help yesterday. I got it to run finally. :)
[22:20] <dmick> yw, and good news
[22:23] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[22:29] * jjgalvez (~jjgalvez@ has joined #ceph
[22:35] * jjgalvez1 (~jjgalvez@ has joined #ceph
[22:37] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[22:41] * jjgalvez (~jjgalvez@ Quit (Ping timeout: 480 seconds)
[22:45] <infernix> with rbd, if object order=4M (default) and stripe_unit=object order, there's no point in having stripe_count>1, right?
[22:47] <infernix> i'm trying to understand use cases of stripe_unit<object_order and stripe_count>1
[22:47] <infernix> but I don't quite see why one would want to
[22:52] <gregaf> yehudasa: comments on wip-4363 and wip-4247; let me know if any of the backports require separate review
[22:54] <yehudasa> gregaf: thanks
[23:01] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[23:02] <yehudasa> sjustlaptop: is having a system leveldb a requirement now?
[23:03] <sjustlaptop> yehudasa: I think that's where it ended up
[23:03] <sjustlaptop> gary was working on that
[23:03] <sjustlaptop> I think we are going to try to provide our own leveldb package where necessary?
[23:03] <yehudasa> can't compile current next
[23:04] <yehudasa> I'm on debian wheezy
[23:04] <glowell> I think the leveldb removal that was in master has been merged into next.
[23:04] <yehudasa> so do I need to compile leveldb for my system?
[23:04] <glowell> On Precise, you just need to install libleveldb-dev and libsnappy-dev.
[23:05] * Philip_ (~Philip@hnvr-4d07ac83.pool.mediaWays.net) has joined #ceph
[23:05] <yehudasa> oh, found it
[23:05] <yehudasa> yay
[23:05] <glowell> If you are running squeeze, natty, or oneiric there is a ppa at ceph.com/debian-leveldb that has them.
[23:06] <yehudasa> running wheezy, that's not ubuntu
[23:06] <glowell> Wheezy shoukd have them upstream, as does quantal.
[23:06] <yehudasa> yeah, it does
[23:06] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[23:28] <Kioob> rahhh !!!
[23:29] <Kioob> My crush problem was because of �tunables�
[23:29] <Kioob> by setting the �non legacy� settings, the same crush map works
[23:32] <dmick> woohoo
[23:33] <Kioob> now the question : can I apply that ? (for now I've just played with "crushtool --test" :D)
[23:33] <Kioob> I verified the "Which client versions support CRUSH_TUNABLES" section, and it's ok
[23:34] <Kioob> (but I can't use crush_tunables2)
[23:35] * buck (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving.)
[23:37] <Kioob> so, 3.6% of the data to move
[23:40] <Kioob> so, thanks again for your time dmick. That test feature really saved me :)
[23:40] <dmick> good deal
[23:41] * tryggvil (~tryggvil@95-91-243-251-dynip.superkabel.de) has joined #ceph
[23:43] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[23:45] * vata (~vata@2607:fad8:4:6:4988:7146:954e:8567) Quit (Quit: Leaving.)
[23:46] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[23:47] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[23:47] * loicd (~loic@magenta.dachary.org) has joined #ceph
[23:51] * drokita (~drokita@ Quit (Quit: Leaving.)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.