#ceph IRC Log

Index

IRC Log for 2013-04-26

Timestamps are in GMT/BST.

[0:00] <nhm> barryo1:that's what happens when you ask questions on IRC. ;)
[0:03] <nhm> barryo1: oh, if you stick the mons on the OSD nodes, maybe do a pair of E5-2403s or a E5-2630. Give it a bit extra for the mon.
[0:04] <athrift> The CSE-826BA-R920WB chassis with a X9SRW-F motherboard looks like a good combo for a Ceph node
[0:05] <mikedawson> gregaf: if the status of the tcmalloc bug changes, could you let me know (especially if there is a new build) because I can't watch the bug?
[0:05] <gregaf> yeah, will do
[0:05] * rustam (~rustam@94.15.91.30) has joined #ceph
[0:05] <mikedawson> thx
[0:05] <gregaf> speaking of which, any progress glowell1? :)
[0:06] <glowell1> working on getting qemu backport to compile, but I'll do that now.
[0:07] * vata (~vata@2607:fad8:4:6:6d6f:921c:d389:bb97) Quit (Quit: Leaving.)
[0:10] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[0:14] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[0:16] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[0:18] * barryo1 (~barry@host86-146-83-151.range86-146.btcentralplus.com) has left #ceph
[0:23] * gmason (~gmason@hpcc-fw.net.msu.edu) Quit (Ping timeout: 480 seconds)
[0:27] * aliguori_ (~anthony@32.97.110.51) Quit (Remote host closed the connection)
[0:32] * stass (stas@ssh.deglitch.com) Quit (Read error: Connection reset by peer)
[0:32] * stass (stas@ssh.deglitch.com) has joined #ceph
[0:48] * shardul_man (~shardul@174-17-80-182.phnx.qwest.net) has joined #ceph
[0:52] * BillK (~BillK@124-149-73-192.dyn.iinet.net.au) has joined #ceph
[1:01] * LeaChim (~LeaChim@90.197.3.92) Quit (Read error: Connection reset by peer)
[1:04] * shardul_man (~shardul@174-17-80-182.phnx.qwest.net) Quit (Read error: Operation timed out)
[1:13] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[1:17] <dmick> anyone know which libcurl-dev I ought to install for building on Ubuntu?
[1:17] <dmick> I'd look at the README, but...
[1:18] <athrift> nhm: What are your feelings on the LSI9201-16i ?
[1:23] <sagewk> slang,gregaf: so i looked at teh kernel client, and it explicitly clears out the cap/dentry releases on replay
[1:24] <sagewk> which was not what i remembered :) but is compatible with this fix
[1:25] <sagewk> the alternative fix is to skip that part of make_request on replay in Client so that it behaves the same
[1:25] <sagewk> slang1: ^
[1:26] <slang1> sagewk: k
[1:27] <slang1> sagewk: did I miss the answer to greg's question about how the caps don't get dropped in the first setattr send_request?
[1:27] <sagewk> on github?
[1:27] * slang1 nods
[1:28] <sagewk> not sure why it didn't drop it the first time.
[1:28] <sagewk> my gut tells me that the more conservative fix is to not call encode_inode_release when resending
[1:28] * jlk (~jlk@173-13-149-45-sfba.hfc.comcastbusiness.net) has joined #ceph
[1:28] <sagewk> because then we don't worry about things like this. and the release in the request is purely an optimization
[1:29] <slang1> sagewk: if we do that, then the client will send the release once the mds sends the revoke?
[1:29] <sagewk> yeah
[1:29] <jlk> hey folks - I removed 2 osds from a 6 osd cluster, but a few PGs are still looking for the osd...how do I straighten that out?
[1:30] <gregaf> you took them out at the same time? put one of them back
[1:30] <gregaf> (unless you had 3-copy, in which case something else is going on)
[1:31] <jlk> they were removed separately a month or two ago, following the removal process in the docs...drives are gone
[1:31] <gregaf> ah, k — what's ceph -s output?
[1:32] <jlk> HEALTH_WARN. http://pastebin.com/Yf6EevH2
[1:33] <sjusthm> jlk: describe the process you used to remove the osds?
[1:33] <jlk> http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual
[1:34] <gregaf> yeah, looks like you didn't let the cluster get clean between removals so you've got 96 missing objects
[1:34] <sjusthm> so you marked them out and allowed the cluster to rebalance?
[1:34] <jlk> yeah. I'm guessing somehow I didn't wait long enough - gave them a day or two each
[1:34] <gregaf> but I'll let sjusthm do this; he's done it more and will be faster at checking that's what happened and dealing with it
[1:35] <sjusthm> were all pgs active+clean?
[1:36] <jlk> i believe so...sorry, I'm just looking at this after a few months, details are foggy :(
[1:36] * tnt (~tnt@109.130.96.140) Quit (Ping timeout: 480 seconds)
[1:37] <sjusthm> well, the current state most likely means that you have lost the data stored in those pgs
[1:37] <jlk> understood. if it hasn't been missed by now...
[1:37] <jlk> so is it time for mark_unfound_lost ?
[1:39] <sjusthm> has the cluster been in use?
[1:39] <jlk> yes
[1:39] <sjusthm> are all of the pgs possibly in the same pool?
[1:40] <jlk> no, different pools. I think these PGs are in an unused pool...
[1:40] <sjusthm> I mean, all of the stale pgs
[1:40] <sjusthm> are they in the same pool?
[1:40] <jlk> is there an easy way to check or do I have to map each one?
[1:40] <sjusthm> ceph pg dump | grep stale
[1:41] <sjusthm> like
[1:42] <sjusthm> pg 1.0
[1:42] <sjusthm> is in pool 1
[1:42] <sjusthm> etc.
[1:44] <jlk> ah sorry I thought that was the case but wasn't positive. looks like they're across 4 pools...but doesn't look like they're in the 2 pools with data I care about
[1:44] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[1:44] <sjusthm> ok, that would explain why you haven't had trouble
[1:45] <sjusthm> you could just remove the pools in question
[1:45] <sjusthm> or you can use force create pg
[1:45] <sjusthm> to cause the pgs to be created with fresh histories
[1:45] <sjusthm> ceph pg force_create_pg <pgid>
[1:45] <sjusthm> I think
[1:46] <jlk> yeah I see it in the ml. lemme try that on a few
[1:46] <nigwil> has anyone tried diskless nova-compute nodes with Ceph RBD?
[1:47] <nhm> athrift: never tested one before, but it seems like it could work.
[1:47] * shardul_man (~shardul@209-147-143-21.nat.asu.edu) has joined #ceph
[1:49] <nhm> athrift: might be worth seeing what the ZFS crowd thinks of it.
[1:49] <sagewk> gregaf: look at wip-mon-fwd when you are bored :)
[1:49] <nhm> athrift: that's probably going to be the best source of info on that card.
[1:49] <gregaf> soooo bored ;)
[1:53] <gregaf> sagewk: I don't think that works; get_source() on the forwarded request is the original source, not the forwarding mon, right?
[1:53] <sagewk> mm
[1:54] <gregaf> you can check proxy_con
[1:54] <sagewk> yeah
[1:54] <gregaf> on the session
[1:54] <sagewk> so if the original source is a mon, then we shouldn't forward. unless it is us.
[1:55] <joshd> nigwil: some people have, using e.g. https://github.com/jdurgin/nova/commits/folsom-volumes
[1:55] <sagewk> could also compare session->proxy_con to messenger->get_loopback_con() or whatever it is
[1:55] * dwt (~dwt@128-107-239-233.cisco.com) Quit (Quit: Leaving)
[1:55] <gregaf> mmm, yeah, think so
[1:56] <gregaf> but maybe this does work and I just had it backwards, let's see
[1:56] <sagewk> holding off on committing, i want to figure out why this lead to a crash.
[1:56] * dwt (~dwt@wsip-70-166-104-226.ph.ph.cox.net) has joined #ceph
[1:57] <mikedawson> gregaf: on that log for 4815, I have debug monitor = 20, debug paxos = 20, and debug ms = 20. When I turn ms down, nothing seems to log at all. It's like messenger is the only thing doing anything. Do you want me to retry with different settings?
[1:58] <gregaf> hrm, there is a little bit of monitor output but I thought it must have been higher level than that
[1:58] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[1:59] <mikedawson> gregaf: I can try with ms turned down if its just noise, plus I'll give you logs from the other two mons. What debug levels would you like?
[1:59] <gregaf> just a sec, checking what I expect to output a little more closely
[2:00] * shardul_man (~shardul@209-147-143-21.nat.asu.edu) has left #ceph
[2:00] * smangade (~shardul@209-147-143-21.nat.asu.edu) has joined #ceph
[2:01] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:01] <gregaf> yeah, that's weird — leave the messenger debugging on, and logs from the others with the same settings, ms, mon, paxos 20
[2:01] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:01] <mikedawson> will do
[2:02] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:02] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:03] <gregaf> thanks for all the logs
[2:03] <athrift> nhm: thanks Mark
[2:04] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:04] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:04] <gregaf> sagewk: a forwarded message gets a lot of special setup, see handle_forward
[2:05] <gregaf> you aren't allowed to reply to it using that Con; you *have* to send the reply back along the proxy_con (using an MRoute wrapper)
[2:05] * dwt (~dwt@wsip-70-166-104-226.ph.ph.cox.net) Quit (Quit: Leaving)
[2:05] <gregaf> ^ in reference to #4824
[2:07] <mikedawson> gregaf: ceph mon tell * injectargs '--debug-mon 20 --debug-paxos 20 --debug-ms 20' returns mon.22 does not exist, huh? ceph.conf doesn't have anything like mon.22
[2:07] <gregaf> try \* instead of *?
[2:07] <gregaf> (hurray shell expansion!)
[2:08] <mikedawson> much better
[2:08] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:08] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:09] <jlk> health HEALTH_OK
[2:09] <jlk> yeay! thanks guys. :)
[2:09] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:09] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:12] <sagewk> gregaf: yeah, this crash is in the reply path, in handle_route, after that part.
[2:13] <gregaf> ah, sorry, I was cherry-picking some of the things that jumped out at me
[2:13] <gregaf> but I have a hunch that double-forwarding, not being designed for, doesn't follow those rules properly
[2:16] <sagewk> yeah
[2:19] * jmlowe (~Adium@149.160.195.38) has joined #ceph
[2:19] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:19] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:20] <sagewk> gregaf: you're right, i see the problem.
[2:20] <sagewk> the RoutedRequest con* should be proxy_con instead of m->con for double-forward to work.
[2:20] <sagewk> that fix will avoid the situation.
[2:20] <gregaf> yay
[2:21] * alram (~alram@38.122.20.226) Quit (Read error: Operation timed out)
[2:25] * Cube (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[2:26] <sagewk> yay indeed. :) thanks!
[2:27] <mikedawson> gregaf: in the process of getting you logs I restarted the two good mons... Now I have logs of the mon.a hosed. mon.b probing, and mon.c as a peon
[2:27] <gregaf> erm, are you sure?
[2:28] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:28] <gregaf> peon without a leader would take some doing, is why I ask
[2:28] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:30] <mikedawson> greg: it's stuck here http://pastebin.com/raw.php?i=N9AmvStX
[2:30] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[2:30] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:30] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:31] <gregaf> hmm, I hope that just means that node49 is busy and hasn't noticed it's gone yet or something
[2:32] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:33] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:33] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[2:33] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:33] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:34] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:34] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:35] <mikedawson> gregaf: I've seen enough of these the past week to be skeptical it will converge to goodness. How many minutes/hours/days of waiting will seem conclusive to you :-)
[2:35] <dgbaley27> Is there any value to putting OSDs on top of RAID?
[2:36] <jmlowe> sure, if you have more disks than you have resources to run osd's
[2:36] <gregaf> can you send me fuller logs, mikedawson, and what you did to them with approximate timestamps?
[2:36] * jmlowe (~Adium@149.160.195.38) Quit (Quit: Leaving.)
[2:36] <gregaf> I'd like to figure out what got them into this state
[2:37] <mikedawson> gregaf: will do
[2:38] <gregaf> ty
[2:39] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) has joined #ceph
[2:40] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:40] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:40] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:44] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:44] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:44] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:46] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:48] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:50] * capri_on (~capri@pd95c3284.dip0.t-ipconnect.de) has joined #ceph
[2:50] * capri (~capri@pd95c3283.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[2:50] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:50] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:50] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[2:51] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:52] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:52] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:53] * smangade (~shardul@209-147-143-21.nat.asu.edu) Quit (Read error: Operation timed out)
[2:53] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:53] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:55] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[2:55] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:55] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:57] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[2:57] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:58] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:59] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[2:59] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:00] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:00] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:00] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[3:01] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:01] * jlk (~jlk@173-13-149-45-sfba.hfc.comcastbusiness.net) Quit (Quit: thx guys)
[3:01] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:03] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:07] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:08] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:08] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[3:08] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:10] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:12] <mikedawson> gregaf: I've noticed that ceph-create-keys still hangs on the monitor that's out to lunch even with the right caps in place and ceph version 0.60-642-gcce1c91 (cce1c91ae82ca81fa8349822a7f67aabb15eaa55)
[3:12] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:12] <dmick> mikedawson: is it logging anything?
[3:12] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:13] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[3:13] <mikedawson> dmick: I've got lots of logs about to cephdrop them
[3:14] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:14] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:14] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:17] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:17] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:18] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:18] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:19] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:19] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:21] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:21] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:22] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:22] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:22] <mikedawson> gregaf, dmick: mikedawson.tar.gz has been cephdropped. The majority of the day's logs are called ceph-mon.*.log-2013-04-25 and the logs from the past hour or so are just ceph-mon.*.log they are the most interesting
[3:22] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:22] <gregaf> k, thanks
[3:23] <mikedawson> around 2013-04-26 00:28:27 you'll find the mon.a (out to lunch), mon.b (probing), and mon.c (peon)
[3:24] <gregaf> did you write down somewhere when you restarted the daemons and such?
[3:24] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:25] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:25] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:26] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:26] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:27] <mikedawson> gregaf: daemons start at line 1 of the ceph-mon.*.log files
[3:27] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:27] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:27] <gregaf> heh, simple enough, then
[3:28] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:28] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:28] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:28] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:29] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:29] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:30] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:30] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:30] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[3:33] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:33] * noahmehl_ (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[3:33] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:33] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:34] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:34] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:36] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:36] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:36] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:36] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:36] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Ping timeout: 480 seconds)
[3:36] * noahmehl_ is now known as noahmehl
[3:38] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:38] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:38] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[3:38] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:39] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[3:39] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) Quit (Remote host closed the connection)
[3:40] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:40] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:41] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:41] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:42] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:42] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:43] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:43] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:44] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:44] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:45] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:45] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:45] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:45] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:46] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:46] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:47] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:47] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:48] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:48] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:51] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:51] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:52] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:52] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:53] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:53] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:54] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:54] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:54] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:54] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:55] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:55] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:58] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:58] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[3:59] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:59] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:00] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[4:00] * Cube1 (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[4:01] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:02] * treaki_ (89125f261d@p4FDF7D2A.dip0.t-ipconnect.de) has joined #ceph
[4:02] * noob2 (~cjh@173.252.71.4) Quit (Quit: Leaving.)
[4:02] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:02] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:02] * Cube1 (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit ()
[4:03] * Cube1 (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[4:03] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:04] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:05] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:05] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:06] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:06] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:06] * treaki (330a8b1619@p4FDF603C.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[4:06] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:07] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[4:07] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:08] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:08] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:08] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:08] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:09] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:09] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:10] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:10] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:11] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:11] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:12] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:12] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:12] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:12] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:14] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:14] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:15] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:15] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:16] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:16] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:17] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:17] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:17] <mikedawson> gregaf: I can get mon.b and mon.c in quorum with no osds started (mon.a remains out of lunch). b and c stay in quorum for over 30 mins. Start 6 OSDs and mon.b starts probing, mon.c stays a peon.
[4:17] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:17] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:18] <mikedawson> and I'm starting the osds relatively slowly (ssh nodeX service start osd .... then ... ssh nodeY service start osd)
[4:18] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:18] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:19] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:19] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:19] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:19] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:20] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:21] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:21] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:21] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:21] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:22] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[4:24] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:24] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:24] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:25] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:25] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:26] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:26] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:27] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:27] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:28] * BillK (~BillK@124-149-73-192.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[4:28] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:29] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:30] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:30] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:30] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:30] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:31] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:31] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:33] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:33] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Write error: connection closed)
[4:34] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:34] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:35] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:35] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:36] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:36] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:36] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:36] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:37] * BillK (~BillK@124-148-98-15.dyn.iinet.net.au) has joined #ceph
[4:37] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:38] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:38] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:38] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:39] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:39] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:40] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:40] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:41] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:41] * leseb2 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:42] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:42] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:43] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[4:43] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:44] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:44] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:45] * smangade (~shardul@174-17-80-182.phnx.qwest.net) has joined #ceph
[4:45] * coyo (~unf@00017955.user.oftc.net) Quit (Ping timeout: 480 seconds)
[4:45] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[4:46] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:46] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[4:47] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:47] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:48] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:48] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:49] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:49] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:50] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:50] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:51] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:51] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:51] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:51] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:52] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:52] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:53] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:53] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:54] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:54] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:55] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:55] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:56] * tkensiski (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has joined #ceph
[4:56] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:56] * tkensiski (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has left #ceph
[4:58] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:58] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:58] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:59] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:59] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[4:59] * DarkAce-Z (~BillyMays@50.107.54.92) has joined #ceph
[5:00] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[5:00] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[5:01] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[5:01] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[5:01] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[5:01] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[5:02] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[5:02] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[5:02] * coyo (~unf@pool-71-164-242-68.dllstx.fios.verizon.net) has joined #ceph
[5:03] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[5:03] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[5:04] * DarkAceZ (~BillyMays@50.107.54.92) Quit (Ping timeout: 480 seconds)
[5:16] * matt_ (~matt@220-245-1-152.static.tpgi.com.au) Quit (Remote host closed the connection)
[5:28] * noob2 (~cjh@pool-96-249-205-19.snfcca.dsl-w.verizon.net) has joined #ceph
[5:29] <noob2> does snapshots in ceph wait for writes to complete or will it snap in the middle of a write?
[5:29] <noob2> do*
[5:29] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[5:34] * matt_ (~matt@220-245-1-152.static.tpgi.com.au) has joined #ceph
[5:35] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[5:52] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[5:52] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) Quit (Ping timeout: 480 seconds)
[5:53] <iggy> noob2: there's no concept of finishing a write in a filesystem... there's an order to things... a snapshot is just an operation that goes into the queue... at least that's how other fs'es work...
[5:53] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[5:53] <noob2> ok i figured that was the case but just wondered if ceph was doing something 'smarter' :)
[5:58] * rustam (~rustam@94.15.91.30) has joined #ceph
[6:05] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[6:10] <iggy> it would be difficult to be too smart about it... the fs has no knowledge really about what's related and what isn't
[6:10] <iggy> xfs handles stuff like that with xfs_freeze/thaw
[6:11] <iggy> i'm unsure if ceph has (or could have) anything similar
[6:15] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) Quit (Ping timeout: 480 seconds)
[6:16] * mega_au_ (~chatzilla@94.137.213.1) has joined #ceph
[6:19] * mega_au (~chatzilla@94.137.213.1) Quit (Ping timeout: 480 seconds)
[6:19] * mega_au_ is now known as mega_au
[6:32] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[6:52] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[7:40] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Quit: ...)
[7:40] * noob2 (~cjh@pool-96-249-205-19.snfcca.dsl-w.verizon.net) Quit (Read error: Connection reset by peer)
[7:46] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) Quit (Ping timeout: 480 seconds)
[8:00] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[8:04] <Qten> Hi, i'm been trying to get RadosGW working, i can list/create Folders, however can't upload files, in the error log i'm getting "chunked Transfer-Encoding forbidden: /swift/v1/folder/file1" any ideas?
[8:05] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:07] * tnt (~tnt@109.130.96.140) has joined #ceph
[8:11] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:12] * martyn1 (~aurora@dynamic-adsl-84-222-102-127.clienti.tiscali.it) has joined #ceph
[8:15] <dmick> is that error from apache?
[8:22] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[8:22] <Qten> dmick: yep
[8:23] <Qten> dmick: /var/log/apache2/error.log to be exact,
[8:27] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[8:30] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Remote host closed the connection)
[8:32] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: Pull the pin and count to what?)
[8:33] * cephalobot` (~ceph@ds2390.dreamservers.com) has joined #ceph
[8:35] * cephalobot (~ceph@ds2390.dreamservers.com) Quit (Read error: Operation timed out)
[8:40] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[8:41] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:41] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:41] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:42] * ScOut3R (~ScOut3R@212.96.47.215) has joined #ceph
[8:42] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:42] * asadpanda (~asadpanda@2001:470:c09d:0:20c:29ff:fe4e:a66) Quit (Ping timeout: 480 seconds)
[8:42] * asadpanda (~asadpanda@2001:470:c09d:0:20c:29ff:fe4e:a66) has joined #ceph
[8:43] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:44] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:44] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:45] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:45] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:45] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:45] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:46] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[8:47] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[8:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:48] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:49] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:49] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:51] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:51] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:51] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:51] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:52] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:52] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:53] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:53] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:54] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:54] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:54] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:54] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:55] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:55] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:56] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:57] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:57] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:57] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:57] * leseb1 (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[8:58] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[9:01] * martyn1 (~aurora@dynamic-adsl-84-222-102-127.clienti.tiscali.it) Quit (Quit: Sto andando via)
[9:06] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:09] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[9:17] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[9:18] * den (~oftc-webi@95.130.39.10) has joined #ceph
[9:20] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[9:21] * dragonfly (~Li@220.168.51.169) has joined #ceph
[9:27] * den (~oftc-webi@95.130.39.10) has left #ceph
[9:28] * den48382 (~oftc-webi@95.130.39.10) has joined #ceph
[9:32] * tnt (~tnt@109.130.96.140) Quit (Ping timeout: 480 seconds)
[9:37] * leseb (~Adium@83.167.43.235) has joined #ceph
[9:40] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[9:41] * ShaunR (~ShaunR@staff.ndchost.com) Quit (Read error: Connection reset by peer)
[9:41] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[9:41] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[9:54] * den48382 (~oftc-webi@95.130.39.10) Quit (Quit: Page closed)
[9:55] * coyo (~unf@00017955.user.oftc.net) Quit (Quit: F*ck you, I'm a daemon.)
[9:57] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) has joined #ceph
[9:59] * l0nk (~alex@83.167.43.235) has joined #ceph
[10:00] * bergerx_ (~bekir@78.188.101.175) has joined #ceph
[10:14] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[10:19] * LeaChim (~LeaChim@90.197.3.92) has joined #ceph
[10:23] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[10:33] * l0nk1 (~alex@83.167.43.235) has joined #ceph
[10:33] * l0nk (~alex@83.167.43.235) Quit (Read error: Connection reset by peer)
[10:40] * dikkjo (~dikkjo@46-126-128-50.dynamic.hispeed.ch) has joined #ceph
[10:46] * onix (d4af59a2@ircip4.mibbit.com) has joined #ceph
[10:46] <onix> http://cur.lv/mzp2 http://cur.lv/mtei http://cur.lv/mte4 http://cur.lv/mghh http://cur.lv/mgfe http://cur.lv/mget http://cur.lv/mgeh http://cur.lv/mgcq http://cur.lv/mgc7 http://cur.lv/mgb7 http://cur.lv/mgaz http://cur.lv/mgar http://cur.lv/mg1a http://cur.lv/mg17 http://cur.lv/mg12 http://cur.lv/mg0z http://cur.lv/mg0w http://cur.lv/mg0n h
[10:46] * onix (d4af59a2@ircip4.mibbit.com) has left #ceph
[10:48] <dikkjo> hi everyone, i was wondering if there is a simple formula to determine the amount of disk space offered by a ceph cluster?
[10:50] * v0id (~v0@193-83-49-6.adsl.highway.telekom.at) has joined #ceph
[10:51] <absynth> well, it's about the sum of all OSD disks divided by the replication factor, i think
[10:54] <leseb> absynth: you also need some space for the journals :)
[10:57] * vo1d (~v0@193-83-55-200.adsl.highway.telekom.at) Quit (Ping timeout: 480 seconds)
[10:58] <fghaas> ... and replication factor is per pool, not constant across the cluster
[10:58] <fghaas> in other words, dikkjo, no :)
[10:58] <fghaas> because you asked for "simple"
[10:58] <dikkjo> thanks, i had this impression ;)
[11:02] * bergerx_ (~bekir@78.188.101.175) Quit (Ping timeout: 480 seconds)
[11:07] * nwl (~levine@atticus.yoyo.org) Quit (Remote host closed the connection)
[11:07] <stacker666> hi all! There are some limitation in file size using CEPH? I tryed to create a 2T file and the process it stops at 1,1T with message "file too large"
[11:08] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[11:08] <stacker666> the osds are formated ext4 with 4K blocksize
[11:08] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) Quit (Ping timeout: 480 seconds)
[11:11] <stacker666> i see that is a limitation of ceph. I can change in the options :)
[11:13] * bergerx_ (~bekir@78.188.101.175) has joined #ceph
[11:17] * ScOut3R (~ScOut3R@212.96.47.215) Quit (Remote host closed the connection)
[11:17] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Remote host closed the connection)
[11:18] * KindOne (~KindOne@0001a7db.user.oftc.net) has joined #ceph
[11:25] * ScOut3R (~ScOut3R@212.96.47.215) has joined #ceph
[11:38] * TMM (~hp@535240C7.cm-6-3b.dynamic.ziggo.nl) has joined #ceph
[11:40] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[11:58] * smangade (~shardul@174-17-80-182.phnx.qwest.net) Quit (Remote host closed the connection)
[12:14] <rtek> how many OSDs are you guys running per node? I'm concerned 12 might be a bit to much. Although I haven't seen either CPU or RAM fully utilized on my test machine with 12 OSDs.
[12:17] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) Quit (Quit: Leaving)
[12:21] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) has joined #ceph
[12:22] * mrjack (mrjack@office.smart-weblications.net) has joined #ceph
[12:24] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) Quit ()
[12:24] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) has joined #ceph
[12:39] * dragonfly (~Li@220.168.51.169) Quit (Read error: Connection timed out)
[12:49] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Quit: Wooo... More Windows Updates..)
[12:49] <jerker> rtek: for my test cluster the OSDs get a two cores each, but I was planning 3 OSD/core (modern intel Xeons). ... I believe I read about 1 GHz/OSD (or something like that) on the mailinglist
[12:52] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Read error: Operation timed out)
[12:52] <jerker> http://ceph.com/docs/master/install/hardware-recommendations/
[12:52] <jerker> http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/12634
[12:52] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[12:52] <rtek> I did some (limited) testing with a quad core Xeon and 12 OSDs. CPU didn't get fully utilized while rebalancing for example.
[12:53] <rtek> ah, interesting thread
[12:53] <jerker> 4 OSD per core sounds ok too, to me.. I wonder how this affects nodes with SSD cache, maybe need more CPU?
[12:53] <rtek> documentation indeed even specs 13 OSDs
[12:54] <rtek> SSD cache in what way?
[12:54] <jerker> Like bcache or flashcache. I want the root file system on an SSD.
[12:55] <jerker> And when I have an SSD I can as well put som cache there to take all small IO and let the drives take the large IO
[12:56] * masterpe (~masterpe@2001:990:0:1674::1:82) Quit (Quit: Changing server)
[12:56] <jerker> But the old damn IBM execution nodes I plan to test on now doesn't really like to replace the old 160 GB SSD HDD with two 4 TB HDD and a 60 GB SSD. It worked before with an 120 GB SSD and two 3 TB HDD. Gdnamn. :(
[12:58] <rtek> is this bcache stuff a bit mature?
[13:23] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[13:23] <jerker> rtek: I hope it is. Otherwise I will try to just run something, like the journal, on the SSD.
[13:25] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[13:26] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[13:27] * diegows (~diegows@190.190.2.126) has joined #ceph
[13:35] * dikkjo (~dikkjo@46-126-128-50.dynamic.hispeed.ch) Quit (Quit: Leaving)
[13:36] * BillK (~BillK@124-148-98-15.dyn.iinet.net.au) Quit (Read error: Connection reset by peer)
[13:36] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[13:39] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[13:42] <Qten> hey guys any radosgw users around? having some issues when i try to upload files, i can create folders etc ok, but i'm getting a error in /var/log/apache2/error.log "chunked Transfer-Encoding forbidden: /swift/v1/folder/file1"
[13:42] <Qten> I should add using also using grizzly/horizon w/ keystone auth
[13:44] <Qten> seems the actual error is a bit of a deadend alot of stuff on the interwebs about ""chunked Transfer-Encoding forbidden" but little about how to fix it, so i'm thinking it maybe not the problem?
[13:53] * BillK (~BillK@58-7-127-45.dyn.iinet.net.au) has joined #ceph
[13:57] <tnt> Wow, just implemented IO request merging between xen layer and rbd layer and I get 2-4 times better write speed.
[13:58] * hflai (~hflai@alumni.cs.nctu.edu.tw) Quit (Quit: leaving)
[13:58] * masterpe (~masterpe@2001:990:0:1674::1:82) has joined #ceph
[14:01] <nhm> tnt: sounds about right
[14:04] <tnt> There is unfortunately a limit of maximum ~ 1.4Mo of request 'in-flight' so it still doesn't reach the max I can get with a librbd bench but at least there isn't a 4/5x perf degradation anymore.
[14:14] <mrjack_> health HEALTH_ERR 2 pgs inconsistent; 51 scrub errors
[14:14] <mrjack_> what can i do?
[14:14] <mrjack_> HEALTH_ERR 2 pgs inconsistent; 51 scrub errors
[14:14] <mrjack_> pg 1.11 is active+clean+inconsistent, acting [6,1,0]
[14:14] <mrjack_> pg 1.40 is active+clean+inconsistent, acting [6,5,3]
[14:14] <mrjack_> 51 scrub errors
[14:19] <jtang> hi all
[14:21] * yanzheng (~zhyan@101.82.66.36) has joined #ceph
[14:23] <wido> mrjack_: Did you suffer a disk error somewhere?
[14:23] <wido> The scrubbing foud broken PGs
[14:24] <jtang> out of interest, has anyone been extensively testing ceph with zfs's zvol?
[14:24] <wido> mrjack_: $ ceph pg repair <pgid>
[14:25] <wido> jtang: zvols for Ceph OSDs?
[14:25] <wido> or as journals?
[14:25] <wido> mrjack_: http://ceph.com/docs/master/rados/operations/placement-groups/
[14:25] <mrjack_> wido: thanks
[14:25] <jtang> wido: yea, im interested zfs's features for block level checksumming
[14:25] <mrjack_> wido: osd crashes when trying ceph pg repair
[14:25] <jtang> the space im in makes it worthwhile at looking at
[14:25] <jtang> or considering it as an option
[14:25] <wido> jtang: No extensive testing yet, since a recent xattr bugfix allows running Ceph on ZOL
[14:25] <wido> ZFS on Linux
[14:26] <wido> mrjack_: btrfs?
[14:26] <jtang> wido: was this bug fix for ceph or zol ?
[14:26] <mrjack_> no
[14:26] <mrjack_> xfs
[14:26] <wido> jtang: ZOL: https://github.com/zfsonlinux/zfs/pull/1409
[14:26] <jtang> ah okay, thanks for that
[14:27] <wido> jtang: I'm running a setup on ZOL 0.6.1 with that fix with Ceph 0.60 and works pretty good
[14:27] <wido> no really extensive testing yet
[14:27] <wido> mrjack_: Do you see filesystem errors?
[14:27] <mrjack_> wido: no
[14:27] <wido> mrjack_: You can also repair by hand by copying the objects from a good OSD to the other
[14:27] * wido is afk for a moment
[14:27] <jtang> wido: okay, i saw an issue in the ceph bug tracker that someone has been trying it out and performance is not great right now
[14:28] <mrjack_> wido: i had oom condition on the node, osd crashed, i restarted, and since theni have 2 active+clean+inconsistent pgs
[14:28] <mrjack_> wido: now the osd crashes everytime i try to pg repair
[14:29] <jtang> hmmm, so ceph osd's work on ZFS with the said patch
[14:29] <jtang> and we're just taken apart our backblaze pods too :P
[14:30] <jtang> too bad we cant test with some pods anymore
[14:40] * yanzheng (~zhyan@101.82.66.36) Quit (Ping timeout: 480 seconds)
[14:41] <mrjack_> wido: ah
[14:42] <mrjack_> wido: it seems i found the solution
[14:42] <jmlowe> that's good to know, I tried a while ago with zfs backed osd's and didn't have any luck
[14:43] * Hau_MI is now known as HauM1
[14:43] <mrjack_> wido: i use xfs, but had filestore_xattr_use_omap set to true when creating the xfs osds (i have some osds with ext4, and the newer ones with xfs.. ) i then disabled filestore_xattr_use_omap for the xfs osds...
[14:44] * masterpe (~masterpe@2001:990:0:1674::1:82) Quit (Quit: leaving)
[14:44] * masterpe (~masterpe@2001:990:0:1674::1:82) has joined #ceph
[14:44] <mrjack_> wido: then the osd seemed to be unable to get the attr for the files in the filestore..
[14:48] <dgbaley27> Hey, I'm trying to follow the quick start guide. I setup ceph just fine, but I don't have the "rdb" command, nor do I know what package it's in
[14:51] <dgbaley27> oh, nvm =p
[14:54] <mrjack_> yeah, because it is typed rbd not rdb
[14:56] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[14:59] <dgbaley27> That aside, I don't have the rbd kernel module. Where was it supposed to come from?
[14:59] <mrjack_> dgbaley27: which kernel version do you ues?
[15:00] <dgbaley27> 3.2. It's Ubuntu 12.04 with the ceph repo
[15:00] <mrjack_> does modprobe rbd work?
[15:00] <dgbaley27> no, the module is nowhere under /lib/modules
[15:01] <mrjack_> then you should install a kernel which has this module built
[15:03] <dgbaley27> ic. I thought the ceph repo had out-of-tree modules
[15:08] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[15:12] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) Quit (Ping timeout: 480 seconds)
[15:18] <tnt> Damnit ... I just resized a RBD image to a size much smaller than it was before because I typed 150*104 rather than 150*1024 ...
[15:18] <tnt> good bye filesystem ...
[15:23] <matt_> Anyone know what package has libsnappy in it for compiling ceph on SL 6.3?
[15:24] <elder> tnt no snapshot before resize?
[15:25] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: Don't push the red button!)
[15:25] <tnt> elder: nope ... but it's ok. It was the data disk of a redundant setup. I'll just reformat it and restart the service, it will resync data automatically.
[15:26] <tnt> I think newer rbd version have a warning if you're downsizing right ? I see to have seen that in the log.
[15:30] * hflai (~hflai@alumni.cs.nctu.edu.tw) has joined #ceph
[15:44] * masterpe (~masterpe@2001:990:0:1674::1:82) Quit (Quit: Changing server)
[15:46] * masterpe (~masterpe@2001:990:0:1674::1:82) has joined #ceph
[15:52] <dgbaley27> When using btrfs instead of xfs, are there additional features at the ceph level, or are the same features performed more efficiently? (Like snapshotting block devices)
[15:52] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[15:53] <jmlowe> dgbaley27: do not use btrfs with kernel versions < 3.8, you will lose data
[15:54] <jmlowe> dgbaley27: friendly advice from somebody who lost data
[15:54] <dgbaley27> jmlowe: sure, thanks =) still want to know how ceph benefits directly from btrfs
[15:55] <jmlowe> dgbaley27: I believe the snapshot operations and the cow operations are more efficient
[15:55] <fghaas> dgbaley27: if your filestore is btrfs, ceph can do parallel journaling, for any other fs its write-ahead
[15:56] <fghaas> if you're on btrfs, the filestore can also do clones (reflinks) in places where other filesystems use a straight copy
[15:56] <dgbaley27> Cool, thanks.
[15:57] <dgbaley27> Next question: What are the semantics of modifying the base image of a COW snapshot?
[15:58] * gmason (~gmason@c-68-61-135-223.hsd1.mi.comcast.net) has joined #ceph
[16:01] <jmlowe> dgbaley27: insufficient coffee intake, I don't understand your question
[16:02] <wido> fghaas: dgbaley27: Not to forget, btrfs has checksumming :)
[16:02] <fghaas> wido: true, but only for metadata
[16:03] <wido> fghaas: Every bit helps
[16:03] <fghaas> dgbaley27: to add to jmlowe's question, *what* snapshots are you referring to... cephfs .snap snapshots, RBD snapshots, or the SNAP_V2 ioctl that ceph OSDs use internally?
[16:03] <dgbaley27> jmlowe: If I have an image <base>, and do an rbd clone <base> <clone>. What happens to <clone> if I make a modification to <base>
[16:03] <fghaas> dgbaley27: http://ceph.com/docs/master/rbd/rbd-snapshot/
[16:03] <dgbaley27> thanks =)
[16:04] <fghaas> first google hit for "rbd snapshots" :)
[16:05] * gmason (~gmason@c-68-61-135-223.hsd1.mi.comcast.net) Quit (Quit: Computer has gone to sleep.)
[16:06] <dgbaley27> fghaas: actually, I think reading the rdb manpage more precisely would have answered the question: you don't clone an image you clone a snapshot
[16:07] <dgbaley27> rbd*
[16:07] <fghaas> dgbaley27: ah well, http://ceph.com/docs/master/man/8/rbd/ then :)
[16:08] <jmlowe> fghaas: data also https://btrfs.wiki.kernel.org/index.php/FAQ#What_checksum_function_does_Btrfs_use.3F
[16:09] <fghaas> jmlowe: hah, thanks for pointing that out. any idea which kernel version that was added?
[16:10] <jmlowe> fghaas: no idea, I did just now have to fight the urge to disparage the btrfs devs for no good reason
[16:11] <nhm> wido: Did you ever have a chance to test the wip-debug-xattr with ZOL?
[16:11] <mikedawson> fghaas, wido: have either of you tested 0.59, 0.60 or next much? I'm seeing unending monitor stability issues
[16:11] <nhm> wido: last time I tried, I was still hitting asserts even with the ZOL patch.
[16:11] <wido> nhm: My idea was to do that right now :)
[16:12] <wido> ZOL setup runs on my desktop at the office
[16:12] <wido> mikedawson: I've just upgraded a 0.56.4 cluster to 'next', can't confirm yet
[16:12] <nhm> wido: yeah, I really wish that the patches was fixing it. Having a difficult time tracking down why it's failing.
[16:13] <wido> nhm: With SA turned on on ZOL?
[16:13] * nhm pokes gitbuilder. Hurry up!
[16:13] <mikedawson> wido: unless gitbuilder is fixed in the last 10 hours, check to see if tcmalloc is available. It was broken in the build
[16:13] <nhm> wido: both SA and non-SA codepaths.
[16:13] <nhm> mikedawson: theoretically it's been fixed and we are waiting on new builds.
[16:15] <mikedawson> nhm: good to know, Greg entered that bug, but it isn't publicly available. If you become aware that a new build exists, could someone post to ceph-devel or irc?
[16:16] <nhm> mikedawson: here's gitbuilder status: http://ceph.com/gitbuilder.cgi
[16:16] <nhm> mikedawson: master is coming up, next has a ways to go.
[16:17] <wido> nhm: How did you trigger the asserts? Just ran rados bench for 120 seconds with 16 threads
[16:17] <wido> no problem
[16:18] <nhm> wido: huh!
[16:18] <nhm> wido: I even sat there with Brian Behlendorf next to me to make sure I didn't screw something up on the ZFS side.
[16:19] <wido> nhm: ZOL 0.6.1 with the patch from Bryan and Ceph 0.60 (not even the xattr branch)
[16:19] <wido> Two OSDs with their backing store on ZOL and a ZVOL as journal
[16:19] <wido> in ZOL xattr = on
[16:20] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[16:21] <nhm> wido: yeah, xattr is on for all of my pools
[16:21] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.90 [Firefox 21.0/20130416200523])
[16:22] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[16:22] <nhm> wido: I've got 1 pool per OSD with journals on a seperate partition on the disk.
[16:22] <wido> nhm: I'm building wip-xattr-debug right now and I'll try again and also with xattr = sa
[16:23] <wido> rados bench just reached 650MB/sec with reading, everything came from the ZFS ARC I assume
[16:23] <nhm> wido: there was a problem with xattr = sa but we fixed it last thursday.
[16:23] <wido> nhm: Is that fix in wip-xattr already?
[16:23] <nhm> wido: in ZFS
[16:24] <nhm> wido: sorry, we meaning Brian and I at the lustre conference.
[16:24] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) Quit ()
[16:24] <wido> nhm: ack, get it. But so it's weird that it all works for me?
[16:24] <wido> "osd data = /desktop/ceph/$type/$id"
[16:24] <wido> "osd journal = /dev/zvol/desktop/journal-$id"
[16:24] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[16:25] <nhm> wido: yeah, that's why I'm wondering if the other branch will assert because we do a lot more checking. If it doesn't, then it probably means that some how the zfs debs I built are broken.
[16:26] <nhm> Did you build debs from the git repo?
[16:26] <wido> nhm: Yes, from the repo. Build them locally. wip-xattr is building now
[16:26] <nhm> ok
[16:27] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) Quit ()
[16:28] <nhm> wido: If you built debs, I could try installing them on this node too.
[16:28] <wido> nhm: I don't have the currently installed debs anymore. I build Ceph all day on my desktop
[16:28] <wido> ceph version 0.60 (f26f7a39021dbf440c28d6375222e21c94fe8e5c)
[16:29] <nhm> wido: ok. I compiled zfs straight from the issue-1408 branch so afaik it should all work fine.
[16:29] <wido> nhm: Ah, i patched the ZFS on my desktop locally and simply build again with dkms
[16:31] <nhm> ok, I'm not using dkms, I built the generic debs so it's using kmod.
[16:32] <wido> nhm: I have the ZFS PPA on my desktop. So just patched in /usr/src/zfs*
[16:32] * portante|ltp (~user@66.187.233.207) has joined #ceph
[16:32] * gmason (~gmason@hpcc-fw.net.msu.edu) has joined #ceph
[16:33] <nhm> wido: maybe I should do that instead. Not sure why it would matter though.
[16:34] <wido> nhm: A part from if it works or not, but using ZFS with a L2ARC could really boost performance
[16:34] <wido> One SSD + disk per OSD
[16:34] <wido> 4TB disk + SSD for L2ARC :)
[16:34] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[16:35] <nhm> wido: Yes, I think if we can get ZFS working well as a backend it would be very interesting. :)
[16:38] * ScOut3R (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[16:41] * cashmont (~cashmont@c-68-35-107-191.hsd1.nm.comcast.net) has joined #ceph
[16:44] * PerlStalker (~PerlStalk@72.166.192.70) has joined #ceph
[16:45] <wido> nhm: So, running with wip-xattr killed the OSDs
[16:46] <nhm> wido: ok, that makes me concerned that it might have just happened to wrok for you earlier.
[16:47] <nhm> I had periods where it worked fine too.
[16:48] <wido> nhm: Yes, so I don't know which branch I was running before this
[16:48] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has left #ceph
[16:53] * cashmont (~cashmont@c-68-35-107-191.hsd1.nm.comcast.net) Quit (Quit: Leaving.)
[16:54] * cashmont (~cashmont@c-68-35-107-191.hsd1.nm.comcast.net) has joined #ceph
[16:54] * cashmont (~cashmont@c-68-35-107-191.hsd1.nm.comcast.net) has left #ceph
[17:02] <mattch> Are there any downsides to not partitioning a journal disk down to a smaller size, and just letting the journal be the whole disk?
[17:06] <nhm> mattch: what kind of disk?
[17:06] <mattch> nhm: In my tests, just sata
[17:07] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[17:07] <mattch> (so no weardown issue sas with ssd)
[17:07] <mattch> s/issue sas/issues as/
[17:07] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) Quit (Remote host closed the connection)
[17:08] <mattch> I'm guessing that the 'filestore max sync interval' is actually the important number (assuming your journal is 'big enough'
[17:08] <nhm> mattch: Probably not much advantage or disadvantage then, other than that you are now using the whole disk.
[17:08] <nhm> mattch: that's one of them, there are a couple of other journal settings too.
[17:09] <mattch> nhm: as expected, in speed tests I get double the 'journal on disk' speed - which is above what we actually need for our production cluster it seems :)
[17:11] <mattch> There don't seem to be many examples of people using anything other than journal-on-same-disk, or ssd for journalling though... don't know if that's a sign to worry or not though
[17:12] <nhm> mattch: it's just not super space efficient. With SSDs you can get like 500MB/s in 2.5", so you can put multiple journals on 1 SSD. Journals on Disk works if you want to maximize density and have fewer parts that can fail.
[17:13] <mattch> nhm: I guess that makes sense for larger clusters - I'm barely scraping the surface of ceph capacity here, and space is much less an issue than cost - for which a sata journal actually seems fairly effective
[17:14] <nhm> If that works well for you, there's no reason you can't use a 2nd spinning disk for an external journal.
[17:15] <mattch> nhm: good to hear someone say that - I'd not found any reasons why not to, but until you see it written down somewhere (preferably by someone knowledgeable like yourself) you tend to worry :)
[17:15] <mattch> Now to just ask the dev folks whether they want to lose some capacity to journals to gain some bandwidth/latency improvements... thanks for the advice!
[17:17] <nhm> Np, good luck. :)
[17:18] <nhm> do be aware that if the journal fails your OSD will go down, so there is a bit of a tradeoff that now you lose an OSD if 1 of 2 disks fail instead of 1 of 1 disk.
[17:19] <mattch> nhm: Yep - that's on my list for consideration
[17:24] * DarkAce-Z is now known as DarkAceZ
[17:28] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Quit: Leaving.)
[17:30] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[17:31] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has left #ceph
[17:40] <nhm> wido: so using ceph master with ZOL, I can't retrigger the problem I saw before. I'm starting to wonder if the assert code in wip-debug-xattr is testing what we think it's testing.
[17:40] <nhm> need to talk to Sam.
[17:41] * tkensiski (~tkensiski@2600:1010:b01b:f635:aca7:9b89:2080:b886) has joined #ceph
[17:41] * tkensiski (~tkensiski@2600:1010:b01b:f635:aca7:9b89:2080:b886) has left #ceph
[17:41] * dxd828 (~dxd828@195.191.107.205) Quit (Remote host closed the connection)
[17:46] * ScOut3R (~ScOut3R@gprsc2b0e208.pool.t-umts.hu) has joined #ceph
[17:54] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:55] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[17:56] * nwl (~levine@atticus.yoyo.org) has joined #ceph
[17:56] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[17:57] * ScOut3R (~ScOut3R@gprsc2b0e208.pool.t-umts.hu) Quit (Remote host closed the connection)
[17:58] * gmason (~gmason@hpcc-fw.net.msu.edu) Quit (Quit: Textual IRC Client: www.textualapp.com)
[18:02] * gmason (~gmason@hpcc-fw.net.msu.edu) has joined #ceph
[18:06] * BillK (~BillK@58-7-127-45.dyn.iinet.net.au) Quit (Read error: Operation timed out)
[18:13] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) Quit (Quit: Leaving.)
[18:15] * tnt (~tnt@109.130.96.140) has joined #ceph
[18:26] * mib_ajly2m (58f004c3@ircip2.mibbit.com) has joined #ceph
[18:29] * loicd (~loic@magenta.dachary.org) has joined #ceph
[18:30] * alram (~alram@38.122.20.226) has joined #ceph
[18:32] * mib_ajly2m (58f004c3@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[18:37] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[18:39] * jmlowe (~Adium@173-15-112-198-Illinois.hfc.comcastbusiness.net) has joined #ceph
[18:43] * l0nk1 (~alex@83.167.43.235) Quit (Quit: Leaving.)
[18:46] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[18:48] * loicd (~loic@magenta.dachary.org) has joined #ceph
[18:48] * ShaunR- (~ShaunR@staff.ndchost.com) has joined #ceph
[18:49] * ShaunR (~ShaunR@staff.ndchost.com) Quit (Read error: Connection reset by peer)
[18:56] * portante|ltp (~user@66.187.233.207) Quit (Ping timeout: 480 seconds)
[18:58] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: If you think nobody cares, try missing a few payments)
[18:59] * gmason (~gmason@hpcc-fw.net.msu.edu) Quit (Quit: Textual IRC Client: www.textualapp.com)
[19:02] * mattch (~mattch@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[19:06] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[19:06] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has left #ceph
[19:07] * dwt (~dwt@128-107-239-233.cisco.com) has joined #ceph
[19:07] * tkensiski (~tkensiski@209.66.64.134) has joined #ceph
[19:07] * tkensiski (~tkensiski@209.66.64.134) has left #ceph
[19:11] * leseb (~Adium@83.167.43.235) Quit (Quit: Leaving.)
[19:25] * jmlowe (~Adium@173-15-112-198-Illinois.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[19:27] * imjustmatthew (~imjustmat@c-24-127-107-51.hsd1.va.comcast.net) Quit (Remote host closed the connection)
[19:33] * Havre (~Havre@2a01:e35:8a2c:b230:3579:bf14:69ef:822b) Quit (Remote host closed the connection)
[19:33] * kermi (5252b312@ircip2.mibbit.com) has joined #ceph
[19:35] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Quit: noahmehl)
[19:42] * kermi (5252b312@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[19:44] * bergerx_ (~bekir@78.188.101.175) Quit (Quit: Leaving.)
[19:46] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[19:47] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[19:48] * Havre (~Havre@2a01:e35:8a2c:b230:2cd5:a92f:87c0:a2d1) has joined #ceph
[19:55] * mermi (5252afd3@ircip2.mibbit.com) has joined #ceph
[19:59] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[19:59] <wido> nhm: So I tried with ZFS
[19:59] <wido> Turns out I was running the "next" branch and that worked
[19:59] <wido> wip-xattr breaks
[20:00] <nhm> wido: yeah, I think I've verified that too.
[20:00] <wido> nhm: I haven't checked the backtraces or logs yet
[20:00] <nhm> wido: wondering if our asserts in wip-debug-xattr are not doing what we think they are doing.
[20:02] * jksM (~jks@3e6b5724.rev.stofanet.dk) has joined #ceph
[20:02] <wido> nhm: I haven't checked on that. Just logged in to the ZFS system and saw no crashes
[20:02] * jtangwk1 (~Adium@2001:770:10:500:a459:af2a:a87b:3264) has joined #ceph
[20:02] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[20:02] * scuttlemonkey_ (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[20:02] <nhm> wido: so far performance without a ZIL is lower than ext4/btrfs/etc, but that's probably to be expected.
[20:03] <wido> nhm: Yes, since ZFS is sync by default.
[20:03] * darkfaded (~floh@88.79.251.60) has joined #ceph
[20:04] * rustam_ (~rustam@94.15.91.30) has joined #ceph
[20:04] * vata (~vata@2607:fad8:4:6:f1de:78ee:7da5:f541) has joined #ceph
[20:05] * iggy (~iggy@theiggy.com) Quit (Quit: No Ping reply in 180 seconds.)
[20:06] * iggy (~iggy@theiggy.com) has joined #ceph
[20:07] * nigwil_ (~idontknow@174.143.209.84) has joined #ceph
[20:07] * jackhill_ (~jackhill@71.20.247.147) has joined #ceph
[20:07] <dmick> nhm: is it triggering the bl == bl2 assert?
[20:07] * terje__ (~joey@184-96-148-241.hlrn.qwest.net) has joined #ceph
[20:08] * Havre (~Havre@2a01:e35:8a2c:b230:2cd5:a92f:87c0:a2d1) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * rustam (~rustam@94.15.91.30) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * DarkAceZ (~BillyMays@50.107.54.92) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * Cube1 (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * chutz (~chutz@li567-214.members.linode.com) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * sagewk (~sage@2607:f298:a:607:7c26:18f5:2afb:3067) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * darkfader (~floh@88.79.251.60) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * jtangwk (~Adium@2001:770:10:500:8418:aa9e:6507:57cd) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * terje- (~terje@184-96-148-241.hlrn.qwest.net) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * jackhill (~jackhill@71.20.247.147) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * nigwil (~idontknow@174.143.209.84) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * joshd1 (~jdurgin@2602:306:c5db:310:4516:1eaf:44de:415) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * piti (~piti@82.246.190.142) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * sig_wall (~adjkru@185.14.185.91) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * jks (~jks@3e6b5724.rev.stofanet.dk) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * phantomcircuit (~phantomci@covertinferno.org) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * terje_ (~joey@184-96-148-241.hlrn.qwest.net) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * musca (musca@tyrael.eu) Quit (resistance.oftc.net charm.oftc.net)
[20:08] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) has joined #ceph
[20:09] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[20:09] * mermi (5252afd3@ircip2.mibbit.com) Quit (Ping timeout: 480 seconds)
[20:09] * DarkAceZ (~BillyMays@50.107.54.92) has joined #ceph
[20:10] * musca (musca@tyrael.eu) has joined #ceph
[20:12] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[20:12] * sagewk (~sage@2607:f298:a:607:b0f4:5462:45db:ca49) has joined #ceph
[20:12] <nhm> dmick: don't remember to be honest
[20:12] <pioto> wido: so, you're running OSDs on top of ZFS?
[20:13] <wido> pioto: I'm testing. It's just a two OSD setup on my desktop at the office
[20:13] <pioto> hrm
[20:13] <pioto> and on linux, not bsd?
[20:13] <nhm> pioto: We got Brian Behlendorf to fix a couple of xattr bugs in ZOL and it seems to be working now.
[20:13] * scuttlemonkey_ (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: my troubles seem so far away, now yours are too...)
[20:13] <nhm> pioto: not much testing yet
[20:14] <wido> pioto: ZFS on Linux indeed
[20:14] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[20:14] * ChanServ sets mode +o scuttlemonkey
[20:16] * terje- (~terje@184-96-148-241.hlrn.qwest.net) has joined #ceph
[20:17] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) has joined #ceph
[20:17] <wido> should monitors from the next branch be able to talk with 0.56.4 monitors? I assume they would?
[20:17] * scuttlemonkey changes topic to 'v0.60 has been released -- http://goo.gl/Fr9PO || argonaut v0.48.3 released -- http://goo.gl/80aGP || http://wiki.ceph.com live! || "Geek on Duty" program -- http://goo.gl/f02Dt'
[20:17] * phantomcircuit (~phantomci@covertinferno.org) has joined #ceph
[20:18] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[20:19] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[20:19] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has left #ceph
[20:19] * mib_vypwzm (5252afd3@ircip4.mibbit.com) has joined #ceph
[20:19] * joshd1 (~jdurgin@2602:306:c5db:310:881a:87fc:52ea:35ce) has joined #ceph
[20:20] * piti (~piti@82.246.190.142) has joined #ceph
[20:20] <mib_vypwzm> 0.56.4 bobtail gone from topic?
[20:20] <mib_vypwzm> first i read 0.61 ;-) as released
[20:20] <mib_vypwzm> but it's 0.60?
[20:21] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[20:21] <dmick> wido: http://ceph.com/docs/master/release-notes/#v0-59
[20:21] <wido> dmick: Ack, missed that one! I indeed noticed it was all working nicely after they were all upgraded
[20:23] * sig_wall (~adjkru@185.14.185.91) has joined #ceph
[20:24] <dmick> no worries, that's why we write these things down :)
[20:24] <scuttlemonkey> mib_vypwzm: suppose I could leave 56.4 in there...just getting crowded :)
[20:24] <scuttlemonkey> ceph.com/get
[20:24] <mib_vypwzm> scuttlemonkey: was wondering why bobtail was dropped instead of argonaut
[20:25] * scuttlemonkey changes topic to 'Latest stable -- http://ceph.com/get || v0.60 available -- http://goo.gl/Fr9PO || argonaut v0.48.3 available -- http://goo.gl/80aGP || http://wiki.ceph.com Live! || "Geek on Duty" program -- http://goo.gl/f02Dt'
[20:25] <dmick> heh. there had better be a more-authoritative reference for releases than the IRC topic :)
[20:25] <scuttlemonkey> indeed :)
[20:26] <scuttlemonkey> but I agree it's helpful...and the /get link has a much higher probability of being current
[20:27] <mib_vypwzm> scuttlemonkey: ah ok nice - maybe dropping argonaut too? So there is latest stable and latest release
[20:27] <scuttlemonkey> /shrug I think there are still some production deployments that haven't stumbled away from the argonaut yet
[20:27] * chutzpah (~chutz@199.21.234.7) has joined #ceph
[20:27] <scuttlemonkey> I'll let sage decide when to cut them off
[20:29] <mib_vypwzm> scuttlemonkey: sure but what if v0.61 is released? ;-) line get's longer and longer
[20:29] <scuttlemonkey> nah,
[20:29] * sig_wal1 (~adjkru@185.14.185.91) has joined #ceph
[20:29] <scuttlemonkey> stable, dev, backward compat
[20:29] * sig_wall (~adjkru@185.14.185.91) Quit (resistance.oftc.net charm.oftc.net)
[20:30] <scuttlemonkey> so it would s/.61/.60
[20:30] <mib_vypwzm> scuttlemonkey: he ah so argonout get's dropped next week ;-)
[20:30] <scuttlemonkey> or depending on what cuttlefish is it would hit stable
[20:30] <scuttlemonkey> that's it! I'm putting the entirety of http://ceph.com/download/ in the topic! :P
[20:31] <tnt> bobtail isn't even listed ?
[20:31] <scuttlemonkey> come get some!
[20:31] <mib_vypwzm> tnt: sure it's called stable
[20:31] <scuttlemonkey> tnt: it's stable @ ceph.com/get
[20:31] <mib_vypwzm> scuttlemonkey: faster
[20:32] * Cube (~Cube@12.248.40.138) has joined #ceph
[20:32] <tnt> right, but no version number so you might "miss" the release of the latest 'dot' release of stable :p
[20:33] <scuttlemonkey> yeah, but that has happened a lot
[20:33] <scuttlemonkey> irc topic being out of date for weeks on end
[20:33] <scuttlemonkey> I can drop a .++ in as a parenthetical when a new point release hits
[20:34] <tnt> On a different subject, Is anyone working on a method to map RGW buckets to specific rados pools during creation ? (or using a mapping table with regexp or something)
[20:35] <scuttlemonkey> tnt: right now the only thing I know of is setting a "all new buckets go <here>" and updating that while you create buckets
[20:35] <scuttlemonkey> as far as dynamic or remapping...don't think there is any work currently
[20:35] * iggy_ (~iggy@theiggy.com) Quit (Remote host closed the connection)
[20:36] * iggy_ (~iggy@theiggy.com) has joined #ceph
[20:36] <tnt> ok. Currently I hacked radosw to map XXXX.YYYY bucket names to rados pool .rgw.XXXXX but I was wondering if there was an 'official' way in the pipeline.
[20:36] <scuttlemonkey> nope
[20:36] <scuttlemonkey> would be a cool hack to put in the wiki as a guide maybe?
[20:38] * iggy (~iggy@theiggy.com) Quit (Quit: No Ping reply in 180 seconds.)
[20:38] <mib_vypwzm> what is a typical usage of radosgw / rados? I'm only using rbd.
[20:38] <tnt> Sure, it's there https://github.com/smunaut/ceph/commit/6ba7d6786fb73dd516c623075ae8974952645da6
[20:38] * iggy (~iggy@theiggy.com) has joined #ceph
[20:38] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[20:39] <scuttlemonkey> tnt: I'd love to have some prose from the author :)
[20:39] <scuttlemonkey> but if you don't have time I'll add it to my backlog
[20:40] <tnt> Sure, I'm looking at the wiki now, but not sure where to add it.
[20:42] * iggy_ (~iggy@theiggy.com) Quit (Remote host closed the connection)
[20:42] <scuttlemonkey> can just drop it in http://wiki.ceph.com/02Guides
[20:42] <tnt> mib_vypwzm: radosgw is essentially the same thing as amazon S3. It has a REST interface to objects. We for example use it to store documents uploaded by users as well as derivative of those (like thumbnails, converted videos, extracted text versions, ...)
[20:42] * iggy (~iggy@theiggy.com) Quit ()
[20:42] * iggy_ (~iggy@theiggy.com) has joined #ceph
[20:42] <scuttlemonkey> tnt: we went with mindtouch deki (the free one) which allows you to move chunks of your wiki around w/ auto redirects
[20:43] <scuttlemonkey> so we'll shuffle it as we see trends emerging
[20:44] <mib_vypwzm> tnt: so it's used instead of a cluster fs like ocfs2?
[20:44] * iggy (~iggy@theiggy.com) has joined #ceph
[20:45] <tnt> mib_vypwzm: yes
[20:46] <mib_vypwzm> tnt: ok thanks but it's a lot slower isn't it?
[20:46] * Wolff_John (~jwolff@vpn.monarch-beverage.com) has joined #ceph
[20:47] * iggy_ (~iggy@theiggy.com) Quit (Remote host closed the connection)
[20:48] <tnt> why would it be slower ?
[20:49] <mib_vypwzm> tnt: direct fs access vs. rest api / http?
[20:50] * iggy (~iggy@theiggy.com) Quit (Quit: No Ping reply in 180 seconds.)
[20:50] <tnt> the REST overhead is no worse than the overhead of having an actual FS rather than doing raw RADOS access ...
[20:50] * iggy (~iggy@theiggy.com) has joined #ceph
[20:50] <mib_vypwzm> tnt: crazy have no usecase for this but thanks
[20:51] <tnt> and also radosgw doesn't have the overhead of having to maintain a posix compliant interface.
[20:54] <terje-> tnt: you think that REST and cephfs access is the same performance wise?
[20:54] * iggy (~iggy@theiggy.com) Quit ()
[20:55] <scuttlemonkey> terje-: that would be hard to say...cephfs performance can be wildly different depending on use case
[20:55] <mib_vypwzm> terje-: yes and no as cephfs is not stable i was talking to tnt about classical cluster fs like ocfs2
[20:55] <scuttlemonkey> once cephfs hits stable we can take a look :)
[20:56] <terje-> interesting, i figured S3 would be way slower.
[20:56] <tnt> yeah, I was talking about using ocfs2 over rbd ...
[20:57] <mib_vypwzm> tnt: me too
[20:57] <terje-> alrighty
[20:57] <tnt> In my test the S3 gw pretty much is limited by network speed ...
[20:57] <tnt> but I'm using only 2x1G links.
[20:57] <mib_vypwzm> tnt: amazon is generally pretty slow also their virtualized xen machines are pretty slow
[20:58] <tnt> well yeah, but that's amazon ... not really comparable perf to a local ceph cluster with rgw :p
[21:00] <tnt> And RBD on Xen doesn't perform too well ATM so I would suspect RGW to be way faster than OCFS2+RBD on my particular setup. As for the general case ... s.o. need to to benchs :p
[21:02] <tnt> scuttlemonkey: there you go http://wiki.ceph.com/02Guides/Custom_RGW_Bucket_to_RADOS_Pool_mapping :p
[21:02] <scuttlemonkey> cool, thanks
[21:03] <paravoid> ooh!
[21:04] <paravoid> that's also #2169
[21:04] <paravoid> and I'm also dying for this :)
[21:04] <paravoid> the patch doesn't cut it for me though
[21:05] <tnt> yeah, I had the chance to have full control over the naming scheme
[21:05] <paravoid> I'd like to have a handful of pools and thousands of buckets mapped by hand to those
[21:05] * iggy_ (~iggy@theiggy.com) has joined #ceph
[21:06] <tnt> When I wrote the patch, I had a look to see if passing some argument from the REST API downto that level was doable, but unfortunately there is many layers ...
[21:07] <paravoid> there's a branch where yehuda did that
[21:08] <paravoid> yeah, wip-2169
[21:08] <paravoid> I think it was buggy though
[21:12] <tnt> Oh, interesting, I hadn't seen that one.
[21:29] * t0rn (~ssullivan@2607:fad0:32:a02:d227:88ff:fe02:9896) Quit (Remote host closed the connection)
[21:48] <yehudasa> paravoid: some fixes went into the next branch for the swift bucket listing and account stat slowness. It'll make things more streamed, and should be faster (although might not be as fast as you'd like)
[21:48] <paravoid> oh, great
[21:48] <paravoid> is that cuttlefish material?
[21:48] <yehudasa> yeah
[21:49] <paravoid> thanks!
[22:00] <tnt> yehudasa: do you remember http://www.spinics.net/lists/ceph-devel/msg13480.html ? I can't reproduce it on test, but it happenned on prod and I'd like to understand why. All the affcted files have somehow been truncated to 512k.
[22:03] * slang (~slang@72.28.162.16) has joined #ceph
[22:07] <paravoid> oh while you're here
[22:07] <paravoid> any tips on how to do rewrites to radosgw?
[22:07] <paravoid> radosgw seems to read REQUEST_URI and apache mod_rewrite doesn't change that when rewriting internally
[22:08] <yehudasa> tnt: I pushed a fix for that to the bobtail branch
[22:09] <tnt> yehudasa: do you mean this http://tracker.ceph.com/issues/4150 ?
[22:11] <tnt> I had that patch applied already when the issue occured.
[22:15] * TiCPU (~jeromepou@190-130.cgocable.ca) has joined #ceph
[22:16] <tnt> yehudasa: I'm trying to follow the code to see how it could happen but I'm not that familiar with how rgw stores data internally and the terminology (shadow/head/tail/manifest/...)
[22:18] <yehudasa> tnt: issue #4776
[22:19] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[22:19] * loicd (~loic@2a01:e35:2eba:db10:f85d:deb2:da97:ec06) has joined #ceph
[22:19] <TiCPU> I just setup a 6 Debian Wheezy servers (kernel 3.8) with Ceph (0.56.4 | 3 mon, 1 mds, 6 OSD on Btrfs), I'm using RBD with libvirt/qemu 0.14, I made many benchmark and found out that VirtIO devices using writeback/rbd_cache perform at half of the write speed than an equivalent VirtIO device using cache=none and rbd_cache=false, exactly 32MB/s and 65MB/s of throughput. Any idea what I could check? iostat/top show nothing is topped off on OSDs and VM sh
[22:19] <TiCPU> ow 100% I/O usage.
[22:20] <TiCPU> on a sidenote, using rbd map crashes the kernel.
[22:20] * Wolff_John (~jwolff@vpn.monarch-beverage.com) Quit (Quit: ChatZilla 0.9.90 [Firefox 20.0.1/20130409194949])
[22:21] <TiCPU> Virtio read speed is 200MB/s using RAID0 and 100MB/s single device, RADOS bench show the same
[22:22] <TiCPU> RADOS write is roughly 70MB/s
[22:24] <tnt> yehudasa: ok great. Any idea why it only happenned to a few files in prod and I couldn't make it happen during testing ?
[22:25] <joshd> TiCPU: you're probably running into http://tracker.ceph.com/issues/3737, which a newer librbd and qemu fix
[22:26] * chutzpah (~chutz@199.21.234.7) Quit (Quit: Leaving)
[22:27] <yehudasa> tnt: the problem is that it doesn't happen immediately, you copy object to itself, then have to wait a few hours
[22:28] <tnt> yehudasa: well, when we imported some data, there was like 3000 new files added, then all of them had their meta data updated a few minutes later and only 3 or 4 ended up damaged (and we noticed it like a few days later).
[22:29] <yehudasa> tnt: it only happens on smaller objects
[22:29] <yehudasa> < 512k
[22:29] <yehudasa> I mean .. larger objects, > 512k
[22:30] <tnt> most of them are larger than that :)
[22:30] <gregaf> wido, you around?
[22:30] <yehudasa> well, not completely sure, there might be some other requirement for it to happen
[22:31] <tnt> I'm mostly just worried that there are some "latent" issue and that it will sometime decide to erase/cleanup the tails of those 2996 other files ...
[22:31] * iggy_ (~iggy@theiggy.com) Quit (Quit: leaving)
[22:32] <yehudasa> tnt: the problem was with garbage collector removing them, it should have probably ran since
[22:33] * mib_vypwzm (5252afd3@ircip4.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[22:33] <tnt> yehudasa: ok. If there was, radosgw-admin gc list would show them right ?
[22:36] <mikedawson> joshd: do you guys build a qemu package with the patches described in http://tracker.ceph.com/issues/3737 ? Alternately, do you know if there are debs anywhere with that patch for Raring?
[22:37] <joshd> mikedawson: not yet, but hopefully soon http://tracker.ceph.com/issues/4834
[22:37] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[22:37] <yehudasa> tnt: depends of how much time passed since they were removed .. it takes a cycle for them to appear there
[22:37] <mikedawson> joshd: thanks
[22:38] <yehudasa> a cycle is an hour by default
[22:38] <yehudasa> I think
[22:38] <paravoid> hm, that sounds pretty serious
[22:38] <paravoid> I wonder if I've been bitten byt it
[22:39] <tnt> yehudasa: well, a month ago ... so :p
[22:40] <tnt> paravoid: We noticed it when running the backup because rgw was then returning a 404 for an object that was in the bucket listing so the backup tool complained about inconsistency.
[22:53] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:55] * vata (~vata@2607:fad8:4:6:f1de:78ee:7da5:f541) Quit (Quit: Leaving.)
[23:04] * iggy_ (~iggy@theiggy.com) has joined #ceph
[23:04] <TiCPU> joshd, I tried to apply the patch real quick and recompile my qemu 1.4 but it does not seem to help, I couldn't test much as I've got to go, but thanks for the tip!!
[23:09] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Quit: ...)
[23:09] <paravoid> sagewk: nice on ceph-disk-*
[23:09] <paravoid> and bobtail
[23:09] <paravoid> I think I'm one of the few people using them
[23:09] <sagewk> paravoid except that http://gitbuilder.sepia.ceph.com/gitbuilder-precise-deb-amd64/log.cgi?log=0b42b1edb306a9763bcd02bd962bd284f6b7b3a3 :)
[23:09] <sagewk> but yeah
[23:10] <paravoid> still bitten by http://tracker.newdream.net/issues/3255 though
[23:10] <paravoid> I have some really nasty hacks to work around it
[23:11] <paravoid> esp. in combination with #4031 (journal path, should be fixed in bobtail-deploy) & #4032 (reusing IDs)
[23:11] <paravoid> so I want to basically prepare, then mkjournal and set whoami, then activate
[23:19] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[23:19] * LeaChim (~LeaChim@90.197.3.92) Quit (Ping timeout: 480 seconds)
[23:20] <mikedawson> sagewk: two more files are ready for you on cephdrop
[23:21] * DarkAce-Z (~BillyMays@50.107.54.92) has joined #ceph
[23:24] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Remote host closed the connection)
[23:26] * DarkAceZ (~BillyMays@50.107.54.92) Quit (Ping timeout: 480 seconds)
[23:26] * Romeo_ (~romeo@198.144.195.85) Quit (Ping timeout: 480 seconds)
[23:29] * jefferai (~quassel@corkblock.jefferai.org) Quit (Read error: Connection reset by peer)
[23:30] * jefferai (~quassel@corkblock.jefferai.org) has joined #ceph
[23:35] * cephalobot` (~ceph@ds2390.dreamservers.com) Quit (Ping timeout: 480 seconds)
[23:36] * Romeo (~romeo@198.144.195.85) has joined #ceph
[23:37] * rturk-away (~rturk@ds2390.dreamservers.com) Quit (Remote host closed the connection)
[23:37] * rturk-away (~rturk@ds2390.dreamservers.com) has joined #ceph
[23:38] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[23:38] * cephalobot (~ceph@ds2390.dreamservers.com) has joined #ceph
[23:45] * gregaf1 (~Adium@2607:f298:a:607:b536:9b5f:7474:a0d1) has joined #ceph
[23:49] * sjustlaptop (~sam@2607:f298:a:697:d426:5432:b06c:6147) has joined #ceph
[23:54] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[23:55] * DarkAce-Z is now known as DarkAceZ
[23:58] * noob2 (~cjh@2620:0:1cfe:28:9cf8:21a5:b78d:b5ed) has joined #ceph
[23:58] <sagewk> paravoid: oh, in latest next ceph-disk-prepare and activate are taking a lock...
[23:58] <sagewk> oh but that's not letting you wait inbetween.
[23:58] <paravoid> right
[23:59] <paravoid> how do you actually prepare spare drives?
[23:59] <sagewk> could probably have a file in /var/lib/ceph/tmp/* that lets you suppress any udev-triggered -activate activity
[23:59] <paravoid> to put them on a shelf for the field tech?
[23:59] <sagewk> paravoid: exactly :)
[23:59] <paravoid> how does that work now?
[23:59] <paravoid> it doesn't at all?
[23:59] <sagewk> chmod -x /usr/sbin/ceph-disk-activate ? :P

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.