#ceph IRC Log


IRC Log for 2012-05-01

Timestamps are in GMT/BST.

[0:02] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[0:07] * verwilst_ (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[0:15] <gregaf> sagewk: fixed the parsing warning we talked about, and also changed a use of DEFAULT_CONFIG_FILE that should have been a different flag
[0:17] <gregaf> it passes the cli tests and works nicely without output:
[0:17] <gregaf> *without a config file
[0:17] <gregaf> gregf@kai:~/logs$ ./ceph -m -k ~/ceph/src/keyring --auth-supported cephx -s
[0:17] <gregaf> 2012-04-30 15:10:06.548150 7f0042b2c760 -1 did not load config file, using default settings.
[0:17] <gregaf> 2012-04-30 15:10:06.552093 pg v7: 24 pgs: 24 active+degraded; 8730 bytes data, 77174 MB used, 744 GB / 863 GB avail; 21/42 degraded (50.000%)
[0:17] <gregaf> 2012-04-30 15:10:06.552255 mds e5: 1/1/1 up {0=a=up:active}
[0:17] <gregaf> 2012-04-30 15:10:06.552284 osd e5: 1 osds: 1 up, 1 in
[0:17] <gregaf> 2012-04-30 15:10:06.552347 log 2012-04-30 15:09:03.513542 mon.0 5 : [INF] mds.0 up:active
[0:17] <gregaf> 2012-04-30 15:10:06.552381 mon e1: 1 mons at {a=}
[0:22] <joshd> I don't think we need the warning
[0:22] * verwilst_ (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[0:23] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[0:34] <gregaf> joshd: we need to say something if they don't have a config specified, because you can get some truly bizarre warnings out without one
[0:34] <gregaf> ENOTSUPP is my favorite
[0:35] <gregaf> that's what you get if the cluster is using cephx and you don't specify it
[0:38] <joshd> how will knowing you're not using a config file help though? it seems like better error messages, like ('did you forget to specify --auth-supported?') would help more
[0:38] <gregaf> sure, they'd be way better, but those aren't a neat little packaged fix we can do
[0:39] <gregaf> this way if you have a ceph.conf somewhere you can at least go "huh, maybe if I add -c /my/normal/working/dir ??? yep, it worked!"
[0:40] <joshd> I guess I'm thinking more about the case where you're trying *not* to use a ceph.conf - it'd be annoying to get that message on every command
[0:41] <gregaf> it's just one line...
[0:41] <gregaf> given that we used to error out, a warning seems appropriate
[0:41] <gregaf> least surprise and all that
[0:42] <joshd> I guess for now, but in the future, when not using a conf file is more common, we can remove it
[0:43] <gregaf> yeah, that's fine with me!
[0:45] * pablohof (~prh@r190-135-40-232.dialup.adsl.anteldata.net.uy) has joined #ceph
[0:45] * s[X]_ (~sX]@eth589.qld.adsl.internode.on.net) has joined #ceph
[1:10] <sagewk> sjust: can you rebase wip-past-interval on master? Interval was renamed to pg_interval_t by the wip-pi branch (now merged)
[1:11] * pablohof (~prh@r190-135-40-232.dialup.adsl.anteldata.net.uy) has left #ceph
[1:21] * The_Bishop (~bishop@p4FCDF50E.dip.t-dialin.net) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[1:28] <sjust> ok
[1:32] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[1:34] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:40] * Ryan_Lane (~Adium@ has joined #ceph
[1:48] * Ryan_Lane1 (~Adium@ has joined #ceph
[1:48] * Ryan_Lane (~Adium@ Quit (Read error: Connection reset by peer)
[2:08] * Tv_ (~tv@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[2:08] * The_Bishop (~bishop@cable-86-56-102-91.cust.telecolumbus.net) has joined #ceph
[2:13] * lxo (~aoliva@lxo.user.oftc.net) Quit (Read error: Operation timed out)
[2:26] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[2:26] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[2:26] <sagewk> nhm: wip-throttle should give better visibility into the various throttlers now...
[2:27] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[2:31] * Ryan_Lane (~Adium@ has joined #ceph
[2:31] * Ryan_Lane1 (~Adium@ Quit (Read error: Connection reset by peer)
[2:31] <nhm> sagewk: excellent, thanks
[2:31] <nhm> sagewk: anything I need to do to use them?
[2:34] <joao> oh yeah, back in business
[2:35] <joao> I'm never upgrading any computer ever again
[2:36] * yoshi (~yoshi@p3167-ipngn3601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:37] <nhm> joao: that's why I'm still on natty. :)
[2:37] <joao> yeah...
[2:38] <joao> truth be told, making changes on both computers at the same time wasn't the smartest idea ever
[2:39] <joao> upgrading the desktop and partitioning/formating the laptop in the same weekend was asking for trouble
[2:40] <joao> I still don't have the mac's webcam working on ubuntu, and one speaker is louder than the other, but at least it's booting and has internet access :p
[2:41] <nhm> That's really all you need. ;)
[2:41] <joao> indeed :)
[2:41] <iggy> one speaker louder... that's an odd one
[2:41] <joao> yeah
[2:42] <joao> the right speaker is louder than the left speaker, but hey... I just don't care anymore
[2:58] <elder> Just switch them around every so often.
[2:59] <elder> My laptop is stuck doing something at bootup. I can ssh in, but the graphic part is not happy. It worked great for a while, but I updated a few package after the initial install and, well, I haven't taken the time to figure out what went wrong.
[2:59] <elder> I have another machine, a desktop server, which I basically run headless, and it's doing just fine.
[3:46] * adjohn (~adjohn@ Quit (Quit: adjohn)
[4:12] * Ryan_Lane1 (~Adium@ has joined #ceph
[4:12] * Ryan_Lane (~Adium@ Quit (Read error: Connection reset by peer)
[4:32] * adjohn (~adjohn@70-36-139-109.dsl.dynamic.sonic.net) has joined #ceph
[4:41] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[4:50] * adjohn (~adjohn@70-36-139-109.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[4:54] <elder> OK, screw it, I must do something else before my mind turns to gel. I'm going to take a break from reading raw memory contents and will return to good old fashioned debug prints to gather informatino.
[4:56] * chutzpah (~chutz@ Quit (Quit: Leaving)
[4:58] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[4:59] * Ryan_Lane1 (~Adium@ Quit (Quit: Leaving.)
[5:47] * s[X]_ (~sX]@eth589.qld.adsl.internode.on.net) Quit (Remote host closed the connection)
[6:00] * adjohn (~adjohn@70-36-139-109.dsl.dynamic.sonic.net) has joined #ceph
[6:04] * adjohn is now known as Guest103
[6:04] * Guest103 (~adjohn@70-36-139-109.dsl.dynamic.sonic.net) Quit (Read error: Connection reset by peer)
[6:04] * adjohn (~adjohn@70-36-139-109.dsl.dynamic.sonic.net) has joined #ceph
[6:21] * Ryan_Lane (~Adium@c-98-210-205-93.hsd1.ca.comcast.net) has joined #ceph
[7:14] * s[X]_ (~sX]@eth589.qld.adsl.internode.on.net) has joined #ceph
[7:18] * LarsFronius (~LarsFroni@95-91-243-252-dynip.superkabel.de) has joined #ceph
[7:40] * cattelan is now known as cattelan_away
[8:00] * detaos (~quassel@c-50-131-106-101.hsd1.ca.comcast.net) has joined #ceph
[8:01] <detaos> hi everybody!
[8:01] * LarsFronius (~LarsFroni@95-91-243-252-dynip.superkabel.de) Quit (Quit: LarsFronius)
[8:01] <detaos> anyone awake at this fine hour?
[8:06] <detaos> guess not ...
[8:06] <detaos> anyways, I just set up a ceph cluster on four machines for a lab class I'm taking this quarter.
[8:07] <detaos> I'd like to mount the cluster on another set of machines using /etc/fstab, but the documenation for doing so is currently a bit lacking. [ http://ceph.newdream.net/docs/master/ops/manage/cephfs/?highlight=fstab ]
[8:08] <detaos> I was wondering if there was any internal documentation floating around, or if someone would be kind enough to explain how to do it.
[8:08] <detaos> additionally, if you're interested in accepting contributions, I'd be happy to help write some of the documentation.
[8:09] <detaos> I
[8:10] <detaos> I'll remain connected to this channel in case someone wants to get back to me after I've gone for the evening.
[8:10] <detaos> Many thanks to all the devs for making ceph so easy to install :)
[8:28] * Theuni (~Theuni@ has joined #ceph
[8:29] <imjustmatthew> detaos: FWIW I use "ceph.example.com:/ /media/ceph ceph name=hostname,secretfile=/etc/ceph/client.hostname.secret"
[8:30] * Theuni (~Theuni@ Quit ()
[8:30] <imjustmatthew> which mounts the filesystem with whatever capabilities the client.hostname.secret has assigned
[8:31] <imjustmatthew> ceph.example.com is a DNS pointer to the monitors' IPs using round robin since the client can connect to any of them
[8:32] <imjustmatthew> The ceph devs are mostly US-Pacific time if you want to try and catch them later today
[8:33] <detaos> hi imjustmatthew ... the DNS solution is interesting. I tried to simply list the ip addresses of the three monitors I have, but when I issued the `mount -a` command, it complained of invalid formatting.
[8:34] <detaos> do you have redundant DNS servers, or is that a single point of failure in your setup?
[8:36] <imjustmatthew> I do, but I'm not really worried about them failing. I just use a record for convience
[8:37] <imjustmatthew> are you using IPv4 or v6?
[8:39] <detaos> v4
[8:40] <detaos> the computers in the lab are relatively dated ... despite having four osd's, i have a whopping 240GB of total cluster space :D
[8:40] <imjustmatthew> I'm afraid I can't help you with that one, I used v6, though I thought that page was on the wiki still...
[8:40] <imjustmatthew> lol, don't worry, mine's a bunch of old 120GB and 250GB drives I pulled from our old web servers
[8:41] <detaos> i was actually wondering how relavent the old wiki information was, given that some of it is contradicted in the new docs.
[8:42] <imjustmatthew> some of it still works, it is pretty messed up
[8:42] <imjustmatthew> try using the one from the wiki: "mount -t ceph /mnt/osd -vv -o name=admin,secret=AQATSKdNGBnwLhAAnNDKnH65FmVKpXZJVasUeQ=="
[8:43] <imjustmatthew> I feel like that was the format I used, though with [] notation instead
[8:43] <imjustmatthew> though at least one other page has it without the port number too
[8:44] <detaos> I'll have to give that a go tomorrow ... I myself am US Pacific and would be terribly less content with life, were I there now (given that it's very nearly midnight) ;)
[8:45] <imjustmatthew> ha, good luck :)
[8:45] <detaos> thanks :)
[9:34] * s[X]_ (~sX]@eth589.qld.adsl.internode.on.net) Quit (Remote host closed the connection)
[9:57] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[10:12] * adjohn (~adjohn@70-36-139-109.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[10:26] * ivan\ (~ivan@108-213-76-179.lightspeed.frokca.sbcglobal.net) Quit (Quit: ERC Version 5.3 (IRC client for Emacs))
[10:28] * ivan\ (~ivan@108-213-76-179.lightspeed.frokca.sbcglobal.net) has joined #ceph
[10:51] * Oliver1 (~oliver1@p54839C4A.dip.t-dialin.net) has joined #ceph
[10:54] * loicd (~loic@magenta.dachary.org) has joined #ceph
[10:57] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[11:09] * Oliver1 (~oliver1@p54839C4A.dip.t-dialin.net) Quit (Quit: Leaving.)
[11:13] * stxShadow (~Jens@ip-78-94-238-69.unitymediagroup.de) has joined #ceph
[11:25] * pmjdebruijn (~pascal@overlord.pcode.nl) has joined #ceph
[11:25] <pmjdebruijn> hi
[11:25] <pmjdebruijn> cp: cannot stat `/build/buildd/ceph-0.46/debian/tmp/usr/bin/ceph-kdump-copy': No such file or directory
[11:26] <pmjdebruijn> the debian packaging seems to be off
[11:26] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[11:27] <pmjdebruijn> oh wait
[11:28] <pmjdebruijn> I probably forgot the checkout the release, and stuff got added after release
[11:43] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[11:43] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[11:45] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[11:45] * yoshi (~yoshi@p3167-ipngn3601marunouchi.tokyo.ocn.ne.jp) Quit (Read error: Connection reset by peer)
[11:45] * yoshi (~yoshi@p3167-ipngn3601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[11:49] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[11:53] * yoshi (~yoshi@p3167-ipngn3601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[12:01] * wido (~wido@rockbox.widodh.nl) Quit (Remote host closed the connection)
[12:01] * wido (~wido@rockbox.widodh.nl) has joined #ceph
[12:01] * LarsFronius (~LarsFroni@95-91-243-252-dynip.superkabel.de) has joined #ceph
[12:07] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[12:25] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[12:49] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[12:52] * loicd (~loic@magenta.dachary.org) has joined #ceph
[12:53] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[13:18] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[13:18] * loicd (~loic@magenta.dachary.org) has joined #ceph
[13:30] * MK_FG (~MK_FG@ Quit (Quit: o//)
[13:32] * s[X]_ (~sX]@60-241-151-10.tpgi.com.au) has joined #ceph
[13:34] * MK_FG (~MK_FG@ has joined #ceph
[13:54] * s[X]_ (~sX]@60-241-151-10.tpgi.com.au) Quit (Remote host closed the connection)
[14:36] * stxShadow (~Jens@ip-78-94-238-69.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[14:42] <nhm> good morning #ceph
[14:50] <joao> hey Mark
[14:50] <joao> :)
[15:13] <lxo> wheee! upgrade to 0.46 uneventful
[15:15] * danieagle (~Daniel@ has joined #ceph
[15:21] <nhm> lxo: ya!
[15:53] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[15:58] * cattelan_away is now known as cattelan
[16:00] * aliguori (~anthony@ has joined #ceph
[16:16] * aliguori (~anthony@ Quit (Remote host closed the connection)
[16:46] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[16:46] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[17:06] * joao (~JL@ Quit (Remote host closed the connection)
[17:16] * joao (~JL@ has joined #ceph
[17:21] <nhm> elder: ping
[17:21] <elder> Here.
[17:23] <nhm> elder: I don't suppose you know how messages get to the dispatch and policy throttlers do you?
[17:23] <elder> No, sorry.
[17:23] <nhm> ok, np
[17:23] <elder> I could look a bit if your question actually triggered any recognition of the context, but it doesn't...
[17:24] <nhm> elder: yeah, I'll just keep snooping around until one of the other guys shows up.
[17:25] <nhm> elder: I'm using Sage's wip-throttle branch now and noticed that the throttlers apparently have no data when these stalls happen, so it seems like the problem is farther upstream.
[17:29] * renzhi (~renzhi@ Quit (Quit: Leaving)
[17:33] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:38] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[17:39] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[17:46] <elder> I haven't been following what you guys have been doing lately nhm. I've been pretty focused on this stupid bug that I wish would simply go away.
[17:59] <nhm> elder: yeah, I have no idea what you are working on other than it sounds terrible. ;) My problem is basically related to something causing stalls on the fast burnupi nodes. It seems to be related to a problem Jim Schutt had on the mailing list 2 months ago, but his fix didn't seem to work for me.
[18:00] <elder> It doesn't have to be terrible, I've just been at it for a few days straight without much progress and I need a little change or it will burn me out.
[18:00] <elder> I don't know what role the throttlers play, nor where in the entire ceph stack the stalls you mention are occuring.
[18:00] <joao> nhm, what seems to be the problem?
[18:00] <elder> So I'm not a lot of help, without better context, and probably wouldn't be very much even then.
[18:01] <nhm> elder: Oh, that's ok. I'm content to keep picking at it until I can figure it out.
[18:02] <nhm> joao: I seem to be hitting something very similar to what's going on in this very long thread: http://www.spinics.net/lists/ceph-devel/msg04787.html
[18:02] <sagewk> nhm: here now
[18:03] <sagewk> nhm: are the wait counters zero after the run?
[18:04] <sagewk> note that there are several dispatch throttlers.. 4 for ceph-osd, iirc
[18:05] <nhm> sagewk: how do I check that? One thing I noticed on new runs is that during the slow periods, I basically no longer see the client reader wanting anything from the dispatch throttler, but do see stuff like "reader wants 47 from dispatch throttler 0/104857600" from the other OSD.
[18:06] <nhm> sagewk: oh, is that just the perf counter stuff from the asok?
[18:07] <sagewk> yeah
[18:07] <sagewk> on the client side, that probably just means there are no incoming messages
[18:07] <sagewk> oh which reminds me, we should probbaly record the current throttle value and max in there too.
[18:08] <nhm> sagewk: hrm, did the asok move? nothing in /var/run/ceph anymore.
[18:08] <sagewk> /var/run/ceph/ceph.osd.$id.asok?
[18:09] <nhm> sagewk: duh, I stopped ceph
[18:10] <nhm> sagewk: btw, I was starting some work on some scripts to parse the ops_in_flight and perf counters once a second. Useful?
[18:11] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[18:11] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit ()
[18:11] <sagewk> probably. i suspect what we want more is something that displays the perfcounter stats once per second, similar to vmstat etc
[18:11] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[18:11] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit ()
[18:12] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[18:12] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit ()
[18:13] <sjust> sagewk: wip-past-interval also has a fix for the probe set bug
[18:14] <sagewk> sjust: had a few comments on github
[18:14] <sjust> ok
[18:15] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[18:16] * Ryan_Lane (~Adium@c-98-210-205-93.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:16] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit ()
[18:17] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[18:21] <gregaf> nhm: sagewk: that probably means the delays are happening in replication rather than when they come in from clients (since the rados bench client limits itself to the given number of messages at a time)
[18:21] <sagewk> yeah
[18:22] <nhm> gregaf: That makes sense to me.
[18:22] <nhm> or at least it doesn't contradict anything I've seen.
[18:24] * gregaf (~Adium@aon.hq.newdream.net) Quit (Quit: Leaving.)
[18:27] * brambles (brambles@ Quit (Quit: leaving)
[18:27] * Tv_ (~tv@aon.hq.newdream.net) has joined #ceph
[18:27] * gregaf (~Adium@aon.hq.newdream.net) has joined #ceph
[18:27] * brambles (brambles@ has joined #ceph
[18:28] * lofejndif (~lsqavnbok@04ZAACZAT.tor-irc.dnsbl.oftc.net) has joined #ceph
[18:37] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[18:42] <jefferai> Hi Ceph guys
[18:44] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[18:44] * loicd (~loic@magenta.dachary.org) has joined #ceph
[18:48] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) has joined #ceph
[18:50] * bchrisman (~Adium@ has joined #ceph
[19:03] * Ryan_Lane (~Adium@ has joined #ceph
[19:03] <joao> how long does usually an osd stays "stale" after being brought back up?
[19:05] <sagewk> 1s-10s of seconds before ceph-osd reports updated status to the monitor
[19:06] <joao> can you see any reason why it would stay in this state for 5 minutes now?
[19:06] <sagewk> the osd(s) for those pgs are still down
[19:06] <sagewk> i think 'ceph health detail' will tell you the likely culprits
[19:07] <joao> HEALTH_WARN 24 pgs degraded; 24 pgs stale; 24 pgs stuck stale; 24 pgs stuck unclean; recovery 4418/8836 degraded (50.000%); 1 near full osd(s); mds a is laggy
[19:07] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:07] <joao> does any of this rings a bell?
[19:08] <joao> also, the osd is up and running
[19:08] <sagewk> there's only one?
[19:08] <joao> yes
[19:09] <sagewk> ceph pg dump | grep stale, then look for one of those pgs on the osd
[19:09] <sagewk> or 'ceph pg <pgid> query'
[19:10] <joao> blocks
[19:10] <joao> as in, forever waiting for something
[19:10] <joao> all the pgs are stale+active+degraded
[19:12] <joao> can this be related with using --filestore-blackhole?
[19:13] <sagewk> ceph pg map <pgid>
[19:13] <joao> also, this shows up in osd log
[19:13] <joao> osd.0 69 mon hasn't acked PGStats in 30.006978 seconds, reconnecting elsewher
[19:13] <sagewk> on teh current ceph-osd, or a previous one?
[19:14] <joao> current version, if that's what you mean; yesterday's I think
[19:14] <joao> jecluis@Magrathea:~/Code/dreamhost/ceph/src$ ./ceph pg map 0.0
[19:14] <joao> osdmap e72 pg 0.0 (0.0) -> up [] acting []
[19:20] * chutzpah (~chutz@ has joined #ceph
[19:25] <Tv_> sagewk: i really wish bringing up an osd had fewer operations in it
[19:26] * eightyeight (~atoponce@pthree.org) Quit (Read error: Operation timed out)
[19:27] <Tv_> sagewk: 1) "ceph osd create --concise" to allocate id 2) "ceph mon getmap -o ..." 3) "ceph-osd --mkfs --mkkey ..." 4) "ceph auth add ..." the osd key to mons 5) "ceph osd crush add ..." (only if id!=0) 6) touch .../done to have some sort of atomic completion marker
[19:28] <sagewk> create is the only mon operation that is non-idempotent
[19:29] <Tv_> sagewk: yeah but i need to know when to re-run, when not to
[19:29] <Tv_> sagewk: oh is ceph osd crush add actually safe to re-run? that's news to me
[19:29] <Tv_> i have a dim memory of it crapping out
[19:29] <sagewk> yeah, it's adding it to a specific location in the tree
[19:29] <Tv_> then that's moving to ceph-osd startup!
[19:29] <Tv_> because it might be different from creation time
[19:29] <sagewk> oh, it may error out the second time. i can fix that
[19:29] <Tv_> add-or-replace ;)
[19:29] <sagewk> the result is the same
[19:30] <sagewk> yeah
[19:30] * eightyeight (~atoponce@pthree.org) has joined #ceph
[19:30] <sagewk> well, add-or-noop.. it won't replace anything (except itself i guess)
[19:30] <sagewk> it could update the weight.
[19:30] <Tv_> sagewk: it should update the location.
[19:30] <Tv_> sagewk: think moving hard drive to different chassis
[19:31] <sagewk> yeah, ok. opening and issue for that.
[19:31] <Tv_> thanks, and sorry for being a scatterbrain about this
[19:32] <nhm> sagewk: So Jim Schutt sent me a script he wrote to scan through tcpdump output and look for unACKed retransmits. Didn't find any.
[19:32] <sagewk> nhm: ok. this is trivial to reproduce, right?
[19:32] <nhm> sagewk: yep
[19:32] <sagewk> with logging turned up?
[19:33] <nhm> sagewk: yep. Let me crank everything back up.
[19:33] <nhm> sagewk: anything you want higher/lower than 20?
[19:34] <sagewk> naw
[19:36] <joao> is there any way to force the stale out of the osd?
[19:37] <nhm> sagewk: ok, restarted ceph. OSD nodes are burnupi28 and burnupi29. Currently using 1GE, 1 OSD per node, XFS.
[19:37] * Ryan_Lane (~Adium@ Quit (Read error: Connection reset by peer)
[19:37] <nhm> client is plana05
[19:37] * Ryan_Lane (~Adium@ has joined #ceph
[19:37] <nhm> now watch as you start looking at it everything magically works. ;)
[19:37] <yehudasa> sagewk: deb package name for rest-bench? rest-bench? ceph-rest-bench? radosgw-rest-bench?
[19:38] <yehudasa> note that it depends on ceph-common
[19:38] <joao> sagewk, looks like --filestore-blackhole is the culprit here
[19:39] <joao> I'm just gonna inject it and see what happens
[19:40] <sagewk> yeah, --filestore-blackhole will make things not work.. it's only there for failure injection
[19:40] <sagewk> if that was ever set, you need to restart ceph-osd
[19:40] <sagewk> yehudasa: is the binary rest-bench?
[19:41] <sagewk> sjust: can you look at wip-2355?
[19:41] <sjust> ok
[19:41] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[19:42] <sjust> sagewk: looks good
[19:43] * loicd (~loic@magenta.dachary.org) has joined #ceph
[19:45] <sagewk> gregaf:
[19:45] <sagewk> b.add_u64_counter(l_throttle_get_or_fail_fail, "get_or_fail_fail");
[19:45] <sagewk> b.add_u64_counter(l_throttle_get_or_fail_success, "get_or_fail_success");
[19:45] <sagewk> ?
[19:45] <gregaf> sure
[19:45] <sagewk> gregaf: so get_or_fail success will also count as a get (and be included in that sum)
[19:46] <gregaf> that was my assumption, although I didn't really think about it much
[19:47] <gregaf> *ponders*
[19:47] <gregaf> well, we can calculate the same information either way, so whatever
[19:47] <sagewk> yeah.
[19:48] <sagewk> ok otherwise?
[19:49] <gregaf> yep, all looks good!
[19:49] <sagewk> great thanks!
[19:52] * LarsFronius (~LarsFroni@95-91-243-252-dynip.superkabel.de) Quit (Quit: LarsFronius)
[19:56] <sagewk> nhm: 2012-05-01 13:51:48.554993 7f78f89fa700 2 filestore(/srv/osd.0) op_queue_reserve_throttle waiting: 50 > 1000 ops || 209804505 > 209715200
[19:56] <sagewk> is what i'm seeing
[19:56] <nhm> sagewk: ok, not sure if this is new with wip-throttle or I some how didn't have the right debugging before, but now I'm seeing some very interesting things in the logs correlated with the stalls. Tons of "reading nonblocking into 0x85fc16c len 4075156" followed by "read 17376 of 4075156"
[19:56] <sagewk> which is basically xfs not keeping up with the journal.
[19:56] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[19:58] <nhm> sagewk: Ok, but when I look at the writes going to the OSD drives, I see ~10-40MB/s.
[19:59] <nhm> Not sure what exactly "osd tell 0 bench" does, but that reports ~130MB/s which is in-line with what dd to the xfs fs reports.
[20:00] <nhm> well, I know it benchmarks osd 0 in some way. ;)
[20:00] <gregaf> bench is 1GB of 4MB writes; I think they're synchronous and sequential?
[20:00] <sagewk> it writes directly to filestore in a loop
[20:00] <sagewk> async
[20:01] <sagewk> nhm: there are extra args to make it write more than 1gb.. should do that (maybe 10gb?).. i think it's too short of a test
[20:01] <sagewk> the stalls i see are more like ~15-30 seconds apart?
[20:01] <nhm> sagewk: it can be, but not always.
[20:02] <gregaf> sagewk: where did 81f51d28d67c2a58ab621405c3da65aac726d719 come from? (osd: pg creation calc_priors_during()...)
[20:05] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Remote host closed the connection)
[20:05] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[20:06] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[20:07] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) has joined #ceph
[20:10] * aliguori (~anthony@ has joined #ceph
[20:11] <sagewk> gregaf: #2355
[20:12] <gregaf> I didn't see it in wip-throttle when I was looking through it, I mean
[20:13] <gregaf> I'm satisfied that mechanically it does what the commit message says, but I don't know if there are (eg) other paths that need that same fix...
[20:14] <sagewk> don't think so. it's analogous to build priors, but at pg creation time.
[20:14] <gregaf> okay, just saying, might want to run it by sjust
[20:15] <sagewk> did.. just forgot to put it in teh commit msg
[20:16] <gregaf> ah, cool
[20:16] <gregaf> the way github emails are formatted I thought it got merged in via wip-throttle and was freaked out
[20:16] <gregaf> stupid github emails
[20:26] * Ryan_Lane1 (~Adium@ has joined #ceph
[20:26] * Ryan_Lane (~Adium@ Quit (Read error: Connection reset by peer)
[20:31] * johnl (~johnl@2a02:1348:14c:1720:29c6:1136:ca1a:b083) Quit (Remote host closed the connection)
[20:31] * johnl (~johnl@2a02:1348:14c:1720:64c1:2a48:15d0:c8db) has joined #ceph
[20:51] <yehudasa> sagewk: the binary is rest-bench
[20:55] * lofejndif (~lsqavnbok@04ZAACZAT.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[20:58] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[20:59] * aliguori (~anthony@ Quit (Remote host closed the connection)
[21:01] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:03] <joao> sagewk, gregaf, whenever you can, check wip-2323-b to see if that works for you
[21:04] <yehudasa> sagewk: Tv: need to clone http://git.ischo.com/libs3.git into github.com/ceph
[21:08] * Ryan_Lane (~Adium@ has joined #ceph
[21:08] * Ryan_Lane1 (~Adium@ Quit (Read error: Connection reset by peer)
[21:15] <Tv_> yehudasa: doin' it
[21:18] <sagewk> yehudasa: yeah i'd name the package rest-bench
[21:19] <sagewk> joao: looking now
[21:20] * stxShadow (~Jens@ip-78-94-238-69.unitymediagroup.de) has joined #ceph
[21:20] <Tv_> yehudasa: pushed master
[21:22] <gregaf> joao: looks good to me, although you should be able to do that with just one vector (eg, stick an empty string in first and overwrite it at the end)
[21:22] <sagewk> joao: https://github.com/ceph/ceph/commit/21e546e946ec945dbc00efab9b1550366d7a69d5
[21:23] <sagewk> oh, actually, what gregaf said :)
[21:26] * detaos|cloud (~cd9b41e9@webuser.thegrebs.com) has joined #ceph
[21:26] <detaos|cloud> hi everybody
[21:27] <gregaf> hello
[21:33] <yehudasa> Tv_: super, thanks
[21:33] <yehudasa> sagewk: ok, rest-bench it is
[21:34] <joao> gregaf, sagewk, kay
[21:38] * aliguori (~anthony@ has joined #ceph
[21:39] <yehudasa> Tv_: that's a fork of wido's github, not the upstream libs3
[21:40] <yehudasa> there are some changes that went in a bit differently upstream, so I'd rather take that one
[21:41] <Tv_> yehudasa: oh right i had wido's checked out
[21:42] <Tv_> yehudasa: force pushed ischo's master
[21:42] <yehudasa> Tv_: thanks
[21:42] * aliguori (~anthony@ Quit ()
[21:43] <detaos|cloud> I got my ceph cluster up and running ok, it reports HEALTH_OK, but when I go to mount using `mount -t ceph ceph -vv` i get: "mount error 16 = Device or resource busy"
[21:47] <detaos|cloud> if I cat /proc/mounts, the ceph folder doesn't appear in the list of mounts.
[21:47] * aliguori (~anthony@ has joined #ceph
[21:53] <detaos|cloud> running `umount ceph` and `mount -t ceph -vv` now produces: "mount:error writing /etc/mtab: Invalid argument"
[21:54] <Tv_> detaos|cloud: uhh you're not specifying where to mount it..?
[21:57] <detaos|cloud> Tv_: sorry, that was a retype error ... i actually ran `mount -t ceph ceph -vv`
[21:57] <detaos|cloud> note: pwd := /media
[21:57] <joao> sagewk, gregaf, rebased & pushed
[21:57] <Tv_> detaos|cloud: anything interesting in dmesg? mds log?
[21:58] <detaos|cloud> dmesg | tail shows "libceph: mon0 session established"
[21:59] <detaos|cloud> not sure how to get at the mds log ...
[21:59] <elder> So when I turn on dynamic_debug messages, where are they supposed to show up? I thought I enabled a bunch of them, but they didn't appear.
[21:59] <Tv_> elder: the kernel thing? they're printk's
[21:59] <elder> So on the console?
[22:00] <elder> Because I didn't see htem.
[22:00] <Tv_> elder: dmesg?
[22:00] <elder> Not there either.
[22:00] <Tv_> whether printk shows on the actual console is separately configurable
[22:00] <elder> I'll make sure I'm setting it up right.
[22:00] <elder> Just wanted a sanity check that I'm expecting the right thing.
[22:08] <elder> Looks like I may be getting some output this time. Maybe I missed something last time.
[22:08] <elder> Well, maybe.
[22:10] <Tv_> the filter syntax is fickle
[22:10] <elder> No, it's not that.
[22:11] <elder> Actually I may still not be getting exactly what I expect.
[22:11] <yehudasa> Tv_: I need permissions to push to libs3.git
[22:12] * Ryan_Lane (~Adium@ Quit (Read error: Connection reset by peer)
[22:12] <detaos|cloud> Tv_: it looks like the mount worked. I created a file on one client, ran the same mount on the other three clients, and the file was there ... interestingly though, I can only mount to two of my three monitors. the one that fails dies with "mount error 5 = Input/output error" dmesg | tail shows "libceph: loaded mon/osd proto 15/24, osdmap 5/6 5/6 and ceph: loaded mds proto 32"
[22:12] <Tv_> yehudasa: oh huh the defaults are weird? checking
[22:12] * Ryan_Lane (~Adium@ has joined #ceph
[22:13] <Tv_> yehudasa: try now
[22:13] <yehudasa> Tv_: works, thanks
[22:13] <gregaf> detaos|cloud: what does ceph -s show ?????are all the monitors working?
[22:14] <sagewk> elder: they're at level debug, so they'll be in dmesg but probably not console.
[22:14] <elder> I don't see htem in dmesg either, at least not via kdb.
[22:15] <detaos|cloud> gregaf: yes, the monitor line shows all three.
[22:15] <gregaf> are all the processes operating?
[22:16] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[22:16] <detaos|cloud> how do i check that?
[22:16] <gregaf> go to each host and look for a ceph-mon process (top, ps, whatever)
[22:18] <detaos|cloud> ceph-mon is running on all three.
[22:18] <Tv_> you could point "ceph -s" at each mon, using "ceph -m IP:PORT -s"
[22:20] <Tv_> sagewk: allow me to play devil's advocate.. how do i know ceph-osd mkfs actually completed? can we make it have some specific file creation as the final operation, with a sync before it?
[22:20] <sagewk> tv_: sure, that's easy to do.
[22:20] <sagewk> may already, let me check.
[22:20] <Tv_> sagewk: actually, since i'm using --mkkey i need to authorize the key afterwards anyway, so perhaps i need that external to ceph-osd
[22:21] <Tv_> sagewk: hmm..
[22:22] <sagewk> yeah, i would have an external completion file, and if it's missing redo the whole thing. the only thing that isn't harmelss to repeat is osd create, and that just wastes an id, no big deal
[22:22] <sagewk> (modulo the crush map thing)
[22:22] <Tv_> sagewk: and the crush map belongs in osd startup, not mkfs
[22:22] <sagewk> oh, even better.
[22:23] <Tv_> sagewk: so i don't want to put in an rm -rf.. and that means if i ran mkfs, i need to re-use the results, and not allocate a new id
[22:23] <Tv_> sagewk: which means i need to read whoami
[22:23] <detaos|cloud> Tv_ running `ceph -m -s` produces no output and doesn't return.
[22:24] <Tv_> sagewk: which means i'd still like to know that the mkfs actually succeeded, and isn't a crash state
[22:24] <Tv_> sagewk: so how about this; make whoami the final file, that gets the whole careful write to whoami.tmp sync rename dance
[22:25] <Tv_> well
[22:25] <Tv_> actually
[22:25] <Tv_> dammit ;)
[22:25] <Tv_> i'd like to just store the result of me allocating an id
[22:25] <Tv_> and start from there
[22:25] <Tv_> so i guess i'll put in "my own" whoami file
[22:25] <Tv_> this is not ideal
[22:27] <sagewk> it doesn't matter if the mkfs completed or not.. the whoami file still tells you the result of the osd create..
[22:27] <Tv_> sagewk: only if it gets far enough to write it down, but yeah
[22:27] <Tv_> sagewk: but as it is right now, i can't trust any single file to be a marker for "mkfs succeeded"
[22:28] <sagewk> yeah, just a sec
[22:28] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[22:28] <sagewk> any preference on a name?
[22:29] <elder> "mkfs_succeeded"?
[22:29] <Tv_> sagewk: why not
[22:29] <Tv_> heh
[22:29] <elder> Or mkfs_status, and inside tells you what it was.
[22:29] <Tv_> elder: there's no point in storing failures
[22:29] <elder> Sometimes a NAK is better than nothing though.
[22:30] <Tv_> sagewk: how's this: magic, ceph_fsid and whoami should be written at the beginning, do stuff, syncfs, touch "ready" at the end?
[22:31] <sagewk> i like ready.
[22:31] <elder> Or a not_ready until it's ready?
[22:31] <Tv_> sagewk: when i pick up, i verify magic, ceph_fsid and whoami and reuse those if i can
[22:31] <sagewk> the only danger i see is that if you run this cookbook with old code it'll repeatedly mkfs
[22:31] <Tv_> elder: presence is a stronger signal than lack of it
[22:31] <Tv_> sagewk: yeah don't care
[22:31] <sagewk> k
[22:32] <Tv_> sagewk: i need to put files in the debs anyway
[22:32] <Tv_> sagewk: it'll break miserably without them
[22:32] <sagewk> right
[22:32] <elder> Tv_, that is restating my "NAK" comment in a different way
[22:32] <Tv_> elder: i want a positive signal, not a lack of negative signal
[22:32] <elder> I know.
[22:32] <Tv_> elder: if nothing checks for the negative signal, then it's of no value
[22:32] <The_Bishop> ceph-osd -v
[22:33] <elder> You're getting dangerously close to overload with all the negatives in that last sentence.
[22:33] <Tv_> sagewk: in fact, i think i want to merge canonical's "ceph-mds" etc changes sooner rather than later
[22:33] <Tv_> elder: no i wouldn't never!
[22:33] <elder> I'm not so sure.
[22:34] <The_Bishop> will there be an error message when something fails in the process?
[22:37] <Tv_> sagewk: oh one more thing.. i'd really like to see these files be guaranteed valid-if-exist
[22:37] <Tv_> sagewk: that is, write to temp file, sync, rename
[22:37] <sagewk> sure
[22:42] <gregaf> joao: yep, looks good! :)
[22:45] <joao> great, marking it resolved
[22:48] <gregaf> joao: btw, if you find yourself idle you may want to read "Paxos made simple" (and the original, if you're brave) if you haven't done so before
[22:50] <joao> haven't read it before, no :)
[22:50] * Ryan_Lane (~Adium@ Quit (Read error: Connection reset by peer)
[22:50] <joao> will look for it and get to it
[22:50] * Ryan_Lane (~Adium@ has joined #ceph
[22:50] <elder> http://research.microsoft.com/en-us/um/people/lamport/pubs/lamport-paxos.pdf
[22:51] <joao> but first, finish up some more pending stuff for the trip :)
[22:51] <joao> thanks elder
[22:51] <elder> http://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf
[22:51] <gregaf> don't you love people who publish their works freely? :D
[22:51] <elder> Should I plan on VidYO!!! for our meeting in 9 minutes?
[22:51] <elder> I will be in the car again, so will turn off my own video if so.
[22:52] <gregaf> yeah, I think it's vidyo
[22:52] <elder> Which "room"?
[22:52] <joao> gregaf, taking into account I'm still using my university's proxy to obtain all articles, I had never given that a thought :x
[22:53] <gregaf> elder: presumably we'll use ceph
[22:53] <gregaf> if not I'll get into irc on my phone and say so
[22:54] <joao> gregaf, if you guys don't use ceph or danger room, try to get me the room's url please :)
[22:54] <gregaf> yeah, it'll be one of those
[22:54] <joao> great!
[22:57] <sagewk> tv_: wip-osdmkfs
[22:58] * aliguori (~anthony@ Quit (Remote host closed the connection)
[23:02] * BManojlovic (~steki@ has joined #ceph
[23:08] * stxShadow (~Jens@ip-78-94-238-69.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[23:21] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[23:32] * Ryan_Lane1 (~Adium@ has joined #ceph
[23:32] * Ryan_Lane (~Adium@ Quit (Read error: Connection reset by peer)
[23:45] * LarsFronius (~LarsFroni@95-91-243-252-dynip.superkabel.de) has joined #ceph
[23:46] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[23:57] * Ryan_Lane (~Adium@ has joined #ceph
[23:57] * Ryan_Lane1 (~Adium@ Quit (Read error: Connection reset by peer)
[23:58] <Tv_> sagewk: sweet
[23:59] <sagewk> passes my sanity check, i'll merge it

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.