#ceph IRC Log


IRC Log for 2011-09-26

Timestamps are in GMT/BST.

[2:41] * jmlowe (~Adium@c-98-223-185-0.hsd1.in.comcast.net) has joined #ceph
[3:04] * jmlowe (~Adium@c-98-223-185-0.hsd1.in.comcast.net) Quit (Quit: Leaving.)
[3:16] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[4:23] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (Ping timeout: 480 seconds)
[4:28] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[4:33] * yoshi_ (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[4:33] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Read error: Connection reset by peer)
[4:41] * yoshi_ (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[4:42] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[5:45] <lxo> gregaf1, the patch does indeed cut down the start-up memory use of an osd, but it still stabilizes at about twice as much as in 0.34
[6:16] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[6:32] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[6:46] * greglap (~Adium@mobile-166-205-141-225.mycingular.net) has joined #ceph
[6:58] <stass> sagewk: hi!
[7:28] * greglap (~Adium@mobile-166-205-141-225.mycingular.net) Quit (Quit: Leaving.)
[7:43] * sjust (~sam@aon.hq.newdream.net) has joined #ceph
[9:58] * ghaskins_ (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[10:05] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Ping timeout: 480 seconds)
[10:13] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[10:18] * ghaskins_ (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Ping timeout: 480 seconds)
[11:00] * DanielFriesen (~dantman@S0106001731dfdb56.vs.shawcable.net) Quit (Quit: http://daniel.friesen.name or ELSE!)
[11:47] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[12:46] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[12:48] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[13:29] * gregorg (~Greg@ has joined #ceph
[14:31] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (Quit: WeeChat 0.3.2)
[14:32] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[16:54] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) has joined #ceph
[17:18] <ajm> hrm, for some reason my mds'es are getting stuck in up:replay
[17:29] <gregaf1> ajm: are your OSDs up, and all your PGs active?
[17:35] <ajm> gregaf1: yes, the cluster is/was rebuilding due to a failed machine so there is some % degraded but all OSDs are up at the moment and "3406 pgs: 271 active, 3135 active+clean"
[17:35] <gregaf1> and your MDSes are still in replay?
[17:36] <ajm> yep, "mds e252: 1/1/1 up {0=13=up:replay}"
[17:36] <gregaf1> hmm, I wouldn't expect that to take too long then
[17:37] <gregaf1> you've just got the one MDS, right? (and the cluster never had more?)
[17:37] <ajm> 3 mds but only 1 active
[17:37] <gregaf1> 2 standbys?
[17:37] <ajm> yes
[17:37] <gregaf1> they aren't showing up there....
[17:37] <ajm> i think its unhappier since i added lots of debugging for them
[17:37] <gregaf1> have they maybe been running through replay and then crashing?
[17:41] <ajm> i was running them in screen with -d with the debugging
[17:41] <ajm> they definitely weren't crashing
[17:42] <ajm> "mds e269: 1/1/1 up {0=15=up:replay}, 2 up:standby"
[17:42] <ajm> now
[17:42] <gregaf1> oh….
[17:42] <gregaf1> what debug level?
[17:42] <ajm> very high :)
[17:42] <gregaf1> if you're forcing it to render a lot of debug to a screen session you're going to make it go very slowly
[17:42] <ajm> yeah, probably why only 1/3 was up
[17:42] <gregaf1> it'll be faster writing to a file :)
[17:42] <ajm> but without that it doesn't seem to come up anyway
[17:54] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[17:57] <ajm> hrm i wonder if i'll just have to wait till the whole thing is healthy again
[18:00] <gregaf1> ajm: what happens if you start them up without logging to screen?
[18:01] * Tv|work (~Tv|work@aon.hq.newdream.net) has joined #ceph
[18:03] <ajm> gregaf1: i have them logging to files at the moment
[18:03] <gregaf1> and it's still just in replay?
[18:03] <ajm> mds e275: 1/1/1 up {0=15=up:replay}, 2 up:standby
[18:03] <gregaf1> no crashing or anything?
[18:04] <ajm> no crashing, running in screen with -f
[18:04] <gregaf1> how much cpu is it using?
[18:04] <gregaf1> does the log look like it's doing anything or is it spitting out error messages?
[18:05] <ajm> one sec, lemme put a log snippet up
[18:06] <ajm> http://pastebin.com/xMtFs6eY
[18:07] <ajm> it runs through a ton of the "journal EMetaBlob.replay added" lines
[18:07] <ajm> (millions of them)
[18:08] <gregaf1> millions?
[18:08] <ajm> i didn't count :)
[18:08] <ajm> but its a lot
[18:08] <gregaf1> it should be running through a lot, yes
[18:08] <gregaf1> those are the grouping of journal events
[18:08] <ajm> i'm not sure if that phase never finishes, or if that phase finishes and it doesn't move on to the next
[18:11] <gregaf1> ajm: this wasn't the end of the log, was it?
[18:12] <gregaf1> it looks like it finished a bunch of replays and then got some more data and didn't start on it for some reason
[18:12] <gregaf1> can you see if 200.0009335c is in the log?
[18:13] <ajm> following that is just a bunch of mon messages similar to what I put, just more of them
[18:13] <ajm> 2011-09-26 11:47:39.094956 7f503718c700 -- --> -- osd_op(mds0.54:40 200.0009335c [read 0~4194304] 1.2363) v1 -- ?+0 0x3230e90 con 0x292c690
[18:14] <ajm> only line with 200.0009335c
[18:14] <gregaf1> okay, so that's why the MDS isn't moving out of replay, it's not getting a reply from the OSD and it's waiting for it
[18:16] <ajm> hrm
[18:16] <gregaf1> sagewk, any ideas why an OSD wouldn't respond to a read request if the PG is active?
[18:16] <ajm> btrfs sucks, thats my answer :)
[18:16] <gregaf1> ajm: I assume you haven't done any manual repairs to your cluster state?
[18:16] <gregaf1> no forcing PGs active or marking things lost or anything?
[18:17] <sagewk> not offhand.. would need to see the log
[18:17] <ajm> no
[18:17] <sagewk> check dmesg for btrfs badness?
[18:17] <gregaf1> (though I would assume those things would manifest in the PG state not going active)
[18:17] <ajm> oh, this is the node that failed
[18:17] <ajm> its rereplicating data into it though
[18:17] <ajm> shouldn't it refetch that data from somewhere else if it wasn't present on that node?
[18:17] <gregaf1> the node that failed is getting asked to service a read?
[18:18] <ajm> its back up and data is copying back onto it
[18:18] <ajm> but its not done yet
[18:18] <gregaf1> okay
[18:18] <ajm> I made this one xfs just to test things out
[18:18] <ajm> I wonder if that is causing breakage
[18:18] <gregaf1> I doubt that's the trouble
[18:19] <gregaf1> the replaced node is probably the primary for that PG, which I didn't think was supposed to happen until it had the data
[18:20] <ajm> hrm, is there a way to see that
[18:22] <gregaf1> ceph pg dump -o | grep 1.2363
[18:22] <gregaf1> I think that 1.2363 is the pg, anyway
[18:22] <gregaf1> but yes, that's the problem
[18:22] * noahdesu (~nwatkins@kyoto.soe.ucsc.edu) has joined #ceph
[18:22] <ajm> hrm
[18:22] <gregaf1> you'll have to wait until the OSD finishes replicating data
[18:23] <gregaf1> when all the PGs are active+clean
[18:23] <ajm> yeah :/
[18:23] <gregaf1> and we'll need to fix this so it doesn't flip so early
[18:27] <gregaf1> ajm: do you have any logging enabled on the OSD?
[18:27] <gregaf1> it's supposed to be prioritizing data that it gets asked about and we'd like to see why it isn't
[18:28] <ajm> no but I can
[18:28] <ajm> debug osd = 20 ?
[18:28] <ajm> or anything else
[18:29] <gregaf1> debug ms =1
[18:29] <gregaf1> as well
[18:30] <gregaf1> you should be able to attach it to http://tracker.newdream.net/issues/1563 once you have it :)
[18:31] <ajm> will do
[18:31] <gregaf1> thanks!
[18:31] <ajm> 2011-09-26 12:31:00.522281 7f66bc421720 osd5 55513 pg[0.22f( v 48421'29099 lc 0'0 (48421'29097,48421'29099]+backlog n=23053 ec=2 les/c 55463/55436 55444/55444/55444) [] r=0 (info mismatch, log(48421'29097,0'0]+backlog) (log bound mismatch, actual=[37'118,38261'17347]) mlcod 0'0 inactive] read_log 1366299 38261'17348 (0'0) b 10000c42fb7.0000018e/head by client7412.1:1134702 2011-09-11 12:52:35.724580
[18:31] <ajm> massive # of those on startu[p
[18:32] <gregaf1> yeah, we'll need to dig through it, but this isn't something we can do over irc (or at least, I can't)
[18:32] <ajm> this is a huge log
[18:33] <gregaf1> it should compress pretty well
[18:34] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[18:35] <ajm> ok waited for the osd to be up/in
[18:35] <ajm> should i restart the mds'es so that it rerequests that data from the osd ?
[18:36] * jojy (~jojyvargh@ has joined #ceph
[18:42] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[19:05] <gregaf1> ajm: yeah
[19:09] <ajm> all done, final log is like 3.5GB :D
[19:09] * sjust (~sam@aon.hq.newdream.net) Quit (Read error: Connection reset by peer)
[19:09] <ajm> its bzipping
[19:10] * sjust (~sam@aon.hq.newdream.net) has joined #ceph
[19:12] <Tv|work> sagewk: so the apache & fastcgi gitbuilder don't have dns set up, i take it...
[19:12] <Tv|work> adding..
[19:14] * adjohn (~adjohn@ has joined #ceph
[19:15] <sagewk> yeah i didn't bother
[19:16] <Tv|work> that's a good indication of how horrible the dns management is :-/
[19:26] <Tv|work> sagewk: what's the "common" and merge_repos.sh thing? what's the semantics, what should sources.list use?
[19:28] * adjohn (~adjohn@ Quit (Quit: adjohn)
[19:29] * jojy (~jojyvargh@ Quit (Quit: jojy)
[19:30] * adjohn (~adjohn@ has joined #ceph
[19:30] <sagewk> common? what are you looking at?
[19:31] <sagewk> merge_repos build steh combined/ repo that combines all the results into a single debian repo
[19:32] <Tv|work> sagewk: http://gitbuilder-apache-deb-ndn.ceph.newdream.net/output/combined/ vs http://gitbuilder-apache-deb-ndn.ceph.newdream.net/output/ref/master/
[19:32] <Tv|work> sagewk: dumb question but why?
[19:32] <Tv|work> sorry "combined" not "common"
[19:32] <sagewk> that's what the ops guys wanted
[19:33] <sagewk> a single apt source that has all the versions.. then they install the one they want.
[19:33] <sagewk> instead of switching the apt source every time.
[19:33] <Tv|work> oh yeah if you explicitly control your desired version, then i can see that making sense
[19:33] <Tv|work> for sepia "autoinstall whatever is current", that's dangerous (wip-* branches win over master)
[19:34] <sagewk> right
[19:34] <Tv|work> so using ref/master for sources.list
[19:34] <sagewk> both are there so you can pick
[19:34] <sagewk> yeah
[19:41] <Tv|work> new sepia installs get custom apache2 etc: https://github.com/NewDreamNetwork/ceph-qa-chef/commit/fe7ccc3bab468b5cdc48a6c2a448894204e69ef3
[19:42] <Tv|work> (you can also do that incrementally, see https://github.com/NewDreamNetwork/ceph-qa-chef/blob/master/solo/solo-from-scratch )
[19:43] <Tv|work> oh except the stupid chef recipe won't *upgrade* the apache2 automatically
[19:43] <Tv|work> reading docs
[19:45] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[19:46] <Tv|work> there we go, fxed
[19:48] <sagewk> cool
[19:50] <joshd> should the naming change have made cauthtool ceph-authtool?
[19:50] <Tv|work> joshd: yes
[19:51] <joshd> the makefile wasn't updated for it, but teuthology was
[19:51] <joshd> or maybe that's an old version
[19:51] <Tv|work> joshd: i did a lot of those fixes on friday
[19:52] <Tv|work> i don't think any of the old names got any hits on the source tree anymore
[19:52] <Tv|work> i have src/ceph-authtool{,.o,.cc}, no src/cauthtool* anymore
[19:54] <joshd> yeah, the version teuthology grabbed was just old
[19:57] <joshd> the coverage gitbuilder seems to have gotten stuck at an old version
[19:59] <ajm> // wondering if this 200mb upload to the bug tracker will work
[19:59] * alexxy (~alexxy@ Quit (Remote host closed the connection)
[20:00] <ajm> Error 101 (net::ERR_CONNECTION_RESET): The connection was reset.
[20:00] <ajm> nope
[20:00] <gregaf1> joshd; I was looking at them this morning and all except for the standard 64-bit builder seem to be running way behind
[20:01] <gregaf1> did they actually get moved over to look at github?
[20:01] <gregaf1> ajm: do you have somewhere you can put it for us to grab?
[20:01] * jojy (~jojyvargh@ has joined #ceph
[20:01] <joshd> yeah, the process that fetches them is just waiting on a read in git fetch-pack
[20:01] <gregaf1> otherwise I think we have a drop zone somewhere
[20:01] * jojy (~jojyvargh@ Quit ()
[20:03] <ajm> gregaf1: yeah i do
[20:07] <ajm> gregaf1: added
[20:07] <joshd> gitbuilders are going again
[20:08] * jojy (~jojyvargh@ has joined #ceph
[20:08] <joshd> i386 and gcov were both at the same commit (d64237a6a555944d6d35676490bc4fb7c7db965d)
[20:09] <gregaf1> ajm: where can we grab it from?
[20:09] <ajm> i put in ticket notes: http://adam.gs/osd.5.log.1317056213.bz2
[20:10] <gregaf1> awesome, thanks
[20:10] <ajm> no, thank you :)
[20:12] * Nightdog (~karl@190.84-48-62.nextgentel.com) has joined #ceph
[20:13] <stass> sagewk: here?
[20:22] * alexxy (~alexxy@ has joined #ceph
[20:24] * jojy (~jojyvargh@ Quit (Quit: jojy)
[20:39] * noahdesu (~nwatkins@kyoto.soe.ucsc.edu) has left #ceph
[20:40] * noahdesu (~nwatkins@kyoto.soe.ucsc.edu) has joined #ceph
[21:18] * adjohn (~adjohn@ Quit (Quit: adjohn)
[21:41] * Dantman (~dantman@S0106001731dfdb56.vs.shawcable.net) has joined #ceph
[21:47] * adjohn (~adjohn@ has joined #ceph
[21:58] <sagewk> stass: here!
[22:12] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) has joined #ceph
[22:20] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[22:21] <stass> sagewk: hi! So how do you want me to send these FreeBSD patches? Should I create a new repo with proper commit messages, or just send them as a set of patches, or as a single patch?
[22:21] <sagewk> i can pull from wherever
[22:21] <sagewk> probably teh easiest is to just rebase your patches off current master and push that branch to you github tree
[22:22] <sagewk> and i'll pull/merge from there
[22:22] <stass> sagewk: ok, thanks
[22:22] <stass> sagewk: will do that
[22:24] <sagewk> cool
[22:31] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[22:42] <darkfaded> sagewk: freebsd patches?
[22:42] <darkfaded> z-o-m-g?
[22:43] <gregaf1> I think stass would say they're still very much a work in progress — working on just building right now, right?
[22:44] <darkfaded> but still
[22:46] <darkfaded> having zfs and ceph on the same freebsd system in the distant future would be nice. (except the memory hunger)
[22:46] <darkfaded> one could do all kinds of weird stuff
[22:53] <stass> gregaf1: it builds successfully now
[22:54] <stass> gregaf1: I didn't test how well it works yet
[22:54] <stass> darkfaded: I was planning to look how we can integrate with ZFS
[22:54] <stass> darkfaded: and maybe doing a GEOM layer for RBD which will be really cool to have
[22:54] <darkfaded> yesh.
[22:55] <darkfaded> maybe geom first and then zfs
[22:55] <darkfaded> since it gives more oppurtunities
[22:55] <sagewk> stass: geom looks interesting
[22:55] <darkfaded> sagewk: freebsd clone of VxVM :)
[22:55] <sagewk> :)
[22:56] <darkfaded> stass: let me just say this is awesome.
[22:56] <darkfaded> keep doing it and if you need any testing etc. let me know
[22:57] <stass> darkfaded: thanks!
[23:35] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[23:46] * noahdesu (~nwatkins@kyoto.soe.ucsc.edu) Quit (Quit: Leaving)
[23:59] * adjohn (~adjohn@ Quit (Remote host closed the connection)
[23:59] * adjohn (~adjohn@ has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.