[1:00] <yehudasa> cmccabe: did you push -f?
[1:01] <gregaf1> by which he means: don't push -f if your config is set up to push all branches
[1:01] <gregaf1> <??? learned this the hard way in a secondary repo without his normal git settings
[1:04] <cmccabe> hmm...
[1:05] <cmccabe> I think maybe I did that from flab?
[1:05] <cmccabe> sigh
[1:05] <cmccabe> yehudasa: yeah, I can confirm that my repo on flab, which I had to use because metropolis was down, did not have push = HEAD in the .git/config
[1:06] <cmccabe> I think maybe I just need to start putting the branch name in the git push command...
[1:06] <gregaf1> only lost one commit, not a big deal :)
[1:06] <cmccabe> relying on that thing being set in the config seems like a slender thread to hang on
[1:07] <cmccabe> sorry about that anyway
[1:07] <gregaf1> I think I erased a few merges last time
[1:07] <yehudasa> cmccabe: yes.. or at least when you push -f look at the command output
[1:07] <yehudasa> you erased a couple of commits I pushed today
[1:07] <gregaf1> if you're cleverer than I am and use git to version your homedir you can put it in your global prefs
[1:08] <cmccabe> well, you can put it in your global prefs anyway I think
[1:10] <cmccabe> hmm
[1:10] <cmccabe> I don't think you can set this in the .gitconfig
[1:10] <cmccabe> it has to be in the project config
[1:16] <Tv> gregaf1: don't ever -f, say +branchname instead
[1:17] <gregaf1> cmccabe: you can, I have it in mine and I'm pretty sure it works
[1:20] <cmccabe> tv: I like the + idea, I wasn't aware of that
[1:21] <Tv> chopping off a toe is more pleasant than losing the whole foot
[1:21] <Tv> (new mnemonic: -f as in foot)
[2:43] * greglap (~Adium@ has joined #ceph
[14:01] <sugoruyo> hey folks, can someone help me figure out why my MDSs are crashing?
[14:06] * jim (~chatzilla@astound-69-42-16-6.ca.astound.net) has joined #ceph
[14:11] <sugoruyo> they seem to be in recovery but, when they reach rejoin, it says laggy or crashed next to them
[14:13] <sugoruyo> and the processes die, i'm also noticing that after a week of simply sitting there mounted by a client that woudln't write anything to it the mds cluster is at e6403 and increasing about once every two minutes
[14:13] <sugoruyo> ceph -w just keeps churning out these:
[14:13] <sugoruyo> 2011-06-21 15:13:14.737460 mds e6407: 3/3/1 up {0=2=up:rejoin(laggy or crashed),1=1=up:rejoin(laggy or crashed),2=0=up:rejoin(laggy or crashed)}
[17:50] * greglap (~Adium@ has joined #ceph
[18:12] <sagewk> http://www.storagebod.com/wordpress/?p=699
[18:13] <yehudasa> sugoruyo: what does your mds log say?
[18:14] <yehudasa> do you have a core dump?
[18:45] <sugoruyo> yehudasa: i was afk, since this is a test system i reran mkcephfs, deleted the old logs and everything
[18:46] <sugoruyo> i don't know how to obtain a core dump, i might have some output from the mds logs though, the last few lines looked like a stack trace i think
[18:46] <yehudasa> sugoruyo: yeah.. the stack trace might be interesting
[18:47] <yehudasa> also, what version are you running?
[18:47] <sugoruyo> well before re-mkcephfs'ing i ran latest in the ubuntu repos - 1 update
[18:47] <sugoruyo> currently i'm running the latest in the repos
[18:48] <sugoruyo> trace is about 22 lines, pastie?
[18:48] <sugoruyo> http://pastie.org/2102249
[19:07] <Tv> sagewk, *: new repo ceph-qa-suite.git, new cli tool teuthology-suite, use like this: mkdir z; teuthology-suite --archive-dir=z --suite=.../ceph-qa-suite.git/ my-sepia-machines.yaml
[19:08] <sagewk> yay!
[19:09] <yehudasa> Tv: I assume these instructions are in some README?
[19:10] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:10] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit ()
[19:10] <Tv> yehudasa: not.. really..
[19:10] <Tv> the code *just* started working
[19:11] <yehudasa> Tv: oh, ok
[20:04] <wido> I removed my "latest" file on my monitor, my 'find' deleted to much stuff....
[20:04] <wido> It's not a monitor map, but what is it?
[20:09] <yehudasa> cmccabe: commit f6c7343f7581b3fcfe1d773ca3b3c997fd883a4d broke AuthNone
[20:10] <cmccabe> yehudasa: what seems to be the problem
[20:10] <cmccabe> yehudasa: vstart without auth worked fine for me, for what that's worth
[20:10] <yehudasa> vstart works for me, but other stuff doesn't
[20:11] <yehudasa> e.g., rados -p foo ls
[20:11] <yehudasa> specifically you removed the encoding of the entity name
[20:11] <cmccabe> perhaps the change to AuthNoneAuthorizer::build_authorizer is to blame
[20:12] <cmccabe> where is the decode that corresponds to this encoding?
[20:12] <yehudasa> you can't just change that protocol
[20:14] <cmccabe> we can restore the name there if you like. The cephx build_authorizer doesn't encode the name, so I was hoping I didn't need it there
[20:15] <yehudasa> well.. we need to get the entity name there
[20:15] <yehudasa> with cephx we don't need it because the tickets hold it
[20:16] <yehudasa> but with auth=none we do need it because there are no tickets
[20:16] <cmccabe> yeah, I'm looking at CephXTicketHandler::build_authorizer now
[20:23] <cmccabe> yehudasa: 9e9cec694ac89c1c2ed162bad68ce5da362cc3b3 should fix it
[20:23] <cmccabe> yehudasa: I don't know what I was thinking there. I think I put an entry on my TODO list to track down whether we needed that entity name, but then I forgot to do it
[20:23] <yehudasa> ok, thanks
[20:34] <stingray> yehudasa: have you seen my bug about rbd offsets mismatch?
[20:34] <yehudasa> stingray: I pushed a fix yesterday, do you still see it?
[20:34] <stingray> yehudasa: I haven't recompiled anything
[20:34] <stingray> did you push it to stable?
[20:34] <yehudasa> stingray: it requires updating both osds and librados.. it was pushed to master
[20:35] <yehudasa> we can cherry-pick it to stable
[20:35] <stingray> I'm tracking stable
[20:35] <stingray> I guess I can cherrypick it myself
[20:35] <stingray> maybe
[20:36] <yehudasa> stingray: we'll send it to stable
[20:36] <stingray> shall I or will you?
[20:36] <stingray> aha
[20:36] <stingray> great
[20:37] <stingray> yehudasa: thanks!
[20:37] <yehudasa> stingray: ok, it's pushed now, let me know if it works for you.. it requires updating both sides
[20:38] <stingray> that mismatch stuff, it only affected rbd, right?
[20:38] <yehudasa> yeah
[20:38] <yehudasa> only rbd was using it.. it would have affected other stuff though because there was also an osd bug
[20:39] <stingray> okay
[20:39] <stingray> I am kicking off my rebuilds
[20:39] <stingray> will see if it helped in 1 hour
[20:39] <yehudasa> great
[20:44] <cmccabe> ah, looks like that dirfrag stuff was the last obstacle to building libcommon without globals
[21:19] <cmccabe> never mind, it's responding again
[21:19] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[21:43] <stingray> yehudasa: didn't seem to help
[21:43] <stingray> but there's a chance I messed up the update
[21:43] <stingray> [root@mpi-m2 t]# rbd --version
[21:44] <stingray> ceph version 0.29.1-8-gc48540a (commit:c48540aec6107199cc6585ec968682f43ed8c050)
[21:44] <stingray> osds are at the same ver
[21:45] <yehudasa> stingray: that's the version
[21:46] <stingray> yep
[21:46] <stingray> still export doesn't match import
[21:47] <yehudasa> stingray: can you send the log?
[21:47] <stingray> I didn't look at offsets yet, just contents
[21:47] <stingray> I am grabbing the offsets now
[21:48] <yehudasa> also, if you could compile and run http://pastebin.com/52SdBjUt on the source object it could help
[21:51] <stingray> sure I can but I doubt it's the source copy
[21:52] <yehudasa> stingray: the source object is a sprase file, and apparently its specific structure triggers a bug
[21:52] <stingray> http://pastebin.com/qAsQkW8n
[21:53] <stingray> first few lines in import log and export log
[21:53] <stingray> second extend offset match
[21:53] <stingray> third doesn't. reading 4096 bytes at offset 1331200 vs writing 4096 bytes at ofs 2367488
[21:55] <yehudasa> hmm.. had a similar issue yesterday, I thought I fixed it
[21:57] <stingray> doesn't look fixed :)
[21:57] <stingray> so, do you still need fiemap?
[21:57] <yehudasa> hmm.. at the moment I can do without it I think
[21:58] <stingray> http://pastebin.com/82yxFKye anyway
[21:59] <yehudasa> cool, thanks
[21:59] <stingray> will go home, will be great if you ping me when you update stable again :)
[21:59] <stingray> so far this one prevents me from running qemu-kvm
[22:00] <stingray> I ported your async changes to my kvm, and it seems to be working fast and stable except for the mangled data
[22:01] <yehudasa> great
[22:08] <sagelap> let's meet at 2 for the planning mtg
[23:32] <jmlowe> quick question, is there going to be a 0.30 release?
