#ceph IRC Log


IRC Log for 2012-03-26

Timestamps are in GMT/BST.

[5:45] <iggy> Rankin: i think there has also been some work for reading from the closest replica done recently... but i may have completely misread that in here
[11:35] <rosco> Does anybody here know how Amazon ec2 works? Is it Cepht RBD like?
[11:35] <rosco> -t
[12:37] <Dieter_be> rosco: i think you mean s3, ec2 is the computing platform
[12:37] <Dieter_be> i haven't used it, but AFAIK it's blob storage system, not a block device
[12:37] <Dieter_be> i.e. comparable to rados
[12:41] <joao> howdy all
[13:10] <rosco> dk
[13:10] <rosco> k
[13:11] <rosco> Dieter_be: I meant the elastic block storage.
[13:15] <Dieter_be> rosco: i'm not familiar with that
[13:41] <wonko_be> rosco: the radosgw normally creates a s3 compatible storage
[13:41] <wonko_be> so, it should be "the same"
[13:56] <dwm__> S3 is the RADOS-alike. EBD is the RBD-alike.
[14:02] <rosco> Ah ok
[14:03] <rosco> I am trying to figure out if i can use/test ceph rbd the same way they use the EBD.
[14:25] <elder> joao, which machine showed you the bnx2 error?
[14:26] <joao> let me check
[14:26] <joao> but either plana12 or plana19
[14:29] <joao> elder, have no idea which one specifically, but for what is worth, they are working fine and dmesg doesn't print any kind of error regarding bnx or its firmware
[14:29] <elder> OK, can I log into those machines? I won't disturb anything.
[14:30] <joao> sure
[14:31] <elder> Just a minute, grabbing some coffee first...
[14:38] <elder> When did you get those messages?
[14:57] <elder> joao, is it possible those messages showed up sometime before I pushed firmware updates out to the plana machines?
[15:00] <joao> elder, they happened this morning
[15:00] <joao> right before I sent the email to the list
[15:00] <joao> this morning as in while you were asleep :p
[15:01] <elder> I was awake.
[15:01] <elder> It showed up in your teuthology log, right?
[15:02] <joao> in teuthology output
[15:02] <joao> to the terminal
[15:02] <joao> I guess that is a log
[15:02] <elder> Ohhh, now I see, something's different.
[15:03] <elder> THe firmware I installed was bnx2/bnx2-mips-09-6.2.1b.fw, but what yours is reporting is about bnx2/bnx2-mips-06-6.2.3.fw
[15:04] <elder> To be conservative, I updated only the single firmware file. Looks like maybe I should update that file as well.
[15:05] <joao> in any case, the kernel reinstall should have taken care of that, no?
[15:06] <joao> well, I gotta go grab lunch
[15:06] <joao> I'll brb
[15:06] <elder> It *should* but I think it won't (yet)
[15:41] <joao> elder, do you know if the btrfs on the testing branch usually follows btrfs releases?
[15:42] <joao> or was it just now updated to the latest btrfs version when you updated testing to 3.3?
[15:45] <elder> Since I updated testing to 3.3, btrfs will include the btrfs that was present in the 3.3 final release.
[15:46] <elder> If you want something different you will need to build your own kernel. If that's necessary I can help you with that. I.e., if you want I can set up a separate branch with what you need in it.
[15:46] <joao> ok, so last testing (3.2, I think) had the latest btrfs from that same version, right?
[15:46] <joao> if so, it might explain why I can no longer trigger 1975
[15:47] <elder> I don't know what was in it.
[15:47] <joao> Oh, okay
[15:47] <elder> If you want I can compare them.
[15:48] <joao> no need, thanks
[15:48] <elder> What were you working with before?
[15:48] <elder> OK.
[15:48] <joao> I'll let teuthology run for a couple more hours, and if it triggers nothing I'll talk with Sage about it :)
[17:19] <wido> pg v513128: 7920 pgs: 7229 active+clean, 11 down+peering, 618 down+peering, 59 down+peering, 3 down+replay+peering; 223 GB data, 679 GB used, 66318 GB / 67068 GB avail
[17:20] <wido> Just to refresh my mind: I right now have "rbd ls" blocking. Probably because some objects can't be accessed
[17:21] <wido> I lost 4 OSD's due to a btrfs crash again, replication is set to 3, but it could be I lost the wrong OSD's :(
[17:22] <wido> But, shouldn't this work? I don't have any unfound objects in PG's?
[17:23] <wido> the cluster is 'stuck' in this state, that's where it is staying now
[17:31] <joao> wido, what happened with btrfs?
[18:19] <joao> does teuthology support coredumps now?
[18:20] <joao> I see something about a failed 'enable-coredump' command
[18:22] <sagewk> they should be in the coredump subdir of the archive results
[18:24] <joao> seen it
[18:24] <joao> thanks
[18:28] <joao> sagewk, how was your flight? still jetlagged? :)
[18:29] <sagewk> not too bad :) slept well this weekend
[18:41] <wido> joao: I'm not sure what happened.
[18:41] <wido> I saw a slowdown (3.2 kernel) and I stopped the OSD's, unmounted btrfs, mounted again and then a couple of FS'es wouldn't mount
[18:41] <wido> open ctree faileds
[18:42] <wido> joao: http://pastebin.com/gVHDgBn4
[18:42] <wido> that is what I got when I executed the rmmod
[18:45] <joao> wido, nothing else before that?
[18:45] <joao> maybe during umount, or even prior to that?
[18:46] <joao> this looks just like fallout from a previous problem
[18:47] <sagewk> wido: there?
[18:49] <wido> sagewk: yes
[18:49] <sagewk> are you able to reproduce the heartbeat crash(es) with logs?
[18:49] <wido> joao: No, no other messages. The FS had been mounted for over a month or so
[18:50] <sagewk> going over the code and the problem isn't jumping out at me..
[18:50] <wido> sagewk: That is what I tried. Problem is, they seem to cause problems with another. So when I (re)start one with higher logging, another one dies
[18:50] <wido> having 40 OSD's run on debug osd = 20 is kind of heavy
[18:51] <sagewk> ah. maybe just a few with high logging?
[18:51] <wido> yes, but I'm never sure which one dies
[18:51] <wido> I'll start them again with that code and see what happends
[18:51] <sagewk> hmm
[18:51] <sagewk> ok thanks
[18:51] <sagewk> could also bump up just the hb related dout's ...
[18:52] <joao> sagewk, do you usually update btrfs on the testing branch to match its current state?
[18:53] <sagewk> wido: pushed new wip-osd-hb that elevates all the hb debugs only
[18:53] <sagewk> do you mind trying that with normal debug levels (but debug ms = 1)...?
[18:55] <wido> sagewk: sure! I'll do
[19:08] <wido> joao: I checked, these FS'es were mounted since March 5th
[19:09] <wido> I had 4 who wouldn't mount after the unmount
[19:09] <joao> no stack traces on the logs since then?
[19:10] <wido> nope, nothing. All I got is on pastebin
[19:10] <wido> sagewk: fyi, #2212 is the one I was seeing the most.
[19:10] <sagewk> ok, let me push that fix
[19:10] <wido> the other one I only saw twice I think
[19:10] <joao> wido, ok, which kernel version?
[19:10] <Tv|work> +*Activing set*
[19:10] <wido> joao: 3.2.0+
[19:10] * Tv|work looks at sagewk, gregaf
[19:11] <joao> wido, k thanks
[19:11] <gregaf> Tv|work: hey, I didn't review the amended version!
[19:11] <wido> joao: I saw my OSD's run into trouble. They started committing suicide
[19:12] <wido> due to the 180 sec I/O timeout
[19:12] <wido> no hardware issues, disks are running fine
[19:12] <joao> oh
[19:12] <sagewk> tv: fixing :)
[19:12] <joao> sagewk, can wido's issue possibly be related with the latencies we've seen?
[19:13] <joao> would the OSD incur intro trouble if an operation would take way too long?
[19:13] <nhm> joao: which latencies are you thinking of specifically?
[19:13] <wido> joao: When I saw the slowdown I stopped the OSD's, unmounted, rmmod and then tried to mount, which went fine 36/40 times
[19:14] <joao> nhm, slowdowns when (meta)data operations are ran concurrently with a snapshot
[19:14] <nhm> joao: interesting.
[19:15] <joao> nhm, indeed, but a headache to pinpoint
[19:15] <sagewk> nhm: skype!
[19:16] <sagewk> nhm: actually, let's use phones..
[19:17] <joao> sagewk, just logged on to skype
[19:17] <elder> sagewk, try again if you want to call.
[19:17] <nhm> skype today?
[19:19] <elder> Sound is pretty bad on telephone.
[19:20] <elder> Correction. TERRIBLE.
[19:20] <nhm> IRC meetings++ ;)
[19:20] <elder> joao, Sage doesn't have your number.
[19:20] <elder> nhm, is it better for your connection?
[19:20] <joao> elder, just told him the number on privmsg
[19:20] <nhm> nope
[19:21] <nhm> I can hear some brief sentence fragments. :)
[19:21] <elder> I ca oo. Bu lly ot ver seful.
[19:22] <joao> fill the blanks?
[19:22] <elder> ex.
[19:22] <elder> es.
[19:23] <elder> Fr d an sl p.
[19:29] <elder> I guess it's over.
[19:30] <nhm> elder: could you hear me?
[19:30] <nhm> you came through relatively well.
[19:30] <elder> I don't know what they are using for a microphone--maybe Sage's phone or something. Whatever it is, it doesn't work.
[19:30] <elder> Way too much echo cancellation or something. You came through fairly well.
[19:32] <sagewk> sorry guys, conf rooms are taken up and the floater external mic has disappeared.. will track that down for a viable backup
[19:32] <elder> OK.
[19:32] <nhm> sagewk: np, we may just need to follow up on IRC. :)
[19:33] <elder> It would be good to figure out something that does work though. As in the past, any of these meetings that I call into are just a waste of time.
[19:33] <sagewk> yeah
[19:33] <nhm> sagewk: btw, vidyo seems to be randomly crashing for me. Not sure how much of a problem it is yet...
[19:34] <nhm> It's happened twice so far, but I've only connected with it a handful of times.
[19:35] <joao> sagewk, btw, I would say that the vid-yo echo is on your side, probably with the speakers and the microphone
[19:36] <joao> the four of us were on vid-yo and there was no echo that I could notice
[19:38] <elder> dmick, Tv|work, nhm, can we have a brief discussion about how we'll complete the task of ensuring firmware is loaded on plana systems?
[19:38] <elder> It can be a call in a different forum if you like.
[19:38] <sagewk> tv and dmick are talking about it in the hallway right now
[19:38] <elder> OK.
[19:39] <elder> joshd, would you mind giving me about five minutes to walk through how best to get some rbd devices going using teuthology?
[19:39] <elder> Is someone else a better candidate?
[19:40] <joshd> elder: I can help (I wrote the task to do it)
[19:40] <elder> Expert!
[19:48] <Tv|work> elder: teuthology/task/kernel.py will ensure an up-to-date linux-firmware.git checkout in /lib/firmware/updates before installing a kernel; nhm is working on that code
[19:49] <nhm> yeah, I'm finishing up some code now. Need to do some testing locally and then test on plana.
[19:50] <dmick> btw, I doubt it's an issue, but we only know for sure that works on Oneiric
[19:50] <dmick> (it was a local package mod to add that search path)
[19:50] <dmick> "local" to Ubuntu
[19:51] <elder> dmick, et al, I don't know why we don't just replace what was there with the latest version from linux-firmware.git rather than messing with the updates directory.
[19:51] <elder> It ought to be harmless for older kernels, and helpful for newer ones.
[19:51] <dmick> well, it breaks the packaging, but I'm also not sure why we care
[19:52] <Tv|work> elder: i don't want to step on the toes of linux-firmware*.deb
[19:52] <mgalkiewicz> hello guys, one of my osds is not transfering from booting to running state
[19:52] <Tv|work> all about plana is currently ubuntu-specific anyway
[19:53] <elder> So are you talking about... Once all is dpkg dependency ccontrolled then we shoudl be fine?
[19:53] <mgalkiewicz> http://pastie.org/3673111
[19:53] <elder> I want to be able to work with bleeding-edge kernels.
[19:53] <nhm> does ubuntu follow debian's convention for using /usr/local/lib/firmware for manually installed firmware?
[19:54] <Tv|work> elder: i'm talking about making nightly run kernel upgrades safe
[19:54] <Tv|work> elder: you either rely on the nightlies to have firmware from last night, or you update it manually
[19:54] <dmick> nhm: no
[19:55] <elder> Tv|work, if the testing kernel gets updated (as it just was), to a version for which the currently-installed linux-firmware package does not have the required firmware, then we hit this problem again.
[19:56] <dmick> nhm: FIRMWARE_PATH = \"/lib/firmware/updates/\", \"/lib/firmware/\"
[19:56] <Tv|work> elder: the nightly runs will grab latest firmware before installing a kernel
[19:56] <dmick> that's why we're using ..updates
[19:56] <elder> Latest firmware from git, or from a package?
[19:56] <nhm> dmick: is updates written to by packages?
[19:56] <elder> Oh
[19:56] <Tv|work> elder: there is no package of linux-firmware.git
[19:56] <elder> Is updates then ignored by the packages/
[19:56] <dmick> nhm: not as far as I know. elder: as far as I know.
[19:56] <elder> OK.
[19:57] <nhm> dmick: :)
[19:57] * nhm keeps forgetting that python doesn't get executed remotely by tasks. bah.
[19:57] <Tv|work> nhm: that's why i want to use pushy in the next grand refactor ;)
[19:58] <elder> OK, so: nightly runs will update the firmware on the target node using content of linux-firmware.git, before installing an updated kernel. And nhm is working on the code to make this the case. Correct?
[19:58] <joshd> mgalkiewicz: could you restart that osd with --debug-osd=20 --debug-filestore=20 --debug-ms=1? that log doesn't tell us much
[19:58] <dmick> elder: yes. manual installers will still have to exercise caution
[19:58] <Tv|work> elder: yes
[19:59] <mgalkiewicz> joshd: yep w8 a sec
[19:59] <elder> And...
[20:00] <elder> if I want to run a private test using teuthology (*not* under the nightly/daily run or whatever) then I need to make sure that any such dependencies on firmware are satisfied on my own--manually--before testing.
[20:00] <elder> Correct?
[20:00] <dmick> if you use the kernel task, it should take care of it
[20:01] <elder> Is the kernel task the thing that installs a kernel I specify (by sha1, branch, or tag)?
[20:01] <Tv|work> and if you don't, at least the window of getting hurt is smaller
[20:01] <Tv|work> elder: "tasks:\n- kernel:\n ..."
[20:01] <Tv|work> hence, the kernel task
[20:02] <elder> Excellent.
[20:02] <elder> I think all is well then. Just waiting for Mark/nhm to finish. Thank you.
[20:05] <mgalkiewicz> joshd: http://pastie.org/3673183
[20:08] <sjust> mgalkiewicz: did you recently upgrade?
[20:09] <mgalkiewicz> sjust: no I still have version 0.43
[20:09] <sjust> was it installed at 0.43?
[20:09] <mgalkiewicz> no my first installation was 0.39 If I am not mistaken
[20:09] <sjust> ok
[20:10] <sjust> did this begin shortly after your upgrade to 0.43?
[20:10] <mgalkiewicz> no
[20:11] <mgalkiewicz> almost 3 weeks after upgrade
[20:12] <sjust> ok
[20:16] <mgalkiewicz> any ideas?
[20:19] <joshd> mgalkiewicz: the pg info ending at 0.0 is a bug
[20:19] <joshd> I think someone else ran into it a week or two ago, let me check
[20:24] <mgalkiewicz> joshd: I have just figured out what caused was the problem
[20:25] <mgalkiewicz> joshd: my ipsec tunnel among ceph machines were not able to transfer packets after network failure
[20:26] <mgalkiewicz> joshd: one osd is still down but know stats shows that cluster is degraded it should fix itself right?
[20:27] <joshd> mgalkiewicz: yeah, if it doesn't let us know - also if those log bound mismatches don't disappear it could be a problem
[20:31] <joshd> mgalkiewicz: you're welcome
[20:48] <joao> just wondering, are the teuthology kernels compiled with debug info?
[21:14] <sagewk> tv|work: why not just make the kernel task do the git update on the firmware? or is that what it's doing?
[21:14] <Tv|work> sagewk: that's the plan
[21:15] <sagewk> cool
[21:17] <sagewk> joao: still around?
[21:21] <joao> sagewk, yes
[21:21] <joao> am about to leave for dinner though
[21:22] <sagewk> if you rebase your branch on top of master you'll see the new global_init() arg where you feed the default config settings/args
[21:22] <sagewk> will you be around later?
[21:22] <joao> sagewk, yes
[21:22] <sagewk> k ping me later and we can skype abou tit
[21:22] <joao> in about an hour
[21:22] <joao> ok
[21:23] <joao> brb then
[21:50] <krisk> /clear/clear
[21:50] <krisk> hi there.
[21:51] <krisk> I'm just trying to setup ceph on 3 ubuntu oneiric nodes. but it seems like most documentation i find is either out of date, or i'm doing stuff horribly wrong. is there any basic setup information you guys can recommend?
[21:51] <krisk> i had a look at https://github.com/nugoat/ceph/blob/master/doc/ops/install/ceph-ubuntu-howto/ceph-installation.rst
[21:51] <krisk> doesn't help me understanding and getting it to work much though
[21:53] <dmick> krisk: official archives are at github.com/ceph/ceph; not sure how current that fork is, but..
[21:54] <krisk> i understand that ceph is under active development, does it actually make sense to use the packaged versions of ubuntu? or should i rather use the current HEAD?
[21:55] <dmick> you'll have a much better time with the current HEAD I believe
[21:55] <nhm> krisk: The versions that come with the distributions are pretty old...
[21:55] <dmick> and documentation can be a bit spotty, it's true
[21:56] <gregaf> I thought our ubuntu packages were pretty up-to-date
[21:57] <gregaf> oh, n/m, looks like it's got .41 right now
[21:57] <gregaf> but don't use HEAD, use our packages
[21:57] <dmick> hm. my Oneiric actually shows 0.34 as latest
[21:57] <nhm> gregaf: looks like oneiric is still on on 0.34-1 according to this: http://packages.ubuntu.com/oneiric/ceph
[21:58] <gregaf> yeah, I was looking at 12.04 :)
[21:59] <krisk> 0.34-1 here
[21:59] <darkfader> yeah thats just a little too old to use ;)
[21:59] <krisk> good to know :) are there .deb repositories you guys maintain for quicker releases? or do i have to manually download deb packages or compile from source?
[22:00] <sagewk> ceph.newdream.net/debian
[22:01] <sagewk> there's a link to info in the release notes on the site
[22:01] <sagewk> http://ceph.newdream.net/news/
[22:04] <krisk> the link i found to the .asc key was broken, where can I get it for this repo?
[22:04] <krisk> a this probably: https://raw.github.com/ceph/ceph/master/keys/autobuild.asc
[22:05] <sagewk> krisk: that is used for autobuilt packages from git, not releases
[22:05] <sagewk> http://newdream.net/~sage/pubkey.asc
[22:06] <Tv|work> krisk: http://ceph.newdream.net/docs/master/ops/install/mkcephfs/
[22:06] <Tv|work> (versus http://ceph.newdream.net/docs/master/ops/autobuilt/ for autobuild.asc)
[22:07] <krisk> mmh, apt didn't complain when I used the other key with http://ceph.newdream.net/debian/
[22:08] <krisk> Tv|work: haha now in this page there's even another key listed: https://raw.github.com/ceph/ceph/master/keys/release.asc
[22:09] <krisk> the install docs seems pretty useful, thx
[22:11] <joao> sagewk, ping
[22:24] * imjustmatthew (~imjustmat@pool-96-228-59-130.rcmdva.fios.verizon.net) has joined #ceph
[22:26] <imjustmatthew> I'm getting a bunch of MDS messages about " mismatch between child accounted_rstats and my rstats!" is this just noise from the logging level or an actual problem with the directory?
[22:29] <gregaf> imjustmatthew: it's a minor problem with the directory
[22:30] <gregaf> recursive statistics (size, time, etc) are kept on each inode, and that message means that the directory inode statistics don't match up with the statistics on all its children
[22:32] <imjustmatthew> gregaf: can I manually update the rstats or do I need to remove the directory?
[22:33] <gregaf> hrm, no, you can't manually update them
[22:33] <gregaf> if you have logs we could look through them at some point and see what happened
[22:34] <imjustmatthew> gregaf: would the problem have come from the client or the MDS?
[22:34] <gregaf> it's all MDS internal stuff
[22:36] <imjustmatthew> how high do you need the MDS logs to see what's going wrong?
[22:37] <gregaf> if you had logs turned on at 5 or higher they'll have some clues
[22:38] <gregaf> but the problem was in the past and is unlikely to be revealed by the current state, so don't bother trying to generate any
[22:40] <imjustmatthew> It's happened on every test cluster so far, I'll just nuke it again and see what happens
[22:40] <gregaf> oh, interesting
[22:40] <imjustmatthew> I'll let you know how it goes, and thanks again for your help
[22:41] <gregaf> in that case, yeah, generate some logs and we'll take a look at them when we can grab some time
[22:41] <gregaf> let me check what level we'll want
[22:41] <imjustmatthew> yeah, home directories are surprising tough on filesystems
[22:41] <imjustmatthew> k
[22:41] <gregaf> if you've got the disk space "debug mds = 20" would be nice
[22:42] <imjustmatthew> any ballpark on how much space I need for that?
[22:43] <gregaf> I don't know, unfortunately
[22:43] <gregaf> how long does it take you to reproduce?
[22:44] * Theuni (~Theuni@ has joined #ceph
[22:44] <imjustmatthew> no more than a day or two, sometimes in the first copy, sometimes after the client's had a little bit of time to work on it
[22:44] <imjustmatthew> I've had client stability issues too, so the crashes could be a trigger
[22:45] <gregaf> I wouldn't expect so; these structures all live on the MDS and are only indirectly changed by clients
[22:45] <gregaf> try it out with "debug mds = 20"; if that's too much then "debug mds = 10" would help too and produce a lot less logging
[22:46] <imjustmatthew> k, I'll do that
[22:47] <gregaf> thanks!
[22:52] <Tv|work> krisk: that's the same key, just saner place to host it.. e.g. https
[22:56] <krisk> ok good to know. installation worked. i'll have a closer look tomorow
[23:28] <dmick> elder, Tv|work: fwiw, check out README.AddingFirmware in the firmware/ directory
[23:29] <Tv|work> no mention of managing versioned dependencies
[23:30] <dmick> right. but apparently an old decision to abandon that tree
[23:30] <Tv|work> yeah, makes sense
[23:30] <Tv|work> as in, this is what we're able to *bundle with GPLed files*
[23:44] * nhm gets ready to break stuff
