#ceph IRC Log


IRC Log for 2011-01-13

Timestamps are in GMT/BST.

[0:03] * bchrisman (~Adium@ has joined #ceph
[0:05] * bchrisman (~Adium@ Quit ()
[0:08] <gregaf> stingray: I'm looking at your log file and I can tell which assert it's failing and stuff, but I can't find the commit it says it is anywhere in our tree
[0:08] <gregaf> where'd you get your source and did you make any changes to it?
[0:10] <stingray> I did make changes
[0:10] <stingray> the changes are unrelated
[0:10] <stingray> the base commit is the one I told you the other day
[0:10] <gregaf> okay
[0:10] <gregaf> what've you added?
[0:11] <stingray> http://stingr.net/d/stuff/what-i-added.diff
[0:13] <sagewk> stingray: btw that should be fixed by commit:15dcc65199fc825ca8c51a31de3be01410aca9c1
[0:18] <stingray> sagewk: yes, I know
[0:18] <stingray> I haven't redeployed anything since you fixed it
[0:18] <stingray> I guess I'm waiting for full moon or something.
[0:18] <sagewk> ok just checking :)
[0:18] <stingray> :)
[0:19] <stingray> np, I just want to get down to this journal problem I discovered
[0:19] <stingray> I don't want to repopulate the cluster with the data
[0:19] <stingray> and it's pretty dumb failure mode in general - if mds journal is corrupted, the cluster is fubar
[0:20] <stingray> just append 0xdeadbeef to it and laugh
[0:20] <stingray> hopefully gregaf will get something useful out of it
[0:21] <gregaf> stingray: we're working on recovery from that for the next release
[0:21] <stingray> I can't do a proper debugging now, too much stuff at work :(
[0:21] <stingray> gregaf: great :)
[0:22] <Tv|work> i don't see anything in autotest so far that would make it understand different kinds of machines
[0:22] <Tv|work> say "i want 1 storage box and 2 client machines"
[0:22] <Tv|work> hmm "mappings"
[0:22] <sagewk> i think this is near the top of the list of questions we'll be asking the qa candidates
[0:24] <cmccabe> tv: I'm having trouble finding the main docs
[0:24] <cmccabe> tv: I found a whitepaper and slides, which were pretty encouraging\
[0:25] <Tv|work> cmccabe: their wiki is decent
[0:25] <Tv|work> i already know how to write single-machine tests, or tests with homogenous machine pools
[0:26] <Tv|work> though i haven't yet seen how to actually launch them ;)
[0:26] <cmccabe> sagewk: I hope we don't get too specific with questions for the QA candidates... I hate it when interviewers ask "how many years of experience do you have with framework X?"
[0:28] <cmccabe> sagewk: on the other hand, it's important that they have a general idea of process
[0:28] <cmccabe> sagewk: I have a friend who would be a great candidate, too bad he's happily employed at the moment :P
[0:29] <cmccabe> tv: one of the cool things for autotest is that you can write single-machine tests that run from the command line, and multi-machine tests that are queued, using the same framework/code
[0:30] <sagewk> cmcabe: i want experience with setting up automated test frameworks. that should imply familiarity with some of them.
[0:30] <Tv|work> cmccabe: yup, that was sweet
[0:30] <cmccabe> sagewk: yeah, I guess that's very important for our specific needs
[0:31] <cmccabe> sagewk: but if someone has experience with some proprietary automated test framework, or something that we can't use for another reason, that experience may transfer
[0:31] <cmccabe> sagewk: that was my main idea
[0:31] <sagewk> btw let's skip a second mtg this afternoon and discuss tomorrow morning
[0:31] <sagewk> yeah definitely
[0:31] <cmccabe> sagewk: k
[0:39] * `gregorg` (~Greg@ has joined #ceph
[0:39] * gregorg_taf (~Greg@ Quit (Read error: Connection reset by peer)
[0:43] <Tv|work> it seems autotest is heavily biased towards "all machines i manage are part of the same test, one test at a time"
[0:43] <Tv|work> which means we'd run one autotest server per cluster
[0:43] <Tv|work> and all the jobs it runs always use the whole cluster
[0:44] <Tv|work> digging source...
[0:45] <Tv|work> yet other parts talk about "when a machine comes available"
[0:45] <Tv|work> ok so the tests specify how many machines they need, that seems to be it
[0:45] <Tv|work> it's not "what you got", it's "gimme this many"
[0:45] <Tv|work> and, assumption of homogenous pool seems very strong
[0:46] <Tv|work> ooh metahosts
[0:47] <cmccabe> tv: looking at the whitepaper, there are 3 levels: autotest_client (single-machine, runs on test machine), autotest_server (controls autotest_clients, can be used to do client-server setup, etc), and autotest_frontend (web frontend, queue manager, database with results, etc.)
[0:48] <Tv|work> cmccabe: this is pretty much all about server
[0:48] <Tv|work> the wiki is way better than the whitepaper
[0:48] <cmccabe> tv: yeah, we'd want to be at the server level most of the time.
[0:48] <cmccabe> tv: but I think that even at that level, you can still run locally without queue management. Which is a plus.
[0:49] <Tv|work> at this point i'm not yet convinced autotest can do what we need
[0:49] <Tv|work> the metahosts thing is the first promising find
[0:49] <cmccabe> tv: I'm trying to figure out if you can make different machines have different properties... that is important because we have some machines we want to be used with kernel clients, and others with servers
[0:49] <Tv|work> and it's barely documented :(
[0:49] <cmccabe> tv: the nice thing about autotest is that it's written expecting kernel crashes, machines locking up, and all that.
[0:50] <cmccabe> tv: we will experience that when testing the ceph kernel client
[0:50] <cmccabe> tv: but I agree, we need some way to request resources by type, rather than just number of nodes
[0:51] <sagewk> autotest may also be useful for client side testing only
[0:52] <cmccabe> sagewk: yeah, it's very kernel-focused, which is a good thing for client-side
[0:52] <cmccabe> sagewk: still I think using multiple frameworks is something to avoid if possible
[0:54] <cmccabe> but I'm pretty far from being an expert on this stuff-- heck, I was dumb enough to not know about autotest prior to this...
[0:54] <cmccabe> tv: re: your question about machine type
[0:54] <cmccabe> tv: maybe this provides some insight? http://autotest.kernel.org/attachment/wiki/WebFrontendHowTo/hostlist.png
[0:55] <cmccabe> tv: seems like you can specify a platform for the hosts
[0:55] <Tv|work> that's one way kernel.org uses the metahosts
[0:56] <Tv|work> i'm still looking for the nuts&bolts of that
[0:56] <cmccabe> tv: we could set "kclient" as one kind of platform, and "server_machine" as another
[0:56] <cmccabe> tv: ah, so metahosts is extensible..
[0:56] <Tv|work> best i've seen so far, outside of the source: http://autotest.kernel.org/wiki/AdvancedJobScheduling
[0:57] <Tv|work> seems clumsy and limited :
[0:57] <Tv|work> :(
[0:57] <Tv|work> and i haven't seen an example of one server job grabbing two different kinds of metahosts at once, etc
[1:01] <Tv|work> well i know how to do it by allocating a physical machine and running kvms in there
[1:02] <cmccabe> tv: yeah, but that is not useful for integration testing.
[1:02] <cmccabe> tv: or at least performance testing
[1:03] <cmccabe> tv: the atomic groups concept could be useful. I think we are going to want to do some performance tests by grabbing a whole rack, to make sure we have locality.
[1:03] <cmccabe> tv: or rather, machines that reside on a single rack. Hopefully the framework doesn't force us to grab every rack machine at once, all the time
[1:03] <Tv|work> frankly, that bit of the resource allocation sounded very inflexible
[1:04] <Tv|work> i'm all for being able to say "i need 5 machines in the same rack", but atomic groups seems to mean those 5 must *always* go together
[1:04] <Tv|work> still need more concrete examples
[1:05] <cmccabe> tv: "Some tests can run against a variable number of machines, and you may with to run such a test against all the ready machines within an atomic group, within some bounds. The scheduler can do this for you -- at job run time, it will verify all machines in the group and use all the ones that are ready"
[1:05] <cmccabe> tv: maybe I'm misleading myself, but that seems to imply that you can run against a subset of the rack machines?
[1:06] <cmccabe> tv: you know, we should find out if these guys have a mailing list or something.
[1:06] <Tv|work> cmccabe: but you still need to ask for that atomic group etc
[1:06] <Tv|work> it's not like the machines could participate in anything else, while idle
[1:07] <cmccabe> tv: yeah, it's unclear whether machines can be in two atomic groups, or how the scheduler interacts with them after they're in a group, etc.
[1:09] <Tv|work> well, if nothing else, there's a way to plug in another resource allocator
[1:16] <jantje> hi !
[1:16] <sagewk> jantje: hi!
[1:16] <jantje> cool, you're alive :)
[1:17] <jantje> did you get my emails?
[1:17] <jantje> (some bug i hit was already in the tracker .. sorry)
[1:18] <jantje> (563 and 666)
[1:19] <sagewk> which emails?
[1:19] <sagewk> 666 i'm working on now
[1:20] <sagewk> 563 ball is somewhat in the btrfs court, tho i haven't had time to push it
[1:23] <jantje> i just resent the email
[1:23] <jantje> some traces you wanted to see with my value too large issue
[1:23] <sagewk> oh i see it now.
[1:23] <jantje> dirsize, norbytes stuff
[1:24] <sagewk> now #706. will look at that next.
[1:24] <jantje> great
[1:25] <jantje> it's 1:24, so i'm going to get some sleep
[1:25] <jantje> let me know what you want me to try
[1:25] * bchrisman (~Adium@ has joined #ceph
[1:26] <sagewk> k
[1:26] <jantje> too bad you just sent patches to linus :-)
[1:27] <jantje> thanks sagewk, i really appreciate the effort
[1:28] <sagewk> it's a bug, doesn't much matter if its in the window
[1:29] <jantje> ow i see
[1:29] <jantje> nite!
[1:31] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[1:37] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[1:50] * gnp421 (~hutchint@c-75-71-83-44.hsd1.co.comcast.net) has joined #ceph
[1:53] <Tv|work> it seems very much that a job can only do one "metahost" trick
[1:53] <Tv|work> which means we can't say "client and a server, please"
[1:54] <cmccabe> tv: I think we should ask about this on the mailing list
[1:54] <Tv|work> yeah
[1:54] <Tv|work> i've been through projects that use autotest and don't see anyone doing anything like this
[1:54] <cmccabe> tv: I've run through the comments on the autotest mailing list but didn't see any really major discussion of metahost
[1:56] <cmccabe> tv: well, I guess there was a discussion of "pluggable metahost handlers"
[2:04] <yehudasa> jantje: just pushed a fix for #706, you can pick it up (commit db5f8e2)
[2:05] <DJL> eheh; guys me about to post some hefty questions on the mailinglist, @.@;
[2:09] * earth (~summer@75-137-144-177.dhcp.gwnt.ga.charter.com) Quit (Quit: gnitiuQ)
[2:10] * Tv|work (~Tv|work@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[2:20] <DJL> hm,, ceph-devel@vger.kernel.org can be busy sometimes? sent message 10min ago, but havent received;;
[2:21] <DJL> argh, rejected cuz it contained html link;
[2:45] * DJL (82d8d198@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[3:14] * DJL (82d8d198@ircip2.mibbit.com) has joined #ceph
[3:15] <DJL> is mailinglist working if anyhow to check?; ive sent messages to it, but still havent received..
[3:17] <cmccabe> is your post "Ceph 0.23.2 Consolidated questions"?
[3:17] <cmccabe> if so, your post is up there.
[3:17] <DJL> yes
[3:17] <DJL> ok, it went through;
[3:18] <DJL> thought i'd receive the message back from it first;
[3:20] * gnp421 (~hutchint@c-75-71-83-44.hsd1.co.comcast.net) Quit (Ping timeout: 480 seconds)
[4:29] * bchrisman (~Adium@ has joined #ceph
[4:33] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[7:00] * ijuz__ (~ijuz@p57999889.dip.t-dialin.net) Quit (Ping timeout: 480 seconds)
[7:09] * ijuz__ (~ijuz@p4FFF6612.dip.t-dialin.net) has joined #ceph
[7:54] * shdb (~shdb@217-162-231-62.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[8:05] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[9:15] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[9:49] <MichielM> morning all
[10:18] * Yoric (~David@ has joined #ceph
[10:33] <jantje> hi
[10:33] <jantje> yehudasa: great, i'm going to test it right away!
[10:50] * verwilst (~verwilst@router.begen1.office.netnoc.eu) has joined #ceph
[11:15] * SimonB (~Simon@mail.openminds.co.uk) has joined #ceph
[11:15] <SimonB> Morning all.
[11:19] <SimonB> silly question of the day. After rebooting both nodes the osd wont start due to: "failed to open journal /var/run/ceph/journal.osd.0: No such file or directory" Which is true, it no longer exists. 'journal dio' is set to false in the ceph.config. I'm getting there slowly, honest.
[11:31] <jantje> so your journals are deleted when you rebooted
[11:32] <jantje> journals potentially contain data
[11:32] <SimonB> yes, certainly seems to be.
[11:32] <SimonB> which explains why I also got a snapshotting error.
[11:32] <jantje> that can be replayed when something goes wrong
[11:32] <jantje> i think ceph won't even start with a missing journal ?
[11:32] <jantje> i have my journal on a /dev/shm
[11:33] <SimonB> mon and mds are running fine, but surprisingly the osd wont
[11:33] <SimonB> unsurprisingly sorry
[11:33] <jantje> and I have to do mkcephfs every time
[11:33] <jantje> yes, because the osd's need the journal
[11:33] <jantje> I don't know if you can just touch the fale and try again
[11:33] <jantje> *file
[11:34] * jantje not an ceph expert :-)
[11:34] <SimonB> otherwise its mkcephfs again? but if this happens whenever I reboot the main node then this will be an issue. So where sohuld I keep my journal? heh
[11:35] <jantje> yehudasa: #706 verified, it works!
[11:35] <jantje> on a place where files don't get deletd
[12:17] * allsystemsarego (~allsystem@ has joined #ceph
[12:26] * votz (~votz@dhcp0020.grt.resnet.group.upenn.edu) has joined #ceph
[12:35] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[13:00] * bchrisman (~Adium@ has joined #ceph
[13:08] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[14:08] * SimonB (~Simon@mail.openminds.co.uk) Quit (Quit: gone)
[15:09] <jantje> I had something really weird:
[15:10] <jantje> i tried to do a 'cd make' -> no such file or directory
[15:10] <jantje> then i did a 'ls'
[15:10] <jantje> (and verified there indeed was a directory 'make' !)
[15:10] <jantje> and then 'cd make' worked again
[15:11] <jantje> i recently had some osd's crashed (the famous eval_repop), but my mds's look just fine
[15:29] <jantje> and lots of shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
[15:29] <jantje> it's all random
[15:29] <jantje> time to restart I guess :-)
[15:32] <jantje> [14668.832046] VFS: Busy inodes after unmount of ceph. Self-destruct in 5 seconds. Have a nice day...
[15:32] <jantje> hehe
[16:04] <MichielM> oops :)
[16:40] * sakib (~sakib@ Quit (Quit: leaving)
[16:55] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[17:00] * billy (~quassel@ has joined #ceph
[17:01] <billy> need some help setting up a two physical machine test envirnment
[17:01] <billy> can someone walk me through it?
[17:02] <billy> I have two machine with ubuntu 10.10 server installed, and I've already downloaded rc from git, compiled and installed on both
[17:12] <jantje> make a config
[17:13] <jantje> src/ceph.conf.twoosds is a good place to start
[17:13] <jantje> install ssh keys
[17:13] <jantje> and then use src/init-ceph -a start
[17:21] * billy (~quassel@ Quit (Remote host closed the connection)
[17:41] * bchrisman (~Adium@ has joined #ceph
[17:42] * greglap (~Adium@cpe-76-90-239-202.socal.res.rr.com) has joined #ceph
[17:42] * bchrisman (~Adium@ Quit ()
[17:42] * bchrisman (~Adium@ has joined #ceph
[17:43] * bchrisman (~Adium@ Quit ()
[17:51] * bchrisman (~Adium@ has joined #ceph
[18:19] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[18:21] * Meths_ (rift@ has joined #ceph
[18:28] * Meths (rift@ Quit (Ping timeout: 480 seconds)
[18:28] * Meths_ is now known as Meths
[18:38] * Tv|work (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:41] <sagewk> jantje: when you saw the eval_repop error, was that unstable or testing branch?
[18:41] <sagewk> jantje: and if you can reproduce the VFS busy inodes problem please share! :)
[18:56] <greglap> sagewk: you available?
[18:57] <sagewk> yeah
[18:58] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:59] * Yoric (~David@ Quit (Quit: Yoric)
[19:19] <MichielM> hi sagewk, cool project and nice thesis :-)
[19:21] <sagewk> michielm: hi, and thanks!
[19:23] * bchrisman (~Adium@ has joined #ceph
[19:24] * bchrisman (~Adium@ Quit ()
[19:24] * bchrisman (~Adium@ has joined #ceph
[19:34] <greglap> stingray: can you apply the patch at http://pastebin.com/7DV4vVds and try to dump the mds log again?
[19:34] <greglap> that ought to let me figure out what's going wrong in the objecter
[19:35] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[20:03] <MichielM> sagewk: what is the largest cluster you've tested ceph on ?
[20:04] <MichielM> in regarding to storage and # osd's ?
[20:04] <sagewk> recently, ~40 nodes. in the past, ~400.
[20:04] <MichielM> nodes or osd?
[20:04] <MichielM> assuming you assign an osd to every single disk
[20:04] * fzylogic (~fzylogic@ has joined #ceph
[20:06] <wido> sagewk: about the eval_repop, saw it yesterday again
[20:06] <wido> didn't update the issue since I thought you were still working on it
[20:08] <sagewk> yeah, i think i have a fix, jim is testing
[20:10] <wido> great
[20:18] * jantje (~jan@paranoid.nl) Quit (Read error: Connection reset by peer)
[20:23] <Tv|work> FYI autotest evaluation: automatic scheduling might not understand the heterogeneous machine pools we'd want (server and client machines are often different in hardware etc), but it contains everything needed to e.g. lock machines manually and run tests on them
[20:23] * jantje (~jan@paranoid.nl) has joined #ceph
[20:23] <Tv|work> so even if it takes a few weeks to get the automatic scheduling in place, i think we should go ahead with rest of autotest
[20:24] * ElectricBill (~bill@smtpv2.cosi.net) Quit (Ping timeout: 480 seconds)
[20:24] <Tv|work> (if nothing else, tests that work with autotest will be easier to port to $SOMETHING than tests that don't exist or don't work right ;)
[20:25] * allsystemsarego (~allsystem@ has joined #ceph
[20:37] <sjust> cool
[20:37] <cmccabe> tv: sounds good
[20:37] <cmccabe> tv: should we post a question to the autotest mailing list about scheduling?
[20:37] <Tv|work> cmccabe: did already
[20:37] <cmccabe> tv: even if we look like newbs it would still be nice to get those questions answered quickly by the people who know
[20:38] <Tv|work> i also asked it on their irc, but there's no activity
[20:38] <cmccabe> tv: great!
[20:40] * ElectricBill (~bill@smtpv2.cosi.net) has joined #ceph
[20:48] * Jiaju (~jjzhang@ Quit (Ping timeout: 480 seconds)
[20:50] * Jiaju (~jjzhang@ has joined #ceph
[21:03] <sagewk> ok to meet at 1pm? we have a narrow window on the conf room
[21:03] <joshd> sure
[21:05] <cmccabe> sagewk: is 12:45 possible?
[21:06] <Tv|work> i'm fine with either, need to head out for lunch asap then
[21:13] * bchrisman (~Adium@ has joined #ceph
[21:18] * bchrisman (~Adium@ Quit ()
[21:32] * ajnelson (~Adium@dhcp-63-189.cse.ucsc.edu) has joined #ceph
[22:00] <Tv|work> 1pm?
[22:00] <joshd> looks like sage is still at lunch
[22:05] <sagewk> back
[22:05] <cmccabe> can we meet at 1:45?
[22:05] <cmccabe> or is that not possible
[22:05] <greglap> I thought we had a narrow window in the conference room?
[22:06] <sagewk> let's make it 2:30
[22:06] <sagewk> they're still in there anyway
[22:06] <cmccabe> sagewk: great
[22:13] <stingray> greglap: on it
[22:18] <stingray> http://stackoverflow.com/questions/2349378/new-programming-jargon-you-coined
[22:31] <Tv|work> wwaaah worst automake/make syntax ugliness ever
[22:31] <Tv|work> i have
[22:31] <Tv|work> VARIABLE+= something \
[22:31] <Tv|work> more
[22:31] <Tv|work> if you forget the backslash, it's 100% silent about the next line being complete garbage
[22:33] <Tv|work> note to self: just ./autogen.sh doesn't regenerate Makefile :(
[22:43] <jantje> sagewk: I currently run testing
[22:44] <jantje> (git branch reports testing) : ceph version 0.24.1 (commit:1cdb01b47b0656f5e61715e0ec35329356c651a1)
[22:44] <jantje> sagewk: and i got it when unmounting the client after the eval_repop
[22:45] <jantje> sagewk: I recently switched to testing, it's possible that I still was on stable when I sent you the email
[22:51] <Tv|work> ahh it seems to reserve N hosts of label X (e.g. beefy-server) in autotest, we say --label=X,X,X,X... N times
[22:51] <Tv|work> so then we can do --label=server,server,client,client,client etc
[22:52] <cmccabe> back
[22:52] <Tv|work> that clears up a *whole* lot
[22:52] <cmccabe> tv: so labels for us might be like kclient, ceph_srv
[22:52] <Tv|work> yeah
[22:52] <Tv|work> though maybe also about the hardware sizing
[22:52] <Tv|work> a host can have many labels
[22:52] <cmccabe> oh hey, do you guys know how to get a new kernel on to cosd
[22:52] <cmccabe> trying to show dallas how to set up the cluster, but it looks to have changed :\
[22:53] <Tv|work> i'm still not clear on how that interacts with the SYNC thingie, that lets you reserve many (identical) hosts
[22:57] <cmccabe> tv: hmm, can't find sync on the autotest wiki
[22:57] <Tv|work> this stuff is way beyond the wiki, i'm reading the source all the time
[22:57] <jantje> cmccabe: thanks for fixing the rbytes issue!
[22:57] <Tv|work> wiki gives some hints to intentions, code is truth
[22:58] <cmccabe> tv: very true
[22:58] <cmccabe> jantje: thanks, but I don't think it was I who fixed the rbytes issue....
[22:58] <jantje> oh
[22:59] <jantje> sorry, it was yehudasa ! :-)
[22:59] <cmccabe> :)
[23:14] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[23:19] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[23:23] <jantje> sagewk: i just updated my local testing branch, and I get the feeling it's slower, it took 1m13sec to write 18600 files or 726MB
[23:24] <jantje> ofcourse, I could be wrong, because I have nothing to compare :)
[23:31] * sakib (~sakib@ has joined #ceph
[23:35] <jantje> nite!
[23:38] <jantje> (how can i set up a OSD without any journal? last time I tried the OSD didn't come up)
[23:42] <sagewk> just omit osd journal =. it'll be super slow though.
[23:43] <sakib> hi all
[23:43] <greglap> hi sakib
[23:43] <sakib> it seems i found a bug in ceph_getxattr routine
[23:44] <jantje> sagewk: [ 1061.060236] libceph: get_reply unknown tid 34146 from osd2
[23:44] <greglap> sakib: that's possible, what happened?
[23:44] <jantje> anyway, i'm off to bed
[23:44] <sakib> this is how to reproduce:
[23:45] <sakib> touch file
[23:45] <sakib> setfattr -n user.bare -v bare file
[23:45] <sakib> setfattr -n user.bar -v bar file
[23:45] <sakib> then both user.bare and user.bar attrs have "bar" value
[23:47] <sakib> it happens because of xattr.c:__get_xattr() slightly incorrectly searches xattr by names
[23:47] <sakib> .. or I guess so..
[23:48] <greglap> sakib: I'll put it in the tracker
[23:49] <sakib> i've written little patch to fix it, actually
[23:49] <greglap> oh, cool
[23:49] <sakib> is that ok if I post it to mail list?
[23:49] <greglap> of course!
[23:50] <sakib> 4 lines :)
[23:52] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.