#ceph IRC Log

Index

IRC Log for 2011-10-14

Timestamps are in GMT/BST.

[0:05] * aliguori (~anthony@32.97.110.59) Quit (Quit: Ex-Chat)
[0:24] * jojy (~jojyvargh@108.60.121.114) Quit (Quit: jojy)
[1:35] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[1:37] <sagewk> ajm-: ok, pushed.
[1:38] <sagewk> ajm-: let me know when you have a second and i'll walk you through it. want to test it carefully on the data pool first, where any badness will have little/no impact.
[1:38] <ajm-> sagewk: i'm around
[1:38] <ajm-> the thing is, we need to backport this to 0.36, i never did the upgrade to 0.37 and now I can't :/
[1:38] <ajm-> unless you think I can
[1:38] <sagewk> ajm-: oh, right...
[1:39] <sagewk> ajm-: let me see how painful that is
[1:40] <ajm-> i have to actually step out for about an hour, we can just do it tommorow if you won't still be around
[1:42] <sagewk> i'll be here
[1:43] <sagewk> sounds good
[1:55] <sagewk> ajm-: pushed a backport. ran my basic tests and it behaves
[1:56] <sagewk> ajm-: still need to write a thorough teuthology test for it, so be careful.
[2:00] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) Quit (Quit: jojy)
[2:16] * Tv|work (~Tv|work@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[2:17] <ajm-> sagewk: where am I looking for this :)
[2:19] * Nightdog (~karl@190.84-48-62.nextgentel.com) Quit (Remote host closed the connection)
[2:20] <gregaf> ajm-: if you fetch the latest git repo there's a wip-unfound-backport branch that is on top of 0.36
[2:21] <ajm-> ok
[2:21] <gregaf> (there's also a wip-unfound branch on top of a more recent master, you probably don't want to try and run that one by mistake)
[2:23] <ajm-> ok
[2:26] <ajm-> is this osd-only or should/must do mon/mds as well?
[2:30] * ajm- is now known as ajm
[2:32] <sagewk> osd only
[2:32] <sagewk> and you need the updated ceph tool to issue commands
[2:32] <ajm> ok
[2:33] <ajm> i can update all OSDs to new without affecting things then?
[2:34] <sagewk> yeah
[2:34] <ajm> k, give me a bit
[2:35] <sagewk> k
[2:35] <sagewk> will be offline for the next 4 hours or so
[2:36] <ajm> ok i'm just going to upgrade these and probably wait till tommorow then
[2:36] <sagewk> yeah sounds good
[2:36] <sagewk> ttyl
[2:36] <ajm> 4 hours is 1am here :)
[2:36] <ajm> and i should be asleep :P
[2:37] * eternaleye_ (~eternaley@195.215.30.181) has joined #ceph
[2:37] * eternaleye (~eternaley@195.215.30.181) Quit (Remote host closed the connection)
[2:43] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) has joined #ceph
[2:53] <ajm> my git is failing, git checkout v0.34 gets me 0.34 without any changes, git checkout wip-unfound-backport gets me 0.36
[3:39] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[3:50] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[5:53] * bchrisman1 (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[5:55] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[5:56] * Meths_ (rift@2.25.193.0) has joined #ceph
[5:57] * chaos__ (~chaos@hybris.inf.ug.edu.pl) has joined #ceph
[5:57] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) Quit (synthon.oftc.net larich.oftc.net)
[5:57] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) Quit (synthon.oftc.net larich.oftc.net)
[5:57] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (synthon.oftc.net larich.oftc.net)
[5:57] * Meths (rift@2.25.193.0) Quit (synthon.oftc.net larich.oftc.net)
[5:57] * chaos_ (~chaos@hybris.inf.ug.edu.pl) Quit (synthon.oftc.net larich.oftc.net)
[5:57] * SpamapS (clint@xenclint.srihosting.com) Quit (synthon.oftc.net larich.oftc.net)
[5:57] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) Quit (synthon.oftc.net larich.oftc.net)
[5:57] * SpamapS (clint@xenclint.srihosting.com) has joined #ceph
[6:00] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) has joined #ceph
[6:08] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has joined #ceph
[6:08] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[6:26] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[6:26] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) Quit ()
[7:13] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) Quit (Quit: This computer has gone to sleep)
[8:55] * yehuda_hm (~yehuda@99-48-179-68.lightspeed.irvnca.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[9:08] * gregorg_taf (~Greg@78.155.152.6) has joined #ceph
[9:08] * gregorg (~Greg@78.155.152.6) Quit (Read error: Connection reset by peer)
[9:31] * Dantman (~dantman@S010600259c4d54ff.vs.shawcable.net) Quit (Remote host closed the connection)
[10:54] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) Quit (Ping timeout: 480 seconds)
[10:59] * failbaitr (~innerheig@62.212.76.29) has joined #ceph
[10:59] <failbaitr> mornin
[11:00] <failbaitr> im trying to install ceph using the new packages for debian
[11:00] <failbaitr> but its pulling in gceph, and thus a graphical env
[11:00] <failbaitr> is there a package without gceph?
[11:03] <df__> --no-install-recommends ?
[11:09] <failbaitr> hmm, ok
[11:09] <failbaitr> I was hoping there was a more sane package somwhere
[11:10] <df__> looking at the 0.36-1 package: "Recommends: ceph-client-tools, ceph-fuse, libcephfs1, librados2, librbd1, btrfs-tools, gceph"
[11:10] <df__> (apt-cache show ceph)
[11:10] <failbaitr> hmm, it came in a a requiered package, il see where that came from
[12:54] * verwilst (~verwilst@dD576F744.access.telenet.be) has joined #ceph
[13:51] * gregorg (~Greg@78.155.152.6) has joined #ceph
[13:51] * gregorg_taf (~Greg@78.155.152.6) Quit (Read error: Connection reset by peer)
[13:52] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[14:03] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[14:24] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[14:34] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[14:44] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[14:46] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[14:47] * gregorg (~Greg@78.155.152.6) Quit (Read error: No route to host)
[14:47] * gregorg (~Greg@78.155.152.6) has joined #ceph
[15:25] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) has joined #ceph
[15:37] * RupS (~rups@panoramix.m0z.net) Quit (Read error: Connection reset by peer)
[15:37] * RupS (~rups@panoramix.m0z.net) has joined #ceph
[16:19] * verwilst (~verwilst@dD576F744.access.telenet.be) Quit (Quit: Ex-Chat)
[16:51] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) has joined #ceph
[16:56] * gohko (~gohko@natter.interq.or.jp) Quit (Quit: Leaving...)
[17:09] * gohko (~gohko@natter.interq.or.jp) has joined #ceph
[17:21] <sagewk> failbaitr: fwiw the next release has that recommends removed.
[17:26] <failbaitr> sagewk: great :)
[17:49] * bchrisman1 (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:03] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[18:06] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[18:13] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[18:27] * Tv|work (~Tv|work@aon.hq.newdream.net) has joined #ceph
[18:34] * jojy (~jojyvargh@108.60.121.114) has joined #ceph
[18:43] <ajm> sagewk: not sure if you saw my msg yesterday, but if i checkout wip-unfound-backport it still looks like 0.36
[18:43] <sagewk> should be 798389cdef6996cbfb0656de754281852df53bfa
[18:43] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[18:45] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:26] * Dantman (~dantman@S010600259c4d54ff.vs.shawcable.net) has joined #ceph
[19:27] * adjohn (~adjohn@50.0.103.34) has joined #ceph
[19:32] <Tv|work> joshd: i keep thinking qemu monitor command "set secret foo thesekrit" and then the -drive line says "...,secret=foo,..."
[19:33] <Tv|work> joshd: as long as the "set secret" is sent once before first "cont", that'll work
[19:33] <Tv|work> joshd: and if not, the driver can issue an error message "secret not found" and fail
[19:34] <Tv|work> joshd: ask qemu & libvirt upstreams what they're willing to accept..
[19:34] <bugoff> :28
[19:34] <Tv|work> joshd: but that "set secret" approach avoids the whole "oh there might be a passphrase prompt later" thing
[19:34] <Tv|work> bugoff: 29, i win.
[19:34] <bugoff> :)
[19:35] <joshd> Tv|work: yeah, that seems like the best approach for now
[19:37] <df__> hmm, managed to get a kernel client to hang doing an "ls". dmessage not reporting anything, cluster is idle
[19:38] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[19:42] <gregaf> df__: which kernel version, which cluster version?
[19:46] <df__> cluster: 0.36, kernel 3.0.0 with ceph-client:dec00a0c
[19:46] <Tv|work> df__: how many mdses?-)
[19:46] * Tv|work looks at gregaf
[19:46] <gregaf> heh
[19:47] <df__> mds e147: 1/1/1 up {0=vc-fs1=up:active}, 2 up:standby
[19:47] <gregaf> I'm trying to remember how you lookup the in-flight ops in the kernel client
[19:47] <gregaf> /proc/sys/fs/ceph maybe? are there files there?
[19:48] <df__> debugfs?
[19:50] <gregaf> ah, /sys/kernel/debug/ceph/*/
[19:50] <gregaf> will have osdc and mdsc files, what do they contain?
[19:50] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:51] <df__> vc-r210-10:~$ sudo cat /sys/kernel/debug/ceph/5cb32403-ea53-ae8b-6ed7-27b17eb6679a.client6105/osdc
[19:51] <df__> vc-r210-10:~$ sudo cat /sys/kernel/debug/ceph/5cb32403-ea53-ae8b-6ed7-27b17eb6679a.client6105/mdsc
[19:51] <df__> 4 mds0 getattr #1000010f120
[19:52] <gregaf> huh, you haven't done anything with this mount yet then?
[19:52] <gregaf> and do you have any logging on for the MDS?
[19:52] <df__> i unmounted it and remounted
[19:53] <gregaf> and what's the full output of ceph -s?
[19:53] <df__> 2011-10-14 17:53:13.767420 pg v282995: 594 pgs: 593 active+clean, 1 active+clean+scrubbing; 13462 GB data, 26963 GB used, 25241 GB / 54998 GB avail
[19:53] <df__> 2011-10-14 17:53:13.770980 mds e147: 1/1/1 up {0=vc-fs1=up:active}, 2 up:standby
[19:53] <df__> 2011-10-14 17:53:13.771061 osd e3082: 3 osds: 3 up, 3 in
[19:53] <df__> 2011-10-14 17:53:13.771208 log 2011-10-14 17:48:59.840083 osd0 172.29.190.28:6801/31577 116 : [INF] 0.c scrub ok
[19:53] <df__> 2011-10-14 17:53:13.771375 mon e1: 3 mons at {0=172.29.190.28:6789/0,1=172.29.190.29:6789/0,2=172.29.190.30:6789/0}
[19:53] <Tv|work> gregaf: i would love it if this ended up in doc/dev/kernel-client-troubleshooting.rst
[19:54] <df__> i've only got the default logging enabled
[19:54] <df__> any particular log level you would like?
[19:54] <gregaf> df__: well for some reason the MDS had to wait on servicing the request, and isn't proceeding, so if you didn't have logging enabled before there's not much chance of working out what went wrong :/
[19:55] <gregaf> restarting the MDS would probably make it go away, though if you reproduce it with MDS logging on (level 10 or 20) we could see what had happened and fix it :)
[19:56] <df__> what level of logging would you want -- i'll up it and restart. this happend before a complete restart and afterwards too
[19:56] <gregaf> well "debug mds = 20" will give us all we could use
[19:56] <df__> btw, (Can't contact the database server: Lost connection to MySQL server at 'reading authorization packet', system error: 0 (mysql.ceph.newdream.net))
[19:57] <df__> gregaf, ta thats just what i was looking for
[19:57] <gregaf> I think 10 probably would too but I'm not certain, and if it's consistent then you should do 20 since you're not worried about running out of logging space before it hits :)
[19:57] <sagewk> gregaf: you can get some clue if you dumpcache on the mds
[19:58] <gregaf> sagewk: users can trigger a dumpcache?
[19:58] <sagewk> also, sometimes you can turn up logging, and then stat that inode from another node to wake things up
[19:58] <gregaf> df__: do that first!
[19:58] <ajm> sagewk: ceph # git log|head -n1
[19:58] <ajm> commit 798389cdef6996cbfb0656de754281852df53bfa
[19:58] <ajm> ceph # ./src/ceph-mds -v
[19:58] <ajm> ceph version 0.36-16-g798389c (commit:798389cdef6996cbfb0656de754281852df53bfa)
[19:58] <sagewk> ceph mds tell 0 dumpcache /tmp/dump.txt
[19:59] <df__> gregaf, ack
[19:59] <sagewk> ajm: cool. ok, so lets pick a pg that has a few unfound objects and is in the data pool
[19:59] <sagewk> (i.e. a 0. pg)
[19:59] <gregaf> df__: all well, my bad *shrug*
[20:00] <ajm> sagewk: its showing up as 0.36 though
[20:00] <ajm> that was supposed to be a 0.34 backport I thought ?
[20:01] <sagewk> 0.36 + 16 patches
[20:01] <sagewk> oh!
[20:01] <sagewk> nevermind then
[20:02] <df__> gregaf, "ack" as in, i've done it.
[20:02] <gregaf> oh, good then!
[20:04] <df__> url in privmsg due to it containing filenames
[20:04] <gregaf> thanks
[20:10] <gregaf> df__: do you have other clients currently mounting the system?
[20:10] <df__> yes
[20:10] <gregaf> and was your last unmount clean or did you have to force reboot or something?
[20:13] <df__> i seem to recall this mornings set of events was: (1) noticing that streaming writes were not as fast as i think they should be and that they were dropping with time, (2) noticing that an "ls" was taking a very long time but completed, (3) stopping the write test and restarting the ceph daemons (init.d/ceph -a restart), (4) doing "ls" with a still mounted client hung, (5) unmount hung, but eventually happened, (6) remounted, ls still failed
[20:14] <gregaf> it looks like the other mounted client has caps that it needs to drop before the ls can proceed (normally this is fast)
[20:14] <df__> i think there may have been an extra "init.d/ceph -a stop" <wait> "init.d/ceph -a start" in there too at around (4)
[20:15] <gregaf> which might mean that the other client is stuck and has file data in memory that it can't write out to the OSDs or something
[20:15] <df__> i'll have a look at the other client i've been doing things on (there are other clients too, but they've been idle for ages)
[20:15] <gregaf> if it's kernel as well, can you check on its current requests? (same as before, /sys/kernel/debug/ceph/*/)
[20:16] <df__> i only have kernel clients
[20:16] <gregaf> yeah, there are only two that have caps on this inode so probably it's the other active one — the debug output will confirm
[20:17] <df__> vc-r210-11:~$ sudo cat /sys/kernel/debug/ceph/5cb32403-ea53-ae8b-6ed7-27b17eb6679a.client4155/osdc
[20:17] <df__> vc-r210-11:~$ sudo cat /sys/kernel/debug/ceph/5cb32403-ea53-ae8b-6ed7-27b17eb6679a.client4155/mdsc
[20:17] <df__> 1832680 mds0 setattr #1000010f120
[20:18] <df__> btw, "caps" == ?, i'm assuming it isn't "capabilities"
[20:18] <gregaf> …huh
[20:18] <gregaf> it is
[20:18] <gregaf> not *nix file capabilities, though — ceph internal ones
[20:18] <df__> ok
[20:19] <gregaf> to do things like read/write or buffer and cache those writes, update metadata, stuff like that
[20:20] <df__> ah ok
[20:20] <df__> is there a way of looking up that inode as to what it is? (#1000010f120)
[20:20] <Tv|work> honestly, i end up thinking of the mds caps as leases
[20:20] <Tv|work> to keep them separate from the access control style caps, in my mind
[20:20] <gregaf> sagewk: so it's trying to do a setattr on that inode, all that leaves me with is trying to see if the locking is messed up on the get/setattr code when you have to wait
[20:21] <df__> tv work, thats what i just had in my mind, grants/leases, after greg described the operations it related to
[20:21] <gregaf> Tv|work: that might be a better term
[20:21] <Tv|work> maybe i've done too much E.. ( http://en.wikipedia.org/wiki/E_(programming_language) )
[20:22] * Tv|work is now known as Tv
[20:22] <gregaf> df__: I don't think we're going to be able to resolve this just by poking, but an MDS restart ought to clear it up (*sigh*)
[20:23] <df__> ok. the only other thing that comes to mind, is just before all this happeend, i'd been doing a streaming write test (dd if=/dev/zero bs=1M of=/mnt/ceph/$file), where two nodes simultaniously did that to the same $file
[20:24] <df__> thats why i was wondering if #1000010f120 happens to be that file
[20:24] <gregaf> oh, right, forgot you asked about that
[20:24] <gregaf> the dumpcache file contains all the filenames next to the inode if you search for it :)
[20:25] <gregaf> I'd guess…probably?
[20:25] <df__> [inode 1000010f120 [2,head] /lf... auth v852437 ap=3+0 s=0 n(v0 1=1+0) (ifile mix->sync) (iversion lock w=1 last_client=4155) cr={4128=0-549755813888@1,4155=0-549755813888@1} caps={4155=pAsLsXs/pAsLsXsFcwb/pFw@6,6105=pAsLsXsFr/-@1},l=4155 | ptrwaiter request lock caps dirty waiter authpin 0x3525360]
[20:25] <df__> thats the one
[20:25] <gregaf> yep
[20:26] <gregaf> so you're exercising code that doesn't get tested as much right now and I'm guessing we have a locking problem in that code
[20:26] <gregaf> if it recurs even after a restart we should at least have the logs needed to solve it, otherwise we'll have to do it by code inspection
[20:26] <df__> ok, i'll restart with the extra debugging
[20:28] <df__> given that two clients will have started writing at the same time, one will have clobbered the other, but they carried on to write ~200GB to it each
[20:30] <gregaf> df: no, they shouldn't have clobbered, that's the whole point of the leases :)
[20:30] <gregaf> and it's hanging because for some reason it's failing to get and propagate the lease status change
[20:31] <gregaf> our lunch just arrived, though — I'll be back in a bit!
[20:31] <df__> O_TRUNC?
[20:32] <gregaf> df__: oh, I dunno what's supposed to happen in that situation then
[20:32] <gregaf> whatever POSIX says ;)
[20:32] <gregaf> I thought you were worried about Ceph clobbering data inappropriately
[20:35] <df__> ah no
[20:36] <df__> (well, not in this instance)
[20:36] <df__> i'll sort this log out and go and find some dinner
[20:55] <df__> gregaf, see privmsg for log of complete restart
[21:08] <gregaf> df__: is it still hanging, or is it working now?
[21:08] <df__> both clients are still in the same state -- i didn't restart them
[21:09] <df__> (i only restarted the daemons)
[21:09] <df__> shall i restart both clients?
[21:09] <gregaf> df__: once the MDS has gone through replay and rejoin I'd expect the clients to become happy
[21:10] <gregaf> if they aren't the logs should contain everything we need
[21:10] <df__> i think everything is up
[21:13] <df__> it isn't in the log file i sent you (it got filtered), but the two clients both reported:
[21:13] <df__> Oct 14 18:30:54 vc-r210-10 kernel: [1392940.782891] ceph: mds0 recovery completed
[21:13] <df__> Oct 14 18:30:54 vc-fs3 mds.vc-fs3[6661]: 7ff8e1e3b700 mds0.server not active yet, waiting
[21:13] <df__> Oct 14 18:30:54 vc-r210-11 kernel: [1392912.994489] ceph: mds0 recovery completed
[21:14] <gregaf> well, it does look like the inode is in nearly the same state as before, so the problem just got more interesting
[21:14] <ajm> sagewk: is it possible to backport to 0.34 or ?
[21:14] <gregaf> ajm: he's in a meeting right now, should be back soonish though
[21:19] <gregaf> df__: I'll take a look at this bug later today (or Sage will), but if it's not a problem for you to restart your clients you probably just want to do that
[21:19] <gregaf> I'm afraid I've got some other things I need to deal with right now
[21:20] <df__> ok, i'll restart those, and see if i can double check why i only seem to get 200Mb/sec/client writes at the moment
[21:21] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[21:34] * adjohn (~adjohn@50.0.103.34) Quit (Quit: adjohn)
[21:45] * adjohn (~adjohn@50.0.103.34) has joined #ceph
[21:52] <sagewk> ajm: will look at it shortly
[21:55] <ajm> thanks sage
[22:58] * Meths_ is now known as Meths
[23:06] * verwilst (~verwilst@dD576F744.access.telenet.be) has joined #ceph
[23:18] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) has joined #ceph
[23:20] <jmlowe> I'm probably missing something obvious but does anybody have any hints for this error: "kvm: -drive file=rbd:data/centos56,if=none,id=drive-virtio-disk0,format=raw: error reading config file
[23:20] <jmlowe> kvm: -drive file=rbd:data/centos56,if=none,id=drive-virtio-disk0,format=raw: could not open disk image rbd:data/centos56: Permission denied"
[23:23] <Tv> jmlowe: "auth supported = cephx" in ceph.conf perhaps?
[23:24] <gregaf> jmlowe: I'm not sure but I'd guess it can't read the ceph.conf file, is that accessible?
[23:24] <Tv> jmlowe: i recall kvm doesn't yet support authentication; joshd is adding that right now as far as i know
[23:24] <jmlowe> I'm not using auth and ceph.conf is readable
[23:25] <Tv> jmlowe: i wonder what would the error be if you didn't have rbd support compiled in your kvm at all....
[23:25] <Tv> jmlowe: perhaps it's trying to open "rbd:.." as a literal file
[23:25] <jmlowe> qemu-img info -f rbd rbd:data/centos56
[23:25] <jmlowe> image: rbd:data/centos56
[23:25] <jmlowe> file format: rbd
[23:25] <jmlowe> virtual size: 80G (85899345920 bytes)
[23:25] <jmlowe> disk size: unavailable
[23:26] <jmlowe> cluster_size: 4194304
[23:26] <Tv> so qemu-img at least has rbd support ;)
[23:26] <Tv> jmlowe: any log output from kvm? perhaps something in /var/log/ceph/‪client*.log? or on the mons?
[23:27] <jmlowe> slight adjustment to the xml yields format=rbd in the qemu args
[23:27] <Tv> oh right, earlier said format=raw, that probably implies "open as a file"
[23:28] <Tv> there's no format inside the rbd container, it's always raw there
[23:28] <jmlowe> same error though
[23:28] <Tv> the mon should log something
[23:29] <Tv> sorry off to a meeting
[23:30] <joshd> jmlowe: it'll read the ceph.conf file if it's in the default location - i.e. if 'ceph-conf -L' as the qemu user works, you should be good
[23:31] <jmlowe> everything in /var/log/ceph from today is zero lenght, is that a problem?
[23:36] <df__> ooh, i've just noticed that you report subtree sizes as directory sizes
[23:39] <df__> however, stat(2)ing a file that is being written to by another node is really slow:
[23:39] <df__> vc-r210-10:~$ time stat /mnt/ceph/lf.7453.12618.28857 > /dev/null
[23:39] <df__> real 0m25.561s
[23:49] <jmlowe> found it, apparmor was blocking access to /etc/ceph/ceph.conf
[23:49] * adjohn (~adjohn@50.0.103.34) Quit (Quit: adjohn)
[23:53] * conner (~conner@leo.tuc.noao.edu) has joined #ceph
[23:57] <jmlowe> edited wiki with a note about apparmor and ubuntu

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.