#ceph IRC Log


IRC Log for 2011-10-10

Timestamps are in GMT/BST.

[0:46] * NaioN (~stefan@andor.naion.nl) Quit (Server closed connection)
[0:46] * NaioN (~stefan@andor.naion.nl) has joined #ceph
[0:53] * wido (~wido@rockbox.widodh.nl) Quit (Server closed connection)
[0:53] * wido (~wido@rockbox.widodh.nl) has joined #ceph
[1:02] * bugoff (bram@november.openminds.be) Quit (Server closed connection)
[1:02] * bugoff (bram@november.openminds.be) has joined #ceph
[1:36] * stass (stas@ssh.deglitch.com) Quit (Server closed connection)
[1:36] * stass (stas@ssh.deglitch.com) has joined #ceph
[3:42] * Nightdog (~karl@190.84-48-62.nextgentel.com) Quit (Quit: Ex-Chat)
[4:14] * Dantman (~dantman@S010600259c4d54ff.vs.shawcable.net) Quit (Remote host closed the connection)
[8:42] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[8:49] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[8:49] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[9:10] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[9:12] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[9:18] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[9:25] * gregorg_taf (~Greg@ Quit (Ping timeout: 480 seconds)
[11:11] * julienhuang (~julienhua@pasteur.dedibox.netavenir.com) has joined #ceph
[14:13] <psomas> Are async writes in librbd enabled by default in 0.36? And is qemu 0.15 ok, or should I clone the repo and build it manually?
[14:17] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) has joined #ceph
[14:20] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) Quit ()
[14:32] * Dantman (~dantman@S010600259c4d54ff.vs.shawcable.net) has joined #ceph
[17:01] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) has joined #ceph
[17:01] * julienhuang (~julienhua@pasteur.dedibox.netavenir.com) Quit (Quit: julienhuang)
[17:09] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) has joined #ceph
[17:25] <gregaf> chaos_: try using the IP instead of the hostname — I don't think the kernel resolves them right now (there are some patches but they aren't upstream yet)
[17:55] <sagewk> psomas: the async writes are off by default. use rbd_writeback_window = 8000000 or similar to enable.
[17:57] <sagewk> but you need qemu with commit 7a3f5fe9afbef3c55c1527f61fcfd0b9d4783c0d for flush to be handle properly.
[17:58] <psomas> y, i cloned the qemu repo
[17:59] <psomas> i saw that about rbd_writeback_windo in the ml, but where am i supposed to add it?
[18:00] <psomas> i'm using the qemu/rbd driver, with libvirt
[18:01] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[18:04] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[18:08] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:19] <chaos_> gregaf, ip didn't work, but fuse is fine and don't cause kernel panics;)
[18:19] <chaos_> btw. when cauthtool changed name to ceph-authtool?
[18:21] <gregaf> chaos_: hrm, what kernel version
[18:21] <gregaf> ?
[18:21] <gregaf> most of the names changed for v0.36
[18:22] <chaos_> gregaf, 2.6.38-11
[18:22] <chaos_> i think it standard for ubuntu natty server
[18:23] <chaos_> gregaf, some of my chef recipies failed;p and i've been little confused...
[18:23] <gregaf> hum
[18:23] <gregaf> I hate EIO; it's such a useless error message
[18:24] <gregaf> chaos_: did you try including the port for kernel mounting?
[18:24] <chaos_> nope
[18:24] <gregaf> it might not add that automatically like the userspace stuff does
[18:24] <gregaf> try that and see if it works
[18:24] <chaos_> i'll try it in a sec
[18:25] <chaos_> first i've convert recipies to these fancy new names;p
[18:26] <chaos_> cosd too.. damn;p
[18:30] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) Quit (Quit: jojy)
[18:32] <sagewk> psomas: use a qemu drive string like rbd:rbd/myimage:rbd_writeback_window=8000000
[18:33] <psomas> sagewk: thanks :)
[18:34] <psomas> btw, i haven't searched if this is a known issue, but i tried "rbd cp image1 image2", and i get an error in the end, although the copied image is fine
[18:35] <chaos_> gregaf, it works with port
[18:36] <sagewk> psomas: thanks, will take a look
[18:38] <chaos_> where i can find changelog for ceph?
[18:38] <chaos_> ok nvm;p
[18:45] <sagewk> psomas: what is the error you see?
[18:45] <psomas> copy failed: Unknown error 18446744071562067968
[18:45] <psomas> i was looking at the code now
[18:45] <sagewk> great, i'm not seeing a problem locally
[18:46] <sagewk> maybe you can generate a log? --debug-rbd 20 --debug-ms 1 --log-file foo.log
[18:55] <psomas> sagewk: image size you used for the cp?
[18:56] <psomas> hm
[18:56] <psomas> in librbd::copy
[18:57] <psomas> there's a call to read_iterate, which returnes int64_t, while r (used throught the 'call chain') is int, so maybe it's soem kind of int overflow?
[18:58] * bchrisman (~Adium@ has joined #ceph
[19:04] * sjust (~sam@aon.hq.newdream.net) Quit (Read error: Connection reset by peer)
[19:05] <sagewk> hmm, probably
[19:05] <sagewk> can you try the patch i just pushed to master?
[19:05] <sagewk> 7060efa9bae6d27bf44fbdc0a89698a31fd8c6c4
[19:07] <psomas> i just pulled, but i'm not sure i can try it right now
[19:08] <sagewk> np
[19:10] <psomas> i'm not familar with the code, but i thought it had something to do with "r = read_iterate(...)" line 1175
[19:10] <psomas> in src/librbd.cc
[19:13] <sagewk> i suspect the log with those debug levels will have the answer
[19:14] <psomas> ok, i'll try it out, it's just a bit suspicious, because r is 32bit, read_iterate returns 64bit total_read, and then tehre's a check if r>=0, which if there was an overflow, would fail
[19:15] <psomas> anyway, thanks for the help, i'll respond when i've tried both the patch, and the log level increase :)
[19:18] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) has joined #ceph
[19:20] <psomas> sagewk: i have the log with the unpatched librbd. Do you want to give you a pastebin link?
[19:25] <chaos_> http://wklej.org/id/605810/ is something wrong with ceph-fuse parser or with my line?
[19:34] <chaos_> well it should be ceph-fuse not ceph, but it fails anyway - http://wklej.org/id/605820/
[19:38] * adjohn (~adjohn@ has joined #ceph
[19:39] * jojy (~jojyvargh@ has joined #ceph
[19:39] * sjust (~sam@aon.hq.newdream.net) has joined #ceph
[19:39] <df__> sjust, were there any further thoughs on friday's issue? the follow up at my end, was that two OSDs failed during the night, and inspecting them now both are sitting at 100% CPU
[19:39] <psomas> sagewk: ok, your patch fixed the progress/status of the cp, but i still get the error copy failed: error -2147483648: Unknown error 18446744071562067968
[19:41] <sagewk> psomas: yeah pastebin works
[19:44] <sjust> df__: the log you posted does suggest that the filestore thread hung on possibly a stat
[19:44] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[19:49] <psomas> sagewk: i think i fixed it
[19:50] <df__> btw, is this of any concern: "[ERR] 1.81 log bound mismatch, empty but (789'3049,789'3971]"
[19:50] <psomas> http://pastebin.ca/2088813
[19:51] <psomas> i'm not sure if it's the right way to do it
[19:59] <sagewk> oh, the problem was just that read_iterate was returning something > 32 bits?
[20:00] <psomas> probably (ie i didn't verify that it was indeed >32bits, but fixing that worked)
[20:01] <sagewk> yeah i see
[20:01] <sagewk> will push somethign shortly (fixing other callsites too)
[20:01] <cp> Is there a specification somewhere of the language used by the crush map? I'm trying to write a new crush map and it's not always working as expected.
[20:05] <sagewk> psomas: ok pushed. thanks!
[20:06] <psomas> np
[20:06] <sagewk> cp: the grammar is in crush/grammar.h
[20:07] <cp> sagewk: thanks Sage
[20:17] <psomas> sagewk: i'm not sure it's fixed, problem is that read_iterate will return >32bit, so even though copy() will return 64bit, the functions in teh 'call chain' will treat the return value as 32bit, and so they'll think it's negative/error code
[20:18] <psomas> so either you just retrun 0 on success and not value of total_read (so that no int overflow happens), or you have to change every 'r' variable up to do_copy to int64_t
[20:20] * jojy_ (~jojyvargh@ has joined #ceph
[20:20] * jojy (~jojyvargh@ Quit (Read error: Connection reset by peer)
[20:20] * jojy_ is now known as jojy
[20:24] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: Leaving)
[20:44] * aliguori (~anthony@ has joined #ceph
[20:50] <sagewk> psomas: oh right. should just return 0, i think.
[20:50] <sagewk> (copy() i mean)
[20:52] <gregaf> sagewk: the amount read isn't required for QEMU or something?
[20:53] <sagewk> the only copy() caller is rbd.cc
[20:53] <sagewk> (i.e. copy an entire image)
[20:53] <psomas> sagewk: y
[20:53] <psomas> i think so
[20:57] <psomas> like the patch i sent, but probably it could be written more cleanly
[20:58] * adjohn (~adjohn@ Quit (Quit: adjohn)
[21:03] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[21:04] * votz (~votz@pool-108-52-121-23.phlapa.fios.verizon.net) Quit (Read error: Connection reset by peer)
[21:07] * votz (~votz@pool-108-52-121-23.phlapa.fios.verizon.net) has joined #ceph
[21:09] <nak> hello, if mkcephfs is called with a config with multiple osd per host, does it by default create a crushmap that avoids replicas on the host?
[21:09] * adjohn (~adjohn@ has joined #ceph
[21:10] <gregaf> nak: unfortunately, no
[21:10] <gregaf> you have to do that on your own
[21:10] <nak> hmmm
[21:10] <psomas> sagewk: just pulled, your new patch fixed the issue, thanks :)
[21:11] <nak> but when I pull the default crushmap from a new fs, it seems to match one I'd create with crushtool
[21:12] <nak> unless I misread the crushtool documentation, and that also avoids replicas per host?
[21:12] <nak> unless edited?
[21:12] <nak> ^typo: that also allows replicas per host
[21:13] <gregaf> hmm, I suspect your handwritten one is off then, unless somebody slipped in host handling without my noticing
[21:13] <gregaf> (which is possible)
[21:13] <nak> 0.36, and I'm just double checking...
[21:14] <gregaf> I actually haven't played with crush maps much myself, I'm trying to think who besides sagewk would know for sure
[21:14] <gregaf> wido might
[21:16] <df__> sjust, is there anything further to investigate? i've now got two OSDs out of three that keep failing with OSD::disk_tp and OSD::op_tp timeouts.
[21:16] <df__> nb, those two OSDs are running on ext4
[21:18] <psomas> has there been any progress with rbd image layring?
[21:18] <nak> looks like newbiew confusion, running through again I get different maps.
[21:20] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[21:23] <sjust> df__: if we could nail down exactly what the spinning cpus are doing, that would help
[21:24] <sjust> the log unfortunately doesn't print the operation before it's attempted
[21:25] <df__> 3826.00 94.1% PG::build_inc_scrub_map
[21:25] <df__> is where it is spinning
[21:27] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[21:33] <sjust> can you use gdb to get the line number?
[21:37] <df__> doh, you dont build the debian packages with debug symbols
[21:37] <sjust> :(, sorry about that
[21:38] <yehudasa> df__: there are debian debug packages
[21:38] <sjust> try turning up OSD and filestore debuggging to 30 and restarting the OSDS, if it's getting stuck in scrub, we should see where it's getting stuck
[21:39] <yehudasa> iirc, ceph-dbg
[21:39] <df__> yehudasa, ah, /me goes to check
[21:48] <df__> sjust: http://pastebin.com/YknfHgYY
[21:48] <df__> thread #10
[21:59] <nak> is there a tool to show osd coverage on a per file basis, including replications?
[22:00] <nak> I don't see anything via http://ceph.newdream.net/wiki/Monitor_commands#stat
[22:01] * Dantman (~dantman@S010600259c4d54ff.vs.shawcable.net) Quit (Ping timeout: 480 seconds)
[22:08] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) Quit (Quit: cp)
[22:14] * bchrisman (~Adium@ has joined #ceph
[22:22] <sjust> nak: osd coverage?
[22:22] <sjust> df__: hmm, looking
[22:32] <sjust> df__, nak: I need to head out, I may be back on later
[22:33] <df__> ok
[22:40] * cp (~cp@ has joined #ceph
[23:31] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[23:32] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[23:38] * aliguori (~anthony@ Quit (Quit: Ex-Chat)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.