#ceph IRC Log

Index

IRC Log for 2013-01-10

Timestamps are in GMT/BST.

[0:01] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[0:09] * madkiss (~madkiss@213.221.125.229) has joined #ceph
[0:16] * korgon (~Peto@isp-korex-15.164.61.37.korex.sk) has joined #ceph
[0:24] * The_Bishop (~bishop@2001:470:50b6:0:6077:d570:bf06:22e4) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[0:25] * andreask1 (~andreas@h081217135060.dyn.cm.kabsi.at) has joined #ceph
[0:31] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[0:54] * madkiss (~madkiss@213.221.125.229) Quit (Quit: Leaving.)
[0:57] * allsystemsarego (~allsystem@5-12-241-245.residential.rdsnet.ro) Quit (Quit: Leaving)
[1:03] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Quit: ...)
[1:13] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[1:15] * korgon (~Peto@isp-korex-15.164.61.37.korex.sk) Quit (Quit: Leaving.)
[1:22] * The_Bishop (~bishop@e179004212.adsl.alicedsl.de) has joined #ceph
[1:27] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[1:27] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[1:29] * tnt (~tnt@86.188-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[1:31] * andreask1 (~andreas@h081217135060.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[1:35] * yoshi (~yoshi@p2100-ipngn4002marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[1:37] * yoshi (~yoshi@p2100-ipngn4002marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:37] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[1:42] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[1:46] * korgon (~Peto@isp-korex-15.164.61.37.korex.sk) has joined #ceph
[1:55] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:56] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[2:01] * korgon (~Peto@isp-korex-15.164.61.37.korex.sk) Quit (Quit: Leaving.)
[2:15] * jlogan1 (~Thunderbi@2600:c00:3010:1:52d:be18:aa69:de7) Quit (Quit: jlogan1)
[2:19] * ircolle (~ircolle@c-67-172-132-164.hsd1.co.comcast.net) Quit (Quit: Leaving.)
[2:19] * yoshi (~yoshi@p2100-ipngn4002marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[2:24] * yoshi (~yoshi@p2100-ipngn4002marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:29] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[2:32] * aliguori_ (~anthony@cpe-70-112-157-151.austin.res.rr.com) has joined #ceph
[2:32] * LeaChim (~LeaChim@b0faeeb0.bb.sky.com) Quit (Ping timeout: 480 seconds)
[2:38] * yoshi (~yoshi@p2100-ipngn4002marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[2:38] * chftosf (u7988@irccloud.com) has joined #ceph
[2:39] * Cube (~Cube@12.248.40.138) Quit (Read error: Operation timed out)
[2:39] * aliguori (~anthony@cpe-70-113-5-4.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[2:40] * yoshi (~yoshi@p2100-ipngn4002marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:42] * BManojlovic (~steki@85.222.223.220) Quit (Quit: Ja odoh a vi sta 'ocete...)
[2:50] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[2:57] * nwat (~Adium@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[2:57] * nwat (~Adium@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[3:00] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[3:01] * agh (~agh@www.nowhere-else.org) has joined #ceph
[3:18] * Cube1 (~Cube@173.155.44.188) has joined #ceph
[3:22] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:24] * JohansGlock (~quassel@kantoor.transip.nl) Quit (Read error: Connection reset by peer)
[3:26] * mattbenjamin (~matt@65.160.16.60) Quit (Quit: Leaving.)
[3:26] * mattbenjamin (~matt@65.160.16.60) has joined #ceph
[3:31] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[3:34] * mattbenjamin (~matt@65.160.16.60) Quit (Ping timeout: 480 seconds)
[3:45] * Cube1 (~Cube@173.155.44.188) Quit (Ping timeout: 480 seconds)
[3:45] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[4:02] * aliguori_ (~anthony@cpe-70-112-157-151.austin.res.rr.com) Quit (Remote host closed the connection)
[4:25] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[4:35] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:41] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[4:46] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit ()
[4:46] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[5:13] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[5:22] * Cube (~Cube@173.155.44.188) has joined #ceph
[5:32] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[5:33] * Cube (~Cube@173.155.44.188) Quit (Ping timeout: 480 seconds)
[5:33] * chutzpah (~chutz@199.21.234.7) Quit (Quit: Leaving)
[5:37] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[5:48] * Cube (~Cube@173.155.44.188) has joined #ceph
[5:57] * Cube (~Cube@173.155.44.188) Quit (Ping timeout: 480 seconds)
[6:00] * agh (~agh@www.nowhere-else.org) has joined #ceph
[6:04] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[6:04] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[6:17] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:30] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:37] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[6:50] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:59] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[7:00] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[7:10] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[7:11] * Cube (~Cube@173.155.44.188) has joined #ceph
[7:15] * Cube (~Cube@173.155.44.188) Quit ()
[7:37] * gaveen (~gaveen@112.135.134.98) has joined #ceph
[7:46] * schlitzer_work is now known as schlitzer|work
[8:04] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:14] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:15] * low (~low@188.165.111.2) has joined #ceph
[8:27] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[8:27] * tnt (~tnt@86.188-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[8:28] * madkiss (~madkiss@213.221.125.229) has joined #ceph
[8:37] * Cube (~Cube@173.155.44.188) has joined #ceph
[8:38] * madkiss (~madkiss@213.221.125.229) Quit (Quit: Leaving.)
[8:49] * gaveen (~gaveen@112.135.134.98) Quit (Remote host closed the connection)
[8:51] * gaveen (~gaveen@112.135.134.98) has joined #ceph
[8:56] * madkiss (~madkiss@62.96.31.190) has joined #ceph
[8:56] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: I used to think I was indecisive, but now I'm not too sure.)
[9:09] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[9:14] * Cube (~Cube@173.155.44.188) Quit (Quit: Leaving.)
[9:37] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[9:41] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[9:41] * agh (~agh@www.nowhere-else.org) has joined #ceph
[9:42] * tnt (~tnt@86.188-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:42] <absynth_47215> morning
[9:46] <schlitzer|work> \o/
[9:47] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[9:49] * ScOut3R (~ScOut3R@212.96.47.215) has joined #ceph
[9:51] * Leseb (~Leseb@193.172.124.196) has joined #ceph
[9:52] * dpippenger (~riven@cpe-76-166-221-185.socal.res.rr.com) has joined #ceph
[9:56] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[10:10] * fghaas (~florian@91-119-215-212.dynamic.xdsl-line.inode.at) has joined #ceph
[10:10] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[10:29] * sagewk (~sage@2607:f298:a:607:e5f3:fe46:26e2:2fe) Quit (Read error: Operation timed out)
[10:33] * LeaChim (~LeaChim@b0faeeb0.bb.sky.com) has joined #ceph
[10:37] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[10:45] * sagewk (~sage@2607:f298:a:607:a911:b1:1097:bd96) has joined #ceph
[10:45] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[10:45] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[10:48] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[10:53] * jtangwk (~Adium@2001:770:10:500:ac68:810:d319:bdd4) Quit (Quit: Leaving.)
[10:54] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[10:58] * jtangwk (~Adium@2001:770:10:500:fc72:f03a:59a8:3b3f) has joined #ceph
[11:02] <Kioob`Taff> hi
[11:02] <Kioob`Taff> I have a big latency problem on RBD devices
[11:03] <Kioob`Taff> iostat showing 3000 to 15000ms of «await» time
[11:03] <Kioob`Taff> but on OSD, I haven't that
[11:03] <Kioob`Taff> it's a network problem ?
[11:10] * JohansGlock (~quassel@kantoor.transip.nl) has joined #ceph
[11:12] <Kioob`Taff> «ping» of each OSD (and the MON) during that problem doesn't show any latency
[11:13] <Kioob`Taff> and ceph status is OK
[11:17] * match (~mrichar1@pcw3047.see.ed.ac.uk) has joined #ceph
[11:18] * Morg (d4438402@ircip2.mibbit.com) has joined #ceph
[11:21] <Kioob`Taff> mmm I have huge amount of network transfer between OSD...
[11:21] <Kioob`Taff> «2 active+clean+scrubbing»
[11:22] <Kioob`Taff> is there a way to disable that ?
[11:22] <Kioob`Taff> (for test)
[11:23] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[11:29] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[11:30] * jano (c358ba02@ircip4.mibbit.com) has joined #ceph
[11:31] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[11:35] * ScOut3R (~ScOut3R@212.96.47.215) Quit (Remote host closed the connection)
[11:35] <Kioob`Taff> I confirm the problem come from the background scrubbing
[11:36] <Kioob`Taff> for now OSD are on the same network than clients... which is of course a bad idea
[11:44] * ScOut3R (~ScOut3R@dslC3E4E249.fixip.t-online.hu) has joined #ceph
[11:44] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[11:58] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[12:09] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[12:09] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[12:09] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[12:10] * The_Bishop (~bishop@e179004212.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[12:10] * agh (~agh@www.nowhere-else.org) has joined #ceph
[12:10] * yehudasa (~yehudasa@2607:f298:a:607:44f5:f47d:52c5:c82d) Quit (Ping timeout: 480 seconds)
[12:10] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Remote host closed the connection)
[12:10] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph
[12:12] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[12:18] * yehudasa (~yehudasa@2607:f298:a:607:10dc:fb8:97ed:714f) has joined #ceph
[12:21] * fghaas (~florian@91-119-215-212.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[12:21] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[12:26] * fghaas (~florian@91-119-215-212.dynamic.xdsl-line.inode.at) has joined #ceph
[12:33] * jluis (~JL@89-181-159-29.net.novis.pt) has joined #ceph
[12:33] * jluis (~JL@89-181-159-29.net.novis.pt) Quit (Remote host closed the connection)
[12:33] * joao (~JL@89.181.159.29) Quit (Remote host closed the connection)
[12:33] * joao (~JL@89-181-159-29.net.novis.pt) has joined #ceph
[12:33] * ChanServ sets mode +o joao
[12:34] * The_Bishop (~bishop@f052099163.adsl.alicedsl.de) has joined #ceph
[12:48] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[12:56] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[13:06] * allsystemsarego (~allsystem@5-12-241-245.residential.rdsnet.ro) has joined #ceph
[13:12] * Morg (d4438402@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[13:18] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[13:18] * ScOut3R (~ScOut3R@dslC3E4E249.fixip.t-online.hu) Quit (Remote host closed the connection)
[13:27] * asanka (~asankanis@103.247.48.167) has joined #ceph
[13:28] * jano (c358ba02@ircip4.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[13:28] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[13:31] <nhm> good morning #ceph
[13:35] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:36] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[13:38] <Kioob`Taff> 'morning nhm ;)
[13:43] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[13:44] * ScOut3R (~ScOut3R@dslC3E4E249.fixip.t-online.hu) has joined #ceph
[13:54] * aliguori (~anthony@cpe-70-112-157-151.austin.res.rr.com) has joined #ceph
[14:02] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[14:11] * fghaas (~florian@91-119-215-212.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[14:11] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[14:12] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[14:18] * ScOut3R (~ScOut3R@dslC3E4E249.fixip.t-online.hu) Quit (Remote host closed the connection)
[14:18] * ScOut3R (~ScOut3R@212.96.47.215) has joined #ceph
[14:25] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[14:30] * Tribaal (uid3081@id-3081.hampstead.irccloud.com) has joined #ceph
[14:31] <Tribaal> Hi all! I have a patch to the python wrapper that we use internally I'd like to submit - should I open a github pull request? Against what branch should I throw that?
[14:31] * fghaas (~florian@91-119-215-212.dynamic.xdsl-line.inode.at) has joined #ceph
[14:34] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[14:35] * agh (~agh@www.nowhere-else.org) has joined #ceph
[14:35] <Tribaal> I mean a patch to src/pybind/rados.py to be precise
[14:38] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[14:39] <janos> Tribaal: i think the inktabk guys are mostly west-coast US (i could be wrong) so they could answer that in a few hours
[14:39] <Tribaal> janos: ah ok :) Thanks
[14:42] * XSBen (~XSBen@195.220.156.20) Quit (Quit: Quitte)
[14:46] <Tribaal> ok, I'll open a pull request in the meanwhile :)
[14:51] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[14:52] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[14:59] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:00] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit ()
[15:05] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[15:05] <jmlowe> good morning
[15:06] <jmlowe> any suggestions for what to do with a active+clean+inconsistent pg?
[15:07] <fghaas> http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/1697
[15:07] <fghaas> ceph pg repair would be your first option
[15:08] <jmlowe> perfect, thanks!
[15:08] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Read error: Operation timed out)
[15:09] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Remote host closed the connection)
[15:13] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[15:14] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[15:17] <jmlowe> hmm, I don't think it worked http://pastebin.com/VycrBEhK
[15:23] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[15:24] <tnt> Does ceph do some periodic task every 10 min ?
[15:25] <jmlowe> it scrubs when idle, there is a blog post about it
[15:26] <jmlowe> right so I guess I need to know how to fix this 2013-01-10 09:15:10.530393 osd.2 [ERR] scrub 2.10d 972ffd0d/rb.0.1e4a.2ae8944a.000000000676/head//2 on disk size (4173824) does not match object info size (4194304)
[15:26] <jmlowe> and ideally why it happened
[15:27] <fghaas> http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/5186 <- ML post from sage about a similar issue
[15:27] <fghaas> bit old, and there's been quite a few osd code changes in the interim, so YMMV
[15:34] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[15:34] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Read error: Connection reset by peer)
[15:34] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[15:35] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[15:39] * Joel (~chatzilla@2001:620:0:46:39cf:6b36:5753:359b) has joined #ceph
[15:40] <jmlowe> is there a way for me to find out what rbd image an object belongs to?
[15:52] <jmlowe> also should I file a bug for this?
[15:52] <tnt> with rbd info you can get the 'prefix' for each image
[15:53] * loicd trying to figure out how to run teuthology
[15:55] <fghaas> jmlowe: there is
[15:55] <fghaas> ... just a sec ...
[15:55] <jmlowe> right so tis is a problem: rbd info centosi386convert
[15:55] <jmlowe> rbd: error opening image centosi386convert: (2) No such file or directory
[15:55] <jmlowe> 2013-01-10 09:53:55.963753 7fdf67529780 -1 librbd::ImageCtx: error finding header: (2) No such file or directory
[15:57] <fghaas> rbd info <name> should give you block_name_prefix... if that's not working perhaps the busted object is either the rbd_directory object, or the index object for the centosi386convert image?
[16:00] * PerlStalker (~PerlStalk@72.166.192.70) has joined #ceph
[16:01] <tnt> jmlowe: what version is your cluster and the client ?
[16:06] * gaveen (~gaveen@112.135.134.98) Quit (Ping timeout: 480 seconds)
[16:07] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Ping timeout: 480 seconds)
[16:11] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[16:12] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Quit: Leaving.)
[16:12] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[16:14] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit ()
[16:14] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[16:15] * gaveen (~gaveen@112.135.154.201) has joined #ceph
[16:19] * allsystemsarego (~allsystem@5-12-241-245.residential.rdsnet.ro) Quit (Quit: Leaving)
[16:24] * aliguori (~anthony@cpe-70-112-157-151.austin.res.rr.com) Quit (Quit: Ex-Chat)
[16:25] <jmlowe> 0.56.1-1quantal from debian-testing and I used this kernel client as well as the qemu client Linux version 3.5.0-21-generic (buildd@allspice) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-2ubuntu1) ) #32-Ubuntu SMP Tue Dec 11 18:51:59 UTC 2012
[16:27] * ScOut3R_ (~ScOut3R@dslC3E4E249.fixip.t-online.hu) has joined #ceph
[16:27] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[16:29] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[16:31] <jmlowe> fghaas: I'm not sure as far as I can tell the broken object belongs to a different rbd image
[16:32] <fghaas> but you're getting the "error finding header" object for any "rbd info" command?
[16:32] <jmlowe> 972ffd0d/rb.0.1e4a.2ae8944a.000000000676/head//2 is the bad object but rbd leads me to believe it belongs do a different image
[16:32] <jmlowe> root@gwbvm1:~# rbd info centosi386convert
[16:32] <jmlowe> rbd: error opening image centosi386convert: (2) No such file or directory
[16:32] <jmlowe> 2013-01-10 09:53:55.963753 7fdf67529780 -1 librbd::ImageCtx: error finding header: (2) No such file or directory
[16:32] <jmlowe> root@gwbvm1:~# rbd info centos5x86_64convert
[16:32] <jmlowe> 2013-01-10 09:54:18.172090 7f46a0b12700 0 -- :/1019784 >> 149.165.228.9:6789/0 pipe(0x2755190 sd=4 :0 pgs=0 cs=0 l=1).fault
[16:32] <jmlowe> rbd image 'centos5x86_64convert':
[16:32] <jmlowe> size 10240 MB in 2560 objects
[16:32] <jmlowe> order 22 (4096 KB objects)
[16:32] <jmlowe> block_name_prefix: rb.0.1e4a.2ae8944a
[16:32] <jmlowe> format: 1
[16:33] <jmlowe> I would think it belonged to centos5x86_64convert, but clearly centosi386convert is seriously broken
[16:33] * ScOut3R (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[16:34] <fghaas> may not be related then
[16:35] <fghaas> are you sure you even have a header object for centosi386convert?
[16:35] <fghaas> (you should be able to check with "rados -p rbd ls" and some clever grepping)
[16:35] <fghaas> assuming you used the default rbd pool for your images
[16:35] * ircolle (~ircolle@c-67-172-132-164.hsd1.co.comcast.net) has joined #ceph
[16:36] <jmlowe> 24 hours ago I was running a vm from it no problems, created 2 new rbd devices and rsync'ed about 700GB of data to the two rbd devices overnight, one with the kernel client and one with the qemu client
[16:38] <jmlowe> then inconsistent object this morning, I did do the repair before discovering the rbd: error opening image centosi386convert: (2) No such file or directory
[16:38] <jmlowe> underlying filesystems for osd's are btrfs with raid 10 for both data and metadata, they scrub clean with no errors
[16:41] <jmlowe> wait a minute, the output of 'rados -p rbd ls|grep rb.0.1e4a.2ae8944a |wc -l' should match the number of objects from 'rbd info' right?
[16:44] <fghaas> erm. well I wouldn't be surprised if they differed by 1 (the header object), but not 100% sure tbh
[16:45] <jmlowe> maybe it's lazy about object creation?
[16:45] <fghaas> that it is, yes
[16:46] <fghaas> so the "size 10240 MB" does mean the max size, not the actually allocated data
[16:47] <fghaas> but I've nevered bothered to check about the object count, and just right now don't have access to a box where I can
[16:47] <fghaas> (sorry about that :) )
[16:47] <jmlowe> it would make sense not to create objects until they are actually written to
[16:47] <tnt> yes, the image is sparse by default.
[16:48] <tnt> which is why creation is very fast, but deletion takes time (because need to try and delete each possible object to check if it exists or not)
[16:51] <nhm> yay, I've sort of mastered gnuplot.
[16:51] * calebmiles (~caleb@65-183-137-95-dhcp.burlingtontelecom.net) Quit (Ping timeout: 480 seconds)
[16:52] * Joel (~chatzilla@2001:620:0:46:39cf:6b36:5753:359b) Quit (Ping timeout: 480 seconds)
[16:53] * aliguori (~anthony@32.97.110.59) has joined #ceph
[16:53] <tnt> nhm: now you can move on to rrdtool :p
[16:55] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[16:55] * low (~low@188.165.111.2) Quit (Quit: Leaving)
[16:55] * sander (~chatzilla@c-174-62-162-253.hsd1.ct.comcast.net) has joined #ceph
[16:56] <jmlowe> right, so I think 0.56 has a serious problem
[16:56] <tnt> 0.56 does have a serious problem. 0.56.1 fixed it ...
[16:57] <jmlowe> right make that 0.56.1
[16:58] <jmlowe> I have objects that have the prefix rb.0.1a4e.238e1f29 and by process of elimination must belong to my broken image
[17:00] * yanzheng (~zhyan@101.83.28.247) has joined #ceph
[17:01] <jmlowe> another image with the prefix rb.0.1e4a.2ae8944a had a corrupt object of both the wrong digest and size according to ceph pg repair, following repair the object is stuck at the wrong size
[17:02] <jmlowe> I think that pretty much sums it iup?
[17:03] * rlr219 (43c87e04@ircip1.mibbit.com) has joined #ceph
[17:04] <jmlowe> so where should I go from here?
[17:04] <tnt> is the image important ? or can you just afford to loose it ?
[17:05] <rlr219> Just set upo a new cluster running 0.56.1 and tried to set up 3 MDSs. First MDS started fine, but other two won't run. Log has this in the line "4=dir inode in separate object} not writeable with daemon features" ....killing myself
[17:05] <jmlowe> it's a throw away, all my images are except one
[17:05] <rlr219> I am guessing this is a write permission issue, but to where?
[17:09] <Kioob`Taff> is there a way to cancel a running scrub, or disable scrubbing ? Maybe reduce the priority or any tuning parameter ?
[17:11] <jmlowe> Kioob`Taff: check this http://ceph.com/dev-notes/whats-new-in-the-land-of-osd/ I think it has some stuff about scrub and priority and how it's changed over time
[17:12] <Tribaal> hey all, since the US is awake now I'll go ahead and throw my question in again - I just opened a pull request on github with some small additions to the python API. Is that the proper contribution format?
[17:12] <Tribaal> (I added a sign-off line)
[17:12] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[17:13] <ircolle> Tribaal: people have been sending patches to the ceph-dev mailing list for review and inclusion
[17:14] <Kioob`Taff> In Bobtail, we scrub a PG in chunks, only pausing writes on the set of objects we are currently scrubbing. This way, no object has writes blocked for long. <=== rahhhhh !!!
[17:14] <Kioob`Taff> thanks jmlowe !
[17:14] <Tribaal> ircolle: is that a way of saying "you should send your patch to the mailing list instead"?
[17:15] <Kioob`Taff> it's why I have that crazy latencies on writes !
[17:15] <jmlowe> Kioob`Taff: np, glad I could be useful
[17:17] <ircolle> Tribaal - I'm saying that's one way to ensure someone looks at your patches :-)
[17:17] <Tribaal> ircolle: hehe ok, I'll send a message to the list pointing to the github pull request and brace for the flame :)
[17:18] * deckid (~deckid@clientvpn.sentryds.com) has joined #ceph
[17:20] <ircolle> Tribaal: might want to consider git send-email
[17:21] <deckid> Im looking for information on RAID stripe/chunk size to use. I read through the performance part 1 post, but there was no mention of the size used for RAID0. Are there any good documents that relate to stripe sizes and how they effect performance?
[17:21] <Tribaal> ircolle: hum. ok then... seems quite old fashioned, especially since the code is on github, but who am I to judge? :)
[17:22] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[17:22] * yanzheng1 (~zhyan@101.82.113.100) has joined #ceph
[17:22] * agh (~agh@www.nowhere-else.org) has joined #ceph
[17:23] <absynth_47215> i wonder, were you able to resolve that issue with the "halted" cluster yesterday?
[17:23] <absynth_47215> don't remember the nick, unfortunately
[17:23] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:24] <fghaas> Tribaal: I've submitted patches via GItHub pull requests, and I can assure you that those are being looked at, too :)
[17:25] <jmlowe> deckid: nhm is the resident performance guy
[17:25] <Tribaal> fghaas: ahhh nice :) should I just wait then, or is a little email to the dev mailing list still a requirement?
[17:26] <fghaas> like I said, I've had patches merged that I only submitted pull reqs for, and at the time I asked the same question. sage told me that either was fine
[17:26] <Tribaal> fghaas: awesome, thanks for your input!
[17:26] * yanzheng (~zhyan@101.83.28.247) Quit (Ping timeout: 480 seconds)
[17:26] <fghaas> the signed-off-by line is indeed required, but you've got that covered
[17:27] * gregaf (~Adium@2607:f298:a:607:8d9:37d1:d5f7:46d3) Quit (Read error: Operation timed out)
[17:28] * gregaf (~Adium@2607:f298:a:607:8006:2bb6:9c15:a221) has joined #ceph
[17:28] <rlr219> can anyone help me with MDSs not starting in bobtail. configured 3, but only 1 started
[17:29] * fghaas (~florian@91-119-215-212.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[17:29] <Kioob`Taff> jmlowe: it seems that this «ChunkyScrub» is already in the 0.55 code
[17:29] <Kioob`Taff> is there a way to find the used size of chunks ?
[17:32] * kbad_ (~kbad@malicious.dreamhost.com) Quit (Remote host closed the connection)
[17:38] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[17:39] <jmlowe> Kioob`Taff: that article is the full extent of my knowledge
[17:39] * kbad (~kbad@malicious.dreamhost.com) has joined #ceph
[17:41] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[17:42] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[17:44] * kbad (~kbad@malicious.dreamhost.com) Quit (Remote host closed the connection)
[17:44] * kbad (~kbad@malicious.dreamhost.com) has joined #ceph
[17:49] * kbad (~kbad@malicious.dreamhost.com) Quit (Remote host closed the connection)
[17:51] * tnt (~tnt@86.188-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[17:54] * chutzpah (~chutz@199.21.234.7) has joined #ceph
[17:58] * xmltok (~xmltok@pool101.bizrate.com) has joined #ceph
[18:00] * yanzheng1 (~zhyan@101.82.113.100) Quit (Ping timeout: 480 seconds)
[18:05] * rlr219 (43c87e04@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[18:07] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[18:10] * Leseb (~Leseb@193.172.124.196) Quit (Quit: Leseb)
[18:11] * ScOut3R_ (~ScOut3R@dslC3E4E249.fixip.t-online.hu) Quit (Ping timeout: 480 seconds)
[18:20] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[18:28] * kbad (~kbad@malicious.dreamhost.com) has joined #ceph
[18:32] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[18:37] * jlogan1 (~Thunderbi@2600:c00:3010:1:9cc3:821f:978c:5b0b) has joined #ceph
[18:38] <jmlowe> so what should I do with this broken cluster, is there anything useful for the dev's?
[18:39] <gregaf> let me catch up on the conversation
[18:40] <madkiss> i have a very strange problem here. 3 nodes, 14 OSDs each. bobtail. If I do a 100gig dd onto a CephFS mounted on a host
[18:40] <madkiss> the cluster becomes almost unresponsible
[18:40] <madkiss> if I take out a node of the cluster, the cluster becomes unresponsive for 50 minutes (and counting)
[18:41] <madkiss> (one node, i.e. 14 OSDs)
[18:42] <madkiss> neither the disk i/o nor the network i/o on any of the servers is properly saturated
[18:44] <madkiss> gregaf: any ideas? :)
[18:45] <jmlowe> gregaf: 0.56.1 from debian-testing, created two rbd's of around 400GB put filesystems on them rsynced data one with qemu client and one with ubuntu 3.5.0-21-generic kernel client, overnight one object went inconsistent had a digest and size error, ran ceph pg repair and it fixed the hash missmatch but size error remains
[18:47] <gregaf> few minutes; morning standup
[18:48] <madkiss> okay, brb
[18:48] * madkiss (~madkiss@62.96.31.190) Quit (Quit: Leaving.)
[18:48] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[18:49] <paravoid> sage: re: #3747, I didn't change the pool size
[18:49] * fghaas (~florian@91-119-215-212.dynamic.xdsl-line.inode.at) has joined #ceph
[18:49] <paravoid> that's what the commit message says is fixing, I'm not entirely sure what it actually does though
[18:49] <paravoid> or is this because CRUSH and hence the active set was changed?
[18:57] <sagewk> paravoid: ah, throught that was the other bug, sorry
[18:58] <paravoid> okay
[18:58] <paravoid> thanks
[18:59] * paravoid is filing bugs like crazy :)
[19:01] * loicd (~loic@magenta.dachary.org) has joined #ceph
[19:01] <gregaf> jmlowe: I give you to sjust, who will ask questions about how your data got inconsistent :)
[19:03] <gregaf> my suspicion (though I don't know for sure) is that you also lost some objects whenever your PG got corrupted
[19:03] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: Few women admit their age. Few men act theirs.)
[19:04] <wer> does ceph.conf need to know about all the radosgw's? Or does it really just need to know if one is running locally?
[19:04] <gregaf> wer: depends on how you're managing configs, but in general you only need it to know about the local ones
[19:06] <gregaf> Tribaal: did you say you'd been using that patch internally?
[19:07] <wer> sweet. I kind of thought so. I am on the fence, if I should put each entry in there or not... I am leaning towards running radosgw on each of my osd nodes and am deciding how to do it.
[19:07] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:07] <Tribaal> gregaf: yes
[19:08] <gregaf> all right
[19:08] <Tribaal> gregaf: not in production though
[19:08] <gregaf> normally I'd throw it at Josh to make sure it followed conventions, but he's on vacation
[19:09] <gregaf> it doesn't look broken to me and you say it runs, so good enough
[19:09] <Tribaal> gregaf: well this one is really quite trivial :)
[19:09] <gregaf> *pushes magic button*
[19:09] <Tribaal> gregaf: I have another one in the pipe that will take a little longer to cleanup
[19:09] <gregaf> thanks!
[19:09] <Tribaal> gregaf: yay :)
[19:10] <gregaf> Kioob`Taff: did you turn off scrub and see your rbd latency get better? because I wouldn't expect it to manifest like that
[19:11] <gregaf> (although it could)
[19:12] * madkiss (~madkiss@213.221.125.229) has joined #ceph
[19:12] <madkiss> back I am.
[19:15] <wer> all the magic for radosgw is actually kept in the various pools in ceph right? So a new instance needs no "local" storage other then its' keyring?
[19:16] <wer> err key rather.
[19:16] <madkiss> gregaf: do you have a minute now? :)
[19:17] * asanka (~asankanis@103.247.48.167) Quit (Remote host closed the connection)
[19:17] * xmltok (~xmltok@pool101.bizrate.com) Quit (Quit: Leaving...)
[19:18] <jmlowe> sjust: you around?
[19:23] * mattbenjamin (~matt@65.160.16.60) has joined #ceph
[19:24] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[19:26] <gregaf> wer: correct
[19:26] <wer> ty gregaf :)
[19:26] <gregaf> jmlowe: I thought he was about to start talking, but he's across the room from me in a standup and I'll poke him when done
[19:26] <gregaf> madkiss: yeah
[19:27] <gregaf> madkiss: how are you measuring unresponsive? what does ceph -s look like while this is happening?
[19:27] <gregaf> and what is your replication level?
[19:28] <madkiss> 2 replicas.
[19:28] * The_Bishop_ (~bishop@e179008006.adsl.alicedsl.de) has joined #ceph
[19:29] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[19:29] <gregaf> jmlowe: he says 15 minutes, or maybe less
[19:29] <jmlowe> gregaf: ok, great, now that I've had lunch and sufficient caffeine it doesn't seem nearly as bad
[19:29] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[19:30] <madkiss> gregaf: Unfortunately I didn't take a particular look at "ceph -s", but i have logfiles available from the time
[19:30] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:31] <jmlowe> gregaf: earlier I thought I had lost a rbd, but it was just an uncaffeinated typo
[19:31] <madkiss> gregaf: What I was doing to come to the conclusion that the thing was unresponsive: I was running a "dd" within a VM running on top of an RBD image on the same cluster, and the vm had 99% IO-wait
[19:32] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[19:33] * agh (~agh@www.nowhere-else.org) has joined #ceph
[19:33] <sjust> jmlowe: repair's currently strategy is to pick the primary copy
[19:33] <sjust> further, it doesn't currently repair that size mismatch
[19:33] <sjust> the easy way to fix it is to do a rados truncate of the object to it's actual ondisk size
[19:33] * ScOut3R (~ScOut3R@dsl5401A397.pool.t-online.hu) has joined #ceph
[19:33] * The_Bishop (~bishop@f052099163.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[19:34] <gregaf> madkiss: okay, I don't think that's the cluster…we've seen some reports recently of VMs getting very slow when doing heavy-duty disk access which we haven't seen before, but it's a local problem not a cluster one
[19:34] <sjust> the rados tool does not at this time support truncate, have a patch that adds support that I am going to put into the bobtail branch
[19:35] <wer> ok I am being lazy, I don't thing sharing keyrings between radosgw is a good thing.
[19:35] <jmlowe> sjust: any idea why it happened?
[19:35] <sjust> jmlowe: was there an osd failure?
[19:35] <sjust> or a disk failure/
[19:35] <jmlowe> sjust: no
[19:35] <sjust> odd
[19:35] <madkiss> aherm.
[19:36] <sjust> the actual error might have happened a while ago
[19:36] <jmlowe> sjust: rbd that contained the inconsistent object was create 24 hours ago
[19:37] <sjust> hmm
[19:37] <sjust> and there were power failures in the mean time (all osds were running?)
[19:38] <jmlowe> sjust: I know, you would think I would have had problems with the two 400GB rbd's I was copying lots of data to but it happened to the 10GB rbd that was backing the vm doing the rsync
[19:39] <madkiss> gregaf: I have trouble believing that. What I can see is that as soon as the OSDs on the failed node were marked down, I had a lot of slow requests in the logs; but clearly the network wasn't saturated
[19:39] <madkiss> at least not according to tools like iftop
[19:39] <jmlowe> sjust: 1/2 ton flywheel + battery ups+ fast start 500KW gen + multiple utility power circuits - we don't have many power problems
[19:40] <jmlowe> sjust: dropped packets possibly?
[19:40] <sjust> not likely to cause that fault
[19:40] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[19:40] * loicd (~loic@magenta.dachary.org) has joined #ceph
[19:41] * danieagle (~Daniel@177.97.248.250) has joined #ceph
[19:41] <jmlowe> sjust: so when is repair slated to pick the object with the valid hash instead of the primary?
[19:42] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[19:43] <jmlowe> sjust: hmm, I have this but the pg is on osds 2 and 8 libceph: osd3 xxx.xxx.xxx.xxx:6800 socket closed
[19:44] <jmlowe> sjust: also 'socket error on read' in addition to 'socket closed'
[19:44] <sjust> jmlowe: unfortunately, there is no "valid" hash at this time
[19:44] <gregaf> madkiss: slow requests aren't too surprising when you kill a whole node; the recovery ops in particular are deliberately low in importance (there's a lot of configurables in this area that need a bunch of tuning)
[19:45] <sjust> we don't track the checksums, we generate them during scrub and compare across replicas
[19:45] <sjust> but we don't know necessarily which is correct
[19:45] <sjust> we are discussing end-to-end checksumming for the next release
[19:45] <gregaf> the RBD thing is something we need to talk about internally when some people get back from vacation, but it's a new problem and probably something in QEMU; certainly not evidence of the cluster causing trouble
[19:46] <jmlowe> sjust: ah, so primary is always "known digest"
[19:47] <madkiss> gregaf: any specific osd option i am supposed to look at?
[19:48] <gregaf> osd_recovery_threads, osd_recovery_delay_start, osd_recovery_max_active, osd_recovery_max_chunk, osd_recovery_op_priority
[19:48] <gregaf> in particular the recovery threads, max active, and op priority
[19:49] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[19:49] <jmlowe> gregaf: has anybody else run across this kind of problem with rbd and qemu?
[19:50] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[19:50] * xmltok (~xmltok@pool101.bizrate.com) has joined #ceph
[19:51] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[19:52] <Kioob> (19:10:58) gregaf: I didn't find how to turn off scrub. But when it was ended, everything came fast again
[19:53] <Kioob> I had near 2Gbps of network trafic, just to scrub 2 PG. OSD should not receive only hash of the chunk ?
[19:54] <madkiss> gregaf: thanks a lot!
[19:54] <gregaf> jmlowe: I don't think we've seen it internally, but I've seen several variants on the theme in the last couple of weeks
[19:54] <madkiss> gregaf: osd recovery max active looks work being looked at, too
[19:54] <Kioob> I add : there was no read activity, all data was comming from cache I suppose (I have 48GB of memory on each host)
[19:55] <jmlowe> sjust: if I wanted to get clean again should I delete the rbd image?
[19:55] <madkiss> gregaf: oh, are all these settings documented somewhere? ;-)
[19:55] <gregaf> Kioob: everything came fast again in the VM, or you stopped getting slow request warnings?
[19:56] <gregaf> madkiss: probably not; just a moment though because sjust is panicking about one of the defaults
[19:56] <gregaf> and/or experiencing joy
[19:56] <madkiss> must be greating working on you guys office ;)
[19:57] <Kioob> I stopped having IOWait in the host machine (the one mounting all RBD devices), and everything came fast again in multiple VM
[19:57] <Kioob> there is logs for �slow request� ?
[20:00] <gregaf> Kioob: so you've got a machine which is mounting RBD devices that is completely separate from the OSDs, and that machine is getting IOWAIT while the OSDs scrub which goes away when they stop?
[20:01] <Kioob> yes
[20:01] <gregaf> what version are you running and have you seen this before?
[20:01] <Kioob> 0.55
[20:01] <gregaf> can you pastebin ceph -s somewhere?
[20:02] <Kioob> I often have latency, but I didn't find why
[20:02] <Kioob> http://pastebin.com/22PRCFg9
[20:03] <Kioob> one thing : I made a mistake when I create pools, so I have few PG
[20:04] * The_Bishop__ (~bishop@e179010209.adsl.alicedsl.de) has joined #ceph
[20:06] <gregaf> hmm, this story doesn't make any sense so I'm not sure where to go next
[20:07] <gregaf> Kioob: how did you measure your network traffic between OSDs? 2gbps for a scrub is ridonkulous
[20:07] <Kioob> iftop
[20:07] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Quit: Leaving)
[20:08] <gregaf> and ceph -s didn't report anything else happening, like maybe a backfill?
[20:09] * The_Bishop_ (~bishop@e179008006.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[20:09] <Kioob> gregaf: if it can help, on each host running OSD (4 hosts, each one with 8 OSD), there was less than 0.1% of IOWait... and iostat was showing an average of less than 1 cmd in queue per second. But on the client server using some RBD devices, iostat was showing 3000 to 15000 ms of latency (the "await" col)
[20:10] <Kioob> gregaf: it's the first thing I was checking. There was no backfilling at all, just 2 PG scrubbing
[20:10] <gregaf> were you doing a bunch of writes on the RBD devices? and how were they mounted?
[20:10] <Kioob> small writes yes, but it's mainly databases, so small IO writes
[20:11] <gregaf> scrubbing two PGs really can't generate that much data; enough writes going through could
[20:11] <Kioob> The client host is a Xen host, with VM
[20:11] <gregaf> using kernel rbd?
[20:11] <Kioob> yes
[20:11] <madkiss> gregaf: btw. a backfill was what I was seeing in my logs, too
[20:11] <Kioob> but there was very little network activity between that Xen host and the Ceph cluster
[20:12] <gregaf> madkiss: yes, that's what happens when you kill storage nodes; backfill is data transfer to a new replica
[20:12] <Kioob> (and the network trafic really was between OSDs)
[20:12] * match (~mrichar1@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[20:13] <madkiss> gregaf: and I can influence that process by setting stuff like osd max recovery active, I presume. Let's see.
[20:13] <gregaf> if you can generate more environment info Kioob I'll take a look but there's not really enough there for me to even make wild-ass guesses :/
[20:13] <gregaf> madkiss: yeah, max active is how many PGs an OSD will recover at one time
[20:13] <janos> when looking to bump up network speed on a 1gb network - is there a preferred bonding mode? i've only ever done active/passive, which is definitely not the way to get more speed
[20:13] <Kioob> ok gregaf, no problem. What should I look if that happen again ?
[20:15] <gregaf> Kioob: log ceph -w somewhere (or if you know how to grab it out of the monitor logs that's fine too); try and track down the daemon source of the network traffic; and have more info about which RBD disks are being written to in what ways
[20:15] <gregaf> the latency is high enough that I doubt it's disk fragmentation but that's my initial thought
[20:16] <Kioob> but iostat was not showing any latency on OSD hosts
[20:16] <gregaf> and apparently everybody running .55 or later should put "osd recovery delay start = 0" in their OSD configs
[20:16] <gregaf> madkiss: you in particular ^
[20:17] <janos> gregaf: what does that do?
[20:17] <madkiss> why so?
[20:17] <Kioob> oki, thanks for the tip
[20:17] <gregaf> Kioob: yeah, I'm confused too, so basically anything you can think of ;)
[20:17] <iggy> janos: if you can 802.3ad (or whatever it's called) other forms use some sort of mac hashing so you won't really get 2 gbps between hosts (at least in my experience)
[20:18] <Kioob> at first I was thinking of a network problem... because there was IOWait on client, but not on OSD
[20:18] <gregaf> janos: madkiss: if it's set to non-zero then it basically prevents recovery from happening until everything that needs to recover is queued up (ie, the rest of peering completes)
[20:18] <janos> iggy, thanks. i'm pretty sre my switches support that
[20:18] <janos> sre/sure
[20:18] <gregaf> but that can take a long time and the problem it was meant to address is better dealt with in other ways now
[20:18] <janos> ah, will do
[20:18] * The_Bishop_ (~bishop@e177090223.adsl.alicedsl.de) has joined #ceph
[20:18] <janos> just add to config and do a "-a restart" ?
[20:19] <iggy> if they are "managed" they should, but they often have a limit on how many bonds/trunks you can have
[20:19] <madkiss> so let's see
[20:19] <gregaf> janos: it's not urgent, just improves recovery behavior, but that's be how you do it, yes
[20:19] <janos> iggy: managed. some dell powerconnects. this is my home network i'm experimenting on, so i shouldn't hit any limits ;)
[20:19] <janos> gregaf: thank you
[20:20] <iggy> janos: test with a net tool to make sure it actually works
[20:20] <janos> will do
[20:21] <madkiss> gregaf: for the max size of push chunks, how is 1<<20 to be interpreted
[20:22] * deckid (~deckid@clientvpn.sentryds.com) Quit (Quit: Gone)
[20:24] <gregaf> bytes; that setting is probably fine
[20:24] * The_Bishop__ (~bishop@e179010209.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[20:24] <madkiss> osd recovery max active = 5 <= is the default. So in a box with 14 OSDs, I would have 4MB*5*14 of recovery stuff?
[20:24] <madkiss> which means 280mbyte?
[20:24] <gregaf> madkiss: actually if it's set at 1 << 20 you should bump it to 4 << 20; current default is 8MB
[20:25] <gregaf> or rather 8 << 20
[20:25] <madkiss> gregaf: I am referring to http://ceph.com/docs/master/rados/configuration/osd-config-ref/
[20:25] <gregaf> what version are you running? that's changed at some point :/
[20:25] <madkiss> latest bobtail from the repo
[20:26] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[20:26] <madkiss> osd recovery op priority <= seems to be undocumented
[20:27] <gregaf> oh, that's fine then
[20:27] <gregaf> already set appropriately
[20:27] <gregaf> op priority influences how much it's dispatched
[20:27] <gregaf> higher means more important so it'll take up more of the bandwidth, roughly speaking
[20:27] <gregaf> values between 0 and 63
[20:27] <madkiss> so what would be a good value for that in a 3-node ceph cluster with 14 OSDs per node, all interconnected via gbit?
[20:28] <gregaf> no idea what the best way to tune is
[20:28] <gregaf> lunchtime!
[20:30] <absynth_47215> heh... madkiss, long time no see
[20:30] <absynth_47215> wschulze: around?
[20:31] <madkiss> hello absynth_47215
[20:31] <wschulze> absynth_47215: sorry I am on a conference call
[20:31] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[20:32] <absynth_47215> wschulze: no worries, i will just query you and you can read it whenever
[20:35] <absynth_47215> madkiss: still on #linux.de?
[20:35] <madkiss> sometimes
[20:44] * Meths (~meths@2.27.95.119) Quit (Quit: leaving)
[20:47] * Meths (~meths@2.27.95.119) has joined #ceph
[21:00] * sjustlaptop (~sam@2607:f298:a:607:6c0a:99b7:71c0:ecec) has joined #ceph
[21:14] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[21:20] <madkiss> gregaf: would "osd max backfills =1" work?
[21:20] <madkiss> for a setup like the described one?
[21:20] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[21:25] * sjustlaptop (~sam@2607:f298:a:607:6c0a:99b7:71c0:ecec) Quit (Quit: Leaving.)
[21:25] * sjustlaptop (~sam@2607:f298:a:607:3d8e:2b22:10bd:8b17) has joined #ceph
[21:34] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[21:35] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[21:36] * agh (~agh@www.nowhere-else.org) has joined #ceph
[21:41] * gaveen (~gaveen@112.135.154.201) Quit (Ping timeout: 480 seconds)
[21:42] * nwat (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[21:43] * sjustlaptop (~sam@2607:f298:a:607:3d8e:2b22:10bd:8b17) Quit (Quit: Leaving.)
[21:44] * sjustlaptop (~sam@2607:f298:a:607:f41c:e825:95e2:855b) has joined #ceph
[21:54] * The_Bishop_ (~bishop@e177090223.adsl.alicedsl.de) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[21:58] <gregaf> madkiss: max backfills = 1 means the OSD will only participate in one backfill at a time; it will reduce how much disk IO is devoted to recovery but will work just fine
[21:59] <madkiss> gregaf: well, with 14 OSDs in place, that would still mean I have a whopping 14 OSDs participating in backfills at a time, right?
[22:00] * sjustlaptop (~sam@2607:f298:a:607:f41c:e825:95e2:855b) Quit (Read error: Operation timed out)
[22:00] <jmlowe> I've had another pg go inconsistent on me in the past hour
[22:03] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[22:10] <jmlowe> sjust: I was able to reproduce loosing pg consistency
[22:10] <jmlowe> make that losing
[22:10] <gregaf> madkiss: well they'd only participate in one each, and so half would be serving and half would be consuming, approximately
[22:11] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[22:11] <madkiss> gregaf: thanks
[22:15] <sjust> jmlowe: any luck?
[22:15] <jmlowe> so far seems to be the same vm, with nothing really going on
[22:16] <sjust> you did reproduce it?
[22:16] <jmlowe> only difference is 64bit centos5.8 has 2 inconsistent pg's and a 32bit centos5.8 vm on the same machine is fine both vm's idle
[22:17] <jmlowe> 2013-01-10 16:08:41.450954 osd.1 [ERR] 2.9b osd.6: soid ff32e9b/rb.0.1e4a.2ae8944a.000000000502/head//2 size 4194304 != known size 4042752, digest 2382907887 != known digest 353635768
[22:17] <jmlowe> 2013-01-10 16:08:41.451010 osd.1 [ERR] deep-scrub 2.9b ff32e9b/rb.0.1e4a.2ae8944a.000000000502/head//2 on disk size (4042752) does not match object info size (4194304)
[22:17] <jmlowe> 2013-01-10 16:08:53.919566 osd.1 [ERR] 2.9b deep-scrub stat mismatch, got 410/410 objects, 0/0 clones, 1693528064/1693679616 bytes.
[22:17] <jmlowe> 2013-01-10 16:08:53.919582 osd.1 [ERR] 2.9b deep-scrub 0 missing, 1 inconsistent objects
[22:17] <jmlowe> 2013-01-10 16:08:53.919585 osd.1 [ERR] 2.9b deep-scrub 3 errors
[22:17] <jmlowe> no osd problems during this time
[22:17] <sjust> the flaw would have happened on the machines with the ceph-osd instances corresponding to that pg
[22:18] <sjust> do you mean the osds are running in vms?
[22:18] <jmlowe> sorry, no, osd's are running on bare metal on different nodes and haven't had any crashes or other errors
[22:18] <sjust> ok, the vms are almost certainly unrelated except as a source of untimely io
[22:20] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:20] <jmlowe> three vm's running against ceph two identical stock centos5.8 installs isolated from the world via nat and idle, only difference is 32 vs 64 bit, 3rd vm is 32bit centos 5.8 hosting some static web pages and it hasn't had any problems with it's rbd image yet
[22:21] <jmlowe> not enough data points to determine if it is coincidence that the pg's have gone bad on the same vm
[22:22] <jmlowe> any thoughts on snapshotting before trying to invoke another inconsistent pg?
[22:27] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[22:31] * gaveen (~gaveen@112.135.142.223) has joined #ceph
[22:42] * Cube (~Cube@12.248.40.138) has joined #ceph
[22:42] * Cube1 (~Cube@12.248.40.138) has joined #ceph
[22:42] * Cube (~Cube@12.248.40.138) Quit (Read error: Connection reset by peer)
[22:48] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[22:49] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has left #ceph
[22:50] * The_Bishop (~bishop@i59F6BB7E.versanet.de) has joined #ceph
[22:57] * aliguori (~anthony@32.97.110.59) Quit (Remote host closed the connection)
[23:14] * korgon (~Peto@isp-korex-15.164.61.37.korex.sk) has joined #ceph
[23:18] * agh (~agh@www.nowhere-else.org) Quit (Remote host closed the connection)
[23:18] * agh (~agh@www.nowhere-else.org) has joined #ceph
[23:21] <loicd> I have teuthology apparently working ( http://paste.debian.net/223566/ ) however, I'm not sure I understand how coverage can be enabled. ( https://github.com/ceph/teuthology/blob/master/coverage/cov-init.sh ).
[23:22] * loicd digging
[23:22] <Kioob> gregaf: by looking logs, I see that my latency problem start and stop exactly when the scrub of one unique PG start and stop.
[23:23] <Kioob> the PG 3.64 (20GB)
[23:23] <Kioob> 2013-01-10 10:52:42.042494 7f358139b700 15 mon.a@0(leader).pg v2212915 got 3.64 reported at 4438'3498898 state active+clean -> active+clean+scrubbing
[23:23] <Kioob> 2013-01-10 11:32:22.604891 7f358139b700 15 mon.a@0(leader).pg v2215279 got 3.64 reported at 4438'3503122 state active+clean+scrubbing -> active+clean
[23:24] <Kioob> and between that start and end, I have 424 lines of �active+clean+scrubbing -> active+clean+scrubbing�
[23:24] <Kioob> so I suppose there was 425 �chunks� for that scrub
[23:25] <Kioob> in any case, it can't throw that amount of bandwith.... to keep 1Gbps of network traffic during 40 minutes, we need 300GB of data. That PG have a size of 20GB...
[23:29] <loicd> using http://gitbuilder.ceph.com/ceph-tarball-precise-x86_64-gcov/ instead of http://gitbuilder.ceph.com/ceph-tarball-precise-x86_64-basic/ is a good start ;-)
[23:29] * gaveen (~gaveen@112.135.142.223) Quit (Remote host closed the connection)
[23:31] <gregaf> Kioob: those logs are just the monitor updating its PG statistics
[23:31] <Kioob> yes
[23:31] <gregaf> it's not based on how many chunks the PG got split into, I don't think, though
[23:32] <Kioob> oh ok
[23:32] * The_Bishop_ (~bishop@i59F6DFD5.versanet.de) has joined #ceph
[23:32] <Kioob> I was thinking there was update at each chunk
[23:34] <gregaf> no, it's just based on a timer
[23:34] <Kioob> ok
[23:35] * AaronSchulz (~chatzilla@216.38.130.166) has joined #ceph
[23:38] * The_Bishop (~bishop@i59F6BB7E.versanet.de) Quit (Ping timeout: 480 seconds)
[23:40] <Kioob> gregaf: now there is 1 active scrubbing, and 2 Gbps of traffic between my OSD. What do you want I check ?
[23:40] <Kioob> 192.168.42.1:32797 <= 192.168.42.2:6827 = 1Gbps
[23:40] <Kioob> 192.168.42.1:43750 <= 192.168.42.2:6830 = 1Gbps too
[23:41] <Kioob> (I'm looking on the host 192.168.42.1)
[23:41] <gregaf> hmm, do you have messenger debug logging on by any chance?
[23:41] <Kioob> no...
[23:41] <Kioob> it's commented :
[23:41] <Kioob> ;debug ms = 1
[23:43] <gregaf> well if you've got anything installed that will let you monitor who's generating traffic that would be nice
[23:44] <gregaf> otherwise restart an osd with ms debugging on and gather up the logs from during that time, because I thought scrubbing was throttled enough that it couldn't generate that much traffic (though I could be wrong)
[23:44] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Read error: Connection reset by peer)
[23:44] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[23:48] <Kioob> # netstat -plan | grep 192.168.42.1:56184
[23:48] <Kioob> tcp 0 0 192.168.42.1:56184 192.168.42.1:6821 ESTABLISHED 16933/ceph-osd
[23:48] <Kioob> tcp 0 0 192.168.42.1:6821 192.168.42.1:56184 ESTABLISHED 15864/ceph-osd
[23:48] <Kioob> tcp 0 0 192.168.42.1:56184 192.168.42.6:6805 ESTABLISHED 16117/ceph-osd
[23:48] <loicd> In https://github.com/ceph/teuthology/blob/master/coverage/cov-init.sh what is the CEPH_BUILD_TARBALL supposed to be ?
[23:49] <Kioob> oups
[23:49] <gregaf> Kioob: bizarre; how are you monitoring that that's the port getting all the traffic?
[23:49] <Kioob> with : iftop -i eth0 -P
[23:50] <Kioob> there is between 800Mbps and 1Gbps on this socket
[23:50] <Kioob> this one : 192.168.42.1:56184 192.168.42.6:6805
[23:50] <Kioob> and 16117 is OSD 14
[23:50] <gregaf> hmm, that's not the port you'd said previously?
[23:50] <Kioob> it's not the same scrub
[23:50] <Kioob> the previous one ended
[23:51] <Kioob> (yes, I'm too slow...)
[23:51] <Kioob> so... I have a lot of logs with �debug ms� enabled
[23:52] <Kioob> what should I do with that amount of data ? what should I filter ?
[23:53] <gregaf> can you post one of the OSD logs somewhere?
[23:54] <Kioob> https://daevel.fr/osd.12.log.gz
[23:56] * BManojlovic (~steki@85.222.223.220) has joined #ceph
[23:57] <gregaf> all right; I or Sam will check through that and see how much traffic it thinks it's sending
[23:57] <Kioob> thanks a lot !
[23:58] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:58] * sander (~chatzilla@c-174-62-162-253.hsd1.ct.comcast.net) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.