#ceph IRC Log

Index

IRC Log for 2013-01-27

Timestamps are in GMT/BST.

[0:02] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[0:08] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Remote host closed the connection)
[0:09] * sleinen1 (~Adium@2001:620:0:26:5932:4b81:9eb0:53a6) Quit (Quit: Leaving.)
[0:20] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[0:22] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[0:27] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[0:30] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[0:30] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[0:43] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[0:52] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) Quit (Quit: Leaving.)
[0:52] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[0:52] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) Quit ()
[0:58] * davidz1 (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[0:58] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Read error: Connection reset by peer)
[1:12] * Qten (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) Quit (Read error: Connection reset by peer)
[1:12] * Qten (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) has joined #ceph
[1:14] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[1:30] <Kioob> is there a way to track where is the latency come from ?
[1:32] <Kioob> the stack is : write access from OS over RBD (kernel module) => network => OSD (software) => drive
[1:32] <Kioob> at drive level, I don't latency problem
[1:33] <Kioob> but when the OS write on the rados block device, I see huge latencies
[1:33] <Kioob> (like 5000ms)
[1:46] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:46] * loicd (~loic@magenta.dachary.org) has joined #ceph
[1:48] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[1:49] * BillK (~BillK@124-169-52-96.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[2:11] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[2:12] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[2:15] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[2:16] * loicd1 (~loic@2a01:e35:2eba:db10:2551:9d46:bacc:9512) has joined #ceph
[2:22] * loicd (~loic@magenta.dachary.org) Quit (Ping timeout: 480 seconds)
[2:52] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[2:52] * BillK (~BillK@124-169-177-116.dyn.iinet.net.au) has joined #ceph
[2:55] * Meths (~meths@2.27.72.227) Quit (Ping timeout: 480 seconds)
[2:59] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) Quit (Quit: Leaving.)
[3:51] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[4:23] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: Say What?)
[4:42] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[4:59] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[4:59] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[5:00] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[5:01] * LeaChim (~LeaChim@b0faf18a.bb.sky.com) Quit (Ping timeout: 480 seconds)
[5:24] * Pagefaulted (~AndChat73@c-67-168-132-228.hsd1.wa.comcast.net) has joined #ceph
[5:38] * gohko (~gohko@natter.interq.or.jp) Quit (Ping timeout: 480 seconds)
[6:00] * loicd1 (~loic@2a01:e35:2eba:db10:2551:9d46:bacc:9512) Quit (Quit: Leaving.)
[6:01] * loicd (~loic@magenta.dachary.org) has joined #ceph
[6:07] * sagelap (~sage@diaman3.lnk.telstra.net) has joined #ceph
[6:35] * sagelap (~sage@diaman3.lnk.telstra.net) Quit (Ping timeout: 480 seconds)
[6:47] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[7:17] * sagelap (~sage@diaman3.lnk.telstra.net) has joined #ceph
[7:22] * ajnelson_ (~ajnelson@c-67-160-231-128.hsd1.ca.comcast.net) has joined #ceph
[7:23] * ajnelson_ (~ajnelson@c-67-160-231-128.hsd1.ca.comcast.net) Quit ()
[7:23] * ajnelson_ (~ajnelson@c-67-160-231-128.hsd1.ca.comcast.net) has joined #ceph
[7:31] * ajnelson_ (~ajnelson@c-67-160-231-128.hsd1.ca.comcast.net) Quit (Quit: ajnelson_)
[7:43] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[8:01] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:01] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:36] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[8:38] * sleinen1 (~Adium@2001:620:0:26:f534:e0f8:9bee:2dcc) has joined #ceph
[8:44] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[8:45] * sagelap (~sage@diaman3.lnk.telstra.net) Quit (Ping timeout: 480 seconds)
[8:45] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:45] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:51] * mmgaggle (~kyle@alc-nat.dreamhost.com) Quit (Quit: leaving)
[9:10] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[9:14] * sagelap (~sage@diaman3.lnk.telstra.net) has joined #ceph
[9:21] <Pagefaulted> Can someone explain slit pg numbers?
[9:22] <Pagefaulted> Slit = split
[9:22] <Pagefaulted> Like 'ceph osd pool create 2000 4000'
[9:33] * sagelap (~sage@diaman3.lnk.telstra.net) Quit (Quit: Leaving.)
[9:33] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[9:34] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[9:34] * AndChat|73761 (~AndChat73@c-67-168-132-228.hsd1.wa.comcast.net) has joined #ceph
[9:34] * Pagefaulted (~AndChat73@c-67-168-132-228.hsd1.wa.comcast.net) Quit (Read error: Connection reset by peer)
[9:38] * sagelap1 (~sage@diaman3.lnk.telstra.net) has joined #ceph
[9:39] * sagelap1 (~sage@diaman3.lnk.telstra.net) Quit ()
[9:39] * sagelap (~sage@diaman3.lnk.telstra.net) has joined #ceph
[9:42] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[9:46] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[10:00] * gucki (~smuxi@84-73-202-7.dclient.hispeed.ch) has joined #ceph
[10:00] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) has joined #ceph
[10:14] * sleinen1 (~Adium@2001:620:0:26:f534:e0f8:9bee:2dcc) Quit (Quit: Leaving.)
[10:28] * sagelap (~sage@diaman3.lnk.telstra.net) Quit (Quit: Leaving.)
[10:30] * sagelap (~sage@diaman3.lnk.telstra.net) has joined #ceph
[10:40] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[10:58] * sagelap (~sage@diaman3.lnk.telstra.net) has left #ceph
[11:12] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[11:16] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[11:16] * loicd (~loic@magenta.dachary.org) has joined #ceph
[11:17] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[11:53] * ScOut3R (~ScOut3R@dsl51B61EED.pool.t-online.hu) has joined #ceph
[12:09] * gaveen (~gaveen@112.135.6.187) has joined #ceph
[12:12] * ScOut3R (~ScOut3R@dsl51B61EED.pool.t-online.hu) Quit (Remote host closed the connection)
[12:21] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[12:39] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[12:42] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Ping timeout: 480 seconds)
[12:42] * LeaChim (~LeaChim@b0faf18a.bb.sky.com) has joined #ceph
[12:45] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[12:48] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[12:48] * loicd (~loic@magenta.dachary.org) has joined #ceph
[13:15] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[13:16] * sleinen1 (~Adium@2001:620:0:26:16:9425:2048:ee92) has joined #ceph
[13:18] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[13:21] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Remote host closed the connection)
[13:23] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[13:23] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[13:37] * Meths (rift@2.25.214.102) has joined #ceph
[13:53] * The_Bishop_ (~bishop@e179020216.adsl.alicedsl.de) has joined #ceph
[14:01] * The_Bishop__ (~bishop@e179014137.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[14:13] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[14:14] * sleinen2 (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[14:15] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Remote host closed the connection)
[14:15] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Read error: Connection reset by peer)
[14:15] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[14:16] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Read error: No route to host)
[14:16] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[14:16] * sleinen3 (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[14:16] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Read error: Connection reset by peer)
[14:17] * sleinen (~Adium@2001:620:0:25:45b8:d417:6d7:254e) has joined #ceph
[14:20] * sleinen1 (~Adium@2001:620:0:26:16:9425:2048:ee92) Quit (Ping timeout: 480 seconds)
[14:22] * sleinen2 (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[14:24] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[14:24] * sleinen3 (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[14:46] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[14:54] * ebo^ (~ebo@233.195.116.85.in-addr.arpa.manitu.net) has joined #ceph
[14:55] <ebo^> my cephfs just kind of stopped responding :-(
[14:55] <ebo^> i can no longer access a specific directory
[14:55] <ebo^> any idea how i can get back functionality?
[14:59] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[15:03] <ebo^> restarting helped \o/
[15:21] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[15:35] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[15:58] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[16:02] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[16:13] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[16:22] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[16:24] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[16:25] * leseb_ (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[16:32] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Ping timeout: 480 seconds)
[16:36] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[16:53] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[17:03] * loicd (~loic@magenta.dachary.org) Quit (Read error: Connection reset by peer)
[17:04] * loicd (~loic@magenta.dachary.org) has joined #ceph
[17:08] * BillK (~BillK@124-169-177-116.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[17:15] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[17:27] * BillK (~BillK@124-148-85-37.dyn.iinet.net.au) has joined #ceph
[17:31] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[17:44] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[17:50] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[17:51] * loicd (~loic@magenta.dachary.org) has joined #ceph
[17:54] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[17:55] <Kdecherf> hm
[17:56] <Kdecherf> I have "ls: cannot access /ceph/myfolder: No such file or directory" sometimes when I list all folders
[17:56] <Kdecherf> but the folder exists and shown after a hang
[18:00] <Kdecherf> (with ceph-0.56.1)
[18:03] * lotia (~lotia@l.monkey.org) Quit (Ping timeout: 480 seconds)
[18:10] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[18:12] * lotia (~lotia@l.monkey.org) has joined #ceph
[18:15] * BillK (~BillK@124-148-85-37.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[18:25] * lotia (~lotia@l.monkey.org) Quit (Ping timeout: 480 seconds)
[18:33] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[18:46] * lotia (~lotia@l.monkey.org) has joined #ceph
[18:50] * ScOut3R (~ScOut3R@dsl51B61EED.pool.t-online.hu) has joined #ceph
[18:50] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[18:53] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[18:55] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[18:56] * loicd (~loic@magenta.dachary.org) has joined #ceph
[18:58] * lotia (~lotia@l.monkey.org) Quit (Ping timeout: 480 seconds)
[19:06] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[19:06] * loicd (~loic@magenta.dachary.org) has joined #ceph
[19:13] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[19:21] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[19:22] <gucki> sage, are you there? :)
[19:29] * noob2 (~noob2@pool-71-244-111-36.phlapa.fios.verizon.net) has joined #ceph
[19:31] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[19:33] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Ping timeout: 480 seconds)
[19:36] * noob2 (~noob2@pool-71-244-111-36.phlapa.fios.verizon.net) Quit (Quit: Leaving.)
[19:41] <absynth_47215> sjust: around?
[19:42] * sagelap (~sage@diaman3.lnk.telstra.net) has joined #ceph
[19:42] <sagelap> gucki: hey
[19:42] <gucki> sagelap: hey :)
[19:43] <gucki> sagelap: in case you need anything else to track down the bug, just let me know :)
[19:43] <sagelap> just downlaoded the log.. this is corin i assume?
[19:44] <gucki> yes it's me
[19:46] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[19:47] <sagelap> gucki: hmm, the logs show just normal io. there must be another thread doing soething else. the cpu is still 100%?
[19:48] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[19:48] <gucki> sagelap: yes, cpu is around 100% of one core.
[19:48] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[19:49] <gucki> sagelap: iotop also show 11mb/s for ceph...
[19:49] <sagelap> gdb /usr/bin/ceph-osd $pid, then ' thread apply all bt' and capture the output?
[19:51] <gucki> can i safely quit gdb (q) and ceph osd continues running?
[19:52] <gucki> A debugging session is active.
[19:52] <gucki> Inferior 1 [process 27045] will be detached.
[19:53] <gucki> ok, seems it worked. the monitor just showed "map e7003 wrongly marked me down" but the cluster is healthy again now
[19:54] <sagelap> yeah
[19:56] <gucki> just added the output to the ticket, to big for pastie
[19:56] <sagelap> k
[19:57] <sagelap> looks like levedb is diong a background compaction
[19:58] <sagelap> what is stored in the cluster?
[19:58] <gucki> kvm images
[19:59] <gucki> but this osd is small, only 35G data on the disk
[20:00] <gucki> is this leveldb compaction code new to .3? when i downgrade to .2 it never takes that much cpu and doesnt do that much io either..
[20:00] <sagelap> the only changes between the two versions are Makefile changes for rpm builds. very strange.
[20:01] <sagelap> how big (bytes and files) is the $osd_data/current/omap directory
[20:01] <sagelap> ?
[20:01] <gucki> sage: are the official ubuntu binaries of .2 different to your .2 version? the .2 version i installed if from the quantal repos
[20:01] <sagelap> oooh.. yeah, they are.
[20:02] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[20:02] <sagelap> the ubuntu ones use the libleveldb that is in ubuntu (no idea what version)
[20:02] <sagelap> ours links it in statically.
[20:02] <gucki> sage: ok, there we have the difference in leveldb :)
[20:02] <gucki> sage: let me check which version is installed
[20:02] <sagelap> what version is libleveldb?
[20:02] <sagelap> :)
[20:02] <gucki> du current/omap
[20:02] <gucki> 1053184 current/omap
[20:03] <sagelap> is it getting smaller over time?
[20:03] <gucki> ii libleveldb1:amd64 0+20120530.gitdd0d562-2
[20:03] <sagelap> i wonder if your version wasn't doing compaction for some reason.
[20:03] <gucki> no, it grew a few bytes 1053216 current/omap
[20:03] <gucki> but well, that's only 1 mb..?
[20:03] <gucki> ah no...1 gb ;)
[20:04] <gucki> but 6 hours to compact such a small file?
[20:05] <sagelap> the quantal version is newer.
[20:05] <sagelap> let me prepare a build that is using the newer version and we'll see if its better
[20:05] <sagelap> maybe the downgrade makes it unhappy
[20:05] <gucki> ok, great :)
[20:13] <sagelap> wip-argonaut-leveldb will appear http://gitbuilder.ceph.com/ceph-deb-quantal-x86_64-basic/ref/ shortly
[20:13] <sagelap> 5-10 min
[20:16] <gucki> ok. i'll let you know how it works then :)
[20:17] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Remote host closed the connection)
[20:22] <paravoid> sagelap: hey
[20:22] <paravoid> sagelap: so, did what you said; most of the osds booted up immediately, three osds took 1-2h to boot
[20:22] <paravoid> and 12 osds (all the osds in 1 particular box) took 5-6h to boot up
[20:23] <paravoid> now everything looks normal I guess
[20:23] <paravoid> no peering/incomplete/stale pgs and resyncing
[20:26] <sagelap> gucki: built
[20:27] <paravoid> still not particularly fast though
[20:28] * Pagefaulted (~AndChat73@c-67-168-132-228.hsd1.wa.comcast.net) has joined #ceph
[20:28] * AndChat|73761 (~AndChat73@c-67-168-132-228.hsd1.wa.comcast.net) Quit (Read error: Connection reset by peer)
[20:31] <Pagefaulted> Anyone explain split placement groups?
[20:31] <sagelap> the recovery you mean?
[20:32] <sagelap> paravoid: ^
[20:39] <paravoid> yes
[20:40] <sagelap> its tune pretty conservatively atm. you might try ceph osd tell \* injectargs '--osd-recovery-max-active 20'
[20:41] <sagelap> either way, keep an eye out for any strangeness like you saw before.
[20:42] <sagelap> and we'll have a 0.56.2 in the next day or two to upgrade to, too
[20:47] <gucki> sage: sry, on moment
[20:48] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[20:50] <ebo^> i had some problems with cephfs lately. under load, the writes terminate correctly, but the files are missing some part of it. any takers?
[20:52] * noob2 (~noob2@pool-71-244-111-36.phlapa.fios.verizon.net) has joined #ceph
[20:52] <gucki> sage: much better, no cpu and io now! :-)
[20:52] <gucki> sage: so i'd say it looks normal...but lets wait a few for minutes
[20:53] <gucki> sagelap: sry, wrong nick ;)
[20:54] <sagelap> gucki: cool. i wonder if this means we should update the argonaut to the new leveldb. the plan was to update in our master but leave argonaut and bobtail on the old version.
[20:54] <sagelap> or just make our quantal debs link dynamically
[20:55] <gucki> sagelap: well, i don't know what's the best. but at least a version check would probably be useful, so others don't run into the same problem...if that's possible.
[20:55] <gucki> sagelap: i just read a new bobtail will come out the next few days? a new argonaut too?
[20:57] <sagelap> there may be one more argonaut release; there was a snap trimming bug that is probabl worth getting out. it'll probably be the last.
[20:57] <gucki> sagelap: probably i'll upgrade to the next bobtail the next few days then. when disabling cephx is should be trivial, right? should it be as stable as argonaut is? :)
[20:57] <sagelap> 0.56.2 will be out in a couple days, yeah
[20:58] <sagelap> that is the hope! it doesn't have as much run time as argonaut does, but lots of things are improved.
[21:00] <gucki> sagelap: yeah, i'm always reading the news *g*. i just saw many commits on the bobtail branch the last few days, so i thought there'll be a new release soon. i really hope bobtail fixes many of the memory leak problems..those are the only one which annoy my with argonaut
[21:01] <sagelap> lots of those to go around in all subsystems
[21:03] <gucki> sagelap: yeah, the last few days i had some osds taking aroung 5 gb ram..normally 300mb. sometimes also the monitors go up from 100 to 1gb ram... and worst: kvm guests also take more ram over time, so looks like leaks in librados....but yeah, i'm looking forward to the next bobtail :)
[21:03] <gucki> sagelap: do you think it's ok to upgrade the whole cluster to you custom wip-leveldb build now?
[21:05] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[21:07] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[21:07] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[21:12] * gaveen (~gaveen@112.135.6.187) Quit (Ping timeout: 480 seconds)
[21:21] * gaveen (~gaveen@112.135.2.249) has joined #ceph
[21:24] * scalability-junk (~stp@dslb-084-056-037-141.pools.arcor-ip.net) has joined #ceph
[21:28] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[21:35] <phantomcircuit> i have a bunch of slow request warnings despite there being nearly zero disk io
[21:44] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[21:48] * gaveen (~gaveen@112.135.2.249) Quit (Remote host closed the connection)
[21:48] * sagelap (~sage@diaman3.lnk.telstra.net) Quit (Ping timeout: 480 seconds)
[21:52] <phantomcircuit> http://pastebin.com/raw.php?i=mL2RDD6i
[21:52] <phantomcircuit> trying to go from 0.49 to 0.56.1
[21:53] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[21:53] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:53] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[21:53] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[21:57] * gaveen (~gaveen@112.135.2.249) has joined #ceph
[22:06] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[22:07] <gucki> phantomcircuit: i'm no dev, but to me it looks like a bug: osd/PG.h: 359: FAILED assert(is_locked())
[22:07] <phantomcircuit> yeah but why
[22:07] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[22:08] <phantomcircuit> possibly it's having trouble communicating with the monitors
[22:09] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[22:09] <gucki> phantomcircuit: imo, even if it's having some troubles, an assert should never fail. can you post a bug report for it?
[22:10] <gucki> http://tracker.newdream.net/projects/ceph/issues
[22:11] <gucki> phantomcircuit: probably it's also a problem with your ondisk data: -7> 2013-01-27 21:50:02.995238 3832145f780 2 journal read_entry 5014179840 : bad header magic, end of journal
[22:11] <gucki> -6> 2013-01-27 21:50:02.995245 3832145f780 2 journal read_entry 5014179840 : bad header magic, end of journal
[22:11] <gucki> probably because 0.49 never was a stable version,.....?
[22:12] * ninkotech (~duplo@89.177.137.236) has joined #ceph
[22:16] * gaveen (~gaveen@112.135.2.249) Quit (Remote host closed the connection)
[22:23] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[22:25] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[22:26] * LeaChim (~LeaChim@b0faf18a.bb.sky.com) Quit (Ping timeout: 480 seconds)
[22:48] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[22:50] * ebo^ (~ebo@233.195.116.85.in-addr.arpa.manitu.net) Quit (Quit: Verlassend)
[23:03] * sleinen (~Adium@2001:620:0:25:45b8:d417:6d7:254e) Quit (Quit: Leaving.)
[23:03] * BillK (~BillK@124-148-195-207.dyn.iinet.net.au) has joined #ceph
[23:04] * LeaChim (~LeaChim@b01bdc44.bb.sky.com) has joined #ceph
[23:06] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[23:18] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[23:24] <Kdecherf> hm, I have a strange freeze on a folder, the cluster seems to be ok
[23:26] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[23:29] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) has joined #ceph
[23:30] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[23:32] <Kdecherf> [17657.695762] EXT4-fs (sda3): ext4_da_update_reserve_space: ino 54284321, allocated 1 with only 0 reserved metadata blocks
[23:32] <Kdecherf> hm
[23:32] <Kdecherf> it's a cool message :D
[23:43] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: This computer has gone to sleep)
[23:47] * jskinner (~jskinner@50-80-32-52.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[23:51] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.