#ceph IRC Log


IRC Log for 2012-08-21

Timestamps are in GMT/BST.

[0:04] <Fruit> yeah same
[0:05] <prometheanfire> checking on how many xattrs per file we can do
[0:05] <prometheanfire> that is that I need to check right? going by http://ceph.com/wiki/Backend_filesystem_requirements ?
[0:08] <dmick> elder: that's usually because .deps still exist, which distclean is supposed to fix
[0:08] <dmick> (not sure why clean doesnt, but it doesn't)
[0:08] <dmick> if distclean didn't, I'm confused
[0:08] <sagewk> joao: still around?
[0:09] <sagewk> elder: blow awak src/.deps and then rerun autogen
[0:10] <sagewk> prometheanfire: ceph-osd can now compensate when xattrs have a size limit, as long as we can still store some small ones (like ext3/4). you should be fine there with zfs, as long as user xattrs work at all
[0:12] <Fruit> hrm, can the osd run on bsd?
[0:15] <prometheanfire> ya, gonna check size and count limits
[0:17] * mgalkiewicz (~mgalkiewi@toya.hederanetworks.net) has joined #ceph
[0:18] <mgalkiewicz> hi is it possible that simple osd restart causes such situation: health HEALTH_WARN 204 pgs degraded; 252 pgs peering; 219 pgs stuck inactive; 221 pgs stuck unclean; recovery 1397/7034 degraded (19.861%)
[0:18] <mgalkiewicz> that osd was restarted few minutes ago and it was also recovering
[0:19] <elder> sagewk, trying that now. The make without "-j 4" didn't help (and was SLOW)
[0:20] <sagewk> elder: srsly
[0:20] * loicd (~loic@brln-4d0ce39f.pool.mediaWays.net) Quit (Quit: Leaving.)
[0:22] <elder> Better.
[0:22] <elder> Thanks.
[0:28] <elder> OK, now that I've got that resolved I'm hitting a problem with my UML build sagewk
[0:28] <sagewk> add a NULL arg
[0:28] <sagewk> should be fixed upstream shortly
[0:29] <elder> OK.
[0:36] <nhmhome> sagewk: still looking at wallclock profiling?
[0:36] <sagewk> nhmhome: if by "still" you mean "was ever", no :)
[0:37] <nhmhome> sagewk: lol, ok. :)
[0:37] <sagewk> just mentioning options in the reply to andreas
[0:38] <nhmhome> sagewk: ah, I missed that thread
[0:38] <nhmhome> oh wait, now I remember.
[0:44] * EmilienM (~EmilienM@vau75-1-81-57-77-50.fbx.proxad.net) Quit (Remote host closed the connection)
[1:02] * yoshi (~yoshi@p22043-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:14] * rturk (~rturk@ps94005.dreamhost.com) Quit (Quit: Coyote finally caught me)
[1:14] * rturk (~rturk@ps94005.dreamhost.com) has joined #ceph
[1:22] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:29] * maelfius (~mdrnstm@ has left #ceph
[1:30] * mgalkiewicz (~mgalkiewi@toya.hederanetworks.net) Quit (Quit: Ex-Chat)
[1:45] <jluis> someone has been busy deleting a bunch of foo* branches on ceph-client :p
[1:45] <elder> Yay!
[1:45] <elder> Sounds like the foos have served their purpose and it's time for them to go.
[1:47] <womble> Time for them to foo off
[1:49] <dmick> I pity the foo that tries to hang around in ceph-client
[1:52] <womble> That's definitely foo for thought
[1:58] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[1:58] <Fruit> *twitch*
[2:00] * tnt (~tnt@89.40-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[2:38] * Tv_ (~tv@2607:f298:a:607:38b3:897f:20fd:72b9) Quit (Quit: Tv_)
[3:16] * pentabular (~sean@adsl-71-141-229-185.dsl.snfc21.pacbell.net) has left #ceph
[3:29] * renzhi (~renzhi@ has joined #ceph
[3:31] * tightwork (~didders@ has joined #ceph
[3:32] * Ryan_Lane (~Adium@ Quit (Quit: Leaving.)
[4:09] * renzhi (~renzhi@ Quit (Ping timeout: 480 seconds)
[4:18] * renzhi (~renzhi@ has joined #ceph
[4:23] <Tobarja> are osd.X entries optional in ceph.conf?
[4:43] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) has joined #ceph
[4:49] * yoshi (~yoshi@p22043-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[5:02] <tightwork> Would I be adding osd with MKCEPHFS?
[5:07] <ajm> no
[5:07] <ajm> http://ceph.com/wiki/OSD_cluster_expansion/contraction
[5:40] * chutzpah (~chutz@ Quit (Quit: Leaving)
[5:42] <elder> sage, I'm having rbd trouble running current ceph-client/testing with current ceph/stable
[5:43] <elder> I can mount and unmount a ceph file system but when I try my rbd test it hangs.
[5:43] <elder> This is enough for it to get stuck. (Running uml)
[5:43] <elder> rbd create image1 --size=1024
[5:45] <elder> I'm going to bed, just mentioning this in case you happen to see it. I've gone through a bunch of scenarios but unfortunately not very methodically. Being offline for more than a week made me unsure of whether I just forgot to do something when setting back up again.
[5:45] <elder> I'll go through the whole exercise again in the morning, very methodically...
[5:49] * deepsa_ (~deepsa@ has joined #ceph
[5:50] * nhm (~nhm@184-97-251-210.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[5:50] * joao (~JL@ has joined #ceph
[5:53] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[5:53] * deepsa_ is now known as deepsa
[5:53] * cattelan (~cattelan@2001:4978:267:0:21c:c0ff:febf:814b) Quit (Ping timeout: 480 seconds)
[5:56] * jluis (~JL@ Quit (Ping timeout: 480 seconds)
[6:13] * yoshi (~yoshi@p22043-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[6:13] * tightwork (~didders@ Quit (Ping timeout: 480 seconds)
[6:37] * loicd (~loic@brln-4d0ce39f.pool.mediaWays.net) has joined #ceph
[6:40] <prometheanfire> so, no limit on number of xattrs (that are known), size is limited to 64k though, sound good?
[6:57] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[7:06] * tnt (~tnt@89.40-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[8:21] * tnt (~tnt@89.40-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[8:47] * tnt (~tnt@138.127-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[9:00] * tnt (~tnt@138.127-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:07] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[9:09] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:13] * BManojlovic (~steki@ has joined #ceph
[9:26] * fghaas (~florian@91-119-204-193.dynamic.xdsl-line.inode.at) has joined #ceph
[9:27] * fghaas (~florian@91-119-204-193.dynamic.xdsl-line.inode.at) Quit ()
[9:34] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[9:39] * EmilienM (~EmilienM@vau75-1-81-57-77-50.fbx.proxad.net) has joined #ceph
[9:56] * Meths (rift@ has joined #ceph
[10:01] * Meths_ (rift@ Quit (Ping timeout: 480 seconds)
[10:22] * denken (~denken@dione.pixelchaos.net) Quit (Ping timeout: 480 seconds)
[10:46] * renzhi (~renzhi@ Quit (Ping timeout: 480 seconds)
[11:01] * renzhi (~renzhi@ has joined #ceph
[11:09] * yoshi (~yoshi@p22043-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:45] * deepsa (~deepsa@ Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[11:49] * deepsa (~deepsa@ has joined #ceph
[11:52] * Qten (Q@qten.qnet.net.au) Quit ()
[12:09] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) Quit (Ping timeout: 480 seconds)
[12:12] * ao (~ao@ has joined #ceph
[12:39] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[12:40] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit ()
[12:41] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:40] * nhm (~nhm@184-97-251-210.mpls.qwest.net) has joined #ceph
[13:42] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[13:53] * tightwork (~didders@ has joined #ceph
[14:03] * tightwork (~didders@ Quit (Ping timeout: 480 seconds)
[14:06] * benner (~benner@ Quit (Remote host closed the connection)
[14:19] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[14:22] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit ()
[14:23] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[14:40] * benner (~benner@ has joined #ceph
[14:52] * denken (~denken@dione.pixelchaos.net) has joined #ceph
[15:31] * senner (~Wildcard@68-113-228-89.dhcp.stpt.wi.charter.com) has joined #ceph
[15:32] <elder> sage, sagewk my problems appear to be messenger-related, or perhaps it's just OSD code. I'm getting lots of retries.
[15:34] <elder> "failed lossy con, dropping message 0x1172d80" and so on, repeatedly when calling rbd.snap_list. Command is just "rbd create image1 --size=1024"
[15:34] <elder> joshd, dmick you guys might have insights as well.
[15:58] * cattelan (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[15:58] * cattelan- (~cattelan@2001:4978:267:0:21c:c0ff:febf:814b) has joined #ceph
[15:58] * cattelan (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Remote host closed the connection)
[15:59] * cattelan- is now known as cattelan
[16:08] * The_Bishop (~bishop@2a01:198:2ee:0:c57c:a03b:8941:5585) has joined #ceph
[16:26] * ao (~ao@ Quit (Quit: Leaving)
[16:40] * deepsa (~deepsa@ Quit (Ping timeout: 481 seconds)
[16:42] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[16:50] * pentabular (~sean@adsl-71-141-232-146.dsl.snfc21.pacbell.net) has joined #ceph
[17:02] * pentabular (~sean@adsl-71-141-232-146.dsl.snfc21.pacbell.net) Quit (Remote host closed the connection)
[17:10] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:19] * denken (~denken@dione.pixelchaos.net) Quit (Ping timeout: 480 seconds)
[17:21] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Remote host closed the connection)
[17:21] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:22] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[17:25] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[17:26] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:28] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[17:28] * loicd1 (~loic@brln-4d0ce674.pool.mediaWays.net) has joined #ceph
[17:32] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[17:32] * loicd (~loic@brln-4d0ce39f.pool.mediaWays.net) Quit (Ping timeout: 480 seconds)
[17:39] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:44] <elder> sagewk, I'm having no end of troubles getting ceph working again in my little UML environment.
[17:45] <sagewk> from testing branch?
[17:46] <elder> Well, I'm now running the ceph-client/testing, but I think the problem may be on the ceph side.
[17:46] <elder> Hard to really tell though.'
[17:47] <elder> I was running ceph/master and ceph-client/testing, and I could mount a ceph file system and could create an rbd image, but when I attempted to remove it the image it hung.
[17:47] <elder> I rolled back to v3.6-rc1 for ceph-client and had the same result.
[17:47] <elder> So I rolled back ceph to commit cd5d7241, just brefore the wip-rbd-protect merge.
[17:48] <elder> That resulted in a sub-module out-of-date and so I updated that. Perhaps that led to some trouble. In any case, now just adding the osd when starting up ceph it's hanging somehow.
[17:48] <elder> Not even touching the client yet.
[17:48] <elder> I'm just trying to get back to a working baseline...
[17:49] <elder> I'm prepared to rebuild my uml file system image to see if that helps.
[17:53] * Tv_ (~tv@ has joined #ceph
[17:53] <sagewk> elder: submodule?
[17:54] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Ping timeout: 480 seconds)
[17:54] <sagewk> elder: are you saying that vstart doesn't work?
[17:54] <elder> In the ceph tree there are two git submodules: ceph-object-corpus and src/leveldb
[17:55] <elder> Yes I'm saying that, using whatever my current state of the ceph tree is. I did "git reset --hard cd5d724; git submodule update"
[17:55] <sagewk> oh. i don't think that's the problem. the next branch should be fine (master too, but definitely next)
[17:55] <elder> And then cleaned it and then built it.
[17:56] <elder> OK, I'll try ceph/next.
[17:56] <sagewk> and vstart.sh failed?
[17:56] <elder> And I'm building with your config. I'll let you know how it goes. Yes, vstart.sh never finished. It hung at this point:
[17:56] <elder> ERROR: error accessing 'dev/osd0/*'
[17:56] <elder> add osd0 aaf80f14-90b0-421f-947e-d34771dca4fc
[17:56] <elder> (then nothing)
[17:57] <sagewk> what arguments did you pass?
[17:57] <sagewk> oh, sometimes it appears to hang there while it's doing rm -rf on the old data dir.. that can be slow if your last cluster instnace had lots of data written
[17:57] <sagewk> find dev/osd0
[17:57] <elder> It was more than slow.
[17:58] <elder> Doesn't exist.
[17:58] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[17:58] <sagewk> what arguments did you pass?
[18:04] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) Quit (Quit: Ex-Chat)
[18:14] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) has joined #ceph
[18:17] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[18:21] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[18:31] * didders (~didders@rrcs-71-43-128-65.se.biz.rr.com) has joined #ceph
[18:36] * EmilienM (~EmilienM@vau75-1-81-57-77-50.fbx.proxad.net) has left #ceph
[18:37] * EmilienM (~EmilienM@vau75-1-81-57-77-50.fbx.proxad.net) has joined #ceph
[18:45] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) Quit (Quit: Leaving.)
[18:46] * dpemmons (~dpemmons@ Quit (Ping timeout: 480 seconds)
[18:46] * dpemmons (~dpemmons@ has joined #ceph
[18:46] * nhm (~nhm@184-97-251-210.mpls.qwest.net) Quit (Read error: Operation timed out)
[18:47] * nhm (~nhm@184-97-251-210.mpls.qwest.net) has joined #ceph
[18:48] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) has joined #ceph
[18:50] * Cube (~Adium@ has joined #ceph
[18:52] * bchrisman (~Adium@ has joined #ceph
[18:56] <Tv_> elder: vstart assumes some directories to exist etc.. i use this wrapper: https://gist.github.com/3417272
[18:57] <elder> I have a script based on that, which you gave me a long time ago. I generalized it to install the osd's based on CEPH_NUM_OSD, but otherwise it's about the same.
[18:57] <Tv_> hah
[18:57] <Tv_> yeah
[18:57] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:57] <Tv_> i've never gone above three, and for <3 the extra dirs don't matter
[18:57] <elder> install -d -m0755 out
[18:57] <elder> install -d -m0755 dev
[18:57] <elder> for i in $(seq 0 $((CEPH_NUM_OSD - 1))); do
[18:57] <elder> install -d -m0755 dev/osd$i
[18:57] <elder> done
[18:57] * chutzpah (~chutz@ has joined #ceph
[18:58] * Cube (~Adium@ Quit (Ping timeout: 480 seconds)
[18:58] * Cube (~Adium@ has joined #ceph
[18:59] * johnmwilliams_ (u4972@irccloud.com) has left #ceph
[19:07] * Tv_ (~tv@ Quit (Remote host closed the connection)
[19:07] * Tv_ (~tv@2607:f298:a:607:38b3:897f:20fd:72b9) has joined #ceph
[19:08] * Tv_ (~tv@2607:f298:a:607:38b3:897f:20fd:72b9) Quit (Remote host closed the connection)
[19:10] * Tv_ (~tv@2607:f298:a:607:38b3:897f:20fd:72b9) has joined #ceph
[19:16] * tnt (~tnt@89.40-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[19:28] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Quit: leaving)
[19:31] <nhmhome> Sam: Couldn't hear you durning the meeting, what results were you seeing?
[19:34] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:39] * Ryan_Lane (~Adium@ has joined #ceph
[19:46] * BManojlovic (~steki@ has joined #ceph
[20:00] * maelfius (~mdrnstm@ has joined #ceph
[20:02] * mib_khqbyk (43c87e04@ircip1.mibbit.com) has joined #ceph
[20:06] * mib_khqbyk (43c87e04@ircip1.mibbit.com) has left #ceph
[20:09] * rlr219 (43c87e04@ircip2.mibbit.com) has joined #ceph
[20:22] <Tv_> sagewk: fyi created http://tracker.newdream.net/issues/3013 for the ceph-osd --get-journal-uuid etc tickets; would appreciate your input for whether we should be using "fsid" or "uuid" going forward
[20:23] <sagewk> tv_: oof... i think we're stuck with fsid, but i wish we'd used uuid from the start.
[20:23] <sagewk> easiest fix is probably to make --osd-uuid be --osd-fsid
[20:23] <sagewk> since it's undocumented (until my email to mandell)
[20:23] <Tv_> sagewk: well a lot of the stuff exists in both variants, in the options
[20:24] <sagewk> i guess the --get-* ones aren't documented either.
[20:24] <Tv_> sagewk: so choose which one to document prominently.. sounds like uuid is the winner, fsid is legacy
[20:24] <sagewk> but the files are _fsid
[20:24] <Tv_> yeah that's fine
[20:24] <Tv_> with the --get stuff, nobody should touch the files directly ;)
[20:24] <dmick> start making symlinks?
[20:24] <Tv_> (including me; need to cleanup later)
[20:24] <Tv_> dmick: naah
[20:24] <sagewk> and you are the only --get-*-fsid so far?
[20:24] <elder> Hard links
[20:24] <Tv_> sagewk: i'm currently reading the files direct
[20:24] <sagewk> k
[20:24] <Tv_> sagewk: that code predates --get-*
[20:25] <sagewk> uuid then
[20:25] <Tv_> i'm also writing some of them well before --mkfs ;)
[20:25] <Tv_> ok
[20:25] <Tv_> sagewk: i'll update the ticket
[20:25] <sagewk> well... :/ "cluster fsid" is all over the place, not just in these options
[20:25] <Tv_> hah
[20:25] <sagewk> the fact that the fsid is type uuid is a detail
[20:26] <sagewk> is cluster fsid and osd uuid to weird?
[20:26] <Tv_> "Documentation should prefer the term "uuid" everywhere, the "fsid" spelling is legacy."
[20:26] <Tv_> sagewk: we can make the config parser start taking cluster uuid too at some point, if we want, and even transition the data files etc.. but it's not very critical
[20:27] <Tv_> sagewk: but i don't see anything really difficult about the transition
[20:27] <sagewk> k
[20:31] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[20:49] * wijet (~wijet@pc-57071.zdnet.com.pl) has joined #ceph
[20:51] <wijet> I have a problem with ceph performance, I'm using Postgresql on ceph block device, with xfs
[20:52] <nhmhome> wijet: ah, what kind of write patterns from postgresql?
[20:52] <wijet> all writes are very slow, simple update takes around 5000ms to execute, and when I tried local file storage it's around 40ms
[20:53] <nhmhome> wijet: 5000ms is pretty awful!
[20:53] <wijet> I think, it wasn't that slow from the beginning
[20:53] <wijet> and it could happen after one of the upgrades
[20:53] <nhmhome> wijet: how old is the filesystem?
[20:53] <wijet> pretty new, I've recently created it
[20:54] <wijet> I'm using ceph 0.48argonaut (commit:c2b20ca74249892c8e5e40c12aa14446a2bf2030)
[20:54] <wijet> on both client and server, with kernel 3.2.0-2-amd64
[20:54] <nhmhome> Ok. Are you using rbd caching?
[20:55] <wijet> I'm not sure, how can I check it?
[20:57] <nhmhome> wijet: http://ceph.com/wiki/QEMU-RBD
[21:01] * mgalkiewicz (~mgalkiewi@toya.hederanetworks.net) has joined #ceph
[21:03] <nhmhome> also, you can check your ceph.conf file for "rbd_cache", "rbd_cache_size", and "rbd_cache_max_age" settings.
[21:08] <mgalkiewicz> we dont have such setting and the wiki you have pointed out does describe rbd volume connected to VM not the way we do this
[21:08] <mgalkiewicz> we are creating rbd device on client machine with simple rbd map
[21:10] <nhmhome> mgalkiewicz: yeah, it doesn't look like we have rbd caching documented outside of using it in qemu as far as I can tell.
[21:10] <mgalkiewicz> rbd cache should be configured on the server, client side or both?
[21:11] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[21:11] * grk (~grk@staticline56607.toya.net.pl) has joined #ceph
[21:11] <nhmhome> mgalkiewicz: afaik it's all client side.
[21:12] <nhmhome> mgalkiewicz: did you see my message about rbd_cache, rd_cache_size, and rbd_cache_max_age values in ceph.conf?
[21:14] <mgalkiewicz> yes we dont have such options in ceph.conf
[21:14] <nhmhome> hrm
[21:16] <nhmhome> do you mean you don't have them avaialble, or you just weren't using them?
[21:16] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Quit: leaving)
[21:17] <nhmhome> mgalkiewicz: more background info here: http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg06098.html
[21:18] * Leseb_ (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[21:18] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[21:18] * Leseb_ is now known as Leseb
[21:19] <nhmhome> not sure what postgresql needs though.
[21:22] <mgalkiewicz> we just dont have them set in ceph.conf
[21:22] <mgalkiewicz> nhmhome: so we should set those options in ceph.conf on server and client side?
[21:26] <wijet> nhmhome: thx, I'm trying these settings
[21:27] * mgalkiewicz (~mgalkiewi@toya.hederanetworks.net) Quit (Quit: Ex-Chat)
[21:32] * nhm (~nhm@184-97-251-210.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[21:35] <nhmhome> wijet: ok, good luck. I'm sporadically afk just so you know
[21:35] <nhmhome> bbiab
[21:38] <dmick> rbd caching is definitely all client
[21:46] <elder> sagewk, is there a guarantee that debugfs initialization *will* occur (re: patch1/4)
[21:47] <wijet> is there a way to display current ceph config?
[21:51] <wijet> I've added rbd_cache = true to global in config, unmounted it, unmapped
[21:52] <wijet> and it's doesn't improve postgresql writes, it may not work or it was not the reason of the issue
[21:56] * John (~john@2607:f298:a:697:bc77:5d6a:fa95:960) has joined #ceph
[21:57] <iggy> I didn't think rbd caching worked with the kernel rbd module (just userspace/librbd)
[21:58] <John> I need an explanation for "osd balance reads", "osd shed reads", "osd shed reads min latency", "osd shed reads min latency diff", and "osd shed reads min latency ratio" for OSD configuration.
[21:59] <dmick> iggy: yes, AFAIK there's no kernel rbd caching
[21:59] <dmick> other than the normal block caching
[22:01] <wijet> ok, so I need to use it via librbd to use rbd_caching
[22:02] <iggy> I'm not sure rbd caching will help your situation... at least I wouldn't expect rbd caching to have any impact on writes (it would kind of defeat the point of using it in case of a crash)
[22:03] <wijet> do you have any idea, how can I debug those slow writes?
[22:03] <dmick> well, there are writeback and writethrough modes, so it could potentially have a small effect; I don't have a lot of measurement experience with it though
[22:04] <dmick> but yes, crashing is not good with active caches
[22:04] <dmick> wijet: there is a way to dump the config, searching
[22:05] <iggy> I'd do the normal OSD benchmark stuffs to see if one of your OSDs is impacting all of them
[22:05] <wijet> dmick: but if rbd_cache works only with librbd it won't work for me now
[22:07] <dmick> yes
[22:08] <wijet> iggy: it may be the case, one of the server has disk IO around 80%
[22:10] * nhm (~nhm@184-97-251-210.mpls.qwest.net) has joined #ceph
[22:14] * nhm (~nhm@184-97-251-210.mpls.qwest.net) Quit (Remote host closed the connection)
[22:14] * nhm (~nh@184-97-251-210.mpls.qwest.net) has joined #ceph
[22:15] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[22:16] <wijet> thx for help guys, I will benchmark OSDs and let you know
[22:22] * wijet (~wijet@pc-57071.zdnet.com.pl) has left #ceph
[22:24] * grk (~grk@staticline56607.toya.net.pl) Quit (Quit: grk)
[22:26] * lofejndif (~lsqavnbok@19NAAB03P.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:44] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[22:57] * jeffp (~jplaisanc@net66-219-41-161.static-customer.corenap.com) has left #ceph
[23:04] * lofejndif (~lsqavnbok@19NAAB03P.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[23:05] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[23:05] * didders (~didders@rrcs-71-43-128-65.se.biz.rr.com) Quit (Ping timeout: 480 seconds)
[23:22] * Tobarja1 (~athompson@cpe-071-075-064-255.carolina.res.rr.com) has joined #ceph
[23:26] * Tobarja (~athompson@cpe-071-075-064-255.carolina.res.rr.com) Quit (Ping timeout: 480 seconds)
[23:26] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Read error: Connection reset by peer)
[23:43] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[23:46] * pentabular (~sean@adsl-71-141-232-146.dsl.snfc21.pacbell.net) has joined #ceph
[23:48] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[23:50] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.