#ceph IRC Log


IRC Log for 2013-09-11

Timestamps are in GMT/BST.

[0:04] * jpieper (~josh@209-6-205-161.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (Remote host closed the connection)
[0:04] <sagewk> joshd: merged that; and loicd should have some basic tests tomorrow
[0:05] <joshd> sagewk: you mean unit tests for the monitors? that'd be awesome
[0:07] <sagewk> for the new osd pool create command variations
[0:08] <sagewk> functional, not unit tests :/
[0:08] * sage (~sage@ has joined #ceph
[0:08] <joshd> functionally this was really easy to detect - ceph -s wouldn't work
[0:13] * a2_ (~avati@ip-86-181-132-209.redhat.com) has left #ceph
[0:15] * torment (~torment@pool-96-228-149-152.tampfl.fios.verizon.net) has joined #ceph
[0:17] * doxavore (~doug@99-89-22-187.lightspeed.rcsntx.sbcglobal.net) Quit (Quit: :qa!)
[0:19] * jpieper (~josh@209-6-205-161.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) has joined #ceph
[0:19] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) has joined #ceph
[0:22] * torment4 (~torment@pool-72-64-182-78.tampfl.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[0:24] * sleinen (~Adium@2001:620:0:25:5de5:2743:46c3:2320) Quit (Quit: Leaving.)
[0:24] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[0:26] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[0:27] * KindOne (~KindOne@0001a7db.user.oftc.net) has joined #ceph
[0:28] * madkiss (~madkiss@ has joined #ceph
[0:28] * mozg (~andrei@host86-185-78-26.range86-185.btcentralplus.com) has joined #ceph
[0:29] <mozg> hello guys
[0:29] <mozg> time to change the topic?
[0:29] <mozg> has anyone upgraded from 0.67.2 to 0.67.3 yet?
[0:29] <mozg> i can't find any upgrade instructions
[0:30] <mozg> should I just run apt-get upgrade and restart the ceph services after?
[0:30] <mozg> or should I upgrade packages in a particular order?
[0:30] <MACscr> xarses: with fuel, its pretty easy to just add a new controller/mon down the line, right? I am missing two systems right now and id like to use one of them in the near future as part of my cluster. Wondering if i can just start with 2 mon/controllers and add the 3rd one in about a week or two
[0:32] <mozg> MACscr, i would start with a single monitor
[0:32] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[0:32] <mozg> and add second and third when you are ready
[0:32] <mozg> as you need minimum 1 mon
[0:33] <mozg> the next step up is 3 mons
[0:33] <xarses> MACscr, you need to change to the cluster_ha mode there is button at the upper left that looks like settings (when you are in the cluster node list). You can try starting with only the number of controllers you have, I'm not sure the result
[0:33] <mozg> as there has to be a quorum
[0:34] <MACscr> xarses: ok. Im about a day or two from testing it. Still working on my hardware.
[0:34] <xarses> mozg, 2 is valid it just wont have a quorum if one is down
[0:34] <xarses> and therefor not reccomended
[0:34] <xarses> like, at all
[0:34] <madkiss> quorum is overrated.
[0:34] <madkiss> *gdr*
[0:34] <xarses> madkiss =)
[0:43] * mcatudal (~mcatudal@142-217-209-54.telebecinternet.net) has joined #ceph
[0:44] * grepory (~Adium@155.sub-70-192-203.myvzw.com) Quit (Quit: Leaving.)
[0:45] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) Quit (Quit: Leaving.)
[0:46] * madkiss (~madkiss@ Quit (Quit: Leaving.)
[0:51] <Tamil> mozg: how did you install ceph v0.67.2?
[0:51] <mozg> Tamil, yeah
[0:51] <mozg> that is what i had before
[0:51] <mozg> i've just upgraded to 0.67.3
[0:51] <mozg> about 10 mins ago
[0:52] <dmick> mozg: the question was "how"
[0:53] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) Quit (Quit: ...)
[0:54] <mozg> Tamil, i've actually upgraded from 0.61.7
[0:54] <mozg> but initially i've installed my cluster using ceph-deploy
[0:55] <Tamil> mozg: http://ceph.com/docs/wip-doc-radosgw/install/upgrading-ceph/#upgrade-procedures - if you already got it going, then fine
[0:55] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[0:57] <mozg> Tamil, does the ceph-deploy install method automatically restarts the ceph services?
[1:00] <Tamil> mozg: no, we need to login to each node and restart the daemons
[1:02] * malcolm (~malcolm@silico24.lnk.telstra.net) has joined #ceph
[1:07] <sagewk> gregaf1: opened up https://github.com/ceph/ceph/pull/586
[1:07] <gregaf1> awesome
[1:08] * lightspeed (~lightspee@2001:8b0:16e:1:216:eaff:fe59:4a3c) Quit (Quit: Leaving)
[1:08] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[1:18] <dontalton> is it possible to pass the "missing_host_key_policy" args (from pushy) to ceph-deploy?
[1:25] * lightspeed (~lightspee@2001:8b0:16e:1:216:eaff:fe59:4a3c) has joined #ceph
[1:26] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) has joined #ceph
[1:30] <sjust> bstillwell: any chance you could reproduce?
[1:33] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) Quit (Remote host closed the connection)
[1:35] * mozg (~andrei@host86-185-78-26.range86-185.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[1:39] <dmick> sjust: kinda personal question
[1:39] * jefferai (~quassel@corkblock.jefferai.org) Quit (Remote host closed the connection)
[1:40] * jefferai (~quassel@corkblock.jefferai.org) has joined #ceph
[1:40] * ircolle (~Adium@c-67-165-237-235.hsd1.co.comcast.net) Quit (Quit: Leaving.)
[1:41] * Steki (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:42] <xarses> alfredodeza: https://github.com/ceph/ceph-deploy/pull/71 still hangs on mon_stats after ceph-deploy mon create
[1:48] * dmick (~dmick@2607:f298:a:607:4d3d:fe55:b729:c3ee) Quit (Quit: Leaving.)
[1:50] * dmick (~dmick@2607:f298:a:607:d8d5:ca8e:728f:e4c9) has joined #ceph
[1:53] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) Quit (Ping timeout: 480 seconds)
[2:00] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[2:08] * dty (~derek@pool-71-114-104-38.washdc.fios.verizon.net) has joined #ceph
[2:08] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[2:21] * LeaChim (~LeaChim@054073b1.skybroadband.com) Quit (Ping timeout: 480 seconds)
[2:22] * antoinerg (~antoine@dsl.static-187-116-74-220.electronicbox.net) Quit (Ping timeout: 480 seconds)
[2:23] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) Quit (Ping timeout: 480 seconds)
[2:27] * sagelap (~sage@2600:1012:b005:5a17:f19a:3ea9:afe1:ad87) has joined #ceph
[2:28] * mistery (~ghena1986@ip98-163-242-143.no.no.cox.net) has joined #ceph
[2:28] <mistery> http://bit.ly/19FkLVd
[2:28] * mistery (~ghena1986@ip98-163-242-143.no.no.cox.net) has left #ceph
[2:36] * xarses (~andreww@ Quit (Ping timeout: 480 seconds)
[2:39] * yy-nm (~Thunderbi@ has joined #ceph
[2:39] * Cube (~Cube@ Quit (Quit: Leaving.)
[2:39] * Cube (~Cube@ has joined #ceph
[2:40] * bandrus (~Adium@ Quit (Quit: Leaving.)
[2:41] * Karcaw_ (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Ping timeout: 480 seconds)
[2:41] * xmltok (~xmltok@pool101.bizrate.com) Quit (Quit: Bye!)
[2:41] * angdraug (~angdraug@ Quit (Quit: Leaving)
[2:45] * Cube (~Cube@ Quit (Read error: Operation timed out)
[2:47] * yanzheng (~zhyan@ has joined #ceph
[3:00] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[3:07] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph
[3:08] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[3:10] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:10] * sagelap (~sage@2600:1012:b005:5a17:f19a:3ea9:afe1:ad87) Quit (Read error: No route to host)
[3:11] * xarses (~andreww@c-71-202-167-197.hsd1.ca.comcast.net) has joined #ceph
[3:11] * nhm seriously considers learning dtrace.
[3:11] * smiley_ (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[3:16] * buck (~buck@bender.soe.ucsc.edu) has left #ceph
[3:16] <dmick> nhm: it's a*maz*ing when it works
[3:16] <nhm> dmick: I've heard the linux port is getting better.
[3:17] <dmick> someone pointed me at some blogging I'd done at Sun the other day, and I was rereading a particular case study...omg it was nice
[3:17] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[3:17] <nhm> dmick: ssd testing so far is hinting at lots of things we are going to need to check out.
[3:17] <dmick> https://blogs.oracle.com/dmick/entry/udp_process_finder is a tiny example of an aspect I love
[3:18] <dmick> the "I don't have a clue, but I can do a general query that points me in the right direction"
[3:18] <nhm> dmick: I need to find a good way to profile lock contention and/or a better way to do wallclock profiling than grabbing gdb stacktraces.
[3:19] <yanzheng> why not use linux perf
[3:21] <nhm> yanzheng: I can't remember the invocation (or if it was even ever implemented) for doing wallclock based profiling.
[3:21] * shimo (~A13032@122x212x216x66.ap122.ftth.ucom.ne.jp) has joined #ceph
[3:22] <nhm> that, and I need to get on a 3.9 kernel with libunwind support so it actually resolve symbols properly.
[3:22] <dmick> wallclock profiling == "real time spent in this routine, even if blocked/unscheduled"?
[3:23] <nhm> dmick: yeah. the poorman's way to do that is to basically just grab gdb stacktraces over and over.
[3:24] <yanzheng> sounds like wallclock profiling == cycles based profiling.
[3:24] <dmick> and you mean kernel, not userland? because in userland event-based is surely the answer, no? istr it's called gperf?
[3:26] <nhm> dmick: Basically I want to know everything
[3:26] <dmick> heh
[3:26] <dmick> yeah
[3:26] <dmick> dtrace, if properly done, is amazing for that. I mean, to the point where it'd be worth porting to Solaris just to use it
[3:33] * dty (~derek@pool-71-114-104-38.washdc.fios.verizon.net) Quit (Quit: dty)
[3:34] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) has joined #ceph
[3:42] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[3:48] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[3:50] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:58] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[4:01] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[4:04] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Quit: Leaving.)
[4:06] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[4:08] * PITon (~pavel@ Quit (Ping timeout: 480 seconds)
[4:09] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[4:11] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit ()
[4:22] * marrusl_ (~mark@ has joined #ceph
[4:25] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[4:27] * carif (~mcarifio@pool-96-233-32-122.bstnma.fios.verizon.net) Quit (Quit: Ex-Chat)
[4:28] * sprachgenerator (~sprachgen@va-71-48-143-23.dhcp.embarqhsd.net) has joined #ceph
[4:31] * S0d0 (joku@a88-113-108-239.elisa-laajakaista.fi) has joined #ceph
[4:33] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[4:33] * nerdtron (~kenneth@ has joined #ceph
[4:39] * grepory (~Adium@2600:1003:b004:9cb9:f9e7:a16e:6a9d:b835) has joined #ceph
[4:40] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Quit: Leaving...)
[4:41] * kraken (~kraken@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Ping timeout: 480 seconds)
[4:48] * DarkAceZ (~BillyMays@ has joined #ceph
[4:52] * mech422 (~steve@ip68-2-159-8.ph.ph.cox.net) has joined #ceph
[4:52] <mech422> hi all - just wondered if anyone is using a 'layered' FS approach for VM's in production? Using an RDB snaphshot/clone for the base OS with a R/W layer on top. If so, what are you using for layering ? UnionFS, aufs, etc etc
[4:56] * wenjianhn (~wenjianhn@ has joined #ceph
[4:59] * glzhao (~glzhao@ has joined #ceph
[5:01] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[5:02] * madkiss (~madkiss@ has joined #ceph
[5:05] * fireD_ (~fireD@93-142-237-144.adsl.net.t-com.hr) has joined #ceph
[5:07] * fireD (~fireD@93-139-191-152.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:09] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[5:10] * madkiss (~madkiss@ Quit (Ping timeout: 480 seconds)
[5:13] * KindTwo (~KindOne@h236.38.186.173.dynamic.ip.windstream.net) has joined #ceph
[5:14] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[5:14] * KindTwo is now known as KindOne
[5:16] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[5:19] * aliguori (~anthony@ Quit (Remote host closed the connection)
[5:21] * sprachgenerator (~sprachgen@va-71-48-143-23.dhcp.embarqhsd.net) Quit (Quit: sprachgenerator)
[5:28] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[5:28] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[5:31] * carif (~mcarifio@146-115-183-141.c3-0.wtr-ubr1.sbo-wtr.ma.cable.rcn.com) has joined #ceph
[5:36] * marrusl_ (~mark@ Quit (Ping timeout: 480 seconds)
[5:36] * carif (~mcarifio@146-115-183-141.c3-0.wtr-ubr1.sbo-wtr.ma.cable.rcn.com) Quit ()
[5:38] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[5:38] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit ()
[5:41] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[5:42] * shang (~ShangWu@ has joined #ceph
[5:44] * mcatudal (~mcatudal@142-217-209-54.telebecinternet.net) Quit (Read error: Operation timed out)
[5:54] <malcolm> Doesn't RBD allow for copy-on-write clones?
[5:54] <malcolm> I'm pretty sure it does
[5:54] * sjm (~sjm@ has joined #ceph
[5:54] <malcolm> http://ceph.com/docs/next/dev/rbd-layering/
[5:54] <malcolm> yep
[5:55] <malcolm> is that what you were getting at mech422?
[6:01] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[6:01] * PITon (~pavel@178-136-128-118.static.vega-ua.net) has joined #ceph
[6:01] * DarkAceZ (~BillyMays@ has joined #ceph
[6:02] <mech422> malcolm: no...
[6:02] <mech422> malcolm: the clones don't (and can't really...) allow rebasing
[6:03] <mech422> so if I have a 'gold' debian image, and I make a COW snapshot on it
[6:03] <mech422> I'm then hosed when I apply security updates to the gold image
[6:04] <mech422> using a 'layered' approach, I can just update the 'gold' image, create a new snapshot, and run a script to adjust all the VMs to use a clone of the updated gold image as the 'read only root'
[6:04] <malcolm> mech422: Fair enough. I get what you are chasing now. That way you can unify your 'core' and install updates on it and have the rest benift. Cool beans.
[6:05] <mech422> yeah - just wondering what people use for that now ?
[6:08] * sjm (~sjm@ Quit (Remote host closed the connection)
[6:09] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[6:09] * BillK (~BillK-OFT@58-7-172-nwork.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[6:11] * lx0 is now known as lxo
[6:13] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Quit: Leaving...)
[6:18] * penguinLord (~penguinLo@ has joined #ceph
[6:20] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla [Firefox 23.0.1/20130814063812])
[6:21] <penguinLord> I am trying to follow the official guide to install ceph on my machine .Its all working fine but when I try to install on the ceph-server it doesnot seem to be detecting the internet connection .The user and the ceph user are on the same machine.Please can someone help? http://pastebin.com/G7HwCx2y
[6:25] * smiley_ (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley_)
[6:25] <nerdtron> penguinLord, are your behind a proxy server or something? or do you have direct access to internet?
[6:26] <penguinLord> nerdtron : I am behind a proxy server .Does ceph deploy work on ssh ?
[6:26] * yy-nm (~Thunderbi@ Quit (Quit: yy-nm)
[6:26] <penguinLord> nerdtron : I guess my problem is that ssh is not picking up the environment variables.
[6:27] <nerdtron> i had problems before installing ceph packages behind a proxy...if you can directly connect to the internet, that would be good
[6:30] <penguinLord> nerdtron : I dont think that is possible for now .I am behind a university proxy server .Is there a way out?
[6:38] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[6:40] * shimo (~A13032@122x212x216x66.ap122.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[6:43] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit ()
[6:50] * ScOut3R (~scout3r@91EC1DC5.catv.pool.telekom.hu) has joined #ceph
[6:53] * shimo (~A13032@122x212x216x66.ap122.ftth.ucom.ne.jp) has joined #ceph
[7:02] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[7:02] * mech422 (~steve@ip68-2-159-8.ph.ph.cox.net) has left #ceph
[7:04] * ScOut3R (~scout3r@91EC1DC5.catv.pool.telekom.hu) Quit (Remote host closed the connection)
[7:04] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) has joined #ceph
[7:05] * grepory (~Adium@2600:1003:b004:9cb9:f9e7:a16e:6a9d:b835) Quit (Quit: Leaving.)
[7:10] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[7:10] * dontalton (~don@128-107-239-234.cisco.com) Quit (Quit: Leaving)
[7:12] * sjm (~sjm@ has joined #ceph
[7:18] * lightspeed (~lightspee@2001:8b0:16e:1:216:eaff:fe59:4a3c) Quit (Ping timeout: 480 seconds)
[7:19] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[7:26] * sjm (~sjm@ Quit (Ping timeout: 480 seconds)
[7:32] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[7:40] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[7:41] * PITon (~pavel@178-136-128-118.static.vega-ua.net) Quit (Ping timeout: 480 seconds)
[7:41] * penguinLord (~penguinLo@ Quit (Quit: irc2go)
[7:42] * PITon (~pavel@178-136-128-118.static.vega-ua.net) has joined #ceph
[7:45] * haomaiwang (~haomaiwan@ has joined #ceph
[7:59] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[8:01] * sleinen1 (~Adium@2001:620:0:25:90ef:184b:775b:8d53) has joined #ceph
[8:04] * penguinLord (~penguinLo@ has joined #ceph
[8:06] <penguinLord> I am having problem with installing ceph following the official document .It gets stuck at apt-get -q update . I am behind a proxy server .Initially I suspected its problem of proxy variables but ssh ceph-server "env" gives all the proxy variables.Can some please look at it or else is there a way to get more verbose output ? http://pastebin.com/HxzP5Npi
[8:07] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[8:07] <loicd> joshd
[8:07] <loicd> around ?
[8:09] * foosinn (~stefan@office.unitedcolo.de) has joined #ceph
[8:19] * BillK (~BillK-OFT@58-7-172-nwork.dyn.iinet.net.au) has joined #ceph
[8:22] * Vjarjadian (~IceChat77@05453253.skybroadband.com) Quit (Quit: We be chillin - IceChat style)
[8:22] * sleinen1 (~Adium@2001:620:0:25:90ef:184b:775b:8d53) Quit (Quit: Leaving.)
[8:22] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[8:23] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[8:27] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[8:30] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[8:34] <xarses> penguinLord, I would suspect it's with your proxy as well. I think i saw some documentation regarding using proxy somwhere
[8:35] <xarses> worse case, ensure that http_proxy and https_proxy are in the .bashrc | .profile for the remote user and are not squished by sudo
[8:36] <malcolm> I've got an odd one.. when I do "service ceph start" i get "monclient(hunting): authenticate timed out after 300" when starting my osd's
[8:36] <malcolm> I've rolled out a cluster before an not seen this.. what obvious thing am I missing
[8:38] <xarses> malcolm, I'd guess that you might have a firewall problem reaching the monitor; the osd dosn't have the right authkey if using cephx; or the monitor listed in ceph.conf on the osd isn't avilable/responding
[8:40] <malcolm> hmmm ok. well mkcephfs work correctly, so the keys were generated. I don't have a firewall.. so it could be a name resolution issue
[8:40] <malcolm> tho my host file is correct
[8:41] <xarses> otherwise, I'd reccomend from around 9 am to around 6 pm PST, the inktank guys are around
[8:41] <xarses> and they might be able to help better
[8:41] <malcolm> Actually how do the osd's find their keys?
[8:42] <xarses> when the osd created, it would have needed access to the ceph.bootstrap-osd.keyring
[8:42] <xarses> ceph could be replaced by the cluster name or ceph would be default
[8:43] <xarses> but ceph-deploy would have taken care of that if you used it
[8:44] <xarses> if you didn't use ceph-deploy i'd reccomend it
[8:44] <malcolm> ceph-deploy doesn't work on OpenSuse Tumbleweed
[8:45] * sleinen (~Adium@2001:620:0:2d:244b:fdc4:dd83:59a5) has joined #ceph
[8:45] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) has left #ceph
[8:45] <xarses> because of the packages? or does the remote not work?
[8:46] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) has joined #ceph
[8:46] <malcolm> it refuses to run. I'd hack it but I'm reall not all that fussed to do so.
[8:46] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[8:46] * sleinen1 (~Adium@2001:620:0:25:51b8:110e:de4b:93b9) has joined #ceph
[8:46] <malcolm> This is a 100% non-production install
[8:47] <wogri_risc> malcolm: why not use a distribution that is well supported by ceph?>
[8:50] * mancdaz (~darren.bi@94-195-16-87.zone9.bethere.co.uk) has joined #ceph
[8:50] * foosinn (~stefan@office.unitedcolo.de) Quit (Remote host closed the connection)
[8:52] <malcolm> because I'm trying to use ceph to store my mythtv recordings and I've got an insane idea and an overpowered media box.
[8:53] <malcolm> its really a long story.
[8:53] * sleinen (~Adium@2001:620:0:2d:244b:fdc4:dd83:59a5) Quit (Ping timeout: 480 seconds)
[8:54] <xarses> penguinLord: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-August/003385.html
[8:56] <xarses> malcolm did you follow http://eu.ceph.com/docs/wip-msgauth/config-cluster/mkcephfs/?
[8:57] * foosinn (~stefan@office.unitedcolo.de) has joined #ceph
[8:57] * S0d0 (joku@a88-113-108-239.elisa-laajakaista.fi) Quit (Ping timeout: 480 seconds)
[8:58] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[8:58] <xarses> and there is no iptables firewall policy blocking 6789 or 6800-7100?
[9:00] * mcatudal (~mcatudal@142-217-209-54.telebecinternet.net) has joined #ceph
[9:00] <malcolm> i can telnet the ports and get an answer
[9:00] <xarses> ok
[9:00] <malcolm> and I checked the iptables -L and its empty
[9:01] <xarses> sound's good
[9:01] <xarses> can you post your ceph.conf?
[9:01] <xarses> also which version?
[9:01] <xarses> of ceph
[9:01] <malcolm> Sure where?
[9:03] <xarses> pastebin, paste.openstack.org. wherever
[9:04] <malcolm> http://pastebin.com/XHG4i2Zb
[9:05] <malcolm> and its ceph version 0.68-224-g16b24f1
[9:05] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) Quit (Remote host closed the connection)
[9:08] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) has joined #ceph
[9:08] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[9:09] * thelan_ (~thelan@paris.servme.fr) Quit (Ping timeout: 480 seconds)
[9:11] <xarses> mythMeda matches hostname -s ?
[9:13] <malcolm> yep
[9:13] <xarses> one thing i see is that ceph.conf pushed by ceph-deploy has _ underscores between the words in a variable, you have spaces, which matches the format that ceph-deploy uses as the staging ceph.conf
[9:14] <xarses> although the docs do show the spaces
[9:14] <malcolm> I'd used spaces in the past. I can change it if you think it might help
[9:14] <malcolm> basically I get the Mounting xfs on mythMedia:/var/lib/ceph/osd/ceph-0
[9:14] <malcolm> then a really long wait
[9:15] <malcolm> and then monclient(hunting): authenticate timed out after 300
[9:15] <malcolm> then librados: osd.0 authentication error (110) Connection timed out
[9:16] <xarses> ceph -s returns a response?
[9:16] <malcolm> nope
[9:16] <xarses> cephx error?
[9:16] <malcolm> Error connecting to cluster: Error
[9:16] <xarses> you're mon isn't running then
[9:17] <malcolm> root 14741 0.2 0.0 150632 7148 pts/4 Sl 17:12 0:00 /usr/bin/ceph-mon -i a --pid-file /var/run/ceph/mon.a.pid -c /etc/ceph/ceph.conf
[9:17] <xarses> odd
[9:17] <malcolm> is there a way it could be listening on the wrong ip?
[9:18] <malcolm> can I force it?
[9:18] <wogri_risc> malcolm: netstat -planet | grep mon
[9:18] <malcolm> oh wait.. I am..
[9:18] <xarses> your mon is bound to the one address
[9:18] <malcolm> yeah I just saw that..
[9:18] <xarses> is that the same address that gethostbyname resolves to?
[9:19] <malcolm> and I've got established connections to mon.
[9:19] <malcolm> gethostbyname?
[9:19] <xarses> host `hostname -s`
[9:19] <xarses> possibly nslookup `hostname -s`
[9:19] <malcolm> Host mythMedia not found: 3(NXDOMAIN)
[9:20] <malcolm> so no dns but ive got the ip and name in hosts file
[9:20] <xarses> can you add that to /etc/hosts?
[9:20] <xarses> oh
[9:20] <xarses> odd
[9:21] <malcolm> mythMedia.local mythMedia
[9:21] <malcolm> is in my hosts file
[9:23] <wogri_risc> there's chances that nslookup doesn't consider /etc/hosts
[9:23] <wogri_risc> ping does, however.
[9:23] <xarses> ping mythMedia would be a better test
[9:23] * mcatudal (~mcatudal@142-217-209-54.telebecinternet.net) Quit (Remote host closed the connection)
[9:23] <xarses> wogri_risc, ya i just tested that
[9:23] <malcolm> PING mythMedia.local ( 56(84) bytes of data.
[9:23] * LeaChim (~LeaChim@054073b1.skybroadband.com) has joined #ceph
[9:23] <xarses> ok
[9:24] <xarses> should be good there
[9:24] <malcolm> the netstat -planet |grep mon returned some established connections
[9:24] * penguinLord (~penguinLo@ Quit (Quit: irc2go)
[9:24] <xarses> i'd take a stab at restarting the monitor since ceph -s dosn't work
[9:24] <xarses> which should at least return a cephx failure even if it isn't in quorum
[9:25] <wogri_risc> I'm only reading on one eye, but did you look at the logs yet?
[9:25] <malcolm> yep
[9:25] <malcolm> it says it starts.
[9:25] <xarses> the monitor logs?
[9:25] <wogri_risc> yes
[9:25] <malcolm> then the only odd message in the mon logs is >> pipe(0x3254280 sd=21 :0 s=1 pgs=0 cs=0 l=0 c=0x31fef20).fault
[9:26] <malcolm> there are the usual update_stats messages...
[9:26] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) Quit (Remote host closed the connection)
[9:26] <wogri_risc> I can never read those, and have no idea what that means....
[9:26] <wogri_risc> does it say sth about quorum status?
[9:27] <malcolm> no mention of quorum
[9:27] <malcolm> mon.a@-1(probing) e0 initial_members mythMedia, filtering seed monmap
[9:27] * mrprud_ (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) has joined #ceph
[9:27] <malcolm> but no quorum messages
[9:27] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) Quit (Read error: Connection reset by peer)
[9:28] <xarses> ceph-create-keys -i mythMedia
[9:28] * penguinLord (~penguinLo@ has joined #ceph
[9:28] <xarses> it will bark if the monitor isn't in quorum or otherwise dead
[9:28] <malcolm> create keys is running 6 times?
[9:29] <malcolm> INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'
[9:29] <xarses> hehe you can run it again to check the stuck message
[9:29] <wogri_risc> i just restarted a mon on my testcluster
[9:29] <wogri_risc> 2013-09-11 09:28:27.348354 7f60da6ca700 10 mon.a@0(leader) e18 win_election, epoch 8 quorum is 0,1 features are 17179869183
[9:29] * mrprud_ (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) Quit (Remote host closed the connection)
[9:29] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) has joined #ceph
[9:29] <xarses> ya, so the mon didn't start right
[9:30] <malcolm> I've restarted it a few times
[9:30] <wogri_risc> maybe a keyring issue?
[9:30] <malcolm> made sure there are no ceph processes running
[9:30] <malcolm> it still doesnt come back happy
[9:30] <wogri_risc> I've seen ceph-create-keys hang before
[9:30] <xarses> sounds like what happens when i try to re-create the monitor several times and the keys are all borked up
[9:30] <malcolm> Nice
[9:31] <malcolm> Ok well I'll scrap the whole thing and recreate
[9:31] <malcolm> :D
[9:31] <malcolm> Thanks!
[9:31] <malcolm> key permissions possibly?
[9:31] <xarses> not usualy
[9:31] <malcolm> ok cool
[9:32] <xarses> usually an issue with the fsid
[9:32] <malcolm> I'll just fry it and start again. Thanks again everybody!
[9:32] <wogri_risc> I've seen it not creating the keyfile in /var/lib/ceph/mon.sth/keyring
[9:32] <xarses> mon initial members
[9:32] <xarses> and the initial mon.keyring
[9:32] <xarses> not being in line
[9:33] * S0d0 (joku@a88-113-108-239.elisa-laajakaista.fi) has joined #ceph
[9:33] * yy-nm (~Thunderbi@ has joined #ceph
[9:33] <malcolm> Cool :D I'll let you know how I go. But I must fly.. I have cars to drive (its 5:30pm in AU :D)
[9:33] <xarses> after you drop the mon
[9:34] <xarses> if ceph-create-keys is still running you should work on it there
[9:34] <xarses> if that dosn't create all its keys, the whole thing is foobar
[9:34] * wogri_risc agrees with xarses
[9:35] * mancdaz (~darren.bi@94-195-16-87.zone9.bethere.co.uk) Quit (Quit: mancdaz)
[9:36] * mancdaz (~darren.bi@94-195-16-87.zone9.bethere.co.uk) has joined #ceph
[9:36] * mancdaz (~darren.bi@94-195-16-87.zone9.bethere.co.uk) Quit ()
[9:37] * malcolm (~malcolm@silico24.lnk.telstra.net) Quit (Read error: Operation timed out)
[9:41] * mrprud_ (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) has joined #ceph
[9:41] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) Quit (Read error: Connection reset by peer)
[9:42] * BManojlovic (~steki@ has joined #ceph
[9:47] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) has joined #ceph
[9:47] * ChanServ sets mode +v andreask
[9:53] * mschiff (~mschiff@pD95108ED.dip0.t-ipconnect.de) has joined #ceph
[10:03] * madkiss (~madkiss@2001:6f8:12c3:f00f:d000:ffb3:448f:3155) has joined #ceph
[10:04] * eternaleye (~eternaley@2002:3284:29cb::1) Quit (Ping timeout: 480 seconds)
[10:05] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:bc93:8b74:49cd:c096) has joined #ceph
[10:11] * madkiss (~madkiss@2001:6f8:12c3:f00f:d000:ffb3:448f:3155) Quit (Ping timeout: 480 seconds)
[10:12] * gurubert (~r.sander@p4FF5A149.dip0.t-ipconnect.de) has joined #ceph
[10:14] * matt_ (~matt@mail.base3.com.au) has joined #ceph
[10:15] * glzhao (~glzhao@ Quit (Read error: Connection reset by peer)
[10:15] * gurubert (~r.sander@p4FF5A149.dip0.t-ipconnect.de) Quit ()
[10:15] * gurubert (~gurubert@p4FF5A149.dip0.t-ipconnect.de) has joined #ceph
[10:16] <matt_> Any seen this error before with any 'ceph' command when going to Dumpling? TypeError: __init__() got an unexpected keyword argument 'clustername'
[10:19] * glzhao (~glzhao@ has joined #ceph
[10:22] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Ping timeout: 480 seconds)
[10:23] * lightspeed (~lightspee@2001:8b0:16e:1:216:eaff:fe59:4a3c) has joined #ceph
[10:24] * ScOut3R (~ScOut3R@catv-89-133-21-203.catv.broadband.hu) has joined #ceph
[10:28] * mancdaz (~darren.bi@ has joined #ceph
[10:29] * penguinLord (~penguinLo@ Quit (Remote host closed the connection)
[10:33] * jbd_ (~jbd_@2001:41d0:52:a00::77) has joined #ceph
[10:36] <wogri_risc> matt_ - when do you see this error? when starting ceph?
[10:36] <wogri_risc> does your ceph.conf contain anything unusal?
[10:37] <matt_> Anything involving the ceph command
[10:37] <wogri_risc> sounds like ceph.conf
[10:37] <wogri_risc> grep clustername /etc/ceph/ceph.conf
[10:37] <matt_> very vanilla ceph.conf... it just has a single monitor and osd
[10:38] <yanzheng> matt_, export PYTHONPATH="/usr/local/lib/python2.7/site-packages/"
[10:38] <yanzheng> try again
[10:40] <matt_> Still the same I'm afraid
[10:41] <yanzheng> did you cleanup old version ceph
[10:41] <matt_> yep, completely purged it and started fresh
[10:41] <matt_> ceph-deploy is doing weird things also. ceph-deploy new is coming up with can't resolve host even though the host resolves just fine
[10:42] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[10:43] * S0d0 (joku@a88-113-108-239.elisa-laajakaista.fi) Quit (Ping timeout: 480 seconds)
[10:44] <andreask> matt_: your distribution?
[10:44] <matt_> 13.04 desktop
[10:44] <matt_> 0.67.3
[10:45] <yanzheng> check if your distribution has old version ceph installed
[10:46] <wogri_risc> dpkg -l | grep ceph
[10:46] <matt_> ceph is 0.67.3raring and ceph-deploy is 1.2.3raring
[10:47] <matt_> no old packages at all
[10:47] <wogri_risc> raring? you ought to use the official packages.
[10:48] <nerdtron> raring??
[10:48] <nerdtron> why not the LTS? ceph is tested on it
[10:48] <matt_> It's a dev machine, it's not production
[10:49] <wogri_risc> even better. use 12.04
[10:50] <matt_> I actually have a 13.04 server running in production with Dumpling and it runs just fine
[10:50] <matt_> It's just this one machine that is causing headaches
[10:55] * ShaunR (~ShaunR@staff.ndchost.com) Quit (Ping timeout: 480 seconds)
[10:58] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[10:58] * PITon (~pavel@178-136-128-118.static.vega-ua.net) Quit (Ping timeout: 480 seconds)
[10:59] * PITon (~pavel@ has joined #ceph
[11:07] <gurubert> has anyone setup samba with ceph as storage backend for high performance fileserving?
[11:11] <absynth> high performance is not ceph#s primary usecase
[11:14] <nerdtron> ceph is more like high availability and not high performance
[11:15] <andreask> .. and highly scalable
[11:19] <matt_> gurubert, you might be better off looking at gluster for SMB
[11:20] * shimo_ (~A13032@122x212x216x66.ap122.ftth.ucom.ne.jp) has joined #ceph
[11:23] * shimo__ (~A13032@122x212x216x66.ap122.ftth.ucom.ne.jp) has joined #ceph
[11:25] * shimo (~A13032@122x212x216x66.ap122.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[11:25] * shimo__ is now known as shimo
[11:28] * shimo_ (~A13032@122x212x216x66.ap122.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[11:30] * ScOut3R (~ScOut3R@catv-89-133-21-203.catv.broadband.hu) Quit (Ping timeout: 480 seconds)
[11:30] <cfreak201> I remember reading about an issue of running osd/mon on the same node where rbd kernel devices are being mapped. Is this still an issue? I'm planning on moving mon's to nodes that are going to map volumes
[11:31] <wogri_risc> cfreak201: use a virtual machine for mapping. this will work without problems, I guess.
[11:31] <cfreak201> i map it on the host to use it inside qemu machines.. (stuck with an ancient qemu version)
[11:32] <cfreak201> so using another VM isn't really an option
[11:32] <wogri_risc> oh :)
[11:32] <wogri_risc> to be hones I don't know if this is still an issue.
[11:32] <wogri_risc> no, wait
[11:33] <wogri_risc> I believe this is supposed to be working
[11:33] <wogri_risc> mounting cephFS on the same host is a problem, AFAIR
[11:39] * tziOm (~bjornar@ has joined #ceph
[11:43] <Gugge-47527> running osd on the same host as kernel rbd and kernel cephfs can cause deadlocks
[11:43] <Gugge-47527> monitors is no problem
[11:44] <decede> what about osd and rbd without cephfs?
[11:44] <Gugge-47527> running osd on the same host as kernel rbd, and running osd on the same host as kernel ceph can cause deadlocks
[11:44] <gurubert> so, video editing on ceph is not a good idea=? ,)
[11:44] <Gugge-47527> monitors is no problem
[11:47] * wenjianhn (~wenjianhn@ Quit (Ping timeout: 480 seconds)
[11:47] <cfreak201> Gugge-47527: ok thanks, I'll give it a try on a test machine
[11:49] <wogri_risc> Gugge-47527 - using a VM for mapping on the same host is OK I assume?
[11:50] * matt_ (~matt@mail.base3.com.au) Quit (Quit: Leaving)
[11:50] <Gugge-47527> sure
[11:50] <wogri_risc> good stuff.
[11:50] * yy-nm (~Thunderbi@ Quit (Quit: yy-nm)
[11:55] * ScOut3R (~scout3r@91EC1DC5.catv.pool.telekom.hu) has joined #ceph
[12:02] <Kioob`Taff> (and OSD + RBD kernel under a Xen Dom0 doesn't work)
[12:02] * iii8 (~Miranda@ Quit (Read error: Connection reset by peer)
[12:04] <nerdtron> is it possible to lower the number of Placement groups?
[12:06] <decede> so the idea of running an OSD on a openstack nova compute node is a bad one?
[12:06] <decede> or is there an rbd userspace client
[12:08] * claenjoy (~leggenda@ has joined #ceph
[12:09] <claenjoy> hello, I need some help to understand how to integrate ceph to openstack
[12:09] <joelio> just used ceph-deploy for the first time in a few months to build a cluster - so much better. Great work
[12:10] <nerdtron> decede, openstack stack compute nodes can consume high resources, ceph (during replication) can be resource hungry
[12:10] * nerdtron (~kenneth@ Quit (Remote host closed the connection)
[12:12] <claenjoy> I have in each node a single DISK how can I set OSD ?
[12:17] <claenjoy> @joelio : did you text to me ?
[12:17] <cephalobot> claenjoy: Error: "joelio" is not a valid command.
[12:17] <claenjoy> >joelio : did you text to me ?
[12:17] <Gugge-47527> decede: qemu access rbd's in userspace, and run fine on the same host as an osd
[12:21] <joelio> claenjoy: nope
[12:22] <claenjoy> #ceph
[12:35] * S0d0 (joku@a88-113-108-239.elisa-laajakaista.fi) has joined #ceph
[12:37] <decede> Gugge-47527: ah good
[12:41] * xdeller (~xdeller@ has joined #ceph
[12:41] * xdeller_ (~xdeller@ has joined #ceph
[12:46] * xdeller (~xdeller@ Quit (Quit: Leaving)
[12:48] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[12:55] * mrprud_ (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) Quit (Remote host closed the connection)
[13:01] <xdeller_> is there any possible case for 50-sec repeatable performance spikes also causing higher cpu consumption in same moments? inflight io and iowait does not change during spikes significantly
[13:01] <xdeller_> *cause
[13:06] * yanzheng (~zhyan@ has joined #ceph
[13:09] * DarkAce-Z (~BillyMays@ has joined #ceph
[13:09] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) has joined #ceph
[13:09] * ChanServ sets mode +v andreask
[13:10] * malcolm (~malcolm@ has joined #ceph
[13:10] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[13:15] * sprachgenerator (~sprachgen@va-71-48-143-23.dhcp.embarqhsd.net) has joined #ceph
[13:20] <claenjoy> I have multinode grizzly with ubuntu 12.04 , all services are working fine , I have 5 nodes -> 1 controller 250gb / 1 network 150 gb / 3 compute 500gb all of them have single disk , how many osd and mon I have to install ? also Do I need to install mds ?
[13:25] * smiley_ (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[13:27] <andreask> you should start with 3 mons and 1 OSD per data disk, no need for mds
[13:27] * gurubert (~gurubert@p4FF5A149.dip0.t-ipconnect.de) has left #ceph
[13:30] * sleinen1 (~Adium@2001:620:0:25:51b8:110e:de4b:93b9) Quit (Quit: Leaving.)
[13:30] * sleinen (~Adium@2001:620:0:2d:746e:85b2:160d:8d9d) has joined #ceph
[13:33] * carif (~mcarifio@146-115-183-141.c3-0.wtr-ubr1.sbo-wtr.ma.cable.rcn.com) has joined #ceph
[13:36] * erwan_taf (~erwan@lns-bzn-48f-62-147-157-222.adsl.proxad.net) Quit (Remote host closed the connection)
[13:36] * erwan_taf (~erwan@lns-bzn-48f-62-147-157-222.adsl.proxad.net) has joined #ceph
[13:38] * sleinen (~Adium@2001:620:0:2d:746e:85b2:160d:8d9d) Quit (Ping timeout: 480 seconds)
[13:39] <cfreak201> can I increase the journal size after the osd has already been added to the cluster with a lower journal size? e.g I upgraded the network connection and now I'm expecting alot more throughput
[13:40] <andreask> cleaenjoy: ... except you want to test live-migration and want cephfs mounts , then you need also redundant mds
[13:41] * glzhao (~glzhao@ Quit (Quit: leaving)
[13:42] <Kioob`Taff> cfreak201: shutdown the OSD, then call ceph-osd with the good osdnum and the "--flush-journal". After that, you can resize or move the journal
[13:43] <cfreak201> Kioob`Taff: will that affect any existing data on the osd (given it has shut down cleanly)
[13:43] <cfreak201> ?
[13:44] <Kioob`Taff> It should not. The man page of "ceph-osd" indicate that it's the good process to resize journal, but I didn't look at the source code
[13:44] <cfreak201> Kioob`Taff: just read that as well. Thanks anyway.. I'll try... I've multiple copies of the data so..
[13:44] <Kioob`Taff> (I'm not from Inktank, just a ceph user)
[13:45] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) has joined #ceph
[13:45] <andreask> ... not to forget to run a ceph-osd --mkjournal on the resized journal
[13:46] * sleinen (~Adium@2001:620:0:26:d5a3:d57b:5313:9ea8) has joined #ceph
[13:50] <cfreak201> am I supposed to remove the old journal? Just completed the shutdown, flush, change setting to new size, mkjournal, start again.. and file is still the old size..
[13:54] <andreask> cfreak201: you can remove the old file, yes
[13:56] * carif (~mcarifio@146-115-183-141.c3-0.wtr-ubr1.sbo-wtr.ma.cable.rcn.com) Quit (Quit: Ex-Chat)
[14:00] * diegows (~diegows@ has joined #ceph
[14:03] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:05] <alfredodeza> xarses: ping
[14:05] * kraken (~kraken@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:16] <claenjoy> andreask , thanks a lot ! because I have 1 disk for each node I need to create an new partition , right ? do you suggest xfs or btrfs ?
[14:17] <andreask> clayg: at least a new partition for the OSD yes, and at this time XFS is a good choice
[14:17] <andreask> sorry ... claenjoy ^^^^
[14:18] <claenjoy> andreask : thanks !
[14:20] <claenjoy> andreask : All my node (controller , network.compute ) have raid 1 or raid 10 with raid controller hardware , so do I need to make 3 mon or once it will be fine ?
[14:21] * S0d0 (joku@a88-113-108-239.elisa-laajakaista.fi) Quit (Ping timeout: 480 seconds)
[14:21] <darkfader> you'd never make just one mon
[14:21] * shang (~ShangWu@ Quit (Ping timeout: 480 seconds)
[14:21] <andreask> no mon no data access!
[14:21] <claenjoy> aaaa perfect now is more clear !
[14:22] <claenjoy> darkfader and andreask , thanks
[14:27] <claenjoy> from the begin , just to don't make everything more complicate , when I have deployed openstack I have my user and my id_rsa.pub to login into the server-machines , do I need to create a separate only for ceph (ceph daemons )? or I can use my user default and add it just in etc/sudoers... ?
[14:28] * markbby (~Adium@ has joined #ceph
[14:30] * berant (~blemmenes@gw01.ussignalcom.com) has joined #ceph
[14:31] <andreask> claenjoy: that should work, yes
[14:31] <claenjoy> andreask thanks !
[14:34] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[14:35] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[14:36] * agh (~oftc-webi@gw-to-666.outscale.net) has joined #ceph
[14:36] <agh> Hello,
[14:36] <agh> I want to create scripts to automatize the deployment of rados gateway instances.
[14:36] * mattt (~mattt@ has joined #ceph
[14:36] * smiley_ (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley_)
[14:37] <agh> I don't understand one thing about keyrings
[14:38] <agh> Is it possible to use the same key for all the gateways, a sort of "radosgw-bootstrap" key, in order to have no manual action to do ?
[14:42] <mattt> running ceph 0.67.2, can't seem to upload files > 1 GB
[14:42] <mattt> using radosgw w/ swift client
[14:44] <mattt> anyone familiar w/ issue and how i fix? :)
[14:45] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[14:53] * shang (~ShangWu@ has joined #ceph
[14:54] * KevinPerks (~Adium@ has joined #ceph
[14:55] * agh (~oftc-webi@gw-to-666.outscale.net) Quit (Quit: Page closed)
[14:57] * Yen (~Yen@2a00:f10:103:201:ba27:ebff:fefb:350a) has joined #ceph
[14:59] <claenjoy> do you suggest on my ubuntu 12.04 64 , running openstack grizzly , to install ceph last realease (dumpling ), right ?
[15:00] * penguinLord (~penguinLo@ has joined #ceph
[15:08] * markbby (~Adium@ Quit (Quit: Leaving.)
[15:08] * markbby (~Adium@ has joined #ceph
[15:10] * markbby (~Adium@ Quit ()
[15:10] * sjm (~sjm@ has joined #ceph
[15:10] * markbby (~Adium@ has joined #ceph
[15:11] * andreask1 (~andreask@h081217135028.dyn.cm.kabsi.at) has joined #ceph
[15:11] * ChanServ sets mode +v andreask1
[15:11] * andreask is now known as Guest6330
[15:11] * andreask1 is now known as andreask
[15:12] * Guest6330 (~andreask@h081217135028.dyn.cm.kabsi.at) Quit (Read error: Connection reset by peer)
[15:12] <penguinLord> I am trying to install ceph on my machine.Initially I faced some problems due to proxy issues .Based on someones recommendation I wrote the proxies for apt-get .Now that seems to be working but is still getting stuck at apt-get again http://codetidy.com/6672/ . I guess the issue is because some of the repo are outdated or return 404 but I am not sure.Can someone help ?
[15:14] * benner (~benner@ has joined #ceph
[15:14] <wogri_risc> that someone with apt-get proxy was me :)
[15:15] <wogri_risc> and it seems you have stuff in your /etc/apt/sources.list that is outdated.
[15:15] <benner> hi
[15:15] <penguinLord> wogri_risc thanks :)
[15:15] <wogri_risc> penguinLord - go to this box and run apt-get update
[15:15] <wogri_risc> I'm sure you will get the same error
[15:15] <wogri_risc> apt-get update ; echo $? will return non-zero
[15:15] <wogri_risc> fix that
[15:16] <wogri_risc> then ceph-deploy will be able to work
[15:17] * claenjoy (~leggenda@ Quit (Remote host closed the connection)
[15:18] * clayb (~kvirc@ has joined #ceph
[15:18] * claenjoy (~leggenda@ has joined #ceph
[15:19] <penguinLord> wogri_risc I also thought this was the issue but it never really occured to me to check the status of the command .Thanks I will do that :)
[15:19] * thomnico (~thomnico@ has joined #ceph
[15:19] <wogri_risc> you're welcome. I'm off and for my way home. good bye #ceph.
[15:19] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has left #ceph
[15:21] * markbby (~Adium@ Quit (Quit: Leaving.)
[15:21] * markbby (~Adium@ has joined #ceph
[15:27] <tziOm> What is the motivation behind going lgpl with rbd ?
[15:32] * PITon (~pavel@ Quit (Quit: Leaving)
[15:33] * eternaleye (~eternaley@2002:3284:29cb::1) has joined #ceph
[15:34] * freedomhui (~freedomhu@ has joined #ceph
[15:37] <leseb> scuttlemonkey: ping
[15:38] <alfredodeza> xarses: ping
[15:39] * markbby (~Adium@ Quit (Quit: Leaving.)
[15:39] * markbby (~Adium@ has joined #ceph
[15:40] * markbby (~Adium@ Quit ()
[15:40] * markbby (~Adium@ has joined #ceph
[15:41] * markbby (~Adium@ Quit ()
[15:41] * sileht (~sileht@gizmo.sileht.net) has joined #ceph
[15:41] * markbby (~Adium@ has joined #ceph
[15:43] * markbby (~Adium@ Quit ()
[15:43] * penguinLord (~penguinLo@ Quit (Quit: irc2go)
[15:43] * markbby (~Adium@ has joined #ceph
[15:44] <joelio> dumpling upgrade all good.. at least on our test rig
[15:45] * penguinLord (~penguinLo@ has joined #ceph
[15:46] <tziOm> Anyone care to comment on the rbd move to LGPL2? Cant find any discussion around it.
[15:47] <jerker> tziOm: be able to link virtual machines that are not gpl into lgpl? (guessing)
[15:48] <mattt> possible to install 0.67.2 from ceph repos still ?
[15:48] <tziOm> smelling vmware?
[15:48] <mattt> looks like 0.67.3 is now avail
[15:50] <alfredodeza> tziOm: I thought everything ceph was LGPGL2
[15:51] <tziOm> atleast lisence on rbd has changed now, according to changelog
[15:51] <joelio> alfredodeza: I rolled a new cluster today with ceph-deploy. Much smoother than last time, thanks for the hard work
[15:52] * ebo^ (~ebo@koln-5d812048.pool.mediaWays.net) has joined #ceph
[15:52] <alfredodeza> \o/
[15:52] <alfredodeza> high five
[15:52] <kraken> \o
[15:53] <alfredodeza> o/
[15:53] <kraken> \o
[15:53] <alfredodeza> :D
[15:53] <alfredodeza> joelio: we have a release coming up with improved monitor support
[15:53] <alfredodeza> so much nicer
[15:53] <alfredodeza> it would actually infer from the status of the monitor if it is running correctly (or not)
[15:54] <alfredodeza> joelio: see here for an example: https://github.com/ceph/ceph-deploy/pull/71
[15:54] <joelio> ahh cool, happy to test - I've rebuilt our initial test harness today (we used that to PoC ceph before we put some £££££ in) - I'm using that as a test bed now for our production
[15:55] <joelio> all very smooth.. ceph-deploy worked great.. data fill went good, upgrade seamless
[15:55] * jcsp (~john@ has joined #ceph
[15:55] * dmsimard (~Adium@ has joined #ceph
[15:55] <joelio> alfredodeza: DEBUG output looks like a nice addition!
[15:56] <alfredodeza> :)
[15:56] <mattt> so back to my previous question, anyone seen issues w/ radosgw + swift uploading images > 1 GB ?
[15:56] * BillK (~BillK-OFT@58-7-172-nwork.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:57] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) has joined #ceph
[15:57] <ebo^> i have recently added 2 new osds to my cluster, now i have a lot of pgs backfilling and some in (for several hours) recovery_wait. is the recovery_wait part normal or harmfull?
[15:58] <joelio> ebo^: ceph osd tree - are they in the CRUSH map?
[15:58] <scuttlemonkey> leseb: hey, was working upstairs...wussup?
[15:58] <joelio> ebo^: or are they orphaned without a node?
[15:58] <ebo^> they are in the tree and are being filled
[15:59] * sjm (~sjm@ Quit (Remote host closed the connection)
[15:59] <joelio> ebo^: and you see cluster activity? it may just be taking time to rebalance?
[15:59] <decede> mattt: any error messages?
[15:59] <Gugge-47527> ebo^: recovery_wait and backfill_wait is normal, its just waiting for other pg's to recover/backfill
[16:00] <leseb> scuttlemonkey: np, I'm doing well and you?
[16:00] <Gugge-47527> it does not take them all at once
[16:00] <ebo^> ty
[16:01] <mattt> decede: the log is so verbose it's hard to see, but what i can tell is that it seems to get several 500s (perhaps retries several times), and then just gives up
[16:01] <scuttlemonkey> not bad, just getting ready to head to new york tomorrow
[16:02] <mattt> decede: same issue as this: http://www.mail-archive.com/ceph-users@lists.ceph.com/msg03579.html
[16:02] <leseb> scuttlemonkey: nice :), I was wondering would it be possible to somehow get a symlink from the last stable version repo to something static like debian-last-stable?
[16:03] <leseb> scuttlemonkey: or perhaps it already exist, I saw that it's available on eu.ceph.com since we can browse it
[16:03] <leseb> s/exist/exists
[16:04] <scuttlemonkey> hmm, I'm not sure
[16:04] <scuttlemonkey> lemme see
[16:04] <leseb> scuttlemonkey: for now I can live without modification from ceph.com since eu.ceph.com provides it, but just in case you know
[16:05] <leseb> scuttlemonkey: will be quite handy, so we don't need to change the list address everytime :)
[16:05] <leseb> scuttlemonkey: it's mainly for CI purpose :)
[16:05] <decede> where are you seeing the 500's in ceph logs or apache?
[16:05] * smiley_ (~smiley@ has joined #ceph
[16:05] * sjm (~sjm@ has joined #ceph
[16:05] <scuttlemonkey> leseb: so something like http://eu.ceph.com/debian-last/ ?
[16:06] <leseb> scuttlemonkey: actually this seems to be symlinked to debian-testing but http://eu.ceph.com/debian/ is ok :)
[16:07] <scuttlemonkey> k
[16:07] * sjm (~sjm@ Quit (Remote host closed the connection)
[16:07] * sjm (~sjm@ has joined #ceph
[16:09] * fretb (~fretb@frederik.pw) Quit (Quit: leaving)
[16:09] * fretb (~fretb@frederik.pw) has joined #ceph
[16:10] <leseb> scuttlemonkey: thanks :)
[16:10] <mattt> decede: the 500s are in the radosgw.log
[16:11] * danieagle (~Daniel@ has joined #ceph
[16:12] <mattt> decede: seeing this: 7f92637f6700 1 ====== req done req=0x1b87060 http_status=500 ======
[16:12] <mattt> decede: 7f92637f6700 0 WARNING: set_req_state_err err_no=27 resorting to 500
[16:15] * jcsp (~john@ Quit (Ping timeout: 480 seconds)
[16:17] * carif (~mcarifio@pool-96-233-32-122.bstnma.fios.verizon.net) has joined #ceph
[16:19] * tziOm (~bjornar@ Quit (Remote host closed the connection)
[16:25] * thomnico (~thomnico@ Quit (Quit: Ex-Chat)
[16:27] * markbby (~Adium@ Quit (Quit: Leaving.)
[16:28] * markbby (~Adium@ has joined #ceph
[16:28] * markbby (~Adium@ Quit ()
[16:29] <penguinLord> I am trying to install ceph using the official guide .Now everything is working till the installation on the ceph-server the ceph-deploy mon create ceph-server also doesnot seem to be giving any error .But when I try to check the installation using ceph-deploy gatherkeys ceph-server it finds no keys http://pastebin.com/my0PpACx .Can someone please help?
[16:29] * markbby (~Adium@ has joined #ceph
[16:29] <alfredodeza> penguinLord: it seems your hostname does not match
[16:30] <alfredodeza> e.g. this log line: remote hostname: thePenguinAllianceBenevolentSupremeDictatorForLifeAndBeyond
[16:30] <alfredodeza> that does not match `ceph-server`
[16:30] <alfredodeza> it needs to
[16:34] <penguinLord> alfreddodeza I have mentioned these lines in my /etc/hosts http://pastebin.com/cX97C8fh being my ip and location for ceph installation .I mean I am installing as different user on my machine itself.Is this creating a problem ?
[16:35] <cmdrk> anyone ever try putting 60 OSDs on one box?
[16:37] * vata (~vata@2607:fad8:4:6:1446:1d9e:1a85:3b9) has joined #ceph
[16:37] <nhm> cmdrk: yes
[16:37] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[16:39] * marrusl (~mark@ Quit (Remote host closed the connection)
[16:40] <mattt> decede: think i may have resolved that w/ osd_max_attr_size = 655360 (as suggested in http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/3859)
[16:40] <joao> nhm, I'm sure he meant to add 'how did it go?' :p
[16:41] * marrusl (~mark@ has joined #ceph
[16:42] * penguinLord (~penguinLo@ Quit (Quit: irc2go)
[16:42] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[16:42] <nhm> joao: Hey, just knowing someone has done it is good enough for some people. ;)
[16:43] * penguinLord (~penguinLo@ has joined #ceph
[16:44] <joao> lol
[16:44] <Kioob`Taff> ;)
[16:45] * fretb (~fretb@frederik.pw) Quit (Quit: leaving)
[16:45] * fretb (~fretb@frederik.pw) has joined #ceph
[16:46] * fretb (~fretb@frederik.pw) Quit ()
[16:46] * erice (~erice@ has joined #ceph
[16:47] * fretb (~fretb@frederik.pw) has joined #ceph
[16:47] * fretb (~fretb@frederik.pw) Quit ()
[16:48] * fretb (~fretb@frederik.pw) has joined #ceph
[16:48] * fretb (~fretb@frederik.pw) Quit ()
[16:49] * fretb (~fretb@frederik.pw) has joined #ceph
[16:51] * penguinLord (~penguinLo@ Quit (Quit: irc2go)
[16:52] * penguinLord (~penguinLo@ has joined #ceph
[16:53] * S0d0 (joku@a88-113-108-239.elisa-laajakaista.fi) has joined #ceph
[16:54] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[16:55] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) Quit (Remote host closed the connection)
[16:56] <cmdrk> haha, well, i plan to try it myself soon.
[16:56] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[16:56] * freedomhui (~freedomhu@ has joined #ceph
[16:56] <cmdrk> just got a new dell machine with four MD shelfs and 10gb
[16:56] <cmdrk> and lots of ram.
[16:56] <nhm> cmdrk: daisy chained?
[16:57] <cmdrk> two MDs per PERC card
[16:58] * thomnico (~thomnico@ has joined #ceph
[16:59] <cmdrk> ill publish some benchmarks when i get a chance. just got them racked, need to cable :(
[17:02] * Guest3115 (~coyo@thinks.outside.theb0x.org) Quit (Ping timeout: 480 seconds)
[17:03] * Coyo (~coyo@thinks.outside.theb0x.org) has joined #ceph
[17:03] * Coyo is now known as Guest6343
[17:03] <nhm> cmdrk: Might be a bit oversubscribed on the expanders.
[17:04] <nhm> cmdrk: We had a similar Dell HPC system setup with Lustre at my old job and there was some kind of mysterious "hihg performance" firmware they gave us. Something you might want to ask about. :D
[17:07] * Vjarjadian (~IceChat77@05453253.skybroadband.com) has joined #ceph
[17:07] * markbby (~Adium@ Quit (Quit: Leaving.)
[17:08] * allsystemsarego (~allsystem@ has joined #ceph
[17:10] * markbby (~Adium@ has joined #ceph
[17:10] * bernieke (~bernieke@176-9-206-129.cinfuserver.com) Quit (Ping timeout: 480 seconds)
[17:11] * malcolm (~malcolm@ Quit (Ping timeout: 480 seconds)
[17:11] * sjm (~sjm@ Quit (Remote host closed the connection)
[17:14] * sjm (~sjm@ has joined #ceph
[17:15] * sjm (~sjm@ Quit (Remote host closed the connection)
[17:16] * sjm (~sjm@ has joined #ceph
[17:16] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:17] * danieagle (~Daniel@ Quit (Ping timeout: 480 seconds)
[17:21] * grepory (~Adium@2.sub-70-192-200.myvzw.com) has joined #ceph
[17:22] * sjm (~sjm@ Quit (Remote host closed the connection)
[17:23] * foosinn (~stefan@office.unitedcolo.de) Quit (Quit: Leaving)
[17:24] * sagelap (~sage@2600:1012:b00e:5c39:f19a:3ea9:afe1:ad87) has joined #ceph
[17:27] * danieagle (~Daniel@ has joined #ceph
[17:27] * grepory (~Adium@2.sub-70-192-200.myvzw.com) Quit (Quit: Leaving.)
[17:29] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:29] * jmlowe (~Adium@2601:d:a800:511:2961:4fdf:36a8:9d06) has joined #ceph
[17:29] * carif (~mcarifio@pool-96-233-32-122.bstnma.fios.verizon.net) Quit (Quit: Ex-Chat)
[17:30] * gucki (~smuxi@77-56-39-154.dclient.hispeed.ch) has joined #ceph
[17:31] <jmlowe> so I went from 0.67.2 to 0.67.3 this morning and got about a 4x improvement in sysbench mysql tests for rbd backed vms
[17:31] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[17:32] <janos> uh
[17:32] <janos> wow
[17:32] <jmlowe> I wish I had kept a couple of yesterday's runs, I wasn't expecting much of a difference
[17:33] <janos> i thought ceph was aiming for more enterprise-y stuff. why not testing postgresql instead? ;) */me ducks and puts on flame suit*
[17:33] * BManojlovic (~steki@fo-d- has joined #ceph
[17:33] <jmlowe> well it's actually mariadb with galera
[17:33] <jmlowe> postgres-xc is in the works
[17:34] <janos> i haven't messed withthe -xc variant, though i would expect it to still be fairly vanilla
[17:34] * ScOut3R (~scout3r@91EC1DC5.catv.pool.telekom.hu) Quit (Remote host closed the connection)
[17:34] <xdeller_> http://i.imgur.com/8BBWM7o.png - cuttlefish with debug_ms=0 at edges, with debug_ms=1 in the middle
[17:34] <nhm> jmlowe: yikes
[17:34] <xdeller_> one of my osds went somehow crazy to produce such latency peaks
[17:35] <nhm> jmlowe: I just did some rados bench tests on SSDs comparing 0.67.2 and 0.67.3 and didn't see any difference. :D
[17:35] <xdeller_> anyone interested in further debug?
[17:35] <nhm> jmlowe: how do you have rbd configured?
[17:36] <jmlowe> nhm: I didn't think I could detect a performance difference between 0.67.2 vs 0.61.8
[17:37] <jmlowe> nhm: I go from 18 to 24 osd's immediately after upgrading so maybe that masked any slowdowns
[17:37] <jmlowe> nhm: writeback cache, virtio-scsi w/ discard/TRIM
[17:38] <nhm> xdeller_: interesting, that was just debug_ms=1?
[17:38] <nhm> xdeller_: one of the things I'm looking at right now is how much our debugging is hurting IOPS performance.
[17:39] <xdeller_> yup, and more interesting detail - peaks was coupled with cpu peaks too but enabled debug just helped for latency and preserved cpu peaks
[17:39] <nhm> jmlowe: I don't suppose you did any rados bench comparisons? :)
[17:40] * danieagle (~Daniel@ Quit (Quit: inte+ e Obrigado Por tudo mesmo! :-D)
[17:40] <xdeller_> quite funny if it was something than production
[17:40] <jmlowe> nhm: nope, I fully expected it to be exactly the same, surprised me enough I had to drop what I was doing and go tell the channel
[17:41] <jmlowe> for those playing at home "read/write requests: 20043 (1278.10 per sec.)" and it was 300 and change yesterday
[17:42] <jmlowe> generated from "sysbench --num-threads=128 --max-requests=1000 --test=oltp --oltp-table-size=1000000 --oltp-read-only=off --mysql-host= run"
[17:42] <nhm> jmlowe: that reminds me, was it across the board for lots of IO patterns?
[17:43] <jmlowe> nhm: did the sysbench arguments answer your io pattern question?
[17:44] <nhm> jmlowe: I don't know sysbench. Is it doing reads and writes? How big?
[17:45] <jmlowe> nhm: a mix of selects, updates, deletes, inserts
[17:45] <nhm> Ah, ok. And how big are the records?
[17:45] <jmlowe> nhm: probably pretty small, I don't know the guts of innodb well enough to say exactly what it's doing
[17:45] <nhm> jmlowe: I've been meaning to do some database on rbd testing.
[17:46] <jmlowe> nhm: very small, 10's of bytes
[17:46] <nhm> jmlowe: ok, good to know.
[17:47] <nhm> no real obvious rbd work in 0.67.3, but some OSD performance improvements.
[17:47] <jmlowe> schema http://pastebin.com/39SzNLHA
[17:48] * grepory (~Adium@115.sub-70-192-204.myvzw.com) has joined #ceph
[17:48] <nhm> jmlowe: I don't know enough about how mysql does things behind the scenes either.
[17:48] <nhm> jmlowe: probably lots of areas things could be tweaked.
[17:49] <jmlowe> well 0.67.3 has my seal of approval, always nice to find some unexpected improvements
[17:50] <nhm> indeed!
[17:50] <alfredodeza> xarses: ping
[17:51] * xarses (~andreww@c-71-202-167-197.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[17:52] * sagelap (~sage@2600:1012:b00e:5c39:f19a:3ea9:afe1:ad87) Quit (Ping timeout: 480 seconds)
[18:01] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[18:02] * markbby (~Adium@ Quit (Quit: Leaving.)
[18:04] * markbby (~Adium@ has joined #ceph
[18:04] * markbby (~Adium@ Quit (Remote host closed the connection)
[18:04] * markbby (~Adium@ has joined #ceph
[18:05] * sagelap (~sage@2600:1012:b01e:315f:c685:8ff:fe59:d486) has joined #ceph
[18:07] * mattch (~mattch@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[18:07] <dmsimard> jmlowe: That's always good news :)
[18:07] * mattch (~mattch@pcw3047.see.ed.ac.uk) has joined #ceph
[18:11] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[18:12] * mancdaz (~darren.bi@ Quit (Ping timeout: 480 seconds)
[18:12] <sagewk> jmlowe: sounds like that's the osd performance fixes
[18:12] <sagewk> jmlowe: great to hear :)
[18:13] * angdraug (~angdraug@ has joined #ceph
[18:15] * sagelap (~sage@2600:1012:b01e:315f:c685:8ff:fe59:d486) Quit (Ping timeout: 480 seconds)
[18:15] * DarkAce-Z (~BillyMays@ Quit (Ping timeout: 480 seconds)
[18:17] * grepory (~Adium@115.sub-70-192-204.myvzw.com) Quit (Quit: Leaving.)
[18:19] * xarses (~andreww@ has joined #ceph
[18:23] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[18:24] <dmsimard> sagewk: You probably caught me asking this before but does inktank/ceph have any plans for puppet ? There's an initiative now but it's not managed by the community in more of an official way like Chef is
[18:25] <sagewk> no planned work internally; we have our hands full with ceph-deploy and chef. there are puppet recipes/scripts/whatever from enovance, iirc, but i'm not sure we've reviewed them.
[18:25] <sagewk> if someone wants to step up and maintain them, happy to host them under github.com/ceph/ceph-puppet (or whatever)
[18:26] * mattt (~mattt@ Quit (Read error: Connection reset by peer)
[18:26] <sagewk> inktank just doesn't have people to do it. (ceph != inktank, etc.)
[18:27] <xarses> dmsimard: we (mirantis) are using puppet ontop of ceph-deploy for deploy methods and puppet for orchestration / config with a module in our fuel product
[18:27] * gucki (~smuxi@77-56-39-154.dclient.hispeed.ch) Quit (Read error: No route to host)
[18:28] <cmdrk> sagewk: i had an issue with disappearing directories in cephfs w/ kernel 3.10 a few weeks ago and you asked me to try the latest ceph-client kernel from github (-rc6 kernel at the time?). is it safe to assume those patches were merged into 3.11 ?
[18:28] <cmdrk> been busy with conferences and havent had time until now to test :(
[18:28] <dmsimard> Yeah, makes sense. The work required in maintaining the initiative and centralizing the effort is certainly not negligeable.
[18:28] <xarses> github.com/Mirantis/fuel/deployment/puppet/ceph for stable, development is currently in github/Xarses/fuel/deployment/puppet/ceph
[18:29] <dmsimard> xarses: I've thought about that too ! With ceph-deploy features and improvements coming in rather quickly, I've thought about using puppet with ceph-deploy
[18:29] * gucki (~smuxi@77-56-39-154.dclient.hispeed.ch) has joined #ceph
[18:29] * sjm (~sjm@ has joined #ceph
[18:29] * gucki (~smuxi@77-56-39-154.dclient.hispeed.ch) Quit (Remote host closed the connection)
[18:29] <dmsimard> It seems like development for ceph-deploy will only go so far, though, alredodeza said it's meant as a tool to get started quickly - probably not as a mean to do everything
[18:30] <xarses> you should be able to pull the ceph module out as long as you have a recent(ish) version of stdlib
[18:30] <sagewk> yehudasa: yehuda_hm: https://github.com/ceph/ceph/pull/587
[18:30] <cmdrk> i'd love to have a puppet solution as well :) i have a deployment of about 200-250 OSDs coming up soon (tm)
[18:30] <sagewk> nm, already merged. :)
[18:30] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[18:31] <xarses> https://github.com/xarses/fuel/tree/ceph-fuel-1/deployment/puppet/ceph
[18:31] <xarses> its kinda tied to openstack althogh those classes should be easy to remove
[18:32] <dmsimard> cmdrk: I'm in the same shoes right now, I was happy with ceph-deploy for my proof of concept. It does some things and it does them well. That's why I was wondering if I should wrap something with puppet on top of ceph-deploy
[18:32] <dmsimard> xarses: I'll definitely look into it.
[18:32] * smiley_ (~smiley@ Quit (Quit: smiley_)
[18:32] * DarkAce-Z (~BillyMays@ has joined #ceph
[18:32] * ircolle (~Adium@c-67-165-237-235.hsd1.co.comcast.net) has joined #ceph
[18:33] <dmsimard> cmdrk: but then there's the puppet-ceph initiative that essentially does what ceph-deploy does without ceph-deploy but it's very fragmented - https://github.com/enovance/puppet-ceph/network
[18:33] <cmdrk> interesting
[18:33] <alfredodeza> dmsimard: I updated the documentation for ceph-deploy a while ago to point out you probably should use chef/puppet/other as *opposed* to ceph-deploy
[18:33] <alfredodeza> not wrap it
[18:34] <dmsimard> Yeah, we've had this discussion before :D
[18:34] <alfredodeza> oh right
[18:34] <xarses> =)
[18:34] <alfredodeza> I thought I had that with xarses
[18:34] <alfredodeza> which is doing the same
[18:34] <dmsimard> Maybe both!
[18:34] <alfredodeza> you guys!
[18:34] <yehuda_hm> sagewk: I already pulled that one
[18:34] * xarses grins sheepishly
[18:35] <nhm> any of the wikimedia guys around?
[18:36] <dmsimard> cmdrk: But see, I'm conflicted about puppet-ceph - it does some things that I would do differently.. For instance, activating an OSD going through mkpart, mkfs "manually" instead of using ceph-provided binaries such as ceph-osd prepare
[18:36] <sagewk> yehuda_hm: hmm, on that upgrade failure.. the problem was running next s3tests against dumpling and the test_cors stuff was failing. does that mean the cors fixes aren't backported to dumpling yet?
[18:36] <nhm> paravoid: ping when you have a sec
[18:36] <gregmark> Ceph folk: If you choose to NOT spin an OpenStack VM by booting from a volume or snapshot, your root disk is effectively ephemeral, yes?
[18:36] <dmsimard> There's a lot of development going in many directions, I don't see many pull requests being merged back - that's why I asked sage if there would be some sort of effort in that direction
[18:37] <yehuda_hm> sagewk: correct. I actually prepared the branch yesterday but didn't push because I wasn't sure how to incorporate the Reviewed-by tag (lame, I know ... )
[18:37] <sagewk> cmdrk: the d_prune patch has been merged post 3.11 (for 3.12-rc1).
[18:38] <cmdrk> sagewk: curses, ok. I'll build from the ceph git repo then.
[18:39] <jmlowe> sagewk: hopefully it didn't get lost when Linus's desktop ssd died
[18:39] <sagewk> the patch is dropped from the ceph git tree now that it is upstream, btw;
[18:40] <sagewk> you want to cherry-pick ... letme find the sha1
[18:40] <cmdrk> ok
[18:40] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) has joined #ceph
[18:40] <sagelap> 590fb51f1cf99c4a48a3b1bd65885192e877b561
[18:41] <cmdrk> thanks
[18:41] <jmlowe> https://plus.google.com/+LinusTorvalds/posts/V81f6d7QK9j
[18:42] * DarkAce-Z (~BillyMays@ Quit (Ping timeout: 480 seconds)
[18:45] * sleinen1 (~Adium@ has joined #ceph
[18:45] * nwat (~nwat@eduroam-237-79.ucsc.edu) has joined #ceph
[18:47] * topro (~topro@host-62-245-142-50.customer.m-online.net) Quit (Quit: Konversation terminated!)
[18:50] <joelio> cool, Ceph Day London tickets sorted #
[18:50] <joelio> see you there :)
[18:50] <sagewk> joelio: yay!
[18:50] <joelio> couple of colleagues coming too from other offices
[18:50] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[18:52] * sleinen (~Adium@2001:620:0:26:d5a3:d57b:5313:9ea8) Quit (Ping timeout: 480 seconds)
[18:53] <scuttlemonkey> joelio: nice!
[18:53] * sleinen1 (~Adium@ Quit (Ping timeout: 480 seconds)
[18:53] * nwat (~nwat@eduroam-237-79.ucsc.edu) Quit (Ping timeout: 480 seconds)
[18:54] * aliguori (~anthony@ has joined #ceph
[18:57] * nwat (~nwat@eduroam-237-79.ucsc.edu) has joined #ceph
[18:59] * DarkAce-Z (~BillyMays@ has joined #ceph
[18:59] * ebo^ (~ebo@koln-5d812048.pool.mediaWays.net) Quit (Quit: Verlassend)
[19:00] * sleinen (~Adium@eduroam-hg-dock-1-48.ethz.ch) has joined #ceph
[19:01] * sjm (~sjm@ Quit (Remote host closed the connection)
[19:01] * sleinen1 (~Adium@2001:620:0:26:5142:9b60:2d82:7335) has joined #ceph
[19:04] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:05] * houkouonchi-work (~linux@ has joined #ceph
[19:06] * Vjarjadian (~IceChat77@05453253.skybroadband.com) Quit (Quit: If you think nobody cares, try missing a few payments)
[19:08] * claenjoy (~leggenda@ Quit (Quit: Leaving.)
[19:08] * sleinen (~Adium@eduroam-hg-dock-1-48.ethz.ch) Quit (Ping timeout: 480 seconds)
[19:10] * DarkAce-Z (~BillyMays@ Quit (Ping timeout: 480 seconds)
[19:11] * grepory (~Adium@170.sub-70-192-192.myvzw.com) has joined #ceph
[19:12] * mancdaz (~darren.bi@94-195-16-87.zone9.bethere.co.uk) has joined #ceph
[19:12] * nwat (~nwat@eduroam-237-79.ucsc.edu) Quit (Ping timeout: 480 seconds)
[19:12] * penguinLord (~penguinLo@ Quit (Quit: irc2go)
[19:14] * gaveen (~gaveen@ has joined #ceph
[19:15] * penguinLord (~penguinLo@ has joined #ceph
[19:16] <xarses> alfredodeza, https://github.com/ceph/ceph-deploy/pull/71 works now; kudos
[19:16] <alfredodeza> xarses: great
[19:16] <alfredodeza> I need to make some fixes to it though
[19:16] <xarses> i noticed
[19:16] <paravoid> nhm: pong
[19:16] <xarses> but it dosn't break my install script
[19:17] * mancdaz (~darren.bi@94-195-16-87.zone9.bethere.co.uk) Quit (Quit: mancdaz)
[19:17] * sjusthm (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[19:20] * jbd_ (~jbd_@2001:41d0:52:a00::77) has left #ceph
[19:24] * DarkAceZ (~BillyMays@ has joined #ceph
[19:28] <sjusthm> bstillwell: are you around?
[19:30] * nwat (~nwat@eduroam-237-79.ucsc.edu) has joined #ceph
[19:34] * sjm (~sjm@ has joined #ceph
[19:38] * nwat (~nwat@eduroam-237-79.ucsc.edu) Quit (Read error: Operation timed out)
[19:41] * sjm (~sjm@ Quit (Remote host closed the connection)
[19:42] * penguinLord (~penguinLo@ Quit (Quit: irc2go)
[19:43] * nwat (~nwat@eduroam-237-79.ucsc.edu) has joined #ceph
[19:44] <elmo> hey, wew've manage to run our ceph cluster out of space
[19:44] <elmo> but we can't even delete volumes to free up space
[19:44] <elmo> can anyone suggest how we get ourselves out of this self-created hell?
[19:46] <scuttlemonkey> elmo: best/easiest way to recover from cluster full is to add osd(s) and let it rebalance some data
[19:47] <elmo> scuttlemonkey: hmm, I was hoping to avoid travelling to the DC
[19:48] <scuttlemonkey> are all OSDs full?
[19:48] <scuttlemonkey> or are some disks larger than other and might have some free space?
[19:48] <elmo> all disks are the same size unfortunately
[19:49] * penguinLord (~penguinLo@ has joined #ceph
[19:49] <penguinLord> 565
[19:49] <elmo> osd.1 is full at 95%
[19:49] <elmo> osd.7 is near full at 85%
[19:49] <elmo> is it worth trying to rebalance that?
[19:50] <scuttlemonkey> hmm
[19:50] <janos> i'd check how many are near full or not near full and see about maybe increasing the warning threshold
[19:50] <janos> and rebalancing
[19:50] <scuttlemonkey> elmo: ^
[19:50] <janos> but that example is so borderline it likely won't work in my experience
[19:50] <elmo> yeah, we're definitely going to adjust our monitoring thresholds
[19:50] <elmo> I thought we had Tb's free :(
[19:50] <janos> which version of ceph?
[19:51] <elmo> 0.48.3-0ubuntu1~cloud0
[19:51] <elmo> (folsom from the Ubuntu Cloud Archive)
[19:51] * nwat (~nwat@eduroam-237-79.ucsc.edu) Quit (Ping timeout: 480 seconds)
[19:51] <janos> not familiar with that
[19:51] <janos> bobtail generation, cuttlefish, etc?
[19:51] <janos> oh that's argonaut isn't it
[19:52] <scuttlemonkey> wow
[19:52] <elmo> yeah, Argonaut
[19:52] <elmo> :(
[19:52] <janos> i'm not sure what options you have there
[19:52] <scuttlemonkey> yeah, was gonna link this:
[19:52] <xarses> ya probably want a newer version there
[19:52] <scuttlemonkey> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
[19:52] <janos> i jumped in post-argonaut, pre-release-bobtail
[19:52] <scuttlemonkey> but I'm not even sure how much of that is true for argo
[19:53] <janos> you may need to do as scuttlemonkey suggested and add osd's
[19:53] <janos> then seriously think about upgrading
[19:53] * sleinen1 (~Adium@2001:620:0:26:5142:9b60:2d82:7335) Quit (Quit: Leaving.)
[19:53] * sleinen (~Adium@eduroam-hg-dock-1-48.ethz.ch) has joined #ceph
[19:57] * sjm (~sjm@ has joined #ceph
[19:57] <scuttlemonkey> elmo: looks like argo had the ability to shuffle thresholds as well:
[19:57] <scuttlemonkey> http://ceph.com/docs/argonaut/ops/manage/failures/osd/#full-cluster
[19:57] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[19:58] <elmo> aha, the disks are not all the same size
[19:58] <elmo> I'm a moron
[19:58] <elmo> fixing the weighting should fix this
[19:58] <scuttlemonkey> my preference would still be to add OSDs...but maybe with a combination of weighting and threshold manipulation you could fix it enough to manip
[19:58] <scuttlemonkey> ahh
[19:58] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Remote host closed the connection)
[19:58] <elmo> assuming ceph can recover itself
[19:58] <scuttlemonkey> nice
[19:58] * xmltok (~xmltok@pool101.bizrate.com) has joined #ceph
[19:59] * ScOut3R (~scout3r@91EC1DC5.catv.pool.telekom.hu) has joined #ceph
[20:02] * sleinen (~Adium@eduroam-hg-dock-1-48.ethz.ch) Quit (Ping timeout: 480 seconds)
[20:05] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[20:05] * grepory (~Adium@170.sub-70-192-192.myvzw.com) Quit (Quit: Leaving.)
[20:07] <elmo> ok, it seems to be backfilling
[20:07] <elmo> is it safe to reweight multiple drives at once?
[20:08] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[20:08] <scuttlemonkey> elmo: should be ok, yeah
[20:09] * grepory (~Adium@2600:1003:b00e:8bd1:b538:ef1c:d1ca:4de7) has joined #ceph
[20:13] <joshd> loicd: what's up?
[20:15] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[20:15] * sjm (~sjm@ Quit (Remote host closed the connection)
[20:16] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[20:17] * sjm (~sjm@ has joined #ceph
[20:19] * alphe (~alphe@0001ac6f.user.oftc.net) has joined #ceph
[20:19] <alphe> hello all nice article on monitors and paxos !
[20:19] <alphe> congratulations to the participants!
[20:20] <mikedawson> Starting to monitor perf dumps from my osd admin sockets. Can anyone give me some pointers on what metrics to focus on to track down spindle contention / slow requests?
[20:20] <scuttlemonkey> alphe: thanks for reading :)
[20:20] <alphe> is there a way to paralelise ceph-deploy commands instead of having them execute on host one after the other
[20:21] <alfredodeza> alphe: not currently
[20:21] <alphe> scuttlemonkey very interresting article I hope one day the same approach of article free speech would be applyed to "What do I need to make a windows client comparable to ceph-fuse?" :)
[20:22] <alphe> alfredodeza ok that is what I thought thank you for confirming. I use dsh to upgrade paralelle the cluster
[20:22] * mschiff (~mschiff@pD95108ED.dip0.t-ipconnect.de) Quit (Remote host closed the connection)
[20:22] <alfredodeza> dsh ?
[20:22] <alphe> dsh -F 10 :)
[20:23] <alphe> that tells dsh to fork 10 times and execute the command on each server listed if they are 10 or by groups of 10 :)
[20:24] <alphe> makes my terminal looks like a soup of words but that is quite neat to gain some time :)
[20:24] <alphe> alfredodeza if I remember well what i rode dsh is a python something script to paralelle commands on remote servers ...
[20:25] * nwat (~nwat@eduroam-237-79.ucsc.edu) has joined #ceph
[20:26] <alphe> the command I use is fun dsh -aM -F 10 "apt-get update && apt-get -y upgrade ceph"
[20:27] * mancdaz (~darren.bi@94-195-16-87.zone9.bethere.co.uk) has joined #ceph
[20:27] * mancdaz (~darren.bi@94-195-16-87.zone9.bethere.co.uk) Quit ()
[20:27] <alfredodeza> alphe: that is one thing that worries me about adding parallelization support to ceph-deploy
[20:27] <alfredodeza> it would make it hard to read what is going on
[20:27] <alfredodeza> as the log output would be intertwined
[20:27] <alfredodeza> right now, the log output is what helps a lot of people solve issues that before they couldn't because ceph-deploy was rather silent :)
[20:28] <alphe> alfredodeza yep you are totally right ... appart if you drop logs in X files being the forks corresponding logs ...
[20:28] * sleinen (~Adium@2001:620:0:25:bd0c:4a06:7a00:4c5f) has joined #ceph
[20:29] <alphe> alfredodeza in general when something goes wrong you have [Err] CALL MOMY OR 911 10 times ... so you can t miss it !
[20:29] <alfredodeza> lol
[20:30] <alphe> alfredoza you still can put in the help text options --help or -h that the -F option is for people in a hurry that knows exactly what they are doing because they tested the command like a zillion times before ...
[20:31] <alfredodeza> I could, but new features for ceph-deploy need to be able to answer this question: https://github.com/ceph/ceph-deploy#why-is-feature-x-not-implemented
[20:31] <alphe> alfredodeza you saw my post in ceph-users list ?
[20:32] <alphe> it was about a weird problem with last ceph-deploy 1.2.3
[20:32] <alfredodeza> hrmnnn what is the subject
[20:32] <alfredodeza> there are a few known issues I am working on
[20:33] <alphe> problem creating a mds after full wipe
[20:33] <alphe> the ceph-deploy doesn t create the dirs that it needs to write files in ...
[20:33] <alphe> so obviously that clashs ...
[20:33] <alfredodeza> ah yes, I did reply
[20:34] <alphe> ok I missed your reply T___T
[20:34] <alphe> I don t see the reply was it direct to me ?
[20:35] <alfredodeza> it was to you, sage and the list
[20:35] <alfredodeza> alphe: "This looks very unexpected, aside from getting us your distro and ceph version, could you paste the exact way you got here? Like, what commands you ran, in order, with output if possible."
[20:36] <dmsimard> alphe: I'm late to the party, you're talking about the fabric python framework ? (to parallelize commands)
[20:36] <alphe> oh distro 13.04 up to date as much it can be on a normal basis
[20:36] <alphe> ceph 0.67.2-1 raring
[20:36] <alphe> ceph-deploy 1.2.3 raring
[20:37] * sjm (~sjm@ Quit (Remote host closed the connection)
[20:38] * lightspeed (~lightspee@2001:8b0:16e:1:216:eaff:fe59:4a3c) Quit (Ping timeout: 480 seconds)
[20:38] <alphe> dmsimard hum not really it was just a general ask toward paralelism in ceph-deploy
[20:39] <alphe> the command I ran was provided in the first mail alfredodeza
[20:40] <alphe> it is a weird issue and trust me when I say I never experienced it before and I installed like 10 time that cluster :)
[20:40] <alphe> (playing with ceph nodes is my hobby !)
[20:43] <alphe> ok replyed
[20:43] <sjusthm> sagewk: left a few comments on the copyfrom thing
[20:44] <alphe> after the upgrade is a reboot recommanded or just a restart ceph-all ?
[20:47] * sjm (~sjm@ has joined #ceph
[20:49] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[20:50] * mrprud (~mrprud@ANantes-554-1-275-192.w2-9.abo.wanadoo.fr) Quit (Remote host closed the connection)
[20:50] * alfredodeza is now known as alfredo|noms
[20:53] * doubleg (~doubleg@ has joined #ceph
[20:55] <xarses> chicken alfredo?
[20:55] * xarses snickers
[20:55] <elmo> scuttlemonkey, janos: we're now out of the woods; thanks again for your help
[20:56] <janos> elmo: cool!
[20:56] <Tamil> alphe: just a restart would do
[20:56] <scuttlemonkey> elmo: rawk
[20:57] <alphe> ok rebooted :...(
[20:57] <alphe> and the ceph cluster isn t super happy
[20:59] * thomnico (~thomnico@ Quit (Quit: Ex-Chat)
[20:59] <Tamil> alphe: what are you upgrading from and are you upgrading the whole cluster?
[21:00] <alphe> one mon has an ip of ...
[21:00] <alphe> that is odd ...
[21:00] * thomnico (~thomnico@ has joined #ceph
[21:00] <Tamil> alphe: is the mon upgraded?
[21:01] <alphe> yes
[21:01] * mech422 (~steve@ip68-2-159-8.ph.ph.cox.net) has joined #ceph
[21:02] <alphe> monitor is up to a freaking election loop ...
[21:02] <alphe> endlessly calling for election ...
[21:02] * alphe slaps mon03 be a quiet and happy peon !
[21:02] <Tamil> alphe: http://ceph.com/docs/master/install/upgrading-ceph/ - for future use
[21:02] <mech422> Hi! I shutdown a mon/osd server and changed its hostanme - it appears to be annoying the hell out of my cluster trying to get back in now :-)
[21:03] <mech422> I assume there is a key somewhere I need to regenerate ?
[21:03] * erjkvbeurvboerichjbv8fb892r7tg (~hvbeuveig@ has joined #ceph
[21:04] * erjkvbeurvboerichjbv8fb892r7tg (~hvbeuveig@ Quit (Excess Flood)
[21:04] <ntranger_> when running ceph-deploy gatherkeys, I'm getting "WARNING", that it can't find the keyrings in /bootstrap-mds, /bootstrap-osd, and /etc/ceph. I was having the same issues when running scientific linux, and just switched to centos 6.4. Any ideas of what I might be doing wrong?
[21:04] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[21:04] <xarses> mech422 ya, follow the directions for changing the IP address
[21:04] <mech422> xarses: Oh - I missed that doc - let me google
[21:05] <Tamil> ntranger_: do you have the monitors running?
[21:05] <mech422> btw - since I'm screwing around - would it be a good idea to upgrade from .56 ?
[21:05] <xarses> bobtail?
[21:05] <xarses> probably
[21:05] <mech422> can I upgrade one machine at a time to keep the cluster service ?
[21:06] <alphe> damn I think there is a hudge problem with my ceph -cluster after updating from ceph 67.2-1 raring to 67.3-1 raring my osds don t want to launch
[21:06] * Cube (~Cube@ has joined #ceph
[21:06] <alphe> 2013-09-11 16:06:15.397604 7ffd68756700 0 -- :/1002285 >> pipe(0x7ffd64021390 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7ffd640215f0).fault
[21:06] <alphe> Invalid command: saw 0 of pool(<poolname>), expected 1
[21:06] <xarses> mech422 http://ceph.com/docs/master/rados/operations/add-or-rm-mons/
[21:06] <mech422> ahh - thanks much!
[21:07] <xarses> mech422 it sounds like you can, but i dont have any exp there
[21:07] <Tamil> mech422: you can
[21:07] <loicd> does anyone know how gitbuilder pulls dependencies ?
[21:07] <ntranger_> Tamil I ran ceph-deploy mon create, and didn'yt get any errors. how would I check if they are running?
[21:07] <mech422> cool - if its possible, I'll dig into the docs
[21:07] <Tamil> ntranger: do you see ceph-mon process running?
[21:07] <loicd> it fails on python-nose which I added in http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-precise-amd64-basic/log.cgi?log=03f99ba024cc452ac8ebaa360b2eed72ec028457 probably because I did not add it where I should have
[21:08] <xarses> mech422, the TOC on that doc has a link to "Adding / Removing OSDs"
[21:08] <xarses> which should help with the osd side
[21:08] <mech422> btw - funny thing about my problem - I didn't change the _I.P._ - just the hostname - but my ceph.conf lists mon_host =,,,,
[21:08] <mech422> so I'm not sure why it cares about the name
[21:08] <xarses> mech422 ceph is peticular about the hostname
[21:08] <xarses> especally in the monitors
[21:09] <mech422> heh - so it seems :-) Luckily, its just a play cluster atm
[21:09] <mech422> I was renaming stuff to get it ready for production
[21:09] <xarses> I assume you need to follow the same logic as if you re-ip'd it
[21:09] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[21:09] <mech422> anyway - off to RTFM - thanks for the help!
[21:11] <loicd> zackc: ping ?
[21:11] <alphe> ceph -s wasn t showing that the osd were down ...
[21:11] <alphe> that is freaky ...
[21:12] <ntranger_> Tamil yeah, it looks to be running
[21:16] <ntranger_> Tamil when I run top, ceph-mon and ceph-create-key keep popping up
[21:17] <sjusthm> yehuda_hm: what does the bucket-prepare step currently do?
[21:17] <sjusthm> (or are there docs?)
[21:17] * ntranger_ is now known as ntranger
[21:18] <yehuda_hm> sjusthm: it initiates the 2-phase commit
[21:18] <sjusthm> k
[21:18] <sjusthm> and as a side effect we use the entries for gc?
[21:19] <yehuda_hm> we send the asynchronously to the gc, it may be get lost though
[21:19] <alphe> arg ...
[21:19] <sjusthm> ok, so the gc log is seperate
[21:19] <alphe> dhcpcd or the replication is messing up
[21:19] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[21:19] <yehuda_hm> but yeah, once we're done with the over-write we send the info the the gc
[21:20] <Tamil> ntranger: looks like your ceph-create-keys is still running then
[21:20] * markbby1 (~Adium@ has joined #ceph
[21:21] <Tamil> ntranger: do you see the keyring files generated in /var/lib/ceph/osd, /var/lib/ceph/mon, etc...?
[21:21] <zackc> loicd: pong, sorry i missed you yesterday, was at a partner site
[21:21] <sjusthm> yehuda_hm: dumb question: what happens if the radosgw instance dies after the prepare before the commit?
[21:21] <alphe> what happends if the ceph nodes changes ip on their private network ?
[21:22] <alphe> can that makes them goes avoke incapable to get in sync ?
[21:22] <yehuda_hm> sjusthm: the next time (after a specific amount of time) we access this entry we see that there's some incomplete request there and we go the the actual data and update the index accordingly
[21:22] <alphe> 2013-09-11 16:22:30.647483 7f09937fe700 0 -- >> pipe(0x7f098800c430 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f098800dfb0).fault
[21:22] <loicd> zackc: short question : do you know how gitbuilder finds build dependencies ? http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-precise-amd64-basic/log.cgi?log=03f99ba024cc452ac8ebaa360b2eed72ec028457 fails and I suspect setting it in debian/control ( python-nose ) is not enough
[21:22] <alphe> ceph -s gives me that
[21:22] <sjusthm> ok
[21:22] <yehuda_hm> hmm.. that is, we go to the data anyway, but after a specific amount of time we also update the index
[21:23] <ntranger> Tamil under /var/lib/ceph/mon/ceph-ceph01 there is a keyring there, but under osd and mon proper, there are not.
[21:23] <sjusthm> it blocks updates on the object in the mean time?
[21:23] <zackc> loicd: i do not
[21:23] <loicd> zackc: to be honest I don't know who's more knowledgeable about gitbuilder ;-)
[21:23] * markbby (~Adium@ Quit (Remote host closed the connection)
[21:23] <zackc> loicd: glowell ought to know
[21:23] <loicd> zackc: ok, no problem, I'll figure it out
[21:23] <loicd> zackc: thanks
[21:23] <zackc> loicd: np!
[21:24] <Tamil> ntranger: could you try manually killing the ceph-create-keys process and retry mon create command
[21:24] <yehuda_hm> sjusthm: the bucket index prepare you mean? it serves as a contention pont
[21:24] * alfredo|noms is now known as alfredodeza
[21:24] <yehuda_hm> point
[21:24] <Tamil> ntranger: what distro are you using?
[21:24] <yehuda_hm> damn wireless keyboard
[21:24] <sjusthm> yehuda_hm: ok, so subsequent prepares fail until the timeout? (just clarifying)
[21:24] <sjusthm> yehuda_hm: also, what information does the bucket index have? object existence + version? or just object exstence?
[21:24] <ntranger> Centos 6.4
[21:24] <xarses> alfredodeza: having problems with ceph-deploy again
[21:25] <alfredodeza> xarses: what is going on
[21:25] <Tamil> ntranger: i hope you turned off the iptables?
[21:25] <xarses> in 1.0.0, ceph-deploy --overwite-conf didn't appear to replace parts of /etc/ceph/ceph.conf that wheren't in the ceph.conf that ceph-deploy was using
[21:25] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[21:25] <yehuda_hm> sjusthm: no, subsequent prepare does not fail
[21:26] <xarses> in 1.2.3 its truncating the file to what is in ceph-deploy's ceph.conf
[21:26] <alfredodeza> xarses: and now it does?
[21:26] <yehuda_hm> sjusthm: the bucket index can hold multiple entries for in-flight updates
[21:26] <xarses> it might just be that --overwite-conf didn't work in 1.0.0
[21:27] <alfredodeza> probably
[21:27] <sjusthm> yehuda_hm: on the same object? does the bucket index record the object version?
[21:27] <alfredodeza> I don't recall changing that behavior though xarses
[21:28] <xarses> well my config is being lost and it was being created put in /etc/ceph/ceph.conf between ceph-deploy new and ceph-deploy --overwite-conf mon create
[21:28] <yehuda_hm> sjusthm: it's in the same entry within the bucket index omap. it uses the reported pg version about the object
[21:28] <xarses> it was working in 1.0.0
[21:28] * ross_ (~ross@ Quit (Quit: Leaving)
[21:29] <yehuda_hm> it keeps that version, that's how it identifies the last writer
[21:29] <sjusthm> ok, so the commit changes the stored version if it has a larger one and doesn't otherwise?
[21:29] <sjusthm> k
[21:29] <yehuda_hm> yeah
[21:29] <alfredodeza> xarses: that sounds like a bug to me, although, like I said, I have no recollection of any changes to overwrite conf
[21:29] <yehuda_hm> sjusthm: and/or if the originating pool has changed
[21:29] <alfredodeza> but maybe I fixed something that now triggers that behavior
[21:30] <xarses> alfredodeza: i'll probably just link the two files since im running ceph-deploy need the cluster network and public network settings before mon create, and i'd like to keep my manafest DRY
[21:31] <xarses> should i create a tracker for it?
[21:31] <yehuda_hm> sjusthm: for the bucket index performance issue, there's the big hammer of "blind" buckets, that is -- buckets without an index
[21:31] <ntranger> Tamil its now off.
[21:31] <Tamil> ntranger: now, kill ceph-create-keys , retry mon create
[21:32] <sjusthm> yehuda_hm: what do the garbage objects look like?
[21:32] <sjusthm> could the overwrite stash the information needed by the gc in the head object omap?
[21:32] <sjusthm> (i guess this is what you meant?)
[21:33] <yehuda_hm> sjusthm: the gc objects keep in omap the list of rados objects that need to be removed
[21:33] <sjusthm> yeah, but instead it could be a list of head objects which need to be looked at
[21:34] <yehuda_hm> so, yeah, that's an option, however, not an easy one
[21:34] <sjusthm> why?
[21:34] <yehuda_hm> because certain operations like remove object would need to instead keep the object around
[21:34] <sjusthm> yeah, they would
[21:34] <sjusthm> it would avoid the read though
[21:34] <yehuda_hm> also, we reset objects through removing them
[21:34] <sjusthm> reset?
[21:35] <yehuda_hm> e.g., remove all their xattrs
[21:35] <sjusthm> their librados level xattrs?
[21:35] <yehuda_hm> yes
[21:35] <sjusthm> that could be just another class operations
[21:35] <sjusthm> *operation
[21:36] <yehuda_hm> wouldn't that read the object?
[21:36] <alfredodeza> xarses: please do
[21:36] <sjusthm> I suppose so, could add a clear_xattrs librados primitive
[21:36] <alfredodeza> as detailed as possible so I can go spelunking in the right places :)
[21:36] <sjusthm> do you know which xattrs would need to be removed?
[21:36] <yehuda_hm> no
[21:36] <sjusthm> what are they used for in this case?
[21:36] <ntranger> Tamil ok, I ran it again, with no errors
[21:37] <yehuda_hm> general object metadata
[21:37] <ntranger> Tamil should I run gatherkeys again?
[21:37] <yehuda_hm> user generated stuff, acls, etc.
[21:37] <Tamil> ntranger: yes
[21:37] <sjusthm> oh, user/S3 level?
[21:37] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Read error: Connection reset by peer)
[21:37] <yehuda_hm> not only, but yes
[21:37] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[21:37] <yehuda_hm> it can really be anything
[21:37] <sjusthm> could add a clear_xattrs/clear_omap primitive
[21:37] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[21:37] <ntranger> Tamil worked that time. iptables musta been hosing it up
[21:38] <Tamil> ntranger: yes
[21:38] <ntranger> Tamil Thanks. :)
[21:38] <Tamil> ntranger: np
[21:39] <yehuda_hm> sjusthm: it'd be easier if there was a second 'stream' of data that we could write into
[21:39] <sjusthm> yehuda_hm: blargh
[21:39] <yehuda_hm> heh
[21:39] * sjm (~sjm@ Quit (Remote host closed the connection)
[21:39] <sjusthm> what sort of data?
[21:40] <sjusthm> you could use the omap prefixes
[21:40] <xarses> alfredodeza: angdrunag found where the issue is
[21:40] <sjusthm> that is, prefixed omap key spaces
[21:40] <alphe> weird I had to activate the osd again
[21:40] <alphe> then shutdown the third monitor ...
[21:40] <yehuda_hm> what I really need here is a second object that could be accessed through the first object
[21:40] <alphe> and ceph -s from the mds server is all messed up ...
[21:40] <xarses> conf.py write_conf
[21:40] <alphe> but from other points it is ok
[21:40] <alphe> bye
[21:40] * alphe (~alphe@0001ac6f.user.oftc.net) Quit (Quit: Leaving)
[21:40] <sjusthm> again, if it's not a lot of data, it seems like prefixed omap keys would be the way to go
[21:41] <sjusthm> (it's not impossible to allow atomic writes to other objects with the same locator, but it would be annoying)
[21:41] <alfredodeza> xarses: oh I guess that is new from 2013
[21:41] <alfredodeza> but before I took a look :)
[21:41] * mrprud (~mrprud@ANantes-554-1-246-148.w2-1.abo.wanadoo.fr) has joined #ceph
[21:41] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph
[21:41] <xarses> so i'm not sure which to fix
[21:41] <alfredodeza> xarses: one sec
[21:42] <yehuda_hm> sjusthm: yeah, but all the hackery around getting it to work correctly will make it unusable
[21:42] <sjusthm> prefixed omap keys?
[21:42] <xarses> it complains that the context changed, if /etc/ceph/ceph.conf and ceph.conf from ceph-deploy dont match
[21:42] <mech422> Hmm - it appears if you ceph mon remove FOO and then ceph mon add FOO - you get an 'it already exists' error.
[21:42] <sjusthm> that seems pretty straightforward
[21:42] <sjusthm> heck, we could provide librados helpers based on the existing omap primitives
[21:43] <yehuda_hm> sjusthm: it won't allow really removing an object
[21:43] <sjusthm> is the problem that you already have arbitrary non-prefixed omap keys around?
[21:43] <yehuda_hm> so everything around it will have to change
[21:43] <nhm> yehuda_hm: I was thinking about the real-time detection for writes vs rewrites. I'd go with optimizing for the write case, keeping track of failures due to existing data, and if it turns out to matter then think about switching to read-before-write.
[21:43] <sjusthm> yehuda_hm: the gc would have to do it
[21:43] * S0d0 (joku@a88-113-108-239.elisa-laajakaista.fi) Quit (Ping timeout: 480 seconds)
[21:43] <sjusthm> yehuda_hm: ah, yes
[21:43] <dmick> mech422: http://ceph.com/docs/master/rados/operations/add-or-rm-mons/
[21:43] <nhm> yehuda_hm: but not implement it unless it actually matters.
[21:44] <mech422> dmick: yeah - thats what I'm following :-)
[21:44] <yehuda_hm> instead of trying to read object and get ENOENT, will have to read the object metadata see if it's marked as removed
[21:44] <dmick> remove isn't just "ceph mon remove"
[21:44] <yehuda_hm> nhm: we can do that, we're trying to come up with a solution that won't require reading the original object
[21:44] <mech422> dmick: Oh - since the mon was down, I didn't think I needed to stop it first...
[21:44] <lxo> would it be too inconvenient if ceph-osd started using the available space, rather than the used space, to compute disk utilization for purposes of disk full/nearfull/etc determination?
[21:45] <xarses> --overwrite-conf just throws away /etc/ceph/ceph.conf
[21:45] <mech422> since it was sorta already dead
[21:45] <alfredodeza> xarses: let me know when the ticket is done, and if you can, assign it to me
[21:45] <yehuda_hm> because reading means that if it was just written it needs to go to disk first
[21:45] <alfredodeza> xarses: that doesn't sound like what we want
[21:45] <lxo> ceph-mon gets things right, but ceph-osd will often undercompute the utilization because at least on btrfs the different can be quite significant when there's a lot of space already reserved for metadata
[21:46] <lxo> and then the disk may fill up, and that can be a bit hard to recover from
[21:46] <nhm> yehuda_hm: wouldn't the method you proposed yesterday work?
[21:46] <sjusthm> yehuda_hm: mmm, actually I'm not sure deletion avoids that problem
[21:47] <yehuda_hm> nhm: it will, however, it forces a read before the write
[21:47] <dmick> mech422: there are three steps to remove and many more to add. You showed one for each. Chances are good something wasn't done or not done in order, is my point
[21:47] <mech422> dmick: probably - I shall RTFM some more
[21:47] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[21:47] <yehuda_hm> sjusthm: using a second object will work, except for the ordering issue
[21:48] <mech422> dmick: I did monmap/mon keyring/ --mkfs stuff for setting up the 'new' mon
[21:48] <yehuda_hm> which we can probably work around
[21:48] <sjusthm> yehuda_hm: which ordering issue?
[21:48] <mech422> dmick: I just gotta convince the existing mons to play nice with it now
[21:48] <sjusthm> yehuda_hm: isn't the problem that you won't know what to write to the second object without reading the head object?
[21:49] * Vjarjadian (~IceChat77@05453253.skybroadband.com) has joined #ceph
[21:49] <yehuda_hm> sjusthm: when you write an object you also update the second object with the version of that object
[21:49] <sjusthm> oh, I see
[21:49] <yehuda_hm> I mean, with all the relevant data for that object
[21:49] <sjusthm> yeah, makes sense
[21:50] <yehuda_hm> not backward compatible though, but we can have that only on new buckets
[21:50] <sjusthm> the ordering problem doesn't seem to be a problem, the writer just includes the pg version returned by the successful write when it updates the gc log object
[21:50] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[21:50] <yehuda_hm> but you don't have that info when you update the second object, also pg version can change if you copy the pool
[21:51] <yehuda_hm> there are ways around it, definitely
[21:51] <sjusthm> yeah
[21:52] <xarses> alfredodeza: slightly similar, but separate topic. If ceph-deploy mon create is performed to add an additional monitor should mon_host and or mon_initial_members be updated? should ceph-deploy perform this update?
[21:52] <yehuda_hm> the gc can first read the second object, then read the first object's meta
[21:52] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[21:52] <alfredodeza> xarses: I would say yes
[21:52] <alfredodeza> but that is not the current behavior
[21:52] <xarses> that's what i've seen
[21:52] <xarses> I'll create a tracker for that as well
[21:53] <yehuda_hm> if the object tag in the meta is within the list of objects it read then it means that all the other versions are old ones
[21:53] <sjusthm> right
[21:53] <xarses> alfredodeza, are you going to remove the openID suport from tracker.ceph.com since they are shutting down?
[21:54] <alfredodeza> good question for houkouonchi-home maybe? ^ ^
[21:54] <yehuda_hm> hmm.. is it right?
[21:55] <sjusthm> I think any version at all which doesn't match the head object's metadata is necessarily garbage
[21:55] <sjusthm> wait
[21:55] <yehuda_hm> no, it may be a new write
[21:55] <sjusthm> yeah, just realized that
[21:56] <sjusthm> any reason to write to the second object before we know the pg version of the overwrite?
[21:56] <sjusthm> trying to avoid loosing the garbage?
[21:56] <yehuda_hm> hmm.. we can just fire and forget
[21:56] <sjusthm> ah
[21:57] <yehuda_hm> what if both objects use the same locator?
[21:57] <yehuda_hm> will ordering be preserved?
[21:57] <sjusthm> not in general
[21:57] * sjm (~sjm@ has joined #ceph
[21:58] <sjusthm> or are you trying to exploit the relationship between the pg versions?
[21:58] <yehuda_hm> ah, no, had something else in mind, but it won't work
[21:58] <yehuda_hm> if we could bundle two requests for the same pg together it would work
[21:59] <sjusthm> not without great effor
[21:59] <sjusthm> *effort
[21:59] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Ping timeout: 480 seconds)
[21:59] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[21:59] <sjusthm> the main reason to not put this on the head object itself is to preserve the fast -ENOENT after delete behavior?
[21:59] <sjusthm> or compatibility?
[22:00] <yehuda_hm> it makes life much easier for the gateway
[22:00] <yehuda_hm> all the semantics there is already too convoluted, no need to make it more complicated
[22:00] <sjusthm> it seems like the problem of knowing when you can get rid of the head object is pretty much the same as the problem of knowing when you can get rid of this second object
[22:00] <yehuda_hm> you might want to keep the second object around
[22:01] <sjusthm> only until you've cleaned up the required garbage (unless you have many objects mapping to the same second object which would be interesting)
[22:01] <sjusthm> actually at that point it's just another kind of gc log
[22:01] <yehuda_hm> well, you can have the gc do it for you
[22:01] <sjusthm> that's what I meant
[22:01] <sjusthm> in either case the gc would have to take care of it
[22:02] <nhm> yehuda_hm: I thought your proposal yesterday did away with the read before the write by assuming the data didn't exist, but then caused an extra read on top of the initial write attempt for overwrites?
[22:02] <yehuda_hm> nhm, right
[22:03] <nhm> I think you should do that, keep track of write failures due to existing data, and if we find out that it's a big deal, then implement an optimization to dynamically detect lots of rewrites and revert back to the read-before-write codd.
[22:03] <nhm> code
[22:04] <yehuda_hm> nhm: I want to explore other possibilities before selecting one approach
[22:04] <yehuda_hm> sjusthm: this solution will help with having versioning
[22:05] <sjustlaptop> the second object?
[22:05] <yehuda_hm> yes
[22:05] <sjustlaptop> yeah
[22:05] <sjustlaptop> would also work to have the versioning metadata stashed in the head object
[22:07] <yehuda_hm> sjustlaptop: I really think this is a too complicated and fragile approach
[22:07] <sjustlaptop> yeah, the second object variant would work as well
[22:07] <sjustlaptop> they seem to me to be almost equivalent
[22:07] <nhm> yehuda_hm: sure
[22:08] <yehuda_hm> sjustlaptop, except for the ordering thing
[22:08] <sjustlaptop> right, with the info on the head, you can do atomic updates
[22:08] <sjustlaptop> (atomic with the head overwrite, that is)
[22:09] <yehuda_hm> not sure how it compares with the first read + write in parallel scheme
[22:10] <yehuda_hm> it only works better for overwrites
[22:10] <sjustlaptop> yeah
[22:10] <sjustlaptop> and it has the massive benefit of not changing your on-rados format
[22:10] <mikedawson> nhm: do you have any advice on which metrics to analyze from osd admin socket perf dumps in my search for RBD performance issues that are associated with spindle contention from scrub or deep-scrub?
[22:10] <yehuda_hm> for first write not so much (you do two writes)
[22:10] <sjustlaptop> yehuda_hm: hmm?
[22:11] <sjustlaptop> yehuda_hm: with the head variant, the second write is to an omap/xattr entry on the same object
[22:11] <yehuda_hm> if it's not overwrite you end up writing to the first and the second object
[22:11] <yehuda_hm> right
[22:11] <sjustlaptop> with the two objects variant you can do the two write in parallel
[22:11] <sjustlaptop> *two writes
[22:11] <yehuda_hm> but you still do two writes
[22:11] <sjustlaptop> true
[22:11] <nhm> mikedawson: great question. I suppose it kind of depends on how you want to go about it. In general dump_historic_ops is really useful as a prelude to scouring over debug20 logs.
[22:12] <sjustlaptop> you don't need to wait for the second object write to complete, though, so it doesn't seem to hurt latency
[22:12] <yehuda_hm> no, just more load on the osds
[22:12] <sjustlaptop> yeah
[22:12] <nhm> mikedawson: but that will probably just tell you what you already know if you are already suspecting that you are maxing seeks on the disks.
[22:13] <yehuda_hm> which can hurt performance
[22:13] <sjustlaptop> the extra omap update on the on-head variant is probably no worse than the object_info update it has to do anyway, so that probably isn't much of an issue
[22:14] <yehuda_hm> well, the data there can be really large
[22:14] <sjustlaptop> you need to journal the data?
[22:14] <sjustlaptop> I thought you just needed enough info to make the gc work
[22:14] * sjm (~sjm@ Quit (Remote host closed the connection)
[22:14] <mikedawson> nhm: yeah, I monitor %util from iostat -x on all drives. When the scrubs start, spindles go from between 10% and 20% up to about 100%, then client i/o goes to hell
[22:14] <sjustlaptop> the on-head thing would't work if you have to store actual data
[22:14] <sjustlaptop> or is it the list of objects?
[22:14] <yehuda_hm> sjustlaptop: depends, for having versioning work you probably need to keep the entire manifest
[22:15] <sjustlaptop> oh, because the number of objects in a multipart-upload can be big?
[22:15] <yehuda_hm> yeah
[22:15] <nhm> mikedawson: every played with blktrace?
[22:15] <sjustlaptop> hmm, that's a problem with the second object approach as well
[22:15] <yehuda_hm> in any upload, not just multipart
[22:15] * BillK (~BillK-OFT@58-7-172-nwork.dyn.iinet.net.au) has joined #ceph
[22:15] <sjustlaptop> yeah
[22:15] * markbby1 (~Adium@ Quit (Quit: Leaving.)
[22:16] * sjm (~sjm@ has joined #ceph
[22:16] <sjustlaptop> sounds like you'd need an indirection block
[22:16] <yehuda_hm> sjustlaptop: with the second object approach you're going to do it in parallel
[22:16] <mikedawson> nhm: no. What do you recommend?
[22:17] <sjustlaptop> yehuda_hm: yeah, but you might have to store many of those manifests, and you wouldn't necessarily want to do that on 1 object
[22:17] <nhm> mikedawson: blktrace lets you attach to a block device and record *every* io going to the disk. You can either inspect that data manually, or use it to create graphs showing the seek behavior using seekwatcher
[22:17] <yehuda_hm> yeah
[22:18] <sjustlaptop> sounds like when you write out the head object, you also need to write out an indirection object containing the same manifest and refer to that from the head/second object after an overwrite
[22:18] <sjustlaptop> and deal gracefully with the case when you can't find it (since you don't want to wait for it to get written)
[22:19] <sjustlaptop> where graceful might mean leaking?
[22:19] <yehuda_hm> well, I had in mind doing that only for buckets that have versioning turned on
[22:19] <sjustlaptop> ah
[22:19] <nhm> mikedawson: super old example: http://nhm.ceph.com/movies/sprint/test5/xfs_4194304Bytes_OSD1.mpg
[22:19] <yehuda_hm> because it has performance implications
[22:19] <sjustlaptop> yeah
[22:20] <xarses> issue 6281
[22:20] <kraken> xarses might be talking about: http://tracker.ceph.com/issues/6281 [ceph-deploy config.py write_conf throws away old config just because they are out of sync]
[22:20] <nhm> mikedawson: if you were really really motivated you might be able to figure out a way to color code scrub activity vs normal activity.
[22:20] <alfredodeza> norris xarses
[22:20] <kraken> xarses won the World Series of Poker using Pokemon cards
[22:20] <yehuda_hm> but for the general case, you can still have that on one object, and let the gc get the entries from the second object and trim them
[22:21] <sjustlaptop> yeah, but nevertheless, the second object might have many of those manifests waiting for the gc to trim
[22:21] <ntranger> Tamil we have 3 nodes with 12 drives in each node. How would you recommend setting up the OSD's?
[22:21] * bandrus (~Adium@ has joined #ceph
[22:22] <yehuda_hm> sjustlaptop: it doesn't need to reside in omap, it can just be a big unordered list of blobs
[22:22] <mikedawson> nhm: interesting, I'll give it a try. Can you use blktrace to gain any insight into the execution path of the ceph binaries (i.e. why is ceph hammering my disks)?
[22:22] <sjustlaptop> yehuda_hm: which you trim by reading and then re-writing without the trimmed entries?
[22:23] <sjustlaptop> the omap part isn't really a problem
[22:23] <nhm> mikedawson: nope, it just tells you how the disk is being hammered.
[22:23] <yehuda_hm> sjustlaptop: yeah, something like that
[22:23] <sjustlaptop> it seems that if the second object can reasonable store N manifests
[22:23] <nhm> mikedawson: exact IO sizes, if it's metadata traffic, where the write actually went, etc.
[22:23] <sjustlaptop> then the head object can reasonably store N+1 manifests
[22:23] <mikedawson> nhm: cool. thanks!
[22:24] <sjustlaptop> omap would actually handle it better since you wouldn't need to read and the write all of the un-trimmed entries
[22:24] <nhm> mikedawson: with that and high ceph debugging levels, you can start to paint a picture of what happened when and what effect it had, but it's a lot of data and a lot of work.
[22:24] <yehuda_hm> yeah, omap will also do
[22:25] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[22:25] <nhm> mikedawson: you can also look at perf for profiling, but it's not as good as I want it to be.
[22:25] <josef> sagewk: not sure if you heard the swearing
[22:25] <josef> but ceph is failing to build on ppc
[22:25] <josef> http://kojipkgs.fedoraproject.org//work/tasks/4881/5924881/build.log
[22:26] <sagewk> heh
[22:26] <josef> down towards the bottom
[22:26] <josef> common/crc32c-intel.c:80: error: unknown register name 'edx' in 'asm'
[22:26] <yehuda_hm> hmm.. ok so if going to the single head solution, it'd be great if we could create some multi-instances object
[22:26] <xarses> issue 6282
[22:26] <kraken> xarses might be talking about: http://tracker.ceph.com/issues/6282 [ceph-deploy mon create should update mon_host and/or mon_initial_members]
[22:26] <nhm> I really really want a tool that can do wall-clock profiling and show how the distribution of wait times changes over time.
[22:26] <yehuda_hm> what I called 'streams' (sorry) earlier
[22:26] <sagewk> josef: ah, crap. looks like i screwed up the arch detection..
[22:26] <sjustlaptop> yehuda_hm: I say again: blargh
[22:26] <yehuda_hm> s/streams/instances
[22:27] <nhm> sagewk: must work on ARM since that would break there otherwise?
[22:27] <sjustlaptop> yehuda_hm: also, I'm not necessarily arguing for the head-only version, I continue to think that they are almost equilivant
[22:27] <josef> i think ppc just got there first
[22:27] <sjustlaptop> yehuda_hm: omap isn't sufficient?
[22:27] <josef> once one build fails it kills the other ones
[22:27] <xarses> alfredodeza, if you give us some guidance angdraug or myself will work both of those
[22:27] <sagewk> it builds on arm... :)
[22:27] <sagewk> thanks, opened up http://tracker.ceph.com/issues/6283. should be able to look at it this afternoon
[22:28] <yehuda_hm> sjustlaptop: I don't want to need to explicitly track the state of the object through extra flags in its header
[22:28] <josef> sagewk: ok
[22:28] <sjustlaptop> as in present/not present
[22:28] * josef goes to break other things
[22:28] <sjustlaptop> ?
[22:28] <yehuda_hm> that's one example
[22:28] <sjustlaptop> yeah, that would be a reason to go with the second object variant
[22:28] <sjustlaptop> oh, I see
[22:28] <sjustlaptop> you want the object to be versioned
[22:29] <sjustlaptop> :(
[22:29] <sjustlaptop> uh
[22:29] <alfredodeza> xarses: you basically need to use the ConfigParser writer --> http://docs.python.org/2/library/configparser.html#ConfigParser.RawConfigParser.write
[22:30] <alfredodeza> but not only that but reconcile the values
[22:30] <angdraug> how come it worked in 1.0?
[22:30] <sjustlaptop> that would be complicated
[22:30] <xarses> magic?
[22:30] <alfredodeza> e.g. these are the values from local ceph-deploy and I am overwriting these other values from the remote one --> result gets written back with the ConfigParser writer
[22:31] <alfredodeza> angdraug: it wasn't implemented correctly probably
[22:31] <yehuda_hm> sjustlaptop: would it? it's basically pushing the extra object logic down to rados
[22:32] <sjustlaptop> yehuda_hm: what interface do you have in mind?
[22:32] <yehuda_hm> sjustlaptop: add an extra field for osd ops beside the locator that states 'instance'
[22:33] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[22:34] * jmlowe1 (~Adium@c-50-172-105-141.hsd1.in.comcast.net) has joined #ceph
[22:34] <yehuda_hm> hmm.. or instead of a global instance per op it'd be something you could set per sub-op
[22:34] <ntranger> alfredodeza running 3 nodes, with 12 disks per node, how would you recommend setting up the OSD's?
[22:35] <alfredodeza> ntranger: I am not sure how to answer that :/
[22:35] <sjusthm> bstillwell: are you there?
[22:35] <alfredodeza> maybe someone else can chime in?
[22:35] <yehuda_hm> sjustlaptop: I know I'm probably violating some basic rados assumptions here, don't bang your head on the keyboard
[22:35] <xarses> ntranger: what do you mean?
[22:36] <xarses> ntranger: typically you want one osd per physical disk and dont share the disk with any other process, like say the OS
[22:36] <xarses> no raid
[22:36] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Ping timeout: 480 seconds)
[22:36] <ntranger> xarses ok. thats what i thought. They are their own drives, no raid, and no OS
[22:37] <yehuda_hm> sjusthm: but basically I'd like to have a multi-object access through a single object. So that you could read / write data, set omap attrs, xattrs, etc. on each of the instances separately
[22:37] <yehuda_hm> when you remove one instance it doesn't remove other instances
[22:37] <yehuda_hm> but they all share the same locking assumptions and versioning
[22:38] <gregaf1> you don't want to do that though, because you want to be able to read old versions while writing the new one....
[22:38] * berant (~blemmenes@gw01.ussignalcom.com) Quit (Ping timeout: 480 seconds)
[22:38] <yehuda_hm> well, I don't want to do that, you're correct.
[22:38] <xarses> ntranger then ceph-deploy osd zap node-1:sd{b,c,d,e,f,g,h,i,j,k,l,m}
[22:38] * jmlowe (~Adium@2601:d:a800:511:2961:4fdf:36a8:9d06) Quit (Ping timeout: 480 seconds)
[22:38] <xarses> then repeat replace zap with create
[22:39] <xarses> and repeat node-1 with node-2...
[22:39] <ntranger> xarses Thanks! :)
[22:39] <xarses> also make sure your MONs are up first
[22:39] <xarses> erm that might be zap, not osd zap
[22:39] <yehuda_hm> but I do want to be able to bundle a few osd subops to the same meta-object, each can point at a different instance
[22:39] <xarses> cant remember
[22:41] <ntranger> xarses Yeah, I just checked, the mons are up on all 3 nodes
[22:41] <ntranger> and I already went through and zapped each drive
[22:41] <ntranger> so now I'm ready to created each osd
[22:41] <Tamil> xarses: ntranger: it is disk zap followed by osd create or osd create with --zap-disk option
[22:41] <Tamil> ntranger: kool
[22:47] * wrencsok (~wrencsok@wsip-174-79-34-244.ph.ph.cox.net) Quit (Remote host closed the connection)
[22:49] * wrencsok (~wrencsok@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[22:50] <xarses> Tamil: thanks
[22:51] <sjusthm> yehuda_hm: probably that means allowing atomic writes to multiple objects with the same locator
[22:51] <sjusthm> it means at the least some changes to the protocol
[22:51] <sjusthm> and it interacts inconveniently with recovery
[22:52] <yehuda_hm> sjusthm: why do it through locator?
[22:52] <sjusthm> because it exists already
[22:53] <sjusthm> besides, that's what you really want, atomic update of two different objects
[22:53] <sjusthm> and the osd side really wants them to be in the same pg
[22:53] <sjusthm> so locator
[22:54] * sleinen (~Adium@2001:620:0:25:bd0c:4a06:7a00:4c5f) Quit (Quit: Leaving.)
[22:54] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[22:55] <yehuda_hm> sjusthm: technically it'd work, however, I think that as a general tool a meta-object would be really useful
[22:55] <yehuda_hm> basically current objects are just a subset of that
[22:56] <sjustlaptop> yehuda_hm: either variant seems like a very heavy modification to avoid having to use the object metadata to determine object existence
[22:56] <yehuda_hm> I can think of other use cases
[22:56] <yehuda_hm> how about object versioning
[22:57] <yehuda_hm> you get it for free
[22:57] <sjustlaptop> well, sort of, if you add a clone operation
[22:57] <yehuda_hm> you don't need it for immutable objects
[22:57] <sjustlaptop> you get that with self-managed snaps anyway
[22:58] <yehuda_hm> that's really a different thing, with self-managed snaps you need to actually manage it
[22:58] <yehuda_hm> e.g., keep track of what snaps exists, etc.
[22:59] <sjustlaptop> yeah, also, snap removal goes through the mons
[22:59] <yehuda_hm> right
[23:00] <sjustlaptop> it's still a pretty huge change
[23:00] <sjustlaptop> the filestore ondisk format changes
[23:00] <sjustlaptop> at least
[23:01] <sjustlaptop> I suppose librados doesn't need to change much, there would just be write_to_fork sort of write() variants
[23:01] <sjustlaptop> still, very complicated
[23:01] <yehuda_hm> sjustlaptop: in a meeting, bbl
[23:01] <sjustlaptop> k
[23:01] <joshd> we already have plenty of cruft added for approaches that were later abandoned, we should be very careful before adding more to rados
[23:01] * Administrator (~chatzilla@ has joined #ceph
[23:01] * mschiff (~mschiff@ has joined #ceph
[23:02] <sjustlaptop> joshd: that is a true
[23:02] <Administrator> .
[23:02] <sjustlaptop> mmm, I think atomic update of multiple objects with the same locator would be a better interface
[23:02] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[23:02] * Administrator is now known as JayBox
[23:02] <sjustlaptop> well, not really
[23:05] * mrprud (~mrprud@ANantes-554-1-246-148.w2-1.abo.wanadoo.fr) Quit (Remote host closed the connection)
[23:07] <gregaf1> I was only skimming, but I didn't see why we need any of these approaches
[23:07] <gregaf1> joshd: can you look at http://tracker.ceph.com/issues/6284 and tell me if it makes any sense to you?
[23:09] * lightspeed (~lightspee@2001:8b0:16e:1:216:eaff:fe59:4a3c) has joined #ceph
[23:11] * sjm (~sjm@ Quit (Remote host closed the connection)
[23:13] <Tamil> xarses: np
[23:14] * grepory (~Adium@2600:1003:b00e:8bd1:b538:ef1c:d1ca:4de7) Quit (Quit: Leaving.)
[23:15] <joshd> gregaf1: yes, that sounds like the problem. release_set() and the assert that everything was released suggests a bug in release_set() or a race that added another reference to the object between release_set() and ~Inode
[23:15] <gregaf1> release_set…are you looking at the raw log? :)
[23:16] <gregaf1> I already closed it up and I didn't copy that into the ticket
[23:16] <joshd> no, just at put_inode()
[23:16] <gregaf1> ah, right
[23:17] * terje-_ is now known as terje-
[23:17] * sjm (~sjm@ has joined #ceph
[23:18] <terje-> hi, I have a VM that's running from an rbd copy-on-write volume. In other environments, it's totally fine but in this environment I'm getting 1MB of write i/o.
[23:18] <terje-> I'm not sure the best way to troubleshoot it..
[23:18] <dmsimard> Is rbd write cache enabled ?
[23:19] <terje-> How can I check that?
[23:19] <terje-> I'm not sure
[23:22] <mikedawson> terje: for RBD writeback cache, you need rbd cache = true in ceph.conf and cache=writeback set when you instantiate the instance
[23:23] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) has joined #ceph
[23:23] * ChanServ sets mode +v andreask
[23:23] <terje-> checking, thanks.
[23:24] <terje-> I don't see 'cache = true' in /etc/ceph/ceph.conf
[23:24] * alfredodeza is now known as alfredo|afk
[23:24] <terje-> possible to enable that on the fly?
[23:25] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[23:25] <mikedawson> terje: you can also bypass qemu and benchmark rbd directly like 'rbd bench-write --pool volumes --io-size 8192 --io-threads 16 --no-rbd-cache --io-pattern rand test-volume'. Toggling from --no-rbd-cache to --rbd-cache should show the impact
[23:26] <mikedawson> terje-: you can't do it on the fly, you will need to restart the volume (presumably restart qemu)
[23:26] <terje-> hmm
[23:26] <mikedawson> terje-: the command is 'rbd cache = true' not 'cache = true'
[23:27] <terje-> ok, so if I wanted to enable this..
[23:27] <terje-> I would shutdown my VM's, edit ceph.conf with 'rbd cache = true'
[23:27] <terje-> in the [client] section and fire up my VM's again?
[23:28] * ScOut3R (~scout3r@91EC1DC5.catv.pool.telekom.hu) Quit (Remote host closed the connection)
[23:28] <mikedawson> terje-: sounds partially right http://ceph.com/docs/next/rbd/qemu-rbd/#qemu-cache-options
[23:29] <terje-> thanks I was just googling that
[23:30] <mikedawson> terje-: you also have to have cache=writeback instead of cache=none in the qemu command line when you start the vm
[23:32] <terje-> I have that in my libvirt xml already but obviously, it isn't working.
[23:32] <terje-> assuming that it will once I get rbd_cache setup properly.
[23:32] <mikedawson> terje-: see the "Important" note http://ceph.com/docs/next/rbd/qemu-rbd/#running-qemu-with-rbd
[23:32] <terje-> great, thanks.
[23:32] <terje-> I'll let you know how this goes. :)
[23:37] <ntranger> xarses sweet, OSD's all created fine. :)
[23:37] <xarses> ntranger: rock on!
[23:37] * gregmark (~Adium@ Quit (Quit: Leaving.)
[23:37] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[23:38] <ntranger> next is to created my MDS I guess.
[23:38] <terje-> mikedawson: so, do I want writethrough or writeback for my cache?
[23:39] <ntranger> xarses would I make just one MDS for the 3 nodeS?
[23:40] <xarses> as far as im aware, multiple active MDSs are not supported yet
[23:40] <mikedawson> terje-: if you have an IOPS/spindle contention issue of too many small writes that you would like to coalesce into fewer larger writes, you want writeback
[23:41] <cmdrk> sagewk: so far so good with the d_prune patch applied to the kernel. will let you guys know / open a ticket if things start disappearing in cephfs again
[23:42] <terje-> great, thanks. I have a couple of DB's running so that's probably what I want for this VM
[23:43] * vata (~vata@2607:fad8:4:6:1446:1d9e:1a85:3b9) Quit (Quit: Leaving.)
[23:43] <terje-> mikedawson: one last question, I have many hosts in this cluster, each with it's own ceph.conf file
[23:44] <terje-> do all those ceph.conf files need that rbd_cache=true for this to work?
[23:45] <mikedawson> terje-: I believe you only need to tell the nodes that will host qemu/rbd clients, not others with only mon or osds
[23:45] <terje-> ok. Unfortunately, it didn't have an affect.
[23:45] * grepory (~Adium@83.sub-70-192-197.myvzw.com) has joined #ceph
[23:45] <terje-> effect?
[23:45] <terje-> I added it to the [global] section
[23:45] <mikedawson> terje-: mine is in global
[23:46] <terje-> so after adding it ceph.conf on that node, I don't need to bounce any services?
[23:46] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[23:46] <mikedawson> terje-: restart qemu
[23:46] <terje-> ah
[23:47] <mikedawson> for each guest
[23:47] <mikedawson> terje-: the best thing to do is get the rbd admin daemons working, then ask them for their config.
[23:47] <terje-> ok
[23:48] <mikedawson> terje-: under [client.volumes] (or whatever key you use other than volumes) add admin socket = /var/run/ceph/rbd-$pid.asok, then ensure apparmor doesn't get in the way
[23:49] <mikedawson> terje-: then you can do something like 'ceph --admin-daemon /var/run/ceph/rbd-15985.asok config show | grep rbd_cache'
[23:50] * KevinPerks (~Adium@ has left #ceph
[23:52] <sagewk> cmdrk: great
[23:52] <terje-> mikedawson: again, I've added those, do I need to bounce any services?
[23:53] <mikedawson> terje-: restart the client (i.e. qemu)
[23:53] <terje-> when you say restart qemu, do you mean libvirtd?
[23:53] <terje-> I dont' have a qemu process
[23:54] <terje-> or you just mean restart my vm?
[23:54] <terje-> probably just restart my VM.. gotcha
[23:54] * sjm (~sjm@ Quit (Remote host closed the connection)
[23:54] <mikedawson> terje-: no, if you use openstack, soft or hard reboot the instance. If you use libvirt, do whatever is needed to restart the qemu process for each vm
[23:55] <terje-> understood, thanks.
[23:55] * thomnico (~thomnico@ Quit (Ping timeout: 480 seconds)
[23:55] <mikedawson> terje-: guest restarts don't work, need to restart qemu
[23:55] * angdraug (~angdraug@ Quit (Quit: Leaving)
[23:56] <mikedawson> terje-: and the user that runs qemu needs to be able to write in /var/run/ceph. Plus typical apparmor stuff will prevent creating the sockets
[23:56] <terje-> mikedawson: can I have a look at your ceph.conf?
[23:56] * angdraug (~angdraug@ has joined #ceph
[23:57] * grepory (~Adium@83.sub-70-192-197.myvzw.com) Quit (Ping timeout: 480 seconds)
[23:58] <terje-> ok, so I went from 1MB/s to 16MB/s
[23:58] <terje-> which is great :)
[23:58] * shang (~ShangWu@ Quit (Ping timeout: 480 seconds)
[23:59] <mikedawson> terje-: http://pastebin.com/raw.php?i=7iYDb6fS

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.