#ceph IRC Log

Index

IRC Log for 2012-11-07

Timestamps are in GMT/BST.

[0:03] <dmick> so, Oliver2, are you running precise?
[0:04] <Oliver2> dmick: yes.
[0:13] * ninkotech_ (~duplo@89.177.137.231) has joined #ceph
[0:19] <dmick> ok. so I've posted a followup to the list
[0:23] <gucki> anybody here who can help me with tuning my ceph cluster used for qemu-rbd only?
[0:24] <Q310> well i've allmost worked out a patch/workaround for horizon until grizzly is released to provision to volumes just waiting for a little help from the openstack dev's on howto phrase the output of cinderclient api in django :)
[0:24] <gucki> for example right now...my cluster is rebalancing because i added 2 osds. so all disks are around 50 - 98% busy. inside on vm i try to start ruby, which takes over 10 minutes now. in iotop i can see it gets only around 3kb/sec and is hanging 99% io wait :(
[0:25] <gucki> the vm is not making a single write, only reads...on those get only 3kb/ sec?! :-(
[0:28] <joshd> Q310: awesome!
[0:28] <gucki> do i need to add more io threads to the osds? or any other hints? my whole cluster is barely useable with this performance :(
[0:28] <joshd> gucki: how many osds do you have?
[0:28] <gucki> joao: 8
[0:28] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[0:29] * nwatkins (~Adium@soenat3.cse.ucsc.edu) Quit (Quit: Leaving.)
[0:29] <gucki> joao: but i mean...the osds are showing not 100% io in iostat...wait_r/wait_w is always around 20...so the clients should get much more read throughput?
[0:30] <gucki> ceph -w is running too and not showing any slow requests..
[0:30] * Oliver2 (~oliver1@ip-178-201-146-106.unitymediagroup.de) Quit (Quit: Leaving.)
[0:30] * jjgalvez (~jjgalvez@12.248.40.138) Quit (Quit: Leaving.)
[0:31] <joshd> gucki: you can try increasing osd_op_threads and filestore_op_threads from the default of 2
[0:32] <gucki> joshd: how many would you suggest? 8?
[0:32] <dmick> joshd: wip-rbd-ls when you get a sec?
[0:32] <joshd> gucki: also decreasing the number of recovery ops at once (osd_recovery_max_active) down from the default of 5 to slow recovery (better control over this is coming in the future)
[0:33] * isomorphic (~isomorphi@659AABSIX.tor-irc.dnsbl.oftc.net) Quit (Remote host closed the connection)
[0:33] <joshd> gucki: and when you're adding a significant fraction of storage (compared to your total cluster) it helps to weight the new osds lower and gradually increase their weight
[0:33] <gucki> joshd: do i have to restart the osds or can i inject online with "ceph osd tell osd recovery max active = 1"?
[0:33] <gucki> joshd: thanks, that's good to know
[0:35] <joshd> gucki: I think they need to be restarted, but I'm not sure (and it'd be "ceph osd tell \* '--osd-recovery-max-active 1'")
[0:36] <gucki> joshd: ok, i'll restart them to make sure. it's no problem while the cluster is recovering, right?
[0:36] <joshd> nope
[0:38] <gucki> osd op threads = 8 and filestore op threads = 16 ok?
[0:38] <gucki> joshd: nope means here no problem or do not restart? ;-)
[0:38] <joshd> gucki: no problem :)
[0:38] <gucki> ok, and the values are good?
[0:39] <gucki> or should they be equal?
[0:39] <joshd> they're fine
[0:39] <gucki> i made filestore bigger because i assume it's using blocking kernel calls?
[0:39] <gucki> ok, i'll give it a try :)
[0:41] <gucki> joshd: thanks so far and i'll let you know how it works in a couple of minutes.. :)
[0:42] <joshd> ok. and yeah, it's using pread/writev
[0:44] * s_parlane (~scott@gate1.alliedtelesyn.co.nz) has joined #ceph
[0:44] <joshd> dmick: wip-rbd-ls looks good (commit to next, push, then merge next into master)
[0:49] <dmick> ok
[0:49] <dmick> I know someday someone's going to write down why and how to do those branches
[0:49] <dmick> I just know it
[0:53] * PerlStalker (~PerlStalk@perlstalker-1-pt.tunnel.tserv8.dal1.ipv6.he.net) Quit (Quit: rcirc on GNU Emacs 24.2.1)
[0:56] <gucki> joshd: seems much better, but i think i can really tell once the rebalance is over. should i increase it even more, like 32?
[0:56] <gucki> joshd: how much overhead do i have per thread, 10mb?
[0:56] <gucki> joshd: or much less? :)
[0:57] * tnt (~tnt@34.23-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[0:57] <joshd> gucki: I'm not sure how much actual overhead there is, but I'm glad to hear it's better. the max that still gives benefit is going to depend on your specific setup
[0:59] <gucki> joshd: right now i have around 60 clients (vms), so with 8 osds and 16 threads i can have 128 requests in parallel...around 2 per client...i think it's ok
[1:01] <joshd> gucki: requests will be queued up at multiple levels as well, it's more a question of what your underlying fs/disks can do
[1:01] <gucki> joshd: are there any debug options which can tell me how many threads have been max busy at the same time? i think apache has something like this work its workers for example. really helpful for tuning :)
[1:02] <gucki> joshd: i think it feels much smooother now...before everything seemed serialized. i think with more threads in parallel the os might also be better in merging requests and so speeding it up :)
[1:03] <gucki> at leats when connecting with ssh now it doesn't take 1-2 minutes for the console to appear :-)
[1:03] <joshd> gucki: I don't think so. perf dump through the admin socket can tell you about avg req latencies though
[1:03] <gucki> now only 1-2 seconds :)
[1:03] <gucki> joshd: would that be something for an issue/ feature request?
[1:04] <joshd> gucki: yeah
[1:04] <gucki> joshd: btw, are there any plans to use the github issue tracker or will you stick to your own?
[1:04] <joshd> gucki: how are your osds setup? osd/spinning disk, with journal on the same disk?
[1:05] <gucki> 4 pcs with 2 disks (7200) each. one has an ssd where i put the journal on. the 3 others have the journal on the same disk (i'm going to improve this in a few days, but 3kb/ sec seemed to slow anyway)
[1:05] <joshd> gucki: I don't think anyone's thought about moving to github issues too much, redmine works pretty well for us
[1:06] * rweeks (~rweeks@c-24-4-66-108.hsd1.ca.comcast.net) has joined #ceph
[1:06] <gucki> joshd: the good thing about github is that externals dont have to register a new account everywhere. at least for me it greatly boots the contribution :). same for the wiki... :)
[1:06] <gucki> joshd: some months ago the issue tracker as real crap imo, but now it's pretty good... :)
[1:07] <gucki> joshd: the github one i mean
[1:07] * johnl (~johnl@2a02:1348:14c:1720:f499:57b9:54fe:1992) has left #ceph
[1:07] * Oliver2 (~oliver1@ip-178-201-146-106.unitymediagroup.de) has joined #ceph
[1:08] <joshd> gucki: yeah, if we got our openid plugin working it'd be better
[1:08] * Oliver2 (~oliver1@ip-178-201-146-106.unitymediagroup.de) has left #ceph
[1:12] <gucki> joshd: are you going to create a ticket for the performance debugging idea?
[1:16] * yoshi (~yoshi@p20198-ipngn3002marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:16] <joshd> gucki: I thought you were :)
[1:29] * Cube1 (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[1:31] * jlogan1 (~Thunderbi@2600:c00:3010:1:9b2:ed42:a1f6:a6ec) Quit (Ping timeout: 480 seconds)
[1:40] * isomorphic (~isomorphi@1RDAAEUQD.tor-irc.dnsbl.oftc.net) has joined #ceph
[1:46] <gucki> joshd: ok, i can do so tomorrow. i'll have to register first etc.
[1:47] <gucki> joshd: last question for doay: why no aio for the filestore stuff instead of threads? :)
[1:48] <gucki> joshd: mh ok, probably one needs o_direct?
[1:48] <gucki> joshd: i means that's why its probably not there..?
[1:50] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[2:00] <gucki> joshd: ok gotta go now...thanks again and cya
[2:00] * gucki (~smuxi@84-72-8-40.dclient.hispeed.ch) Quit (Remote host closed the connection)
[2:14] * buck (~buck@bender.soe.ucsc.edu) has left #ceph
[2:20] * dweazle (~dweazle@tilaa.krul.nu) Quit (Ping timeout: 480 seconds)
[2:24] * dweazle (~dweazle@tilaa.krul.nu) has joined #ceph
[2:27] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Ping timeout: 480 seconds)
[2:32] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[2:32] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[2:34] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[2:39] * Cube (~Cube@174-154-203-252.pools.spcsdns.net) has joined #ceph
[2:42] * scuttlemonkey_ (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[2:44] * maxiz (~pfliu@202.108.130.138) has joined #ceph
[2:44] * rweeks (~rweeks@c-24-4-66-108.hsd1.ca.comcast.net) Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[2:46] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Ping timeout: 480 seconds)
[2:47] * chutzpah (~chutz@199.21.234.7) Quit (Quit: Leaving)
[2:49] * glowell1 (~glowell@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[2:58] * yoshi (~yoshi@p20198-ipngn3002marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[3:06] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[3:14] * yoshi (~yoshi@p20198-ipngn3002marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[3:27] * adjohn (~adjohn@69.170.166.146) Quit (Quit: adjohn)
[3:34] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[3:46] * dmick (~dmick@2607:f298:a:607:dd74:bc35:af3f:43c) Quit (Quit: Leaving.)
[3:49] * Cube (~Cube@174-154-203-252.pools.spcsdns.net) Quit (Quit: Leaving.)
[4:07] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[4:24] * Cube (~Cube@174-154-203-252.pools.spcsdns.net) has joined #ceph
[4:31] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[4:31] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[4:35] * Cube (~Cube@174-154-203-252.pools.spcsdns.net) Quit (Quit: Leaving.)
[4:39] * Cube (~Cube@174-154-203-252.pools.spcsdns.net) has joined #ceph
[4:53] * Cube (~Cube@174-154-203-252.pools.spcsdns.net) Quit (Quit: Leaving.)
[5:00] * iggy (~iggy@theiggy.com) Quit (Remote host closed the connection)
[5:00] * iggy_ (~iggy@theiggy.com) Quit (Remote host closed the connection)
[5:15] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[5:25] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has left #ceph
[5:37] * yoshi (~yoshi@p20198-ipngn3002marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[5:48] * Tv (~tv@cpe-76-170-224-21.socal.res.rr.com) Quit (Quit: Tv)
[5:49] * iggy (~iggy@theiggy.com) has joined #ceph
[6:00] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[6:02] * glowell1 (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[6:52] * glowell2 (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[6:52] * glowell1 (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[7:05] * sagelap (~sage@bzq-218-183-205.red.bezeqint.net) Quit (Read error: No route to host)
[7:15] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[7:52] * tnt (~tnt@50.90-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[7:56] * yoshi (~yoshi@p30106-ipngn4002marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[8:02] * sagelap (~sage@bzq-13-168-31-158.red.bezeqint.net) has joined #ceph
[8:05] <iltisanni> Hey! Can someone please tell me how to do the following example? I have 3 VMs running each one OSD and MON daemon. Additionaly one of those VMs runs a mds daemon. I also have one VM acting as a client (ceph is installed, but the VM is not in the cluster. ceph -s gives me good output)
[8:07] <iltisanni> Now I created a new pool (testauf2) and a new directory and I want the client to mount this one
[8:08] * stxShadow (~Jens@ip-178-203-169-190.unitymediagroup.de) has joined #ceph
[8:08] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[8:08] <iltisanni> How to do this?... I tried mount -t ceph "ip:port:/ /mnt/cephmounts on the client side... but it always hangs
[8:08] <iltisanni> and I have to hard shutdown the client
[8:18] * gregorg (~Greg@78.155.152.6) Quit (Quit: Quitte)
[8:25] * stxShadow (~Jens@ip-178-203-169-190.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[8:27] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[8:33] * s_parlane (~scott@gate1.alliedtelesyn.co.nz) Quit (Ping timeout: 480 seconds)
[8:34] * Q310 (~qgrasso@ip-121-0-1-110.static.dsl.onqcomms.net) Quit ()
[9:02] * gregorg (~Greg@78.155.152.6) has joined #ceph
[9:06] <wido> iltisanni: You probably have cephx turned on
[9:06] <wido> when mounting you need to pass these options
[9:06] <iltisanni> no is disabled
[9:06] <wido> iltisanni: Which IP do you use when mounting?
[9:07] <iltisanni> its that line in ceph.conf right? auth supported = none
[9:07] <wido> Which version of Ceph?
[9:07] <iltisanni> the ip of one of the monitors
[9:07] <iltisanni> and I also tried to use all
[9:07] <iltisanni> all 3 monitor ips
[9:07] <wido> Ok, what does ceph -s show?
[9:08] <wido> pastebin?
[9:08] <iltisanni> version 0.48
[9:08] <iltisanni> health HEALTH_OK monmap e1: 3 mons at {a=10.61.11.68:6789/0,b=10.61.11.69:6789/0,c=10.61.11.70:6789/0}, election epoch 4, quorum 0,1,2 a,b,c osdmap e90: 3 osds: 3 up, 3 in pgmap v10595: 636 pgs: 636 active+clean; 1000 MB data, 11241 MB used, 28189 MB / 41541 MB avail mdsmap e63: 1/1/1 up {0=a=up:active}
[9:09] <wido> iltisanni: Does "dmesg" show anything on the client when mounting?
[9:09] <iltisanni> I cant do anything when trying to mount
[9:09] <iltisanni> it just hangs
[9:09] <iltisanni> and I have to hard shutdown
[9:09] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:09] <wido> second terminal and show dmesg?
[9:09] <iltisanni> sec
[9:10] <tnt> Which kernel does your client have ? >= 3.6 is recommended IIRC
[9:11] <iltisanni> can open second shell... the VM is dead
[9:11] <wido> iltisanni: Like tnt mentioned, which kernel on that VM?
[9:11] <wido> I'd indeed recommend at least 3.6
[9:14] <iltisanni> 3.5.0-17
[9:14] <iltisanni> hm ok
[9:14] <iltisanni> so update first
[9:14] <iltisanni> and try again
[9:14] <iltisanni> but the command I used was OK?
[9:15] <iltisanni> mount -t ceph 10.61.11.68:6789:/ /mnt/cephmount
[9:15] <tnt> yes, afaik
[9:16] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[9:16] <tnt> and in anycase it shouldn't crash the vm ...
[9:16] <iltisanni> OK. Thx for your reply. I'll update and try again
[9:24] * tnt (~tnt@50.90-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:29] * s_parlane (~scott@121-74-235-205.telstraclear.net) has joined #ceph
[9:37] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[10:07] * Leseb (~Leseb@193.172.124.196) has joined #ceph
[10:08] * maxiz (~pfliu@202.108.130.138) Quit (Read error: Connection reset by peer)
[10:17] <iltisanni> I updated the kernel version and rebootet the VMs. Now I get Health_warn -> pgx.xx is stuck stale+active+clean, last acting [y,y] .... what's that now ?
[10:18] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[10:34] <iltisanni> I already tried restarting all osd daemons and also restarting all Nodes...Problem with stuck stale pgs is stll there
[10:38] <tnt> mmm ... there is still a "small" bug in the rbd kernel client when the cluster disappears for a second, it segfaults. Just crashed 20 VMs or so during a small network issue :(
[10:41] <iltisanni> what do you mean with "crush" lost their mounts or really crushed
[10:43] * tryggvil (~tryggvil@16-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[10:44] * sagelap1 (~sage@bzq-19-168-31-70.red.bezeqint.net) has joined #ceph
[10:49] * sagelap (~sage@bzq-13-168-31-158.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[10:52] <s_parlane> iltisanni: what is ceph osd tree and ceph -w saying ? (are any osd's missing or being slow ?)
[10:55] <iltisanni> dumped osdmap tree epoch 118 # id weight type name up/down reweight 0 0 osd.0 up 1 1 0 osd.1 up 1 2 0 osd.2 up 1
[10:55] <iltisanni> seems to be ok
[10:55] <iltisanni> all three osds are up
[10:57] <iltisanni> 3 up, 3 in says ceph -w
[10:58] <s_parlane> which pg is broken ?
[10:58] * match (~mrichar1@pcw3047.see.ed.ac.uk) has joined #ceph
[10:59] <s_parlane> what does "ceph pg dump" have for that pg's line ?
[11:01] <iltisanni> http://pastebin.com/wh9h4JXt
[11:02] <s_parlane> hrm, it thinks all your pg's are stale
[11:02] <iltisanni> y :-)
[11:02] <iltisanni> but why
[11:02] <iltisanni> Just rebootet the system
[11:03] <iltisanni> and then.. boom
[11:03] <s_parlane> if i understand correctly, that would be enough to cause it
[11:03] <iltisanni> -.-
[11:03] <iltisanni> so.. how to fix it
[11:03] <s_parlane> i think you can recover by "ceph osd repair 0" or "ceph osd scrub 0" (and same for 1 and 2)
[11:03] <s_parlane> do one at a time
[11:04] <iltisanni> k I'll try
[11:04] <iltisanni> osd.0 instructed to repair
[11:04] <s_parlane> ok
[11:04] <iltisanni> but changed nothing
[11:04] <s_parlane> watch ceph -w
[11:05] <s_parlane> from the docs, Stale
[11:05] <s_parlane> The placement group is in an unknown state - the monitors have not received an update for it since the placement group mapping changed.
[11:05] <s_parlane> how many hosts are you using ?
[11:06] <iltisanni> I have 3 VMs having each one osd and one mon daemon running
[11:06] <s_parlane> are they all on the same real machine ?
[11:06] <iltisanni> no
[11:07] <s_parlane> if possible, i think it is best to restart them one at a time
[11:07] <s_parlane> (for future)
[11:07] <iltisanni> y :-) I think so too
[11:08] <s_parlane> if they don't take too long, it shouldn't start rebalancing before the node comes bck
[11:08] <iltisanni> they are restarting quiet fast
[11:08] <s_parlane> ubuntu ?
[11:08] <iltisanni> y
[11:09] <s_parlane> ubuntu VM's do reboot fast, which is good
[11:09] * ninkotech_ (~duplo@89.177.137.231) Quit (Quit: Konversation terminated!)
[11:10] <iltisanni> y but i just recognized that one of those VMs isn't in the Vcenter anymore.... huh? I have to check this
[11:10] <s_parlane> I have a mixed ubuntu/gentoo setup (mon on ubuntu VMs, mds active/standby ubuntu i3's, osd's gentoo real hardware)
[11:10] <iltisanni> and you are using cephfs?
[11:11] <s_parlane> yes, just for testing
[11:11] <iltisanni> kk
[11:11] <s_parlane> i tried multiple active mds, then one got restarted, later i found the "do not do this" messages
[11:12] <s_parlane> (after i locked up 2 machines)
[11:12] <iltisanni> argh.. thats bad...
[11:12] <iltisanni> so.. having multiple active mds and restarting one of them
[11:12] <iltisanni> is bad?
[11:12] <s_parlane> having multiple mds is bad
[11:12] <iltisanni> active
[11:13] <s_parlane> i believe they wrote: "does not work, do not try it"
[11:13] <s_parlane> yes, multiple active mds
[11:13] <s_parlane> i have active/standby
[11:13] <s_parlane> (now)
[11:13] <s_parlane> which works fine
[11:13] <iltisanni> ok how to make an mds standby? Never tried this cause I only have one mds right now
[11:14] <s_parlane> configure another mds, but do not use the command to make it active
[11:14] <iltisanni> OK and ceph recognizes when the active one is going down
[11:14] <tnt> s_parlane: is there a specialy way to configure standby/active or just have one ready to start in case the other is down ?
[11:15] <s_parlane> tnt: you configure the mds, and start it
[11:15] <iltisanni> same question as mine :-)
[11:15] <s_parlane> but don't tell ceph to use the second mds
[11:16] <s_parlane> you should get ceph -w like "mdsmap e10: 1/1/1 up {0=b=up:active}, 1 up:standby"
[11:16] <iltisanni> have you tried what happens when the active one is going down ?
[11:16] <s_parlane> (note i've had a fail-over)
[11:16] <s_parlane> yes, it continues to work, or seems to
[11:16] <iltisanni> ok. fine
[11:17] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[11:18] <s_parlane> http://ceph.com/wiki/MDS_cluster_expansion <- DO NOT DO THIS
[11:19] <s_parlane> (the last step specifically, the other steps are required for having a standby mds)
[11:19] <iltisanni> ceph mds set_max_mds 1 should be ok
[11:19] <iltisanni> and the other steps before are ok
[11:19] <s_parlane> yeah i set mine to 2
[11:19] <iltisanni> ok and then it crashed
[11:20] <s_parlane> then i restarted one of the mds's, and that locked up the machines that had cephfs mounts
[11:20] <iltisanni> really bad.... and thx for the info :-)
[11:21] <s_parlane> np
[11:21] <iltisanni> so I restarted all of the VMs... but same problem...
[11:21] <iltisanni> dont know what to do now.
[11:22] <s_parlane> i think having multiple active mds is only useful when you have lots of metadata (i.e. many many folders of many many files),
[11:22] <s_parlane> are you still seeing the pg stale ?
[11:22] <iltisanni> y
[11:23] <s_parlane> what happens if you turn off only 1 of the VMs ?
[11:23] <s_parlane> (or just sudo service ceph stop)
[11:24] <iltisanni> Then ceph -s tells me that one mon is down (that's correct)... but the pgs stale problem is still there
[11:24] <iltisanni> but
[11:24] <iltisanni> it also tells me that 3 osds are up and in
[11:24] <iltisanni> which cant be correct
[11:24] <s_parlane> no
[11:24] <s_parlane> check with ceph osd tree
[11:25] <s_parlane> you can force down an osd, ceph osd down 0
[11:25] <iltisanni> ceph osd tree also shows all osds are up
[11:25] <iltisanni> k. i force it down
[11:25] <s_parlane> and maybe mark it out too, ceph osd out 0
[11:26] <iltisanni> marked down osd.0
[11:26] <s_parlane> (replace 0 with whichever you shutdown)
[11:26] <iltisanni> ok now its down
[11:26] <iltisanni> y it was 0
[11:26] <s_parlane> and watch ceph -w
[11:27] <iltisanni> http://pastebin.com/26kP8BZv
[11:27] <s_parlane> for one of the active osd's, what does tail -f /var/log/ceph/ceph-osd.1.log show ?
[11:27] <iltisanni> thats ok... because one mds, one mon and one osd was on that vm
[11:27] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[11:28] <iltisanni> there is no such file
[11:28] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[11:28] <iltisanni> ceph.log.1.gz
[11:28] <iltisanni> tahts what i have
[11:29] <s_parlane> btw, i think you have to wait for it to stop reporting it as laggy before it will kick it out, which is probably why osd.0 was still up/in
[11:30] <iltisanni> kk maybe
[11:30] <s_parlane> sorry, i've only been playing with it for a week or so
[11:30] <iltisanni> but its still reported as laggy
[11:31] <s_parlane> ceph mon stat ?
[11:32] <iltisanni> shows me all of the 3 mons
[11:32] <s_parlane> did you remove power from the VM ?
[11:33] <iltisanni> no.. i just rebooted them
[11:33] <s_parlane> what about the one you stopped ?
[11:33] <iltisanni> i stopped the ceph service
[11:34] <s_parlane> ok, that should make it timeout faster
[11:34] <iltisanni> hm... any way to rescan pgs or something like that ? :-)
[11:35] * yoshi (~yoshi@p30106-ipngn4002marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:36] <s_parlane> not for all
[11:36] <s_parlane> but specific ones, seems ceph pg scrub x.xx works
[11:36] <iltisanni> I could try with one
[11:37] <s_parlane> oh bugger, i just had an osd thrown out
[11:38] <s_parlane> anyways, when i did service ceph stop, it took about 10s before one of the monitors decided there should be an election
[11:39] <s_parlane> what does "ps aux | grep ceph" show on the box you stopped ceph on ?
[11:39] <iltisanni> root 2715 0.0 0.0 10896 932 pts/0 S+ 11:39 0:00 grep --color=auto ceph
[11:40] <s_parlane> so ceph isn't still running, no idea why you are still having your down mon there
[11:40] <s_parlane> thats strange
[11:41] <s_parlane> which ubuntu are you running, and which version of ceph ?
[11:41] <iltisanni> y...
[11:41] <iltisanni> and pg scrub doesn't work..
[11:41] <iltisanni> i tried with one of them
[11:41] <s_parlane> i see why my osd got thrown out, i have a fault hdd
[11:42] <iltisanni> 2.14 for example and it tells me pg: scrub: no such file or directory
[11:42] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[11:42] <iltisanni> but thats not the fault of ceph :-) the faulty hdd
[11:43] <s_parlane> ceph did what it was meant to do, it failed the faulty node out
[11:44] <iltisanni> I have a meeting right now and then I#m going to eat something... so I'd like to thank you so far. and maybe we'll talk to eacht other later (about 1 hour)
[11:44] <iltisanni> bye
[11:45] <s_parlane> i must say i am very impressed with ceph's handling of everything, even if i have been very mean to it (4 crappy hdd's per node, on P4's w/ 2GB)
[11:45] <s_parlane> (and different versions of ceph, across different distros and arch)
[11:45] <s_parlane> bye
[11:52] <s_parlane> i guess i should try running it on some under-powered powerpc and mips and arm equipment
[11:52] <s_parlane> anyways, i too must go, goodnight all
[11:53] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[11:59] * kees_ (~kees@devvers.tweaknet.net) has joined #ceph
[12:13] <wido> hi kees_!
[12:13] <wido> Regular Tweakers user here ;)
[12:13] <kees_> hey :)
[12:13] <kees_> nice :)
[12:14] <wido> Saw you were playing with the MDS?
[12:14] <kees_> yep
[12:14] <wido> How is that working out?
[12:14] <kees_> got some strange issues, so im going to simplify my setup first
[12:15] <wido> Be aware for now that the MDS is still not ready
[12:15] <wido> RBD and the RADOS Gateway work much better, the RADOS part is pretty stable
[12:15] <kees_> ye, i noticed :P
[12:15] <kees_> strange things like a a file that is 1GB in size on client1 and 81M on client2
[12:16] <wido> Ouch, that is pretty odd
[12:16] <wido> do file issues in the tracker though if you identify them
[12:16] <wido> the sprint for more MDS development should start shortly
[12:18] <wido> If you get creative you could actually store a lot of data in just plain RADOS objects
[12:18] <wido> PHP bindings are available ;)
[12:18] <kees_> yeah, so i saw :)
[12:19] <kees_> which is also one of the reasons we're looking at it
[12:19] <wido> That will be more robust then using the MDS
[12:19] <wido> if you can take the MDS out of the chain, one less thing to fail
[12:19] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)
[12:21] <kees_> i've noticed it works well as long as you don't have multiple clients, so ill prolly just have one client export it by nfs while we rewrite our php code :)
[12:27] * Leseb (~Leseb@193.172.124.196) Quit (Quit: Leseb)
[12:36] * mib_dh80to (ca037809@ircip1.mibbit.com) has joined #ceph
[12:36] <mib_dh80to> Hi Community ...
[12:36] <mib_dh80to> anyone is online?
[12:39] <tnt> yes
[12:39] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[12:42] <mib_dh80to> hello Sir
[12:43] <mib_dh80to> Can I ask you a doubt regarding the operations w.r.t ceph .v .0.48.2argonaut a stable release?
[12:44] <tnt> ask ... we'll see if anyone knows :p
[12:45] <mib_dh80to> I have cluster with 3 nodes
[12:46] <mib_dh80to> vm1 (mon.0+osd.0+mds.0) VM2(osd.1+mds.1+mon.1) VM3(mon.2)
[12:46] <mib_dh80to> i mounted cluster data to VM1
[12:46] <mib_dh80to> under ~/cephfs
[12:46] <mib_dh80to> and created 1 file 'test.txt'
[12:47] <mib_dh80to> with replication level 2
[12:47] <mib_dh80to> my file went to VM1 and VM2
[12:47] <mib_dh80to> then I failed VM2 forcefully
[12:48] <mib_dh80to> resulting configuration
[12:48] <mib_dh80to> after failing VM2 is = > VM1 (mon.0+mds.0+osd.0) VM3(mon.2)
[12:49] <tnt> Well, first off, multi-mds config aren't recommended right now
[12:49] <iltisanni> multi active
[12:49] <iltisanni> active /passive is ok
[12:49] <iltisanni> i thought
[12:52] <mib_dh80to> alright
[12:52] <mib_dh80to> so i shall tryi with keeping only 1 mds
[12:52] <mib_dh80to> and get back to you
[12:52] <mib_dh80to> any other issue you see with my exp. setup?
[12:53] <mib_dh80to> ideally, i should expect to get my data from 1 osd even while the other osd is down (since my replication factor is 2)
[12:53] <mib_dh80to> right?
[12:54] <iltisanni> y. the monitor i stelling the client where to get data from
[12:54] <iltisanni> and the montir knows both mds
[12:54] <iltisanni> osd
[12:54] <iltisanni> sry
[12:54] <mib_dh80to> exactly
[12:54] <mib_dh80to> so why am i seeing this problem
[12:54] <mib_dh80to> that i cant access the data
[12:54] <mib_dh80to> when one of the osd is down?
[12:55] * Leseb (~Leseb@193.172.124.196) has joined #ceph
[12:58] <iltisanni> sorry cant help you put with that... I even can't mount the data on the client... don't know why..(it just hangs). I only know how it should theoratically work
[13:01] <mib_dh80to> use mount.ceph cmd na
[13:01] <mib_dh80to> mount.ceph X.X.X.X:port,X.X.X.X.:port:/ ~/dirname
[13:01] <tnt> yes you should have been able to read your data
[13:02] <mib_dh80to> oops
[13:02] <mib_dh80to> how to del these smilies?
[13:03] <kees_> hm, what is the best way to make (offsite) backups of the data in ceph?
[13:04] <iltisanni> i ised the mount.ceph and the mount -t ceph command... both didnt work.. anyway: You should have had access to your data, also when one osd is down.
[13:05] <tnt> kees_: good question ... I'm looking for a way too :p
[13:05] <iltisanni> btw you can have more than one mds. You just have to set this one: ceph mds set_max_mds 1
[13:11] <iltisanni> all my pgs are in stuck stale mode after I updated Kernel and rebooted all VMs... How to fix it ?
[13:12] * gucki (~smuxi@84-72-8-40.dclient.hispeed.ch) has joined #ceph
[13:14] <iltisanni> http://pastebin.com/J4RScHZP <- maybe this helps for my question
[13:14] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[13:15] <tnt> iltisanni: pastebin a ceph pg dump
[13:17] <iltisanni> its a really long text.. sec :-)
[13:19] <iltisanni> http://pastebin.com/z6FU5GKH
[13:26] <iltisanni> any idea?
[13:26] <tnt> try 'ceph pg 4.16 query' and pastebin that
[13:29] <iltisanni> pgid currently maps to no osd
[13:33] * pixel (~pixel@81.195.203.34) has joined #ceph
[13:34] <pixel> Hi everybody
[13:41] * mib_dh80to (ca037809@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[13:49] <gucki> hi :)
[13:49] <gucki> anybody know if rbd with virtio-scsi should work? :)
[13:51] <pixel> does anybody have a guide how to move existing journal to tmpfs ?
[13:51] <wido> pixel: Just shut down the OSD
[13:51] <wido> and copy the journal
[13:52] <wido> and change the path in the ceph.conf
[13:52] <pixel> osd journal = /mnt/tmpfs/journal ?
[13:52] <wido> kees_: Geo-replication is something that should be coming up, but isn't there now
[13:52] <gucki> wido: when the osd is gracefully shutdown, cant the journal just be recreated on next stratup? (all data in the journal should have been committed?)
[13:52] <wido> pixel: If that's the path to your tmpfs journal
[13:53] <wido> gucki: I do think so, but then you'd have to format the journal again
[13:53] <wido> might be easy to just copy it
[13:54] <gucki> wido: ok, justed wanted to know. thanks :-)
[13:54] <kees_> wido, well, with backup i meant something like: a dump of all objects so i can put it on another filesystem
[13:54] <pixel> I've done as you suggested but osd haven't started :(
[13:54] <wido> kees_: rados -p <pool> ls
[13:54] <wido> and then do stuff with it :)
[13:56] <wido> pixel: Check the logs? What does that say?
[13:59] <pixel> which log file we need to check ? ceph-osd.X.log ?
[13:59] <wido> pixel: Yes, that log
[13:59] <wido> If the OSD doesn't start, that should tell you why
[14:01] * MikeMcClurg (~mike@62.200.22.2) has joined #ceph
[14:02] <pixel> journal read_entry 3891200 : bad header magic, end of journal
[14:07] <wido> pixel: Seems like the copy failed
[14:08] <wido> something went wrong
[14:08] <wido> you could try to start with a clean journal or run the OSD manually with --mkjournal
[14:08] <pixel> sure, let me check
[14:11] <pixel> ** ERROR: error converting store /var/lib/ceph/osd/ceph-0: (2) No such file or directory
[14:12] <pixel> service ceph -a start osd.0 --mkjournal
[14:18] <tnt> Any dev familiar with rgw internals here ?
[14:26] <kees_> wido, hm, rados -p <pool> ls (and export) works, but it seems like objects arent deleted from the data pool when you delete them on a cephfs mount :/
[14:26] * mtk (NGSByVW6BK@panix2.panix.com) has joined #ceph
[14:27] <wido> kees_: Ah, for the filesystem I wouldn't do so anyway
[14:27] <wido> there is a delay though when you remove a file before the RADOS object goes
[14:27] <wido> A message came up on the ml about this today
[14:28] <wido> that those objects weren't removed properly. But again, the MDS is kind of in dev state
[14:28] <kees_> true, for the filesystem i'll prolly just tar it all :)
[14:28] <wido> kees_: Better do so :)
[14:28] <wido> But for rados objects the ls and export works, so if you use phprados to store objects
[14:28] <wido> pixel: ceph-osd -c /etc/ceph/ceph.conf -i 0 --mkjournal
[14:29] <kees_> ye, sometimes i just want a dump of all my images for example
[14:32] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[14:35] <pixel> root@srv15:/var/lib/ceph/osd/ceph-0# ceph-osd -c /etc/ceph/ceph.conf -i 0 --mkjournal
[14:35] <pixel> 2012-11-07 17:35:09.551157 7ffe0a6d7780 -1 journal check: ondisk fsid 00000000-0000-0000-0000-000000000000 doesn't match expected 5a82c7b2-5641-4bbc-b2ea-f49e732efd84, invalid (someone else's?) journal
[14:35] <pixel> 2012-11-07 17:35:09.551272 7ffe0a6d7780 -1 ** ERROR: error creating fresh journal /mnt/tmpfs/journal for object store /var/lib/ceph/osd/ceph-0: (22) Invalid argument
[14:37] <tnt> does the /mnt/tmpfs/journal already exist ?
[14:37] <pixel> yep
[14:38] <tnt> try deleting it before doing a mkjournal
[14:38] <tnt> (I mean assuming you indeed want to drop the journal and create a new one)
[14:39] <pixel> now I've received only this error: ** ERROR: error creating fresh journal /mnt/tmpfs/journal for object store /var/lib/ceph/osd/ceph-0: (22) Invalid argument
[14:39] * kees_ (~kees@devvers.tweaknet.net) Quit (Read error: Connection reset by peer)
[14:46] * kees_ (~kees@devvers.tweaknet.net) has joined #ceph
[14:47] <kees_> ok, mental note: don't use laptop as a test client anymore.. kernel panics suck
[14:47] <wido> pixel: Ah, wait, there was/is a setting for journal on tmpfs
[14:48] <wido> I'm not sure which one
[14:49] <tnt> you might need the xattr stuff like on ext4
[14:49] <tnt> ?
[14:50] <pixel> it's my setting: mount -t tmpfs -o size=3072M tmpfs /mnt/tmpfs/
[14:54] * mtk (NGSByVW6BK@panix2.panix.com) Quit (Remote host closed the connection)
[14:54] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[14:58] * nhm stumbles into the channel
[14:58] <nhm> morning #ceph
[14:58] <jmlowe> did you watch the concession speech or did you make it all the way to the acceptance speech?
[15:00] <nhm> jmlowe: acceptance speech and then a bit more of the MN ammendments.
[15:01] <iltisanni> all my pgs are in stuck stale mode after I updated Kernel and rebooted all VMs... How to fix it ?
[15:01] <iltisanni> http://pastebin.com/J4RScHZP <- maybe this helps for my question
[15:01] <iltisanni> and ceph pg 4.16 query for example outputs -> pgid currently maps to no osd
[15:02] <jmlowe> A bit lucky here in Indiana when it comes to election night bed times, no ballot initiatives and we call 'em early and red
[15:02] <nhm> iltisanni: sounds like a problem with the crush mpa
[15:02] <iltisanni> nhm: what can I do then?
[15:03] <nhm> iltisanni: can you do: ceph osd dump -o -|grep size
[15:04] <iltisanni> y -> output: http://mibpaste.com/d4GiH2
[15:05] <nhm> ok, now ceph osd getcrushmap, and look and see if all of the pools map to crush rulesets
[15:07] <iltisanni> haha.. here is my crushmap....doesn't look that good?? http://mibpaste.com/ZGEcq9
[15:08] <nhm> oh my
[15:08] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[15:10] <iltisanni> y.... so. what happended?
[15:10] <nhm> iltisanni: good question. Trying to see if I can find any occurances of something similar happening to someone else.
[15:11] <iltisanni> thats really strange... everything was OK.. then I updated kernel version and rebooted.. after that the crushmap was empty
[15:12] <nhm> what kernel did you update from/to?
[15:12] <iltisanni> from 3.5 to 3.6.3
[15:13] <iltisanni> because I couldn't mount anything I was told to update the kernel version first
[15:13] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[15:13] <nhm> iltisanni: what/who told you you couldn't mount anything?
[15:15] <iltisanni> I tried to mount with mount -t ceph 10.61.11.68:6789:/ /mnt/mycephfs and then the client just stopped working
[15:15] <iltisanni> couldnt do anything
[15:15] <iltisanni> only hard shutdown was possible
[15:16] <iltisanni> no output.. nothing
[15:18] <nhm> iltisanni: sounds like maybe there was a problem before the kernel upgrade...
[15:19] * noob2 (a5a00214@ircip1.mibbit.com) has joined #ceph
[15:19] <iltisanni> any way to re generate the crusmap ?
[15:19] <iltisanni> or something like that
[15:20] <nhm> iltisanni: I don't actually know. This is probably a question for one of the devs at this point. If not, you might have to try recreating it by hand.
[15:20] <iltisanni> ok .. thx anyway
[15:23] <nhm> iltisanni: it's still like 6:30am in california, so they should be around in a couple of hours.
[15:23] * pixel (~pixel@81.195.203.34) Quit (Quit: Ухожу я от вас (xchat 2.4.5 или старше))
[15:24] <iltisanni> :-) in a couple of hours it's 6:30 pm here and I'm at home doing other things.
[15:24] <iltisanni> fu timezones
[15:24] <iltisanni> ;-)
[15:25] <noob2> nhm: quick question on adding an osd to ceph
[15:26] <noob2> i added my osd to all the config files for the servers, did the osd create, added it into the osd tree. when i do the ceph-osd -i 5 --mkfs it fails saying it can't find the directory
[15:29] * iltisanni (d4d3c928@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[15:39] <noob2> nvm i got it :)
[15:43] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[15:48] <kees_> great, i somehow wrecked my cluster.. can't mount it anymore with cephfs.. getting 'can't read superblock' and in dmesg: libceph: osdc handle_map corrupt msg
[15:55] <kees_> and now the kernel panics again
[16:02] * PerlStalker (~PerlStalk@perlstalker-1-pt.tunnel.tserv8.dal1.ipv6.he.net) has joined #ceph
[16:22] <nhm> noob2: doh, sorry, I was eating breakfast. Glad you got it worked out.
[16:25] <jtang> heh cool the ceph day slides are up
[16:25] <jtang> i can at least read some of it
[16:25] <jtang> gah no texts!
[16:26] <tnt> was it filmed ?
[16:27] <jtang> dunno
[16:27] <jtang> it'd be nice if it was
[16:27] <jtang> :P
[16:27] * noob2 (a5a00214@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[16:29] <jtang> we had a chat in work yesterday about them storage pods
[16:29] <jtang> we think we might just take a hit on the capacity and mirror the disks internally with btrfs
[16:29] <jtang> so we'll only have 65tb's per pod
[16:29] * kees_ (~kees@devvers.tweaknet.net) Quit (Remote host closed the connection)
[16:33] <jtang> the cluster design slides look interesting
[16:39] <jtang> cool geo-replication is on the map
[16:41] * noob2 (a5a00214@ircip2.mibbit.com) has joined #ceph
[16:41] * loicd (~loic@31.36.8.41) Quit (Quit: Leaving.)
[16:43] <noob2> i have a question about ceph performance. i setup a 5 node ceph cluster, 1 is a monitor. all nodes have sas 10K 36GB disks which are older 3Gb/s. When i do a straight dd on the home partition without ceph i get about 750MB/s which is normal for these drives. When I mount a ceph rbd on a remote server over 1Gb/s ethernet I'm looking at 60MB/s write speed.
[16:43] <noob2> is that normal?
[16:43] <noob2> i would've expected around network speed
[16:44] <noob2> btrfs is under all the osd disks
[16:44] <noob2> i'm runnning ceph 0.53
[16:47] <noob2> seems my block size with dd makes a huge difference
[16:47] <noob2> at 1MB i'm at 90MB/s write speed
[16:50] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[16:50] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit ()
[16:53] <gucki> what is the difference between pg-num and pgp-num when creating a new pool? should both be set to osd count * 100?
[16:54] * stxShadow (~jens@jump.filoo.de) has joined #ceph
[17:05] * joshd1 (~jdurgin@2602:306:c5db:310:b8ba:dd34:f85:264b) has joined #ceph
[17:20] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:25] * The_Bishop (~bishop@2001:470:50b6:0:ec24:ee44:4264:26c) has joined #ceph
[18:02] * buck (~buck@c-24-7-14-51.hsd1.ca.comcast.net) has joined #ceph
[18:05] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:06] * sagelap1 (~sage@bzq-19-168-31-70.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[18:08] * oliver2 (~oliver@jump.filoo.de) has joined #ceph
[18:09] <nhm> gucki: set both to the same thing yeah
[18:12] <gucki> nhm: do you know what's the difference? can i show both values of existing pools to check them? "rados lspools" only shows the name...
[18:12] * scuttlemonkey_ is now known as scuttlemonkey
[18:12] * tnt (~tnt@50.90-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[18:14] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:16] <nhm> gucki: pg-num is the number of total placement groups, and pgp-num is the total number of placement groups for placement purposes. Perhaps in some cases it's makes sense to set pg-num higher than pgp-num. I've never really dug into it.
[18:16] * danieagle (~Daniel@177.133.174.11) has joined #ceph
[18:17] <joshd1> nhm, gucki: iirc it was meant to be used separately when pg-splitting was available, but in practice they should always be the same right now
[18:17] <gucki> nhm: yeah i read that but to be honstest i dont really get it. what are placement groups used for other than placement of data?
[18:17] <tnt> Any rgw dev around ? I create a few issue and I'd love to know if I need to collect more info to get them looked at :)
[18:18] <gucki> i'm asking because i created my cluster with only setting the first value, omitting the second one. so i know wonder what the second value was set to automatically and if it has a bad impact on performance... :)
[18:18] <nhm> tnt: yehuda is probably your man.
[18:19] <buck> I have a pull request for autobuilder-ceph. Is anyone available that could review it? The branch is wip-java.
[18:19] <gucki> there's no way to tune these valus afterwards, right? so when i start with a pool with 8 osds and then expand my cluster at some point having 16 osds, i cannot double the number of placement groups..?
[18:20] <tnt> nhm: ah thanks.
[18:20] <NaioN> gucki: no not at the moment
[18:20] * loicd (~loic@31.36.8.41) has joined #ceph
[18:20] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[18:20] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[18:20] <NaioN> pg splitting will be build in the future
[18:20] * loicd (~loic@31.36.8.41) Quit ()
[18:21] * loicd (~loic@31.36.8.41) has joined #ceph
[18:25] <joao> any of you guys around?
[18:25] <joshd1> good morning joao
[18:25] <joao> morning josh :)
[18:26] <joao> teuthology is driving me nuts
[18:26] <joao> have you ever seen this?
[18:26] <joao> CommandFailedError: Command failed with status 1: 'cd -- /tmp/cephtest/mnt.0 && sudo install -d -m 0755 --owner=ubuntu -- client.0'
[18:26] <joao> apparently my workunit isn't run because of this (I think)
[18:27] <joao> no idea what I'm messing up
[18:27] <joshd1> joao: I'm guessing you need something (like the ceph-fuse task) before the workunit to create /tmp/cephtest/mnt.0
[18:27] <joao> oh
[18:27] <joao> ah right, the thrashosds yaml does indeed have ceph-fuse
[18:28] <joao> alright, will give that a shot
[18:28] <joao> thanks
[18:28] <joshd1> np
[18:32] * Leseb (~Leseb@193.172.124.196) Quit (Quit: Leseb)
[18:35] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[18:40] <elder> joshd, If I am going to fire off a request to a parent, I believe I should use the snapshot context that was provided with the original request, right? I.e., if it happened to have changed between the original request and the send-to-parent request I don't use the new one.
[18:41] <elder> joshd1^
[18:43] <joshd1> elder: only writes need the snapshot context, and you're not writing to the parent
[18:43] <elder> joshd1 related, I probably can abort early before I forward a request to a parent if at that point in time I find the original mapped (snapshot) image no longer exists.
[18:43] <elder> That makes sense.
[18:43] <elder> I'm trying to mimic the code in rbd_do_request() and hadn't thought I won't need any write path there...
[18:43] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[18:44] <joshd1> elder: yes, if the original image doesn't exist, it's fine to cancel any i/o you feel like
[18:44] <elder> Sounds good, thanks.
[18:45] <stxShadow> hmmm .... what can be wrong if i get an answer to "rados df" but "rbd ls *pool*" hangs
[18:45] <stxShadow> we can't start our vms after recovering the cluster
[18:46] <joshd1> stxShadow: the rbd_directory object may be inaccessible if 'rbd ls pool' hangs
[18:46] <stxShadow> and how may we recover from that state ?
[18:48] * glowell (~Adium@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:49] * filooabsynth (~Adium@p4FFFE3BB.dip.t-dialin.net) has joined #ceph
[18:50] <filooabsynth> so. detecting a night shift.
[18:50] <joshd1> what's the status of your cluster (ceph -s)?
[18:50] <stxShadow> root@fcmsnode6:~# ceph -s
[18:50] <stxShadow> health HEALTH_OK
[18:50] <stxShadow> monmap e1: 3 mons at {0=10.10.10.4:6789/0,1=10.10.10.5:6789/0,2=10.10.10.6:6789/0}, election epoch 70, quorum 0,1,2 0,1,2
[18:50] <stxShadow> osdmap e4589: 48 osds: 16 up, 16 in
[18:50] <stxShadow> pgmap v8139161: 10760 pgs: 10760 active+clean; 6956 GB data, 13350 GB used, 16429 GB / 29780 GB avail
[18:50] <stxShadow> mdsmap e55: 1/1/1 up {0=0=up:active}, 1 up:standby
[18:50] <jmlowe> joshd1: any idea why I'm getting this in my cinder-volume.log "Unable to read image rbd://a23f9c9e-9f19-4a1a-93d0-762cf5254d7b/images/cfd35715-6794-4727-8075-ee5d9c3263f8/snap _is_cloneable /usr/lib/python2.7/dist-packages/cinder/volume/driver.py:772"
[18:51] <joshd1> stxShadow: hmm, 'rbd ls' must be hanging for a different reason if it's all active+clean like that
[18:52] * glowell2 (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:53] <joshd1> stxShadow: can you try 'rbd ls *pool* --debug-ms 1' and pastebin the output?
[18:53] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[18:53] * sagelap (~sage@bzq-218-183-205.red.bezeqint.net) has joined #ceph
[18:53] <joshd1> jmlowe: does the client cinder is using have read permission on the images pool?
[18:53] * MikeMcClurg (~mike@62.200.22.2) Quit (Ping timeout: 480 seconds)
[18:53] <jmlowe> I think so
[18:54] <jmlowe> wait let me check
[18:54] <joshd1> jmlowe: that's rbd://<fsid>/<pool>/<image>/<snapshot>
[18:55] <oliver2> joshd: there you go: http://pastebin.com/e1JmVgjK
[18:55] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:55] <gucki> oh, i just found one mon crashed: http://pastie.org/5341600
[18:55] <jmlowe> I can ls the pool with the cinder credentials
[18:55] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[18:56] <gucki> i wonder how the clocks can be too skewed...there's running chronyd on all machines?
[18:56] <joshd1> jmlowe: can you run 'rbd info' on that image?
[18:57] <gucki> btw, the "Issue Tracking" link on ceph.com seems broken.
[18:57] <gucki> http://tracker.newdream.net/projects/ceph <- url not found
[18:57] <joshd1> oliver2: hmm, it's not contacting the osds at all
[18:57] <joshd1> gucki: seems the entire bug tracker is down atm
[18:57] <gucki> joshd: woah, too bad...i just wanted to create a few :-)
[18:58] <jmlowe> ok so I do have a permissions problem "error opening image cfd35715-6794-4727-8075-ee5d9c3263f8: (1) Operation not permitted"
[18:58] <gucki> joshd: any idea what can be wrong with my clocks? :)
[18:59] <jmlowe> is this more parsing problems? "caps: [osd] allow rwx pool volumes, allow rx pool images"
[18:59] <oliver2> joshd: well... bad state, though...
[19:00] <joshd1> oliver2: can you check your monitor logs for any messages about clocks too far in the future?
[19:00] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[19:02] <joshd1> oliver2: also 'rbd ls *pool* --debug-ms 1 --debug-monc 20'
[19:02] * nwatkins (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[19:02] <oliver2> joshd: seems clean with clock
[19:02] <sagelap> bin/fab gitbuilder_ceph:host=ubuntu@onehost
[19:02] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[19:02] <joshd1> I've got to get on a train, back in an hour or so
[19:03] <joshd1> jmlowe: you can try the 'next' branch, which has the parsing fixed. that looks right
[19:03] * joshd1 (~jdurgin@2602:306:c5db:310:b8ba:dd34:f85:264b) Quit (Quit: Leaving.)
[19:04] * chutzpah (~chutz@199.21.234.7) has joined #ceph
[19:04] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[19:07] * jlogan (~Thunderbi@2600:c00:3010:1:9b2:ed42:a1f6:a6ec) has joined #ceph
[19:08] <oliver2> Hi Sage.. any idea... Josh left for his train?...
[19:10] <sagelap> just got online. traveling
[19:11] <sagelap> did you post the complete output from the rbd command? there was a blank line.
[19:13] <stxShadow> http://pastebin.com/e1JmVgjK -> the output
[19:13] <noob2> nhm: just got my pci express ssd cards in :)
[19:13] <noob2> they're crazy fast
[19:13] <jtang> fusion io?
[19:13] <noob2> intel 910's
[19:13] <nhm> noob2: whoa, jeleous! :)
[19:13] <noob2> :)
[19:13] <nhm> noob2: very nice
[19:14] <noob2> is it possible to create a separate tier of storage in ceph with just them?
[19:14] <noob2> i'm guessing yes. ceph can do anything pretty much
[19:14] <nhm> noob2: yeah, you should be able to do that.
[19:14] <nhm> noob2: you'll want to use master
[19:14] <jtang> can you put ltfs in the backend?
[19:14] <noob2> master?
[19:14] <jtang> i meant to ask that question a while ago, noob2 just prompted ot ask
[19:15] <noob2> haha
[19:15] <noob2> glad i could jog your memory
[19:15] <jtang> heh
[19:15] <jtang> ltfs --> http://en.wikipedia.org/wiki/Linear_Tape_File_System
[19:15] <nhm> noob2: Sam's been improving our locking and threadpool code in the osd/filestore. You'll want that.
[19:15] <nhm> Not sure if it's in 0.53
[19:15] <noob2> ok
[19:16] <noob2> i have .53 running right now
[19:16] <jtang> having tiered storage would be nice
[19:16] <noob2> definitely
[19:17] <noob2> i didn't know oracle open sourced their ltfs
[19:18] <jtang> ltfs is a standard
[19:18] <jtang> there are multiple implementations of it
[19:18] <noob2> i see
[19:18] <noob2> i have heard of this before. the backup guy is always talking about LTO's
[19:18] <jtang> heh
[19:18] <jtang> after seeing what the SUN/Oracle guys are selling, I'm not sure if LTO's is all that hot
[19:19] <noob2> nhm: is that new locking code going to get folded into the next stable release?
[19:23] <nhm> noob2: I think so, Sam would know for sure.
[19:23] <noob2> ok
[19:26] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:31] * MikeMcClurg (~mike@62.200.22.2) has joined #ceph
[19:33] * stxShadow (~jens@jump.filoo.de) Quit (Ping timeout: 480 seconds)
[19:34] * houkouonchi-work (~linux@12.248.40.138) Quit (Remote host closed the connection)
[19:35] <jefferai> nhm: I assume the tiering bit is done with a crushmap?
[19:36] <tnt> yehudasa: ping
[19:37] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[19:37] <nhm> jefferai: well, yes, but just so that you can have a seperate pool for the high performance storage.
[19:37] <jefferai> right
[19:37] <jefferai> which is my case
[19:37] <jefferai> I was going to use a crushmap
[19:37] <jefferai> I guess I could also use a separate cluster
[19:38] <nhm> yeah, either way.
[19:38] <yehudasa> tnt: pong
[19:40] <tnt> yehudasa: I created a few bug tickets about rgw and was wondering if you could have a quick look see if you need more info for them ? (I have the thing setup to reproduce the issues on the test cluster currently).
[19:40] <yehudasa> tnt: I started looking at these
[19:41] <yehudasa> does that include the multipart permissions issue?
[19:41] <tnt> yes
[19:41] <yehudasa> this one didn't make quite sense, as there's a mixup between the bucket owner and the object owner
[19:42] <yehudasa> the user who creates the upload is the object owner and should have write access to the object
[19:42] <AaronSchulz> yehudasa: was that temp url bug ever created?
[19:42] <yehudasa> unless you're using a different user to initiate the upload and to complete the upload
[19:42] <tnt> yehudasa: well, I'm using a subuser with write permission only ...
[19:42] <yehudasa> AaronSchulz: not that I'm aware of
[19:42] <yehudasa> tnt: a subuser with S3?
[19:42] <tnt> yehudasa: yes
[19:43] <yehudasa> not sure we're talking about the same thing
[19:43] <yehudasa> how did you create the subuser?
[19:44] <tnt> yehudasa: http://pastebin.com/3Bi6Rbac
[19:45] <yehudasa> AaronSchulz: can you open a ticket for that?
[19:45] <tnt> this gives me a S3 key that can write in buckets owned by 'svc' but can't list the content, and when it writes, the Owner is set to 'svc' which is what I need.
[19:48] <yehudasa> ah, I wasn't aware that it would work
[19:49] <yehudasa> the problem is that the change that you introduce modifies the behavior and breaks S3 compatibility
[19:51] <yehudasa> e.g., if you'd initiate a multipart object request, set permission as WRITE only for some other user (for that object)
[19:51] <yehudasa> and then try to list the parts using that user, you're going to fail
[19:52] <joshd> oliver2: back
[19:53] <oliver2> joshd: thnx... got a chat with sage already ;)
[19:53] <joshd> even better :)
[19:53] <oliver2> josh: things have recovered...
[19:53] <tnt> yehudasa: mmm, I don't see what you mean.
[19:54] <joshd> oliver2: what was the underlying problem?
[19:54] <oliver2> joshd: ceph osd was "paused"...
[19:55] <yehudasa> tnt: what you do is essentially the same as: user A initiates multipart upload, setting user B for write-only permission on the object. Then user B tries to list the upload parts.
[19:55] <yehudasa> The fact that you're using a subuser here doesn't matter.
[19:56] <joshd> gucki: bug tracker is back
[19:56] <joshd> joao: did you see gucki's mon crash earlier?
[19:56] <yehudasa> However, we can relax it for this specific request so that if it's a subuser, look at the parent user permissions.
[19:57] <yehudasa> that way we don't break compatibility with S3, as there's no subuser there
[19:57] <tnt> yehudasa: mmm, I was seeing the subusers more as the same kind of multiple users you can create under a single amazon account.
[19:57] <yehudasa> tnt: it is similar
[19:57] <tnt> yehudasa: but yes, if for that particular API you can check the parent, that's be good for me.
[19:58] <joao> joshd, sorry, no; went afk for a bit
[19:58] <joao> when was that mon crash?
[19:59] <joshd> (09:55:24 AM) gucki: oh, i just found one mon crashed: http://pastie.org/5341600
[20:00] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[20:00] <yehudasa> tnt: maybe the correct fix would be to set the owner as the subuser, however, then it wouldn't work for your case where you want the user to only have write access to the objects
[20:01] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[20:01] <joao> joshd, looking
[20:02] <tnt> yehudasa: yeah here the whole point is that we wanted to 'delegate' upload permissions but we want ownership of the uploaded files.
[20:02] <tnt> or put the owner as the subuser until the multipart complete
[20:03] <yehudasa> tnt: I think that can be solved with POST object, however, that'll only be available in 0.55
[20:03] <tnt> what's that ?
[20:03] <yehudasa> http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectPOST.html
[20:04] <tnt> mmm, not sure this would fit us. We need multi-part because we upload multi-gb files ...
[20:07] <tnt> In S3 we could use policies, but that's a whole other thing to implement
[20:08] <joao> well, that looks a lot like a race on the monitor's state
[20:08] <joao> gucki, what version are you running?
[20:12] <filooabsynth> after days like this one, i want to hit somebody
[20:12] <filooabsynth> seriously
[20:14] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[20:14] * houkouonchi-work (~linux@12.248.40.138) Quit (Remote host closed the connection)
[20:16] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[20:16] <yehudasa> tnt: also, you asked about the 200 OK issue?
[20:16] <yehudasa> ah, it wasn't you maybe
[20:17] <tnt> yehudasa: nope not me. The two other issues I entered were related to dowload resume :p (If-xxx and Etag)
[20:17] <yehudasa> tnt, the ETag is a trivial fix
[20:18] <yehudasa> I do remember fixing a similar issue recently though
[20:18] <yehudasa> checking the If-Modifies-Since now
[20:18] <tnt> well, do you support all the RFC unquoting rules or just strip "" ?
[20:19] <tnt> I wasn't actually sure if it was the job of the fastcgi frontend to deal with that or not.
[20:19] <yehudasa> fastcgi doesn't handle that
[20:19] <yehudasa> basically we strip the "" for the http header fields
[20:20] <yehudasa> not sure if the http url encoding applies to these
[20:21] <tnt> no, but \ escaping does
[20:22] <tnt> (for ETag at least, each field can actually have different rules ...)
[20:22] <tnt> gotta love http specs.
[20:22] <yehudasa> well, users cannot modify ETag
[20:22] <gucki> joao: latest argonaut from ubuntu quantal repos
[20:23] <gucki> joao: so 0.48.2
[20:23] <joao> gucki, kay, thanks
[20:23] <yehudasa> and since we generate them using a pretty constant format, there's no real use for any other escaping rules
[20:23] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[20:23] <joao> if you happen to have logs from all the monitors, it would be nice if you could just drop them somewhere I could grab them from
[20:24] <gucki> joshd: thanks, going to create some issues/ feature requests now :
[20:24] <tnt> yehudasa: well, a library could feel free to decode/re-encode the headers as they see fit.
[20:26] <yehudasa> tnt: while you are correct, I don't think that investing time in that is a top priority, until we find a library that actually does that
[20:26] <yehudasa> specifically for these specific strings it doesn't make sense
[20:26] <yehudasa> but I'll keep it in mind
[20:26] <AaronSchulz> yehudasa: http://tracker.newdream.net/issues/3454
[20:26] <yehudasa> AaronSchulz: thanks
[20:28] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[20:31] <noob2> as far as the journal's go can i move them after firing up my ceph cluster?
[20:31] <noob2> i'd like to see how ceph performs if i move them onto something else
[20:36] <joshd> noob2: you can, but you have to stop the osd, 'ceph-osd -i N --flush-journal', move the journal, then 'ceph-osd -i N --mkjournal' before starting the osd again
[20:36] <NaioN> noob2: yeah you could
[20:47] <gucki> not sure if it's a bug: when i shutdown an osd (service ceph stop mon), the cluster immediately starts to rebalance. shouldn't it wait around 300 seconds before the rebalance happens? this is really bad, if you need to take down an osd for a couple of minutes for example...
[20:48] <gucki> mon osd down out interval
[20:51] <NaioN> euhmmm do you use down or out?
[20:51] <gucki> i do service ceph stop mon
[20:51] <gucki> sorry
[20:52] <gucki> service ceph stop OSD
[20:52] <gucki> my fault
[20:52] <gucki> does the pause command help here? so should i do "ceph osd pause X" and then stop the osd daemon?
[20:53] <NaioN> I don't think that will help
[20:53] <joshd> no, just stopping the daemon like that shouldn't cause a rebalance, unless the upstart/init script is doing something it shouldn't
[20:53] <NaioN> but i don't know if stopping the service is the same as out'ing the osd
[20:54] <NaioN> joshd: yeah thought so, it's the same as downing the osd...
[20:54] <gucki> joshd: ok, so i should file a bug for it?
[20:55] * MikeMcClurg (~mike@62.200.22.2) Quit (Ping timeout: 480 seconds)
[20:55] * buck (~buck@c-24-7-14-51.hsd1.ca.comcast.net) has left #ceph
[20:55] <gucki> joshd: but wait...i think it also rebalances when i simply do "kill #osd-daemon-pid"
[20:55] <gucki> joshd: if you like i can test it ;)
[20:56] <NaioN> gucki: that's the same as stopping the service...
[20:56] <gucki> NaioN: if the upstart script does only that, yes. but then it's not a bug in the upstart script but ceph..?
[20:57] <joshd> gucki: what's your ceph.conf where the mons are running?
[20:58] <gucki> joshd: http://pastie.org/5342155
[20:58] <NaioN> gucki: as far as i know that's the only thing the scripts do
[20:58] <gucki> joshd: it's the same on all hosts
[20:58] <NaioN> so no i don't think it's a bug in the scripts but somewhere else
[21:00] <joshd> just to confirm, after you stop an osd (by killing the process directly or with service ceph stop osd), you immediately see ceph -s report the osd as down and out (ceph osd dump will show the individual osds)?
[21:02] <gucki> joshd: hold on, i'll just test it now
[21:07] <gucki> joshd: http://pastie.org/5342192
[21:08] <joshd> so that's correct, it's being marked down, but data isn't reshuffled until it's marked out
[21:09] <joshd> and it shouldn't be marked out (shows up as 0 in the last column of ceph osd tree) until that 300s interval you mentioned earlier
[21:10] <gucki> joshd: but it does rebalance as you can see from the output of ceph -w ...?
[21:10] <jefferai> So I just got my ceph cluster up and running, I think...ran mkcephfs which ran without errors
[21:10] <jefferai> but when I run "ceph -k <keyring> -c <conf file> health"
[21:10] <jefferai> it hangs
[21:10] <jefferai> any ideas?
[21:10] <joshd> jefferai: are the monitors running?
[21:11] <jefferai> yes
[21:12] <joshd> gucki: that ceph -w output showing pgs marked degraded doesn't mean they've moved, just that there's less than the intended number of replicas currently. you'll see them go to peering+degraded and then active+degraded+recovering when osd.3 is finally marked out, and that's when data is moved
[21:13] <jefferai> hm
[21:13] <jefferai> joshd: 2012-11-07 15:13:25.155462 7f00ba621700 1 mon.1@0(probing) e0 discarding message auth(proto 0 30 bytes epoch 0) v1 and sending client elsewhere; we are not in quorum
[21:13] <gucki> joshd: ah ok...so when bringing the osd back up the recovery i see then is just for all the data that changed in the meanwhile?
[21:14] <NaioN> jefferai: you sure all mons are started?
[21:14] <jefferai> yep
[21:14] <jefferai> NaioN: I'm seeing that message on all three mons :-)
[21:14] <joshd> gucki: yes
[21:14] <NaioN> how many doe you have?
[21:14] <NaioN> oh k
[21:14] <NaioN> can they reach each other?
[21:14] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[21:15] <jefferai> good question, let me ping around
[21:15] <jefferai> they can ping, at least
[21:15] <jefferai> I did put in the requested iptables rules to allow traffic
[21:15] <NaioN> and you didn't split the networks?
[21:16] <NaioN> so cluster network and public network?
[21:16] <jefferai> I did
[21:16] <jefferai> ah, hah
[21:16] <jefferai> firewall, after all
[21:16] <NaioN> ok and in what network reside the mons and can they reach each other in that network?
[21:16] <NaioN> hehe
[21:17] <NaioN> damn firewalls :)
[21:17] <jefferai> hm, that doesn't seem nice:
[21:17] <jefferai> ==> /var/lib/ceph/log/osd.100.log <==
[21:17] <jefferai> 2012-11-07 15:17:31.330700 7ff43b3a6780 1 journal close /var/lib/ceph/osd/journal/osd.100
[21:17] <jefferai> 2012-11-07 15:17:31.331198 7ff43b3a6780 -1 ** ERROR: osd init failed: (1) Operation not permitted
[21:18] <joshd> jefferai: usually that happens due to missing xattrs - the user_xattr option for ext4
[21:18] <jefferai> not using ext4
[21:18] <jefferai> using xfs
[21:18] <NaioN> does the file exist?
[21:18] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[21:18] <jefferai> which doesn't have a user_xattr option (read around the net, on by default)
[21:19] <NaioN> yeah for xfs you don't have to do anything for xattrs
[21:19] <jefferai> they do
[21:19] <jefferai> exist, that is
[21:19] <jefferai> the journals
[21:19] <jefferai> I'll let things settle a bit
[21:19] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[21:19] <NaioN> and they have the right size?
[21:19] <jefferai> might just be because everything is coming up
[21:19] <jefferai> yep, 5GB each
[21:20] <jefferai> seeing a lot of osd X reported failed by osd Y
[21:20] <jefferai> I'm guessing because they're still syncing?
[21:20] <jefferai> health shows HEALTH_OK
[21:20] <NaioN> what's the output of ceph -s?
[21:20] <jefferai> health HEALTH_OK
[21:20] <jefferai> monmap e1: 3 mons at {1=192.168.66.201:6789/0,2=192.168.66.202:6789/0,3=192.168.66.204:6789/0}, election epoch 6, quorum 0,1,2 1,2,3
[21:20] <jefferai> osdmap e21: 12 osds: 12 up, 12 in
[21:20] <jefferai> pgmap v65: 3648 pgs: 3648 active+clean; 0 bytes data, 504 MB used, 5719 GB / 5720 GB avail
[21:20] <jefferai> mdsmap e1: 0/0/1 up
[21:21] <jefferai> sorry, should have pastebinned I guess
[21:21] <NaioN> everything seems ok
[21:21] <jefferai> ah, missing heartbeats being reported for some nodes
[21:21] <gucki> joshd: btw, why's there only aio for the journal and not the data itself? :)
[21:22] <NaioN> jefferai: well if the load gets to high you could see them
[21:22] <NaioN> but I saw them when recovering
[21:22] <jefferai> load average is 0.28
[21:22] <joshd> gucki: aio requires directio, which can often end up being more expensive. plus with btrfs, we can use btrfs snapshots instead of calling syncfs or using directio
[21:23] <NaioN> jefferai: not only the load of the system :)
[21:23] <jefferai> node 4 keeps saying no reply from osd.205 ever
[21:23] <NaioN> also from the network
[21:23] <jefferai> and node 2 keeps saying no reply from osd.405 ever
[21:23] <NaioN> euhmm
[21:23] <jefferai> node 1 is happy
[21:23] <NaioN> yeah
[21:23] <NaioN> you didn't number the osds correct
[21:23] <jefferai> how so?
[21:23] <NaioN> you can't have spaces between them
[21:24] <jefferai> spaces?
[21:24] <jefferai> where?
[21:24] <NaioN> so it's osd 1,2,3,4,5...
[21:24] <NaioN> not 1 205 405
[21:24] <jefferai> are you sure?
[21:24] <NaioN> yeah
[21:24] <jefferai> because most of the cluster seems totally fine
[21:24] <joshd> well, you can have spaces, it's just slightly less efficient
[21:24] <jefferai> and it's only those two osds
[21:24] * danieagle (~Daniel@177.133.174.11) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[21:24] <joshd> and some tools etc might assume sequential numbering
[21:24] <jefferai> hm
[21:25] <NaioN> yeah sequential i was looking for that word :)
[21:25] * buck (~buck@c-24-7-14-51.hsd1.ca.comcast.net) has joined #ceph
[21:25] <jefferai> I had wanted to do that because that way I could easily look at an OSD and know what and where it was
[21:25] <jefferai> which disk, which server
[21:26] <gucki> joshd: thanks for the info :)
[21:27] <NaioN> i would recommend sticking to sequential numbering for the osds, that's also recommended
[21:27] <NaioN> for mds's and mon's you can have names
[21:29] * filooabsynth (~Adium@p4FFFE3BB.dip.t-dialin.net) has left #ceph
[21:30] <joshd> jefferai: you can tell the server an osd is on with 'ceph osd tree' if you have the hostnames in your crushmap
[21:31] <jefferai> hrm, ok
[21:41] * holoca (~MoZaHeM@1RDAAEV1V.tor-irc.dnsbl.oftc.net) has joined #ceph
[21:48] * s_parlane (~scott@121-74-235-205.telstraclear.net) Quit (Ping timeout: 480 seconds)
[21:50] * dmick (~dmick@2607:f298:a:607:a19c:5287:dc35:bf55) has joined #ceph
[22:04] <jefferai> joshd: it seems to me like you don't really need crushmaps in most cases, you can use pools to separate out e.g. fast and slow storage, right?
[22:04] <holoca> joshd: it seems to me like you don't really need crushmaps in most cases, you can use pools to separate out e.g. fast and slow storage, right?
[22:04] <jefferai> or am I missing something?
[22:04] <holoca> or am I missing something?
[22:04] <elder> dmick, can you get operator privs for holoca?
[22:04] <holoca> dmick, can you get operator privs for holoca?
[22:05] * ChanServ sets mode +o dmick
[22:05] * holoca was kicked from #ceph by dmick
[22:05] <elder> My hero.
[22:05] <dmick> <bows>
[22:05] <nhm> yay
[22:05] <dmick> sorry
[22:05] <joshd> jefferai: the crushmap controls placement by matching pools to crush rules
[22:06] <dmick> (I started off confused, like, "who's holoca, and why would I give him ops")
[22:06] <joshd> jefferai: within a pool, your crush rule can separate replicas across failure domains (i.e. rows, racks, hosts, etc)
[22:06] <elder> Talk funny, sometimes I do.
[22:06] <jefferai> I see
[22:07] <jefferai> I assume the defaults are to spread replicas across hosts?
[22:07] * Tamil1 (~Adium@38.122.20.226) has joined #ceph
[22:07] <joshd> yup
[22:08] <lurbs> For a small (three node, six osds per node) setup I have two SSDs per node. Currently they're split, and each is providing the journal for three osds. This made more sense to me than putting them in a RAID 1 and having less chance of failure, but also a larger impact and smaller possible journal size (8 Gb vs 16 GB) per osd.
[22:08] <lurbs> Any thoughts?
[22:08] <lurbs> s/Gb/GB/
[22:10] <joshd> I don't think raid1 for journal ssds is a good idea, since they're get similar amounts of (and more than if they were separate) writes, so they might fail sooner and closer together
[22:10] <nhm> lurbs: I do 3 journals per SSD on my test node. 8GB journals per SSD should be fine.
[22:10] <nhm> s/per SSD/per OSD
[22:13] <lurbs> I should probably test performance with smaller journals. The less of the SSD I'm using the better they'll last, too.
[22:14] <tnt> joshd: more writes than if they were separates ? how so ?
[22:15] * oliver2 (~oliver@jump.filoo.de) has left #ceph
[22:16] <joshd> tnt: with all the osds using the raid1, writes from every osd go to both ssds. using the 2 ssds separately only requires them to handle the writes of half the osds on that node
[22:17] <tnt> Ah yes, ok.
[22:17] <lurbs> I don't expect the workload to be all that write heavy, but it's still a consideration. SSDs are still crazy newfangled things that I don't quite trust.
[22:19] * warlot (~Chatzilla@1RDAAEV26.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:19] <yehudasa> tnt: I pushed a branch for 3452. It's against latest master but could probably be cherry-picked into whatever branch you're on
[22:19] <joshd> keep in mind that recovery counts as writes too
[22:19] <warlot> tnt: I pushed a branch for 3452. It's against latest master but could probably be cherry-picked into whatever branch you're on
[22:19] <warlot> keep in mind that recovery counts as writes too
[22:19] <yehudasa> fixes the not-modified-since issue
[22:19] <elder> dmick, warlot now
[22:19] <warlot> fixes the not-modified-since issue
[22:19] <warlot> dmick, warlot now
[22:19] * warlot was kicked from #ceph by dmick
[22:20] <elder> Dreamy!
[22:20] <yehudasa> whoo!
[22:20] <dmick> what a senseless hobby
[22:20] <elder> Killing echobots?
[22:20] <dmick> being
[22:21] <dmick> I mean, go blow up a model car or something mature
[22:22] <tnt> yehudasa: thanks, I'll try that tomorrow when I'm at work.
[22:23] * warlot (~Chatzilla@1RDAAEV26.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:25] <tnt> yehudasa: it's weird, I though the other one was the right one. Guess I must have screwed up my testcase :p
[22:25] <warlot> yehudasa: it's weird, I though the other one was the right one. Guess I must have screwed up my testcase :p
[22:25] <elder> Hi warlot!
[22:25] <warlot> Hi warlot!
[22:25] * warlot was kicked from #ceph by dmick
[22:25] <yehudasa> dmick, ban warlot
[22:25] <elder> I got chills.
[22:25] <elder> Yes.
[22:25] <dmick> trying to figure it out
[22:25] <elder> Bannination.
[22:27] * dmick sets mode +b warlot!*@*
[22:28] <elder> Now let's see if warlot_prime shows up.
[22:29] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:35] <rturk> wow, that's the first time I've seen misbehavior here
[22:35] <elder> It happens.
[22:37] <dmick> first day of that thousand years of darkness I guess
[22:37] * sagelap (~sage@bzq-218-183-205.red.bezeqint.net) Quit (Read error: Connection reset by peer)
[22:37] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[22:38] * sagelap (~sage@bzq-218-183-205.red.bezeqint.net) has joined #ceph
[22:44] <rturk> I blame the election
[22:46] <dmick> rturk: http://bit.ly/TawBkJ
[22:47] <elder> I wondered what that reference was.
[22:48] <rturk> aah!
[22:48] <rturk> nice
[22:48] <rturk> sounds vaguely racist
[22:48] <dmick> HOW DARE YOU
[22:49] <dmick> <sigh.
[22:49] * rturk wonders what ted nugent is saying today
[22:49] <lurbs> Only vague racism from America? That's a dramatic improvement.
[22:50] <rturk> lurbs: haha
[22:50] <dmick> rturk: it's something to behold. have antacids handy
[22:50] <lurbs> They should let the rest of the world vote for them, as it is around 55 million got it wrong.
[22:51] * lurbs shuts up and goes back to breaking his cluster.
[22:51] <elder> Chuck is a remarkably young looking 72 years old.
[22:52] <rturk> I guess being a ninja can do that
[22:53] <nhm> apparently Donald Trump was calling for marching on washington and having a revolution.
[22:55] <dmick> probably not the right forum for this conversation, but one last rejoinder: http://bit.ly/Z176CY
[22:55] * gucki (~smuxi@84-72-8-40.dclient.hispeed.ch) Quit (Remote host closed the connection)
[22:58] * jskinner (~jskinner@69.170.148.179) has joined #ceph
[23:01] * ScottIam (~ScottIam@38.104.59.18) has joined #ceph
[23:07] <jskinner> anybody have experience with integrating with openstack?
[23:08] <joshd> jskinner: yeah, what's up?
[23:08] <jskinner> using cinder, I am having issues getting it to create volumes.
[23:09] <jskinner> but if I copy the commands that cinder is trying to run, and run them directly it works.
[23:09] <joshd> are you using cephx authentication?
[23:10] * nwatkins (~Adium@soenat3.cse.ucsc.edu) Quit (Quit: Leaving.)
[23:10] <jskinner> yes
[23:10] <joshd> does the user running cinder have read access to /etc/ceph/ceph.conf and whatever keyring that points to?
[23:11] <lurbs> Last time I tried that I ended up with apparmor permission issues. I think that the lbivirt-bin from Ubuntu's cloud repository fixes that issue, though.
[23:11] <lurbs> libvirt-bin, even
[23:11] <jskinner> we have temporarily disabled apparmor to rule that out
[23:11] <jskinner> cinder has access to the keyring
[23:11] <jskinner> but not the ceph.conf file
[23:11] * s_parlane (~scott@202.49.72.37) has joined #ceph
[23:13] <joshd> it needs access to ceph.conf to get the monitor addresses
[23:16] <lurbs> jskinner: I found this useful: http://www.sebastien-han.fr/blog/2012/06/10/introducing-ceph-to-openstack/
[23:16] <jefferai> dmick: so this is odd -- when starting a new cluster, one of my machines failed to create its OSDs
[23:16] <jefferai> only difference between it and the others was it was a slightly newer kernel patch level
[23:16] <jefferai> this is on precise
[23:16] <jskinner> thanks, lurbs. I have been reading that article and it is pretty useful
[23:16] <jefferai> I upgraded the other two machines, rebooting now, will see if all three fail
[23:16] <dmick> creating how, and any evidence of failure mode?
[23:16] <jskinner> joshd, we have made sure it has read access. still getting errors
[23:17] <jefferai> nope, no evidence of failure -- mkcephfs was fine, it was when the daemons first started
[23:17] <jefferai> they're rebooting, but when they come up I can get logs
[23:17] <jefferai> (if the problem is still there)
[23:17] <jskinner> "Stderr: 'create error: (17) File exists\n2012-11-07 16:16:03.011627 7f7eca93f780 -1 librbd: rbd image volume-7cbd54ba-75ab-4110-9069-dd9d5c624c41 already exists\n'"
[23:17] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[23:17] <jskinner> however, that file absolutely does not exist
[23:18] <joshd> jskinner: which ceph version are you running?
[23:18] <jskinner> ceph version 0.48.2argonaut (commit:3e02b2fad88c2a95d9c0c86878f10d1beb780bfe)
[23:19] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[23:20] <joshd> are you sure it's using the right pool?
[23:20] <jskinner> I've created the volumes pool in ceph, and in the cinder.conf told it to use volumes
[23:20] <joshd> I ask because there was a bug with earlier versions of 'rbd import' the pool parameter was ignored
[23:21] <jskinner> "TRACE cinder.openstack.common.rpc.amqp Command: rbd create --pool volumes --size 1024 volume-3c295f4b-f527-4ab3-8429-6ff624fbe1f2"
[23:21] <jefferai> dmick: so here's the output from mkcephfs, all looks fine:
[23:21] <jefferai> http://paste.kde.org/599630/
[23:21] <jskinner> if I copy that command and run it as root it works just fine
[23:22] <jskinner> it's interesting, when trying to create volumes we either get the error I mention above; or this error:
[23:22] <dmick> jefferai: maybe they came up, but then died?
[23:22] <dmick> osd logs might show something
[23:22] <joshd> what if you run it as the cinder user? same error?
[23:22] <jskinner> "Stderr: 'create error: (1) Operation not permitted\n2012-11-07 16:20:49.116043 7f8042a6f780 -1 librbd: error adding img to directory: (1) Operation not permitted\n'"
[23:22] <jskinner> Will try that
[23:22] <dmick> (i.e. what you said; sorry, interleaved conversations)
[23:23] <jefferai> dmick: http://paste.kde.org/599636/
[23:23] <jefferai> that's the error
[23:23] <jefferai> it's only on this one box, not sure why
[23:23] <jefferai> they all got preseeded the same way, all using salt for config management
[23:23] <jefferai> and it's for every osd on this box
[23:24] <dmick> maybe try the osd startup command manually with strace to see what that EPERM is coming from
[23:24] <dmick> I suspect perms on the data or journal dirs/files, but no theories as to why they'd differ
[23:25] <dmick> (look at one of the other hosts' running osd and do the same with the appropriate mods)
[23:25] <joshd> jskinner: what are the permissions for the rados client cinder is using? (ceph auth list will show them)
[23:25] <jskinner> aha!
[23:25] <jskinner> that looks like it, running it from cinder user
[23:25] <jskinner> "error: couldn't connect to the cluster!2012-11-07 16:24:39.484045 7fe7e2973780 -1 auth: failed to open keyring from /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin"
[23:26] <jefferai> dmick: *excellent* idea
[23:26] <jskinner> so looks like bad read permission on the keyrings
[23:27] <joshd> yeah, although I'm not sure how it would get those other errors in that case...
[23:27] <jefferai> dmick: I see the error but do not yet understand it
[23:27] <jefferai> looking...
[23:28] <jskinner> ok, so editing the keyring for read access to cinder allows us to run the command from cli as cinder user. However, creating volume from horizon is still no go.
[23:29] <joshd> what's the cinder-volume log say?
[23:29] <jskinner> "TRACE cinder.openstack.common.rpc.amqp Stderr: 'create error: (1) Operation not permitted\n2012-11-07 16:27:46.730493 7f071a286780 -1 librbd: error adding img to directory: (1) Operation not permitted\n'"
[23:30] <joshd> what are the rados client's permissions?
[23:31] <jskinner> from ceph auth?
[23:31] <joshd> yeah, like in 'ceph auth list'
[23:32] <jskinner> yeah ok - well we don't have a rados user - we have client.radosgw.gateway, client.admin, client.images, and client.volumes
[23:32] <jskinner> client.volumes is the user we have in place for cinder
[23:33] <dmick> jefferai: can stare at the strace with you if you like
[23:33] <jefferai> dmick: isssue is that the IP address is in use
[23:33] <jskinner> client.volumes
[23:33] <jskinner> key: AQBWL5lQgJ7vFBAAMAPUJxV/za66KOHdTx6bLg==
[23:33] <jskinner> caps: [mon] allow r
[23:33] <jefferai> trying to figure out WTF is using it
[23:33] <jskinner> caps: [osd] allow rwx pool=volumes, allow rx pool=images
[23:33] <dmick> the port?
[23:34] <jefferai> bind(6, {sa_family=AF_INET, sin_port=htons(6800), sin_addr=inet_addr("192.168.37.201")}, 16) = -1 EADDRINUSE (Address already in use)
[23:34] <jefferai> but thre is *nothing*using it
[23:34] <jefferai> UDP or TCP
[23:34] <jefferai> err, except NTP on port 123
[23:34] <joshd> jskinner: that's correct. maybe cinder-volume needs to be restarted since you disabled apparmor?
[23:34] <dmick> lsof -i :6800, maybe?
[23:34] <jskinner> we have rebooted since then.
[23:35] <jefferai> dmick: nothing
[23:35] <jskinner> in the cinder.conf for rbd user, does it need to be volumes, or client.volumes
[23:35] <joshd> just volumes
[23:35] <jskinner> ok, thats good then.
[23:35] <jskinner> yep, really bashing my head against this one.
[23:35] <jefferai> dmick: I did run mkcephfs on that box
[23:36] <jefferai> maybe it started something that iddn't shut down cleanly?
[23:36] <jefferai> IOW try rebooting? :-)
[23:36] <dmick> jskinner: you could try zsh'ing your head
[23:36] <dmick> jefferai: the only way I can imagine bind failing is if a proc has that port open
[23:36] <joshd> jskinner: did you set CEPH_ARGS="--id volumes" for cinder-volume?
[23:36] <jefferai> dmick: right, me too, but thre is *nothing*
[23:36] <jefferai> lsof says nothing, netstat -tln says nothing
[23:36] <dmick> bouncing certainly can't hurt
[23:37] <jefferai> yeah,will bounce
[23:37] <jskinner> yes
[23:38] <joshd> so the create still works as the cinder user, but not when run from the cinder-volume service
[23:38] <jskinner> correct
[23:38] <joshd> does 'rbd create --id volumes --pool volumes --size 1024 volume-3c295f4b-f527-4ab3-8429-6ff624fbe1f2' work as the cinder user?
[23:39] <joshd> or with a different name, that's just from above
[23:39] <jskinner> no it does not
[23:39] <jskinner> I get
[23:39] <jskinner> 2012-11-07 16:39:12.773543 7faa7090e780 -1 librbd: rbd image volume-3c295f4b-f527-4ab3-8429-6ff624fbe1f2 already exists
[23:39] <joshd> right, what about when you change the name?
[23:40] <joshd> i.e. 'rbd create --id volumes --pool volumes --size 1024 test'
[23:40] <jskinner> cinder@dev-node1:/etc/ceph$ rbd create --id volumes --pool volumes --size 1024 volume-test-volume
[23:40] <jskinner> create error: (1) Operation not permitted
[23:40] <jskinner> 2012-11-07 16:40:22.381386 7ff115b26780 -1 librbd: error adding img to directory: (1) Operation not permitted
[23:42] <joshd> well that means it's a problem specific to client.volumes, which suggests that the caps may not be being interpreted properly
[23:42] <jskinner> its failing when specifying "--id volumes"
[23:42] <joshd> right, and that's what makes it use client.volumes instead of client.admin
[23:43] <jskinner> oh I see - the id is specifying the volumes user.
[23:43] <jefferai> dmick: bouncing didn't help
[23:43] <jskinner> still learning ceph lol
[23:43] <jefferai> this is awful, what on earth is going on
[23:43] <dmick> hmm
[23:43] <jefferai> full strace;
[23:43] <jefferai> http://paste.kde.org/599648/
[23:44] <dmick> wait wtf
[23:44] <joshd> jskinner: that's what the CEPH_ARGS env variable is doing for cinder-volume too - it's the same as adding $CEPH_ARGS to each rbd command it runs
[23:44] <dmick> bind succeeds on 192.168.88.201:6800
[23:44] <jefferai> ceph.conf: http://paste.kde.org/599654/
[23:44] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[23:45] <dmick> succeeds on 192.168.37.201:6800
[23:45] <dmick> and then fails the second time
[23:45] <jskinner> ok
[23:45] <jefferai> yes
[23:45] <jefferai> dmick: see if my conf makes sense
[23:45] <jefferai> I think it should, but, maybe I'm doing something stupid
[23:45] <jskinner> possible the key is wrong for that user
[23:45] <jefferai> maybe, for instance, I need to be putting explicit ports in
[23:46] <jefferai> although it works on -storage-2 and -storage-4
[23:46] <joshd> jskinner: if the key were wrong, it wouldn't be able to connect at all. since it's failing when trying to actually create the image, it seems like a caps issue
[23:47] <joshd> i.e. 'caps: [osd] allow rwx pool=volumes, allow rx pool=images' isn't doing what it should, or there's some small error there
[23:47] <dmick> I would *think*, without checking the code, that the osd's should be trying 6800, 6801, 6802 etc. as they fail
[23:48] <jefferai> dmick: does that config file make sense?
[23:48] <dmick> checking; not sure how * addr works in the osd section
[23:49] <jefferai> it's: http://ceph.com/docs/master/config-cluster/ceph-conf/#networks
[23:49] <dmick> I think it's unnecessary to do that once you've specified the network above; the daemons will search for a public/private address on the interface
[23:49] <jefferai> that's what I was saying to you yesterday I think is mistaken
[23:49] <jefferai> ah
[23:49] <dmick> (and yes I know, but I don't have experience with actually setting them up)
[23:49] <joshd> jskinner: I just reproduced the problem using those caps
[23:49] <dmick> but it may be that *if* you specify an address, you also need to specify a port
[23:49] <jefferai> I can try taking those bits out
[23:49] <jefferai> hm
[23:49] <jefferai> will try
[23:50] <dmick> Accepter::bind, if you're following along at home
[23:50] <jskinner> ok cool
[23:51] <jefferai> dmick: nope
[23:51] <jefferai> dmick: unless that's not the problem
[23:51] <jefferai> maybe the problem is the
[23:51] <jefferai> futex(0x7fb604bcf9d0, FUTEX_WAIT, 5402, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[23:51] <jefferai> that is actually killin git
[23:51] <jskinner> do I need to specify a keyring under the osd portion of the ceph.conf?
[23:52] <dmick> yeah, wait
[23:52] <joshd> jskinner: if you replace the , with a ; (so it's 'allow rwx pool=volumes; allow rx pool=images' it works
[23:52] <dmick> it fails bind, but then binds to 6801 and works, which is what I'd expect
[23:52] <dmick> so that's not a failure, yes
[23:52] <jefferai> right
[23:52] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[23:53] <dmick> in fact after that I don't see a failure. hm.
[23:53] <jefferai> dmick: the next time I ran it -- no EAGAIN on the futex
[23:53] <jefferai> any debugging you can suggest that I add?
[23:53] <dmick> http://paste.kde.org/599648/ shows no failure; I was just looking at the bind because you pointed at it, but .. I don't see any problem?...
[23:53] <dmick> that ran exited?
[23:53] <dmick> *that run
[23:53] <jefferai> yep
[23:53] <dmick> oh oh
[23:53] <dmick> clone()
[23:53] <dmick> you need -f
[23:53] <dmick> to see the daemon
[23:54] <dmick> strace -f to follow children. Sorry
[23:54] <jskinner> aha, which command do I use to edit that
[23:54] <joshd> jskinner: the osds don't need a keyring specified - they keep it in their data dir
[23:54] <jefferai> ah, heh
[23:54] <dmick> then look up from the bottom for EPERM
[23:54] <joshd> jskinner: no editing, you can delete and re-add client.volumes
[23:54] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Remote host closed the connection)
[23:54] <jskinner> ok
[23:54] <jskinner> what about the key?
[23:55] <jefferai> dmick: http://paste.kde.org/599666/
[23:55] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[23:55] <jefferai> no EPERM
[23:56] <joshd> jskinner: it will change
[23:56] <joshd> jskinner: editing permissions without changing keys would make revoking much more complicated
[23:56] <jskinner> so just update the keyring with the new key
[23:56] <joshd> yeah
[23:58] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)
[23:59] <jskinner> ok so i will have to just re generate a secret for virsh and define the new key
[23:59] <joshd> you can just re-run virsh secret-set-value

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.