#ceph IRC Log

Index

IRC Log for 2012-10-31

Timestamps are in GMT/BST.

[0:12] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) Quit (Remote host closed the connection)
[0:14] * BManojlovic (~steki@212.200.240.142) Quit (Quit: Ja odoh a vi sta 'ocete...)
[0:20] * gucki (~smuxi@46-127-158-51.dynamic.hispeed.ch) has joined #ceph
[0:20] <gucki> hi again
[0:20] <gucki> now i got my first ceph cluster failure
[0:20] <gucki> one osd was too busy and thus got marked as down and out
[0:21] <gucki> it was never really down...
[0:21] <gucki> but now everything is really slow because rebalacing happens
[0:21] <gucki> how can i get that osd back online asap? :)
[0:21] <gucki> I did "ceph osd in 0" and now it's marked as in, but still down
[0:21] <sagewk> need to restart the daemon
[0:22] <gucki> cep osd up 0 does not seem to exist
[0:22] <gucki> can i just do "killall ceph" and then "ceph start"?
[0:22] <sagewk> right. the daemon has to be running to be 'up'
[0:22] <sagewk> kill that particular ceph-osd process, then start it (service ceph start osd.0)
[0:22] <sagewk> killall if it's the only ceph-osd on the host
[0:24] * adjohn (~adjohn@69.170.166.146) has joined #ceph
[0:24] <gucki> ok thanks, i'll try :)
[0:24] <sagewk> elder: there?
[0:25] <gucki> killing takes some time, as the osd process is in D state...
[0:25] <joshd> gucki: check dmesg for fs errors (esp. if using btrfs)
[0:25] <gucki> joshd: everything clean, i'm using ext4 for now
[0:26] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:4908:4757:6376:c5a2) has joined #ceph
[0:27] <gucki> joshd: but in syslog i can see this http://pastie.org/5140124
[0:28] * jlogan (~Thunderbi@2600:c00:3010:1:3880:bbab:af7:6407) Quit (Ping timeout: 480 seconds)
[0:29] <joshd> that's expected when there's tons of dirty data to write out
[0:29] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[0:30] <gucki> joshd: ok, so i just have to wait i guess...? :(
[0:33] <joshd> yeah, that's the trouble with stress testing writes
[0:33] <gucki> joshd: mh, the strange this is dstat shows no disk io at all...
[0:34] <gucki> http://pastie.org/5140147
[0:34] <joshd> what kernel are you running?
[0:34] <gucki> 2.6.32-pve (proxmox)
[0:35] <joshd> ah, that's pretty old
[0:35] <joshd> I wouldn't be too surprised if you hit an ext4 bug
[0:35] <gucki> that sounds really bad :(
[0:35] <gucki> i've important data on that ceph cluster... :(
[0:36] <gucki> so you think i need to reboot or what can i do now?
[0:36] <joshd> it's just that one osd though, right?
[0:36] <gucki> yes, one of four
[0:36] <gucki> http://pastie.org/5140166
[0:37] <gucki> the degraded percentage slowly does down...
[0:38] <joshd> yeah, with a small cluster things go a lot slower
[0:38] <joshd> things still operate while degraded though, it just means you have less redundancy
[0:39] <gucki> joshd: so do you think it'll recover or a reboot is needed? because io is still 0 on that node but one core is 100% iowait...
[0:40] <joshd> sounds like it needs a reboot
[0:53] * tnt (~tnt@20.35-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[0:54] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[0:57] * chutzpah (~chutz@199.21.234.7) Quit (Quit: Leaving)
[0:59] * synapsr (~Adium@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[0:59] * miroslav (~miroslav@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[0:59] * synapsr (~Adium@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[0:59] * miroslav1 (~miroslav@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[1:00] * glowell2 (~glowell@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[1:00] * glowell1 (~glowell@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[1:06] * synapsr (~Adium@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[1:07] * deepsa_ (~deepsa@122.172.6.178) has joined #ceph
[1:08] * synapsr (~Adium@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[1:09] * deepsa (~deepsa@122.172.20.38) Quit (Ping timeout: 480 seconds)
[1:09] * deepsa_ is now known as deepsa
[1:13] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) has joined #ceph
[1:13] * Yann__ (~Yann@did75-15-88-160-187-237.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[1:15] * Ryan_Lane (~Adium@207.sub-70-197-12.myvzw.com) has joined #ceph
[1:19] * miroslav (~miroslav@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[1:19] * synapsr (~Adium@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[1:24] * sagelap (~sage@2607:f298:a:607:74aa:5ec5:b8d:3870) Quit (Ping timeout: 480 seconds)
[1:29] * glowell1 (~glowell@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[1:30] * houkouonchi-work (~linux@12.248.40.138) Quit (Read error: Connection reset by peer)
[1:31] * sagelap (~sage@75.sub-70-197-142.myvzw.com) has joined #ceph
[1:32] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[1:44] <gucki> joshd: what's the best way to shutdown the whole cluster? shutdown the mons first and then all osds?
[1:45] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:45] <dmick> gucki: joshd is gone, but I've seen "killall" recommended a lot, so I doubt the order matters
[1:45] <gucki> dmick: but when i shutdown an osd, wont ceph start a rebalance?
[1:45] <dmick> if you wait the timeout period, which is tens of seconds, to kill the rest
[1:46] <dmick> if you do it fast enough there may not even be anyone around to notice
[1:46] <gucki> dmick: mh, but there's no "clean, real" way to do it? :)
[1:46] <dmick> well there's always init.d/upstart
[1:47] <gucki> mh, guess i'll do "ceph stop -a"
[1:47] * Ryan_Lane (~Adium@207.sub-70-197-12.myvzw.com) Quit (Quit: Leaving.)
[2:01] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[2:06] * jjgalvez1 (~jjgalvez@12.248.40.138) Quit (Quit: Leaving.)
[2:07] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[2:07] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[2:16] * Cube1 (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[2:16] * dmick (~dmick@2607:f298:a:607:2c2d:7a5b:d40b:e703) has left #ceph
[2:20] * houkouonchi-work (~linux@12.248.40.138) Quit (Remote host closed the connection)
[2:24] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) Quit (Quit: Leaving.)
[2:38] * MikeMcClurg (~mike@91.224.174.71) Quit (Ping timeout: 480 seconds)
[2:45] * adjohn (~adjohn@69.170.166.146) Quit (Quit: adjohn)
[2:57] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:4908:4757:6376:c5a2) Quit (Quit: LarsFronius)
[3:04] * Cube1 (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[3:05] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[3:06] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[3:36] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[3:56] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[4:20] <gucki> after my osd failure i followed the instructions here http://ceph.com/wiki/Replacing_a_failed_disk/OSD
[4:20] <gucki> now the osd is marked again "up" but still "out". do i have to do anything to put i back in or will this happen automatically within a few minutes?
[4:23] <gucki> ok, looking here http://ceph.com/docs/master/cluster-ops/add-or-rm-osds/ it seems i have to manually add it...
[4:24] <mikeryan> gucki: it's after business hours, so dmick probably won't be around, but i can help you
[4:25] <mikeryan> i assume you've made it as far as this step: http://ceph.com/docs/master/cluster-ops/add-or-rm-osds/#starting-the-osd
[4:25] <mikeryan> "Starting the OSD"
[4:25] <mikeryan> so as you can see, the next step is to bring it into the cluster again, using ceph osd in $num
[4:25] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[4:30] <gucki> mikeryan: yes, it's now back in and rebalacning. seems like the different docs (and wiki) are not fully in sync...
[4:31] <mikeryan> yes, that is a problem we've had
[4:31] <mikeryan> we're working on it, sorry you ran into it :(
[4:33] <gucki> mikeryan: no problem, better to have too much doc (even if not 100% consistent) than too few :(
[4:34] <gucki> mikeryan: is it normal that "rbd rm ..." is taken ages before anything happens? "rbd info ..." works instantly
[4:34] <gucki> mikeryan: the image is 100gb if that matters...
[4:34] <mikeryan> unfortunately i'm not too familiar with rbd
[4:34] <mikeryan> it sounds surprising that a delete would be expensive
[4:35] <gucki> can i somehow check an image for consistency? something like fsck..?
[4:35] <mikeryan> i don't think that we have that, but i'm not sure
[4:36] <mikeryan> the OSDs themselves are constantly being checked for accuracy, a process known as scrub
[4:36] <gucki> ah ok, yeah i see that in the logs
[4:37] <gucki> probably it's so slow because rebelancing is working now too
[4:39] <mikeryan> you can check your PGs using ceph pg stat
[4:39] <mikeryan> if there are any that are not active+clean they could be recovering
[4:43] <gucki> mikeryan: yes, "ceph health" shows recovery
[4:43] <mikeryan> until everyone's clean, performance will of course be degraded
[4:46] <mikeryan> until everyone's clean, performance will of course be degraded
[4:46] <mikeryan> whoops, sorry
[4:48] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[4:58] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[5:00] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) has joined #ceph
[5:02] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[5:03] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) has joined #ceph
[5:15] * sagelap (~sage@75.sub-70-197-142.myvzw.com) Quit (Ping timeout: 480 seconds)
[5:15] * glowell1 (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[5:24] * deepsa_ (~deepsa@122.172.174.182) has joined #ceph
[5:25] * deepsa_ (~deepsa@122.172.174.182) Quit ()
[5:25] * deepsa (~deepsa@122.172.6.178) Quit (Remote host closed the connection)
[5:26] * deepsa (~deepsa@122.172.174.182) has joined #ceph
[5:31] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[5:31] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[5:31] * deepsa_ (~deepsa@122.172.37.153) has joined #ceph
[5:33] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[5:33] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[5:34] * deepsa (~deepsa@122.172.174.182) Quit (Ping timeout: 480 seconds)
[5:34] * deepsa_ is now known as deepsa
[5:59] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[6:00] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[6:01] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Remote host closed the connection)
[6:07] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[6:26] * synapsr (~Adium@c-69-181-244-219.hsd1.ca.comcast.net) has joined #ceph
[6:32] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[6:48] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[6:48] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[6:51] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[6:51] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[6:51] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[7:02] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[7:15] * spaceman139642 (l@89.184.139.88) Quit (Ping timeout: 480 seconds)
[7:21] * votz (~votz@c-76-21-6-55.hsd1.ca.comcast.net) has joined #ceph
[7:22] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[7:22] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[7:39] * iltisanni (d4d3c928@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[7:40] * mib_xpjvs2 (d4d3c928@ircip1.mibbit.com) has joined #ceph
[8:09] * deepsa_ (~deepsa@122.172.12.246) has joined #ceph
[8:09] * deepsa (~deepsa@122.172.37.153) Quit (Ping timeout: 480 seconds)
[8:09] * deepsa_ is now known as deepsa
[8:09] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[8:11] <mib_xpjvs2> hey guys.. what do I have to do, to let my client use rbd to write objects into the cluster?
[8:11] <mib_xpjvs2> ? why do I have this name? sec
[8:11] * mib_xpjvs2 (d4d3c928@ircip1.mibbit.com) has left #ceph
[8:12] * mib_xpjvs2 (d4d3c928@ircip1.mibbit.com) has joined #ceph
[8:12] * benpol (~benp@garage.reed.edu) Quit (Ping timeout: 480 seconds)
[8:13] * rosco (~r.nap@188.205.52.204) Quit (Quit: *Poof*)
[8:13] * mib_xpjvs2 (d4d3c928@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[8:13] * rosco (~r.nap@188.205.52.204) has joined #ceph
[8:14] * tnt (~tnt@20.35-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[8:14] * iltisanni (d4d3c928@ircip3.mibbit.com) has joined #ceph
[8:15] <iltisanni> OK. so again. What do I have to do, to let a client use rbd to write and get objects from the ceph cluster?
[8:16] * deepsa_ (~deepsa@101.63.197.82) has joined #ceph
[8:17] * deepsa (~deepsa@122.172.12.246) Quit (Ping timeout: 480 seconds)
[8:17] * deepsa_ is now known as deepsa
[8:17] <synapsr> iltisanni I am not technical but work at Ceph, will be interesting to see the answer
[8:17] <iltisanni> I first installed a cluser of 3 VMs (all have osd and mon daemons running) and set a new pool. Then I installed ceph on a fourth VM (the client) and didnt't put that into the cluster but wrote the same ceph.conf to ot as it is in the cluster. (so that the client finds mon). Was that alle correct? What to do now?
[8:18] <iltisanni> The client should save its data into the new pool then... that was my idea :-)
[8:18] <iltisanni> using rbd commands (if possible.. i hope so)
[8:19] * benpol (~benp@garage.reed.edu) has joined #ceph
[8:23] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) has joined #ceph
[8:32] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[8:33] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[8:34] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) Quit (Remote host closed the connection)
[8:41] * votz (~votz@c-76-21-6-55.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[8:42] * votz (~votz@c-76-21-6-55.hsd1.ca.comcast.net) has joined #ceph
[8:44] * votz (~votz@c-76-21-6-55.hsd1.ca.comcast.net) Quit ()
[9:02] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[9:08] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[9:10] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit ()
[9:19] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[9:20] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[9:21] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[9:24] * loicd (~loic@magenta.dachary.org) has joined #ceph
[9:25] * deepsa_ (~deepsa@122.166.163.180) has joined #ceph
[9:26] * deepsa (~deepsa@101.63.197.82) Quit (Ping timeout: 480 seconds)
[9:26] * deepsa_ is now known as deepsa
[9:28] * aegislabs (7661b47c@ircip2.mibbit.com) has joined #ceph
[9:29] <aegislabs> hii guys
[9:30] <aegislabs> currently, I have issue with radosgw
[9:33] * deepsa (~deepsa@122.166.163.180) Quit (Remote host closed the connection)
[9:35] * deepsa (~deepsa@122.172.27.130) has joined #ceph
[9:35] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:38] * deepsa_ (~deepsa@101.62.45.161) has joined #ceph
[9:40] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[9:43] * deepsa (~deepsa@122.172.27.130) Quit (Ping timeout: 480 seconds)
[9:43] * deepsa_ is now known as deepsa
[9:45] * Leseb (~Leseb@193.172.124.196) has joined #ceph
[9:46] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[9:47] * tnt (~tnt@20.35-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:50] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[9:50] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[9:58] <aegislabs> i can't authenticate radosgw with s3 API
[10:00] <aegislabs> I use Amazon S3 Authentication Tool for Curl http://aws.amazon.com/code/128
[10:02] <aegislabs> this is verbose message from curl http://mibpaste.com/LylzfU
[10:14] <aegislabs> can someone help me..?
[10:15] <iltisanni> seems like I'm the only one who reads ur message and I can't help you. Sorry :-(
[10:15] <iltisanni> timezones...
[10:19] * deepsa_ (~deepsa@115.184.126.88) has joined #ceph
[10:20] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[10:20] * deepsa (~deepsa@101.62.45.161) Quit (Ping timeout: 480 seconds)
[10:20] * deepsa_ is now known as deepsa
[10:21] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[10:24] <iltisanni> I repeat my question from 8:15
[10:24] <iltisanni> OK. so again. What do I have to do, to let a client use rbd to write and get objects from the ceph cluster?
[10:24] <iltisanni> I first installed a cluser of 3 VMs (all have osd and mon daemons running) and set a new pool. Then I installed ceph on a fourth VM (the client) and didnt't put that into the cluster but wrote the same ceph.conf to ot as it is in the cluster. (so that the client finds mon). Was that alle correct? What to do now?
[10:24] <iltisanni> The client should save its data into the new pool then... that was my idea :-)
[10:24] <iltisanni> using rbd commands (if possible.. i hope so)
[10:26] <tnt> if you have cephx enabled, the client will need a key to access the cluster
[10:26] * loicd (~loic@178.20.50.225) has joined #ceph
[10:28] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[10:29] <iltisanni> that was the answer to aegislabs question ?
[10:31] <tnt> no, to yours
[10:32] <iltisanni> OK . cephx is not enabled I think
[10:33] <iltisanni> its this entry in ceph.conf isn't it ? auth supported = none
[10:33] <tnt> yes
[10:33] <tnt> then rbd should just work ...
[10:34] <tnt> if you do ceph -s on the 4th vm, does it say something ?
[10:35] <iltisanni> is says health_ok and gives me some stats about the 3 mons and the osds
[10:35] <iltisanni> so i think its finding the other ones
[10:35] <iltisanni> how can i write an object now
[10:36] <iltisanni> I have a pool named testauf2... there it should be placed
[10:36] <tnt> rbd doesn't store objects
[10:36] <tnt> rbd stores disks image
[10:36] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[10:37] <iltisanni> ok so no chance to write any data with rbd?
[10:37] <iltisanni> write data from client to cluster
[10:38] <tnt> well you can create a disk image and mount it ... but you need to make sure only 1 client mounts it.
[10:40] <iltisanni> ok
[10:40] <iltisanni> so 1 image for each client?
[10:40] <iltisanni> one ore more images for each client
[10:40] <tnt> ... depends on what you want to do ...
[10:41] <iltisanni> lets say I have 4 clients that want to save some data on a cluster (just to store it there and not locally)
[10:41] <iltisanni> then I would create 4 images
[10:42] <iltisanni> and mount those images. 1 at each client
[10:42] <tnt> yes ... but depending on what kind of 'some data' it is and how they're accessed, rbd might not be the best option
[10:43] <iltisanni> OK what other options are there? I heard cephfs is not ready for production right now
[10:43] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[10:43] <tnt> RADOS direct access or RADOS-GW (S3-like service)
[10:44] <tnt> There was a post about all those on the ceph blog not that long ago IIRC
[10:44] <iltisanni> I need this all for a mailsystem
[10:45] <iltisanni> mails are saved and have to be restored if needed
[10:45] <iltisanni> nothing more
[10:46] <iltisanni> whats the idea of RADOS direct access?. The Client has to be in the cluster?
[10:48] <iltisanni> and... for what reason is rbd used if you can not write and get data from the cluster? I dont get the function of rbd then...
[10:49] * Yann__ (~Yann@did75-15-88-160-187-237.fbx.proxad.net) has joined #ceph
[10:51] <tnt> cephfs rbd rados rados-gw : they _all_ allow to store and retrieve data from the cluster in one way or another. But they all have different method and usage models and pro/cons depending on your application ...
[10:52] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[10:52] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[10:53] <iltisanni> ok... what method would you prefer to just write and get data to/from the cluster? What is the easiest way? :-)
[10:53] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[10:53] * votz (~votz@c-76-21-6-55.hsd1.ca.comcast.net) has joined #ceph
[10:54] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) has joined #ceph
[10:54] <iltisanni> and with all of those methods ceph will manage the distribution of the data through pool -> crushmap -> pg -> osds
[10:54] <iltisanni> so that its fault-tolerant
[10:54] <tnt> rbd is meant for disk image, so essentially it appears as a block device on the client. Now the pros is that you can use that as any block drivers, format it, put a fs on it and use it. The cons is that you can't really share data with that since it can only be mounted at one place at a time (and if you don't have external locking, you actually could mount it at several places at once but them you'd most likely hose the FS on it and loose your data). Also giv
[10:57] * Yann__ (~Yann@did75-15-88-160-187-237.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[10:59] * benner (~benner@193.200.124.63) Quit (Read error: Connection reset by peer)
[10:59] * benner (~benner@193.200.124.63) has joined #ceph
[11:00] <iltisanni> and an other pro is, that you can seperate data from one and the other client easily because they use different block devices. right?
[11:01] <iltisanni> so data are not thrown together at cluster
[11:06] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Ping timeout: 480 seconds)
[11:06] <tnt> well, most allow some level of separation ...
[11:06] <tnt> rbd has buckets
[11:07] <tnt> rados has pools
[11:07] <tnt> cephfs ... well, you can mount subdirectories of the main fs rather than the root.
[11:07] <iltisanni> I like tose pools because I understood how to manage that :-)
[11:08] <tnt> right but if they have root and access to the cluster they can just mount anyone's rbd image ...
[11:08] <iltisanni> y I understood this.....and that could cause data loss
[11:08] <iltisanni> if you are not carefully
[11:09] <todin> morning
[11:09] <iltisanni> so... If I want to use RADOS direct access... what to do then... I already created a new pool for the 4th VM (the client) and want to write in that
[11:11] * gucki_ (~smuxi@46-127-158-51.dynamic.hispeed.ch) has joined #ceph
[11:13] <iltisanni> and an other short question...I can stop the mds daemon, when I'm not using cephfs but Rados direct Access or rbd right?
[11:13] <tnt> yes. mds is only for cephfs
[11:13] <tnt> RADOS is an API, you actually need to code stuff ...
[11:14] <iltisanni> Oh... thats nasty
[11:14] <iltisanni> I dont want to code something
[11:14] <iltisanni> :-)
[11:14] <iltisanni> hm... ok so no direct acces and no cephfs
[11:14] <tnt> they you're pretty much stuck with cephfs or rbd
[11:15] <iltisanni> ok
[11:15] <iltisanni> and cephfs is not working properly
[11:15] <iltisanni> i heard
[11:15] <tnt> radosgw is a S3 like REST API ... there are command line tools but that's not really meant for heavy use.
[11:15] <iltisanni> so I will test rbd
[11:15] <tnt> cephfs is not production-ready AFAIK ... but people are still using it.
[11:15] <tnt> it's just not as tested
[11:15] <iltisanni> ok
[11:15] <tnt> you can live on the edge :p
[11:16] <iltisanni> not really :-)
[11:16] <iltisanni> whats the diefference between rados gw acess and direct access... just the way how the client builds up connection to the cluster
[11:16] <iltisanni> ?
[11:17] <tnt> rados-gw is S3 compatible HTTP access.
[11:18] <tnt> it's also more restricted than RADOS
[11:18] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[11:18] <tnt> RADOS is the really low level layer on whch RBD / RADOSGW and CephFS are built
[11:19] <iltisanni> kk Thx for the explanation ;-)
[11:19] <iltisanni> so I'm stuck at RBD
[11:19] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit ()
[11:19] <iltisanni> now I just need to know how to use those buckets for fault tollerance and how to mount the images.. All right. I'll try to find something
[11:21] <aegislabs> Hi, I still have trouble while access radosgw with S3 API
[11:21] <iltisanni> Thank you very much tnt. You really helped me a lot
[11:22] <aegislabs> I use Amazon S3 Authentication Tool for Curl for communicating with rados
[11:23] <aegislabs> but still can't authenticated when I put an object or create a bucket
[11:24] <iltisanni> I'm pretty lucky I won't use RADOS-GW ;-).
[11:24] <iltisanni> sorry
[11:24] <aegislabs> huhu
[11:24] <tontsa> aegislabs, are you sure your radosgw is actually listening that port80 or it's properly configured in the apache?
[11:24] <aegislabs> yup
[11:25] <tontsa> as that 403 forbidden comes straight from apache itself.. so check your htaccess / <Limit> statements
[11:26] <tontsa> oh actually scratch that. it's from proxy
[11:26] <aegislabs> can you explain more specific?
[11:28] <aegislabs> FYI, there is log from radosgw http://mibpaste.com/85tB6h
[11:32] <aegislabs> btw, I didn't use .htaccess files
[11:32] * votz (~votz@c-76-21-6-55.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[11:34] <tontsa> yeah seems your auth tokens don't match for some reason
[11:38] * gucki (~smuxi@46-127-158-51.dynamic.hispeed.ch) Quit (Remote host closed the connection)
[11:38] * gucki_ (~smuxi@46-127-158-51.dynamic.hispeed.ch) Quit (Remote host closed the connection)
[11:38] * gucki (~smuxi@46-127-158-51.dynamic.hispeed.ch) has joined #ceph
[11:42] <aegislabs> yup, I also thought it was the auth token didn't match...
[11:42] <aegislabs> is there any differentiation among the s3 auth and the radosgw auth?
[11:50] <gucki> hi there
[11:50] <gucki> is there a way to put an osd in maintenance mode? so that i can stop/restart the osd without having the cluster performancing a rebalance...
[11:59] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[11:59] <joao> gucki, not that I know of
[12:01] <joao> some alternative that I saw going around in the channel is increasing the timeout value that will trigger a rebalance after an osd is down, and perform the said maintenance within that window
[12:01] <joao> but that never seemed like the greatest idea to me
[12:04] * MikeMcClurg (~mike@91.224.175.20) has joined #ceph
[12:07] * aegislabs (7661b47c@ircip2.mibbit.com) Quit (Ping timeout: 480 seconds)
[12:07] <gucki> joao: ok, i'll probably do that. i think it's much better than having to get a rebalacing triggered when you know you have to take an osd offline for a few minutes
[12:11] <joao> gucki, I'm not sure if 'mon_osd_down_out_interval' is the option that would trigger the rebalance, but if it is, by default it is 300s (5 minutes)
[12:12] <gucki> joao: then it's not
[12:12] <gucki> joao: i did a "ceph osd out 0" which immediatly triggered a rebalance :(
[12:12] <gucki> joao: the osd was still "in"...
[12:12] <joao> oh
[12:12] <gucki> lets hope some ceph dev can help me here :)
[12:12] <joao> yeah, I guess that should work like that
[12:13] <joao> in the sense that, after all, you're letting the monitors know that that osd is out
[12:13] * mib_x2d7w9 (7661b47c@ircip2.mibbit.com) has joined #ceph
[12:13] <joao> for some reason I just assumed that you were going to kill the ceph-osd and go about with doing maintenance :p
[12:14] <gucki> joao: no, i'm nice and wanted to do it gracefully :)
[12:14] <gucki> joao: probably pause works?
[12:15] <gucki> lets wait for an answer from a dev :)
[12:15] <joao> well, I am a dev, I'm just not familiar with that kind of thing
[12:19] <joao> afaict, the "pause" command will pause reads and writes on that osd, but I'm not sure what that means, and I highly doubt it would avoid a rebalance
[12:21] * synapsr (~Adium@c-69-181-244-219.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[12:22] <gucki> joao: ah ok, sorry *g*
[12:23] <gucki> I found this link http://www.sebastien-han.fr/blog/2012/08/17/ceph-storage-node-maintenance/ but this is how i did it and it doesn't work.. :(
[12:23] <gucki> i'm using the latest stable version..
[12:24] <joao> sorry, gotta make a call
[12:24] <joao> trying to figure out how to smuggle an umbrella on the flight to amsterdam
[12:25] <tontsa> i had no trouble with umbrella in my cabin luggage
[12:30] <joao> tontsa, what kind of an umbrella was it?
[12:30] <joao> the flight company is telling me that I cannot bring any umbrella with pointy ends
[12:31] <tontsa> joao, it didn't have pointy end. it was like http://www.umbrellaheaven.com/images/city_compact_classic.jpg this one
[12:32] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[12:35] * masterpe (~masterpe@2001:990:0:1674::1:82) Quit (Remote host closed the connection)
[12:39] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Ping timeout: 480 seconds)
[12:39] * LarsFronius_ is now known as LarsFronius
[12:40] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[12:41] <Robe> what's the usual strategy to increase performance for "hot" files in ceph?
[12:42] <joao> tontsa, thanks; the airline just told me I must be mad if I want to bring that weapon of mass destruction aboard
[12:43] <Robe> lol
[12:46] * masterpe (~masterpe@2001:990:0:1674::1:82) has joined #ceph
[12:50] <tontsa> joao, going well :)
[12:51] <iltisanni> Hey. Is that Information i have right? Only cephfs has a filesystem structure in background.. RADOS hasn't
[12:52] <iltisanni> because the connection is just between S3 and Client for writing and getting data from/to cluster
[12:53] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Quit: LarsFronius)
[12:53] <joao> not sure what you mean, but cephfs does provide a posix-compliant fs on top of rados, while rados itself (through osds) sits on native fs where the osds keep their data
[12:54] <iltisanni> so if I want to use ceph as a fileserver as well, I had to use cephfs
[12:54] <iltisanni> y thats what i meant.
[12:54] <iltisanni> thx
[12:54] <iltisanni> rbd does also have a fs on top of rados right?
[12:55] <joao> I would assume you could use something that would keep files on rados through rgw, and provide the necessary abstractions to act as a fileserver, but this might also be only wishful thinking ;)
[12:56] <nhmlap> iltisanni: rbd is just the block device. You can put any filesystem on top of it that you want...
[12:56] <joao> oh, yeah, there's that too
[12:56] <iltisanni> y. so I should use rbd or cephfs
[12:56] <nhmlap> iltisanni: Some people are doing that and putting and NFS server in front of it.
[12:56] <iltisanni> ok nice.. thx
[12:56] <iltisanni> but rbd has some negative points as someone here told me some hours ago
[12:57] <iltisanni> he said it's not best practice to use one device image for more than one Client
[12:57] <iltisanni> but thats what i want
[12:58] <iltisanni> I want several clients to connect on one mountable device
[12:58] <iltisanni> so that they can write/get data to/from there at the same time
[12:58] <joao> wouldn't nfs in front of rbd, as nhmlap said, do the trick in that setup?
[12:59] <iltisanni> well yeah.. you mean I could create one big device image and put nfs on top of it
[12:59] <iltisanni> then mount it on several clients
[13:00] * MikeMcClurg (~mike@91.224.175.20) Quit (Ping timeout: 480 seconds)
[13:01] <iltisanni> and.. btw what bugs do you know from cephfs? I heard it's not usable for production... And what do you think how long it will take the developers to fix those issues
[13:03] * loicd (~loic@178.20.50.225) Quit (Ping timeout: 480 seconds)
[13:08] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[13:08] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[13:09] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[13:14] * loicd (~loic@178.20.50.225) has joined #ceph
[13:21] * mib_x2d7w9 (7661b47c@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[13:23] * MoroIta (~MoroIta@62.196.20.28) has joined #ceph
[13:34] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Quit: slang)
[13:35] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[13:35] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Remote host closed the connection)
[13:43] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Ping timeout: 480 seconds)
[13:45] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:47] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) has joined #ceph
[13:53] * bcampbell (~bcampbell@74.115.185.110) has joined #ceph
[13:53] * bcampbell (~bcampbell@74.115.185.110) Quit ()
[13:54] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[14:12] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[14:14] * elder (~elder@c-24-118-242-216.hsd1.mn.comcast.net) Quit (Remote host closed the connection)
[14:15] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[14:15] * long (~chatzilla@118.186.58.35) has joined #ceph
[14:16] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit ()
[14:17] * elder (~elder@c-24-118-242-216.hsd1.mn.comcast.net) has joined #ceph
[14:26] <long> hi,all
[14:27] <long> do you have known rados_cluster_stat and rados_ioctx_pool_stat?
[14:30] <elder> Building ceph/master I now get "/usr/bin/ld: cannot find -lboost_program_options"
[14:31] <elder> I'm switching to ceph/next. I hope that works.
[14:34] * MikeMcClurg (~mike@91.224.175.20) has joined #ceph
[14:34] <elder> The next branch still builds without the boost libraries.
[14:35] * MoroIta (~MoroIta@62.196.20.28) Quit (Ping timeout: 480 seconds)
[14:37] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) Quit (Quit: ZNC - http://znc.sourceforge.net)
[14:38] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[14:43] * nolan (~nolan@phong.sigbus.net) has joined #ceph
[14:47] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[14:53] * PerlStalker (~PerlStalk@perlstalker-1-pt.tunnel.tserv8.dal1.ipv6.he.net) has joined #ceph
[14:53] * MikeMcClurg (~mike@91.224.175.20) Quit (Ping timeout: 480 seconds)
[14:56] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit (Quit: Leaving.)
[14:59] * rlr219 (43c87e04@ircip1.mibbit.com) has joined #ceph
[14:59] * synapsr (~Adium@c-69-181-244-219.hsd1.ca.comcast.net) has joined #ceph
[15:01] * MikeMcClurg (~mike@91.224.175.20) has joined #ceph
[15:09] * deepsa_ (~deepsa@122.172.1.71) has joined #ceph
[15:10] * deepsa (~deepsa@115.184.126.88) Quit (Ping timeout: 480 seconds)
[15:10] * deepsa_ is now known as deepsa
[15:10] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[15:37] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) has joined #ceph
[15:45] * vata (~vata@208.88.110.46) has joined #ceph
[15:46] * danieagle (~Daniel@186.214.79.105) has joined #ceph
[15:49] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[15:51] <long> hi
[15:52] <long> it is bad for normal user they can not subcribe ceph-dev mail
[15:56] <elder> I have an ceph-osd process that seems to be spinning, consuming lots of CPU. Does anyone have a suggestion about how to determine what it's doing?
[15:56] <elder> I'm running ceph/next
[15:58] * MikeMcClurg (~mike@91.224.175.20) Quit (Ping timeout: 480 seconds)
[15:58] <nhmlap> elder: strace?
[15:58] * danieagle (~Daniel@186.214.79.105) Quit (Remote host closed the connection)
[15:59] <elder> I've only run strace with a command, from the beginning. Can I use it to capture a stack or something?
[15:59] <nhmlap> elder: maybe sysprof or perf, though annoyingly I haven't been able to get perf to show symbols for ceph-osd.
[15:59] <elder> The process has been running for some time.
[15:59] <elder> But only after a while did it start spinning like htis.
[15:59] <nhmlap> elder: you can attach strace to an already running process, though some times when you detach it it can kill the running process.
[16:00] <elder> Cool. sudo cat /proc/17692/stack
[16:00] <nhmlap> nice
[16:03] <elder> Well it's in futex(), but nothing in the ceph tree (directly) calls that.
[16:04] <nhmlap> btrfs?
[16:05] <elder> I don't know, actually. Whatever "start_vstart.sh" uses.
[16:07] <nhmlap> maybe see if sysprof gives you anything useful.
[16:07] <elder> I don't have it. What does it do?
[16:07] <nhmlap> elder: it's a pretty basic profiler, but it at least more or less seems to work.
[16:08] <nhmlap> it's pretty easy to install/use if you have ubuntu.
[16:08] <nhmlap> not sure if it will tell you anything useful or not, but it might be worth trying. perf would give you more if it worked. ;(
[16:10] * lofejndif (~lsqavnbok@28IAAISJU.tor-irc.dnsbl.oftc.net) has joined #ceph
[16:10] <elder> Hey that's cool.
[16:13] <elder> I don't know what to do with this information, but I think it's told me the general area that must have a problem.
[16:14] <nhmlap> Where does it seem to be happening?
[16:15] <elder> Well, ThreadPool::worker(THreadPool::WorkThread*) is 94.04%
[16:15] <elder> Under that I get OSD::SnapTrimWQ::_process(PG*) at 33.64%
[16:16] <elder> I had been executing a test that resizes images and takes snapshots of the result.
[16:17] <elder> And the problem occurred after I had removed all of the snapshots, and unmapped the image, and was in the process of removing it.
[16:17] <nhmlap> Sounds like something to tell Sam.
[16:22] * MikeMcClurg (~mike@91.224.174.75) has joined #ceph
[16:26] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[16:29] * jlogan1 (~Thunderbi@2600:c00:3010:1:3880:bbab:af7:6407) has joined #ceph
[16:32] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[16:33] * test (c3dc640b@ircip3.mibbit.com) has joined #ceph
[16:39] * test (c3dc640b@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[16:42] * MoroIta (~MoroIta@62.196.20.28) has joined #ceph
[16:42] * kyannbis (~yann.robi@tui75-3-88-168-236-26.fbx.proxad.net) has joined #ceph
[16:42] <kyannbis> hi
[16:43] * MoroIta (~MoroIta@62.196.20.28) Quit ()
[16:43] <kyannbis> quick question, if I created a datacenter bucket, rack, hypervisor, host bucket hierarchy and osd tree doesn't show my tree
[16:44] <kyannbis> osd are shown wihtout any hierarchy with osd tree
[16:44] <kyannbis> is it normal ?
[16:45] <joao> it happened to me when I built the crushmap the wrong way
[16:45] <joao> have you defined a custom crushmap?
[16:45] <kyannbis> yes
[16:45] <kyannbis> but the crushmap compiled fine
[16:46] <joao> when you compiled it, did you get any 'missing items' (or similar)?
[16:47] <kyannbis> no, no error and no warning
[16:48] <joao> actually, I meant something like this
[16:48] <joao> in rule 'testingdata' item 'testing' not defined
[16:48] <joao> in rule 'testingmetadata' item 'testing' not defined
[16:48] <joao> ok
[16:48] <joao> the thing about the crushtool is that it will return a compiled crushmap even if these things happen
[16:49] <kyannbis> it compiled fine
[16:50] <joao> could you please pastebin the output of ceph osd tree?
[16:50] <joao> so we could take a look?
[16:51] <kyannbis> dumped osdmap tree epoch 51
[16:51] <kyannbis> # id weight type name up/down reweight
[16:51] <kyannbis> -1 0 root default
[16:51] <kyannbis> 0 0 osd.0 up 1
[16:51] <kyannbis> 1 0 osd.1 up 1
[16:55] <joao> have you added those osds to crush?
[16:57] <kyannbis> my bad, we didn't put the datacenter in the root ^
[17:00] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[17:02] * drokita (~drokita@199.255.228.10) has joined #ceph
[17:05] <buck> I noticed that vstart.sh doesn't work out of the box because the dev/ directory does not exist. Is this intentional? That the user must create a dev /directory within /src to run vstart.sh ?
[17:05] <mikeryan> buck: yes
[17:05] <buck> mikeryan: ok. Thanks
[17:06] * long (~chatzilla@118.186.58.35) Quit (Quit: ChatZilla 0.9.89 [Firefox 16.0.2/20121024073032])
[17:12] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:14] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[17:18] * sjusthm (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[17:18] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:20] * rlr219 (43c87e04@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[17:23] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[17:23] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[17:27] * MikeMcClurg (~mike@91.224.174.75) Quit (Ping timeout: 480 seconds)
[17:29] * gaveen (~gaveen@111.223.150.90) has joined #ceph
[17:30] * MikeMcClurg (~mike@91.224.174.75) has joined #ceph
[17:37] * scuttlemonkey_ is now known as scuttlemonkey
[17:37] * danieagle (~Daniel@186.214.95.140) has joined #ceph
[17:38] * rlr219 (43c87e04@ircip2.mibbit.com) has joined #ceph
[17:39] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:03] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[18:06] * tnt (~tnt@20.35-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[18:06] * danieagle (~Daniel@186.214.95.140) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[18:10] * adjohn (~adjohn@69.170.166.146) has joined #ceph
[18:11] * slang (~slang@ace.ops.newdream.net) has joined #ceph
[18:19] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[18:24] * BManojlovic (~steki@212.200.240.142) has joined #ceph
[18:30] * MikeMcClurg1 (~mike@91.224.174.75) has joined #ceph
[18:30] <sjusthm> rlr219: are you there?
[18:30] * MikeMcClurg (~mike@91.224.174.75) Quit (Read error: Connection reset by peer)
[18:31] * eccojoke (~ecco@h94n3-lk-d2.ias.bredband.telia.com) has joined #ceph
[18:31] <eccojoke> mapping the same rbi on two clients is a big no-no right?
[18:32] <eccojoke> rbd*
[18:32] <rweeks> just like mapping the same iSCSI LUN on two clients, yes.
[18:34] <eccojoke> I've been trying to read up on this, but think i got the whole thing wrong...
[18:34] <nhmlap> rweeks: I previously delt with a lustre system that did that when it was supposed to do an HA failover. :)
[18:35] <rweeks> oh lovely.
[18:36] <nhmlap> eccojoke: rbd is block storage that should only be accessed by a single device. CephFS is the filesystem that can be mounted on multiple clients at once.
[18:37] <eccojoke> i realize this is probably not the right forum, but if i can't map the same ISCSI LUN to two clients, how does vmware esx handle just that? or is that the thing, it doesn't handle it?
[18:37] <darkfader> eccojoke: it doesn't use the same block on two clients, at the same time, ever
[18:38] <darkfader> look for persistent group reservation on google, probably better pass wikipedia articles about that :)
[18:38] <eccojoke> ok, so then it should be find mapping the same rbd to two clients, export them as iscsi lun 0 to two esxi's?
[18:38] <eccojoke> as long as the clients does nothing but exporting the iscsi?
[18:38] * joey__ (~joey@71-218-31-90.hlrn.qwest.net) has joined #ceph
[18:39] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[18:39] <eccojoke> or would it be better to not use rbd, and instead create two files in cephfs and export those?
[18:39] <darkfader> no, you'd need to make sure the lun is presented with the same scsi serial etc
[18:39] <joey__> hi, I have a situation .. where I have ceph mounted via the fuse client and after much I/O, the fuse mount becomes hung.
[18:39] <eccojoke> yes, that i can do
[18:39] <darkfader> i dont think esxi will be able to tell you're showing it the same thing
[18:40] <eccojoke> esxi does mpio, and that part seems to be working
[18:40] <joey__> All i/o to this file system is blocked and processes (libvirtd) are hung
[18:40] * MikeMcClurg1 (~mike@91.224.174.75) Quit (Ping timeout: 480 seconds)
[18:40] <darkfader> then maybe a very careful "maybe works"
[18:40] * synapsr (~Adium@c-69-181-244-219.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:41] <eccojoke> ok, so I'm better off creating a file in cephfs, mounting it on two clients, and exporting the files?
[18:41] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[18:41] <nhmlap> joey__: ceph health ok?
[18:41] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[18:42] <joey__> nhmlap: yes
[18:42] <joey__> HEALTH_OK
[18:42] <joey__> a 'df -h' will hang on that mount.
[18:42] <joey__> Not sure really how to troubleshoot it
[18:44] <nhmlap> joey__: anything in an of the logs?
[18:45] <nhmlap> might want to look in dmesg on any of the systems involved too.
[18:45] <joey__> k, nothing in the osd logs
[18:45] <joey__> but lemme check dmesg
[18:46] * yehudasa (~yehudasa@2607:f298:a:607:74bd:a80c:a3ba:88b4) Quit (Quit: Ex-Chat)
[18:47] <joey__> yea, so there are many messages about blocked processes
[18:47] <joey__> with a kernel stack indicating fuse
[18:47] <nhmlap> hrm
[18:47] * loicd (~loic@178.20.50.225) Quit (Ping timeout: 480 seconds)
[18:48] <nhmlap> I admit I'm not very familiar with the fuse stuff. Can you tell if it's fuse specific?
[18:49] <joey__> well it's mounted via ceph-fuse
[18:49] <slang> joey__: can you pastebin the fuse stacktrace you see in dmesg?
[18:50] <joey__> and all the hung procs are waiting on io to that mount
[18:50] <joey__> sure
[18:51] <joey__> http://pastie.org/5143994
[18:52] <slang> joey__: do you have a ceph-fuse process running? ps -e | grep ceph-fuse
[18:52] <slang> joey__: just wondering if it possibly crashed
[18:52] <joey__> good question.. sec.
[18:54] <joey__> yes, it's running
[18:54] <joey__> let's see if strace tells me anything intersting
[18:54] <joey__> interesting, even.
[18:56] * dmick (~dmick@2607:f298:a:607:746d:6b8c:2c54:3594) has joined #ceph
[18:56] * Leseb (~Leseb@193.172.124.196) Quit (Quit: Leseb)
[18:56] <joey__> well, it's doing stuff..
[18:57] * dmick (~dmick@2607:f298:a:607:746d:6b8c:2c54:3594) has left #ceph
[18:59] <mikeryan> slang: another fella with an MDS question on ceph-devel
[19:03] * yehudasa_ (~yehudasa@38.122.20.226) has joined #ceph
[19:04] <slang> joey__: can you pastebin the output of:
[19:04] <slang> echo > /tmp/gdbcmds
[19:04] <slang> gdb -batch -nx -x /tmp/gdbcmds -q -p $(ps h -C ceph-fuse -o pid)
[19:04] <slang> err
[19:04] <slang> joey__: can you pastebin the output of:
[19:04] <slang> echo "thread apply all bt" > /tmp/gdbcmds
[19:04] <slang> gdb -batch -nx -x /tmp/gdbcmds -q -p $(ps h -C ceph-fuse -o pid)
[19:04] <slang> that's better
[19:05] <joey__> that's some foo right there.. sure. one sec.
[19:05] <slang> mikeryan: k
[19:07] * elder (~elder@c-24-118-242-216.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[19:09] <joey__> http://pastie.org/5144092
[19:16] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[19:17] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[19:20] <slang> joey__: how many mds servers do you have?
[19:25] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:26] <joey__> slang: how can I check?
[19:26] <joey__> I think only 3
[19:27] <joey__> I'll post my ceph.conf
[19:27] <slang> joey__: pastebin your ceph.conf and the output of ceph -s?
[19:27] <slang> cool
[19:27] <elder> joshd, looking ahead a bit, if I send two ops in a single OSD request, will I get them both back in an array, in a single response?
[19:28] <elder> The current completion code assumes there's only one response, and one op in it.
[19:30] * chutzpah (~chutz@199.21.234.7) has joined #ceph
[19:31] <joey__> slang: http://pastie.org/5144185 ceph.conf
[19:37] <joshd> elder: yes, all the op return values and data will be part of one logical MOSDOpReply
[19:37] <elder> OK.
[19:38] <elder> Another little task I guess.
[19:38] <elder> But I was expecting it...
[19:38] <elder> It just might be better to do a little on that before I go changing how each op gets handled when complete.
[19:38] <elder> Or... at least now I know that's what I'll be looking at doing.
[19:40] <joshd> possibly, it can be done independently too
[19:40] <elder> I'm going to do that later, but I'll structure it so that the handling is per-op.
[19:41] * benpol (~benp@garage.reed.edu) Quit (Ping timeout: 480 seconds)
[19:41] <elder> Trying to figure out how a MOSDOpReply relates to a ceph_osd_reply_head....
[19:42] <slang> joey__: I think the only way to get it unstuck will be to restart/remount the client (which I can show you how to do), but before you do that it would be great if you could get some debugging info from the mds for us
[19:42] <joey__> yea, so 3 monitors and 1 mds
[19:42] <joey__> I probably need another mds
[19:43] <elder> joshd, I see, it's in encode_payload()
[19:43] <joey__> slang: the mds log looks happy
[19:43] <joey__> but whatever you need I'll provide.
[19:43] <joey__> I was going to give Juan access to this cluster to have a look
[19:44] <slang> joey__: oh sure let him at it
[19:44] <joey__> ok
[19:44] <joey__> jjgalvez: ping
[19:45] <slang> joey__: you can do: ceph mds tell mds.alpha injectargs --debug_mds 20
[19:45] <slang> joey__: that will start spitting out more verbosity to the mds log
[19:45] <slang> joey__: might be something useful in there
[19:46] <slang> joey__: you can also do the gdb trick on the mds process:
[19:46] <slang> echo "thread apply all bt" > /tmp/gdbcmd
[19:46] <slang> gdb -batch -nx -x /tmp/gdbcmd -q -p $(ps h -C ceph-mds -o pid)
[19:46] <joey__> sure
[19:49] <slang> joey__: restarting the mds might resolve your client hangs, because the connections will get reset the mds should be able to hand back the proper capabilities to the client
[19:49] * dmick (~dmick@2607:f298:a:607:746d:6b8c:2c54:3594) has joined #ceph
[19:49] * ChanServ sets mode +o dmick
[19:49] <slang> joey__: probably wait to do that till Juan has had a chance to take a look
[19:51] * benpol (~benp@garage.reed.edu) has joined #ceph
[19:51] <elder> joshd, we currently do not even look at most of the fields sent in an osd reply. Is there zero value in these: client_inc, flags, layout, osdmap_epoch, reassert_version ?
[19:51] <joey__> ok, he's jumping on now, thanks slang.
[19:52] <elder> Oh, and object_len..
[19:53] <joshd> elder: I'm not sure, but it wouldn't surprise me if the kernel client doesn't need to worry about several of those
[19:53] <joshd> elder: i.e. reassert_version is for lower-level rados ops the kernel doesn't currently use
[19:53] <elder> OK. Just asking. I'm not planning to do anything about it but if it's supplied it seems we might use it to validate things at least.
[19:56] * loicd (~loic@90.84.144.107) has joined #ceph
[19:57] * jluis (~JL@89.181.150.224) has joined #ceph
[20:02] * joao (~JL@89.181.150.224) Quit (Ping timeout: 480 seconds)
[20:03] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[20:03] * glowell2 (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[20:04] * glowell1 (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[20:06] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[20:21] <eccojoke> i semi-regularly see "libceph: get_reply unknown tid xxxxx from osdN" and "libceph: osdN up/down"
[20:21] <eccojoke> that equals crappy network?
[20:28] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[20:37] * loicd (~loic@90.84.144.107) Quit (Quit: Leaving.)
[20:39] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[20:39] * kYann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) has joined #ceph
[20:39] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Read error: No route to host)
[20:40] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[20:40] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit ()
[20:42] * sjustlaptop (~sam@mc00436d0.tmodns.net) has joined #ceph
[20:42] * synapsr (~Adium@209.119.75.46) has joined #ceph
[20:48] <elder> joshd, another question on the response from a multi-op request. It looks like each op has a 32-bit rval field. Return value for the individual op? And then there's a request result--aggregate result of the request? How should those be interpreted?
[20:48] <elder> Right now the (combined) result is being used as the result for the (assumed single) operation.
[20:50] * sjustlaptop (~sam@mc00436d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[20:53] <joshd> elder: look at Objecter.cc's handle_osd_op_reply function in userspace
[20:53] <elder> OK.
[20:53] <elder> Thanks.
[20:54] <joshd> elder: also note that the kernel is getting back an older version of MOSDOpReply since it doesn't report support CEPH_FEATURE_PGID64 yet (you can see the different encodings in MOSDOpReply encode_payload)
[20:55] <elder> I saw that and wondered which branch of the test I should be looking at.
[20:55] <elder> (I was looking at the right one...)
[20:56] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has left #ceph
[20:56] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[20:56] <elder> Whoops. ^W will close your window...
[20:58] <joshd> hmm, we may need to make the kernel support CEPH_FEATURE_PGID64 since the earlier version doesn't include all the return codes
[21:02] <elder> Another project...
[21:02] <elder> Right now I just want to understand why the existing code is doing what it does.
[21:03] * jluis is now known as joao
[21:04] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:05] * Ryan_Lane (~Adium@216.38.130.163) has joined #ceph
[21:05] <gucki> is it ok that an ceph osd crashes after a write timeout? http://pastie.org/5144222
[21:05] <gucki> it's latest stable release
[21:06] <joshd> yeah, sometimes disks (or fses) go bad by timing out
[21:08] <slang> jjgalvez: fyi, that doesn't look like an issue at the client
[21:09] <slang> jjgalvez: the mds isn't sending back the caps grant message for some reason
[21:09] <slang> jjgalvez: it looks like everything is backed up behind that write request waiting for caps
[21:09] <joshd> gucki: if you hit that with normal usage, you can turn up some osd throttling to prevent the disks from being overloaded, or increase the timeout, but it usually means your underlying disk or fs has problems
[21:10] <gucki> joshd: yeah, i found out why it hangs....i'm using debian which has only sync. and i have another disk inside the host with heavy io (not related to ceph). but as ceph does a sync, the whole machine basically hangs..
[21:11] <jjgalvez> slang: okay I'm going to look at the mds logs - it looks like you previously had joey_ make those logs more verbose earlier
[21:11] <gucki> joshd: why not simply use fsync on the needed files?
[21:11] <slang> getting the output of the mds with debug logging might be helpful
[21:11] <slang> jjgalvez: not sure if that happened
[21:11] <joey__> it did
[21:11] <slang> jjgalvez: I'm curious what the output of the gdb command is too
[21:11] <slang> joey__: cool
[21:12] <joey__> so there are 4 out of 12 that are in this state.
[21:12] <gucki> joshd: but so it was a timeout and "gracefull" shutdown and no crash? :-)
[21:12] <joey__> would bouncing the mds make them happy again or .. reboots?
[21:12] <joey__> sorry, 4 out of 15
[21:12] <slang> joey__: 4 of 15 processes on the client?
[21:14] * sjustlaptop (~sam@mc00436d0.tmodns.net) has joined #ceph
[21:14] <joshd> gucki: ah, that's unfortunate. there are a bunch of files that all need to be synced at once, and most distros have syncfs these days
[21:14] <dmick> and fsync doesn't get all the upper-level metadata
[21:14] <jjgalvez> slang: 4 out of 15 machines I believe
[21:15] <joshd> gucki: there's no real graceful shutdown for an osd - the journal provides consistency during crashes anyway
[21:15] <slang> jjgalvez: which state?
[21:17] <jjgalvez> slang: with the ceph-fuse mount not responding and things like cd'ing to the directory hangs
[21:17] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[21:17] <slang> jjgalvez: ah
[21:19] <slang> jjgalvez: were you able to get the backtrace of the mds threads?
[21:19] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[21:19] <joey__> sorry, there are 15 nodes in that cluster and 4 of them have a fuse FS that's unhappy
[21:20] * gucki (~smuxi@46-127-158-51.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[21:20] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[21:21] <jjgalvez> slang: yeah it looks like joey had run that command let me get that data off the machine
[21:22] * gucki (~smuxi@46-127-158-51.dynamic.hispeed.ch) has joined #ceph
[21:25] * sjusthm (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[21:37] * sjusthm (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[21:38] * sjustlaptop (~sam@mc00436d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[21:38] * yehudasa (~yehudasa@2607:f298:a:607:5df2:7084:cf61:812) has joined #ceph
[21:39] <yehudasa> sjusthm: do you know if perfcounter operations take a lock, or use atomic operations?
[21:40] <joshd> yehudasa: they use a lock
[21:41] <yehudasa> hmm, so I see a few cases where we operate on them inside a critical section, which can add to contention, etc.
[21:43] * eccojoke (~ecco@h94n3-lk-d2.ias.bredband.telia.com) Quit (Quit: This computer has gone to sleep)
[21:43] <joshd> yeah, it may be worth enabling sam's mutex instrumentation on them/removing them to see what the effect is
[21:43] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) Quit (Read error: Connection reset by peer)
[21:44] <joshd> which places are you thinking of?
[21:54] <yehudasa> joshd: there's in the rgw cache
[21:56] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) has joined #ceph
[21:57] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[21:57] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:57] * Q310 (~qgrasso@ip-121-0-1-110.static.dsl.onqcomms.net) Quit (Remote host closed the connection)
[21:57] * Q310 (~qgrasso@ip-121-0-1-110.static.dsl.onqcomms.net) has joined #ceph
[22:06] * yehudasa_ (~yehudasa@38.122.20.226) Quit (Ping timeout: 480 seconds)
[22:06] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[22:08] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) Quit ()
[22:08] * yehudasa_ (~yehudasa@38.122.20.226) has joined #ceph
[22:09] <sjusthm> yehudasa: are you actually observing cases where the perf counter overhead is significant
[22:10] <sjusthm> I actually wasn't able to demonstrate that
[22:10] <sjusthm> joshd: that instrumentation only tells you how much the lock is contended, it won't tell you about overhead from grabbing an uncontended lock
[22:13] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[22:15] <yehudasa> sjusthm: not really, I just looked at some code
[22:16] <sjusthm> yehudasa: yeah, I thought the same thing, but couldn't actually demonstrate a performance hit
[22:16] * drokita (~drokita@199.255.228.10) Quit (Quit: Leaving.)
[22:17] <sjusthm> yehudasa: from the perf counter code though, the locks should not be contended so it should be ok up to a point
[22:22] <yehudasa> sjusthm: yeah, I didn't see any real issue, just being careful
[22:35] * BManojlovic (~steki@212.200.240.142) Quit (Quit: Ja odoh a vi sta 'ocete...)
[22:38] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[22:39] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:46] <sagewk> elder: ping
[22:49] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[22:49] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:50] * synapsr (~Adium@209.119.75.46) Quit (Quit: Leaving.)
[22:52] * nwatkins (~nwatkins@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[22:56] <nwatkins> file system question: what's the intended behavior of rmdir("/")? Should this always be returning -EBUSY?
[22:56] <sagewk> sounds like it from the man page
[22:57] <nwatkins> sagewk: yeh, except for the part about "the process' root directory" .. is this for like chroot environments?
[23:14] * benner_ (~benner@193.200.124.63) has joined #ceph
[23:14] * benner_ (~benner@193.200.124.63) Quit (Read error: Connection reset by peer)
[23:15] * benner (~benner@193.200.124.63) Quit (Read error: Connection reset by peer)
[23:19] * benner (~benner@193.200.124.63) has joined #ceph
[23:23] * SkyEye (~gaveen@111.223.150.90) has joined #ceph
[23:23] * SkyEye (~gaveen@111.223.150.90) Quit ()
[23:23] * gaveen (~gaveen@111.223.150.90) Quit (Quit: Leaving)
[23:27] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) Quit (Remote host closed the connection)
[23:29] * yehudasa_ (~yehudasa@38.122.20.226) Quit (Ping timeout: 480 seconds)
[23:33] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) has joined #ceph
[23:55] * bchrisman (~Adium@c-24-130-235-16.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[23:58] * PerlStalker (~PerlStalk@perlstalker-1-pt.tunnel.tserv8.dal1.ipv6.he.net) Quit (Quit: rcirc on GNU Emacs 24.2.1)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.