#ceph IRC Log


IRC Log for 2013-07-23

Timestamps are in GMT/BST.

[0:02] <sjust> sagewk: wip-5714, wip-5699
[0:02] <sagewk> k
[0:03] * bandrus (~Adium@ Quit (Quit: Leaving.)
[0:04] <sagewk> sjust: 5714 looks good
[0:04] <Tamil> jakes: http://ceph.com/docs/master/start/quick-ceph-deploy/#multiple-osds-on-the-os-disk-demo-only
[0:05] <sagewk> probably backport that one?
[0:05] <sjust> sagewk: yep
[0:05] <sjust> sagewk: speaking of which, ok to merge wip-cuttlefish-next?
[0:05] <jakes> Tamil : I was following the same page to create osd
[0:06] <sagewk> wip-5699 too
[0:06] <sagewk> sjust: i just added the msgr backports on top, want to run that through the test suite first
[0:07] <sagewk> may as well do it together
[0:07] <sjust> ok, I'll add 5714 on top of that as well
[0:07] <sagewk> k
[0:08] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[0:10] * Cube1 (~Cube@ Quit (Ping timeout: 480 seconds)
[0:14] <Tamil> jakes: oh ok\
[0:16] <cmdrk> hey folks, quick question.. I normally run a custom kernel (3.7) on my CentOS 6 systems , will I need to do any recompilation of the Ceph RPMs against my custom kernel?
[0:17] * drokita (~drokita@ Quit (Ping timeout: 480 seconds)
[0:19] <Tamil> jakes: from what i could say, something is messed up with your osd create and even am not sure what it is
[0:20] <Tamil> jakes: please file a bug with all possible logs available so someone could take a look at it
[0:20] <jakes> Tamil: ok. I have collected logs. I will try again to see if issue is seen again
[0:21] <Tamil> jakes: sure. thanks. please use something else other than /var/lib/ceph when creating osds
[0:21] <jakes> Tamil: you mean the data directory?
[0:22] <Tamil> jakes: yes
[0:22] <jakes> Tamil: ok.I will
[0:22] <Tamil> jakes: thanks
[0:22] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[0:22] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[0:23] <jakes> Tamil: Also, which one would you suggest?. use ceph-deploy or mkcephfs?
[0:24] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[0:25] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[0:28] <Tamil> jakes: ceph-deploy is very simple and easy to use. if you are using ceph branch cuttlefish and above, only ceph-deploy is supported
[0:28] <jakes> ok
[0:34] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[0:47] <ntranger> I used mkcephfs with cuttlefish, should I go ahead and redo it using ceph-deploy instead? It looks like it worked, but it did tell me that ceph-deploy is what is used now.
[0:48] * lautriv (~lautriv@f050082011.adsl.alicedsl.de) Quit (Read error: Operation timed out)
[0:55] <Tamil> ntranger: not required if you already brought your cluster up and running without any issues, but we recommend ceph-deploy
[0:56] <jakes> Tamil: Now, it is giving error unable to authenticate as client.admin when I ceph -w . I have done ceph-deploy admin for all cluster nodes and I have ceph.client.admin.keyring in /etc/ceph. Is it some permission issue?
[0:57] <ntranger> Tamil: when I try to start Ceph, I get a "no filesystem type defined". I ran in to this on a couple other systems I set up awhile ago, but I'm failing to remember what I did to correct the issue.
[0:58] * lautriv (~lautriv@f050081157.adsl.alicedsl.de) has joined #ceph
[1:03] <Tamil> jakes: try "sudo ceph -w"
[1:03] <jakes> Tamil: same error
[1:04] <Tamil> jakes: yes, it should be permission issue, verify the keyring files on all nodes is the same as mentioned in "sudo ceph auth list"
[1:04] <Tamil> jakes: did you recreate the cluster?
[1:04] <Tamil> ntranger: which distro?
[1:04] <jakes> Tamil: yup . again using ceph-deploy. Now, I ran into this issue
[1:05] <Tamil> jakes: by any chance, did you delete all the keyring files on admin before recreating the cluster?
[1:06] <jakes> Tamil: No, i did purgedata and purge on all nodes. Then again ceph-deploy new as in http://ceph.com/docs/next/start/quick-ceph-deploy/
[1:06] <ntranger> Tamil: Scientific Linux with 3.10.1-1.el6.elrepo.x86_64 kernel
[1:06] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[1:07] <Tamil> jakes: oh ok, please verify the keyring files match with the output from "sudo ceph auth list"
[1:07] <dmick> of course you can't do ceph auth list unless you can authenticate
[1:08] <jakes> Tamil: all ceph commands give the same error
[1:08] <Tamil> jakes: on which node are you trying this?
[1:08] <Tamil> dmick: oh yeah, thanks ! :)
[1:08] <jakes> on all cluster nodes. I have made ceph-deploy admin all cluster nodes
[1:14] <jakes> Tamil: I did the same steps as given there.
[1:15] <Tamil> jakes: is your monitor running?
[1:16] * devoid (~devoid@ Quit (Quit: Leaving.)
[1:17] <jakes> yes
[1:18] * mozg (~andrei@ has joined #ceph
[1:18] <jakes> Tamil: mon node is created at first
[1:19] <ntranger> Tamil: if you mean ceph distro, its cuttlefish
[1:20] * zynzel (zynzel@spof.pl) Quit (Remote host closed the connection)
[1:21] * ccourtaut (~ccourtaut@2001:41d0:1:eed3::1) Quit (Ping timeout: 480 seconds)
[1:23] * LeaChim (~LeaChim@0540adc6.skybroadband.com) Quit (Remote host closed the connection)
[1:24] * loicd (~loicd@bouncer.dachary.org) Quit (Remote host closed the connection)
[1:24] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[1:26] * zynzel (zynzel@spof.pl) has joined #ceph
[1:27] * loicd (~loicd@bouncer.dachary.org) has joined #ceph
[1:27] * zackc (~zack@0001ba60.user.oftc.net) Quit (Ping timeout: 480 seconds)
[1:28] <Tamil> ntranger: i meant the OS
[1:29] <Tamil> jakes: yes, but is ceph-mon process running?
[1:32] <ntranger> Tamil: Yeah, Scientific Linux 6.4
[1:36] <jakes> Tamil: I guess, something related to fsid. I deleted the admin folder and I repeated the steps. it worked fine .
[1:41] * mozg (~andrei@ Quit (Quit: Ex-Chat)
[1:42] <Tamil> jakes: next time, you retry, please make sure to do forgetkeys on the admin node and retry
[1:43] <jakes> ok
[1:43] * ccourtaut (~ccourtaut@2001:41d0:1:eed3::1) has joined #ceph
[1:43] * diegows (~diegows@ has joined #ceph
[1:44] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[1:45] <lautriv> haha ceph-deploy osd destroy foo:bar ..... subcommand destroy not implemented
[1:49] * jakes (~oftc-webi@128-107-239-233.cisco.com) Quit (Remote host closed the connection)
[1:58] * coredumb (~coredumb@xxx.coredumb.net) Quit (Ping timeout: 480 seconds)
[2:04] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[2:09] * mschiff_ (~mschiff@port-4148.pppoe.wtnet.de) has joined #ceph
[2:11] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) Quit (Ping timeout: 480 seconds)
[2:11] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[2:13] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[2:16] * mschiff (~mschiff@port-50293.pppoe.wtnet.de) Quit (Ping timeout: 480 seconds)
[2:27] * huangjun (~huangjun@ has joined #ceph
[2:28] * xmltok (~xmltok@pool101.bizrate.com) Quit (Quit: Bye!)
[2:37] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[2:42] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[2:42] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[2:45] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[2:45] * mschiff_ (~mschiff@port-4148.pppoe.wtnet.de) Quit (Remote host closed the connection)
[2:45] * yanzheng (~zhyan@jfdmzpr03-ext.jf.intel.com) has joined #ceph
[2:53] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[2:57] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[3:03] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[3:26] * sagelap (~sage@2600:1012:b014:38aa:7591:7c0a:29ef:1078) has joined #ceph
[3:26] * gregaf (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit (Quit: Leaving.)
[3:27] * ntranger (~ntranger@c-98-228-58-167.hsd1.il.comcast.net) Quit ()
[3:36] * yy-nm (~chatzilla@ has joined #ceph
[3:47] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[3:50] * sagelap (~sage@2600:1012:b014:38aa:7591:7c0a:29ef:1078) Quit (Ping timeout: 480 seconds)
[3:55] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[3:57] * john_barbee_ (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[3:57] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Read error: Connection reset by peer)
[3:57] * john_barbee_ is now known as john_barbee
[4:01] * sagelap (~sage@ has joined #ceph
[4:02] * dosaboy_ (~dosaboy@host109-155-13-224.range109-155.btcentralplus.com) has joined #ceph
[4:06] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Remote host closed the connection)
[4:07] * dosaboy (~dosaboy@host217-44-62-172.range217-44.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[4:14] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Read error: No route to host)
[4:25] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[4:25] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[4:35] * jakes (~oftc-webi@128-107-239-233.cisco.com) has joined #ceph
[4:37] <jakes> In my cluster, certain nodes needs root permission for running ceph commands, while others not. I am seeing some issues due to this. What is the reason for it?
[4:37] * john_barbee_ (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[4:38] <cmdrk> womp womp. ceph-deploy doesn't detect Scientific Linux 6.4 as a RHEL variant :(
[4:38] <cmdrk> or it really doesn't support SL64. I think it's the former though.
[4:39] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[4:39] * john_barbee_ is now known as john_barbee
[4:39] <cmdrk> question though, if I'm already using a full-blown solution like Puppet, should I be using ceph-deploy anyway?
[4:45] <dmick> jakes: probably perms on the key files.
[4:46] <dmick> cmdrk: I know there were fixes about that, but, yes, if you're using Puppet, you probably want to manage all that yourself. We have Chef recipes we develop at Inktank, but I know there are folks in the community using Puppet as well.
[4:47] <dmick> 373092af462a991b8fe6ca26a856bc522742c685 in github.com/ceph/ceph-deploy
[4:48] <dmick> (or actually a8236423aa4289483f1e9e624f24f2b0bcbfb706)
[4:51] * huangjun (~huangjun@ Quit (Ping timeout: 480 seconds)
[4:51] <cmdrk> great, sounds good.
[4:57] * [1]huangjun (~huangjun@ has joined #ceph
[4:57] <[1]huangjun> cmdrk: it supports ubuntu debian suse opensuse and centos
[4:59] * AfC (~andrew@2001:44b8:31cb:d400:bc5e:b0f0:5bd2:f8b2) Quit (Ping timeout: 480 seconds)
[5:01] <dmick> [1]huangjun: actually, I think it's got changes to support Scientific too, as I said, and posted the SHA1s for
[5:01] <jakes> i am seeing something strange. I am trying to have openstack over ceph. There are three nodes in my cluster. I have rbd settings in all three cinder conf files. Two nodes are giving error when connecting to ceph cluster while one passes thorugh when running ./rejoin.sh
[5:04] <jakes> client.connect in rados.py gives exception when running openstack
[5:06] * fireD_ (~fireD@93-142-200-49.adsl.net.t-com.hr) has joined #ceph
[5:08] * fireD (~fireD@93-139-163-75.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:08] <jakes> ObjectNotFound: error calling connect
[5:19] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[5:19] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[5:35] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[5:57] * DarkAceZ (~BillyMays@ has joined #ceph
[6:06] * jakes (~oftc-webi@128-107-239-233.cisco.com) Quit (Quit: Page closed)
[6:12] * [1]huangjun (~huangjun@ Quit (Read error: Connection reset by peer)
[6:13] * huangjun (~huangjun@ has joined #ceph
[6:13] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) has joined #ceph
[6:18] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla [Firefox 22.0/20130618035212])
[6:26] <cmdrk> dmick, huangjun: indeed, SL is essentially CentOS with a couple of extra packages and different branding.
[6:27] <dmick> we need more distros
[6:34] * yehuda_hm (~yehuda@2602:306:330b:1410:e505:48e3:1c9b:4ad1) Quit (Quit: Leaving)
[6:34] * julian (~julianwa@ has joined #ceph
[6:38] * zackc (~zack@65-36-76-12.dyn.grandenetworks.net) has joined #ceph
[6:38] * zackc is now known as Guest747
[6:40] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[6:50] * sha|2 (~kvirc@ has joined #ceph
[6:50] * Almaty (~san@ has joined #ceph
[6:51] <sha|2> hi all. can any one say what`s wrong whis mon? http://pastebin.com/RtKtR75k
[6:54] <sha|2> with*
[6:55] <sha|2> cdbcnjr vjybnjh)
[7:03] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) has joined #ceph
[7:03] * ChanServ sets mode +o scuttlemonkey
[7:03] * Cerales (~danielbry@router-york.lninfra.net) has joined #ceph
[7:03] <Cerales> I'm having trouble finding out what the standard way to back up a Ceph Object Store cluster is. I don't really want incremental backups - just periodic full snapshots that I can script.
[7:16] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) has joined #ceph
[7:56] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[8:05] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) has joined #ceph
[8:08] * matt__ (~matt@220-245-1-152.static.tpgi.com.au) has joined #ceph
[8:16] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[8:18] * Machske (~Bram@d5152D87C.static.telenet.be) Quit (Ping timeout: 480 seconds)
[8:28] <Pauline> sha|2: most likely http://tracker.ceph.com/issues/5704
[8:30] <Pauline> Cerales: uhm.. rados export ?
[8:32] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[8:34] * huangjun (~huangjun@ Quit (Quit: HydraIRC -> http://www.hydrairc.com <- Nine out of ten l33t h4x0rz prefer it)
[8:34] * huangjun (~huangjun@ has joined #ceph
[8:34] <sha|2> Pauline: it i our bugtrack.
[8:34] * sha|2 (~kvirc@ Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[8:36] <Pauline> sha2 sorry I did not get that
[8:37] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[8:39] * AfC (~andrew@gateway.syd.operationaldynamics.com) has joined #ceph
[8:39] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 480 seconds)
[8:40] * yy-nm (~chatzilla@ Quit (Read error: Connection reset by peer)
[8:40] * yy-nm (~chatzilla@ has joined #ceph
[8:42] * bergerx_ (~bekir@ has joined #ceph
[8:48] <Gugge-47527> Cerales: a full backup would take weeks on big clusters :)
[8:53] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:d08c:2d46:f42:a958) has joined #ceph
[8:58] * madkiss (~madkiss@2001:6f8:12c3:f00f:2870:e33a:5d26:a4fc) Quit (Ping timeout: 480 seconds)
[9:05] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) has joined #ceph
[9:05] * ChanServ sets mode +o scuttlemonkey
[9:09] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) has joined #ceph
[9:12] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:18] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[9:26] * sha (~kvirc@ Quit (Read error: Connection reset by peer)
[9:29] * SWAT_ (~swat@cyberdyneinc.xs4all.nl) has left #ceph
[9:32] * ScOut3R (~ScOut3R@catv-89-133-25-52.catv.broadband.hu) has joined #ceph
[9:34] * LeaChim (~LeaChim@0540adc6.skybroadband.com) has joined #ceph
[9:44] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) Quit (Quit: Leaving.)
[9:49] * Guest747 (~zack@65-36-76-12.dyn.grandenetworks.net) Quit (Read error: Operation timed out)
[9:56] * leseb1 (~Adium@ has joined #ceph
[10:11] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[10:15] * agh (~oftc-webi@gw-to-666.outscale.net) has joined #ceph
[10:16] <agh> Hello, I need your help, i'm deseperated !
[10:16] * jcfischer (~fischer@ has joined #ceph
[10:16] <agh> I'm not new to Ceph, but new to Ceph-delpoy. It does not work. So i try to create a new cluster "by hand".
[10:16] <agh> Here is what i do :
[10:16] <agh> ceph-authtool --create-keyring /etc/ceph/client.admin --gen-key -n client.admin
[10:16] <agh> ceph-authtool --create-keyring /etc/ceph/keyring --gen-key -n mon.
[10:16] <agh> mkdir /var/lib/ceph/mon/ceph-a
[10:17] <agh> ceph-mon --mkfs -i a --fsid 3c5d0a05-f76d-4e03-981c-059e712b9f95 --keyring /etc/ceph/keyring
[10:17] <agh> then,
[10:18] <agh> etc/init.d/ceph start mon
[10:18] <agh> and here is the output
[10:18] <agh> failed: 'ulimit -n 8192; /usr/bin/ceph-mon -i a --pid-file /var/run/ceph/mon.a.pid -c /etc/ceph/ceph.conf '
[10:18] <agh> And in fact, i have the same issue via ceph-deploy :(
[10:23] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[10:23] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[10:27] <dmick> agh: so you probably need to investigate why ceph-mon isn't starting
[10:27] <dmick> did you try running that command by hand?
[10:28] <agh> dmick: mm in fact, there is a "ceph-create-keys" wich runs for a very longtime
[10:28] * leseb1 (~Adium@ Quit (Quit: Leaving.)
[10:28] <dmick> that's leftover from the ceph-deploy
[10:29] <dmick> well, maybe
[10:29] <dmick> but if ceph-mon is the one that's failing, you need to know why that is
[10:29] <agh> and... now it works. I've just made the same steps as before... but now it works. how
[10:29] <agh> i don't understand
[10:29] <huangjun> uhh, maybe you can delete files in the /var/lib/ceph/tmp/*, and then redo this
[10:29] <dmick> beats me. we don't know how it was failing so we don't know what changed to make it succeed
[10:30] <dmick> but "failed: <command>" is a pretty big clue to investigate that command
[10:30] <huangjun> i have meet this before, via debugging, we found the it always create-keys
[10:30] <agh> I dit a lot at the same time, so... i'm bit confused.
[10:30] <dmick> create-keys will run forever if the mon isn't starting
[10:31] <dmick> and, well, glad it's working, but an organized approach is necessary when asking for help. good luck.
[10:31] <agh> dmick: it's working, yes and no, i don't know
[10:32] * mschiff (~mschiff@p4FD7FE9D.dip0.t-ipconnect.de) has joined #ceph
[10:32] <agh> [root@ix6-ceph-1 ~]# /etc/init.d/ceph start mon === mon.a === Starting Ceph mon.a on ix6-ceph-1... Starting ceph-create-keys on ix6-ceph-1.
[10:32] <agh> oops sorry
[10:32] <agh> [root@ix6-ceph-1 ~]# /etc/init.d/ceph start mon
[10:32] <agh> === mon.a ===
[10:32] <agh> Starting Ceph mon.a on ix6-ceph-1...
[10:32] <agh> Starting ceph-create-keys on ix6-ceph-1...
[10:32] <agh> so, mon.a is OK
[10:32] <agh> but what is this ceph-create-keys ?
[10:32] <dmick> a part of the startup process
[10:33] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[10:33] <agh> ok, so it's normal ? (i did not use cephx before, so i'm pretty new with it)
[10:33] <dmick> read /etc/init.d/ceph if you want more inside info
[10:34] <agh> thanks for your help.
[10:42] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[10:46] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[10:50] * dosaboy (~dosaboy@faun.canonical.com) has joined #ceph
[10:55] * waxzce (~waxzce@2a01:e34:ee97:c5c0:ccc3:70ce:e846:47fa) has joined #ceph
[10:58] * leseb (~Adium@ has joined #ceph
[10:59] * waxzce (~waxzce@2a01:e34:ee97:c5c0:ccc3:70ce:e846:47fa) Quit (Remote host closed the connection)
[10:59] * waxzce (~waxzce@2a01:e34:ee97:c5c0:159d:634a:2220:6934) has joined #ceph
[11:00] * sleinen (~Adium@2001:620:0:25:54b3:8fa9:8926:385e) has joined #ceph
[11:06] * leseb (~Adium@ Quit (Ping timeout: 480 seconds)
[11:14] * mxmln (~maximilia@ has joined #ceph
[11:22] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[11:25] * yanzheng (~zhyan@jfdmzpr03-ext.jf.intel.com) Quit (Remote host closed the connection)
[11:29] * X3NQ (~X3NQ@ Quit (Remote host closed the connection)
[11:30] * leseb (~Adium@ has joined #ceph
[11:39] * odyssey4me (~odyssey4m@ has joined #ceph
[11:41] * leseb (~Adium@ Quit (Ping timeout: 480 seconds)
[11:50] * BManojlovic (~steki@fo-d- Quit (Ping timeout: 480 seconds)
[11:53] * BManojlovic (~steki@fo-d- has joined #ceph
[11:53] * leseb (~Adium@ has joined #ceph
[12:12] * ScOut3R (~ScOut3R@catv-89-133-25-52.catv.broadband.hu) Quit (Ping timeout: 480 seconds)
[12:15] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[12:30] * Almaty (~san@ Quit (Quit: Ex-Chat)
[12:40] * yy-nm (~chatzilla@ Quit (Quit: ChatZilla [Firefox 22.0/20130618035212])
[12:42] * yanzheng (~zhyan@jfdmzpr04-ext.jf.intel.com) has joined #ceph
[12:49] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[12:58] * huangjun (~huangjun@ Quit (Quit: HydraIRC -> http://www.hydrairc.com <- Now with extra fish!)
[13:36] * julian (~julianwa@ Quit (Quit: afk)
[13:37] * oliver1 (~oliver@jump.filoo.de) has joined #ceph
[13:38] * Kacian (~kvirc@aster.cystersow9.netart.com.pl) has joined #ceph
[14:00] * yanzheng (~zhyan@jfdmzpr04-ext.jf.intel.com) Quit (Remote host closed the connection)
[14:04] * mschiff_ (~mschiff@p4FD7FE9D.dip0.t-ipconnect.de) has joined #ceph
[14:04] * mschiff (~mschiff@p4FD7FE9D.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[14:16] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[14:16] * BManojlovic (~steki@fo-d- has joined #ceph
[14:17] * diegows (~diegows@ has joined #ceph
[14:19] * mschiff (~mschiff@p4FD7FE9D.dip0.t-ipconnect.de) has joined #ceph
[14:19] * mschiff_ (~mschiff@p4FD7FE9D.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[14:19] * jcfischer_ (~fischer@user-28-17.vpn.switch.ch) has joined #ceph
[14:19] * yanzheng (~zhyan@ has joined #ceph
[14:24] * jcfischer (~fischer@ Quit (Ping timeout: 480 seconds)
[14:24] * jcfischer_ is now known as jcfischer
[14:26] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[14:33] * BManojlovic (~steki@fo-d- Quit (Remote host closed the connection)
[14:47] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[14:51] * markl (~mark@tpsit.com) Quit (Read error: Connection reset by peer)
[14:51] * markl (~mark@tpsit.com) has joined #ceph
[14:58] * BillK (~BillK-OFT@203-214-147-30.perm.iinet.net.au) has joined #ceph
[15:09] * bt (~trefzer@93-160-104-90-static.dk.customer.tdc.net) has joined #ceph
[15:10] <Azrael> joao: are you there?
[15:10] <bt> hi
[15:10] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[15:12] <Azrael> joao: we are encountering the http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/16080 issue
[15:12] <Azrael> joao: was wondering if you have a workaround or fix that we should test out
[15:13] <bt> does anyone know: I connect osd's to cluster and public network. does mon also need a connection to cluster or is public network enough ?
[15:13] * BManojlovic (~steki@fo-d- has joined #ceph
[15:13] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[15:13] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) has joined #ceph
[15:14] <lautriv> bt, mon will fail if it has a connection to cluster
[15:15] <bt> lautriv: I know that mon needs to be on public. but does ist also need to be on cluster ?
[15:15] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:15] <Azrael> bt: in my experience, the mon needed an ip in the public network as well as the cluster network
[15:15] <lautriv> bt, it would confuse the whole, just because name-resolution.
[15:15] <bt> lautriv: I had a setup with mon beeing on an osd: that worked. but now I'm in troubles with separating mon on a host with only public.
[15:16] <lautriv> bt, i found nothing works like suspected if you use ceph-deploy.
[15:17] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Read error: Operation timed out)
[15:18] <lautriv> unfortunately, cuttlefish is deploy only.
[15:19] * huangjun (~kvirc@ has joined #ceph
[15:19] <bt> Atrael: so the first picture on http://ceph.com/docs/master/rados/configuration/network-config-ref/ is probably wrong, as it does not show a connection from mon to cluster.
[15:21] <joao> mon only binds on public
[15:22] <Azrael> hi joao
[15:22] <joao> doesn't use the cluster addr or network
[15:22] <joao> hi
[15:22] <joao> grabbing coffee; brb
[15:22] <Azrael> joao: ok
[15:22] <lautriv> 2013-07-23 15:03:14.013948 b7204740 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication ? mon IS running, i gathered keys.
[15:26] <huangjun> lautriv: if you have deployed it before, and didn't use ceph-deploy forgetkeys, you will use the old keyring,
[15:27] <lautriv> huangjun, i would prefer to destroy some osd from the misleading howto.
[15:28] <huangjun> maybe you can use ceph-deploy purgedata your-osd-hostname
[15:29] * aliguori_ (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Remote host closed the connection)
[15:32] <lautriv> rm -rf /var/lib/ceph/* is a saner one ;)
[15:33] * BillK (~BillK-OFT@203-214-147-30.perm.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:34] * markbby (~Adium@ has joined #ceph
[15:34] <huangjun> yes,
[15:37] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[15:37] * jcfischer_ (~fischer@peta-dhcp-3.switch.ch) has joined #ceph
[15:37] * markbby (~Adium@ Quit (Remote host closed the connection)
[15:39] <lautriv> basically ceph looks like a fork of lustre with all the good parts taken out ...
[15:39] * alfredodeza (~alfredode@c-24-99-84-83.hsd1.ga.comcast.net) has joined #ceph
[15:40] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit ()
[15:40] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[15:40] <huangjun> ceph just absorb all distribute fs's good idea
[15:40] * jcfischer (~fischer@user-28-17.vpn.switch.ch) Quit (Ping timeout: 480 seconds)
[15:40] * jcfischer_ is now known as jcfischer
[15:40] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:41] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[15:41] <lautriv> huangjun, so it was a bad idea to respect Myricom, IB , Kerberos ?
[15:42] <lautriv> ... and md-raids
[15:42] <huangjun> no, i think all products have it own emphasis
[15:43] <lautriv> however, when i was on lustre before oracle came into the game it was more complex to setup but it just worked, where ceph has massive issues to find it's own back before anything is activated.
[15:44] * drokita (~drokita@ has joined #ceph
[15:44] <joelio> I found most of my issues were down to a misunderstanding of the tool, not the technology per se..
[15:46] <lautriv> joelio, the howto with the sample osd is totally misleading and should not even be on the same site. /tmp/osdX looks like a temporary mountpoint but they fake disks.
[15:47] * sagedroid (~yaaic@ has joined #ceph
[15:47] * allsystemsarego (~allsystem@ has joined #ceph
[15:50] <joelio> lautriv: I think this (to me) looks symtomatic of automatic documentation generation from the ceph-deploy tool, I assume it uses it. I won't argue that it can look a little misleading and as a how to could be better. I've written up my steps, they're only sligh deviations which fit into my particular setup (small bit of bash for OSD creation, 6 osds per host etc...)
[15:51] <joelio> there is definite grounds for improvement, but this is an open source project so I guess we are all the masters of destiny :)
[15:53] * X3NQ (~X3NQ@ has joined #ceph
[15:53] <lautriv> ok, 38 C� i need to solve some thermal issues :(
[15:54] * aliguori (~anthony@ has joined #ceph
[15:55] * yanzheng (~zhyan@ Quit (Ping timeout: 480 seconds)
[15:56] <alfredodeza> lautriv: what howto are you referring to?
[16:01] * bt (~trefzer@93-160-104-90-static.dk.customer.tdc.net) Quit (Quit: Leaving)
[16:05] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[16:06] * markbby (~Adium@ has joined #ceph
[16:07] * jskinner (~jskinner@ has joined #ceph
[16:10] * yanzheng (~zhyan@ has joined #ceph
[16:10] * danieagle (~Daniel@ has joined #ceph
[16:17] * markbby (~Adium@ Quit (Ping timeout: 480 seconds)
[16:18] * markbby (~Adium@ has joined #ceph
[16:19] * zackc (~zack@65-36-76-12.dyn.grandenetworks.net) has joined #ceph
[16:20] * zackc is now known as Guest820
[16:23] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[16:28] * markbby (~Adium@ Quit (Remote host closed the connection)
[16:29] <lautriv> alfredodeza, this one -> http://ceph.com/docs/master/start/quick-ceph-deploy/
[16:30] * soren (~soren@hydrogen.linux2go.dk) Quit (Read error: Operation timed out)
[16:31] <alfredodeza> lautriv: what is misleading about that?
[16:32] * markbby (~Adium@ has joined #ceph
[16:32] <lautriv> alfredodeza, the mkdir /tmp/osdX is not a temporary mountpoint but a fake-disk.
[16:34] * soren (~soren@hydrogen.linux2go.dk) has joined #ceph
[16:34] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[16:35] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[16:39] * AfC (~andrew@gateway.syd.operationaldynamics.com) Quit (Ping timeout: 480 seconds)
[16:40] <alfredodeza> lautriv: I don't see any mention of a /tmp/osdX being a temporary mount point, but if the wording needs improvement, could you elaborate a bit more?
[16:40] <alfredodeza> also, it would be fairly easy to send a pull request with your proposed changes, it is just a plain text file in the docs/ dir in ceph
[16:42] <Gugge-47527> alfredodeza: something in /tmp kinda points towards it being something temporary :)
[16:42] <lautriv> alfredodeza, the creation of an empty dir suggests it would be something mounted there but thhe following prepare/activate will directly use it for ceph. since the real handling is mentioned later, new users ending in useless demo-osd's ( even they split those already on 2 machines )
[16:44] <alfredodeza> Gugge-47527: sure, '/tmp/' could be anything though, it does say in the title it is for demonstration only
[16:44] <Gugge-47527> i know :)
[16:45] <lautriv> sure it sais demo-only. but the logical conclusion in the first breath leads to mkdir /tmp/osd0 on node1 and mkdir/tmp/osd1 on node2.
[16:47] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[16:51] * markbby (~Adium@ Quit (Remote host closed the connection)
[16:51] * zhyan_ (~zhyan@ has joined #ceph
[16:52] * ntranger (~ntranger@proxy2.wolfram.com) has joined #ceph
[16:53] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) has joined #ceph
[16:57] * markbby (~Adium@ has joined #ceph
[16:58] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[16:58] <lautriv> i wonder if the demo would even work since ceph should care about size == freespace/2 in that case
[16:58] * yanzheng (~zhyan@ Quit (Ping timeout: 480 seconds)
[16:59] * vipr (~vipr@78-21-228-224.access.telenet.be) has joined #ceph
[17:01] * markbby (~Adium@ Quit (Remote host closed the connection)
[17:01] * zhyan_ (~zhyan@ Quit (Quit: Leaving)
[17:08] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[17:13] * oliver1 (~oliver@jump.filoo.de) has left #ceph
[17:21] * alfredodeza (~alfredode@c-24-99-84-83.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[17:21] * devoid (~devoid@ has joined #ceph
[17:29] * jmlowe1 (~Adium@2601:d:a800:97:c5bd:db07:ec9a:3a90) has joined #ceph
[17:29] * bergerx_ (~bekir@ Quit (Quit: Leaving.)
[17:34] * dosaboy (~dosaboy@faun.canonical.com) Quit (Ping timeout: 480 seconds)
[17:36] * dosaboy (~dosaboy@faun.canonical.com) has joined #ceph
[17:38] * mschiff (~mschiff@p4FD7FE9D.dip0.t-ipconnect.de) Quit (Remote host closed the connection)
[17:38] * matt__ (~matt@220-245-1-152.static.tpgi.com.au) Quit (Ping timeout: 480 seconds)
[17:40] * grepory1 (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[17:41] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Ping timeout: 480 seconds)
[17:42] * huangjun (~kvirc@ Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[17:47] * sprachgenerator (~sprachgen@ has joined #ceph
[17:50] * odyssey4me (~odyssey4m@ Quit (Ping timeout: 480 seconds)
[17:51] * sleinen (~Adium@2001:620:0:25:54b3:8fa9:8926:385e) Quit (Quit: Leaving.)
[17:51] * sleinen (~Adium@ has joined #ceph
[17:52] * jcfischer (~fischer@peta-dhcp-3.switch.ch) Quit (Quit: jcfischer)
[17:53] * fireD_ is now known as fireD
[17:58] <lautriv> hm, each invocation of ceph-deploy invents its own errors o.O
[17:59] <lautriv> ceph-deploy mon add node004 --> pushy.protocol.proxy.ExceptionProxy: [Errno 2] No such file or directory: '/var/lib/ceph/mon/ceph-node004' ........ should deploy not add it right NOW ?
[17:59] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[18:00] <gregaf1> believe you want "mon create"
[18:01] * sagedroid (~yaaic@ Quit (Ping timeout: 480 seconds)
[18:01] <lautriv> yes, it is mon create, just a typo here.
[18:02] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[18:03] <jmlowe1> sjust: you around? I had a osd die, most likely doing some trim operations, now i have an inconsistent pg
[18:03] <gregaf1> dunno then, did you install ceph on that node first?
[18:03] <gregaf1> he's not in yet, probably in 40 minutes or an hour
[18:03] <gregaf1> jmlowe1: ^
[18:04] <lautriv> gregaf1, installed on all neccessary nodes, network and ssh is fine, user is proper
[18:04] <jmlowe1> trim operations should always complete on the primary and secondaries right?
[18:04] <gregaf1> dunno then lautriv, sorry
[18:04] <jmlowe1> looks like the objects were removed from the primary but not the secodnary
[18:06] <lautriv> gregaf1, but i'm right that "/var/lib/ceph/mon/ceph-node004" should be created by "ceph-deploy mon create node004" ........maybe some perms. what's your ls -al /var/lib/ceph ?
[18:08] <gregaf1> believe so, but I really haven't used ceph-deploy much; I'm more focused on cheap scripts like vstart and frameworks like teuthology ;)
[18:09] <gregaf1> I was asking about if you'd run install on that node yet because I think it might create some of the parent directories that ceph-deploy won't on its own
[18:09] <gregaf1> jmlowe1: that sounds odd, did the underlying fs have some errors or something?
[18:10] <jmlowe1> no afaik
[18:10] <jmlowe1> 2.37d osd.14 missing 466fff7d/rb.0.1d87.238e1f29.00000001a1f9/head//2
[18:11] <jmlowe1> find /data/osd.14/current/2.37d_head/ -name 'rb.0.367a.3d1b58ba.0000000010d9*'
[18:11] <jmlowe1> /data/osd.14/current/2.37d_head/DIR_D/rb.0.367a.3d1b58ba.0000000010d9__head_3DEDFF7D__2
[18:11] <lautriv> gregaf1, i purged a former install and did a fresh one so it _should_ be proper but since it fails maybe something else broke.
[18:11] <jmlowe1> find /data/osd.6/current/2.37d_head/ -name 'rb.0.367a.3d1b58ba.0000000010d9*'
[18:11] <jmlowe1> /data/osd.6/current/2.37d_head/DIR_D/DIR_7/rb.0.367a.3d1b58ba.0000000010d9__head_3DEDFF7D__2
[18:11] <jmlowe1> does that make any sense?
[18:12] <gregaf1> are those the replicas, or primary&replica, or..?
[18:12] * andrei (~andrei@ has joined #ceph
[18:12] <jmlowe1> [14,6]
[18:12] <andrei> hello guys
[18:12] <andrei> i was wondering if someone could help me with a question
[18:12] <gregaf1> and no, not really making sense, which is why I was hoping the local FS had crapped itself
[18:12] <andrei> i am planning to upgrade one of the ceph servers to a faster machine
[18:12] <andrei> but with the same number of osd
[18:12] <gregaf1> but I'm not much into the nitty-gritty of these errors so you should probably wait for Sam, sorry
[18:12] <andrei> what is the best way to do that?
[18:13] <andrei> can i simply plug the osds into the new server and install ceph?
[18:13] <andrei> or is there a migration procedure that I should follow?
[18:13] <jmlowe1> yeah, it's really weird, hope Sam show up soon
[18:13] <lautriv> andrei, will the new machine hold same IP/hostname and so on ?
[18:13] <andrei> i can give it the same ip/hostname
[18:14] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[18:16] * DarkAce-Z (~BillyMays@ has joined #ceph
[18:16] <gregaf1> andrei: your life will be a lot easier if you can plug in the new ones and then decommission the old ones
[18:17] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[18:17] <andrei> gregaf1: that would involve copying about 8TB of data as I currently have about 1TB of data on each osd on that server
[18:18] <gregaf1> you're going to need to copy it anyway, I meant letting Ceph do the data movement and then removing the old ones
[18:18] <gregaf1> is that infeasible?
[18:18] <gregaf1> you could also do one-at-a-time or something, but trying to pretend your new disks are your old ones would suck
[18:19] * dosaboy (~dosaboy@faun.canonical.com) Quit (Read error: No route to host)
[18:19] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[18:19] * toabctl (~toabctl@toabctl.de) has joined #ceph
[18:20] <andrei> that's feasable
[18:20] <andrei> so, how would this work?
[18:20] <andrei> i add a new server with osds
[18:20] <toabctl> I did a clean checkout of https://github.com/ceph/ceph but the build fails. the directory src/libs3 is empty (master branch). is this intended?
[18:20] * dosaboy (~dosaboy@faun.canonical.com) has joined #ceph
[18:20] <andrei> and once the sync has finished I remove the osds from the other one?
[18:20] * sagelap (~sage@2600:1012:b003:e23:7591:7c0a:29ef:1078) has joined #ceph
[18:20] * Kacian (~kvirc@aster.cystersow9.netart.com.pl) Quit (Quit: KVIrc 4.1.3 Equilibrium http://www.kvirc.net/)
[18:20] <andrei> or should I remove the old server first (as i currently have 2 servers with replica of 2)
[18:21] <gregaf1> andrei: yeah, what you probably want to do actually is add the new ones, then mark the old ones out (but leave them up), then wait for the cluster to settle and actually remove them
[18:21] <andrei> and add the new one after removing the old?
[18:22] <Gugge-47527> add the new one
[18:22] <Gugge-47527> mark the old one out
[18:22] <Gugge-47527> then wait
[18:22] <Gugge-47527> and then remove
[18:22] <andrei> removing the old one would be preferable for me actually as I do not have much space left the in rack (((
[18:22] <andrei> could this work at all?
[18:22] <Gugge-47527> as long as nothing crashes
[18:22] <Gugge-47527> yes
[18:22] <gregaf1> it should but things will take longer since you only have the one replica left to draw from, and everything will go degraded
[18:25] <andrei> i see
[18:25] <andrei> so, the performance of cluster will be slower while the rebuilding is taking place, right?
[18:26] <andrei> the other question that i have is the slow performance of 4K reads.
[18:26] <andrei> what i mean by that is this:
[18:27] <andrei> i am running 4 concurrent dd with bs=4K iflag=direct from a virtual machine (kvm/rbd)
[18:27] <andrei> sequential reads
[18:27] <andrei> i am running this test several times in a row
[18:28] <andrei> so, server wise the data is coming from ram
[18:28] <andrei> no disk activity
[18:28] <andrei> and I am getting about 3mb/s per dd thread ((((
[18:29] <andrei> which is about 750 iops per thread
[18:29] * mschiff (~mschiff@ has joined #ceph
[18:30] <andrei> in terms of the storage server load - i get a bunch of ceph-osd processes consuming between 20 -80% of cpu
[18:30] <andrei> and overall load is around 2-3
[18:31] <andrei> so, what do I need to do to improve performance of 4Ks?
[18:34] <gregaf1> figure out what happens if you add extra clients generating load — are the OSDs processing as much as they can (unlikely) or is it latency bound one way or another?
[18:34] <gregaf1> the OSDs should be doing a lot more ops/second than that but I'm not sure what the lower bound on a single op's latency is
[18:35] <gregaf1> that's the first step, if it's latency within the OSD then you're stuck and about all you can do is document it and politely agitate for pipeline improvements; if it's latency across the network then you're stuck unless you can get better gear, if it's something else maybe you can do something better
[18:39] * alfredodeza (~alfredode@c-24-99-84-83.hsd1.ga.comcast.net) has joined #ceph
[18:41] * houkouonchi-work (~linux@ has joined #ceph
[18:42] * leseb (~Adium@ Quit (Quit: Leaving.)
[18:46] <andrei> gregaf1: thanks
[18:46] <andrei> well, i can check the latency from the networking side
[18:46] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[18:46] <andrei> i guess with ping and 4k size?
[18:46] <andrei> but how can I check if it's osd related?
[18:47] * vipr_ (~vipr@78-21-227-195.access.telenet.be) has joined #ceph
[18:47] <andrei> i am on infiniband ipoib
[18:47] <andrei> throughput is pretty good on large block sizes
[18:47] <andrei> and iperf and alike show me about 23-25gbit/s throughput
[18:47] <gregaf1> you can pull it out of the admin socket
[18:48] * markbby (~Adium@ has joined #ceph
[18:48] <andrei> cool
[18:48] <andrei> do you know the command or what should I look for?
[18:51] <andrei> gregaf1: well, i've just tried 16 concurrent dds with iflag=direct
[18:51] <andrei> and each one is giving me 3mb/s throughput
[18:52] <andrei> so, there seems to be a bottleneck
[18:52] <andrei> which doesn't allow over 3mb/s throughput per dd thread
[18:53] <gregaf1> if you look at the admin socket you can run a help command against it ("ceph —admin-daemon <path-to-socket> help") and it will tell you what you can do
[18:53] * vipr (~vipr@78-21-228-224.access.telenet.be) Quit (Ping timeout: 480 seconds)
[18:54] <gregaf1> you'll want to look for the op history ones
[18:54] <gregaf1> and that'll let you look at their progress through the OSD and how long it takes
[18:55] <grepory1> interesting
[18:55] * grepory1 is now known as grepory
[18:55] <grepory> the admin socket is the .asok socket in /var/run/ceph, right?
[18:55] <gregaf1> yeah
[18:55] <grepory> word
[18:56] * dosaboy (~dosaboy@faun.canonical.com) Quit (Ping timeout: 480 seconds)
[18:56] <andrei> guys, i've got to run, but should be back in a few hours
[18:56] <andrei> would like to get to the bottom of this with your help
[18:59] * sleinen (~Adium@2001:620:0:25:4916:6369:9f76:f0f3) has joined #ceph
[19:02] <jmlowe1> sjust: whenever you get in, I created http://tracker.ceph.com/issues/5723
[19:06] * andrei (~andrei@ Quit (Ping timeout: 480 seconds)
[19:12] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[19:13] * DarkAce-Z is now known as DarkAceZ
[19:14] <sjust> jmlowe1: is there a DIR_7 in osd 14?
[19:16] <jmlowe1> ls /data/osd.14/current/2.37d_head/DIR_D/DIR_7/
[19:16] <jmlowe1> DIR_3 DIR_7 DIR_B
[19:16] <jmlowe1> and that's it
[19:17] <jmlowe1> I did turn up logging on restart when it came up without complaints, if that would be useful
[19:18] <jmlowe1> I turned it back off in the interim
[19:19] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[19:20] <jmlowe1> 2.37d scrub 220 missing, 0 inconsistent objects, and ls /data/osd.14/current/2.37d_head/DIR_D/ |wc -l shows 220 files
[19:20] <sjust> any errors on that fs/disk?
[19:20] <jmlowe1> nope
[19:21] <jmlowe1> all the missing objects I've spot checked are just one directory up in the hierarchy from where they are on the secondary
[19:21] <sjust> that's not normally a problem
[19:22] <sjust> oh, but that is
[19:22] <jmlowe1> !!
[19:22] <sjust> ok, so /data/osd.14/current/2.37d_head/DIR_D/DIR_7 exists?
[19:22] <jmlowe1> yes
[19:23] <sjust> ok, so it's looking for rb.0.105b.238e1f29.000000000ff4__head_F30B0F7D__2 in /data/osd.14/current/2.37d_head/DIR_D/DIR_7 rather than in /data/osd.14/current/2.37d_head/DIR_D
[19:23] <sjust> was there some form of power cycle or something/
[19:23] <sjust> ?
[19:23] <jmlowe1> no
[19:23] <jmlowe1> the osd did die and didn't log anything
[19:23] <jmlowe1> just the process, machine and all other osd's on it are fine
[19:24] <jmlowe1> dmesg is clean
[19:24] <jmlowe1> syslog is as clean as you would expect it
[19:24] <sjust> xfs/61.5?
[19:24] <jmlowe1> yes
[19:25] <jmlowe1> barriers are on
[19:25] <jmlowe1> hardware raid5 with bbu cache, disk cache off
[19:25] <jmlowe1> Linux version 3.5.0-26-generic (buildd@lamiak) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-2ubuntu1) ) #42-Ubuntu SMP Fri Mar 8 23:18:20 UTC 2013
[19:26] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[19:26] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[19:27] <jmlowe1> is it possible this happened a while ago and the trim operation isn't as forgiving when it comes to objects in unexpected places?
[19:29] <Pauline> should the scrub not have complained?
[19:29] <jmlowe1> should there ever be objects in a directory with sub directories?
[19:30] <sjust> yes
[19:30] <sjust> this is straigtforward to fix, one sec
[19:32] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[19:33] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[19:33] <sjust> so, the trouble is that for object rb.0.105b.238e1f29.000000000ff4__head_F30B0F7D__2, it will look first for DIR_D, then DIR_D/DIR_7, then DIR_D/DIR_7/DIR_F, etc
[19:33] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:33] <sjust> and once it fails to find the directory, it will assume the object is in that directory, for example since there is no DIR_D/DIR_7/DIR_F, it assumes the object must be in DIR_D/DIR_7
[19:34] <sjust> so if you scan the objects in DIR_D, you will probably find some misplaced
[19:34] * indeed (~indeed@ has joined #ceph
[19:34] <sjust> you can just move them into the appropriate subdir
[19:34] <jmlowe1> mv is safe?
[19:34] <sjust> yes
[19:35] <jmlowe1> no xattr complications?
[19:35] <sjust> you do have to preserve xattrs, but mv should do that just fine
[19:36] <jmlowe1> so mv then scrub again, or do I need to repair
[19:36] <sjust> mv then scrub should suffice
[19:36] * themgt (~themgt@pc-56-219-86-200.cm.vtr.net) has joined #ceph
[19:36] <jmlowe1> ok, here goes
[19:39] <jmlowe1> 2013-07-23 13:37:56.454890 7fb401751700 0 log [ERR] : 2.37d osd.6 missing f30b0f7d/rb.0.105b.238e1f29.000000000ff4/head//2
[19:39] <sjust> is that the only one?
[19:39] <jmlowe1> it's the only one I moved, 6 is the secondary
[19:39] <sjust> does that object exist in osd 6?
[19:40] <jmlowe1> it does
[19:40] <jmlowe1> find /data/osd.6/current/2.37d_head/ -name 'rb.0.105b.238e1f29.000000000ff4*'
[19:40] <jmlowe1> /data/osd.6/current/2.37d_head/DIR_D/DIR_7/rb.0.105b.238e1f29.000000000ff4__head_F30B0F7D__2
[19:40] <jmlowe1> ls /data/osd.6/current/2.37d_head/DIR_D/DIR_7/ |grep DIR
[19:40] <jmlowe1> DIR_3
[19:40] <jmlowe1> DIR_7
[19:40] <jmlowe1> DIR_B
[19:40] <sjust> post ls -lRah of the primary and secondary directories
[19:43] <jmlowe1> working on it
[19:44] * Cube (~Cube@ has joined #ceph
[19:46] * houkouonchi-work (~linux@ Quit (Remote host closed the connection)
[19:46] <jmlowe1> https://iu.box.com/s/1rbuuhcop11ssb3r581p
[19:46] <jmlowe1> https://iu.box.com/s/uyhtvuw71m2t55kov8hw
[19:46] * houkouonchi-work (~linux@ has joined #ceph
[19:48] <jmlowe1> I didn't post the full errors
[19:48] <jmlowe1> 2013-07-23 13:37:56.873331 7fb401751700 0 log [ERR] : 2.37d scrub stat mismatch, got 904/903 objects, 4/4 clones, 3568542208/3564347904 bytes.
[19:48] <jmlowe1> 2013-07-23 13:37:56.873356 7fb401751700 0 log [ERR] : 2.37d scrub 220 missing, 0 inconsistent objects
[19:48] <jmlowe1> 2013-07-23 13:37:56.873361 7fb401751700 0 log [ERR] : 2.37d scrub 221 errors
[19:57] <sjust> you can ignore the stat errors
[19:57] <jmlowe1> ok
[19:58] <jmlowe1> what's my next move?
[19:58] <sjust> reading your output
[19:59] <jmlowe1> ok, I'll go make myself some lunch, back in < 5
[20:01] * jcfischer (~fischer@peta-dhcp-13.switch.ch) has joined #ceph
[20:01] <sjust> jmlowe1: looks like every file in osd.14 DIR_D should be in DIR_D/DIR_7 (not directories, just files)
[20:02] <sjust> you will also need to restart the daemon, I think
[20:06] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) Quit (Quit: Leaving.)
[20:06] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[20:09] <jmlowe1> restart osd.14?
[20:09] <sjust> yeah
[20:10] <jmlowe1> move while stopped is a better idea right?
[20:10] <sjust> oh, yes
[20:12] <jmlowe1> still missing objects
[20:12] <jmlowe1> Jul 23 14:12:07 gwioss1 ceph-osd: 2013-07-23 14:12:07.697629 7f1c9d81b700 0 log [ERR] : 2.37d scrub 220 missing, 0 inconsistent objects
[20:13] <sjust> can you post the ls's again?
[20:13] <sjust> or just the changed ones
[20:14] * houkouonchi-work (~linux@ Quit (Remote host closed the connection)
[20:15] * houkouonchi-work (~linux@ has joined #ceph
[20:15] <jmlowe1> https://iu.box.com/s/f7r8teof6e2i96d397nb
[20:17] <sjust> 2.37d_head/DIR_D/rb.0.1026.2ae8944a.000000002a36__head_5F051F7D__2
[20:18] <sjust> must also be moved
[20:18] <sjust> as well as the rest that tare in that directory
[20:18] <sjust> *as well as the rest that are in that directory
[20:18] <sjust> oh
[20:18] <sjust> wait, did the osd crash?
[20:18] <jmlowe1> when?
[20:19] <sjust> just now
[20:19] * Esmil_ is now known as Esmil
[20:19] <jmlowe1> nope, it's still up after stop, mv, start
[20:20] <jmlowe1> it was in a recovering state for a short period, did it recover the objects to the wrong place?
[20:20] <sjust> I'm not sure
[20:21] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) Quit (Quit: Leaving.)
[20:21] <sjust> I'll be back in about an hour
[20:31] * sagelap1 (~sage@ has joined #ceph
[20:31] * sagelap (~sage@2600:1012:b003:e23:7591:7c0a:29ef:1078) Quit (Read error: No route to host)
[20:32] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[20:33] <jmlowe1> Well, I'm not crazy, I moved those files as you suggested, they were moved back on startup
[20:33] <jmlowe1> root@gwioss1:/data/osd.14/current/2.37d_head/DIR_D# mv *head* DIR_7/
[20:33] <jmlowe1> root@gwioss1:/data/osd.14/current/2.37d_head/DIR_D# ls
[20:33] <jmlowe1> DIR_7 rb.0.14c3.2ae8944a.000000000342__293_B8BB0F7D__2
[20:33] <jmlowe1> root@gwioss1:/data/osd.14/current/2.37d_head/DIR_D# mv rb.0.14c3.2ae8944a.000000000342__293_B8BB0F7D__2 DIR_7/
[20:37] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[20:41] * mikedawson_ (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[20:41] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[20:41] * mikedawson_ is now known as mikedawson
[20:44] * indeed (~indeed@ Quit (Remote host closed the connection)
[20:51] * diegows (~diegows@host63.186-108-72.telecom.net.ar) has joined #ceph
[20:53] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[20:53] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[20:54] * indeed (~indeed@ has joined #ceph
[20:58] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) has joined #ceph
[21:01] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[21:04] * mikedawson (~chatzilla@50-195-193-105-static.hfc.comcastbusiness.net) has joined #ceph
[21:16] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[21:16] <loicd> ccourtaut: http://dachary.org/?p=2171 Ceph replication vs erasure coding . Do you find the drawing readable or just obscure ? I made this for the OSCON Ceph bof on thursday.
[21:19] * indeed (~indeed@ Quit (Remote host closed the connection)
[21:24] <alfredodeza> loicd: that is only going to be readable if someone is close to the projected screen
[21:25] <alfredodeza> loicd: maybe try to segment it a bit and go through each part *before* showing the whole picture?
[21:25] <alfredodeza> if someone knows all the details about each section of the graph, it helps understanding the picture at the end if you are far away :)
[21:39] <grepory> omg shaky is the coolest ever.
[21:49] * indeed (~indeed@ has joined #ceph
[21:52] <janos> indeed, do you work for Indeed, or is that just a name
[21:52] <janos> (just curious)
[21:53] <lxo> hey, I'm running Cuttlefish and I have an oldish cephfs that went through several upgrades. “rados -p data getxattr <inode>.00000000 parent” gives me useful information for files created after a certain date, but for older files, the attribute isn't there
[21:54] <lxo> is there any way to tell the mds to create this attribute for old files (other than re-creating them :-) renaming the tree to something else and then back doesn't seem to have done it, but maybe I didn't wait long enough or something... thoughts?
[21:57] * indeed (~indeed@ Quit (Ping timeout: 480 seconds)
[21:59] <jmlowe1> sjust: you back?
[22:13] <joshd> joao: there? ceph-mon on the next and cuttlefish branches crash after install for me
[22:14] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:19] <loicd> grepory: :-D
[22:19] * indeed (~indeed@ has joined #ceph
[22:23] * indeed (~indeed@ Quit (Read error: Connection reset by peer)
[22:23] * indeed (~indeed@ has joined #ceph
[22:24] * mikedawson__ (~chatzilla@50-195-193-105-static.hfc.comcastbusiness.net) has joined #ceph
[22:27] * mikedawson (~chatzilla@50-195-193-105-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[22:29] <joao> joshd, crash dump?
[22:30] <joshd> joao: I think sagewk got it in wip-mon-foo
[22:31] <jmlowe1> sjust: I have to go for a bit, if you think of anything let me know
[22:33] <joao> joshd, yeah, just saw the patch
[22:33] <joao> thanks
[22:35] * mikedawson__ (~chatzilla@50-195-193-105-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[22:37] * Meths_ (rift@ has joined #ceph
[22:42] * Meths (rift@ Quit (Ping timeout: 480 seconds)
[22:55] * sleinen (~Adium@2001:620:0:25:4916:6369:9f76:f0f3) Quit (Quit: Leaving.)
[22:56] <gregaf1> lxo: I think you'd have to rename the file and that should do it
[22:56] <gregaf1> or maybe you'd even have to move it to a different directory
[23:01] <wrencsok> 3 questions in this wall text: with respect to "slow requests" in the logs. On my test cluster, I can create a moderate load (~5-10 per node) and usually generate ones that last over 58,000 seconds, or over 16 hours if osd's fail in the cluster. They will continue to persist until I restart some osd's. IF the cluster behaves they still come in regularly but usually in the 30 to 60 second range. Are all io's blocked to that read/write or d
[23:04] * Meths_ is now known as Meths
[23:06] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[23:16] * portante is now known as portante|afk
[23:18] <jmlowe1> sjust: back
[23:20] <gregaf1> wrencsok: the requests that are stuck for a really long time (eg 58,000 seconds!) ran into a bug which I believe is fixed in the latest cuttlefish release, although you can relieve them by restarting
[23:20] <gregaf1> the 30-60 second ones are just when you manage to overload that OSD one way or another
[23:20] <gregaf1> it's not blocking IO, but it is slower, generally on the OSD in question but sometimes just on the object or PG that the request is hitting
[23:24] <sjust> jmlowe1: can you dump the attributes on the DIR_D/DIR_7 on osd.14 and post it?
[23:25] * indeed (~indeed@ Quit (Remote host closed the connection)
[23:31] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[23:32] * sleinen1 (~Adium@2001:620:0:25:c15a:6c42:f8ba:65a0) has joined #ceph
[23:33] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[23:33] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[23:34] <jmlowe1> cephos.phash.contents
[23:34] <jmlowe1> 0000000: 0109 0000 0000 0000 0000 0000 0002 0000 ................
[23:34] <jmlowe1> 0000010: 00 .
[23:35] <wrencsok> gregaf1: i agree on the long ones, i am ~90% certain the really long ones are one of the bobtail journal_aio bugs i've seen in the ceph bug tracker. Where should I look to debug the osd in question? ops history from the admin-daemon reporting the slow request? I am creating an alert on extremely long slow requests, to get someone to look, and to possibly automate the restart of that osd daemon if its still up an running and the drive is r
[23:35] <wrencsok> When you say not blockable IO, does that depend on the error thrown? if its say "no flag points reached" vs "something like "currently commit sent." I am wondering how important it is to monitor these slow requests, in the past others who came before me, have developed a paranoia about any slow request and blocked io to clients.
[23:37] <gregaf1> wrencsok: the bug's fixed in our dev branches and I believe the latest cuttlefish, I don't know if it existed in Bobtail but if it does then we should figure out a backport for that too (sjust:)
[23:37] <gregaf1> slow requests aren't good, they mean that some request from a client has been sitting around that long without getting satisfied
[23:38] <gregaf1> but it doesn't mean all IO is stacked up behind that one request
[23:38] * fireD (~fireD@93-142-200-49.adsl.net.t-com.hr) Quit (Quit: Lost terminal)
[23:39] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[23:40] <gregaf1> sagewk: what stuff has gone into ceph-fuse lately? there was a for-real pjd failure over the weekend involving directory permissions
[23:42] * indeed (~indeed@ has joined #ceph
[23:43] * sleinen1 (~Adium@2001:620:0:25:c15a:6c42:f8ba:65a0) Quit (Quit: Leaving.)
[23:44] * themgt (~themgt@pc-56-219-86-200.cm.vtr.net) Quit (Ping timeout: 480 seconds)
[23:45] <wrencsok> ok, so it still sounds worthwhile to give monitoring and ops that capability and potentially to automate the restart of the reporting osd and to keep track of them vs drive/osd/node. well i guess i know what i am doing tomorrow. is the -admin-daemon of the affected osd the only/best place to dive once i determine from scraping logs that the journal or the drive, or something else causing the slow request?
[23:46] <Psi-Jack_> Hmm. Interesting issue.. I have a HEALTH_WARN mds cluster is degraded.
[23:46] <Psi-Jack_> Ahh, there, it went away. :0
[23:47] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) has joined #ceph
[23:48] <gregaf1> admin-daemon output is certainly the best
[23:48] <jmlowe1> sjust: was that xattr dump of any use?
[23:49] <wrencsok> ok, one last thing. I can dig the source, but maybe you know. what is the difference btwn teh "age" and "duration" in this output? { "description": "osd_op(client.2195880.0:305770 rb.0.618bc.238e1f29.00000000023c [write 4055040~8192] 21.bba9b8e3)",
[23:49] <wrencsok> "received_at": "2013-07-23 17:42:50.238483",
[23:49] <wrencsok> "age": "265.197902",
[23:49] <wrencsok> "duration": "1.164032",
[23:49] <wrencsok> "flag_point": "commit sent; apply or cleanup",
[23:49] <wrencsok> "client_info": { "client": "client.2195880",
[23:49] <wrencsok> "tid": 305770},
[23:49] <wrencsok> "events": [
[23:49] <wrencsok> { "time": "2013-07-23 17:42:50.238800",
[23:49] <wrencsok> "event": "waiting_for_osdmap"},
[23:49] <wrencsok> { "time": "2013-07-23 17:42:50.238875",
[23:49] <wrencsok> "event": "reached_pg"},
[23:49] <wrencsok> { "time": "2013-07-23 17:42:50.238984",
[23:49] <wrencsok> "event": "started"},
[23:49] <wrencsok> { "time": "2013-07-23 17:42:50.239042",
[23:50] <gregaf1> ah, that one's done; duration is how long it took to complete, age is how long ago it arrived
[23:50] <gregaf1> we don't clean up the age once it's been applied to so that can keep incrementing even if it's complete, you'll need to watch out for that
[23:51] <wrencsok> ah ok
[23:52] <wrencsok> thanks
[23:52] * alfredodeza (~alfredode@c-24-99-84-83.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[23:56] * GREGORTH (~greglauvi@nor75-h01-31-33-56-112.dsl.sta.abo.bbox.fr) has joined #ceph
[23:56] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[23:56] * GREGORTH (~greglauvi@nor75-h01-31-33-56-112.dsl.sta.abo.bbox.fr) has left #ceph
[23:56] * GREGORTH (~greglauvi@nor75-h01-31-33-56-112.dsl.sta.abo.bbox.fr) has joined #ceph
[23:56] * GREGORTH (~greglauvi@nor75-h01-31-33-56-112.dsl.sta.abo.bbox.fr) has left #ceph
[23:59] * andrei (~andrei@host109-151-35-94.range109-151.btcentralplus.com) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.