#ceph IRC Log

Index

IRC Log for 2013-07-26

Timestamps are in GMT/BST.

[0:03] * aliguori (~anthony@32.97.110.51) Quit (Remote host closed the connection)
[0:04] <sagewk> sjusthm: do you need your 2 plana to be plana?
[0:07] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[0:07] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[0:08] * Tv (~tv@pool-108-13-115-132.lsanca.fios.verizon.net) has joined #ceph
[0:08] <Tv> hi guys
[0:08] <Tv> figured you might enjoy: http://lookatthatfuckingcuttlefish.wordpress.com/
[0:09] <sjustlaptop> sagewk: no, not really, but mira is also pretty full
[0:09] <dmick> rofl
[0:11] <sjustlaptop> sagewk: released them
[0:11] <sagewk> tv: nice
[0:12] <sjustlaptop> heh
[0:21] <nhm> Tv: nice!
[0:22] <nwat> sagewk: back
[0:23] <sagewk> nwat: hey, wondering if you're like to do a CDS session on porting ceph to other platforms. osx/darwin in your case, but hoefully we can get people interested in *bsd or illumos
[0:25] <nwat> sagewk: phew, i thought it overlapped with outside lands.. sure--i started a blueprint. i'll try to structure it so people can add relevant info for other platforms they are familiar with.
[0:26] <sagewk> awesome
[0:26] * rudolfsteiner (~federicon@190.220.6.50) has joined #ceph
[0:27] <sagewk> a ceph on zfs blueprint/discussion would also be cool, but it's probably separate from the porting discussion
[0:27] <sagewk> and i bet alfredodeza would be interested in that one
[0:27] <nhm> sagewk: definitely would be nice to see what ceph could do with a zfs backend that didn't go through posix (I think that's what the ZFS guys are doing)
[0:27] <sagewk> yeah
[0:27] <nwat> i added a section on other file systems. there is already some porting weirdness where btrfs is woven into file store. so maybe its relevant, but i don't know much about zfs features..
[0:27] <nhm> sorry, Lustre I meant
[0:28] <sagewk> yeah
[0:29] <gregaf> using an object store for an object store, you mean? CRAZINESS!
[0:29] * erwan_taf (~erwan@lns-bzn-48f-62-147-157-222.adsl.proxad.net) has joined #ceph
[0:30] <nwat> obfs
[0:30] <sagewk> heh
[0:32] * dxd828 (~dxd828@host-2-97-79-23.as13285.net) Quit (Quit: Textual IRC Client: www.textualapp.com)
[0:41] <gregaf> yehuda_hm: pushed rebased wip-rgw-versionchecks; this time it adds the STATUS_APPLIED/STATUS_NO_APPLY to the error lookup and turns them into 204 NoContent there, but doesn't change the http_errno field anywhere else
[0:47] * sprachgenerator (~sprachgen@130.202.135.191) Quit (Quit: sprachgenerator)
[0:49] <yehudasa__> gregaf: I'm not sure that's the best place to handle it
[0:50] <yehudasa__> it'll do the job, however, we don't want to have an automatic translation between that status to 204
[0:50] * lautriv (~lautriv@f050081055.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[0:50] <gregaf> yehudasa__: I'm confused then, what were you asking for in your last comment?
[0:50] <gregaf> and why wouldn't we want automatic translation to the proper HTTP code?
[0:51] <yehudasa__> we want to return that status code, that's correct. However, I don't want it to be automatic.
[0:52] <gregaf> well either it's automatic there or we translate it somewhere else, like we were doing in the previous iteration
[0:52] <gregaf> and why not make it automatic? it's not like anybody else will use that status
[0:53] <yehudasa__> we might use it, it's not clear that it should always be a success
[0:53] * alphe (~alphe@0001ac6f.user.oftc.net) has joined #ceph
[0:53] <alphe> hello :)
[0:54] <alphe> I would like to exchange experience on cuttlefish lastest stable installation
[0:54] <yehudasa__> gregaf: specifically we'd probably need to implement RGWOp_Metadata_Put::send_response(), and handle it there
[0:54] <alphe> so I try to install /deploy ceph using ceph-deploy and there is a ton of asks I have
[0:55] <gregaf> that is implemented
[0:55] <loicd> alphe: I may be able to help. What questions do you have ?
[0:55] <alphe> loicd a ton ... sorry ...
[0:55] <alphe> if there is an ongoing important discussion we can talk in private
[0:55] <gregaf> I guess we were handling the translation one level from there rather than in it directly
[0:56] <alphe> I don t want to spoil the chatroom
[0:56] <yehudasa__> gregaf: yeah.. I just don't want it to be automatic
[0:56] <loicd> alphe: :-) ask the first question and we'll see where it leads us.
[0:56] <alphe> ok it s a bit confuse how to use ceph-deploy
[0:57] <alphe> I read the how-to on ceph.com the documentation etc...
[0:57] <gregaf> k, I'll change it back and move the translation into there instead of ::execute()
[0:57] * dutchie (~josh@2001:ba8:1f1:f092::2) has left #ceph
[0:57] <alphe> so first I want a ceph cluster organised with one mds and 10 osds
[0:57] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[0:58] <alphe> so I run ceph-deploy -v install on all the machines in the cluster
[0:58] <alphe> I run ceph-deloy -v new on all the machines in the cluster no particular name
[0:58] * waxzce (~waxzce@glo44-2-82-225-224-38.fbx.proxad.net) has joined #ceph
[0:58] <alphe> then obvious thing I install the monitors to each machine and there arrise problems
[0:59] <dmick> alphe: "new" is only needed once, to start a cluster with a name
[0:59] <alphe> in fact problems arise since the ceph-deploy new ...
[0:59] <loicd> yes ?
[0:59] <alphe> dmick so I should do it on the mds only ?
[0:59] <alphe> the mds will be obviously my admin server
[0:59] * lautriv (~lautriv@f050081222.adsl.alicedsl.de) has joined #ceph
[1:00] <alphe> ok so first question
[1:00] <dmick> it can be anywhere, including a host that's not 'part of' the cluster (i.e. not running any particular daemons), but on the mds host is fine
[1:00] <alphe> why the ceph-deploy -v new claims for a /etc/ceph/ceph.conf file ?
[1:01] <alphe> I provided a full featured /etc/ceph/ceph then noticed that in my local dir I get a ceph.conf too ...
[1:01] <dmick> not sure what you mean; 'new' primarily creates ceph.conf
[1:02] <alphe> if I run "new" without /etc/ceph/ceph.conf it yells at me to create one
[1:02] <dmick> that's its job, mostly, is to create a ceph.conf for the new cluster (or a ${cluster}.conf, really)
[1:02] <alphe> then in /mydir/ after doing the "new" I get a /mydir/ceph.conf with only the global part
[1:03] <dmick> alphe: that doesn't make sense; what exact command invocation, and what's the error message (the 'yelling')?
[1:03] <alphe> ok but why creating something the is already provided ...
[1:03] <dmick> ceph-deploy doesn't require or expect ceph.conf to already be there; it creates it
[1:04] <alphe> hum ... tryed it again with only the mds in it and it seems it worked ...
[1:04] <erice> alphe: I am new to ceph and found Inktank video on ceph-deploy very helpful. http://www.brighttalk.com/channel/8847
[1:04] <alphe> can I copy paste many lines ?
[1:05] <dmick> I don't know what the 'it' that contained the mds was, but ceph-deploy new doesn't do anything with an existing ceph.conf (except maybe overwrite it)
[1:05] <dmick> alphe: not here, use some pastebin site (fpaste.org is one)
[1:05] <alphe> i m not new to ceph but since 0.38 ... some water went down the hills :)
[1:05] <alphe> hehehe
[1:05] <alphe> dmick in private ?
[1:06] <dmick> what's the question?
[1:07] <loicd> alphe: you can http://pastebin.com/ the lines and post the link here
[1:07] <alphe> ok so once I did the "new" with my admin what is the next step
[1:07] <alphe> needs a user account ?
[1:07] <loicd> no
[1:07] <dmick> Have you read http://ceph.com/docs/master/rados/deployment/
[1:07] <dmick> it does walk you through the process quite well
[1:08] <alphe> root@mds01:~# rm /etc/ceph/ceph.conf
[1:08] <alphe> root@mds01:~# ceph-deploy -v new mds01
[1:08] <alphe> Creating new cluster named ceph
[1:08] <alphe> Resolving host mds01
[1:08] <alphe> Monitor mds01 at 20.10.10.2
[1:08] <alphe> Monitor initial members are ['mds01']
[1:09] <alphe> Monitor addrs are ['20.10.10.2']
[1:09] <alphe> Creating a random mon key...
[1:09] <alphe> Writing initial config to ceph.conf...
[1:09] <alphe> Writing monitor keyring to ceph.conf...
[1:09] <alphe> damn ...
[1:09] <alphe> http://pastebin.com/dSG7RbcP
[1:09] <alphe> ok that was the pastebin ...
[1:09] <dmick> I would have sworn we just said not to cut and paste here....
[1:09] <alphe> dmick yeah sorry copy paste melted ...
[1:10] <dmick> what's your question?
[1:10] <alphe> third mouse button problems ...
[1:10] <alphe> so the question is what is the next step
[1:10] <dmick> (04:07:45 PM) dmick: Have you read http://ceph.com/docs/master/rados/deployment/
[1:10] <dmick> (04:07:59 PM) dmick: it does walk you through the process quite well
[1:11] <alphe> if I try to create the mon in all the other nodes after doing a new with all other nodes then the process get stuck no cluster.mon.keyring generated on nodes and the conf is dead
[1:11] <alphe> dmick yes i rode it but it is confuse
[1:12] <dmick> (03:59:05 PM) dmick: alphe: "new" is only needed once, to start a cluster with a name
[1:13] <alphe> so now I should go to the monitors doing a ceph-deploy mon create osd0{1,2,3,4,5,6,7,8,9} osd10
[1:13] <dmick> so I'm not sure why you would be "doing a new with all other nodes"
[1:13] <dmick> I suggest you start again, following that process, and ask specific questions when something doesn't seem to work. Or else maybe someone else can help you.
[1:13] * AfC (~andrew@2001:44b8:31cb:d400:997e:78b7:e195:37cd) Quit (Ping timeout: 480 seconds)
[1:13] <alphe> dmick because I probably didn t understoud rightfully the meaning of the lone sentence
[1:13] <alphe> that explains the new use
[1:14] <loicd> alphe: is your admin on the IRC channel too ?
[1:14] <alphe> the host... is in the line is confusing ... it implies that many hosts can be provided
[1:14] <alphe> loicd I am the admin ...
[1:15] <loicd> alphe: oh sorry, I was confused by you saying "<alphe> ok so once I did the "new" with my admin what is the next step "
[1:15] <alphe> lol
[1:15] <loicd> what did you mean by "my admin" ?
[1:17] <alphe> ok that wasn t what I ment to say now that the new is done on the mds alone and the ceph.conf initial file is created I need to create the mon in all nodes the futures ods or do I do an ceph-deploy admin mds to generate the admin key and turn it to administrative rights ?
[1:18] <alphe> because according to the doc i should prepare the monitors on my nodes now
[1:18] <alphe> according to that http://ceph.com/docs/master/rados/deployment/
[1:19] <alphe> guys ?
[1:19] <alphe> ceph-deploy -v mon osd0{1,2,3,4,5,6,7,8,9} osd10 something like that ?
[1:19] <alphe> ceph-deploy -v mon create osd0{1,2,3,4,5,6,7,8,9} osd10 something like that ?
[1:20] <loicd> alphe: I suggest you reset the operating system to get rid of any previous installation leftovers. And then you run the command ceph-deploy new as described in http://ceph.com/docs/master/rados/deployment/
[1:20] * waxzce (~waxzce@glo44-2-82-225-224-38.fbx.proxad.net) Quit (Remote host closed the connection)
[1:20] <loicd> can you do that ?
[1:20] <alphe> yeah it was done
[1:21] <loicd> I mean can you do that now ?
[1:21] <alphe> ok so I did the mon create and the keyring are stuck
[1:21] <alphe> root 22381 22380 0 19:20 ? 00:00:00 /usr/bin/python /usr/sbin/ceph-create-keys --cluster=ceph -i osd01
[1:22] <loicd> reseting the machine will get you to a sound state, from which installation is going to go much better.
[1:22] <alphe> hum not sure ...
[1:22] <alphe> too much time lost ...
[1:23] <alphe> I will manually install and forget about ceph-deploy
[1:23] <alphe> INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
[1:23] * rudolfsteiner (~federicon@190.220.6.50) Quit (Quit: rudolfsteiner)
[1:24] * alram (~alram@38.122.20.226) Quit (Quit: leaving)
[1:24] <loicd> it's difficult for me to understand the situation you're in because I'm not sure what you did before, that's why I suggest restarting from scratch.
[1:24] <alphe> I have that on the osd01 to osd10 with the ceph-deploy -v mon create osdXX
[1:24] <loicd> however, if you feel confortable installing from source, it's certainly a reasonable approach :)
[1:24] <alphe> loicd I could nt do much in fact I get stuck in propagating the monitors
[1:24] <alphe> monitors are prior to osd
[1:25] <alphe> and mds
[1:25] <alphe> and disk formating etc ...
[1:26] <alphe> reinstalling 11 machine (Real ones) is a kinda of a pain ...
[1:26] <sagewk> joao: still around?
[1:27] <loicd> alphe: you can use ceph-deploy purge ; ceph-deploy purge-data + make sure manually that there are no daemons or data in /etc/ceph or /var/lib/ceph
[1:27] <lautriv> if i create a mon on a multihomed machine where the IP matches forward and reverse DNS and the config exclusively mentions the right IP, for what reason may that mon work on the wrong net ?
[1:27] <alphe> ok
[1:27] <joao> sagewk, still here
[1:28] <alphe> need to kill all the stcuck keycreate ...
[1:28] <alphe> what is a good distributed shell ?
[1:29] <sagewk> do you have a reference for the leveldb ghost entries reappearing bug?
[1:29] <sagewk> i think i'm hitting it. trimmed keys are appearing my trace of stuff applied to the store very clearly removes them
[1:29] <loicd> alphe : i use either dsh or cssh
[1:29] <sagewk> a full compaction doesn't make them go away, though; not sure if it should
[1:30] <joao> sagewk, let me track that for you
[1:30] <joao> should be on their latest release notes
[1:30] <sagewk> hrm; this is pretty easy to reproduce, and the leveldb isn't even very big (only 6 sst's)
[1:31] <joao> sagewk, https://code.google.com/p/leveldb/issues/detail?id=178
[1:31] * rudolfsteiner (~federicon@host132.190-30-146.telecom.net.ar) has joined #ceph
[1:31] * rudolfsteiner_ (~federicon@190.247.73.163) has joined #ceph
[1:32] <alphe> ok
[1:32] <alphe> dsh was what i used before ...
[1:32] <alphe> now it is screen with ssh and it is quite a pain...
[1:33] <loicd> alphe: once you've wiped out and got back to a sane state, you could run the commands in a screen(1) or a script(1) and post the log to pastebin : that will greatly help me understand what went wrong.
[1:36] <alphe> the command new right ?
[1:36] <alphe> the command new with my mds right ?
[1:37] <loicd> all commands. The idea is that you log all you're doing to share it with me and I can see everything. Including mistakes typos whatever.
[1:37] <loicd> starting when you're finished cleaning up the machines that is ;-)
[1:38] * AfC (~andrew@CPE-124-184-2-35.lns10.cht.bigpond.net.au) has joined #ceph
[1:39] <loicd> I'll have to run in 45 minutes but I'm with you in the meantime ;-)
[1:39] <alphe> how do I tell ceph-deploy to process server from serv1 to serv9 ?
[1:39] <alphe> serv{1-9} ?
[1:39] * rudolfsteiner (~federicon@host132.190-30-146.telecom.net.ar) Quit (Ping timeout: 480 seconds)
[1:39] * rudolfsteiner_ is now known as rudolfsteiner
[1:39] <loicd> no
[1:39] <loicd> what command do you want to run ?
[1:40] * rudolfsteiner (~federicon@190.247.73.163) Quit (Quit: rudolfsteiner)
[1:40] <alphe> ceph-deploy purge serv1 mds1 serv2 serv3 serv4 serv5 serv6 serv7 serv8 serv9
[1:40] <alphe> etc...
[1:40] <alphe> but in shorter
[1:40] <gregaf> pushed again, yehudasa__
[1:40] <loicd> you can't
[1:40] <alphe> I have to clean every thing ...
[1:40] <loicd> ceph-deploy purge --help
[1:40] <loicd> usage: ceph-deploy purge [-h] HOST [HOST ...]
[1:41] * kyle_ (~kyle@216.183.64.10) has joined #ceph
[1:41] <alphe> i can short it with serv{1,2,3,4,5,6,7,8,9}
[1:41] <loicd> you can ?
[1:41] <alphe> yes ...
[1:42] <loicd> I did not know that :-)
[1:43] <alphe> I read the documentation far enought to know that you can somehow s expression mathc the servers names :) that the rest that is a total mess in my head ...
[1:43] <dmick> loicd: that's the shell
[1:43] <dmick> $ echo foo{1,2,3}
[1:43] <dmick> foo1 foo2 foo3
[1:43] <loicd> dmick: ho... right :-)
[1:44] <loicd> I forgot that shell expansion does it regardless of the file names in the current directory :-)
[1:44] <dmick> even foo{1..10} works
[1:44] <alphe> ok so it is ceph-deploy purge serv{1..9} to purge to serv1 to serv9
[1:44] <alphe> :)
[1:45] <dmick> ...and that is nothing to do with ceph-deploy
[1:45] <alphe> dmick ok
[1:45] <alphe> though it was python doing it ...
[1:45] <alphe> ok 5 servs purged
[1:46] <alphe> so next is to do "install" or "new" on the mds only ?
[1:46] <loicd> ceph-deploy purgedata --help
[1:46] <loicd> usage: ceph-deploy purgedata [-h] HOST [HOST ...]
[1:47] <alphe> any ways I really appreciate some help :) so dmick loicd others thank you :)
[1:47] <loicd> alphe: you want to do this on all servers
[1:47] <loicd> alphe: :-)
[1:47] <loicd> not just 5 servers
[1:47] <alphe> i will since you told me
[1:47] <alphe> my cluster has 11 machines in it ...
[1:47] <alphe> and it is quite some pain
[1:48] <loicd> how do you mean ?
[1:48] <alphe> purge is long process
[1:48] <loicd> it's worth it
[1:48] <alphe> around 1 minutes per server
[1:49] <alphe> ok so as you need to leave soon what will be the normal process
[1:49] <alphe> I create on the mds the new cluster
[1:49] <loicd> I still have plenty of time, don't worry, run the purge
[1:50] <loicd> 45
[1:50] <alphe> then I create from the mds each monitors on all my cluster using a ceph-deploy -v mon create osd{1..10}
[1:51] <alphe> purging the 11th serv
[1:51] <alphe> but then there it the purge-data to run
[1:51] <loicd> alphe: it's going to be difficult to discuss the process in theory
[1:51] <loicd> can you run purge on the other servers ?
[1:52] <alphe> ceph-deploy runt it everywhere i gave in argument
[1:53] <loicd> yes
[1:53] <alphe> ok data purged fully
[1:54] <loicd> great
[1:54] <loicd> purgedata now ?
[1:56] <alphe> done
[1:57] <loicd> excellent :-)
[1:57] <alphe> so now
[1:57] <loicd> do you have files in your current directory ?
[1:57] <loicd> if so can you move to an empty directory
[1:57] <alphe> yes
[1:58] <loicd> and start loging all you do with screen ( clear the screen log maybe )
[1:58] <alphe> ok in a new dir
[1:58] <alphe> done
[1:58] <loicd> can you type
[1:58] <loicd> pwd
[1:58] * AfC (~andrew@CPE-124-184-2-35.lns10.cht.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[1:58] <alphe> done
[1:58] <loicd> and post the log to pastebin.com so that we know that you actually log before going further ;-)
[1:59] <loicd> it would be a shame to realize in 15 minutes from now that the logs are not there
[1:59] <alphe> http://pastebin.com/02nijNeF
[1:59] <alphe> done
[1:59] <loicd> ok
[2:00] <loicd> http://ceph.com/docs/master/rados/deployment/ceph-deploy-install/#install
[2:00] <loicd> that's where we start
[2:00] * Tv (~tv@pool-108-13-115-132.lsanca.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[2:00] <loicd> ceph-deploy install srv1 etc...
[2:01] <alphe> i can do a ceph-deploy install serv{1..10} mds01
[2:01] <loicd> yes please
[2:02] <alphe> ok running
[2:02] * agh (~oftc-webi@gw-to-666.outscale.net) Quit (Remote host closed the connection)
[2:03] <alphe> pastbin posted
[2:03] <loicd> ok
[2:03] <loicd> now to http://ceph.com/docs/master/rados/deployment/ceph-deploy-new/
[2:04] <loicd> I'm following the instructions as listed in the documentation index
[2:04] <loicd> let says you install 3 mons
[2:04] <loicd> it's enough
[2:04] <loicd> ceph-deploy new srv1 srv2 srv3
[2:04] <alphe> read it and it implies that many servers can get the new command run uppon
[2:05] <alphe> I didn t understand that in fact it was for only 1 server
[2:05] <loicd> I'm not sure what you mean by that but just run the command and list the files in the directory you're in
[2:05] <loicd> that command will just create a few files in your current directory and do nothing else
[2:06] * mschiff_ (~mschiff@port-46347.pppoe.wtnet.de) has joined #ceph
[2:06] <alphe> what I mean is that To create a cluster with ceph-deploy, use the new command and specify the host(s) that will be initial members of the monitor quorum.
[2:06] <loicd> I'll explain after you've done it, based on the files you have
[2:06] <alphe> I am not sure that the inital quorum should have my mds01 only or all the 11 serv in it ...
[2:07] <loicd> ceph-deploy new srv1 srv2 srv3
[2:07] <loicd> will be fine
[2:07] <loicd> ceph-deploy new serv1 serv2 serv3
[2:07] <loicd> rather
[2:07] <alphe> installing on the osd02
[2:08] <loicd> please post the logs when you're done running "ceph-deploy new serv1 serv2 serv3"
[2:08] <alphe> sure
[2:11] <alphe> long ...
[2:12] <loicd> what command takes a long time exactly ?
[2:12] <alphe> install
[2:13] <loicd> you mean the ceph-deploy install ?
[2:13] <alphe> but it is normal deploying 10 servs one after another is long ...
[2:13] <alphe> yes
[2:13] * mschiff (~mschiff@port-16072.pppoe.wtnet.de) Quit (Ping timeout: 480 seconds)
[2:13] <loicd> yes it does
[2:14] <alphe> yesterday to save some time I paralelised the process with ceph-deploy install serv1& ; ceph-deploy install serv2 & ; etc ...
[2:15] <alphe> that is a dirty way to make the servs work togather ...
[2:16] <loicd> while this is working, could you make sure that all your hostnames ( serv1 etc. etc. ) resolve to the same IP ?
[2:17] <loicd> dsh -m serv1 -m serv2 getent host serv1 serv2
[2:17] <alphe> resolve to the same ip ?
[2:17] <loicd> dsh -m serv1 -m serv2 getent hosts serv1 serv2
[2:18] <loicd> you want this to give you the same answer on all hosts
[2:18] <alphe> it is dns feeded not /etc/hosts
[2:18] <alphe> i have local dns to broadcast name serv
[2:18] <_robbat2|irssi> getent will do a DNS query if needed
[2:18] <_robbat2|irssi> eg:
[2:18] <_robbat2|irssi> $ getent hosts google.com
[2:18] <_robbat2|irssi> 2607:f8b0:400a:800::1004 google.com
[2:19] <loicd> _robbat2|irssi: right
[2:19] <loicd> alphe: it's just a sanity check ;-)
[2:19] <_robbat2|irssi> that was alphe
[2:20] <_robbat2|irssi> *was for
[2:20] <_robbat2|irssi> maybe I need to vanish for the day
[2:20] <loicd> :-D
[2:21] <alphe> pastebin http://pastebin.com/6Amvmx0u
[2:21] <alphe> done
[2:22] <alphe> since each of them have a /etc/host resolving themselves as 127.0.0.1
[2:22] <loicd> hum
[2:23] <alphe> i can remove it
[2:23] <loicd> I'm not sure how that would interfere with the installation but it just to be on the safe side, having all names resolve to IPs that are reachable from all other machines is a good idea
[2:24] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley)
[2:24] <alphe> ok
[2:31] <alphe> installing serv09
[2:31] * huangjun (~kvirc@111.175.164.32) has joined #ceph
[2:31] <alphe> and /etc/hosts files changed
[2:32] <loicd> ok
[2:33] <loicd> http://pastebin.com/6Amvmx0u shows osd02.keplerdata.com etc.
[2:33] <loicd> but no serv09
[2:33] <loicd> what are the actual names of your hosts ? osdXX or servXX ?
[2:33] <alphe> yes
[2:34] <alphe> osdXX
[2:34] <loicd> ok
[2:34] <alphe> this chat is logged :)
[2:38] <loicd> can you run
[2:38] <loicd> ceph-deploy new osd01 osd02 osd03
[2:38] <loicd> now alphe ?
[2:38] <alphe> pastebin updated
[2:38] <alphe> ok so new on every machines of the culter
[2:38] <loicd> alphe: what's the URL of the updated pastebin ?
[2:39] <loicd> alphe: no
[2:39] <alphe> http://pastebin.com/gaKrBNy4
[2:39] <loicd> alphe: run
[2:39] <loicd> ceph-deploy new osd01 osd02 osd03
[2:39] <loicd> on the same machine your ran
[2:39] <loicd> ceph-deploy install ....
[2:39] <loicd> in the empty directory
[2:39] <loicd> and ls -l
[2:39] <loicd> after
[2:39] <loicd> and
[2:39] <alphe> ok from the mds01 i run ceph-deploy -v new mds01 osd{01..10}
[2:39] <loicd> cat ceph.conf
[2:40] <loicd> no
[2:40] <loicd> run
[2:40] <loicd> ceph-deploy new osd01 osd02 osd03
[2:40] <sagewk> dmick: wip-bootstrap
[2:40] <alphe> only the 3 first ?
[2:40] <loicd> you only need 3 monitors
[2:40] <alphe> really ?
[2:40] <loicd> you don't need a monitor on each machine, it'snot necessary
[2:40] <loicd> alphe: yes
[2:40] <alphe> I though it was 1 monitor per osd
[2:40] <alphe> ok
[2:41] <loicd> no
[2:41] <loicd> note that this command will just create files
[2:41] <loicd> it will not do anything else
[2:42] <alphe> done loicd
[2:42] <loicd> could you update the pastebin please ?
[2:42] <loicd> we'll check the ceph.conf content
[2:42] <alphe> http://pastebin.com/L94iNt0X
[2:43] <loicd> looks good :-)
[2:43] <loicd> moving on
[2:43] * jjgalvez1 (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) Quit (Quit: Leaving.)
[2:43] <loicd> http://ceph.com/docs/master/rados/deployment/ceph-deploy-mon/
[2:43] <loicd> on the same machine you ran ceph-deploy new
[2:43] <loicd> run
[2:43] <loicd> ceph-deploy mon create osd01
[2:43] <alphe> ok
[2:43] <loicd> that will create *one* monitor on osd01
[2:43] <loicd> and we'll check it's going well
[2:45] <loicd> i.e. you should now see that there is a ceph process running
[2:45] <loicd> and that there are files in /etc/ceph
[2:45] <alphe> http://pastebin.com/NZi8WQ8j
[2:45] <alphe> ok done
[2:46] <loicd> you can ceph-deploy mon create osd02 osd03
[2:46] <alphe> stuck again
[2:46] <alphe> wait osd01 mon create is stuck
[2:46] <loicd> don't panic ;-)
[2:46] <loicd> ceph-deploy mon create osd02 osd03
[2:46] <alphe> http://pastebin.com/bx6McLDA
[2:46] <loicd> and see what happens on all three machines
[2:48] * rudolfsteiner (~federicon@190.247.73.163) has joined #ceph
[2:48] <alphe> ok pasted
[2:49] <alphe> http://pastebin.com/hau3HUtH
[2:49] <loicd> tail -f /var/log/ceph/ceph-mon.* will tell you more about why it's waiting
[2:50] <loicd> the first time around it takes some time before running the mons
[2:50] <loicd> can you check again now ?
[2:51] <alphe> http://pastebin.com/2PGvTQBP
[2:51] <loicd> we're basically back to where you were initially
[2:51] <alphe> yes
[2:51] <loicd> only now we know exactly what happened before getting in this state
[2:51] <alphe> but instead of having monitors on 11 machines I have them on osd01 osd02 osd03
[2:52] <alphe> and we know what happend before ..
[2:52] <loicd> 20.10.10.103:6789/0 not in my monmap e1: 3 mons at {osd01=20.10.10.101:6789/0,osd02=20.10.10.102:6789/0,osd03=0.0.0.0:0/2}
[2:52] <alphe> yes saw that ...
[2:52] <dmick> sage: sorry, I'm nuts, you have to get()
[2:52] * kenneth (~kenneth@202.60.8.252) has joined #ceph
[2:53] <loicd> that's weird
[2:53] <alphe> why there is a osd03 with 0.0.0.0.0
[2:53] * sagelap (~sage@2600:1012:b025:7e9a:25b0:cc61:dfe7:e868) has joined #ceph
[2:53] <dmick> sagelap: sorry, I'm nuts, you have to get()
[2:53] <loicd> can you check on osd01 see if you have the same thing ?
[2:54] <dmick> and you know it's safe because the fact that caps was defined with n means it's always a list/vector
[2:54] <alphe> http://pastebin.com/wMSiKdmV
[2:56] <alphe> pasted
[2:56] <alphe> osd01 is ok
[2:56] <sagelap> dmick: ok thought so
[2:56] <loicd> can you cat /etc/ceph/ceph.conf on osd02 ?
[2:56] <sagelap> reviewed-by?
[2:57] <sagelap> fwiw still doing
[2:57] <sagelap> - cmd_getval(g_ceph_context, cmd, "caps", cv);
[2:57] <sagelap> - if (cv.size() % 2 == 0) {
[2:57] <sagelap> + if (cmd_getval(g_ceph_context, cmd, "caps", cv) &&
[2:57] <sagelap> + cv.size() % 2 == 0) {
[2:57] <alphe> ok
[2:57] <dmick> sagelap: spaces around '=' and '<' in for, and
[2:58] <loicd> alphe: I have to run now. I wish I had time to know the answer to this puzzle :-)
[2:58] <dmick> the "caps_<entity>" form matches the profile_grants map keys?...someday I really have to understand MonCap.cc
[2:58] <alphe> reached the limit of 10 past per 24 hours ...
[2:58] <alphe> ceph.conf is ok
[2:58] <alphe> the osd03 is the exped ip ...
[2:58] <loicd> alphe: http://paste.openstack.org/
[2:58] <sagelap> yeah
[2:59] <dmick> otherwise yeah, and I'll file a bug to revisit this
[2:59] <loicd> if you getent hosts osd03 on osd02
[2:59] <alphe> http://paste.openstack.org/show/41881/
[2:59] <loicd> alphe: what does it show ?
[2:59] <alphe> pastebined
[3:00] <alphe> http://paste.openstack.org/show/41885/
[3:00] <loicd> alphe: and do you see the same issue on osd03 ?
[3:01] <alphe> no
[3:01] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Remote host closed the connection)
[3:01] <alphe> osd03 is saying he is ok ...
[3:02] <alphe> 2013-07-25 20:56:34.778035 7fdc3ac3a700 1 mon.osd03@1(electing).elector(1) init, last seen epoch 1
[3:02] <alphe> 2013-07-25 20:56:39.778327 7fdc3ac3a700 1 mon.osd03@1(electing).elector(1) init, last seen epoch 1
[3:02] <alphe> 2013-07-25 20:56:44.778619 7fdc3ac3a700 1 mon.osd03@1(electing).elector(1) init, last seen epoch 1
[3:02] <alphe> 2013-07-25 20:56:49.778919 7fdc3ac3a700 1 mon.osd03@1(electing).elector(1) init, last seen epoch 1
[3:02] <alphe> 2013-07-25 20:56:54.779228 7fdc3ac3a700 1 mon.osd03@1(electing).elector(1) init, last seen epoch 1
[3:02] <alphe> damn ...
[3:02] <alphe> sorry
[3:02] <alphe> I go to fsast
[3:02] <alphe> http://paste.openstack.org/show/41886/
[3:03] <loicd> bbl good luck with this problem, I'm sure the answer is not too far away ;-)
[3:03] <alphe> ok thank you for the help
[3:03] <loicd> my pleasure
[3:03] <alphe> can i restart ceph mon ?
[3:04] <alphe> service ceph restart mon ?
[3:04] <alphe> on osd02
[3:04] <dmick> sagelap: meanwhile, if you get a chance, https://github.com/dmick/ceph/tree/wip-api-cleanup, one tiny commit
[3:05] <sagelap> lgtm
[3:07] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) has joined #ceph
[3:07] <sagelap> dmick: is wip-tell-unified destined for next?
[3:07] <dmick> I was just about to bring it up
[3:09] <dmick> was considering including an exceptions cleanup but I'd probably rather wait on that. gonna do one last build and cephtool&rest test, but yes, if you could look
[3:09] <sagelap> exceptions?
[3:09] <sagelap> oh for the python?
[3:09] <dmick> yeah. if you get an exception on import ceph_rest_api, it's really hard to tell why
[3:09] <dmick> I can make that better
[3:12] <alphe> pfff
[3:12] <alphe> ok I ceph-deploy detroy osd02 and now it is all broken :P
[3:18] * jluis (~JL@89.181.148.68) has joined #ceph
[3:21] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[3:24] * kenneth (~kenneth@202.60.8.252) Quit (Ping timeout: 480 seconds)
[3:28] * julian (~julianwa@125.70.135.241) has joined #ceph
[3:30] <sagelap> dmick: ok went through teh series. a few fixes in the formatter stuff, and some questions about the COMMAND stuff toward the end
[3:33] <sagelap> btw i can probably pull in the initial formatter patches and fix them up separately a bit later if you want to focus on the other parts
[3:35] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:39] * LeaChim (~LeaChim@0540adc6.skybroadband.com) Quit (Ping timeout: 480 seconds)
[3:39] <alphe> 1
[3:41] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[3:47] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[3:48] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[3:49] <alphe> do I need to put one monitor on each node of the cluster ?
[3:49] <alphe> do monitor with have to go on par with osd ?
[3:49] <alphe> do monitor have to go on par with osd ?
[3:54] <dmick> sagelap: eh, I can address the comments
[3:54] <alphe> strangely now the gatherkeys works ...
[3:55] <alphe> i installed the first 3 nodes of my cluster puts monitors on them and now all have all the keys .
[3:55] <alphe> so now the next step is to deploy on each nodes osd right ?
[3:55] <alphe> how do I prepare the disks on each osds ?
[3:56] * markbby (~Adium@168.94.245.1) has joined #ceph
[3:57] <alphe> i have 3 disks per node and ceph-deploy list disks only sees 2 (system and a data one) is it normal ?
[3:58] <alphe> sorry they are normally shown now
[3:58] <alphe> so I will continue allong the deployement documentation
[3:58] <alphe> thank you for help see you
[3:58] <alphe> bye all
[3:58] * alphe (~alphe@0001ac6f.user.oftc.net) Quit (Quit: Leaving)
[4:07] * jaydee (~jeandanie@124x35x46x15.ap124.ftth.ucom.ne.jp) has joined #ceph
[4:10] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[4:14] * rudolfsteiner (~federicon@190.247.73.163) Quit (Quit: rudolfsteiner)
[4:19] * markbby (~Adium@168.94.245.1) Quit (Remote host closed the connection)
[4:21] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[4:24] * huangjun|2 (~kvirc@111.172.152.22) has joined #ceph
[4:26] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley)
[4:27] * sagelap1 (~sage@76.89.177.113) has joined #ceph
[4:27] * sagelap (~sage@2600:1012:b025:7e9a:25b0:cc61:dfe7:e868) Quit (Read error: No route to host)
[4:31] * huangjun (~kvirc@111.175.164.32) Quit (Ping timeout: 480 seconds)
[4:43] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[4:43] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[4:44] * glzhao (~glzhao@203.192.156.9) has joined #ceph
[4:45] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[4:51] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) has joined #ceph
[4:54] * jaydee (~jeandanie@124x35x46x15.ap124.ftth.ucom.ne.jp) Quit (Read error: Operation timed out)
[4:55] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[4:57] * glzhao (~glzhao@203.192.156.9) Quit (Quit: leaving)
[5:06] * fireD (~fireD@93-139-139-105.adsl.net.t-com.hr) has joined #ceph
[5:07] * fireD_ (~fireD@93-136-12-230.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:15] * ScOut3R (~ScOut3R@5402421C.dsl.pool.telekom.hu) has joined #ceph
[5:17] * sleinen1 (~Adium@2001:620:0:26:f952:34f7:8c2e:3dc4) Quit (Quit: Leaving.)
[5:17] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[5:20] * axisys (~axisys@ip68-98-189-233.dc.dc.cox.net) has joined #ceph
[5:24] * ScOut3R (~ScOut3R@5402421C.dsl.pool.telekom.hu) Quit (Ping timeout: 480 seconds)
[5:33] * jamespage (~jamespage@culvain.gromper.net) Quit (Ping timeout: 480 seconds)
[5:33] * jamespage (~jamespage@culvain.gromper.net) has joined #ceph
[6:11] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:13] * houkouonchi-work (~linux@12.248.40.138) Quit (Quit: Client exiting)
[6:39] * kenneth (~kenneth@202.60.8.252) has joined #ceph
[6:39] <kenneth> hi all!
[7:05] <huangjun|2> if one osd in cluster is full, should this have effect on ceph-fuse client?
[7:06] <huangjun|2> we cannot mount the fuse client anymore
[7:06] <phantomcircuit> huangjun|2, you need to change the osd's weight so that there's less on it
[7:06] <phantomcircuit> iirc a single full osd means that writes wont work
[7:10] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[7:11] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[7:12] <lautriv> so i went through the pain of ceph-deploy and managed to setup a test-case wit 1 mon and 2 osd but ceph still complains about a keyring ?
[7:16] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[7:17] <dmick> lautriv: could you be slightly more specific than "complains about a keyring"?
[7:21] <lautriv> dmick, a simple ceph health gives me --> 2013-07-26 07:09:24.859777 b7206740 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication
[7:21] <dmick> so that sounds like the client.admin keyring is not present on the host you're trying to run the ceph CLI from
[7:22] <dmick> did you ceph-deploy admin <that host>?
[7:22] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[7:23] <lautriv> dmick, i missed that part in the howto ... the host where i did the deploy-stuff from, i guess.
[7:25] <lautriv> ok, i guess also HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean is somewhat normal on a fresh one.
[7:26] <dmick> stuck isn't good. does ceph osd tree show both osds in place and up and stuff?
[7:26] <dmick> http://ceph.com/docs/master/rados/deployment/ceph-deploy-admin/ is the reference for what I meant, btw.
[7:30] <lautriv> dmick, one box is not up, service ceph -a start didn't change or output anything and the log was fine :(
[7:33] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[7:34] <dmick> one...osd?
[7:35] * odyssey4me (~odyssey4m@165.233.71.2) has joined #ceph
[7:37] <lautriv> dmick, i have one mon and 2 osd, the first osd has 2 disks, the second has 1 but with the same size of that other 2 , the single-disk is up.
[7:47] <lautriv> dmick, ok, i found something, the failing box doesn't have anything below /var/lib/ceph/osd while the running has .../ceph-0 with the disk mounted there. repeated the osd activate with no error but still no mount
[7:49] * root_ (~chatzilla@218.94.22.130) has joined #ceph
[7:50] * lautriv -> bed, laters
[7:53] * sjusthm (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[8:22] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[8:26] <kenneth> hi all! in my ceph conf i have this line
[8:27] <kenneth> cluster network = 10.2.0.0/24
[8:27] <kenneth> public netowrk = 10.1.0.0/24
[8:27] <kenneth> how am i sure that replication is done on the cluster network?
[8:35] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) Quit (Remote host closed the connection)
[8:43] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[8:43] <Cube> kenneth: You could always verify by checking with iftop
[8:44] <kenneth> "iftop is not installed" my ceph node are seperated from the network with internet access
[8:45] <Cube> ifconfig shows bandwidth used, guess that would work as well
[8:59] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[9:01] * sleinen1 (~Adium@user-23-17.vpn.switch.ch) has joined #ceph
[9:05] * sleinen2 (~Adium@2001:620:0:25:d511:c4bb:ee78:2a7e) has joined #ceph
[9:07] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[9:07] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[9:09] * sleinen1 (~Adium@user-23-17.vpn.switch.ch) Quit (Ping timeout: 480 seconds)
[9:12] * sleinen2 (~Adium@2001:620:0:25:d511:c4bb:ee78:2a7e) Quit (Quit: Leaving.)
[9:13] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:26] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[9:33] * KindTwo (~KindOne@h195.36.186.173.dynamic.ip.windstream.net) has joined #ceph
[9:36] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:36] * KindTwo is now known as KindOne
[9:37] <silversurfer> Hi all, I am working on a CEPH cluster with shared physical network with the the clien and
[9:37] <silversurfer> I was wondering if there is a way to limit the bandwidth of OSDs traffic
[9:37] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Quit: jlogan)
[9:37] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[9:43] * jaydee (~jeandanie@124x35x46x15.ap124.ftth.ucom.ne.jp) has joined #ceph
[9:44] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:46] * bergerx_ (~bekir@78.188.101.175) has joined #ceph
[9:47] * waxzce (~waxzce@glo44-2-82-225-224-38.fbx.proxad.net) has joined #ceph
[9:47] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[9:49] * leseb (~Adium@83.167.43.235) has joined #ceph
[9:51] * waxzce (~waxzce@glo44-2-82-225-224-38.fbx.proxad.net) Quit (Remote host closed the connection)
[9:55] * kenneth (~kenneth@202.60.8.252) Quit (Ping timeout: 480 seconds)
[9:56] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) has joined #ceph
[9:59] * jaydee (~jeandanie@124x35x46x15.ap124.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[10:00] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) Quit (Quit: erice)
[10:04] * kenneth (~kenneth@202.60.8.252) has joined #ceph
[10:04] * waxzce (~waxzce@glo44-2-82-225-224-38.fbx.proxad.net) has joined #ceph
[10:14] * sleinen (~Adium@2001:620:0:2d:38a6:181:150f:163e) has joined #ceph
[10:18] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[10:22] * sleinen (~Adium@2001:620:0:2d:38a6:181:150f:163e) Quit (Ping timeout: 480 seconds)
[10:26] * sleinen (~Adium@2001:620:0:26:499a:aafa:8a08:1264) has joined #ceph
[10:30] * waxzce (~waxzce@glo44-2-82-225-224-38.fbx.proxad.net) Quit (Remote host closed the connection)
[10:41] * LeaChim (~LeaChim@0540adc6.skybroadband.com) has joined #ceph
[11:02] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[11:03] * leseb (~Adium@83.167.43.235) Quit (Quit: Leaving.)
[11:04] * waxzce (~waxzce@glo44-2-82-225-224-38.fbx.proxad.net) has joined #ceph
[11:09] <lautriv> if using ceph-deploy and having 2 disks on the same OSD, do i call " {node-name}:{disk}[:{path/to/journal}] " once per disk or " {node-name}:{disk}[:{path/to/journal}] {node-name}:{disk}[:{path/to/journal}] " in one turn ?
[11:11] * leseb (~Adium@83.167.43.235) has joined #ceph
[11:19] * waxzce (~waxzce@glo44-2-82-225-224-38.fbx.proxad.net) Quit (Remote host closed the connection)
[11:20] <joelio> lautriv: how are you joining the 2 disks, with RAID/LVM or something?
[11:20] * JM (~oftc-webi@193.252.138.241) has joined #ceph
[11:21] <lautriv> joelio, since cepj-deploy doesn't work on md-raid i would add each as a plain disk.
[11:21] <joelio> so use one disk per OSD then
[11:22] <joelio> I do something like this.. 6 hosts and 6 disks https://gist.github.com/anonymous/b716796a5f155c08456e
[11:22] <joelio> so 36 OSD in total
[11:23] <lautriv> joelio, that is partially why i ask, pre-deploy does [osd.X] but with deploy i can only call by hostname:disk:journal where i miss the unique identifier.
[11:24] * haomaiwa_ (~haomaiwan@117.79.232.196) Quit (Ping timeout: 480 seconds)
[11:24] <joelio> what do you mean pre-deploy?
[11:25] <joelio> a time earlier, i.e. mkcephfs days?
[11:26] <joelio> ceph-deploy just handles the numbering, you don't need to worry about it. It looks in the relevant default paths to enumerate the number of OSDs when running (why you don't see in ceph.conf)
[11:26] <lautriv> joelio, ceph.conf before ceph-deploy came into the game but your example points to the second part of my question, calling one ceph osd {prepare|activate|create} with all host:disk in one line
[11:27] <joelio> yea, I do it that way, so much easier thank all the other commands
[11:27] <joelio> I think generally you want to flatten the disk, partition it and get an FS on there.. may as well do it all in one step
[11:28] <lautriv> one call per disk seems not to work anyway, i get no errors but also no disk mounted.........lemme see what happens .
[11:30] <joelio> sure, that step worked for my colleagues in another office just yesterday, so should work fine.. obviously change the hostname/number of hosts/osd device path to suit
[11:31] <joelio> we now have 2 ceph clusters <insert evil muahhahaaaa>
[11:35] <lautriv> would not be that bad if deploy could unserstan md-raid but it does neither list nor call ( the unneccessary ) partitions right.
[11:36] <joelio> I'm not sure why you'd want to use an md device, personally
[11:36] <joelio> ceph does striping and redundancy - why add another layer of complexity
[11:37] * kenneth (~kenneth@202.60.8.252) Quit (Ping timeout: 480 seconds)
[11:39] * allsystemsarego (~allsystem@188.27.164.169) has joined #ceph
[11:40] <lautriv> joelio, because mdadm is fast and local and must not even touch a network or monitor, you are right about needless layers but that counts much more for LVM.
[11:43] * waxzce (~waxzce@2a01:e34:ee97:c5c0:b190:7c48:9c9:4b24) has joined #ceph
[11:44] <lautriv> another strange thing is : if i zap/prepare fdisk tells me GPT and parted tells me unrecognized disk label ?
[11:47] * Guest2192 (~quassel@coda-6.gbr.ln.cloud.data-mesh.net) Quit (Remote host closed the connection)
[11:49] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[11:55] <lautriv> ok, found something confusing : the disklabel gets corrupted on disk with 73G but not on disks with 146G
[11:57] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) Quit (Remote host closed the connection)
[12:03] * haomaiwang (~haomaiwan@117.79.232.246) has joined #ceph
[12:06] * haomaiwa_ (~haomaiwan@117.79.232.214) has joined #ceph
[12:09] * julian (~julianwa@125.70.135.241) Quit (Quit: afk)
[12:11] * haomaiwang (~haomaiwan@117.79.232.246) Quit (Ping timeout: 480 seconds)
[12:14] <joelio> lautriv: mdadm is not fast and local across the network though? You're conflating two things there..
[12:14] * waxzce (~waxzce@2a01:e34:ee97:c5c0:b190:7c48:9c9:4b24) Quit (Remote host closed the connection)
[12:15] <lautriv> joelio, local bunch of disks taken apart from the ceph-handling, i won't say mdadm does the whole job but can reduce the work of ceph.
[12:15] <lautriv> ok, this are the last words in the log regarding the failing osd and i have no idea what essentially is wrong : http://pastebin.com/LNnFX7ng
[12:17] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[12:17] <joelio> there are no errors in there, perhaps you could desribe your issue itself?
[12:18] <joelio> btw I'd be interested to know how many people use mdadm backed devs rather than letting Ceph manage it - to me seems like it could cause double blind and potential data loss
[12:20] * waxzce (~waxzce@office.clever-cloud.com) has joined #ceph
[12:20] <lautriv> that is my culprit ....... disk zap -> fine, osd prepare -> garbled disklabel / if i create the partitions manually and do osd activate the labels stay correct but in both cases no osd start, not even /var/lib/ceph/osd/* mountpoints.
[12:21] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit ()
[12:21] <lautriv> joelio, if some use mdadm behind it is before cuttlefish b/c deploy doesn't handle it.
[12:22] <joelio> ?? Why not just use what I sent you in that gist?
[12:24] <lautriv> joelio, you count several hosts up but not drives, since this is just one host with 2 drives, that's equal, did i miss something ?
[12:25] <lautriv> "snip" vm-ds-0$n.ch:/dev/sdb vm-ds-0$n.ch:/dev/sdc "snap" same host per call, different drives like i did
[12:28] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[12:29] <lautriv> ah, you meant the create --zap instead prepare/activate thingie ? similar result, will paste
[12:31] <lautriv> https://gist.github.com/anonymous/6087880
[12:38] <joelio> Again, what's the issue.. that's just debug info and it all looks fine to me
[12:41] <lautriv> joelio, the OSD in question won't even start.
[12:42] * julian (~julianwa@125.70.135.241) has joined #ceph
[12:42] <joelio> it;ll start automatically in that command, what are you trying to do to start it?
[12:44] <lautriv> i check local with pgrep ceph and do service ceph -a start
[12:44] <lautriv> heh, shed some more light in : that box haz ceph-disk.activate.lock and ceph-disk.prepare.lock in /var/lib/ceph/tmp
[12:44] <lautriv> i assume from any former failure and may be ok to remove them ?
[12:46] <joelio> If you have nothing working now, then I guess so ;)
[12:51] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[12:53] <lautriv> no change and a new ceph-disk.prepare.lock, there is no related process running, neither some create-keys nor anything touching my drives or the network
[12:55] <lautriv> how do i kill all config but hold the install ?
[13:05] <lautriv> or rather if i rm all files below /var/lib/ceph and the conf in /etc can i assume there is nothing somewhere backed ( old keys and such) ?
[13:09] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[13:12] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[13:20] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:25] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[13:36] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) has joined #ceph
[13:37] <lautriv> ok, after all it points out i found probably a bug.......seems ceph-deploy is messing with disklabels on small drives.
[13:39] <lautriv> very same commands, on a 146G i get this result :
[13:40] <lautriv> Disk /dev/sdb: 146GB
[13:40] <lautriv> Sector size (logical/physical): 512B/512B
[13:40] <lautriv> Partition Table: gpt
[13:40] <lautriv> Number Start End Size File system Name Flags
[13:40] <lautriv> 17.4kB 1049kB 1031kB Free Space
[13:40] <lautriv> 1 1049kB 146GB 146GB xfs ceph data
[13:40] <lautriv> while the same on a 73G drive results in : Error: /dev/sdc: unrecognised disk label
[13:43] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) Quit (Quit: rudolfsteiner)
[13:43] <joelio> lautriv: don't know, we're all 2TB here..
[13:43] <lautriv> joelio, i dislike the large ones
[13:44] <joelio> 2tb gives best bang for buck
[13:44] <lautriv> but anything above 600G is slow
[13:45] <joelio> not sure I'd agree there, if you're using SATA 500G and SATA 2TB, going to get same speeds. If it's SAS it's going to be quicker of course
[13:46] <lautriv> joelio, i prefer SCSI and SAS also reason for higher speed is the changed technology to achive the higher density
[13:47] <joelio> the whole point of ceph is to just use commodity hardware.. you offset the speeds by using ssd backed journals etc.. so I'm not sure that really fits here :)
[13:48] <lautriv> joelio, was just a sidenote because ceph is not all i do/need but it obv. garbles disklabels on smaller ones.
[13:49] <darkfaded> joelio: the whole point of ceph is to scale out. i don't see reason to not scale out over fast components :))
[13:49] * lautriv dedicated a myricom network for the private cluster ;)
[13:49] <joelio> I'm not going to disagree on that front, but if you look at cost/benefit analysis... ;)
[13:50] <darkfaded> of course the osd journal eases up the whole latency issue a lot and so it might be irrelevant
[13:50] <joelio> if you have the kit and money, do it :)
[13:50] <joelio> we don't, public sector
[13:51] <lautriv> joelio, still opinions but it should be checked/agreed/fixed.
[13:53] <joelio> yea, I understand.. I'd personally love to have some 40G and SSD only behemoth cluster :D
[13:53] <joelio> lautriv: if you think it's a bug in deploy, raise it as a bug
[13:53] <joelio> I've got some 147g scsi knocking about here, if I get a chance, I'll test too
[13:54] <lautriv> joelio, 147 works but 73 not
[13:54] <lautriv> does this involve registration ?
[13:55] <joelio> ahh, ok, not sure I have any machines with less than 100g now tbh
[13:55] <joelio> lautriv: I'd imagine so :) only takes a minute..
[13:55] <joelio> otehwise pop a mail to the list
[13:56] <lautriv> if i need a TB, i would collect 14x73G instead of one disk but still......flavours and opinions
[13:58] <lautriv> upgrading distro-mirrors, will recheck with latest tools before.
[14:04] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[14:05] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[14:06] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) has joined #ceph
[14:08] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) has joined #ceph
[14:08] * markbby (~Adium@168.94.245.2) has joined #ceph
[14:11] * markbby (~Adium@168.94.245.2) Quit ()
[14:13] * markbby (~Adium@168.94.245.2) has joined #ceph
[14:17] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:18] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[14:20] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:22] * abh_ (~oftc-webi@gw.vpn.autistici.org) has joined #ceph
[14:22] <abh_> hi
[14:26] * huangjun|2 (~kvirc@111.172.152.22) Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[14:27] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[14:28] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:32] <abh_> is this a right chan for an internal (stupid) question about ceph?
[14:34] <liiwi> shoot
[14:34] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) Quit (Remote host closed the connection)
[14:34] * kaveh (~Own3R@91.99.142.149) has joined #ceph
[14:34] <kaveh> http://www.vipspeak.com/ new chat program , with voice , admin powers in your own room, pm , multi chat ...
[14:34] * kaveh (~Own3R@91.99.142.149) has left #ceph
[14:38] <joelio> wow, that's like... irc
[14:41] * yanzheng (~zhyan@134.134.139.76) has joined #ceph
[14:44] * jjgalvez1 (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) has joined #ceph
[14:45] <abh_> well, is there a doc that shows how internally ceph interacts with real disks (sata, ssd, etc) ? ie: if i've 2 vm with they 2 lvm disk, ceph implements his network layer and export the 2 disk as a single disk to use in a third vm ?
[14:45] * jluis (~JL@89.181.148.68) Quit (Remote host closed the connection)
[14:46] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[14:47] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[14:47] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[14:50] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) Quit (Quit: rudolfsteiner)
[14:50] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) Quit (Ping timeout: 480 seconds)
[14:55] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[14:59] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[14:59] <joelio> abh_: http://ceph.com/docs/next/architecture/
[15:00] * yanzheng (~zhyan@134.134.139.76) Quit (Remote host closed the connection)
[15:06] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley)
[15:08] <lautriv> joelio, see THAT was needless layering ;)
[15:16] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) has joined #ceph
[15:20] <joelio> lautriv: you seem to think I've not tested all the various combinations - you're free to make your own mind up though ;)
[15:21] <joelio> if it works for you, roll with it )
[15:21] <joelio> :)
[15:21] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) Quit (Quit: rudolfsteiner)
[15:22] <joelio> I'm not convinced about there being no silent data loss too usign mdadm backed devs, there is no way to introspect all the layer but hey ho.. Maybe ceph scrubbing a massive mdadm backed OSD is good enough
[15:23] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[15:23] <lautriv> joelio, i can't actually follow you, was this meant to me ?
[15:24] <joelio> considering you were the last person to speak in the channel and pointed your comment at me, I guess so ;)
[15:27] * mschiff_ (~mschiff@port-46347.pppoe.wtnet.de) Quit (Remote host closed the connection)
[15:30] <lautriv> joelio, ah, my comment about needless layers ;)
[15:30] * PerlStalker (~PerlStalk@72.166.192.70) has joined #ceph
[15:32] * mikedawson (~chatzilla@rrcs-24-123-27-250.central.biz.rr.com) has joined #ceph
[15:36] * elmo (~james@faun.canonical.com) has joined #ceph
[15:43] * abh_ (~oftc-webi@gw.vpn.autistici.org) Quit (Remote host closed the connection)
[15:45] * mikedawson (~chatzilla@rrcs-24-123-27-250.central.biz.rr.com) Quit (Read error: Connection reset by peer)
[15:52] * mikedawson (~chatzilla@rrcs-24-123-27-250.central.biz.rr.com) has joined #ceph
[15:53] * julian (~julianwa@125.70.135.241) Quit (Quit: afk)
[15:59] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[16:01] <elmo> hey, is there anyway to disable thin provisioning in rbd?
[16:01] * mikedawson (~chatzilla@rrcs-24-123-27-250.central.biz.rr.com) Quit (Read error: Connection reset by peer)
[16:02] * odyssey4me (~odyssey4m@165.233.71.2) Quit (Ping timeout: 484 seconds)
[16:04] <joelio> elmo: out of interest, why do you want to do that? I guess you could create thin then dd in from /dev/zero
[16:05] * mikedawson (~chatzilla@rrcs-24-123-27-250.central.biz.rr.com) has joined #ceph
[16:06] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) Quit (Ping timeout: 480 seconds)
[16:07] <elmo> joelio: mostly so that provisioning a new volume fails, rather than a write to a postgres FS
[16:07] <elmo> if that makes sense
[16:07] <elmo> i.e. I'd rather the failure happen at creation time than to a running volume
[16:09] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[16:10] <joelio> don;t know what postgres has got to do with it, but the overall space defined by the image is removed from the available space.. it's just not allocating the full zero-d out block device
[16:10] <Azrael> joelio: he's likely running a postgresql db on rbd
[16:11] <joelio> but if you want fat provisioning, create the volume thin and then dd in from zero, that's effectively what fat provisioning does I guess
[16:11] <elmo> " but the overall space defined by the image is removed from the available space." <-- that's a) not what I understand thin provisioning to mean, b) is not what I'm seeing in practice with our ceph instance
[16:12] <elmo> but maybe I'm missing something/being a muppet
[16:13] <joelio> thin provisioning is just not allocating the full image, it still removes available data allowance.. it does for me at least :)
[16:13] <elmo> I have ~30TB of volumes created, and 'ceph -s' shows ~20TB of usage
[16:13] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[16:15] <jeff-YF> I am running a 3 node ceph cluster with OSD's on each node mapped to raid 0 virtual disks… I am finding it a management nightmare when a drive fails and I replace the disk and have to create a new virtual disk and all the drive letters change… is there a better way of doing this?
[16:16] * lyncos (~chatzilla@208.71.184.41) has joined #ceph
[16:16] <lyncos> Hi .. I would like to know what this means: 2013-07-26 14:15:44.846460 osd.4 [WRN] slow request 69646.470872 seconds old, received at 2013-07-25 18:54:58.375519: osd_op(client.631398.0:1 volume-b6ec691b-2cf2-4250-a96b-b3de6d1b5b93.rbd [stat] 5.67924f2 RETRY=12 e1335) v4 currently reached pg
[16:17] <lyncos> It came all of sudden..
[16:17] <jeff-YF> Is it better to use RAID 5 arrays so that I can use the RAID controller to keep track of the physical drives?
[16:18] <joelio> lyncos: slow OSD, have you gone throuhg troublehooting? http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
[16:18] <lyncos> no I will do
[16:18] * mikedawson (~chatzilla@rrcs-24-123-27-250.central.biz.rr.com) Quit (Read error: Connection reset by peer)
[16:19] <joelio> jeff-YF: why do you have to recreate virtual devs? Surely the abstraction layer takes care of that. I certainly don't have to recreate devs when replacing an OSD?
[16:20] <lyncos> joelio .. I did check all of this.. the server has a problem I did reboot it but it seems the slow request is always on same PG and the number of seconds always increase
[16:20] <jeff-YF> the old virtual disk seems to disappear when i remove the failed drive.. using a perc5e card
[16:20] * mikedawson (~chatzilla@rrcs-24-123-27-250.central.biz.rr.com) has joined #ceph
[16:21] <joelio> jeff-YF: what middleware?
[16:21] * _Tass4da1 (~tassadar@tassadar.xs4all.nl) Quit (Remote host closed the connection)
[16:22] * _Tassadar (~tassadar@tassadar.xs4all.nl) has joined #ceph
[16:22] <jeff-YF> middleware?… hmm.. you mean megaraid?
[16:22] <joelio> no, virtual machine software
[16:22] <joelio> I'd steer clear of megaraid if I were you
[16:22] <joelio> but that's just me
[16:23] <joelio> lyncos: try http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/
[16:23] <joelio> if it's the same pg
[16:23] <jeff-YF> joello: vmware
[16:24] <joelio> ah, well can't help you then. I use libvirt and dfinitely don't have to recreate devices! One of the points in ceph is abstrating everything away so you don't get the situations your describing
[16:25] <joelio> maybe someone in here is a vmware user
[16:26] <jeff-YF> joello: i don't think this has anything to do with vmware.. its a ceph OSD that fails
[16:26] <jeff-YF> when i troubleshoot I find that its the physical disk assigned to the OSD which has failed
[16:27] <jeff-YF> joello: what kind of raid card are you using that you don't have to create a new virtual raid 0 when a drive fails?
[16:27] <joelio> yea, but if you replace the disk, that's in teh ceph layer - not anything to do with the virtual device abstractions
[16:28] <joelio> jeff-YF: I don't use any RAID
[16:28] <joelio> Ceph has it built in!
[16:28] <joelio> striping, redundancy.. why add more layers ;)
[16:28] <joelio> I use a JBOD controller
[16:28] <joelio> works an absolute treat and very performant
[16:29] <jeff-YF> joello: I am not striping.. I am just creating a R0 VD for each drive
[16:29] <jeff-YF> which JBOD controller are you using? (curious)
[16:29] <joelio> well, if you're doing that, then it's a raid 0
[16:29] <joelio> same applies to physical disk
[16:29] <joelio> you lose one, gone
[16:29] <joelio> unless I'm missing something obvious
[16:30] <joelio> jeff-YF: cheap supermicro chassis, can't remember model number
[16:30] <jeff-YF> joello: I think my issue is using the perc5e… its just not working well for ceph
[16:30] <joelio> well, yea
[16:30] <joelio> I really have been burnt by megaraid several times over the years
[16:30] <nhm> from what I've seen so far, a controller (typically RAID) with WB cache is helpful if you have your journals on the same disks as the data as the WB cache can help aggregate writes and reduce contentious seeks. If you have journals on SSD, straight up SAS controllers seem to be as good if not better.
[16:31] <joelio> nhm: +1
[16:33] <jeff-YF> nhm: I am using SSD for my journals, sounds like it would be good to replace my perc5e with a sas controller
[16:34] <nhm> jeff-YF: perc5e is pretty old... I haven't tested one of those in a while.
[16:34] <nhm> Not sure how it would perform to be honest.
[16:34] <joelio> We've got an lsi 9207e (i think) - SAS2308 chip anyway
[16:35] <nhm> joelio: that's a nice card. I've got 4 of them in our supermicro box right now.
[16:35] <joelio> yea, not too expensive either
[16:36] <nhm> amazingly the highpoint rocket 2720SGL has nearly the same level of performance. Not sure if it will last as long.
[16:36] <joelio> never used those before tbh, these were the cards the other storage bods at $WORK used, so made sense to have parity :)
[16:37] <nhm> joelio: yeah, the 9207 is great
[16:37] <nhm> joelio: 1 card per node?
[16:37] <joelio> yep, quad lane
[16:38] <nhm> oh, interesting. I'm using the card with dual internal ports.
[16:38] <nhm> joelio: and directly connecting each disk without an expander backplane.
[16:38] <nhm> well, there is a backplane, but no expanders.
[16:38] * jluis (~JL@89.181.148.68) has joined #ceph
[16:41] <mattch> Playing on a test ceph cluster to try and break things, and tested a network outage by firewalling all the mons and osds off from each other :) Now I see 'HEALTH_WARN 1024 pgs peering; 1024 pgs stuck inactive; 1024 pgs stuck unclean'. I've read over the notes at 'Troubleshooting PGs' but I can't find any clear info on what to do next. Anyone got any pointers on what the recovery procedure for this is (if there is one?)
[16:44] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) has joined #ceph
[16:45] * niklas (~niklas@2001:7c0:409:8001::32:115) Quit (Ping timeout: 480 seconds)
[16:46] * mgalkiewicz (~mgalkiewi@178-36-251-192.adsl.inetia.pl) has joined #ceph
[16:47] <mattch> scratch that - looks like I still had some stuff set in the firewall which was blocking stuff... turning off the firewall and it all resolved nicely!
[16:48] * niklas (~niklas@2001:7c0:409:8001::32:115) has joined #ceph
[16:49] <joelio> nhm: yea, these are the external port variety. A storage node is a dell r320, 8 cores, 32GB and an external supermicro chassis (currently half populated, so we have room to add more OSDs)
[16:49] <joelio> all JBOD, no raid malarky :D
[16:49] <nhm> joelio: very interesting! Performance is ok?
[16:50] <joelio> yea, great for us.. we're using it for rbd devices and using opennebula. I'm using the latest tranfer manager with does the layering/snapshot stuff.. 200VMs 40GB vms in a minute
[16:50] <joelio> so rbd caching and local fscache in the images works really well
[16:51] * mikedawson (~chatzilla@rrcs-24-123-27-250.central.biz.rr.com) Quit (Ping timeout: 480 seconds)
[16:51] <joelio> we'll be looking into doing more librados/s3 style stuff soon - mainly for video capture storage
[16:51] <janos> joelio: for that storage node, are you setting your failure domain in crush to the osd level as opposed to host?
[16:51] <janos> or do you have multiple storage nodes
[16:51] <joelio> multiple
[16:52] <joelio> 6 to be precice
[16:52] <janos> nice
[16:52] * janos with his home-brewed cluster is drooling
[16:52] <janos> it's a good thing i'm married with kids. otherwise my money would lean far too much to hardware ;)
[16:52] <joelio> :D yea, we can double the OSD space, and use the 2nd 10Gbit port when we come to upgrade.. no need for any additional kit, bar disks
[16:53] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[16:53] <joelio> disks are pretty cheap now too, so all good :)
[16:53] <janos> yeah
[16:53] <janos> i need to get more here
[16:53] <ntranger> hey all! I'm running in to a "no filesystem type defined" when the ceph service starts. I'm having a hell of a time trying to pinpoint what I've done wrong. Anyone have any ideas as of what I should check? Thanks!
[16:54] * allsystemsarego (~allsystem@188.27.164.169) Quit (Quit: Leaving)
[16:54] * jeff-YF (~jeffyf@67.23.117.122) Quit (Quit: jeff-YF)
[16:56] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[17:01] * jeff-YF (~jeffyf@67.23.123.228) has joined #ceph
[17:01] <mattch> ntranger: Have you set 'osd mkfs type' in your ceph config ?
[17:06] <ntranger> yeah, its like this "osd mkfs options ext4 = user_xattr,rw,noatime"
[17:07] <ntranger> should I change options to type?
[17:15] * JM (~oftc-webi@193.252.138.241) Quit (Quit: Page closed)
[17:16] <ntranger> the OSD config looks like this
[17:16] <ntranger> osd]
[17:16] <ntranger> osd data = /srv/ceph/osd$id
[17:16] <ntranger> osd journal = /srv/ceph/osd$id/journal
[17:16] <ntranger> osd journal size = 1000
[17:16] <ntranger> osd class dir = /usr/lib/rados-classes
[17:16] <ntranger> keyring = /etc/ceph/keyring.$name
[17:16] <ntranger> ; working with ext4
[17:16] <ntranger> filestore xattr use omap = true
[17:16] <ntranger> ; solve rbd data corruption
[17:16] <ntranger> filestore fiemap = false
[17:16] <ntranger> osd mkfs options ext4 = user_xattr,rw,noatime
[17:18] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[17:21] * bergerx_ (~bekir@78.188.101.175) Quit (Remote host closed the connection)
[17:21] * markl (~mark@tpsit.com) has joined #ceph
[17:22] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[17:22] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[17:27] <joelio> ntranger: do you need manual creation vs. ceph-deploy - curious
[17:29] * rudolfsteiner (~federicon@200.68.116.185) Quit (Quit: rudolfsteiner)
[17:31] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Remote host closed the connection)
[17:32] * sleinen (~Adium@2001:620:0:26:499a:aafa:8a08:1264) Quit (Quit: Leaving.)
[17:32] * sleinen (~Adium@130.59.94.234) has joined #ceph
[17:33] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[17:33] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:36] * huangjun (~kvirc@119.147.167.193) has joined #ceph
[17:37] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Ping timeout: 480 seconds)
[17:39] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[17:40] * sleinen (~Adium@130.59.94.234) Quit (Ping timeout: 480 seconds)
[17:42] * sagelap (~sage@2600:1012:b001:341f:1d02:b23b:b077:af09) has joined #ceph
[17:45] <huangjun> hello,all
[17:46] <huangjun> how to set auto weight of osd when create osds?
[17:46] * alram (~alram@38.122.20.226) has joined #ceph
[17:47] * sagelap1 (~sage@76.89.177.113) Quit (Ping timeout: 480 seconds)
[17:47] <joelio> huangjun: when you add they are already balanced.. if you need something more esoteric - http://ceph.com/docs/master/rados/operations/crush-map/
[17:48] <josef> for some reason make install keeps wanting to install ceph-create-keys into /usr/usr/sbin instead of /usr/sbin
[17:48] <josef> glowell: ^^
[17:49] <joelio> huangjun: if you #ceph osd tree - you will see the weighting
[17:50] <huangjun> joelio: i used the default setting when deploy osds, and the default osd weight only takes disk capacity into considertion?
[17:51] * josef shakes his fist at ceph_sbindir
[17:52] <joelio> huangjun: not sure I follow. I have a cluser with a value of 65.52, each host (6 of) has 10.92, each host has 6 OSDs with weight of 1.82
[17:52] * mxmln3 (~maximilia@212.79.49.65) Quit ()
[17:52] <joelio> huangjun: are you looking for some other metric to weight by?
[17:53] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[17:53] <joelio> if it's speed etc, then it sounds like you need to rework the curshmap.. not aware of doing this at OSD creation time - may be wrong thoguh
[17:54] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[17:54] <joelio> huangjun: ceph osd crush reweight {name} {weight}
[17:54] <joelio> is a simple way to do that
[17:55] <huangjun> joelio: and i found some weird things, i have 11 two osds and 2 of them runs on 1TB disk, and i write 4TB data into cluster, finally, one of it get one osd full , used about 96% of the disk, and another just used 450GB, it's interesting
[17:56] * leseb (~Adium@83.167.43.235) Quit (Quit: Leaving.)
[17:57] <joelio> huangjun: that doesn't sound right to me, not sure of the implications of OSD asymmetry though
[17:58] <huangjun> what about your cluster osd data percetage?
[17:58] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[17:58] * sprachgenerator (~sprachgen@130.202.135.197) has joined #ceph
[17:59] <joelio> huangjun: they are all (pretty much) equal in terms of usage
[17:59] <joelio> maye a percentage difference (to be expected)
[17:59] <joelio> certainly not what you are seeing with yours
[18:01] <joelio> huangjun: just thinking out loud.. how many placement groups did you set for a given pool?
[18:01] <huangjun> uhh, i'll test this more
[18:01] * markbby (~Adium@168.94.245.2) Quit (Quit: Leaving.)
[18:01] <huangjun> the default pg num
[18:02] <joelio> 8? that may be the issue then
[18:04] * joelio really doesn't think saying 8 is good in the docs
[18:04] <joelio> if it's not suitable for most systems.. why is it in there
[18:04] <huangjun> no, 64 PGs every pool, total 192 pgs
[18:04] <joelio> yea, that's a little better I guess
[18:05] <huangjun> the default data,metadata,rbd pool are 2^6 pgs, if you create a new pool, you can sepecify the pg num
[18:06] <ntranger> joelio: I actually used mkcephfs
[18:06] <joelio> ntranger: that's deprecated
[18:06] <joelio> ceph-deploy is the tool now - for all it's quirks :)
[18:07] <ntranger> ok. I was seeing that. I'll get it downloaded and run that. :)
[18:10] * niklas (~niklas@2001:7c0:409:8001::32:115) Quit (Ping timeout: 480 seconds)
[18:12] <sprachgenerator> so I'm having some curious trouble trying to bring up 160OSD's across 20 hosts. Whenever I start to start ceph, the first osd is stuck at "mounting xfs", looking at the host for osd.0 I can clearly see that the osd.0 mount point is already mounted/there with files accessible. Running a ceph status shows that the PG's are stuck in "creating" - total PG's is around 30k - any ideas on what could be happening here?
[18:14] * sleinen (~Adium@2001:620:0:26:a118:e04a:44c9:8be7) has joined #ceph
[18:14] <huangjun> how many osds up and in when stuck in the creating status?
[18:14] <sprachgenerator> 0
[18:15] <sprachgenerator> [INF] : pgmap v2: 30912 pgs: 30912 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail -- osdmap e1: 0 osds: 0 up, 0 in
[18:15] <joelio> sprachgenerator: how are you deploying?
[18:15] <huangjun> login the "mounting xfs" osd and see what happed,
[18:15] <sprachgenerator> I'm using the mkcephfs command to build the cluster, then service ceph -a start
[18:16] <huangjun> what version of ceph did you use?
[18:17] <huangjun> before 0.56 you can use the mkcepfs to deploy cluster, but ceph-deploy is used on version 0.56 and latter
[18:17] <sprachgenerator> on osd.0: http://pastebin.com/gNDsbvzr
[18:18] <sprachgenerator> running 0.61.7 - and ceph-deploy is broken for me
[18:19] <sprachgenerator> specifically with ceph-deploy the lack of support for http_proxy prevents the initial install - this probably should be filed as a bug report for it
[18:20] * sagelap (~sage@2600:1012:b001:341f:1d02:b23b:b077:af09) Quit (Read error: Connection reset by peer)
[18:20] * rudolfsteiner (~federicon@200.68.116.185) Quit (Quit: rudolfsteiner)
[18:22] <joelio> sprachgenerator: just install manually then?
[18:22] <joelio> the rest of the ceph-deploy doesn't need routable internets
[18:22] * joelio a proxy user
[18:22] <joelio> and a ceph-deploy user
[18:23] <joelio> mkcephfs isn't supported anymore and certainly not for 0.6x
[18:23] <joelio> .. also the wget will pull from env vars, so if you have http_proxy set, it'll wget the key no problem
[18:24] <joelio> the install task won't work, admittedly
[18:24] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) has joined #ceph
[18:24] <joelio> if you have the ssh keys enabled across the hosts, uses parallel-ssh to run something like parallel-ssh {hosts list} 'HTTP_PROXY=http://host wget.. blah'
[18:25] <joelio> parallel-ssh ships with ceph-deploy iirc
[18:25] <joelio> .. or make a patch for ceph-deploy :D
[18:26] <sprachgenerator> with regards to ceph-deploy and http_proxy, the specific problem is that the environment variable does not apply to the subprocess once it's launched
[18:26] <sprachgenerator> either set as part of .bashrc or /etc/environment
[18:26] * r0r_taga (~nick@greenback.pod4.org) has joined #ceph
[18:26] <joelio> ok, well install the keys and software via parallel-ssh
[18:26] <joelio> use ceph-deploy for the rest
[18:26] <joelio> that's what I did
[18:27] <sprachgenerator> thanks for the tip - i'll give that a go and see where it puts me
[18:27] <joelio> yea, or use some config management of your choice
[18:28] <joelio> either way, once the ceph binaries are installed, ceph-deploy new.. etc. will all work fine
[18:28] <sprachgenerator> I'm still testing at this point - wanted to flush a few things out before going further down the road, this is being tested against a gluster stack as a candidate for shared storage on an openstack cluster that is due for an upgrade
[18:29] * alphe (~alphe@0001ac6f.user.oftc.net) has joined #ceph
[18:29] <joelio> definitely go down the block device route then
[18:29] <joelio> rahter than using cephfs for vm image storage
[18:30] <joelio> rbd == win
[18:30] <alphe> can i have in one node 1 osd for 1 disk ?
[18:31] <alphe> on my node I have the system disk and two data dedicated disks
[18:31] <joelio> alphe: do you have multiple nodes?
[18:31] <alphe> i read that you need to store the metadata files from btrfs on other disks ...
[18:32] <joelio> erm, not quite..
[18:32] <alphe> joelio yes I have 10 nodes and all of them have 1 system disk 2 data disks
[18:32] <joelio> OSDs can be created with btrfs or XFS/Ext4 with XATTRS
[18:32] <joelio> I'd go XFS personally, but depends on how much you trust btrfs
[18:33] <joelio> ymmv!
[18:33] <joelio> on the number of osds per node, one is fine
[18:34] <alphe> ok so how I create it with 2 data disk ?
[18:34] <joelio> I would just create 2 OSDs per node
[18:35] <joelio> if you have 2 physical disks
[18:35] <alphe> ceph-deploy osd create osd01:/dev/sda
[18:35] <joelio> ceph-deploy osd create osd01:/dev/sda osd01:/dev/sdb
[18:35] <joelio> there, 2 per node ;)
[18:35] <alphe> then ceph-deploy osd create osd01:/dev/sdb
[18:35] <joelio> no, you can do multiple OSD creation per command
[18:35] <joelio> ceph-deploy osd create osd01:/dev/sda osd01:/dev/sdb
[18:36] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[18:36] <joelio> put a --zap-disk is if you want the disks cleaning and paritioning too
[18:36] <alphe> ok cause in the example from the documentation they do ceph-deploy osd create osdserv1:/dev/sda:/dev/sdda1
[18:36] <joelio> that's for a journal too
[18:37] <alphe> i m confused because I read the ceph-deploy howto and the ceph-deploy documentation at same time
[18:37] <alphe> and both docs are telling different things
[18:37] <alphe> xfs seems fine but what are the perfs ?
[18:38] <joelio> hmm, you won't be the 1st person that's had issue groking the how-to
[18:38] <joelio> this is how I do mine.. 6 hosts, 6 osds per hosts, journals on the disks themselves -https://gist.github.com/anonymous/b716796a5f155c08456e
[18:38] <alphe> the how to is great but a kinda simplistic in the explaination field
[18:38] <ntranger> hey joelio, I'm having issues getting ceph-deploy for centos. the instructions on how to download ceph-deploy is for ubuntu. (yeah, I'm still learning this stuff.) :)
[18:39] <alphe> so you try to compensate that reading the documentation on the wiki and bam welcome to difficult world
[18:39] <joelio> going out for tea and beer guys, may be back later :)
[18:39] <alphe> ntranger ceph-deploy is recommanded for ubuntu
[18:39] <alfredodeza> alphe: what issues are you having
[18:39] <alfredodeza> err
[18:39] <alfredodeza> I meant ntranger
[18:40] <alfredodeza> sorry about that
[18:40] <alphe> ntrange I tryed ceph on archlinux some month ago it was a perma combat ...
[18:40] <alfredodeza> ntranger: what issues with ceph-deploy are you having?
[18:40] <alphe> so i went ubuntu and it is really great
[18:40] <alfredodeza> ntranger: I released ceph-deploy to the Python package index a few days ago, so you can get it from there
[18:41] <alphe> python pushy is needed not sure it will auto install with yum on centos
[18:41] <alphe> need python 2.6
[18:41] <alfredodeza> alphe: we have RPM/DEB packages for ceph-deploy
[18:41] <alphe> k
[18:41] <ntranger> Alfredodeza: I ran mkcephfs, and when the ceph service starts, I get "no filesystem type defined"
[18:41] <alfredodeza> but what I was saying is that you can install it with Python install tools
[18:42] <alfredodeza> ntranger: that doesn't sound like ceph-deploy :/
[18:43] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: Light travels faster then sound, which is why some people appear bright, until you hear them speak)
[18:43] <ntranger> yeah, I don't have ceph-deploy. I was told I should run it instead. We're running Centos, and the instrustions for downloading ceph-deploy, is for ubuntu.
[18:44] <alphe> journal non activate true is that normal ?
[18:45] * niklas (~niklas@2001:7c0:409:8001::32:115) has joined #ceph
[18:46] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[18:46] <alfredodeza> ntranger: we have instructions for RPM as well
[18:47] <alfredodeza> ntranger: however, like I said, I just released ceph-deploy on the Python package index, so you can install on your OS even if we have not packaged for it
[18:47] <alfredodeza> if you have either Python 2.6 or 2.7 it will work
[18:47] <alfredodeza> you will need an installer (preferably pip), so you can do `pip install ceph-deploy`
[18:48] * kyle_ (~kyle@216.183.64.10) Quit (Quit: Leaving)
[18:48] <ntranger> Awesome. Thanks alfred!
[18:48] * devoid (~devoid@130.202.135.194) has joined #ceph
[18:49] <alfredodeza> ntranger: no problem, we are really working hard to make this an easier process
[18:49] <alfredodeza> please, do ping me if you get into any issues
[18:50] <alphe> journal non activate true is that normal ?
[18:51] <alphe> what means no journal that means i have no replication ?
[18:52] <alphe> how do i see the underlying filesystem ?
[18:52] <alphe> /dev/sdb1 ceph data, active, cluster ceph, osd.1, journal /dev/sdb2
[18:54] * mgalkiewicz (~mgalkiewi@178-36-251-192.adsl.inetia.pl) Quit (Quit: Ex-Chat)
[18:57] <alphe> how do i know the underlying filesystem ?
[19:00] <alphe> ok xfs_info /dev/sda1 and if it doesn t work then it is not in xfs ...
[19:04] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[19:07] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[19:12] * jeff-YF (~jeffyf@67.23.123.228) Quit (Quit: jeff-YF)
[19:12] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Read error: Connection reset by peer)
[19:14] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[19:16] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[19:16] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[19:17] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[19:21] * alphe (~alphe@0001ac6f.user.oftc.net) Quit (Quit: Leaving)
[19:25] * markbby (~Adium@168.94.245.1) has joined #ceph
[19:27] * iggy__ is now known as iggy_
[19:30] * `10 (~10@juke.fm) has joined #ceph
[19:31] * huangjun (~kvirc@119.147.167.193) Quit (Ping timeout: 480 seconds)
[19:35] * `10` (~10@juke.fm) has joined #ceph
[19:37] * `10__ (~10@juke.fm) Quit (Ping timeout: 480 seconds)
[19:38] <lyncos> I just noticed the OSD process going at 100% CPU (or even more) without any activity on my cluster... it's not the same on all node.. only on 50% of the nodes... wich is strange
[19:39] <lyncos> and the osds get marked down from time to time
[19:39] <sagewk> lyncos: what version?
[19:39] <lyncos> 0.61.1
[19:39] <sagewk> i would upgrade to v0.61.7
[19:40] <sagewk> lots and lots and lots of stuff fixed since then
[19:40] <lyncos> Ok I will try to do that :-)
[19:40] <lyncos> ok cool thanks for the advice I will upgrade
[19:40] <lyncos> is that version in the Ubuntu repo ?
[19:40] <lyncos> Or I need to compile it
[19:42] * `10 (~10@juke.fm) Quit (Ping timeout: 480 seconds)
[19:42] <lyncos> nvm my mirror was wrongly configured :-)
[19:44] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[19:44] * ChanServ sets mode +o scuttlemonkey
[19:44] * waxzce (~waxzce@office.clever-cloud.com) Quit (Remote host closed the connection)
[19:45] * `10 (~10@juke.fm) has joined #ceph
[19:45] <lyncos> still the same problem.. I have 2/4 OSD using 100%+ CPU and 2 using 0% with an idle cluster
[19:47] * sjustlaptop (~sam@2607:f298:a:697:dd97:61f0:732f:9b8c) has joined #ceph
[19:48] <lyncos> nice now they crash after some time
[19:49] <lyncos> I guess I will have to rebuild them
[19:49] * sleinen1 (~Adium@2001:620:0:26:8cbf:b2b5:1d99:a4c0) has joined #ceph
[19:49] <lyncos> 2013-07-26 17:49:05.116818 7f2424813700 1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7f24157f5700' had timed out after 15
[19:49] <lyncos> having lot of these messages
[19:52] * `10` (~10@juke.fm) Quit (Ping timeout: 480 seconds)
[19:52] <sagewk> which the .7 version?
[19:52] <sagewk> er, with the .7 version?
[19:54] * sleinen (~Adium@2001:620:0:26:a118:e04a:44c9:8be7) Quit (Ping timeout: 480 seconds)
[19:55] * sjustlaptop (~sam@2607:f298:a:697:dd97:61f0:732f:9b8c) Quit (Ping timeout: 480 seconds)
[19:57] <mikedawson> sagewk: Does the radosgw async replication in Dumpling have any use case for RBD? i.e. can we asynchronously replicate RBD volumes (or snapshots)?
[19:58] <sagewk> mikedawson: no. the incremental rbd snapshot feature can be used for async replication, though
[20:02] <mikedawson> sagewk: I see... Dumpling adds incremental rbd snapshots. Am I right to assume it is up to the operator to build the glue to make it work between Ceph clusters?
[20:02] * markbby (~Adium@168.94.245.1) Quit (Remote host closed the connection)
[20:03] <sagewk> actually cuttlefish added it :) but yeah, there isn't any bundled orchestration tools.
[20:03] <joshd> mikedawson: cinder can use it in havana though
[20:04] <janos> has the bobtail-->cuttlefish transisition been documented and ironed out? i saw enough scary stuff in here when that first released to make me afraid to this day
[20:05] <lyncos> sagewk yes
[20:05] <mikedawson> sagewk: Gotcha. Are there build in tools to rehydrate the incremental on the "backup" side? Link?
[20:05] <sagewk> lyncos: check dmesg for fs errors?
[20:05] <sagewk> and/or post an osd log somewhere
[20:05] <mikedawson> joshd: Excellent! Anything published I could read?
[20:05] <lyncos> http://pastebin.com/skvqJyg9
[20:05] <lyncos> no dmesg errors
[20:06] <joshd> mikedawson: no docs yet, but the code is https://github.com/openstack/cinder/blob/master/cinder/backup/drivers/ceph.py
[20:08] * lyncos (~chatzilla@208.71.184.41) Quit (Remote host closed the connection)
[20:09] * lyncos (~chatzilla@208.71.184.41) has joined #ceph
[20:09] <lyncos> Sorry I did got disconnected... did I miss anything >?
[20:11] * devoid (~devoid@130.202.135.194) Quit (Read error: Operation timed out)
[20:16] <sagewk> lyncos: can you reproduce teh hang with osd logs 'debug osd = 20' 'debug filestore = 20' 'debug ms = 1' and post the log? (it'll be big but it will tell us why the op_tp thread is blocked up
[20:18] <lyncos> sure
[20:19] <lyncos> it will take some time before it crash
[20:19] <lyncos> it's using exactly 100% of one core
[20:21] * skm (~smiley@205.153.36.170) has joined #ceph
[20:21] <lyncos> this will create a freakin huge log :-)
[20:22] <lyncos> sagewk just little background of the proble... I did had an OSD running on a raid controller that failed in a really bad way... then remove that OSD
[20:22] <lyncos> then I got that problem... I don't know If I had it before
[20:22] * dpippenger (~riven@tenant.pas.idealab.com) has joined #ceph
[20:23] * lyncos (~chatzilla@208.71.184.41) Quit (Remote host closed the connection)
[20:24] * lyncos (~chatzilla@208.71.184.41) has joined #ceph
[20:25] <lyncos> it dosen't seems to crash now.. or it takes longer
[20:25] <lyncos> but still using 100% cpu
[20:25] <lyncos> log is already 33M
[20:30] * rudolfsteiner_ (~federicon@200.68.116.185) has joined #ceph
[20:30] <sagewk> that is probably enough log to tell what he problem is
[20:32] <lyncos> https://lyncos.dyndns.org/ceph-osd.1.log
[20:32] <lyncos> the other server crashed faster
[20:33] <lyncos> so I sent the other server log
[20:35] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) has joined #ceph
[20:36] * lyncos (~chatzilla@208.71.184.41) Quit (Remote host closed the connection)
[20:37] * rudolfsteiner (~federicon@200.68.116.185) Quit (Ping timeout: 480 seconds)
[20:37] * rudolfsteiner_ is now known as rudolfsteiner
[20:39] * lyncos (~chatzilla@208.71.184.41) has joined #ceph
[20:39] <lyncos> my connection to IRC is soo unstable....
[20:40] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[20:48] * markbby (~Adium@168.94.245.2) has joined #ceph
[20:55] <lyncos> sagewk did you see something ?
[21:00] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[21:04] * yehudasa__ (~yehudasa@2602:306:330b:1410:ea03:9aff:fe98:e8ff) Quit (Ping timeout: 480 seconds)
[21:05] * jluis (~JL@89.181.148.68) Quit (Ping timeout: 480 seconds)
[21:11] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) has joined #ceph
[21:11] * jeff-YF (~jeffyf@67.23.117.122) Quit (Quit: jeff-YF)
[21:12] * yehudasa__ (~yehudasa@2602:306:330b:1410:84d3:fab1:232b:b7b5) has joined #ceph
[21:20] * jeff-YF (~jeffyf@216.14.83.26) has joined #ceph
[21:23] * devoid (~devoid@130.202.135.194) has joined #ceph
[21:30] * markbby (~Adium@168.94.245.2) Quit (Quit: Leaving.)
[21:31] * markbby (~Adium@168.94.245.2) has joined #ceph
[21:31] * markbby (~Adium@168.94.245.2) Quit ()
[21:32] * markbby (~Adium@168.94.245.2) has joined #ceph
[21:32] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[21:36] <ShaunR> anybody know if xen has blktap support for rados/rbd yet/
[21:43] <ntranger> hey alfred, I think I may have figured out whats up. We're using Scientific Linux 6.4, and when we tried to run ceph-deploy install, we get "Platform is not supported: Scientific Carbon"
[21:43] <alfredodeza> ntranger: I see
[21:44] <ntranger> but the older version of ceph that we used to test and see if it was what we wanted to go with, worked with Scientific Linux 6.3. lol
[21:45] <alfredodeza> I've just got some changes into the repo that should detect Scientific and use the CentOS procedures to install
[21:45] <alfredodeza> procedures/api/callables
[21:47] <alfredodeza> this is not yet released, but if you want to test it out (super helpful for me) you could clone the repo and tell me if it works ?
[21:48] <alfredodeza> ntranger: the master branch from here: https://github.com/ceph/ceph-deploy
[21:48] <ntranger> absolutely
[21:48] <alfredodeza> actually, you could tell pip to install from there directly
[21:48] <alfredodeza> let me get you the actual command
[21:49] <ntranger> ok
[21:49] <alfredodeza> ntranger: pip install https://github.com/ceph/ceph-deploy/tarball/master
[21:56] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Quit: Leaving.)
[21:58] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[21:59] * saaby (~as@mail.saaby.com) Quit (Quit: leaving)
[22:02] * saaby (~as@mail.saaby.com) has joined #ceph
[22:03] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[22:05] * dpippenger (~riven@tenant.pas.idealab.com) has joined #ceph
[22:14] <ntranger> when I installed ceph from the newer version, it worked, no errors. when I'm trying to created a monitor, it tells me the ceph.conf has stuff in it, and tells me to overwrite-conf, which I do, and it throws up errors.
[22:17] * xdeller (~xdeller@91.218.144.129) Quit (Quit: Leaving)
[22:18] <ntranger> I sent you the error in a pm, I didn't want to flood the channel with it. :)
[22:18] <alfredodeza> sure
[22:18] <alfredodeza> I am happy that you were able to install though :)
[22:18] <alfredodeza> that means my changes are working correctly :)
[22:20] <alfredodeza> ntranger: what happens if you run that command directly on ceph01? --> service ceph start mon.ceph01
[22:20] <alfredodeza> probably with sudo I guess
[22:23] <ntranger> [root@ceph01 ~]# service ceph start mon.ceph01
[22:23] <ntranger> === mon.ceph01 ===
[22:23] <ntranger> Starting Ceph mon.ceph01 on ceph01...
[22:23] <ntranger> failed: 'ulimit -n 8192; /usr/bin/ceph-mon -i ceph01 --pid-file /var/run/ceph/mon.ceph01.pid -c /etc/ceph/ceph.conf '
[22:23] <ntranger> Starting ceph-create-keys on ceph01...
[22:23] <alfredodeza> that errors seems a bit out of my league :(
[22:24] <ntranger> :)
[22:25] <ntranger> ok
[22:26] * bandrus (~Adium@12.248.40.138) has joined #ceph
[22:28] <dmick> ntranger: check the log for mon.ceph01
[22:30] <sagewk> lyncos: doh
[22:30] <sagewk> just missed you
[22:30] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) Quit (Quit: Leaving.)
[22:30] <sagewk> lyncos: or not.. there?
[22:33] <ntranger> thats was my fault. there was already a service running, so when I tried running it, it threw up on itself. we killed the other services, and started it......
[22:33] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:33] <dmick> ah, cool
[22:33] <ntranger> [root@ceph01 ~]# service ceph start
[22:33] <ntranger> === mon.ceph01 ===
[22:33] <ntranger> Starting Ceph mon.ceph01 on ceph01...
[22:33] <ntranger> Starting ceph-create-keys on ceph01...
[22:35] <ntranger> so, alfred, you work around is in the right direction. :)
[22:36] <alfredodeza> ntranger: excellent!@
[22:36] <alfredodeza> s/@//g
[22:36] <alfredodeza> :D
[22:38] <ntranger> just added the keys with no errors as well. next is to add the OSD's, which I have 12. lol
[22:40] <alfredodeza> nice
[22:41] <lyncos> sagewk still there
[22:42] <sagewk> can you attach to a ceph-osd while it is 100% and do a 'thr app all bt'?
[22:42] <sagewk> it look slike it doesn't even get out of init() :/
[22:43] <lyncos> attach ? you mean make a tcp connection and send that command ?
[22:57] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[22:57] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[23:01] * jeff-YF (~jeffyf@216.14.83.26) Quit (Quit: jeff-YF)
[23:06] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[23:07] * sleinen1 (~Adium@2001:620:0:26:8cbf:b2b5:1d99:a4c0) Quit (Quit: Leaving.)
[23:08] <ntranger> should I ceph-deploy osd prepare ceph01:sd*:/dev/sd* for each drive, or is there a way to do it for each drive all at the same time? I did it for sda:/dev/sda and it worked fine, but when I did it for sdb:/dev/sdb, it seems to be hanging (or taking an extra long time)
[23:09] * alram (~alram@38.122.20.226) Quit (Quit: leaving)
[23:10] * jeff-YF (~jeffyf@67.23.117.122) Quit ()
[23:11] <lautriv> ntranger, the second :/dev/sd* is the journal if you will have no separate journal just host:drive
[23:13] <ntranger> awesome. Thanks!
[23:15] <sagewk> host:foo:foo and host:foo are equivalent; if you do'ht specify a separate journal disk it will put it on a partition on the same disk
[23:17] <lautriv> could anyone explain why ceph-deploy osd activate may refuse to work without errors in log, where it creates no /var/lib/ceph/osd/* but /var/lib/ceph/tmp/ceph-disk.activate.lock ?
[23:18] <sagewk> ceph-disk -v /dev/sdb1 (or whatever the xfs partition is) should give you more verbose output
[23:20] <Azrael> hiya sagewk
[23:20] <sagewk> hey
[23:20] <Azrael> so far so good with 0.61.7!
[23:20] <sagewk> great!
[23:20] <Azrael> very excited
[23:20] <Azrael> the osd peering delay bug issue thingy, i'm not even sure if thats around aymore either
[23:20] <sagewk> btw teh msgr fixes (stalled peering you saw) are in cuttelfish branch, but want to pound on those for a while longer before pushing them out in a release
[23:21] <Azrael> ok
[23:21] <Azrael> yeah
[23:21] <Azrael> like
[23:21] <sagewk> i bet teh osd thing from before exacerbated it
[23:21] <Azrael> i see it wait on peering sometimes for a min but then it goes away and is fine
[23:21] <Azrael> yeah
[23:21] <Azrael> its definitely not like beore
[23:21] <sagewk> great to hear
[23:21] <dmick> lyncos: attach, like "gdb /usr/bin/ceph-osd -p <pid>"
[23:22] <Tamil1> lautriv: which ceph branch?
[23:22] <sagewk> yeah
[23:22] <sagewk> no -p needed i think
[23:22] <dmick> maybe, but it works :)
[23:22] <ntranger> Awesome. Thanks sage!
[23:22] <dmick> interesting. I wonder if it's just ignored
[23:23] <lautriv> Tamil 0.61.7
[23:24] <Azrael> sagewk: we're going to chill for a bit and then get back into osd/leveldb/boost on debian debugging
[23:24] <sagewk> k
[23:24] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Quit: ...)
[23:24] <Azrael> emil mentioned starting osd with valgrind
[23:25] <sagewk> yeah
[23:25] <Tamil1> lautriv: there was a fix for that yesterday
[23:26] <Tamil1> lautriv: but wait a min, you said 0.61.7. let me check
[23:28] * rudolfsteiner (~federicon@200.68.116.185) Quit (Quit: rudolfsteiner)
[23:28] <Tamil1> lautriv: do you see anything on ceph.log? what does "./ceph-deploy disk list <hostname>" say?
[23:32] <lautriv> Tamil1, those in question tell me "ceph data, unprepared"
[23:38] <Tamil1> lautriv: so, osd prepare failed in the first place.
[23:39] <lautriv> Tamil1, my issue with this node is, if i prepare the disks (73G) via ceph-deploy disk prepare, it produces "unrecognized disklabel" but the same on another node (146G) works. since this is a no-go i tried to parted them which succeeds but isn't enought for ceph :(
[23:40] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:40] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[23:42] <Tamil1> lautriv: which means on the other node, osd is created?
[23:42] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has left #ceph
[23:42] * rudolfsteiner (~federicon@200.68.116.185) Quit ()
[23:47] <lautriv> Tamil1, yes, my test-setup should be osd1 with 2x73G and osd2 with 1x 146G node 2 is up ceph-heals is on degraded 50%
[23:48] <lautriv> *health
[23:49] <lautriv> Tamil1, since this reroducible always, i guess the prepare fails on some size-calculation below a certain disk-size.
[23:49] * lautriv must change batt on kbd to write less garbage ...
[23:52] <Tamil1> lautriv: maybe but am looking at the error "unrecognized disk label"
[23:55] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) has joined #ceph
[23:56] <lautriv> Tamil1, i may have a small hint, somewhen on the beginning (screwing on this since a week) i got more verbose info about GPT differs from backup and found in the ceph_deploy "sgdisk just writes the primary label.....seems to be enought to zero the last blocks for the backup" where the last blocks may be early/late/less.
[23:57] * leseb- (~leseb@88-190-214-97.rev.dedibox.fr) has joined #ceph
[23:58] <Tamil1> lautriv: what is your disk size, on which osd failes to prepare?
[23:58] <Tamil1> lautriv: s/failes/fails
[23:58] <lautriv> Tamil 73.4G per disk

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.