#ceph IRC Log


IRC Log for 2013-07-31

Timestamps are in GMT/BST.

[0:00] * sleinen1 (~Adium@2001:620:0:25:401:b540:2ca5:4e24) Quit (Quit: Leaving.)
[0:01] * aliguori (~anthony@ Quit (Remote host closed the connection)
[0:01] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[0:01] * sagelap (~sage@2607:f298:a:607:b1dd:2879:bafb:4558) Quit (Ping timeout: 480 seconds)
[0:03] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[0:03] * jeff-YF (~jeffyf@ Quit (Ping timeout: 480 seconds)
[0:04] <gregmark> joshd: oh it's installed. I just upgraded it though to see if that would help. Same error.
[0:05] <cmdrk> sjust: that was my impression, but I find that my performance seems to be topping out at around 2Gbps (have one machine mounting CephFS via 10GbE) and /var/log/ceph/ceph.log is putting up a lot of complaints about slow requests. i'm not expecting a ton of performance but it woudl be nice to know why i keep seeing "slow request 69.68308 seconds old..." blahblah
[0:05] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Read error: No route to host)
[0:05] <sjust> cmdrk: that should not be happening, you might have a disk which is slower than the others
[0:07] <joshd> gregmark: it must not be in PYTHONPATH for some reason
[0:08] * BillK (~BillK-OFT@124-168-243-244.dyn.iinet.net.au) has joined #ceph
[0:08] <gregmark> joshd: that's interesting. But cinder has no problem using RBD. I used the instructions here: http://ceph.com/docs/next/rbd/rbd-openstack/ almost in complete detail.
[0:08] * madkiss (~madkiss@2001:6f8:12c3:f00f:3c07:a4d9:23e5:6db3) Quit (Remote host closed the connection)
[0:09] <gregmark> I'll investigate this angle nonetheless
[0:09] * cmdrk (~lincoln@c-24-12-206-91.hsd1.il.comcast.net) Quit (Quit: Lost terminal)
[0:09] <joshd> gregmark: cinder's not using the python bindings until havana
[0:10] * cmdrk (~lincoln@c-24-12-206-91.hsd1.il.comcast.net) has joined #ceph
[0:10] <cmdrk> blargh, stupid screen
[0:10] <gregmark> joshd: do you know offhand where I can set the path correctly?
[0:10] <cmdrk> anyway. yeah. I made a distribution of some OSDs that commonly report slowness, perhaps I ought to test them all individually.
[0:11] <joshd> gregmark: whatever's starting glance may be doing something funky, since python-ceph puts them in the standard locations for python modules
[0:12] <gregmark> ok: you gave me a new troubleshooting angle. thanks.
[0:12] <dmick> alfredo was just telling me this morning that OpenStack deploys production envs with virtualenv?...perhaps that's relevant?
[0:13] <gregmark> crud, I don't see anything in the glance-api startup file that might run a set.append or similar
[0:24] <lautriv> cmdrk, still around ?
[0:25] <lautriv> anyone on my ceph-disk question ?
[0:25] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: my troubles seem so far away, now yours are too...)
[0:27] <dmick> lautriv: I don't understand what you're asking; if you're hacking the script and it's not having any effect, then, are you not running the script you think you're running or something?...
[0:28] <lautriv> dmick, i had the idea it could be precompiled or cached but that isn't the case, another change about logging worked immediately and it can't be a limited numargs because part of the call is variable.
[0:29] * mschiff (~mschiff@ Quit (Read error: No route to host)
[0:29] * dosaboy (~dosaboy@host109-158-236-137.range109-158.btcentralplus.com) Quit (Quit: leaving)
[0:29] * mschiff (~mschiff@ has joined #ceph
[0:29] * dosaboy (~dosaboy@host109-158-236-137.range109-158.btcentralplus.com) has joined #ceph
[0:29] <dmick> I mean....this isn't anything to do with ceph or ceph-disk, it's just getting a Python code change to execute, right?...debug it?...add logging?...strace?...etc.
[0:30] <alfredodeza> lautriv: it looks like you are not using the files you think you are using
[0:30] * jluis (~JL@89-181-148-68.net.novis.pt) has joined #ceph
[0:31] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Ping timeout: 480 seconds)
[0:31] <lautriv> i should see if i strace ceph-deploy itself, lemme give it a shot .........but unlikely since the change for logging worked.
[0:33] * BillK (~BillK-OFT@124-168-243-244.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[0:36] * mschiff_ (~mschiff@tmo-102-214.customers.d1-online.com) has joined #ceph
[0:41] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) has joined #ceph
[0:42] * mschiff (~mschiff@ Quit (Read error: No route to host)
[0:42] * mschiff__ (~mschiff@ has joined #ceph
[0:49] * mschiff_ (~mschiff@tmo-102-214.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[0:53] * lautriv (~lautriv@f050082216.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[0:58] * markbby (~Adium@ has joined #ceph
[1:02] * lautriv (~lautriv@f050085055.adsl.alicedsl.de) has joined #ceph
[1:02] <lautriv> bah disconnect, my last words ?
[1:05] <dmick> (03:31:35 PM) lautriv: i should see if i strace ceph-deploy itself, lemme give it a shot .........but unlikely since the change for logging worked.
[1:06] <lautriv> ok, found that change is applied and neccessary at least for this drives but another bug truncates my label after that point anyway
[1:06] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) Quit (Ping timeout: 480 seconds)
[1:08] * gregmark (~Adium@ Quit (Quit: Leaving.)
[1:09] <dscastro> hi
[1:10] <dscastro> i can't mount cephfs on a centos machine
[1:10] <dscastro> FATAL: Module ceph not found.
[1:10] <dscastro> mount.ceph: modprobe failed, exit status 1
[1:10] <dscastro> mount error: ceph filesystem not supported by the system
[1:11] <lautriv> dscastro, build the kernel-modules, try again
[1:11] <dscastro> lautriv: you mean i have build the modules?
[1:11] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[1:12] <lautriv> dscastro, you need some ceph and libceph but they aren't obviously present. find /lib/modules/$(uname -r) -type f -name ceph.ko should tell you 1
[1:13] <dscastro> lautriv: yep, they aren't available
[1:13] <dscastro> i wondering where i get them
[1:13] <lautriv> dscastro, so build em'
[1:14] <dscastro> ok
[1:14] <dscastro> lautriv: so ubuntu has native ?
[1:15] <lautriv> dscastro, don't care on insane distros, need help tp build the modules for your cent ?
[1:15] <dscastro> lautriv: i'm running on aws
[1:16] <lautriv> meh, on time people will try to breath in a cloud.
[1:16] <dscastro> looking for a fast way to get ceph running, and figure out what i have to do to run for production
[1:16] <lautriv> dscastro, get sources, configure, make
[1:17] <dmick> dscastro: you may not need kernel modules; are you sure that's the config you want?
[1:17] <lautriv> dscastro, what centos is that ?
[1:17] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:17] <dscastro> 6.4
[1:18] <dscastro> i wanna mount cephfs
[1:18] <lautriv> dscastro, 6.4 works, may have a look in your repo.
[1:18] <dscastro> i have already a ceph cluster running
[1:19] <dmick> if you have to mount it from the kernel, then, yes, you need a later kernel
[1:20] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit (Quit: Leaving.)
[1:20] <dscastro> dmick: i'm looking for a distributed fs which supports mcs labels
[1:20] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[1:21] * sagelap (~sage@ has joined #ceph
[1:27] <lautriv> where do i change the global debugging level for deploy commands ?
[1:28] <dmick> dscastro: I don't know if cephfs will fill that role; maybe if mcs labels are built on xattrs
[1:28] <dmick> ?
[1:28] <dmick> but it might be worth trying ceph-fuse first
[1:28] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) has joined #ceph
[1:28] <dmick> because you'll get later versions of that, even if it's not your final deployment answer
[1:28] <dmick> lautriv: as in ceph-deploy?
[1:28] <dscastro> lautriv: centos has kernel 2.6.32
[1:28] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[1:28] * ChanServ sets mode +o scuttlemonkey
[1:29] <dmick> each command takes a -v argument
[1:29] <dscastro> really need a new one
[1:29] <dmick> but it can be difficult to find logging for things like ceph-disk that are really spawned from udev
[1:29] <lautriv> dmick, yes and -v is not verbose enought, except i can -v 8
[1:30] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[1:32] * AfC (~andrew@2001:44b8:31cb:d400:b874:d094:ec64:aec3) has joined #ceph
[1:34] <_Tassadar> i'm trying to run a sequential read benchmark using rados bench 300 seq
[1:34] <_Tassadar> but it complains that it has to write data first
[1:34] <_Tassadar> so i did a "rados bench -p data 300 write --no-cleanup" (note the --no-cleanup)
[1:35] <_Tassadar> but the read benchmark still complains
[1:35] <dmick> you used -p data with read too?
[1:35] <_Tassadar> actually
[1:36] <_Tassadar> that is the error
[1:36] <lautriv> HOLY <CENSORED> probably found the issue and THAT is a pain ...
[1:36] <_Tassadar> i did make a special pool for it
[1:36] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) Quit (Quit: rudolfsteiner)
[1:36] <_Tassadar> but i copied the command the second time so now the un-cleanupped benchmark is in my data pool |:(
[1:36] <_Tassadar> dammit
[1:36] * mschiff__ (~mschiff@ Quit (Remote host closed the connection)
[1:36] * mikedawson_ (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[1:37] <_Tassadar> i was planning on removing the "benchmark" pool that i had specially created for this
[1:37] <_Tassadar> now how am i going to get that data out of my data pool...
[1:37] <dmick> rados cleanup
[1:38] <_Tassadar> thanks
[1:38] <_Tassadar> that needs a <prefix>
[1:39] <_Tassadar> it doesn't really say what that prefix is
[1:39] <dmick> maybe that's the object prefix
[1:39] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[1:39] * mikedawson_ is now known as mikedawson
[1:39] <dmick> you can see the objects with rados -p data ls; perhaps their names will be obvious, I don't recall
[1:39] <_Tassadar> ah, yeah that is so
[1:40] <_Tassadar> well
[1:40] <dmick> would be nice if bench write would have told you
[1:40] <_Tassadar> that is, the names are recognizable, i don't know what the prefix is
[1:40] <dmick> oh look it does
[1:40] <dmick> Object prefix: benchmark_XX_XX_XX
[1:41] <dmick> so the prefix is everything up to but not including _objectNN
[1:41] <_Tassadar> oh it does indeed
[1:41] <_Tassadar> but cleanup doesn't recognize it or something
[1:42] <dmick> just did ofr me
[1:42] <dmick> again with the -p?
[1:42] <_Tassadar> it just gives me the usage help, instead of presenting an error of what i did wrong
[1:42] <_Tassadar> yeah
[1:42] <_Tassadar> with the -p option it works
[1:42] <_Tassadar> nice
[1:44] <_Tassadar> some better error reporting would be nice but okay, at least it cleaned up allright
[1:45] <dmick> _Tassadar: accepting patches all the time :)
[1:47] <_Tassadar> already submitting small ones ;)
[1:47] <_Tassadar> i don't think i'm going to improve this any time soon though ;)
[1:47] <_Tassadar> anyway thanks for helping out
[1:47] <_Tassadar> good night
[1:49] * alram (~alram@ Quit (Quit: leaving)
[1:50] <dmick> yw
[1:52] <lautriv> can anyone tell me offhead which kernel-option gives me /dev/disk/by-partuuid
[1:53] <dmick> lautriv: perhaps a later version of udev?...
[1:53] <lautriv> dmick, nope, 2 machines, all the same versions but kernel is configured for each.
[1:54] <lautriv> AH, one haz /devtmpfs mounted
[2:00] * humbolt (~elias@80-121-55-112.adsl.highway.telekom.at) Quit (Quit: humbolt)
[2:02] * BillK (~BillK-OFT@124-168-243-244.dyn.iinet.net.au) has joined #ceph
[2:03] <lautriv> i have the impression several things went slower since kernel became 3.x
[2:05] * buck (~buck@bender.soe.ucsc.edu) has left #ceph
[2:05] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[2:06] * ShaunR (~ShaunR@staff.ndchost.com) Quit ()
[2:06] * markbby (~Adium@ Quit (Remote host closed the connection)
[2:06] * dscastro (~dscastro@ Quit (Remote host closed the connection)
[2:12] * sagelap (~sage@ Quit (Remote host closed the connection)
[2:16] * dxd828 (~dxd828@host-2-97-70-33.as13285.net) has joined #ceph
[2:19] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) has joined #ceph
[2:21] * huangjun (~kvirc@ has joined #ceph
[2:22] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[2:24] * LeaChim (~LeaChim@ Quit (Ping timeout: 480 seconds)
[2:29] * dxd828 (~dxd828@host-2-97-70-33.as13285.net) Quit (Quit: Textual IRC Client: www.textualapp.com)
[2:30] * Cube (~Cube@ has joined #ceph
[2:30] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) Quit (Quit: rudolfsteiner)
[2:35] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit (Quit: Leaving.)
[2:41] <lautriv> dmick, was no kernel-option nor a udev problem, they appear only on GPT partitions which still fail to create :( will get some sleep, laters.
[2:42] <dmick> ah, yes, you must have gpt for that
[2:48] * cfreak201 (~cfreak200@p4FF3E75F.dip0.t-ipconnect.de) has joined #ceph
[2:57] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) has joined #ceph
[2:59] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) Quit ()
[3:00] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:03] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[3:03] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit ()
[3:05] <cfreak201> heya, short (and probably silly) question: I've read multiple times on the web (random places while search for related stuff..) that ceph can only be implemented in a fault tolerant (machine failure) way with atleast 3 (osd / data) nodes ? Does that still hold true? (I'm currently planning on using 3x physical mon, 2x phyiscal osd servers)
[3:09] <gregaf> as long as it's more than one you can set it up to be fault tolerant
[3:10] <gregaf> 3x replication is a standard production recommendation is probably why it says that
[3:11] <cfreak201> ya, i'm just in a situation where i have 2 "big" servers and one ~3 year old server which is also hosting some openstack daemons
[3:11] <dmick> gregaf: or for mon failures
[3:12] <gregaf> ah, yeah, but he mentioned 3 mon servers already
[3:12] <dmick> yeah. confused by the total of 3 servers in the last comment :)
[3:13] <cfreak201> how would that crushmap look like for such a probably trivial setup? Just 2 sever in one rack with a rule like "... chooseleaf firstn 0 type node" (node meaning server)?
[3:16] * rturk is now known as rturk-away
[3:16] <gregaf> yeah
[3:17] <cfreak201> sorry for those stupid questions, I had that setup running earlier and I just couldn't get it to do any io while I failed one node (plugged the cable)
[3:19] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[3:21] * AfC (~andrew@2001:44b8:31cb:d400:b874:d094:ec64:aec3) Quit (Quit: Leaving.)
[3:23] * yy-nm (~chatzilla@ has joined #ceph
[3:23] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Remote host closed the connection)
[3:24] <gregaf> cfreak201: there's a "min size" setting on pools that prevents it from going active if there's not enough redundancy, you might have that set to 2 instead of 1 (most likely)
[3:24] <gregaf> if that happens again you should look at ceph -s and see what it's telling you
[3:25] <cfreak201> gregaf: i had put it to 1 manually, size was at 2 i think..
[3:27] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[3:28] <cfreak201> well it's 3.27am here... i'll have a look at it tomorrow again with some fresh enegery... thanks for your help so far :)
[3:28] * julian (~julianwa@ Quit (Quit: afk)
[3:32] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[3:37] * jluis (~JL@89-181-148-68.net.novis.pt) Quit (Ping timeout: 480 seconds)
[3:50] * jasoncn (~jasoncn@ has joined #ceph
[3:51] <jasoncn> hi
[3:51] <jasoncn> any one use radosgw?
[3:56] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:58] <phantomcircuit> 2013-07-31 03:58:46.422097 7fea87fff700 0 mon.d@2(synchronizing sync( requester state chunks )).data_health(0) update_stats avail 89% total 27781720 used 1495512 avail 24893520
[3:59] <phantomcircuit> is that saying that the monitor has 89% of the data
[3:59] <phantomcircuit> or that it still needs 89% of the data
[3:59] <phantomcircuit> im confuse
[4:00] <joao> neither; that's a data_health debug message, and all messages contain the current monitor state
[4:00] <joao> so from that message one perceives that the monitor is synchronizing
[4:01] <joao> and on an unrelated matter, it updated its data store stats, which has 89% total available space
[4:08] <phantomcircuit> oh
[4:08] <phantomcircuit> joao, i get it
[4:08] <phantomcircuit> yeah it is syncing
[4:08] <phantomcircuit> i copied to a new partition and i suspect lost xattr info
[4:10] * jeff-YF (~jeffyf@pool-173-66-21-43.washdc.fios.verizon.net) has joined #ceph
[4:13] <jasoncn> the time of next version ?
[4:14] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[4:14] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) has left #ceph
[4:15] <phantomcircuit> huh
[4:15] <phantomcircuit> ceph-mon keeps crashing
[4:16] <phantomcircuit> lets see if i can get a useful error log
[4:19] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) Quit (Ping timeout: 480 seconds)
[4:26] <phantomcircuit> http://pastebin.com/raw.php?i=XZyTHZRn
[4:28] <phantomcircuit> dmick, any idea what's wrong in the above paste
[4:28] <phantomcircuit> ?
[4:36] * rongze (~quassel@ has joined #ceph
[4:36] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) Quit (Remote host closed the connection)
[4:36] <dmick> looking
[4:37] <dmick> phantomcircuit: not offhand, no
[4:38] <phantomcircuit> i'll try turning logging up
[4:39] * rongze (~quassel@ Quit (Remote host closed the connection)
[4:40] * jeff-YF (~jeffyf@pool-173-66-21-43.washdc.fios.verizon.net) Quit (Quit: jeff-YF)
[4:40] * rongze (~quassel@ has joined #ceph
[4:42] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) has joined #ceph
[4:43] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) has joined #ceph
[4:44] * yy-nm (~chatzilla@ Quit (Quit: ChatZilla [Firefox 22.0/20130618035212])
[4:46] <phantomcircuit> dmick, any suggestions for which logger to turn up?
[4:47] * yanzheng (~zhyan@ has joined #ceph
[4:49] <dmick> I don't, no. It looks like it went off the rails in the midst of a leveldb update
[4:49] <dmick> perhaps in the heap management?...but
[4:50] <dmick> maybe the best thing would be to wrap up the log and the corefile and cephdrop 'em?
[4:54] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) Quit (Ping timeout: 480 seconds)
[5:00] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) Quit (Remote host closed the connection)
[5:02] <phantomcircuit> dmick, huh
[5:02] <phantomcircuit> magic it works
[5:02] <phantomcircuit> i just restarted it about 20 times
[5:02] <phantomcircuit> whatever
[5:03] <dmick> um, awesome?...
[5:03] <phantomcircuit> dmick, yeah im confused but not complaining
[5:05] * fireD_ (~fireD@93-142-245-150.adsl.net.t-com.hr) has joined #ceph
[5:07] * fireD (~fireD@93-142-231-192.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:17] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[5:28] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[5:29] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[5:41] * AfC (~andrew@2001:44b8:31cb:d400:a914:70d2:fc7f:a687) has joined #ceph
[5:48] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[6:01] * sprachgenerator (~sprachgen@c-50-141-192-36.hsd1.il.comcast.net) Quit (Quit: sprachgenerator)
[6:24] * haomaiwa_ (~haomaiwan@ has joined #ceph
[6:24] * rongze (~quassel@ Quit (Read error: Connection reset by peer)
[6:24] * haomaiwang (~haomaiwan@ Quit (Read error: Connection reset by peer)
[6:25] * rongze (~quassel@ has joined #ceph
[6:26] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[6:33] * huangjun (~kvirc@ Quit (Read error: No route to host)
[6:47] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[6:47] * markl (~mark@tpsit.com) has joined #ceph
[6:47] <sputnik13> hello
[6:49] <sputnik13> anyone around that's using ceph in a production context?
[6:50] <phantomcircuit> yes
[6:50] <phantomcircuit> a very small production context
[6:50] <sputnik13> small like how small
[6:50] <phantomcircuit> im sure there are others using ceph in much larger contexts
[6:50] <phantomcircuit> sputnik13, i have 2 hosts
[6:50] <sputnik13> oh
[6:50] <phantomcircuit> so
[6:50] <phantomcircuit> tiny
[6:50] <sputnik13> :)
[6:50] <sputnik13> 2 ceph servers or 2 hosts using ceph
[6:51] <phantomcircuit> 2 servers
[6:51] <phantomcircuit> running qemu with rbd for their disks
[6:51] <sputnik13> how many guests?
[6:51] <phantomcircuit> ~40
[6:52] <sputnik13> and 2 servers as in ceph servers, not vm hosts right?
[6:52] <phantomcircuit> right
[6:53] <sputnik13> do you care about IOPs? or is your setup more concerned about consolidation and redundancy
[6:53] <phantomcircuit> i do care about IOPS
[6:54] <phantomcircuit> but to maintain consistent performance i limit vms to 1000 read iops and 200 write iops
[6:54] <phantomcircuit> i have ceph on top of zfs with an slog and l2arc on ssds
[6:54] <sputnik13> what kind of IO numbers have you gotten?
[6:54] <phantomcircuit> so the underlying filesystem is insanely fast
[6:54] <sputnik13> ahhh… zfs
[6:54] <sputnik13> yes
[6:54] <sputnik13> I love zfs, unabashed geek love
[6:54] <phantomcircuit> to do that you just have to disable directio and use leveldb instead of xattrs
[6:55] <sputnik13> well, I also need freebsd or solaris
[6:55] <phantomcircuit> sputnik13, http://zfsonlinux.org/
[6:55] <sputnik13> I love solaris, but my company has an allergic reaction to anything not rhel or windows
[6:56] <phantomcircuit> im not sure RHEL has a recent enough kernel to run zfs on linux
[6:56] <sputnik13> phantomcircuit: do you feel that's production ready for a medium to large deployment?
[6:56] <sputnik13> phantomcircuit: I get you're using it in a production environment so I'm guessing you'd say it's production ready for at least a small setup
[6:57] <phantomcircuit> well
[6:57] <phantomcircuit> im using this stack in a production environment where a certain amount of downtime is accepted
[6:57] <phantomcircuit> people bitch and moan and get their SLA refund and that's the end of it
[6:57] <sputnik13> ok, that's different then
[6:58] <sputnik13> :-\
[6:59] <phantomcircuit> sputnik13, you should contact inktank
[6:59] * huangjun (~kvirc@ has joined #ceph
[6:59] <phantomcircuit> they can give you a better idea
[7:00] <phantomcircuit> they're the ones developing ceph and provide support contracts
[7:01] <sputnik13> I kind of need to prove it out first
[7:01] <sputnik13> whatever I decide to propose
[7:02] <sputnik13> nobody either cares about storage or cares to use something other than big iron SANs
[7:36] * huangjun (~kvirc@ Quit (Ping timeout: 480 seconds)
[7:42] * yy-nm (~chatzilla@ has joined #ceph
[7:50] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: sputnik13)
[7:52] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[7:53] <markl> phantomcircuit: how do you set those limits?
[7:53] <markl> iops per vm
[7:54] <phantomcircuit> markl, libvirtd lets you set them using cgroups
[7:55] <phantomcircuit> markl, actually scratch that that's memory
[7:55] * yy-nm_ (~chatzilla@ has joined #ceph
[7:55] <phantomcircuit> markl, qemu has an internal rate limiter
[7:58] <markl> iops_wr
[7:58] <markl> cool
[8:01] * yy-nm (~chatzilla@ Quit (Ping timeout: 480 seconds)
[8:01] * yy-nm_ is now known as yy-nm
[8:04] * iggy (~iggy@theiggy.com) Quit (Quit: No Ping reply in 180 seconds.)
[8:08] * dpippenger (~riven@cpe-75-85-17-224.socal.res.rr.com) has joined #ceph
[8:08] * iggy_ (~iggy@theiggy.com) Quit (Remote host closed the connection)
[8:14] * iggy_ (~iggy@theiggy.com) has joined #ceph
[8:17] * agh (~oftc-webi@gw-to-666.outscale.net) Quit (Quit: Page closed)
[8:17] * agh (~oftc-webi@gw-to-666.outscale.net) has joined #ceph
[8:24] * KindTwo (~KindOne@ has joined #ceph
[8:25] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Read error: Connection reset by peer)
[8:25] * KindTwo is now known as KindOne
[8:25] * Cube (~Cube@ Quit (Quit: Leaving.)
[8:28] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Read error: Connection reset by peer)
[8:31] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[8:33] * gregaf (~Adium@ Quit (Quit: Leaving.)
[8:34] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Quit: Changing server)
[8:35] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[8:39] * gregaf (~Adium@2607:f298:a:607:112c:1fa8:77e1:af2e) has joined #ceph
[8:45] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) has joined #ceph
[8:49] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[8:49] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) Quit ()
[8:56] * Cube (~Cube@ has joined #ceph
[9:00] * Cube (~Cube@ Quit (Read error: Connection reset by peer)
[9:03] * dpippenger (~riven@cpe-75-85-17-224.socal.res.rr.com) Quit (Remote host closed the connection)
[9:07] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) has joined #ceph
[9:07] * mschiff (~mschiff@pD9511984.dip0.t-ipconnect.de) has joined #ceph
[9:10] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:22] * yy-nm (~chatzilla@ Quit (Read error: Connection reset by peer)
[9:22] * yy-nm (~chatzilla@ has joined #ceph
[9:31] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[9:31] * huangjun (~kvirc@ has joined #ceph
[9:32] * The_Bishop (~bishop@2001:470:50b6:0:d176:f45d:f651:852a) Quit (Ping timeout: 480 seconds)
[9:36] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[9:38] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[9:41] * yy-nm (~chatzilla@ Quit (Read error: Connection reset by peer)
[9:41] * yy-nm (~chatzilla@ has joined #ceph
[9:54] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[9:55] * BManojlovic (~steki@fo-d- has joined #ceph
[9:59] * MACscr (~Adium@c-98-214-103-147.hsd1.il.comcast.net) has joined #ceph
[10:00] * JM__ (~oftc-webi@ has joined #ceph
[10:05] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[10:05] * bergerx_ (~bekir@ has joined #ceph
[10:20] * LeaChim (~LeaChim@ has joined #ceph
[10:24] * acalvo (~acalvo@208.Red-83-61-6.staticIP.rima-tde.net) has joined #ceph
[10:24] <acalvo> Hello
[10:24] <acalvo> I've just updated ceph from 0.61.4 to 0.61.7 due to the "admin socket not ready" but, and now 0.61.7 complains about "failed to create new leveldb store"
[10:25] <acalvo> I'm concerned about all data, and if there is some kind of migration necessary
[10:28] * LeaChim (~LeaChim@ Quit (Ping timeout: 480 seconds)
[10:37] * LeaChim (~LeaChim@ has joined #ceph
[10:41] * mschiff_ (~mschiff@pD9511984.dip0.t-ipconnect.de) has joined #ceph
[10:41] * mschiff (~mschiff@pD9511984.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[10:44] <jtang> hi
[10:45] <jtang> im trying to figure out if its possible to get radosgw to deliver an index.html from a bucket as a directory index
[10:45] <jtang> is this supported in radosgw?
[10:46] <Kioob`Taff> acalvo: I see same problem, but in my case the partition allowed to the MON was "lost" at reboot :/
[10:49] <acalvo> Kioob`Taff, did you update from 0.61.4 to 0.61.7 too?
[10:52] <Kioob`Taff> yep
[10:57] <acalvo> haven't found any upgrade guide on the ceph documentation
[10:57] <acalvo> at least from 0.61.4 to 0.61.7
[10:59] <Kioob`Taff> acalvo: from 0.61.4 to 0.61.5, you need to upgrade all MON
[10:59] <Kioob`Taff> one by one it doesn't works
[10:59] <Kioob`Taff> I check
[11:00] <joao> uh?
[11:00] <Kioob`Taff> http://ceph.com/docs/master/release-notes/#v0-61-5-cuttlefish
[11:00] <joao> acalvo, I'd love to see some logs with debug mon = 20
[11:00] <Kioob`Taff> « This release fixes a 32-bit vs 64-bit arithmetic bug with the feature bits. An unfortunate consequence of the fix is that 0.61.4 (or earlier) ceph-mon daemons can’t form a quorum with 0.61.5 (or later) monitors. To avoid the possibility of service disruption, we recommend you upgrade all monitors at once. »
[11:00] * Cube (~Cube@ has joined #ceph
[11:01] <Kioob`Taff> but, "failed to create new leveldb store" indicate a different problem
[11:02] <joao> my guess is that it is a *way*too*verbose* message in the store conversion code (bobtail -> cuttlefish), that doesn't really need to create a store, just test its creation to assess the necessity of a conversion
[11:02] <joao> but given I'm not looking at the code right now I'm mostly talking out of my ass
[11:03] <joao> then again, if the monitor isn't coming up, I guess something must be happening
[11:03] <acalvo> I'll try it with more verbosity
[11:03] <joao> i.e., I'm assuming the monitor isn't coming up; acalvo ?
[11:03] <acalvo> there are just 2 mon 1 mds
[11:03] <acalvo> all mon are on 0.61.7
[11:04] <joao> are both monitors running?
[11:04] <acalvo> and yes, it fails trying to start mon1
[11:04] <joao> okay
[11:04] <acalvo> nope, none of them
[11:06] <joao> acalvo, before that, would you mind running an ls /var/lib/ceph/mon/ceph-1 ? (or whatever is the path to your mon store)
[11:08] <acalvo> sure
[11:08] <joao> well, I just remembered that if you aren't running with the right permissions, leveldb will fail to obtain a LOCK and fail with such sort of error message
[11:08] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[11:08] <joao> also, that might happen if there's another monitor running (same issue with the LOCK, which is a good thing in that case)
[11:09] <acalvo> yes, I get that
[11:09] <joao> so, I'm going to grab some coffee now; be back in 10m
[11:09] <acalvo> thanks!
[11:10] <acalvo> I'm using the default path for the monitors and metadataservers
[11:10] <acalvo> osds are stored under another path (/media/storage/osd/ceph-$id)
[11:10] <acalvo> however, listing /var/lib/ceph gives an empty directory
[11:11] <acalvo> but it should have the mon and mds files, right?
[11:14] <acalvo> it seems that all contents for /var/lib/ceph on node 1 (ceph-0) are lost
[11:15] <loicd> ccourtaut: morning sir
[11:16] <acalvo> but node 2 (ceph-1) is ok
[11:16] <loicd> http://marc.info/?l=ceph-devel&m=135956069705397&w=4 WG: RadosGW S3 Api is probably of interest to you, in case you missed it
[11:16] <ccourtaut> loicd: morning!
[11:16] <loicd> although ... it seems more like a debug issue ... ;-)
[11:17] <joao> acalvo, do you happen to have those on other partitions? might you need to mount them?
[11:17] <acalvo> only OSDs are on separated partitions
[11:18] <acalvo> MON and MDS (and RADOSGW) are on local partitions
[11:18] <ccourtaut> loicd: yes seems to
[11:19] <acalvo> servers had a power failure, so they were hard turned off
[11:19] <joao> I can't come up with anything that would cause ceph to lose all its data on the stores by own ceph's fault
[11:19] <joao> maybe it's the disk?
[11:20] <acalvo> maybe
[11:20] <acalvo> could I recover that from the other node?
[11:20] <joao> anyway, if you still have node 2's you should be able to bring back the monitors
[11:20] <joao> acalvo, yes
[11:20] <joao> basically, just recreate the monitor on node 1
[11:20] <acalvo> a regular start does not bring back the monitors, as it tries to connect to the node 1 (ceph-0) and start from there
[11:20] <joao> manually even
[11:21] <joao> mkdir -p /var/lib/ceph/mon/ceph-FOO
[11:21] <joao> ahem, you need the monmap
[11:21] <joao> alright
[11:21] <acalvo> not a mkcephfs I believe
[11:21] <joao> still have node-2, right? does that monitor start?
[11:21] <acalvo> (or ceph-deploy)
[11:22] <acalvo> no from the init script
[11:22] <joao> acalvo, I'm familiar with doing this all manually; don't really know how to recover stuff with mkcephfs or ceph-deploy
[11:22] <joao> might be possible, no idea
[11:23] <joao> acalvo, set debug mon = 20 on node 2's (the monitor that still has its data in place)
[11:23] <acalvo> should I run it manually?
[11:23] <joao> and 'ceph-mon -i BAR'
[11:24] <joao> and pastebin the log
[11:24] <joao> might be worth to 'mv /var/log/ceph/ceph-mon.whatever.log /var/log/ceph/ceph-mon.whatever.log.2' to get the old log out of the way before you run the monitor
[11:25] <joao> we just really want whatever is outputted in the new run
[11:26] <loicd> ccourtaut: FYI the deadline for the blueprints submission is today.
[11:26] <acalvo> http://pastebin.com/eNYJ4k0s
[11:26] <joao> loicd, ccourtaut, you guys submitting other blueprints besides the erasure coding one?
[11:27] <joao> acalvo, ah, so it comes up, just doesn't form a quorum because there's no one else to form a quorum with
[11:27] <joao> that's great
[11:27] <joao> cool, kill it
[11:27] <acalvo> done
[11:27] <joao> ceph-mon -i b --extract-monmap /tmp/ceph.monmap -d
[11:28] <joao> copy /tmp/ceph.monmap to the other node
[11:28] <joao> on the other node:
[11:28] <loicd> joao: I contemplated submitting blueprints for sub tasks of erasure code ( plugin library, refactor PGBackend ... ) but it would flood the submission and I'm not sure this level of detail is of interest to many people. What do you think ?
[11:28] <joao> mkdir -p /var/lib/ceph/mon/ceph-FOO
[11:28] <joao> ceph-mon -i FOO --mkfs --monmap /tmp/ceph.monmap --debug-mon 20 -d
[11:28] <joao> pastebin the result
[11:29] <joao> if nothing craps out, ceph-mon -i FOO
[11:29] <joao> and you should be good to go
[11:29] <joao> (after restarting node 2's mon
[11:30] <acalvo> ceph-mon: created monfs at /var/lib/ceph/mon/ceph-a for mon.a
[11:30] <joao> loicd, I believe that specificity is always great, even if it the discussion doesn't focus on that level of granularity
[11:31] <joao> although, if those blueprints can be dissociated from the erasure coding one, you could create new ones I guess; if they're not, then you should probably nest them on the erasure coding blueprint?
[11:31] <joao> acalvo, cool
[11:32] <joao> acalvo, before we go nuts, back up your other monitor store
[11:32] <joao> just in case
[11:32] <joao> then fire up both monitors
[11:33] <acalvo> http://pastebin.com/t5dWaqu3
[11:33] <acalvo> complains about keyrings
[11:34] <joao> ah, yeah, guess I forgot about that
[11:34] <joao> acalvo, pvt me the contents of /etc/ceph/*keyring* on both nodes?
[11:35] <loicd> joao: I think they can be dissociated from erasure coding. I'll give it a try today for backfilling. Good idea ;-)
[11:36] <joao> loicd, I woke up mighty inspired today, don't get used to it :p
[11:37] <joao> acalvo, this is weird; should you have a /etc/ceph on the node that lost its data?
[11:38] <joao> acalvo, do you still have /etc/ceph/ceph.conf on that node?
[11:38] <loicd> joao: :-D
[11:40] <ccourtaut> joao: i submitted a blueprint to maintain a S3 compliance page in the documentation
[11:40] <ccourtaut> i already started to create a draft version of it
[11:40] <loicd> ccourtaut: https://github.com/kri5/ceph/blob/a02912a945edbf70aeb48f637d6db1b45af72f2e/doc/radosgw/s3_compliance.rst looks good
[11:41] <ccourtaut> loicd: was going to post it here :D
[11:41] <ccourtaut> loicd: thanks btw
[11:41] <loicd> you may want to use this link instead of the commit that shows the diff
[11:41] <ccourtaut> loicd: indeed
[11:41] <maciek> hi, how to manually add client.admin.keyring to mds? I accidentally removed it from mds :|
[11:41] <loicd> ccourtaut: :-) sorry I spoiled it !
[11:41] <maciek> or maybe regenerate a new one?
[11:41] * jluis (~JL@89-181-148-68.net.novis.pt) has joined #ceph
[11:41] <maciek> its only devel env
[11:42] <ccourtaut> loicd: should be updated on the blueprint page
[11:42] <loicd> ccourtaut: this is a great example.
[11:42] <joao> okay, I'll be back in 20 minutes or so; will have to run shortly after that
[11:44] <loicd> It would be useful if http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Create_and_Maintain_S3_feature_list_for_compatibility_tracking#Detailed_Description describe the page structure so that it can be commented on by interested parties. How is the document divided ? Why did you chose these parts ? What is the semantic of each column ? How is it going to be updated ? etc.
[11:44] <loicd> ccourtaut: ^
[11:45] <ccourtaut> loicd: ok
[11:49] <loicd> ccourtaut: when updating the file, what if a URL is longer than the size of the cell ? You manually resize it ?
[11:50] <ccourtaut> loicd: for now, yes
[11:50] <ccourtaut> i need to find a way to avoid that
[11:50] <loicd> I was going to suggest that instead of URLs to master you should list URLs to a specific commit. Otherwise they will drift away as code changes ;-)
[11:50] <loicd> hence the question about updating for long urls ;-)
[11:51] <ccourtaut> :)
[11:54] <loicd> reading http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadListMPUpload.html I wonder : would it make sense to have one entry in the table for each parameter & each response element ?
[11:54] <loicd> I suspect that not *all* of them are implemented and having a list of those that are missing would make sense. what do you think ?
[11:54] <ccourtaut> loicd: don't know, i did that at first
[11:55] <acalvo> joao, yes, I do have the whole /etc/ceph directory
[11:55] <ccourtaut> but would be harder to maintain
[11:55] <ccourtaut> as there already is a link to the page that lists all the features above the table
[11:56] <loicd> ccourtaut: I think that's exactly the kind of discussion that is relevant to the summit. Would you like me to add this to the blueprint or will you do it ?
[11:56] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[11:56] <loicd> ccourtaut: a key use case of this list is:
[11:56] <ccourtaut> maybe you should had it
[11:56] <loicd> a) a casual developer sees the list
[11:57] <ccourtaut> loicd: i just updated the blueprint with more details
[11:57] <loicd> b) (s)he sees response element KeyMarker of List multipart bucket is missing. Clicks on the link describing it, in the table.
[11:57] <acalvo> joao, contents of /etc/ceph on both nodes should be the same
[11:58] * yy-nm (~chatzilla@ Quit (Quit: ChatZilla [Firefox 22.0/20130618035212])
[11:58] <loicd> c) it looks achievable and (s)he clicks on the link to the area where it should probably be implemented. Gives it a try and submit a pull request.
[11:58] <joao> acalvo, do you have a 'keyring' option on your ceph.conf?
[11:58] * loicd being over optimistic ;-)
[11:58] <ccourtaut> loicd: :)
[11:58] <acalvo> yes
[11:58] <ccourtaut> don't know if over optimistic but i completly see the point yes
[11:59] <joao> acalvo, does said keyring exist and have a '[mon.]' entity in it?
[11:59] <acalvo> correction: yes but only for the rados entry
[11:59] <joao> ah
[11:59] <ccourtaut> the main goal to achieve is to have a complete document, but still easy to read for new comers
[11:59] <joao> acalvo, okay, check your keyrings for a [mon.] entry
[12:00] <joao> then you should be able to rerun 'ceph-mon -i FOO --mkfs --monmap /tmp/monmap -k /path/to/said/keyring --debug-mon 20 -d'
[12:00] <acalvo> mon entries only have the hostname and the ip addresa
[12:00] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) has joined #ceph
[12:00] <joao> acalvo, I meant a keyring file
[12:01] <joao> grep 'mon' /etc/ceph/*keyring*
[12:01] * rudolfsteiner (~federicon@141-77-235-201.fibertel.com.ar) Quit ()
[12:01] <acalvo> /etc/ceph/keyring.radosgw.gateway: caps mon = "allow r"
[12:02] <joao> acalvo, okay, as I don't have much time I'll have to hack around it, but it should work
[12:02] <acalvo> so, still the rados
[12:03] <joao> copy /var/lib/ceph/mon/ceph-FOO/keyring from your good node to the bad node's /tmp/ceph.keyring
[12:03] <acalvo> should I copy the keyring in /var/lib/mon/ceph-b/keyring?
[12:03] <acalvo> ok
[12:03] <loicd> ccourtaut: exactly
[12:03] <loicd> http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Create_and_Maintain_S3_feature_list_for_compatibility_tracking#Detailed_Description updated
[12:04] <joao> then 'ceph-mon -i FOO --mkfs --monmap /tmp/monmap -k /tmp/keyring --debug-mon 20 -d'
[12:05] <acalvo> ok, good to go
[12:05] <acalvo> I'm going to try to start mon on node 1
[12:06] <loicd> ccourtaut: the more I think about it the more it becomes clear that the "Supported" column will not be "Yes" in many cases but "Partially" and that it will require a level of detail that amounts to listing the supported behavior/parameters/response headers
[12:06] <joao> ceph-mon -i FOO --debug-mon 20
[12:06] <joao> well, you can probably get rid of debug mon
[12:06] <acalvo> 2013-07-31 10:06:03.505194 7f4901ef1780 0 mon.a@-1(probing) e0 my rank is now 0 (was -1)
[12:06] <acalvo> so, same output as node 2
[12:06] <joao> that's okay
[12:06] <acalvo> which is expected
[12:06] <acalvo> that leaves only the MDS
[12:06] <acalvo> it was on node 1
[12:06] <acalvo> but it's completely gone
[12:07] <joao> I don't think the mds has any disk state to begin with
[12:07] <loicd> the first entry in https://github.com/kri5/ceph/blob/a02912a945edbf70aeb48f637d6db1b45af72f2e/doc/radosgw/s3_compliance.rst#operations-on-buckets
[12:07] <joao> once you have a quorum you should be able to get the full keyring with 'ceph auth export -o /tmp/keyring'
[12:07] <acalvo> let's try to start then
[12:08] <loicd> shows you made your research in the code ccourtaut ^
[12:08] <joao> and that should be the only thing you need, maybe aside from a monmap
[12:08] <acalvo> is it necessary to do that (keyring and monmap) since node 2 was ok and we have recreated node 1?
[12:09] <loicd> ccourtaut: for the others maybe it's enough to have a single link to the code. It's a hint for the contributor : without this link he would not know where to look. It's no replacement for doing the research you did and figure out all the areas of the code that are impacted by the implementation.
[12:10] <acalvo> gets stuck in "Starting ceph-create-keys on dfs01..." ...
[12:10] <loicd> anyway, that's what I have regarding your blueprint ;-)
[12:10] <cmdrk> blargh
[12:10] <cmdrk> ran bonnie++ on all of my OSDs raw disks last night. conclusion: they're all terrible ;)
[12:10] <acalvo> same error as with 0.61.4 "INFO:ceph-create-keys:ceph-mon admin socket not ready yet."
[12:11] <cmdrk> would high latency or low throughput matter to ceph more in a pool of otherwise identical OSDs/disks?
[12:13] <cmdrk> im guessing i need to replace some of the disks with high latencies.. ive been seeing a lot of old requests complaints and waiting for subops
[12:21] <acalvo> a journal file (/var/lib/ceph/osd/ceph-FOO/journal) is the same for all nodes? can I copy it to a lost node to try to recover it?
[12:23] <jluis> acalvo, you shouldn't copy it
[12:24] <acalvo> can it be recreated somehow?
[12:25] <jluis> if you lost the journal, no; but if the other osds have their journal and you have the data replicated and they come up just fine, you should be okay
[12:25] <jluis> well, if your data is replicated between the two nodes that is
[12:25] <MACscr> Do that many of you use ceph for VM storage? Wondering if i should use it or something like zfs instead
[12:25] <acalvo> it is, but one OSD is not coming up
[12:26] <acalvo> HEALTH_WARN 624 pgs degraded; 624 pgs stuck unclean; recovery 1995/3990 degraded (50.000%)
[12:26] <jluis> acalvo, why?
[12:27] <acalvo> pastebin if it helps: http://pastebin.com/ikSTZPFX
[12:28] <jluis> acalvo, it says you don't have a /media/storage/osd/ceph-0 ?
[12:28] <jluis> maybe it got lost as well?
[12:28] <acalvo> but it exists and it's filled
[12:28] <acalvo> [root@fileserver1 ~]# ls /media/storage/osd/ceph-0/
[12:28] <acalvo> ceph_fsid current fsid keyring magic ready store_version whoami
[12:28] <jluis> did you recreate it?
[12:29] <acalvo> no, it was fine
[12:29] <acalvo> only /var/lib/ceph was lost on node 1
[12:30] <jluis> acalvo, is your journal on a partition?
[12:31] <jluis> or a different disk altogether?
[12:31] <acalvo> the journal file was on /var/lib/ceph/osd/ceph-0/journal
[12:31] <acalvo> the osd data is in a separate disk (/media/storage)
[12:31] <jluis> so it was a file?
[12:31] <acalvo> yes
[12:31] <jluis> and no such file exists?
[12:32] <acalvo> from node 2: -rw-r--r--. 1 root root 1000M jul 31 10:22 /var/lib/ceph/osd/ceph-1/journal
[12:32] <acalvo> no, it's gone on node 1
[12:33] <jluis> you could try running 'ceph-osd -i FOO --mkjournal -d'
[12:33] <acalvo> yes, I saw that just now
[12:33] <acalvo> 2013-07-31 10:33:45.935174 7fc85b018780 -1 created new journal /var/lib/ceph/osd/ceph-0/journal for object store /media/storage/osd/ceph-0
[12:33] <acalvo> that's promising
[12:34] <acalvo> HEALTH_OK
[12:34] <acalvo> thanks jluis
[12:34] <acalvo> and thanks joao for your time
[12:36] <jluis> jluis == joao; same guy different computer :p
[12:36] <jluis> np
[12:37] <acalvo> jajaja, didn't know!
[12:37] <acalvo> however the init script gets stuck when creating keys
[12:38] <acalvo> but everything (mon, mds and osd) seems to be working now
[12:42] * jasoncn (~jasoncn@ Quit (Ping timeout: 480 seconds)
[12:49] * madkiss1 (~madkiss@089144192103.atnat0001.highway.a1.net) has joined #ceph
[12:51] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) Quit (Ping timeout: 480 seconds)
[12:58] * jluis (~JL@89-181-148-68.net.novis.pt) Quit (Ping timeout: 480 seconds)
[13:00] * Cube (~Cube@ has joined #ceph
[13:06] * huangjun (~kvirc@ Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[13:06] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:08] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) has joined #ceph
[13:09] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[13:09] * syed_ (~chatzilla@ has joined #ceph
[13:12] * madkiss1 (~madkiss@089144192103.atnat0001.highway.a1.net) Quit (Ping timeout: 480 seconds)
[13:14] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:23] * yanzheng (~zhyan@ has joined #ceph
[13:24] <loicd> ccourtaut: Bucket restrictions and limitations and other transversal rules / behavior should be listed and matched against code such as the one stating that a bucket name can only start with a number, letter and underscore and highlight that underscore is not allowed by S3 to start a bucket name. ( added to http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Create_and_Maintain_S3_feature_list_for_compatibility_tracking#Detailed_Description )
[13:25] <ccourtaut> loicd: great
[13:25] <ccourtaut> loicd: diving into each features made me go through tranversal rules indeed
[13:26] <ccourtaut> loicd: as an other example, http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketPUTacl.html leads to http://docs.aws.amazon.com/AmazonS3/latest/dev/S3_ACLs_UsingACLs.html
[13:27] <loicd> yes :-)
[13:28] <loicd> ccourtaut: what about versioning ? is S3 API versionned in any way ? or is it constantly evolving with no way to know what version you're at ?
[13:28] <ccourtaut> loicd: unfortunatly i think it just evolves
[13:29] <loicd> maybe yehudasa know if S3 is versioned when it changes
[13:29] <ccourtaut> loicd: http://docs.aws.amazon.com/AmazonS3/latest/API/APIRest.html
[13:29] <ccourtaut> Here it states this API Reference (API Version 2006-03-01)
[13:29] <ccourtaut> in the page header
[13:30] <ccourtaut> but i really think that the S3 API has known evolution since
[13:30] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:30] <loicd> :-D
[13:30] <loicd> you bet
[13:32] <loicd> IIRC canonical started an independant effort to implement the S3 API as a library ( I may be completely mistaken here ... just a vague rememberance )
[13:32] <ccourtaut> loicd: never heard about it, will search about this though
[13:36] <Kioob`Taff> On my cluster I have a mix of OSD in versions 0.61.4, 0.61.5 and 0.61.7. I suppose it's not a good idea to keep it like that, but is there a way to properly restart OSD, reducing the downtime ?
[13:38] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[13:39] <loicd> ccourtaut: http://ceph.com/events/ceph-developer-summit-emperor/ submission completed yesterday actually. But we're in a timezone where we can pretend we're in the middle of the night still ;-)
[13:39] <ccourtaut> loicd: submission was done before yesterday, as the page exist since the beginning of the week
[13:39] <ccourtaut> but it has been updated today :)
[13:40] * sleinen1 (~Adium@2001:620:0:25:42f:2382:de:2549) has joined #ceph
[13:40] <loicd> ccourtaut: my only concern is that the submission looks right when people start reviewing them. Updating the submission *after* they started reviewing would make it quite difficle for them ;-)
[13:41] <ccourtaut> loicd: indeed
[13:42] <loicd> I think it looks good as it is. And people will start waking up in los angeles. So it's probably better if we refrain from modifying it further.
[13:42] <ccourtaut> loicd: i think so
[13:42] <ccourtaut> in it's current state, it is enough to engage discussion and describe what is to be done
[13:43] <loicd> 4:42 am in los angeles... we probably have a two hours left ;-)
[13:43] <loicd> +1 ccourtaut
[13:46] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[13:46] <joao> loicd, claim Anywhere-on-Earth
[13:46] <joao> that would give you what, 14 more minutes? :p
[13:50] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) has joined #ceph
[13:50] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) Quit (Ping timeout: 480 seconds)
[13:51] * rudolfsteiner (~federicon@200-122-76-249.cab.prima.net.ar) has joined #ceph
[14:01] * tziOm (~bjornar@ has joined #ceph
[14:04] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[14:05] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:07] * rudolfsteiner (~federicon@200-122-76-249.cab.prima.net.ar) Quit (Quit: rudolfsteiner)
[14:18] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) Quit (Ping timeout: 480 seconds)
[14:19] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) has joined #ceph
[14:20] * AfC1 (~andrew@2001:44b8:31cb:d400:316d:5b36:6a0:4bb8) has joined #ceph
[14:20] * AfC (~andrew@2001:44b8:31cb:d400:a914:70d2:fc7f:a687) Quit (Quit: Leaving.)
[14:22] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[14:23] * julian (~julianwa@ has joined #ceph
[14:28] * yanzheng (~zhyan@ has joined #ceph
[14:29] * diegows (~diegows@ has joined #ceph
[14:30] * markbby (~Adium@ has joined #ceph
[14:31] * john_barbee (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:33] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[14:35] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[14:36] * acalvo (~acalvo@208.Red-83-61-6.staticIP.rima-tde.net) Quit (Quit: Leaving)
[14:45] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[14:47] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[14:49] * yanzheng (~zhyan@ has joined #ceph
[14:50] * The_Bishop (~bishop@2001:470:50b6:0:d176:f45d:f651:852a) has joined #ceph
[14:51] * danieagle (~Daniel@ has joined #ceph
[14:54] * sprachgenerator (~sprachgen@c-50-141-192-36.hsd1.il.comcast.net) has joined #ceph
[15:06] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:09] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[15:10] * saabylaptop (~saabylapt@2002:5ab8:a453:0:319f:ce8f:13ff:a65f) has joined #ceph
[15:10] * BillK (~BillK-OFT@124-168-243-244.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:11] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[15:16] * Vjarjadian (~IceChat77@ Quit (Quit: A day without sunshine is like .... night)
[15:18] * AfC1 (~andrew@2001:44b8:31cb:d400:316d:5b36:6a0:4bb8) Quit (Quit: Leaving.)
[15:19] * jeff-YF (~jeffyf@ has joined #ceph
[15:22] * jeff-YF (~jeffyf@ Quit ()
[15:28] * joao-laptop (~ubuntu@gw.sepia.ceph.com) has joined #ceph
[15:33] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[15:36] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:39] <loicd> :-)
[15:40] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[15:43] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) has joined #ceph
[15:44] * AfC (~andrew@2001:44b8:31cb:d400:316d:5b36:6a0:4bb8) has joined #ceph
[15:46] * rudolfsteiner (~federicon@ has joined #ceph
[15:48] * huangjun (~kvirc@ has joined #ceph
[15:50] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) Quit (Ping timeout: 480 seconds)
[15:50] * saabylaptop (~saabylapt@2002:5ab8:a453:0:319f:ce8f:13ff:a65f) Quit (Quit: Leaving.)
[15:52] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) has joined #ceph
[15:55] * X3NQ (~X3NQ@ Quit (Quit: Leaving)
[15:57] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[15:59] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) has joined #ceph
[16:03] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) Quit (Ping timeout: 480 seconds)
[16:04] * huangjun (~kvirc@ Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[16:11] * tziOm (~bjornar@ Quit (Remote host closed the connection)
[16:14] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) has joined #ceph
[16:15] * syed_ (~chatzilla@ Quit (Quit: ChatZilla [Firefox 22.0/20130627172038])
[16:20] * AfC (~andrew@2001:44b8:31cb:d400:316d:5b36:6a0:4bb8) Quit (Quit: Leaving.)
[16:25] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) Quit (Quit: Leaving.)
[16:25] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) has joined #ceph
[16:27] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) Quit (Read error: Network is unreachable)
[16:29] * joao-laptop (~ubuntu@gw.sepia.ceph.com) Quit (Quit: Lost terminal)
[16:30] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) has joined #ceph
[16:31] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[16:32] * Cube (~Cube@ has joined #ceph
[16:33] <Kioob`Taff> Hi. In case of 2 hosts, 4 OSD per host, and replication level 3, is it possible to have a CRUSH rule which say that first 2 copies need to be on a different host, and the third just on a different OSD ?
[16:36] * madkiss1 (~madkiss@089144192103.atnat0001.highway.a1.net) has joined #ceph
[16:37] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[16:38] * guerby (~guerby@ip165-ipv6.tetaneutral.net) Quit (Quit: Leaving)
[16:39] * guerby (~guerby@ip165-ipv6.tetaneutral.net) has joined #ceph
[16:39] * rudolfsteiner (~federicon@ has joined #ceph
[16:40] * madkiss (~madkiss@2001:6f8:12c3:f00f:5d40:937b:7b63:14b7) Quit (Ping timeout: 480 seconds)
[16:40] <Kioob`Taff> just :
[16:40] <Kioob`Taff> step chooseleaf firstn 2 type host
[16:40] <Kioob`Taff> step chooseleaf firstn 0 type osd
[16:40] <Kioob`Taff> do the job ?
[16:41] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) Quit (Quit: Leaving.)
[16:42] * amatter (~oftc-webi@ has joined #ceph
[16:44] <amatter> Hi guys. I have a cluster with one monitor and 15 osds. I tried to add an additional monitor via ceph-deploy and while it returned without error, no monitor joined (it did appear to complete the mon.mkfs on the mon host). so I manually added the mon using ceph mon add and
[16:44] * Cube (~Cube@ Quit (Quit: Leaving.)
[16:44] <amatter> started the mon manually on the host, but now the mon.0 is stuck in an "electing" mode
[16:45] <amatter> two out of two monitors should be a quorum, so I'm not sure what it's complaining about
[16:47] * iggy (~iggy@theiggy.com) has joined #ceph
[16:47] <janos> you want odd numbers
[16:47] <janos> 2 mons can tie. no quorum
[16:48] * madkiss1 (~madkiss@089144192103.atnat0001.highway.a1.net) Quit (Ping timeout: 480 seconds)
[16:50] <amatter> janos: okay, I've added a third monitor which seems to have booted okay and joined the cluster but the mon.0 is still refusing connections and reporting "electing" state in the log
[16:51] <Gugge-47527> as far as i know, its because the two new mons dont have the needed data yet, and they cant get it from mon0, because it is not in quorom anymore
[16:51] <Gugge-47527> how you go from 1 mon to more, i have no idea :)
[16:51] <Gugge-47527> when you find out, please do tell :)
[16:51] <janos> i have never had the displasure of finding out
[16:51] <amatter> probably have to shut the monitor down and manually edit the monmap, I guess
[16:51] <janos> *displeasure
[16:51] <janos> :O
[16:52] <janos> yeah, sadly i am ignorant on that aspectr
[16:52] <janos> -r
[16:56] * rongze (~quassel@ Quit (Ping timeout: 480 seconds)
[17:00] * gregmark (~Adium@ has joined #ceph
[17:01] <amatter> ok, used ceph-mon --extxract-monmap and monmaptool to remove the offending monitor and get back to a quorum of one mon and it's back up
[17:02] <amatter> I first tried to add the third monitor manually with monmaptool but it still wouldn't get to quorum even though the other two monitors showed successful sync with the cluster.
[17:03] * bergerx_ (~bekir@ Quit (Remote host closed the connection)
[17:05] * mschiff_ (~mschiff@pD9511984.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[17:08] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[17:09] * odyssey4me (~odyssey4m@ has joined #ceph
[17:11] * markbby (~Adium@ Quit (Quit: Leaving.)
[17:13] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: leaving)
[17:16] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[17:16] * rudolfsteiner (~federicon@ has joined #ceph
[17:17] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[17:18] <ntranger> Hey all! I'm still messing with ceph, still getting the health warnings, so I started from scratch, to see if I was over looking something. I was told that since I only currently have it on one node, I need to change the "osd crush chooseleaf type" to 0 instead of the default 1, but when I'm creating the new nodes, this string isn't actually in the ceph.conf. I added it to the last one, and the
[17:18] <ntranger> results were the same. Should this automatically be in the conf file, or is it something that is added?
[17:20] * sprachgenerator (~sprachgen@c-50-141-192-36.hsd1.il.comcast.net) Quit (Quit: sprachgenerator)
[17:21] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[17:21] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[17:26] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) Quit (Quit: Bye)
[17:26] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:30] <janos> ntranger: that's in the crushmap, not ceph.conf
[17:30] * yehudasa__ (~yehudasa@2602:306:330b:1410:84d3:fab1:232b:b7b5) Quit (Ping timeout: 480 seconds)
[17:30] * julian (~julianwa@ Quit (Quit: afk)
[17:31] <janos> http://ceph.com/docs/master/rados/operations/crush-map/
[17:32] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Ping timeout: 480 seconds)
[17:36] * rudolfsteiner_ (~federicon@ has joined #ceph
[17:38] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[17:38] * rudolfsteiner (~federicon@ Quit (Ping timeout: 480 seconds)
[17:38] * rudolfsteiner_ is now known as rudolfsteiner
[17:39] <ntranger> janos: awesome. thanks. :)
[17:40] <janos> any time
[17:42] * mschiff (~mschiff@ has joined #ceph
[17:51] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[17:51] <joelio> is there a command that can show me the number of pg's per pool
[17:51] <off_rhoden> joelio: "ceph osd dump | grep pg_num"
[17:52] <joelio> off_rhoden: thanks
[17:52] * markbby (~Adium@ has joined #ceph
[17:56] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[17:59] <Kioob`Taff> Wiki should have different CRUSH examples :p I'm never sure to understand this rules
[18:07] * n1md4 (~nimda@anion.cinosure.com) has joined #ceph
[18:11] * sprachgenerator (~sprachgen@ has joined #ceph
[18:15] * Vjarjadian (~IceChat77@ has joined #ceph
[18:15] * mschiff (~mschiff@ Quit (Ping timeout: 480 seconds)
[18:16] <joelio> hrm, cephfs-fuse mounts fine.. kernel based hangs giving me I/O errors.. latest stable 0.61.7
[18:17] <joelio> mount error 5 = Input/output error
[18:17] <joelio> ahh - feature set mismatch, my 4008a < server's 204008a, missing 200000
[18:17] * joshd1 (~jdurgin@2602:306:c5db:310:a098:5eea:fe31:af17) has joined #ceph
[18:18] <joelio> damn tunables
[18:20] <joelio> it's still giving the df values as a fraction, rather than the real estate value
[18:20] <joelio> so my 67TB cluster is now 246G
[18:20] <joelio> not an issue for me, but will catch some people out
[18:23] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[18:29] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[18:36] <n1md4> hi. installing a new service, and new to ceph. following the online documentation, there is a bit about using ceph-deploy to install on a server-node, but the command fails http://pastie.org/pastes/8194023/text
[18:36] <n1md4> if any one can be of assistance, I'd appreciate it
[18:37] <alfredodeza> n1md4: I have just made some changes to ceph-deploy that are about to get released soon, could you install the latest version from the repo?
[18:37] <alfredodeza> the changes that were merged yesterday will give you better information as to what is going on
[18:38] <alfredodeza> n1md4: how are you installing ceph-deploy ?
[18:40] <n1md4> alfredodeza: thanks.
[18:40] <n1md4> using http://ceph.com/docs/master/start/quick-ceph-deploy/
[18:40] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[18:40] <alfredodeza> n1md4: but how are you actually installing ceph-deploy
[18:41] <alfredodeza> via pip? the bootstrap script? or via RPM/DEB ?
[18:43] <n1md4> ah, er, still using the guide, the INSTALL CEPH-DEPLOY section http://ceph.com/docs/master/start/quick-start-preflight/
[18:43] <n1md4> looks like a .deb
[18:43] <alfredodeza> right
[18:43] <alfredodeza> ok
[18:43] <n1md4> installed on a debian wheezy box
[18:43] <alfredodeza> so that will give you what is currently released
[18:44] <alfredodeza> are you familiar with virtualenv/pip ?
[18:44] <n1md4> i'm not
[18:44] <alfredodeza> if not that is OK, we have a bootstrap script
[18:44] <alfredodeza> ok
[18:45] <alfredodeza> can you clone the ceph-deploy repo? git clone https://github.com/ceph/ceph-deploy.git
[18:45] <alfredodeza> after you do, you need to run the bootstrap script: ./bootstrap
[18:45] <n1md4> 2 ticks
[18:45] * jbd_ (~jbd_@34322hpv162162.ikoula.com) Quit (Ping timeout: 480 seconds)
[18:45] <n1md4> alfredodeza: should I remove the current stable ceph install??
[18:46] <alfredodeza> sure
[18:48] * sjusthm (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[18:50] <n1md4> okay, i ran the ./bootstrap script
[18:50] <n1md4> next?
[18:51] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[18:53] <alfredodeza> ok, good, so that means you know have installed what we have on that repo
[18:53] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[18:53] * mschiff (~mschiff@ has joined #ceph
[18:53] <alfredodeza> can you try installing again?
[18:53] <alfredodeza> the error reporting should be much better
[18:54] <alfredodeza> it should be telling you exactly what is going on in the install process
[18:55] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[18:55] <n1md4> yeah, i've just tried that, still failing, but in a shiny/colourful way :)
[18:55] <alfredodeza> excellent, so it should be also telling you what failed
[18:55] <alfredodeza> can you paste that?
[18:55] <alfredodeza> maybe that output can point to the right place :)
[18:56] <n1md4> http://pastie.org/pastes/8194082/text
[18:57] * Vjarjadian (~IceChat77@ Quit (Quit: When the chips are down, well, the buffalo is empty)
[18:58] <alfredodeza> n1md4: aha! so you are missing the gpg keys
[18:58] <alfredodeza> n1md4: GPG error: http://ceph.com wheezy Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 7EBFDD5D17ED316D
[19:00] <alfredodeza> n1md4: it seems an error on our end, with that link not being able to download the gpg key
[19:00] <alfredodeza> this link: https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
[19:00] <alfredodeza> however, I can get to that file just fine from my machine
[19:00] <n1md4> yeah, i was just looking at that .
[19:00] <alfredodeza> can you try and get to that?
[19:01] <n1md4> looks good to me
[19:01] <n1md4> give me a minute
[19:02] <n1md4> okay, added that time ..
[19:02] <n1md4> :)
[19:05] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:07] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:09] <n1md4> hmm there seems to be a fundamental problem on the host i want to install on, don't think it's a ceph issue .. i'll be back ;) thanks for the help, alfredodeza
[19:10] <alfredodeza> no problem
[19:10] <alfredodeza> at least now you have a better idea of what is going on :)
[19:12] * jluis (~JL@ has joined #ceph
[19:13] <n1md4> alfredodeza: --no-check-certificate was the way around it, it doesn't trust the ceph.com certificate
[19:15] * yehudasa__ (~yehudasa@2607:f298:a:607:ea03:9aff:fe98:e8ff) has joined #ceph
[19:16] <alfredodeza> oh wow
[19:16] <alfredodeza> ok
[19:16] <alfredodeza> that seems to me that it is something we could address
[19:18] <n1md4> it worked fine on the admin node though ... not sure to be honest
[19:18] * joshd1 (~jdurgin@2602:306:c5db:310:a098:5eea:fe31:af17) Quit (Quit: Leaving.)
[19:24] * rudolfsteiner (~federicon@ has joined #ceph
[19:25] <Psi-jack> Whoah... 0.61.7 already?
[19:26] <Psi-jack> I just upgraded to 0.61.5 like a week ago..
[19:26] * rudolfsteiner (~federicon@ Quit ()
[19:27] * lyncos (~chatzilla@ has joined #ceph
[19:27] <n1md4> alfredodeza: another problem, with creating a mon daemon http://pastie.org/pastes/8194163/text
[19:27] <alfredodeza> n1md4: what happens if you run that failing command on pp-ceph-2 ?
[19:28] <alfredodeza> n1md4: sudo service ceph start mon.pp-ceph-2
[19:28] <lyncos> Hi ! .. I need some advices to fine tune our ceph installation .. It seems that even if I have 2G bonds on each ceph nodes. I cannot go over 800mbit/sec ... I think it's because the osd process is CPU bond... any advices ?
[19:28] <loicd> ccourtaut: here is an example of code drifting away http://dachary.org/?p=2182 points to https://github.com/ceph/ceph/blob/962b64a83037ff79855c5261325de0cd1541f582/src/osd/ReplicatedPG.cc#L6661 which has moved away in master, rendering the link useless if it was not tagged with the specific commit
[19:28] <lyncos> I'm using it with openstack
[19:33] <n1md4> alfredodeza: not sure, looked okay, but .. http://pastie.org/pastes/8194180/text
[19:33] * rudolfsteiner (~federicon@ has joined #ceph
[19:34] <n1md4> alfredodeza: exist status 1 .. fail. i'll check it out later.
[19:36] * amatter (~oftc-webi@ Quit (Remote host closed the connection)
[19:38] * dontalton (~don@128-107-239-234.cisco.com) has joined #ceph
[19:39] <dontalton> are there specific permissions needed on ceph.conf when using rbd on a nova compute node?
[19:39] <dontalton> I keep getting this error: http://paste.debian.net/plainh/797e720c
[19:39] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Remote host closed the connection)
[19:39] <loicd> ccourtaut: here is an idea : that would be interesting to discuss in the context of http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Create_and_Maintain_S3_feature_list_for_compatibility_tracking
[19:40] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[19:40] <loicd> what if the code contained a URL to the related S3 spec ( a comment ) at a place that could be considered the "entry point" for someone willing to understand how it's implemented
[19:41] <loicd> it could be a semi arbitrary place in the code, as long as it is related to the feature
[19:41] <odyssey4me> lyncos - what bonding mode are you using?
[19:41] <loicd> the upside would be that git log -S 'URL of the spec' would track it fairly easily
[19:42] <loicd> so you could in theory use this to figure out where / if a given S3 feature is implemented
[19:42] <odyssey4me> lyncos - it sounds to me like youre using a default bonding mode... ideally you should be using 802.3ad and mode l2+l3
[19:44] <lyncos> Transmit Hash Policy: layer2+3 (2)
[19:44] <lyncos> this is what I'm using
[19:45] <lyncos> Also I have my journal on the same raid5 set than the data.. write cache is enabled (512M) on the raid controller...
[19:46] <lyncos> you think moving it to a single cheap SSD will be faster than using a cached raid set ?
[19:46] <odyssey4me> lyncos - odd, generally the folks here recommend using individual disks for each osd instead of raid sets... and write-focused ssd's for journaling
[19:47] * clayb (~kvirc@proxy-nj1.bloomberg.com) has joined #ceph
[19:47] <odyssey4me> but journaling can also be done on the same disk as the osd, on a seperate partition ideally so that you can ensure that it's on the first sectors on each disk
[19:48] <lyncos> you know why they recommend using multiple OSD ?
[19:48] <lyncos> I guess it's for better cpu utilisation ?
[19:48] <odyssey4me> lyncos - usually due to the fact that ceph handles the 'raid' and because using each disk individually allows ceph to spread the reads/writes across multiple spindles
[19:49] <lyncos> yeah but it dosen't make use of the raid cache ...
[19:49] <odyssey4me> so you can set your controller to enable caching for reads/writes per disk
[19:49] <lyncos> ok yeah I see
[19:49] <odyssey4me> but the best is often to test for your situation and adapt as needed
[19:49] <lyncos> I will have to test
[19:50] <lyncos> I will need bigger ssd then
[19:50] <odyssey4me> my own testing with IBM hardware and RAID controllers found that a native RAID5 setup was comparable to a setup with Ceph with individual disk OSD's.
[19:50] <lyncos> ok this is what I tought
[19:50] <lyncos> I guess going to 10G will help
[19:51] <lyncos> now my clients are 10G but my ceph nodes are bonded...
[19:51] <odyssey4me> If you use SSD's for journaling, bear in mind that the throughput to your SSD will hamper performance if you journal to it for too many drives
[19:52] <lyncos> let say I'm using a ssd for journal for my 12 drive raid5 set you think the SSD will keepup ?
[19:52] <lyncos> I mean a single SSD
[19:52] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Ping timeout: 480 seconds)
[19:52] <odyssey4me> all writes are done to your SSD first, then migrated to spinning disks - so it's often best to do journaling for no more than something like 3 spinning disks to 1 SSD
[19:52] <odyssey4me> it depends on the read/write speed of your SSD
[19:53] <lyncos> Ok I see
[19:53] <odyssey4me> if you have spinning disks capable of doing 150MB/s and an SSD capable of doing 150MB/s, then your journal should only be a 1:1 ratio
[19:53] <odyssey4me> but if the SSD can do 300MB/s, then you can journal to it for 2 spinning disks
[19:54] <lyncos> ok ssd speed must be equal at the backend speed
[19:54] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[19:54] <odyssey4me> ideally SSD must be more than back-end
[19:54] <lyncos> ok
[19:54] <lyncos> so 2:1 should be good
[19:54] <odyssey4me> or more, if you have such an SSD
[19:54] <lyncos> what about raid on ssd ?
[19:55] <odyssey4me> if your spinning disks do 50MB/s and the SSD does 150MB/s, then you have a 1 SSD:3 HDD ratio... even better
[19:55] <lyncos> crate one big raid set with like 3-4 ssd and put all the journal on the same logical volume
[19:55] <lyncos> *create
[19:56] <odyssey4me> lyncos - dunno... but logically to me SSD's don't have seek times, etc... so it'd be better to split them up so that you can use the disk channels seperately
[19:56] <lyncos> you right .. the raid controller might induce latency
[19:56] <odyssey4me> *ports
[19:56] <lyncos> and I guess latency is what we are trying to avoid
[19:56] <lyncos> by using ssd
[19:56] <odyssey4me> the SSD's give you better latency, which is why you'd use an SSD, yes
[19:57] <odyssey4me> it's specifically to speed up writes
[19:57] <lyncos> Ok
[19:57] <lyncos> I'll do lot of writes on my cluster .. I guess I need to experiment a little
[19:57] * buzwor (~oftc-webi@2-227-187-110.ip187.fastwebnet.it) has joined #ceph
[19:57] <buzwor> hi there
[19:57] <lyncos> will switch to 10G and add some good SSD
[19:57] <lyncos> and split the journal accordingly
[19:58] <buzwor> i've a question: is better iscsi under ceph or ceph under iscsi ?
[19:58] <buzwor> maybe it is a wrong question?
[19:58] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has left #ceph
[19:58] <lyncos> probably what will happens is 2SSD for 12 disks 6/6 I will need really good ssds
[19:59] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[19:59] <odyssey4me> lyncos - sure - be sure to read the docs and some blog entries about ssd's and ceph
[19:59] <odyssey4me> they'll help you with testing and determining the best fit
[20:00] <lyncos> Ok thanks a lot
[20:00] <odyssey4me> buzwor - can't say I've tried either, but I would imagine that it'd be best to use Ceph as close to the HDD as possible and put the interface for clients on top of it
[20:00] <clayb> Oh; so if you have a homogenous set of disks (e.g. 8x150MB/s) journaling doesn't really buy anything?
[20:01] <odyssey4me> ie iscsi to clients, ceph underneath to manage storage
[20:01] <buzwor> odyssey4me: so you say to install ceph on a single server and connect these with iscsi?
[20:02] <buzwor> where server is a single machine where disks are installed in
[20:02] <odyssey4me> buswor - not necessarily - ceph distributes reads across devices, so if you're read-heavy then it may be better to spread the disk devices across storage machines
[20:02] <buzwor> http://ceph.com/w/index.php?title=ISCSI&redirect=no this one
[20:03] <odyssey4me> buzwor - yeah, that looks about right... the ideal is for a client not to be a cluster member, but it does depend on your use-case
[20:04] <buzwor> ok, so: disk <-> ceph <-> iscsi -> client app/client portion of storage
[20:04] * xmltok (~xmltok@pool101.bizrate.com) Quit (Remote host closed the connection)
[20:04] * xmltok (~xmltok@relay.els4.ticketmaster.com) has joined #ceph
[20:04] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[20:04] <gregmark> folks: I'm having trouble restarting indivual osd and mons
[20:04] <odyssey4me> clayb - my testing has indicated that your statement is correct... but YMMV
[20:04] <odyssey4me> I need to run - chat tomorrow if you're all around
[20:05] <clayb> Ah thanks Odyssey4Me
[20:05] <gregmark> e.g. on one of my mons, I have /var/lib/ceph/mon/ceph-kvm-cs-sn-10i
[20:05] <gregmark> then if I run service ceph status mon.kvm-cs-sn-10i, it says not found
[20:06] <gregmark> any ideas?
[20:06] * alphe (~alphe@0001ac6f.user.oftc.net) has joined #ceph
[20:06] <alphe> hello all
[20:06] <odyssey4me> clayb - one last thought though... I've seen that with homogenous disks people recommend putting a journal partition on the early sectors and the data on later sectors through two partitions... I guess that cuts the latency, but doesn't make much difference for throughput
[20:07] <clayb> Ah for spinning disks assuming the beginning to read faster?
[20:07] <odyssey4me> gregmark - are you running that from the /etc/ceph directory? try it
[20:07] <gregmark> odyssey4me: /etc/init.d/ceph: mon.kvm-cs-sn-10i not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines )
[20:07] <gregmark> did it from /etc/ceph
[20:07] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[20:09] <alphe> I have problems with ceph ... it is slower than before ... with ceph 0.38 with 5 sata 2 disks betrfs i could reach 400Mb/s of write speed using giga eth0 now with 2 sata 2 disks xfs i am around 88Mb/s
[20:10] <alphe> I m trying to figure out if there is a bottle neck and all my switch tells me the that the network is running at giga ethernet speed
[20:10] <alphe> so i try to understand what is the reason of that difference
[20:12] * xmltok_ (~xmltok@pool101.bizrate.com) has joined #ceph
[20:15] * odyssey4me (~odyssey4m@ Quit (Ping timeout: 480 seconds)
[20:15] * xmltok_ (~xmltok@pool101.bizrate.com) Quit (Remote host closed the connection)
[20:16] * xmltok_ (~xmltok@relay.els4.ticketmaster.com) has joined #ceph
[20:16] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[20:16] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[20:18] * buck1 (~Adium@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[20:18] * wido (~wido@2a00:f10:104:206:9afd:45af:ae52:80) Quit (Remote host closed the connection)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.