#ceph IRC Log


IRC Log for 2014-08-05

Timestamps are in GMT/BST.

[0:00] * xarses (~andreww@ has joined #ceph
[0:01] <Gorazd> is there already solution within CEPH, such as it is in Swift, to put container/bucket database to other OSD SSD like device than the one where object sits, so to get more througput on writing files.... If this is correct there is still issue opened - http://wiki.ceph.com/Planning/Blueprints/Submissions/rgw%3A_bucket_index_scalability ?
[0:04] * JC (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[0:05] * Sysadmin88 (~IceChat77@ has joined #ceph
[0:05] * vmx (~vmx@dslb-084-056-052-102.084.056.pools.vodafone-ip.de) Quit (Quit: Leaving)
[0:07] <sage> Gorazd: yes, you can put the bucket indexes in a different rados pool that is mapped to SSDs
[0:07] <sage> but that only helps so much.. hence, the bucket sharding
[0:07] <sage> which ought to arrive for giant
[0:08] * bkopilov (~bkopilov@ has joined #ceph
[0:08] * sjm (~sjm@ has left #ceph
[0:10] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit (Quit: Leaving.)
[0:12] * JC (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) has joined #ceph
[0:17] <Gorazd> ok thank you sage
[0:19] * stj (~stj@2001:470:8b2d:bb8:21d:9ff:fe29:8a6a) Quit (Ping timeout: 480 seconds)
[0:20] * stj (~stj@tully.csail.mit.edu) has joined #ceph
[0:23] * jeff-YF (~jeffyf@ Quit (Quit: jeff-YF)
[0:23] * ircolle is now known as ircolle-afk
[0:27] * tom2 (~jens@v1.jayr.de) has joined #ceph
[0:30] * Gorazd (~Venturi@93-103-91-169.dynamic.t-2.net) Quit (Ping timeout: 480 seconds)
[0:30] <steveeJ> why would crush choose the same OSD twice when steps are taken from different buckets which happen to have the same hosts?
[0:31] <steveeJ> better said which happen to have some hosts in common
[0:32] * vaminev (~vaminev@ has joined #ceph
[0:36] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[0:45] * pressureman_ (~daniel@g225006000.adsl.alicedsl.de) has joined #ceph
[0:45] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[0:48] * rendar (~I@host45-177-dynamic.20-87-r.retail.telecomitalia.it) Quit ()
[0:55] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[0:56] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Remote host closed the connection)
[0:56] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[0:59] <steveeJ> is there a faster way to copy an rbd image from one pool to another than "rbd cp ..."?
[1:00] <steveeJ> i'd actually want to move it, but that's not supported between pools
[1:01] <cookednoodles> qemu-img
[1:01] <cookednoodles> you can 'convert' from one pool to another
[1:01] <cookednoodles> but its slow
[1:02] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[1:02] <steveeJ> i suppose that uses the API for copy then
[1:02] * aknapp (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Ping timeout: 480 seconds)
[1:04] <dmick> I think you can clone across pools, which would allow lazy copying, but require you to keep the parent around until you promote the child
[1:04] <steveeJ> i wish i could set crush rules on a per image base :/
[1:04] <steveeJ> i could create one pool per image but that sounds like overkill too
[1:04] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Ping timeout: 480 seconds)
[1:06] <steveeJ> but that would really solve my problem the best i guess
[1:07] * andreask (~andreask@ has joined #ceph
[1:07] * ChanServ sets mode +v andreask
[1:07] * zack_dolby (~textual@p8505b4.tokynt01.ap.so-net.ne.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[1:08] * vaminev (~vaminev@ has joined #ceph
[1:09] * andreask (~andreask@ has left #ceph
[1:11] * AfC (~andrew@jim1020952.lnk.telstra.net) has joined #ceph
[1:11] * AfC (~andrew@jim1020952.lnk.telstra.net) Quit ()
[1:11] * AfC (~andrew@jim1020952.lnk.telstra.net) has joined #ceph
[1:13] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[1:13] <steveeJ> dmick: thanks for that idea! i'll try that out
[1:13] * oms101 (~oms101@p20030057EA023800EEF4BBFFFE0F7062.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:14] * baylight (~tbayly@74-220-196-40.unifiedlayer.com) Quit (Ping timeout: 480 seconds)
[1:15] <steveeJ> would the formula for the pg numbers still be the same if i know i'll only have one image per pool?
[1:15] <steveeJ> or is this completely unrelated?
[1:16] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[1:18] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[1:18] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[1:18] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[1:19] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Remote host closed the connection)
[1:19] * aknapp (~aknapp@ has joined #ceph
[1:22] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Read error: No route to host)
[1:22] * oms101 (~oms101@p20030057EA2FF700EEF4BBFFFE0F7062.dip0.t-ipconnect.de) has joined #ceph
[1:22] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[1:22] * gregmark (~Adium@ Quit (Quit: Leaving.)
[1:24] <mongo> you can do a COW across pools then flatten
[1:24] <mongo> http://ceph.com/docs/firefly/dev/rbd-layering/
[1:24] <mongo> or clone across.
[1:25] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[1:25] * sputnik13 (~sputnik13@ Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[1:29] * fsimonce (~simon@host225-92-dynamic.21-87-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[1:29] * AfC (~andrew@jim1020952.lnk.telstra.net) Quit (Ping timeout: 480 seconds)
[1:31] <steveeJ> thanks mongo. confirms dmick's idea working!
[1:34] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Remote host closed the connection)
[1:37] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[1:37] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[1:39] * PureNZ (~paul@122-62-45-132.jetstream.xtra.co.nz) has joined #ceph
[1:44] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[1:45] * alram (~alram@ Quit (Ping timeout: 480 seconds)
[1:56] * cookednoodles (~eoin@eoin.clanslots.com) Quit (Quit: Ex-Chat)
[1:57] * ralphte (ralphte@d.clients.kiwiirc.com) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[2:00] * rweeks (~rweeks@pat.hitachigst.com) Quit (Quit: Leaving)
[2:03] * baylight (~tbayly@ has joined #ceph
[2:06] * zack_dolby (~textual@e0109-114-22-0-42.uqwimax.jp) has joined #ceph
[2:08] * zerick (~eocrospom@ Quit (Ping timeout: 480 seconds)
[2:09] * vaminev (~vaminev@ has joined #ceph
[2:09] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[2:11] * joshd1 (~jdurgin@2602:306:c5db:310:6cc7:8f68:673b:92f8) Quit (Quit: Leaving.)
[2:15] * baylight1 (~tbayly@ has joined #ceph
[2:15] * baylight (~tbayly@ Quit (Read error: Connection reset by peer)
[2:16] * rmoe (~quassel@ Quit (Ping timeout: 480 seconds)
[2:17] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[2:19] * rmoe (~quassel@173-228-89-134.dsl.static.sonic.net) has joined #ceph
[2:20] * lofejndif (~lsqavnbok@freeciv.nmte.ch) Quit (Quit: gone)
[2:22] * danieagle_ (~Daniel@ Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[2:23] * DP (~oftc-webi@zccy01cs105.houston.hp.com) Quit (Remote host closed the connection)
[2:24] * vaminev (~vaminev@ has joined #ceph
[2:27] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[2:27] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[2:32] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit ()
[2:32] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[2:32] * baylight1 (~tbayly@ Quit (Read error: No route to host)
[2:32] * baylight (~tbayly@ has joined #ceph
[2:39] <steveeJ> mongo: pgs that haven't been copied-for-writing still have the primary osd according to the source pool, right?
[2:41] * rturk is now known as rturk|afk
[2:42] * xarses (~andreww@ Quit (Ping timeout: 480 seconds)
[2:43] * lucas1 (~Thunderbi@ has joined #ceph
[2:46] <muhanpong> help me. I can't visit ceph.com. (actually not only me, but whole city or country I guess, I am from South Korea)
[2:49] <steveeJ> use a proxy
[2:50] <muhanpong> steveeJ: i am using a tor browser. it works.
[2:51] <muhanpong> but this situation is very weird and uncomfortable.
[2:52] <steveeJ> i can't even imagine i guess
[2:55] <muhanpong> steveeJ: since last week, i remember, my ceph nodes also failed to download updates from ceph.
[2:56] <steveeJ> can you resolve the domain name?
[2:57] <muhanpong> yes. sure. i can reach ceph.com, it only answers "forbidden, 403..."
[2:57] <steveeJ> have you contacted your ISP?
[2:58] <steveeJ> or do you think the ceph webserver is blocking you out?
[2:58] <muhanpong> I think the webserver blocks..
[2:59] <muhanpong> wiki and tracker.ceph.com are normal
[2:59] <burley> ubuntu?
[3:00] <muhanpong> burley: yes for my nodes.
[3:00] <burley> got a proxy setup in your apt configs?
[3:00] <burley> like maas or what not
[3:01] <muhanpong> burley: in my env, proxy is not necessary..
[3:01] <burley> right, but if you used maas, it might be setup
[3:01] <burley> check /etc/apt/apt.conf
[3:02] <muhanpong> thank, but i'm not using maas.
[3:02] <steveeJ> flattening a cloned image while it is in use: is this a problem?
[3:02] <lurbs> steveeJ: Don't believe so.
[3:03] <steveeJ> locking mechanisms must go crazy though
[3:05] <steveeJ> i must say, i'm impressed by ceph. again
[3:06] <muhanpong> burley: it does not work from my iphone using its own carrier, my home laptop, also my friends using other cell phone carrier, from their PC.
[3:12] * pressureman_ (~daniel@g225006000.adsl.alicedsl.de) Quit (Quit: Ex-Chat)
[3:13] <burley> what's the Server: HTTP response header when you make the request?
[3:22] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) Quit (Quit: Leaving.)
[3:24] * Elbandi (~ea333@elbandi.net) Quit (Read error: Connection reset by peer)
[3:24] * Elbandi (~ea333@elbandi.net) has joined #ceph
[3:25] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) has joined #ceph
[3:25] * vaminev (~vaminev@ has joined #ceph
[3:28] * bandrus (~oddo@ Quit (Quit: Leaving.)
[3:33] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[3:37] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[3:41] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit ()
[3:42] * LeaChim (~LeaChim@host86-161-89-237.range86-161.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[3:43] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[3:44] <muhanpong> burley: http://pastebin.com/UE6tdB4z # that's all
[3:48] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit ()
[3:50] * KevinPerks (~Adium@2606:a000:80a1:1b00:5d91:1d6d:9895:f502) Quit (Quit: Leaving.)
[3:53] * baylight (~tbayly@ has left #ceph
[3:58] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[4:02] * haomaiwa_ (~haomaiwan@ Quit (Remote host closed the connection)
[4:02] * zhaochao (~zhaochao@ has joined #ceph
[4:02] * haomaiwang (~haomaiwan@ has joined #ceph
[4:08] * vz (~vz@ has joined #ceph
[4:14] * KevinPerks (~Adium@2606:a000:80a1:1b00:6534:7a73:78ac:6f90) has joined #ceph
[4:14] * haomaiwa_ (~haomaiwan@ has joined #ceph
[4:17] * haomaiwa_ (~haomaiwan@ Quit (Remote host closed the connection)
[4:18] * haomaiwa_ (~haomaiwan@ has joined #ceph
[4:18] * MACscr (~Adium@c-50-158-183-38.hsd1.il.comcast.net) Quit (Quit: Leaving.)
[4:18] * haomaiwang (~haomaiwan@ Quit (Read error: Connection reset by peer)
[4:20] * vz (~vz@ Quit (Read error: Connection reset by peer)
[4:21] * vz (~vz@ has joined #ceph
[4:27] * vaminev (~vaminev@ has joined #ceph
[4:33] * haomaiwang (~haomaiwan@ has joined #ceph
[4:35] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[4:39] * haomaiwa_ (~haomaiwan@ Quit (Ping timeout: 480 seconds)
[4:44] * angdraug (~angdraug@ Quit (Quit: Leaving)
[4:47] * steveeJ (~junky@HSI-KBW-085-216-022-246.hsi.kabelbw.de) Quit (Quit: Leaving)
[4:51] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has joined #ceph
[4:56] * ultimape (~Ultimape@c-174-62-192-41.hsd1.vt.comcast.net) Quit (Ping timeout: 480 seconds)
[4:57] * ultimape (~Ultimape@c-174-62-192-41.hsd1.vt.comcast.net) has joined #ceph
[4:57] * Jakey (uid1475@id-1475.uxbridge.irccloud.com) has joined #ceph
[4:59] * MapspaM (~clint@xencbyrum2.srihosting.com) has joined #ceph
[4:59] * SpamapS (~clint@xencbyrum2.srihosting.com) Quit (Read error: Operation timed out)
[5:02] * tdb_ (~tdb@myrtle.kent.ac.uk) has joined #ceph
[5:02] * tdb (~tdb@myrtle.kent.ac.uk) Quit (Remote host closed the connection)
[5:03] <Jakey> https://www.irccloud.com/pastebin/DKDgnYHa
[5:03] <Jakey> i'm getting this log
[5:03] <Jakey> what does it mean
[5:04] <lurbs> Which part? The scrub, or the peering?
[5:05] <lurbs> Scrub: http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing
[5:05] <lurbs> Peering: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#failures-osd-peering
[5:07] <Jakey> oh
[5:07] <Jakey> yeah i don't know what it means
[5:07] <Jakey> is it okay
[5:07] <Jakey> lurbs: last time it is the iptables thats blocking communication
[5:07] <lurbs> Okay? The scrubbing? Yes. The peering? No.
[5:07] <Jakey> i swith it off
[5:07] <lurbs> All PGs should be active+clean.
[5:07] <Jakey> lurbs: whats wrong with the peering
[5:07] <Jakey> lurbs: oh
[5:07] <Jakey> so whats wrong
[5:08] <lurbs> Dunno, may pay to run through the troubleshooting link.
[5:08] <lurbs> Usually it's a network connectivity problem.
[5:09] <Jakey> what the hell
[5:14] * aknapp (~aknapp@ Quit (Remote host closed the connection)
[5:14] * aknapp (~aknapp@ has joined #ceph
[5:14] * aknapp (~aknapp@ Quit (Remote host closed the connection)
[5:14] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[5:14] * aknapp (~aknapp@ has joined #ceph
[5:18] * aknapp (~aknapp@ Quit (Remote host closed the connection)
[5:19] * RameshN (~rnachimu@ has joined #ceph
[5:19] * michalefty (~micha@188-195-129-145-dynip.superkabel.de) has joined #ceph
[5:24] <Jakey> lurbs: this looks okay right?
[5:24] <Jakey> # id weight type name up/down reweight
[5:24] <Jakey> -1 1.19 root default
[5:24] <Jakey> -2 0.21 host node1
[5:24] <Jakey> 0 0.21 osd.0 up 1
[5:24] <Jakey> -3 0.56 host node8
[5:24] <Jakey> 1 0.56 osd.1 up 1
[5:24] <Jakey> -4 0.42 host node4
[5:24] <Jakey> 2 0.42 osd.2 up 1
[5:24] <lurbs> Yep.
[5:25] * Vacum_ (~vovo@i59F79BE3.versanet.de) has joined #ceph
[5:27] <Jakey> lurbs: i'm stuck here
[5:27] <Jakey> 2014-08-05 18:21:10.347268 mon.0 [INF] pgmap v27799: 192 pgs: 159 peering, 33 active+clean; 0 bytes data, 24830 MB used, 1128 GB / 1214 GB avail
[5:28] <lurbs> Find a PG that's in the 'peering' state, using 'ceph pg dump' (PG id is the first column), and then run 'ceph pg $id query'.
[5:28] * vaminev (~vaminev@ has joined #ceph
[5:30] <lurbs> It should return a bunch of JSON, which you can pastebin.
[5:31] <lurbs> But I suspect it's still a network connectivity problem, you may have missed some of the required ports in the iptables fixes.
[5:32] * Vacum (~vovo@ Quit (Ping timeout: 480 seconds)
[5:34] <Jakey> lurbs: no i turn off the iptables
[5:36] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[5:39] <Jakey> lurbs: do you know how can i pasted the logs to a url within terminal
[5:40] <lurbs> However you did it with the original 'ceph -w' output.
[5:45] * shang (~ShangWu@ has joined #ceph
[5:45] <Jakey> lurbs: heres the pg dump
[5:45] <Jakey> http://sprunge.us/cYeR
[5:46] <lurbs> So what does 'ceph pg 2.2f query' give you?
[5:46] <Jakey> why 2.2f ? theres a bunch of it
[5:47] <lurbs> Just because it's the first on the list, and the problem's probably the same for all of them.
[5:47] <Jakey> and this is the output of that
[5:47] <Jakey> https://www.irccloud.com/pastebin/hCtpgmvK
[5:47] <Jakey> lurbs: lol okay :P
[5:50] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[5:51] <Jakey> lurbs: ??
[5:51] <lurbs> The machine you ran that on, node7, doesn't seem to be part of the cluster. I see node1, node8 and node4 in the 'ceph osd tree' output. Does the pg query command give the same output on those?
[5:52] * kanagaraj (~kanagaraj@ has joined #ceph
[5:53] <lurbs> Oh, node7's the monitor.
[5:53] <Jakey> lurbs: um yes and also the admin
[5:53] <Jakey> lurbs: the output
[5:53] <Jakey> https://www.irccloud.com/pastebin/rE70YHp8
[5:56] <Jakey> lurbs: ???
[5:58] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[5:58] <lurbs> This cluster was built using ceph-deploy, yes? From which host?
[5:58] <Jakey> node7
[5:58] <Jakey> node1 node4 node7 are osd
[5:59] <Jakey> i mean node8
[5:59] <lurbs> From node7, as root, what does 'ceph pg 2.2f query' give you?
[5:59] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[5:59] <Jakey> lurbs: the same thing
[5:59] <Jakey> as i pasted earlier
[5:59] <lurbs> Even as root?
[6:00] <Jakey> yes
[6:00] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Remote host closed the connection)
[6:01] <lurbs> You may want to ensure that the time on all of the servers is in sync, and compare the output of 'ceph auth list' on them.
[6:01] * aknapp (~aknapp@ has joined #ceph
[6:01] <lurbs> That contains sensitive information, BTW, so you may not want to pastebin it.
[6:01] <lurbs> May not matter if it's just a test cluster.
[6:02] <Jakey> its a test
[6:02] <Jakey> my ceph.conf
[6:02] <Jakey> https://www.irccloud.com/pastebin/AXBhlKA8
[6:02] <Jakey> so i have to set the timer on all host
[6:02] <Jakey> correctly
[6:02] * michalefty (~micha@188-195-129-145-dynip.superkabel.de) Quit (Quit: Leaving.)
[6:03] <Jakey> ?
[6:03] <lurbs> Using NTP, yes.
[6:03] <Jakey> lurbs: can i just set the timer on the server locally?
[6:03] <Jakey> you know just to be fast
[6:03] <Jakey> i mean its quicker
[6:04] <lurbs> I'd check 'ceph auth list' first anyway.
[6:05] <Jakey> lurbs:
[6:05] <Jakey> https://www.irccloud.com/pastebin/P2XTZIhV
[6:05] <lurbs> And also compare what that gives you with the content of any keyring files in /etc/ceph on the various nodes.
[6:06] <Jakey> lurbs: [root@node4 ~]# ls /etc/ceph/
[6:06] <Jakey> ceph.conf rbdmap
[6:06] <Jakey> [root@node4 ~]#
[6:06] <Jakey> i only have that on the osd nodes
[6:06] <Jakey> i think its the same on all osd nodes
[6:07] <lurbs> What about the monitor?
[6:07] <Jakey> no no the directory
[6:07] * tracphil (~tracphil@ Quit (Ping timeout: 480 seconds)
[6:07] <lurbs> Yeah, the directory on the monitor host.
[6:07] <Jakey> [ceph@node7 m_cluster]$ ls /etc/ceph/
[6:07] <Jakey> ceph.client.admin.keyring ceph.conf rbdmap
[6:07] <Jakey> [ceph@node7 m_cluster]$
[6:08] <Jakey> ^up there
[6:08] <lurbs> Does ceph.client.admin.keyring match what you get in 'ceph auth list'?
[6:08] <lurbs> Specifically the key for client.admin.
[6:08] <Jakey> [ceph@node7 m_cluster]$ cat /etc/ceph/ceph.client.admin.keyring
[6:08] <Jakey> [client.admin]
[6:08] <Jakey> key = AQBFrsZTGMXDDhAAcBUm+NxD6hfrw7aVVfiX5w==
[6:08] <Jakey> [ceph@node7 m_cluster]$
[6:09] <Jakey> yes
[6:09] <Jakey> it matches
[6:10] * houkouonchi-home (~linux@2001:470:c:c69::2) Quit (Remote host closed the connection)
[6:10] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Quit: Verlassend)
[6:10] <Jakey> hey
[6:10] <Jakey> [root@node1 ~]# ls /etc/ceph/
[6:10] <Jakey> ceph.client.admin.keyring ceph.conf rbdmap
[6:10] <Jakey> [root@node1 ~]#
[6:10] <Jakey> only node4 doesn't have the ceph.client.admin.keyring
[6:11] <lurbs> Make sure they all match.
[6:11] <lurbs> Also, try "ceph pg 2.2f query" from node1.
[6:12] <Jakey> lurbs: i try it it the same error
[6:12] <Jakey> https://www.irccloud.com/pastebin/0q2lZ9Sp
[6:12] <lurbs> Which host is
[6:13] <Jakey> lurbs: node4 is 50.4
[6:13] <Jakey> the ceph.client.admin.keyring all match for node7 node8 node1
[6:13] <lurbs> You able to put ceph.client.admin.keyring on there, and restart the OSD daemon?
[6:14] <Jakey> on host 50.4 ?
[6:14] <Jakey> manually?
[6:14] <lurbs> Yep.
[6:14] <Jakey> lurbs: okay what about the host that already has it
[6:15] <lurbs> So long as it's the same as the others, and matches 'ceph auth list' it should be fine.
[6:15] <Jakey> lurbs: yeah its the same but i run the query command
[6:15] <Jakey> and its gives the same error
[6:16] * vz (~vz@ Quit (Remote host closed the connection)
[6:16] <lurbs> 'ceph auth list' gives the exact same output on every host?
[6:17] <Jakey> lurbs: yes
[6:17] * vbellur (~vijay@ Quit (Ping timeout: 480 seconds)
[6:21] * pactuser (~pactuser@ has joined #ceph
[6:29] * vaminev (~vaminev@ has joined #ceph
[6:32] * rdas (~rdas@ has joined #ceph
[6:37] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[6:38] * pactuser (~pactuser@ Quit ()
[6:39] * pactuser (~pactuser@ has joined #ceph
[6:40] * aknapp (~aknapp@ Quit (Remote host closed the connection)
[6:41] * aknapp (~aknapp@ has joined #ceph
[6:42] * aknapp (~aknapp@ Quit (Remote host closed the connection)
[6:42] * lucas1 (~Thunderbi@ Quit (Quit: lucas1)
[6:42] * aknapp (~aknapp@ has joined #ceph
[6:48] * vbellur (~vijay@ has joined #ceph
[6:50] * aknapp (~aknapp@ Quit (Ping timeout: 480 seconds)
[7:05] * Cube (~Cube@ has joined #ceph
[7:08] * adamcrume (~quassel@c-71-204-162-10.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[7:08] * Cube (~Cube@ Quit (Read error: Connection reset by peer)
[7:08] <bens> if i was high all day and german, It would be a #hashtag
[7:12] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[7:13] * dmick groans
[7:13] * tdb_ (~tdb@myrtle.kent.ac.uk) Quit (Ping timeout: 480 seconds)
[7:14] * tdb (~tdb@myrtle.kent.ac.uk) has joined #ceph
[7:20] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[7:20] * KevinPerks (~Adium@2606:a000:80a1:1b00:6534:7a73:78ac:6f90) Quit (Quit: Leaving.)
[7:21] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) Quit (Quit: Leaving.)
[7:26] * lucas1 (~Thunderbi@ has joined #ceph
[7:30] * dmsimard_away is now known as dmsimard
[7:31] * reed (~reed@75-101-54-131.dsl.static.sonic.net) Quit (Quit: Ex-Chat)
[7:31] * michalefty (~micha@p20030071CE5107611CFCE9EA5480902E.dip0.t-ipconnect.de) has joined #ceph
[7:38] * Cube (~Cube@66-87-79-196.pools.spcsdns.net) has joined #ceph
[7:39] * Cube1 (~Cube@66-87-79-196.pools.spcsdns.net) has joined #ceph
[7:39] * Cube (~Cube@66-87-79-196.pools.spcsdns.net) Quit (Read error: Connection reset by peer)
[7:45] * lupu (~lupu@ Quit (Ping timeout: 480 seconds)
[7:46] * Cube (~Cube@66-87-79-196.pools.spcsdns.net) has joined #ceph
[7:46] * Cube1 (~Cube@66-87-79-196.pools.spcsdns.net) Quit (Read error: Connection reset by peer)
[7:46] * Cube (~Cube@66-87-79-196.pools.spcsdns.net) Quit ()
[7:50] * bkopilov (~bkopilov@ Quit (Ping timeout: 480 seconds)
[7:51] * lucas1 (~Thunderbi@ Quit (Remote host closed the connection)
[8:03] * angdraug (~angdraug@c-67-169-181-128.hsd1.ca.comcast.net) has joined #ceph
[8:04] * vaminev (~vaminev@ has joined #ceph
[8:08] * dmsimard is now known as dmsimard_away
[8:10] * vshankar (~vshankar@ has joined #ceph
[8:10] * vshankar is now known as vz
[8:13] * vaminev_ (~vaminev@ has joined #ceph
[8:13] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[8:13] * MACscr (~Adium@c-50-158-183-38.hsd1.il.comcast.net) has joined #ceph
[8:15] * lalatenduM (~lalatendu@ has joined #ceph
[8:16] * vaminev (~vaminev@ Quit (Ping timeout: 480 seconds)
[8:19] * ikrstic (~ikrstic@109-93-162-27.dynamic.isp.telekom.rs) has joined #ceph
[8:21] <nizedk> is rbd on top of replicated pool, with writeback tier overlay on top, supported?
[8:21] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[8:21] * mattch1 (~mattch@pcw3047.see.ed.ac.uk) has joined #ceph
[8:23] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[8:26] * mattch (~mattch@pcw3047.see.ed.ac.uk) Quit (Ping timeout: 480 seconds)
[8:28] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) Quit (Quit: Leaving.)
[8:29] * topro (~prousa@host-62-245-142-50.customer.m-online.net) has joined #ceph
[8:32] * pactuser (~pactuser@ Quit ()
[8:32] * vaminev_ (~vaminev@ Quit (Read error: Connection reset by peer)
[8:34] * lcavassa (~lcavassa@ has joined #ceph
[8:34] * Cube (~Cube@ has joined #ceph
[8:37] * cok (~chk@2a02:2350:18:1012:d0d7:97f2:a0fa:d5b3) has joined #ceph
[8:39] * vaminev (~vaminev@ has joined #ceph
[8:44] * lupu (~lupu@ has joined #ceph
[8:45] * b0e (~aledermue@juniper1.netways.de) has joined #ceph
[8:46] * CAPSLOCK2000 (~oftc@2001:610:748:1::8) Quit (Quit: ZNC - http://znc.in)
[8:46] * CAPSLOCK2000 (~oftc@2001:610:748:1::8) has joined #ceph
[8:56] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) has joined #ceph
[8:57] * rendar (~I@host228-179-dynamic.1-87-r.retail.telecomitalia.it) has joined #ceph
[8:58] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) Quit ()
[8:59] * BManojlovic (~steki@ has joined #ceph
[9:00] * danieljh (~daniel@0001b4e9.user.oftc.net) Quit (Quit: leaving)
[9:04] * lupu (~lupu@ Quit (Ping timeout: 480 seconds)
[9:08] * thb (~me@2a02:2028:6d:e550:7137:b5ae:8f79:fb3d) has joined #ceph
[9:09] * Sysadmin88 (~IceChat77@ Quit (Quit: IceChat - Its what Cool People use)
[9:14] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[9:18] * bkopilov (~bkopilov@nat-pool-tlv-t.redhat.com) has joined #ceph
[9:22] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[9:22] * bazli (bazli@d.clients.kiwiirc.com) has joined #ceph
[9:23] <bazli> hi
[9:23] <bazli> anyone here knows how to list down pgs of selected pool only?
[9:24] <bazli> err.. rephrasing it.. any ceph tool to list down pgs of selected pool only...?
[9:26] * ashishchandra (~ashish@ has joined #ceph
[9:28] * ashishchandra (~ashish@ Quit (Remote host closed the connection)
[9:31] * ashishchandra (~ashish@ has joined #ceph
[9:33] * swami (~swami@ has joined #ceph
[9:35] * angdraug (~angdraug@c-67-169-181-128.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[9:36] * fsimonce (~simon@host225-92-dynamic.21-87-r.retail.telecomitalia.it) has joined #ceph
[9:36] * abhi- (~abhi@ has joined #ceph
[9:36] * ksingh (~Adium@2001:708:10:10:a9c1:36dd:25d8:f450) has joined #ceph
[9:36] <ashishchandra> hi cephers... i have configured radosgw for swift... but radosgw doesnot returns me all the headers one of the being "x-timestamp" and 'content-tyoe'
[9:37] <ashishchandra> absence of these headers is giving me issues on horizon front where I am not able to upload objects
[9:38] * zack_dolby (~textual@e0109-114-22-0-42.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[9:39] * ashishchandra1 (~ashish@ has joined #ceph
[9:40] * ashishchandra (~ashish@ Quit (Remote host closed the connection)
[9:41] * swami (~swami@ has left #ceph
[9:41] * swami (~swami@ has joined #ceph
[9:43] * zack_dolby (~textual@e0109-114-22-0-42.uqwimax.jp) has joined #ceph
[9:46] * vaminev (~vaminev@ Quit (Read error: Connection reset by peer)
[9:49] <Jakey> why dmick
[9:50] <Jakey> stop being lazy dmick lots of people need your help :P
[9:53] * stewiem2000 (~stewiem20@ has joined #ceph
[9:53] * vaminev (~vaminev@ has joined #ceph
[9:54] * lupu (~lupu@ has joined #ceph
[9:54] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[9:54] * stewiem2000 (~stewiem20@ Quit ()
[9:55] * jordanP (~jordan@ has joined #ceph
[9:56] * stewiem2000 (~stewiem20@ has joined #ceph
[9:57] * stewiem2000 (~stewiem20@ Quit ()
[9:57] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[9:57] * LeaChim (~LeaChim@host86-161-89-237.range86-161.btcentralplus.com) has joined #ceph
[9:59] * stewiem2000 (~stewiem20@ has joined #ceph
[10:00] * stewiem2000 (~stewiem20@ Quit ()
[10:00] * vaminev (~vaminev@ Quit (Read error: Connection reset by peer)
[10:02] * vaminev (~vaminev@ has joined #ceph
[10:03] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[10:03] * Cybertinus (~Cybertinu@cybertinus.customer.cloud.nl) Quit (Remote host closed the connection)
[10:04] * Cybertinus (~Cybertinu@cybertinus.customer.cloud.nl) has joined #ceph
[10:04] * mattch1 (~mattch@pcw3047.see.ed.ac.uk) Quit (Ping timeout: 480 seconds)
[10:05] * cookednoodles (~eoin@eoin.clanslots.com) has joined #ceph
[10:07] * mattch (~mattch@pcw3047.see.ed.ac.uk) has joined #ceph
[10:07] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[10:07] * CAPSLOCK2000 (~oftc@2001:610:748:1::8) Quit (Quit: ZNC - http://znc.in)
[10:08] * CAPSLOCK2000 (~oftc@2001:610:748:1::8) has joined #ceph
[10:14] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[10:14] <ksingh> [root@storage0111-ib ~]# rados -p cache-pool ls
[10:14] <ksingh> hosts
[10:14] <ksingh> test
[10:14] <ksingh> [root@storage0111-ib ~]# rados -p cache-pool rm test
[10:14] <ksingh> error removing cache-pool/test: (2) No such file or directory
[10:14] <ksingh> [root@storage0111-ib ~]#
[10:15] <ksingh> Any one can help , why objects are not getting deleted using rados rm command ?
[10:15] * stewiem2000 (~stewiem20@ has joined #ceph
[10:15] <ksingh> FYI there is no CACHE tier set ?
[10:22] * swami (~swami@ Quit (Quit: Leaving.)
[10:22] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[10:25] * fdmanana (~fdmanana@bl5-173-238.dsl.telepac.pt) has joined #ceph
[10:25] <Jakey> hey fuckers i got this
[10:26] <Jakey> [ceph@node7 m_cluster]$ ceph health
[10:26] <Jakey> HEALTH_OK
[10:26] <Jakey> [ceph@node7 m_cluster]$
[10:26] <Jakey> what does it mean
[10:26] <Clabbe> Jakey: you wont get a heartattack
[10:26] <Clabbe> your health is ok
[10:29] * kapil (~ksharma@2620:113:80c0:5::2222) Quit (Read error: Connection reset by peer)
[10:31] * swami (~swami@ has joined #ceph
[10:34] * b0e1 (~aledermue@juniper1.netways.de) has joined #ceph
[10:35] * bkopilov (~bkopilov@nat-pool-tlv-t.redhat.com) Quit (Remote host closed the connection)
[10:38] * b0e (~aledermue@juniper1.netways.de) Quit (Ping timeout: 480 seconds)
[10:49] * MrBy2 (~MrBy@ has joined #ceph
[10:49] * kippi (~oftc-webi@host-4.dxi.eu) has joined #ceph
[10:50] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[10:50] <kippi> Hi, following the quick install however I can getting this error: librados: osd.0 authentication error (1) Operation not permitted Error connecting to cluster: PermissionError
[10:50] * lupu (~lupu@ Quit (Ping timeout: 480 seconds)
[10:54] <ksingh> kippi : check auth for osd.0
[10:54] <ksingh> ceph auth list osd.0
[10:54] <ksingh> if its there remove it ( i trust this is your test cluster )
[10:55] <ksingh> sometimes when you try installing ceph multiple times , it does not removes auth keys from the last installation
[10:55] <ksingh> ceph auth rm osd.0
[11:01] <kippi> remove doesn't seem to be a vaild option
[11:02] <ksingh> ceph auth del osd.0
[11:03] <kippi> yeah, just done that :)
[11:03] * vaminev (~vaminev@ Quit (Read error: Connection reset by peer)
[11:03] <kippi> still the same
[11:04] <ksingh> whats your cluster health now ?
[11:05] <ksingh> output of ceph osd tree pls
[11:05] * mgarcesMZ (~mgarces@ has joined #ceph
[11:05] <kippi> HEALTH_ERR 384 pgs stuck inactive; 384 pgs stuck unclean; no osds
[11:06] <ksingh> ceph osd tree
[11:06] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[11:07] <kippi> http://pastebin.com/VrVaFGSC
[11:08] * zack_dolby (~textual@e0109-114-22-0-42.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[11:10] <ksingh> try ceph osd in osd.0 osd.1
[11:11] <ksingh> then try restarting osd services for osd.0 osd.1
[11:11] <ksingh> paste the output on pastbin
[11:11] <mgarcesMZ> hi there
[11:12] <mgarcesMZ> im also giving my first ceph steps (like kippi I believe)
[11:12] <ksingh> whats the problem buddy ?
[11:12] <mgarcesMZ> kippi: do you have something like ???public_network = in ceph.conf (on the deploy node)
[11:12] <mgarcesMZ> ksingh: for now I have 3 osd, 3 mon
[11:13] <mgarcesMZ> running from admin-node everything. the nodes are centos7
[11:13] <mgarcesMZ> I see that ceph-deploy starts the services, but how can I make this persistent with centos7 startup (systemd)
[11:13] <kippi> mgarcesMZ: nope
[11:14] <mgarcesMZ> also, ceph-deploy is failing when I run ???ceph ???install <node_n>???
[11:14] <mgarcesMZ> I ended up making ???yum install ceph??? on all nodes
[11:15] <ksingh> magarcesMZ: do you mean when you reboot your centos7 node ,ceph is not getting started by default
[11:15] <mgarcesMZ> kippi: I was having trouble starting the extra mon, before I setup that
[11:15] <ksingh> did you tried it
[11:15] <mgarcesMZ> ksingh: well, to be honest, only yesterday??? I was very confused, so today I started everything fresh
[11:15] <mgarcesMZ> let me try
[11:15] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[11:15] <ksingh> ok clean everything , all the stale entries from yesterday
[11:16] <ksingh> use ceph-deploy to do it
[11:16] <ksingh> ceph-deploy install <node> ???release firefly
[11:16] <mgarcesMZ> ksingh: I just go back using vmware snapshot
[11:17] <mgarcesMZ> ksingh: everything is running
[11:17] <mgarcesMZ> after the reboot
[11:17] <mgarcesMZ> yay!
[11:17] <mgarcesMZ> amazing! :)
[11:18] * madkiss (~madkiss@tmo-097-45.customers.d1-online.com) has joined #ceph
[11:18] <ksingh> so you are happy now :-)
[11:18] <mgarcesMZ> ksingh: maybe that - - release flag helps
[11:19] <mgarcesMZ> ksingh: oh ye! :)
[11:19] <mgarcesMZ> the problem I was having with ceph-deploy install
[11:19] <ksingh> kippi : whats going on
[11:19] <mgarcesMZ> was that he was trying to install el6 packages, in a centos7 (el7)
[11:21] <kippi> I think mgarcesMZ I might of played a bit to much
[11:21] <kippi> what is the recommend OS that ceph-deploy works with? We have tired ubuntu and cen 7 but had issues with both
[11:22] <ksingh> both should work
[11:23] <kippi> so if I do 3 clean installs of Ubunut 14.04
[11:23] <kippi> and start again?
[11:23] * swami (~swami@ Quit (Quit: Leaving.)
[11:23] <kippi> or am I better off with centos 6?
[11:23] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[11:23] <mgarcesMZ> ksingh: I just ran ???ceph-deploy install <node-client> ???release firefly??? and he changed the ceph.repo to point to el6 :(
[11:24] * vaminev (~vaminev@ has joined #ceph
[11:25] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[11:26] <ksingh> try this flag --no-adjust-repos
[11:26] <mgarcesMZ> ok
[11:27] <Kioob`Taff> is there ceph APT mirrors with IPv6 support ?
[11:28] <mgarcesMZ> ksingh: should I put all the 3 yum repos in
[11:28] <mgarcesMZ> ".cephdeploy.conf"
[11:29] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Remote host closed the connection)
[11:29] <Kioob`Taff> Xen dom0 stor1:~# ping6 eu.ceph.com
[11:29] <Kioob`Taff> unknown host
[11:29] <Kioob`Taff> Xen dom0 stor1:~# ping6 ceph.com
[11:29] <Kioob`Taff> unknown host
[11:30] <Kioob`Taff> ceph.com is handle by nameservers of dreamhost, which are only available in IPv4
[11:31] <Vacum_> dig AAAA ceph.com
[11:31] <Vacum_> ceph.com. 300 IN AAAA 2607:f298:4:147::b05:fe2a
[11:31] <kippi> ksingh: just rebuilding the vm's
[11:31] * bazli (bazli@d.clients.kiwiirc.com) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[11:31] <Vacum_> ping6 ceph.com
[11:31] <Vacum_> PING ceph.com(ceph.com) 56 data bytes
[11:31] <Vacum_> 64 bytes from ceph.com: icmp_seq=1 ttl=57 time=172 ms
[11:31] <Kioob`Taff> Vacum_: yes, but you resolve from an IPv4 network.
[11:32] <kippi> going to try, ubuntu 14:04 with the installer
[11:32] <Kioob`Taff> Vacum_: you can't resolve ceph.com from an IPv6 DNS resolver
[11:32] <Vacum_> ah!
[11:32] <Kioob`Taff> Authoritative answers can be found from:
[11:32] <Kioob`Taff> ns1.dreamhost.com internet address =
[11:32] <Kioob`Taff> ns2.dreamhost.com internet address =
[11:32] <Kioob`Taff> ns3.dreamhost.com internet address =
[11:32] <Vacum_> :)
[11:33] <ksingh> mgarcesMZ : yep
[11:33] <ksingh> kippi : nice
[11:34] <kippi> ksingh: think that might be easier, start from fresh, then we know the base is good
[11:35] <mgarcesMZ> kippi: thats what I did today
[11:35] <ksingh> kippi : yep , and when you are deploying ceph for the firs time its esssentioal
[11:35] <ksingh> *essential
[11:35] <mgarcesMZ> good thing to try ceph is use vms
[11:35] * i_m (~ivan.miro@gbibp9ph1--blueice2n1.emea.ibm.com) has joined #ceph
[11:35] * vaminev (~vaminev@ Quit (Read error: Connection reset by peer)
[11:36] <kippi> how many machines do I need? 1 deploy node, 1 monitor, 2 osd?
[11:36] <ksingh> with a snapshot 8-)
[11:36] <ksingh> you should take 3 vms
[11:36] * vaminev (~vaminev@ has joined #ceph
[11:36] <ksingh> 1st node : ceph-deploy , monitor OSD
[11:37] <ksingh> 2nd node: MON , OSD
[11:37] <ksingh> 3rd node : mon , osd
[11:37] <ksingh> add 3 disks on each vm for OSDs , so total 9 OSDS
[11:37] <ksingh> hope you are not running out of resources
[11:38] * joao|lap (~JL@ has joined #ceph
[11:38] * ChanServ sets mode +o joao|lap
[11:38] * primechu_ (~primechuc@host-95-2-129.infobunker.com) has joined #ceph
[11:40] * primechuck (~primechuc@host-95-2-129.infobunker.com) Quit (Remote host closed the connection)
[11:40] <mgarcesMZ> im using 4 vm??? node1 is admin-node/ceph-client, and node1,2,3 are MON an OSD
[11:40] <mgarcesMZ> but usin filesystem for the OSD
[11:40] <mgarcesMZ> what are the option I can put in [ceph-deploy-global] inside .cephdeploy.conf ?
[11:41] <mgarcesMZ> also, ceph-deploy assumes hosts can connect to the internet or use repos directly??? what if I use rhn/spacewalk ?
[11:41] * lucas1 (~Thunderbi@ has joined #ceph
[11:42] * lucas1 (~Thunderbi@ Quit ()
[11:42] <ksingh> can you put your .cephdeploy.conf file on pastbin
[11:43] <ksingh> i want to see it
[11:43] <ksingh> ok no need , forget it
[11:44] * swami (~swami@ has joined #ceph
[11:44] <ksingh> if you dont have internet connectivity for ceph nodes, then you need to create a local repo server and while installing ceph using ceph-deploy you need to specify local repo server
[11:47] <mgarcesMZ> ok, so no support for rhn/spacewalk
[11:48] <mgarcesMZ> whell
[11:48] <mgarcesMZ> maybe it works, must see??? first I need to understand why ???ceph-deploy install??? is trying to install el6 packages in a el7
[11:48] <mgarcesMZ> also, what kernel module I need to mount cephfs ?
[11:53] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[11:53] <mgarcesMZ> kmod-libceph.x86_64 :)
[12:00] <mgarcesMZ> ksingh: when I try the mount, using the instructions in the docs??? I just get modprobe: FATAL: Module ceph not found.
[12:04] <ksingh> the machine on which you are trying to moutn cephfs , should be a ceph client
[12:05] <ksingh> and it should connect to ceph cluster
[12:06] <mgarcesMZ> cant I use the admin-node (not osd, mon or mds) ?
[12:06] <mgarcesMZ> also, what do I need installed to have ???ceph??? filesystem available? I thought it came with the kernel (3.10.0-123.4.4.el7.x86_64)
[12:07] * b0e1 (~aledermue@juniper1.netways.de) Quit (Quit: Leaving.)
[12:15] <mgarcesMZ> I used ceph-fuse for testing...
[12:15] <mgarcesMZ> not very good results :(
[12:15] * bkopilov (~bkopilov@nat-pool-tlv-t.redhat.com) has joined #ceph
[12:16] <mgarcesMZ> on a local fs: (1.1 GB) copied, 20.8747 s, 51.4 MB/s / on ceph fs: (1.1 GB) copied, 309.294 s, 3.5 MB/s
[12:18] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[12:19] <joao|lap> mgarcesMZ, you may need the cephfs package
[12:19] <mgarcesMZ> joao|lap: in what repo is that?
[12:19] <joao|lap> you'll need mount.ceph at the very least, and I don't think that comes with the distro even if the kernel is ceph-ready
[12:20] <mgarcesMZ> but that package is in Ceph yum repos?
[12:20] <joao|lap> mgarcesMZ, you should find that somewhere in ceph.com
[12:20] <joao|lap> mgarcesMZ, some iteration of it
[12:20] <mgarcesMZ> ceph.mount I have installed
[12:20] <joao|lap> then you should need nothing
[12:20] <kippi> ok, trying again, machines built
[12:21] <joao|lap> well, you'll probably need an mds running
[12:21] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[12:21] <kippi> should I use the 5 min start?
[12:21] <mgarcesMZ> joao|lap: I have it
[12:21] <mgarcesMZ> joao|lap: e6: 1/1/1 up {0=node1=up:active}
[12:21] <joao|lap> check the kernel config on the client and grep for CEPH
[12:22] <kippi> or should I use install quick?
[12:22] <mgarcesMZ> joao|lap: how?
[12:23] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[12:23] <joao|lap> mgarcesMZ, http://unix.stackexchange.com/questions/83319/how-can-i-know-if-the-current-kernel-was-compiled-with-a-certain-option-enabled
[12:24] <joao|lap> that was just the first result on google, may not work out of the box for you
[12:25] <mgarcesMZ> CONFIG_CEPH_FS is not set
[12:25] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[12:25] <mgarcesMZ> joao|lap: i think I know what to do
[12:26] <kippi> ok, this is the issue I was having before with the installer
[12:26] <kippi> RuntimeError: remote connection got closed, ensure ``requiretty`` is disabled for osd0
[12:26] <ninkotech> hi... can someone please give me few big names using rbd in production?
[12:29] * fdmanana (~fdmanana@bl5-173-238.dsl.telepac.pt) Quit (Quit: Leaving)
[12:29] <ninkotech> few good use cases would help me to push rbd forward...
[12:30] <cookednoodles> ebay, cern
[12:31] <cookednoodles> do some googling, there are a ton of talks on this :P
[12:31] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[12:31] <ninkotech> trying to use startpage. but didnt find much yet
[12:31] <kippi> ksingh: The issue with ubuntu and the installer is that it think requiretty is enabled, however the suderos has Defaults:userNAME !requiretty
[12:32] <mgarcesMZ> kippi: you need a rule in sudoers
[12:32] <kippi> already in there
[12:33] <mgarcesMZ> put this on your /etc/sudoers.d/ceph file: cephadmin ALL = (root) NOPASSWD:ALL
[12:33] <mgarcesMZ> and Defaults:cephadmin !requiretty
[12:33] <mgarcesMZ> (cephadmin is my user)
[12:33] <mgarcesMZ> I think the problem is??? you have Default, not defaults
[12:33] <mgarcesMZ> right?
[12:33] <kraken> http://i.imgur.com/RvquHs0.gif
[12:33] <mgarcesMZ> ehehe
[12:35] <kippi> mgarcesMZ: you are good :)
[12:35] <mgarcesMZ> ;)
[12:36] * bitserker1 (~toni@63.pool85-52-240.static.orange.es) Quit (Read error: No route to host)
[12:36] * bitserker (~toni@63.pool85-52-240.static.orange.es) has joined #ceph
[12:37] <mgarcesMZ> Im just fighting to have cephfs available :(
[12:37] <cookednoodles> cephfs is pretty iffy and not designed for production
[12:37] <mgarcesMZ> what is then?
[12:38] <cookednoodles> ceph its self
[12:38] <mgarcesMZ> lol
[12:38] <mgarcesMZ> so, only using rados to write files, not has a filesystem?
[12:39] <mgarcesMZ> I have a big question...
[12:39] <cookednoodles> well, its ideal as a backend for VMs
[12:39] <cookednoodles> using qemu/libvirt etc
[12:40] <mgarcesMZ> cookednoodles: in my case, I want to use ceph (or some other technologie) to store files, distribuited (datacenter lan machines + recovery datacenter wan)
[12:40] <mgarcesMZ> my question is: how is perfomance, regarding nodes in wan?
[12:41] <cookednoodles> its 'ok', proper support for this sort of stuff is coming very soon
[12:41] <mgarcesMZ> can I use the wan nodes with some option so they are treated has geo-localizated?
[12:41] * lucas1 (~Thunderbi@ has joined #ceph
[12:41] <mgarcesMZ> ok, I know in gluster I can set up nodes as special, using that geo something
[12:41] <mgarcesMZ> geo-replication
[12:42] <mgarcesMZ> so, ceph is going to support geo-replication?
[12:42] <cookednoodles> you can setup placement groups to make sure you have replicas in certain areas, but https://wiki.ceph.com/Planning/Blueprints/Dumpling/RGW_Geo-Replication_and_Disaster_Recovery
[12:42] <cookednoodles> is what you're after
[12:43] <mgarcesMZ> this is what I am after, yes :)
[12:43] * swami1 (~swami@ has joined #ceph
[12:43] <cookednoodles> "Its coming" :P
[12:43] <mgarcesMZ> now I just need to convice my devs to use the API to write files to my ceph
[12:44] <mgarcesMZ> I thouhgt dumpling was previous from firefly
[12:44] <cookednoodles> it is, but some milestones get moved around, pushed back, delayed etc
[12:46] * lucas1 (~Thunderbi@ Quit (Remote host closed the connection)
[12:47] * swami (~swami@ Quit (Ping timeout: 480 seconds)
[12:48] * cok (~chk@2a02:2350:18:1012:d0d7:97f2:a0fa:d5b3) Quit (Quit: Leaving.)
[12:49] <mgarcesMZ> cookednoodles: this is bad for my use case :(
[12:50] <mgarcesMZ> also, I need to convice the devs to use object storage, not filesystem directly
[12:50] <mgarcesMZ> since ceph-fuse is showing me slow results
[12:51] <cookednoodles> why aren't you using swift or something ?
[12:51] * swami (~swami@ has joined #ceph
[12:51] <mgarcesMZ> cookednoodles: I started using ceph yestarday :)
[12:51] <mgarcesMZ> I really cant answer you that :)
[12:51] <cookednoodles> well openstack swift
[12:52] <mgarcesMZ> nope, I dont have openstack here :(
[12:52] <cookednoodles> gives you a clean interface for storing files across ceph, gluster, anything
[12:52] <cookednoodles> you dont need openstack, just swift
[12:52] <mgarcesMZ> ok
[12:52] <mgarcesMZ> how easy is to install that?
[12:52] * coreping (~xuser@hugin.coreping.org) has joined #ceph
[12:52] <cookednoodles> go check :P
[12:53] <mgarcesMZ> I was just asking on your experience
[12:53] <mgarcesMZ> :)
[12:53] <cookednoodles> the answer is not very, but its faster than writing apis for storing files
[12:53] <mgarcesMZ> ok
[12:53] <mgarcesMZ> so I just set up swift, to point to my ceph
[12:54] <mgarcesMZ> and the devs point their code to write files in swift ?
[12:54] <cookednoodles> yep
[12:54] <mgarcesMZ> cool
[12:54] * abhi- (~abhi@ Quit (Ping timeout: 480 seconds)
[12:56] * swami1 (~swami@ Quit (Ping timeout: 480 seconds)
[12:56] <mgarcesMZ> cookednoodles: now I just need to decide, and talk with my boss, regarding the geo-replication issue??? I dont know if its the right way to go
[12:57] <mgarcesMZ> but Swift is a good pointer
[12:57] <mgarcesMZ> since I them can use whatever I want below (ceph, glusterfs)
[12:58] * zhaochao (~zhaochao@ has left #ceph
[12:58] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[12:58] <kippi> ksingh: Just finshed the install, but getting HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean is this normal on start up?
[12:59] <cookednoodles> kippi, lemme guess 2 osds ?
[12:59] <kippi> yep
[13:00] <cookednoodles> the guide uses 3 :P
[13:00] <kippi> I added osd pool default size = 2
[13:00] <cookednoodles> because thats the default ceph replication
[13:00] <cookednoodles> did you restart everything ?
[13:00] <kippi> cookednoodles: the guide says 2
[13:01] <cookednoodles> I can't remember the syntax, but its normally due to the size
[13:01] <kippi> should osd pool default size = 2 / shouldn't it be osd_pool_default_size = 2
[13:01] <cookednoodles> this issue gets asked here every few days :P
[13:01] <lurbs> That will only affect new pools, not existing ones.
[13:03] <lurbs> 'ceph osd pool set $pool size 2', where $pool is the default pools, rbd, data and metadata.
[13:04] <lurbs> You can check them with 'ceph osd dump | grep size'
[13:09] * mozg111 (~oftc-webi@ has joined #ceph
[13:09] <mozg111> hello guys
[13:09] <mgarcesMZ> hi MACscr
[13:10] <mgarcesMZ> ups
[13:10] <mgarcesMZ> hi mozg111
[13:10] <mozg111> Could someone help me to understand how the cache pool works
[13:10] <mozg111> i have a few questions
[13:11] <mozg111> I am planning to install an ssd disk on every hypervisor server and use it as a part of the cache pool
[13:11] <mozg111> at the moment i've got 4 hypervisor server using rbd as the store for virtual machines
[13:11] <mozg111> my aim is to try to "promote" the cache pool layer to the hypervisors, away from the osd servers
[13:12] <mgarcesMZ> cookednoodles: do I need swift.. cant I just provide the devs with radosgw ?
[13:12] <mozg111> so, my question is how would read/write requests behave?
[13:12] <mozg111> let's say if a guest vm makes a read or write request
[13:12] <cookednoodles> mgarcesMZ, I said use swift as it makes your programmer's lives 100x easier
[13:12] <mozg111> how would the workflow go?
[13:13] <cookednoodles> they then have a common api that you can use for ceph, glusterfs, ftp whatever
[13:13] <cookednoodles> mozg111, I've actually been asking myself the same question
[13:13] <mgarcesMZ> cookednoodles: I am very very confused right now
[13:13] <mgarcesMZ> :)
[13:14] <mozg111> cookednoodles: are you also running kvm + ceph?
[13:14] <cookednoodles> mgarcesMZ, research each part
[13:14] <cookednoodles> mozg111, yep
[13:14] <mozg111> working well?
[13:14] <mozg111> are you using openstack or cloudstack or standalone kvms?
[13:14] <cookednoodles> well we're in final stages of testing the full platform, but apart from our 'okish' benchmarks, its good
[13:14] <cookednoodles> custom rolled with libvirt
[13:14] <mozg111> i c
[13:15] <cookednoodles> openstack didn't meet our needs, and cloudstack is ... cloudstack
[13:15] <mozg111> what are your performance figures?
[13:15] <mozg111> i was also a bit concerned with performance at first
[13:15] <cookednoodles> I'm at 30Mb with 2 osds, I'll say more as we roll out properly :P
[13:16] <cookednoodles> thats with 0 tuning btw
[13:16] <cookednoodles> .
[13:16] <mozg111> but ceph is designed to have good performance for many requests concurrently. single thread operations are that good performance wise
[13:16] <cookednoodles> but yes, ceph will be slower, but the advantages outweigh the issues
[13:17] <mozg111> i am using 2 osd servers with 17 osds shared between them
[13:17] <mozg111> and 3 mon servers
[13:17] <mozg111> seems to work okay when it comes to concurrency
[13:17] <ninkotech> cookednoodles: i did my custom cloud solution too :)
[13:17] <ninkotech> with libvirt and kvm
[13:17] * shimo (~A13032@122x212x216x66.ap122.ftth.ucom.ne.jp) Quit (Quit: shimo)
[13:17] <ninkotech> still using local images for safety and raw speed
[13:18] <cookednoodles> safety ? :P
[13:18] <ninkotech> hehe
[13:18] <ninkotech> usually only 1 drive will fail
[13:18] <cookednoodles> what happens if the motherboard fails ?
[13:18] <cookednoodles> what happens if your backplane fails, psus etc
[13:19] <ninkotech> but if central solution would fail somehow.... whole business would be dead
[13:19] <ninkotech> cookednoodles: backups...
[13:19] <cookednoodles> well thats why you use ceph :P
[13:19] <ninkotech> i would like to
[13:19] <ninkotech> but i have hard barriers here
[13:19] <ninkotech> people have fears of rbd still
[13:19] * vaminev (~vaminev@ Quit (Read error: Connection reset by peer)
[13:19] <cookednoodles> as long as its not cephfs, its pretty very stable
[13:20] <ninkotech> i know, but its hard to explain to some people
[13:20] <ninkotech> maybe i will have a chance...
[13:20] <cookednoodles> easy to demonstrate, unplug the machine :P
[13:21] <ninkotech> you know... but what if the cluster will stop responding?
[13:21] <ninkotech> -> doomed
[13:21] <cookednoodles> why would it ?
[13:21] <ninkotech> i dont know.. when soemthing can fail, it will
[13:21] <ninkotech> thats the kind of things they say to me
[13:21] <cookednoodles> every part of ceph has redundancy
[13:21] <cookednoodles> its designed for this
[13:21] <ninkotech> local problem -> local failure
[13:21] <ninkotech> global problem -> global failure, loosing customers
[13:22] <ninkotech> cookednoodles: i work against this thinking for years now...
[13:22] <ninkotech> but its hard to beat
[13:23] <ninkotech> as i myself am scared a bit -- what if it will fail?
[13:23] * hijacker (~hijacker@bgva.sonic.taxback.ess.ie) Quit (Quit: Leaving)
[13:23] <ninkotech> if i fail to manage the cluster running -> i am done
[13:23] <cookednoodles> nothing stops you doing backups of the cluster
[13:24] <ninkotech> well, backup is nice, but recovering from crash of 100 machines takes ages
[13:24] * diegows (~diegows@ has joined #ceph
[13:26] <mozg111> cookednoodles: well, i've seen issues with ceph as well
[13:27] <mozg111> the redundant design is very good, that's not a problem
[13:27] <mozg111> however, your cluster might become unresponsive when ceph is not optimised and doing recovery for example
[13:27] <mozg111> this is especially visible on small clusters
[13:27] <mozg111> i've seen this several times on my cluster
[13:27] <mozg111> when client IO becomes almost unexistant
[13:28] <mozg111> and all your vms become 99% iowait
[13:28] <mozg111> and simply crash after a few minutes
[13:28] <cookednoodles> thats an io issue, you'll see the same thing with raid etc
[13:28] <narurien> mozg111: what optimizations do you use to avoid this?
[13:28] <mozg111> cookednoodles: not to the same extent
[13:29] <mozg111> narurien: you need to make sure that your recovery and scrubbing processes are set to the lowest priority and lowest number of tasks.
[13:29] <mozg111> even that doesn't help in extreme cases
[13:29] <mozg111> for instance, when I was upgrading to firefly from emperor
[13:29] <mozg111> to have the optimal crush settings it had to do data reshuffling
[13:29] <narurien> mhm, we have been playing around with those settings, but haven't found the magic bullet yet
[13:30] <mozg111> the process was poorly documented
[13:30] <mozg111> it had to move 30% of my cluster
[13:30] <narurien> recovery still hurts
[13:30] <mozg111> and even with lowest priority for recovery it still managed to kill my io in the cluster
[13:30] <mozg111> all of my vms just hanged indefinately
[13:30] * vaminev (~vaminev@ has joined #ceph
[13:30] <mozg111> util the recovery finished (some 12 hours later)
[13:31] <mozg111> and I had to manually restart every single vm
[13:31] <mozg111> thankfully, none of them were corrupted!
[13:31] <cookednoodles> 12 hours ? thats pretty huge
[13:31] <mozg111> narurien: you can't avoid this at the moment
[13:32] <mozg111> you can lessen the effect with optimisation, but for my case (and i've heard of 2 more stories like that) it wasn't enough
[13:32] <mozg111> still killed client IO
[13:32] <mozg111> so, in these cases, ceph is not immune to falling over
[13:33] <mozg111> as far as the client io is concerned
[13:33] <mozg111> the data safety seems to be good and failures of osds and servers are handled pretty well
[13:34] <mozg111> anyone has answers to the cache related questions i've posted earlier?
[13:34] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[13:35] <cookednoodles> actually, your plan was to have 1 ssd per client ?
[13:38] * jtang_ (~jtang@ Quit (Ping timeout: 480 seconds)
[13:39] * Cube (~Cube@ Quit (Quit: Leaving.)
[13:39] <s3an2> Are there are known issues for client side (trusty 3.13) kernel panics comming out of XFS when using RBD's?
[13:39] <kraken> http://i.imgur.com/rhNOy3I.gif
[13:40] <s3an2> I just had a case where 24 Servers connecting to the same ceph cluster all had a kernel panic
[13:40] <kraken> http://i.imgur.com/WS4S2.gif
[13:40] <cookednoodles> s3an2, cephfs ?
[13:41] <s3an2> This is using RBD's that are formated with xfs
[13:41] <mozg111> cookednoodles: nope, one ssd per hypervisor host
[13:41] <mozg111> so, the more hypervisor hosts I add, the greater becomes the cache pool
[13:41] <cookednoodles> mozg111, what about the cache pool journal ? ;)
[13:41] <mozg111> something like 500gb ssd on every hypervisor
[13:41] * shang (~ShangWu@ Quit (Ping timeout: 480 seconds)
[13:41] <mozg111> cookednoodles: i don't think it needs that
[13:42] <cookednoodles> it does
[13:42] <cookednoodles> the cachepool is just a normal pool that 'overlays' your slower pool
[13:42] <mozg111> s3an2: i've not come across this behavriour
[13:42] <mozg111> however, I am using 3.15.6 kernel on osd servers
[13:43] <mozg111> in the past i've used 3.8
[13:43] <mozg111> 3.11 didn't work well for me as the osd servers were becoming unstable after a few days of uptime
[13:43] <mozg111> had to restart them to gain performance back to the normal level
[13:44] <mozg111> cookednoodles: well, perhaps I could try it on the same disk and see how it works
[13:44] <mozg111> or use btrfs for tests
[13:44] <cookednoodles> it depends if you're read or write heavy, but on the same disk, you're cutting your performance in 2
[13:44] <mozg111> however, I am not sure it the issue with btrfs becoming slow overtime still exists with the latest kernels
[13:45] <Kioob`Taff> s3an2: some RBD bugs were fixed in 3.14.x kernels, which are ??longterm?? ones. You should probably use it
[13:45] <mozg111> s3an2: are you using ubuntu servers?
[13:45] <mozg111> if so, give the latest 3.15 a try from the ubuntu utopic branch
[13:46] <s3an2> I am using ubuntu 14.04 kernel 3.13
[13:46] <kippi> so my cluster is now healthy, can write etc, however when I try to mount from another machine I am getting Input/Output errors. I have installed ceph on the client and using mount -t ceph
[13:47] * jtang_ (~jtang@ has joined #ceph
[13:47] <s3an2> I have considered running a utopic kernel as 3.16 is availble that looks to contain a lot o changes in rbd.c - Is there a prefered file system to use with RBD?
[13:50] <mozg111> s3an2: it is recommended to use xfs or ext4
[13:50] <mozg111> i've seen xfs performance to slighthly higher
[13:50] <mozg111> but btrfs kicks ass, but is not considered to be stable yet
[13:50] <mozg111> plus ppl report it slowing down overtime
[13:50] <mozg111> due to fragmentation issues
[13:51] <mozg111> however, i am not sure if this issue has been addressed in the latest kernels
[13:51] <mozg111> personnaly, i run xfs
[13:51] <mozg111> and I have a scheduled job every weekend to run defragmentation for one hour on every osd
[13:52] <mozg111> so, for my current writes, my fragmentation level on xfs is < 2%
[13:52] <mozg111> but without defrag it went up to 20% in about 3-4 months
[13:54] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[13:55] <kippi> also trying fuse I get pipe(0x7f9b180031b0 sd=9 :0 s=1 pgs=0 cs=0 l=1 c=0x7f9b18005430).fault
[13:57] * vbellur (~vijay@ Quit (Ping timeout: 480 seconds)
[14:03] * fdmanana (~fdmanana@bl5-173-238.dsl.telepac.pt) has joined #ceph
[14:03] <mozg111> anyone around with ceph cache pool experience?
[14:04] <mozg111> I need some help to understand how the IO workflow is done with cache pools
[14:04] <cookednoodles> spamming here + the mmailing list won't help :P
[14:04] <mozg111> sorry, not trying to spam. occasionally pple join the list and have some ideas
[14:05] <cookednoodles> most of the devs hang around on the mailing list, I think its your best hope
[14:05] <pressureman> mozg111, i never knew you could defrag xfs... is it online defrag?
[14:05] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[14:05] <mozg111> pressureman: yeah
[14:05] <mozg111> online
[14:06] <darkfader> https://bitbucket.org/darkfader/nagios/src/0ee677859ce06e9a189bd483a92810dec3132c2b/check_mk/local/xfsfrag.py?at=default for reporting fragmentation
[14:06] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[14:06] <pressureman> i see some fairly scary fragmentation levels on some of my OSDs... worst was around 42%
[14:06] <darkfader> just don't run it every 10 minutes
[14:06] <mozg111> you can run something like that in your cron job:
[14:06] <mozg111> disks="`mount -t xfs| awk {'print $1'}`"
[14:06] <mozg111> for i in $disks; do xfs_fsr -t 900 $i ; done
[14:07] <mozg111> this will run the xfs defrag for 900 seconds
[14:07] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[14:07] <mozg111> so, depending on your write work loads and how often you run the jobs for
[14:07] <pressureman> lol... i like the german comments in that script
[14:07] <mozg111> for me, it is enough to have a weekly run to keep fragmentation at <2%
[14:08] * KevinPerks (~Adium@2606:a000:80a1:1b00:42:a31c:3b07:d5e1) has joined #ceph
[14:08] <mozg111> pressureman: yeah, you should definately run it
[14:08] <mozg111> it will take a bit longer than 900 seconds to get rid of your 42% fragmentation level
[14:09] <mozg111> but once you have it at bay, you don't need to run it for long
[14:09] <mozg111> unless you have crazy write patterns
[14:09] <pressureman> cool, thanks for the tip
[14:09] <pressureman> i had never seriously used XFS before ceph
[14:10] <mozg111> yeah, I am an ext4 guy mainly
[14:10] <mozg111> i do like zfs though
[14:10] <mozg111> i think it is far more superiour to other fs
[14:11] <mozg111> but it's too bad ceph is not stable with zfs
[14:11] <pressureman> do you use logbsize=256k,logbufs=8 mount options?
[14:11] <darkfader> my experience is like vxfs > zfs > xfs > jfs > ext4 > ufs2 > ext3
[14:11] <pressureman> (yes, i like zfs too - i did a fair bit with it on opensolaris a few years ago)
[14:12] <pressureman> (and jfs, about 20 years ago on HP-UX 9.04)
[14:12] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Remote host closed the connection)
[14:12] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) has joined #ceph
[14:12] * aknapp (~aknapp@ has joined #ceph
[14:13] <mozg111> pressureman: no, I am not
[14:13] <mozg111> but it is on my todo list to use the following ceph.conf option:
[14:13] <mozg111> osd mount options xfs = "rw,noatime,inode64,logbsize=256k,delaylog,allocsize=4M"
[14:13] <pressureman> yes, i use the allocsize=4m already
[14:13] <mozg111> i've had this written down
[14:13] <mozg111> according to some tests done by some ppl
[14:13] * vaminev (~vaminev@ Quit (Read error: Connection reset by peer)
[14:13] <pressureman> seems to reduce fragmentation (on new filesystems)
[14:13] <mozg111> pressureman: has it improved your performance?
[14:14] <pressureman> i don't think the allocsize will make much difference to performance, but just reduce fragmentation
[14:14] <pressureman> (which arguably could indirectly improve performance)
[14:14] * vaminev (~vaminev@ has joined #ceph
[14:15] <pressureman> i found this forum post quite useful, about xfs mount options - http://forum.proxmox.com/threads/18552-ceph-performance-and-latency?s=ad49c327530a1939202990868dc0b0cd&p=95399#post95399
[14:16] <pressureman> and i've read a number of other people recommend logbsize=256k,logbufs=8
[14:16] <pressureman> but nobarrier is playing with fire, unless u have BBU cache ;-)
[14:17] * jtang_ (~jtang@ Quit (Ping timeout: 480 seconds)
[14:19] <mgarcesMZ> please dont use nobarrier???. I lost lot of data, because I turned that on, for a db migration, and forgot to turn it back off gain :(
[14:20] <darkfader> thanks for being honest, people need to be warned of that a lot
[14:21] <mgarcesMZ> darkfader: are you the same alfresco guy?
[14:21] * rwheeler (~rwheeler@ has joined #ceph
[14:21] <darkfader> no
[14:22] * Cube (~Cube@ has joined #ceph
[14:22] <mgarcesMZ> you nickname ring a bell :)
[14:23] <darkfader> i think i follow you on twitter, so we do know somewhere, somehow. gotta go back to work though :>
[14:23] <darkfader> nobarrier highlights in my brain
[14:25] * ganders (~root@200-127-158-54.net.prima.net.ar) has joined #ceph
[14:26] * jtang_ (~jtang@ has joined #ceph
[14:27] <mozg111> yeah, i've heard about nobarrier
[14:36] * kanagaraj (~kanagaraj@ Quit (Ping timeout: 480 seconds)
[14:42] * mgarcesMZ (~mgarces@ Quit (Quit: mgarcesMZ)
[14:46] <djh-work> When are deleted S3 object actually deleted from rados? There's a .rgw.gc rados pool with a number of objects, I guess it's exactly for that purpose? radosgw-admin gc {list,process} returns an empty list/ does not change anything.
[14:48] <ganders> hi guys, anyone saw this kind of err msg on the dmesg while trying to map a rbd "feature set mismatch, my 4a042a42 < server's 2004a042a42, missing 20000000000"
[14:48] * vaminev (~vaminev@ Quit (Read error: Connection reset by peer)
[14:49] * vaminev (~vaminev@ has joined #ceph
[14:50] * lalatenduM (~lalatendu@ Quit (Quit: Leaving)
[14:51] <mozg111> ganders: yeah, i've seen this
[14:51] <mozg111> it's because the rbd module in your kernel doesn't support the featrues that are set in crushmap
[14:52] <mozg111> you either need to remove the features
[14:52] <mozg111> or use newer kernel
[14:52] <ganders> mozg111: and were can i find what feature is causing the issue?
[14:52] <mozg111> i've found switching from 3.8 kernel to 3.15 fixed the problem for me
[14:52] <mozg111> ganders: there is a mailing list thread about this
[14:52] <mozg111> not so long ago
[14:52] <ganders> im using 3.13-0
[14:52] <ganders> oh ok i will check it out, thanks :)
[14:52] <mozg111> it is likely to be one or two flags
[14:53] <mozg111> search for feature set mismatch
[14:53] <mozg111> you should be able to find it
[14:54] * nyerup (irc@jespernyerup.dk) has joined #ceph
[15:00] * manohar (~manohar@ has joined #ceph
[15:00] * abhi- (~abhi@ has joined #ceph
[15:01] <mozg111> ganders: 3.13 is still missing something
[15:01] * lalatenduM (~lalatendu@ has joined #ceph
[15:01] <mozg111> i've tried that one and it didn't work for me
[15:01] <mozg111> but i've now got 3.15.6 from the utopic ubuntu branch
[15:01] <mozg111> works okay so far
[15:01] <ganders> mozg111: yes i think im going with the 3.15.8
[15:01] * Cube (~Cube@ Quit (Quit: Leaving.)
[15:04] <pressureman> mozg111, nice, my xfs filesystems are now around 0.7% fragmented :D
[15:04] <mozg111> pressureman: )))
[15:04] <mozg111> how large are you osds?
[15:04] <mozg111> that was pretty quick
[15:04] <pressureman> most of the previous fragmentation would have been from before i started using allocsize=4m
[15:04] <pressureman> OSDs are 1TB
[15:05] <mozg111> i remember it took like 2 hours or so to go from 20% to <1% on my 3tb osds
[15:05] <pressureman> well now i will start on one of my other clusters... 24 OSDs
[15:07] <pressureman> actually i just tried xfs_fsr on a relatively new node, which would have used allocsize=4m right from the beginning, and it basically said "nothing to do here"
[15:07] <pressureman> so allocsize=<object_size> *definitely* helps avoid fragmentation
[15:09] <mozg111> ))
[15:09] <mozg111> good to know
[15:10] * ashishchandra1 (~ashish@ Quit (Ping timeout: 480 seconds)
[15:18] * michalefty (~micha@p20030071CE5107611CFCE9EA5480902E.dip0.t-ipconnect.de) Quit (Quit: Leaving.)
[15:18] * houkouonchi-home (~linux@pool-71-177-96-154.lsanca.fios.verizon.net) has joined #ceph
[15:22] * mgarcesMZ (~mgarces@ has joined #ceph
[15:23] * RameshN (~rnachimu@ Quit (Ping timeout: 480 seconds)
[15:25] * abhi- (~abhi@ Quit (Remote host closed the connection)
[15:25] * bloodice (~butchers@ has joined #ceph
[15:27] <bloodice> Does anyone know why there would be slow write but the read is fine?
[15:28] <bloodice> we are on ceph version 0.72.1 in a test environment and the write speed is 6.6MB/s but the read is 97MB/s... we expected the read to be around that, but the write is really slow
[15:31] <bloodice> We have WD red 4TB sata drives in a JBOD config. Servers have 6 1Gb ports, 4 bonded, but we are only allowing 2 rados gateways at 1Gb/s each
[15:31] <kippi> Ok, ceph was up and running, however now I am getting errors of, failed to create new leveldb store
[15:32] * hijacker (~hijacker@bgva.sonic.taxback.ess.ie) has joined #ceph
[15:35] * madkiss (~madkiss@tmo-097-45.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[15:36] * markbby (~Adium@ has joined #ceph
[15:37] * jeff-YF (~jeffyf@ has joined #ceph
[15:38] * brad_mssw (~brad@shop.monetra.com) has joined #ceph
[15:42] <iggy> bloodice: have you tried non-rgw benchmarks? (rados bench, local benchmarks, etc)
[15:43] * swami (~swami@ Quit (Quit: Leaving.)
[15:43] * TiCPU (~jeromepou@ has joined #ceph
[15:43] <bloodice> the benchmark we ran was the rbd one
[15:43] <bloodice> #dd if=/dev/zero of=/dev/rbd1 bs=1024k count=1000 oflag=direct
[15:44] <iggy> oh, you mentioned rgw, so i thought...
[15:44] <burley> what're you using for journaling?
[15:44] <bloodice> when i ran the rados bench and it was slow too
[15:45] <bloodice> the journals are on the disks
[15:46] <bloodice> journal setting in ceph.conf is 1024
[15:46] <iggy> tried playing around with shifting the journals around?
[15:46] <mozg111> bloodice: where are you osd journals? on the same disk or on ssd?
[15:46] <bloodice> same disk
[15:46] <mozg111> that could be the bottleneck. you should consider moving it to the ssd
[15:47] <iggy> i.e. sda's journal on sdb, sdb's on sdc, etc
[15:47] <mozg111> as each write is done twice
[15:47] <mozg111> you might be getting io issues
[15:47] <bloodice> each server has 36 drives, so putting the journals on the OS ssds was a question mark
[15:48] <iggy> way too much contention
[15:48] <Anticimex> do the math well on that in terms of iops etc
[15:48] <Anticimex> get intel P3700
[15:48] <Anticimex> :]
[15:48] <iggy> maybe fusionio
[15:49] * lalatenduM (~lalatendu@ Quit (Quit: Leaving)
[15:49] <Anticimex> sucks, P3700 better
[15:49] * lalatenduM (~lalatendu@ has joined #ceph
[15:49] <bloodice> lol, trying to avoid putting more money into this...
[15:49] * vz (~vshankar@ Quit (Quit: Leaving)
[15:49] <iggy> then i'd play with shifting the journals
[15:50] <bloodice> to me, if the reads are 97 MB/s, then the writes should be at around 50MB/s with the double writing overhead
[15:50] * ksingh (~Adium@2001:708:10:10:a9c1:36dd:25d8:f450) has left #ceph
[15:50] <iggy> i would expect closer to 30-40 personally
[15:51] <iggy> but thats still well above where you are at
[15:51] <bloodice> is that with the journals on SSDs?
[15:51] * lalatenduM (~lalatendu@ Quit ()
[15:51] <mozg111> iggy: agree. We get around 30-35mb/s across our osds. that is with journals on ssd drives
[15:51] <bloodice> wow
[15:51] <mozg111> that's raw disk performance
[15:51] <mozg111> not what the client get's
[15:51] <bloodice> what about read?
[15:51] <Anticimex> i have a wd purple at home. it read like 250MB/s and ~350 iops. i was impressed
[15:52] <Anticimex> (block level dd, tho)
[15:52] <mozg111> client side is more as you get journal + cache benefits
[15:52] * aknapp (~aknapp@ Quit (Remote host closed the connection)
[15:52] <mozg111> bloodice, I get around 50mb/s for reads using 1 dd process with iflag=direct and on cold data (which hasn't been read from osds for a very long time)
[15:53] * aknapp (~aknapp@ has joined #ceph
[15:53] <mozg111> repeat dd a few times and it goes to around 300-400mb/s
[15:53] * lalatenduM (~lalatendu@ has joined #ceph
[15:53] <mozg111> as the data is coming from the osd server's ram
[15:53] <iggy> are you really mb/s?
[15:54] <iggy> really meaning
[15:58] * rdas (~rdas@ Quit (Quit: Leaving)
[15:58] <bloodice> i ran the read test five more times, i am still only getting around 100MB/s via rbd
[15:59] * Gorazd (~Gorazd@89-212-99-37.dynamic.t-2.net) has joined #ceph
[15:59] * i_m (~ivan.miro@gbibp9ph1--blueice2n1.emea.ibm.com) Quit (Quit: Leaving.)
[15:59] * i_m (~ivan.miro@gbibp9ph1--blueice1n1.emea.ibm.com) has joined #ceph
[16:00] * marrusl (~mark@209-150-43-182.c3-0.wsd-ubr2.qens-wsd.ny.cable.rcn.com) Quit (Remote host closed the connection)
[16:00] <bloodice> radosgw testings shows 80MB/s read, but the write is the same at 6MB/s which is why i think there is a setting i am missing.. besides the journal setup.
[16:00] * marrusl (~mark@209-150-43-182.c3-0.wsd-ubr2.qens-wsd.ny.cable.rcn.com) has joined #ceph
[16:01] * aknapp (~aknapp@ Quit (Ping timeout: 480 seconds)
[16:01] <bloodice> off topic question, i have upgraded before, but now it seems we are .10 versions behind... can we just go to the newest or is there a stepping upgrade required?
[16:01] * marrusl (~mark@209-150-43-182.c3-0.wsd-ubr2.qens-wsd.ny.cable.rcn.com) Quit (Read error: Connection reset by peer)
[16:01] * marrusl (~mark@209-150-43-182.c3-0.wsd-ubr2.qens-wsd.ny.cable.rcn.com) has joined #ceph
[16:02] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[16:02] <mozg111> iggy: megabytes/s
[16:03] <mozg111> bloodice: you should be okay with getting the latest one in. just make sure you follow the upgrade guide
[16:05] <iggy> i would say that is probably the journals
[16:07] * manohar (~manohar@ Quit (Quit: manohar)
[16:07] * gregmark (~Adium@ has joined #ceph
[16:08] <iggy> if there is no money to fix the journals, some people have had luck at least putting the journals on different disks
[16:09] <iggy> but it's something you'll have to check with your expected workload
[16:14] <mozg111> bloodice: I agree with iggy
[16:14] <mozg111> writes are more io intensive then reads in ceph
[16:14] <mozg111> plus it also depends on your replication ratio
[16:15] <mozg111> the more it is the harder the osds have to work to push the write
[16:15] <mozg111> bloodice: if you don't have the budget to dedicate an ssd for every 4-6 osds, you might want to consider createing an ssd cache pool
[16:16] * ikrstic (~ikrstic@109-93-162-27.dynamic.isp.telekom.rs) Quit (Quit: Konversation terminated!)
[16:16] <mozg111> made of 4-6 ssds
[16:16] * bkopilov (~bkopilov@nat-pool-tlv-t.redhat.com) Quit (Ping timeout: 480 seconds)
[16:17] * morse_ (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[16:18] * vaminev (~vaminev@ Quit (Remote host closed the connection)
[16:19] <mozg111> I am yet to play with cache pools
[16:19] * jtang_ (~jtang@ Quit (Ping timeout: 480 seconds)
[16:19] <mozg111> hense waiting here for ppl to answer couple of my questions
[16:19] <mozg111> but they all tend to be busy ))
[16:20] <Kioob`Taff> for me you should upgrade step by step
[16:20] <Kioob`Taff> from major version to major version
[16:20] * longguang_home (~chatzilla@ has joined #ceph
[16:21] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[16:21] <mozg111> bloodice: if you are upgrading from emperor, read about the optimal settings for the crush map and how it will effect performance while rebalancing is taking place
[16:21] <mozg111> it killed my IO when I ran it
[16:21] <Kioob`Taff> If it was me, I would start by upgrading to the latest 0.72.* version, then I will upgrade to the next ??stable?? version (which is 0.80.5)
[16:22] * baylight (~tbayly@ has joined #ceph
[16:23] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[16:24] * bkopilov (~bkopilov@nat-pool-tlv-t.redhat.com) has joined #ceph
[16:31] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[16:31] * baylight (~tbayly@ Quit (Ping timeout: 480 seconds)
[16:33] * vbellur (~vijay@ has joined #ceph
[16:33] <seapasulli> anyone know how to mark unfound objects as lost?
[16:34] <Vacum_> When I call the ceph command line tool, is there a way to define a timeout when it should stop trying to connect to a mon?
[16:34] <seapasulli> I did mark_unfound_lost revert but it is still trying to recover it
[16:34] <seapasulli> http://ceph.com/docs/master/rados/configuration/mon-config-ref/
[16:35] <seapasulli> Vacum_: I think it's 'mon sync timeout'
[16:35] <seapasulli> default is 30 seconds
[16:35] <seapasulli> There is also a weird bug with python 2.7.7 that I was having where ceph-client would just sit there 300+ seconds.
[16:35] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:36] <seapasulli> python 2.7.6 2.7.7 and 2.7.9 work
[16:36] <seapasulli> but 2.7.8 doesn't work *
[16:36] <seapasulli> so correction bug is in 2.7.8 I guess
[16:36] <seapasulli> hahaha
[16:36] <Vacum_> seapasulli: the "mon sync ..." settings are for "Monitor Store Synchronization"
[16:36] <Vacum_> seapasulli: so everything related between mons
[16:36] <seapasulli> yup?
[16:36] <seapasulli> oh client
[16:36] <seapasulli> ah sorry i missed that.
[16:37] <Vacum_> no worries :)
[16:40] <seapasulli> I thought I remember reading about a ceph client config now but I don't see any options mentioned for client timeouts.
[16:40] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[16:42] * baylight (~tbayly@74-220-196-40.unifiedlayer.com) has joined #ceph
[16:44] * i_m (~ivan.miro@gbibp9ph1--blueice1n1.emea.ibm.com) Quit (Quit: Leaving.)
[16:45] * dmsimard_away is now known as dmsimard
[16:46] <Vacum_> client_mount_timeout seems to be the one :)
[16:52] * lalatenduM (~lalatendu@ Quit (Quit: Leaving)
[16:53] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) has joined #ceph
[16:53] * lalatenduM (~lalatendu@ has joined #ceph
[16:54] * markbby (~Adium@ Quit (Quit: Leaving.)
[16:55] * bkopilov (~bkopilov@nat-pool-tlv-t.redhat.com) Quit (Ping timeout: 480 seconds)
[16:56] * danieagle (~Daniel@ has joined #ceph
[16:56] * kanagaraj (~kanagaraj@ has joined #ceph
[16:57] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[16:57] <seapasulli> I only see that mentioned in a bug ticket after a quick search.
[17:00] * Nacer_ (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[17:00] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[17:01] <mgarcesMZ> cookednoodles: do you recomend any public cloud with swift, so I can point my devs to it, so they can start testing, while I implement everything on my side?
[17:01] <cookednoodles> that depends on you
[17:02] <mgarcesMZ> how come?
[17:02] <Vacum_> seapasulli: I found it in config_opts.h, the treasurebox of yet-unknown config options :) and then in http://mail-archives.apache.org/mod_mbox/cloudstack-commits/201402.mbox/%3Cf49ca67aa8c2414f9bd38878b9e3ba75@git.apache.org%3E
[17:03] <cookednoodles> on what your operational requirements are
[17:03] * jtang_ (~jtang@ has joined #ceph
[17:03] * markbby (~Adium@ has joined #ceph
[17:04] <mgarcesMZ> I will use ceph, for storing documents
[17:05] <mgarcesMZ> I need the devs, to build their software, using swift API
[17:05] <mgarcesMZ> they dont have to know what I use below
[17:05] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[17:05] <mgarcesMZ> but since I will take a few days to build this
[17:05] <mgarcesMZ> I would like them to use some cloud service with swift, just for getting the dev rolling
[17:06] <mgarcesMZ> there is no need on perfomance
[17:06] <mgarcesMZ> no big data
[17:06] <mgarcesMZ> just a simple swift interface so they can start coding today
[17:06] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[17:06] <mgarcesMZ> those are my ops reqs
[17:06] <Vacum_> mgarcesMZ: http://docs.dreamobjects.net/
[17:06] <Vacum_> mgarcesMZ: dreamobjects is from the guys behind ceph. and they run dreamobjects on ceph :)
[17:07] <mgarcesMZ> dreamhost based
[17:07] <mgarcesMZ> nice
[17:07] <mgarcesMZ> I had a dreamhost box for severall years
[17:07] <Vacum_> mgarcesMZ: http://docs.dreamobjects.net/swift-examples/index.html
[17:07] <mgarcesMZ> dumb question here
[17:07] <mgarcesMZ> swift and S3, have a different API
[17:07] <mgarcesMZ> right
[17:08] <cookednoodles> I have no idea
[17:08] <mgarcesMZ> :D
[17:10] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[17:16] <seapasulli> mgarcesMZ: https://wiki.openstack.org/wiki/Swift/APIFeatureComparison
[17:16] <seapasulli> ^_^
[17:17] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[17:17] * xarses (~andreww@ has joined #ceph
[17:22] * mgarcesMZ (~mgarces@ Quit (Quit: mgarcesMZ)
[17:23] <colonD> ceph cluster of 3 optiplex 9020s (each mon & osd) with dedicated 500g 7200rpm osd disk (colocated journal) vs a single 7200rpm 80gb disk: http://openbenchmarking.org/result/1408058-KH-LOCALXFSN88,1408022-KH-RBDXFSNOA38
[17:25] <darkfader> pretty good for shared journal
[17:25] <darkfader> s/shared/colocated/
[17:25] <kraken> darkfader meant to say: pretty good for colocated journal
[17:25] <darkfader> yes sorry
[17:26] * jordanP (~jordan@ Quit (Quit: Leaving)
[17:29] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[17:31] <burley> so when I create a RBD volume to use for mysql
[17:31] <burley> then try to load data into a new databases (~20+GB dump file)
[17:31] <burley> I consistently am able to get a soft lockup on the CPU
[17:32] <burley> only happens when writing to the RBD volume, if I do the same with a local volume, it completes no issue
[17:32] <burley> http://pastie.org/private/pn4yuxi1w9why7l6kxieg
[17:33] * sputnik13 (~sputnik13@ has joined #ceph
[17:33] <narurien> is the client where you mount this also running OSDs?
[17:34] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[17:34] <burley> no
[17:34] * erice_ (~erice@ has joined #ceph
[17:34] * erice (~erice@ Quit (Read error: No route to host)
[17:34] <burley> its only running mysql for this test
[17:34] <burley> not being used for any other workload
[17:35] <burley> I have reproduced it on 2 identical servers, running ubuntu 12.04 and 14.04 (ceph 0.80.5)
[17:35] * lalatenduM (~lalatendu@ Quit (Quit: Leaving)
[17:36] <runfromnowhere> Jumping in with a question - I saw a thread from a while back about LVM not being recommended for use on top of RBDs...Does anyone know if this is still the case?
[17:36] * zack_dolby (~textual@p8505b4.tokynt01.ap.so-net.ne.jp) has joined #ceph
[17:40] * cok (~chk@2a02:2350:18:1012:14d3:ae81:32a0:a625) has joined #ceph
[17:40] <burley> here's a bit more dmesg output: http://pastie.org/private/wfpiycj5bxfntz5itpyoa
[17:40] * longguang_home (~chatzilla@ Quit (Quit: ChatZilla [Firefox 29.0.1/20140506152807])
[17:41] * mgarcesMZ (~mgarces@ has joined #ceph
[17:41] * adamcrume (~quassel@ has joined #ceph
[17:41] * sputnik13 (~sputnik13@ Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[17:41] * ajiang38740 (~ajiang387@cpe-70-116-46-115.austin.res.rr.com) has joined #ceph
[17:42] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[17:43] <bloodice> thanks for the input guys... i read through the upgrade docs and it appears i can go directly to .80 if i follow the order they have in their document.
[17:43] * sputnik13 (~sputnik13@ has joined #ceph
[17:43] <narurien> burley: some report that they see this in ubuntu (23.04 and 14.0) but not in Debian f.ex.
[17:43] <narurien> that 23 is obviously a typo
[17:44] * reed (~reed@75-101-54-131.dsl.static.sonic.net) has joined #ceph
[17:45] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Remote host closed the connection)
[17:46] <bloodice> anyone using the standalone radosgw?
[17:46] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[17:49] <bloodice> oh.. i noticed that the ceph.conf file format seems have changed... are the underscores still required or are they optional... IE osd_journal_size
[17:51] <PerlStalker> I'm running ceph 0.80.4. A little while ago I had an inconsistent scrub error that I though was leftover from the XFS xattr problems from earlier. Unfortunately, running `ceph pg repair` has destroyed as rbd.
[17:51] * ceph (~ceph@ has joined #ceph
[17:52] * ircolle-afk is now known as ircolle
[17:52] * ceph is now known as dave_casey
[17:52] <dave_casey> hello
[17:53] <PerlStalker> I may be wrong about it being destroyed but I now have an "unfound" pg.
[17:53] * madkiss (~madkiss@tmo-106-237.customers.d1-online.com) has joined #ceph
[17:53] <PerlStalker> What can I do to repair this or is it even possible?
[17:53] * danieljh (~daniel@0001b4e9.user.oftc.net) has joined #ceph
[17:53] <burley> narurien: Have link to another report that I can look at?
[17:54] <PerlStalker> Do I simply delete the rbd?
[17:54] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[17:55] * lpabon_test (~quassel@66-189-8-115.dhcp.oxfr.ma.charter.com) has joined #ceph
[17:55] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:55] * lpabon_test (~quassel@66-189-8-115.dhcp.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[17:58] * manohar (~manohar@ has joined #ceph
[17:58] * rmoe (~quassel@173-228-89-134.dsl.static.sonic.net) Quit (Ping timeout: 480 seconds)
[17:59] <bloodice> ugh .80 is not on the ubuntu upgrade list
[18:00] * mgarcesMZ (~mgarces@ Quit (Quit: mgarcesMZ)
[18:00] * ibuclaw (~ibuclaw@rabbit.dbplc.com) has joined #ceph
[18:01] * lpabon_test (~quassel@66-189-8-115.dhcp.oxfr.ma.charter.com) has joined #ceph
[18:01] * thb (~me@0001bd58.user.oftc.net) Quit (Ping timeout: 480 seconds)
[18:02] * lpabon_test (~quassel@66-189-8-115.dhcp.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[18:02] * lpabon (~quassel@66-189-8-115.dhcp.oxfr.ma.charter.com) has joined #ceph
[18:02] * cok (~chk@2a02:2350:18:1012:14d3:ae81:32a0:a625) Quit (Quit: Leaving.)
[18:03] * kraken (~kraken@gw.sepia.ceph.com) Quit (Remote host closed the connection)
[18:03] * kraken (~kraken@gw.sepia.ceph.com) has joined #ceph
[18:04] <ibuclaw> Hi, just have a quick question about distribution behaviour of ceph. I've got a test server (single OSD node with 12 disks). And the crush rule for replication is set to: "chooseleaf firstn 0 type osd".
[18:04] * bandrus (~oddo@ has joined #ceph
[18:05] * markbby (~Adium@ Quit (Quit: Leaving.)
[18:05] * rmoe (~quassel@ has joined #ceph
[18:05] * kraken (~kraken@gw.sepia.ceph.com) Quit (Remote host closed the connection)
[18:05] * kraken (~kraken@gw.sepia.ceph.com) has joined #ceph
[18:06] * brad_mssw (~brad@shop.monetra.com) Quit (Quit: Leaving)
[18:06] <ibuclaw> Finally loaded it with about 100GB of data (via radosgw)
[18:06] * JC (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[18:06] * lupu (~lupu@ has joined #ceph
[18:07] * JC (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) has joined #ceph
[18:07] <ibuclaw> currently on the OS level, I can see that 7 of the disks have been filling up to 19-20GBs, but the other 5 are pretty much empty.
[18:08] * Sysadmin88 (~IceChat77@ has joined #ceph
[18:08] <ibuclaw> is this expected? Or should there be more a balance between them?
[18:08] * brad_mssw (~brad@shop.monetra.com) has joined #ceph
[18:10] <iggy> do they all have the same weight?
[18:10] <ibuclaw> yes
[18:10] * swami (~swami@ has joined #ceph
[18:10] <iggy> then yes, I would expect the data to be evenly spread
[18:10] * JC1 (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) has joined #ceph
[18:12] * kraken (~kraken@gw.sepia.ceph.com) Quit (Remote host closed the connection)
[18:12] * kraken (~kraken@gw.sepia.ceph.com) has joined #ceph
[18:12] * markbby (~Adium@ has joined #ceph
[18:12] <Vacum_> how many PGs in the pool that radosgw uses?
[18:13] <ibuclaw> 512 PGs
[18:13] <ibuclaw> got 64 PGs for every other pool (.rgw.users, etc)
[18:13] <Vacum_> mh. they should distribute fine over the OSDs then. are all 12 OSDs "up" and "in"?
[18:15] <ibuclaw> all are up and in
[18:15] * JC (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[18:16] <ibuclaw> my monitoring scripts all say that weights are good, PG distribution is within bounds (maybe I should debug this script)
[18:17] <Vacum_> ibuclaw: how many objects did you write to radosgw?
[18:18] <loicd> is there a way to get the primary OSD for a given object via the command line ?
[18:18] <ibuclaw> Vacum_, 842114 apparently
[18:18] <Vacum_> loicd: I though ceph osd map poolname objectname ?
[18:18] <Vacum_> ibuclaw: mh, also enough for a good distribution
[18:19] <loicd> Vacum_: looks like what I'm looking for, thanks :-)
[18:29] <bloodice> ugh.. the ceph documents say to use ceph-deploy to upgrade the systems, but it doesnt say what command... and the help doesnt list an upgrade option... so i assume, you just run ceph-deploy install <host>
[18:30] * shang (~ShangWu@220-135-203-169.HINET-IP.hinet.net) has joined #ceph
[18:30] * swami (~swami@ Quit (Read error: Connection reset by peer)
[18:30] * b0e (~aledermue@x2f2a49c.dyn.telefonica.de) has joined #ceph
[18:33] <alfredodeza> bloodice: I don't think I've done upgrades with ceph-deploy before. And of course, you need to take into account upgrade-specific details that are usually different from version to version
[18:33] <alfredodeza> e.g. some versions will ask you to do something before/after upgrading
[18:36] * swami (~swami@ has joined #ceph
[18:37] <bloodice> oh yea, i read the upgrade documentation, i am going from 0.72.1 to 0.80 ( or higher ), i just needed to add a line to the config, which i have done.
[18:37] <bloodice> Some of the documents are specific and some are vague... like the ceph-deploy... no mention of what command to run for upgrades
[18:39] * b0e (~aledermue@x2f2a49c.dyn.telefonica.de) Quit (Quit: Leaving.)
[18:42] <bloodice> alfredodeza: i deployed via ceph-deploy, so i am trying to stay inline with what i orginally did
[18:42] * ibuclaw (~ibuclaw@rabbit.dbplc.com) Quit (Quit: Leaving)
[18:42] * zerick (~eocrospom@ has joined #ceph
[18:43] * tracphil (~tracphil@ has joined #ceph
[18:44] * madkiss (~madkiss@tmo-106-237.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[18:45] * fghaas (~florian@91-119-223-7.dynamic.xdsl-line.inode.at) has joined #ceph
[18:46] * JC (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) has joined #ceph
[18:46] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[18:47] * JC1 (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[18:49] * rweeks (~rweeks@pat.hitachigst.com) has joined #ceph
[18:49] * rturk|afk is now known as rturk
[18:49] * sarob (~sarob@2001:4998:effd:600:e8b0:bbcf:fa1c:e209) has joined #ceph
[18:53] * ajiang38740 (~ajiang387@cpe-70-116-46-115.austin.res.rr.com) Quit (Quit: Leaving...)
[18:55] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[18:57] * markbby (~Adium@ Quit (Quit: Leaving.)
[18:59] * markbby (~Adium@ has joined #ceph
[19:00] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) has joined #ceph
[19:01] * lcavassa (~lcavassa@ Quit (Quit: Leaving)
[19:03] * erice (~erice@ has joined #ceph
[19:03] * jeff-YF (~jeffyf@ Quit (Quit: jeff-YF)
[19:04] * erice_ (~erice@ Quit (Ping timeout: 480 seconds)
[19:06] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[19:08] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[19:10] <loicd> what is Virtual Storage Manager (VSM) ? (in https://www.openstack.org/vote-paris/Presentation/improve-your-ceph-awareness-virtual-storage-manager-for-ceph )
[19:12] * shang (~ShangWu@220-135-203-169.HINET-IP.hinet.net) Quit (Quit: Ex-Chat)
[19:15] * reed (~reed@75-101-54-131.dsl.static.sonic.net) Quit (Ping timeout: 480 seconds)
[19:15] * sjustwork (~sam@2607:f298:a:607:d129:8cf9:5f9b:511e) has joined #ceph
[19:15] * adamcrume (~quassel@ Quit (Remote host closed the connection)
[19:17] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[19:19] * chutz (~chutz@rygel.linuxfreak.ca) Quit (Quit: Leaving)
[19:19] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[19:19] * Nacer_ (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[19:19] <cookednoodles> "Intel plans to release Virtual Storage Manager as an Open Source project in Q4 of 2014."
[19:19] <cookednoodles> heh
[19:19] * chutz (~chutz@rygel.linuxfreak.ca) Quit ()
[19:20] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[19:20] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[19:21] * ninkotech__ (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[19:22] * TiCPU (~jeromepou@ Quit (Ping timeout: 480 seconds)
[19:25] * jeff-YF (~jeffyf@ has joined #ceph
[19:25] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[19:25] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[19:27] <bloodice> ugh there is nothing online about using ceph-deploy to upgrade cluster
[19:28] * bkopilov (~bkopilov@ has joined #ceph
[19:29] * ninkotech__ (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[19:30] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[19:32] * swami (~swami@ Quit (Ping timeout: 480 seconds)
[19:33] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[19:33] <kitz> bloodice: I was under the impression that you updated with yum/apt which replaced the packages but didn't restart the services. Then you restarted the services manually starting with mons. I haven't done this and I could be wrong but that's the impression I've gathered so far.
[19:33] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[19:35] * ninkotech__ (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[19:35] * qhartman (~qhartman@den.direwolfdigital.com) has joined #ceph
[19:36] * thb (~me@2a02:2028:6d:e550:6060:d2d3:ad02:67f2) has joined #ceph
[19:37] * cookednoodles (~eoin@eoin.clanslots.com) Quit (Quit: Ex-Chat)
[19:40] <bloodice> I just ran ceph-deploy install <monitorhostname> and it updated the monitor to .80.5 and then restarted the monitor
[19:40] <bloodice> so i guess thats the step....
[19:41] <bloodice> it does use apt-get, so you have to update the ceph repository
[19:45] * adamcrume (~quassel@c-71-204-162-10.hsd1.ca.comcast.net) has joined #ceph
[19:46] * aknapp (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[19:56] * angdraug (~angdraug@ has joined #ceph
[19:59] * mozg111 (~oftc-webi@ Quit (Remote host closed the connection)
[19:59] * madkiss (~madkiss@tmo-107-140.customers.d1-online.com) has joined #ceph
[20:02] <seapasulli> ceph-deploy will update the repo unless you tell it not to I believe
[20:02] <seapasulli> --no-adjust-repos
[20:03] <seapasulli> https://github.com/ceph/ceph-deploy
[20:03] * ircolle is now known as ircolle-afk
[20:03] * astellwag (~astellwag@ Quit (Read error: Connection reset by peer)
[20:03] * astellwag (~astellwag@ has joined #ceph
[20:07] * erice (~erice@ Quit (Read error: Connection reset by peer)
[20:08] * astellwag (~astellwag@ Quit (Remote host closed the connection)
[20:09] * astellwag (~astellwag@ has joined #ceph
[20:12] * kanagaraj (~kanagaraj@ Quit (Quit: Leaving)
[20:13] * rturk is now known as rturk|afk
[20:13] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[20:20] <jksM> is there a way to see which client or rbd is generating iops on ceph?
[20:21] <jksM> have a number of rbds used by qemu - the number of op/s as reported by ceph status suddenly grew, but I cannot tell which client is responsible for it
[20:24] <iggy> jksM: at the qemu level, the monitor command "info blockstats"
[20:26] <jksM> iggy, thanks! :-) isn't there something "central" like the io/s count you can get from ceph status?
[20:26] <jksM> otherwise I'll have to check in on each and every server to see if I can find the right qemu instance
[20:27] <iggy> I'm not sure... but you should be monitoring that!
[20:28] <jksM> I'll do that now :-)
[20:29] * rturk|afk is now known as rturk
[20:29] <iggy> qemu also has block limiters fwiw
[20:29] <iggy> to make sure one guest doesn't kill others
[20:30] <iggy> but that get's complicated when multiple hosts are talking to ceph
[20:30] <Gorazd> Is there a way for object files, which are stored through RadosGW to RADOS, to be accessd from VM through RBD or CephFS?
[20:30] <jksM> well, it isn't blocking other stuff - so it's not that bad... I was wondering why the number grew
[20:30] * rturk is now known as rturk|afk
[20:30] * rturk|afk is now known as rturk
[20:32] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[20:33] * scuttlemonkey is now known as scuttle|afk
[20:33] * fghaas (~florian@91-119-223-7.dynamic.xdsl-line.inode.at) has left #ceph
[20:36] * manohar (~manohar@ Quit (Ping timeout: 480 seconds)
[20:39] <iggy> Gorazd: rbd and cephfs can only access rbd and cephfs (respectively) objects
[20:39] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[20:41] * bbutton (~bbutton@ has joined #ceph
[20:47] <ganders> is possible to use ceph-deploy to deploy osd's with btrfs instead of xfs?
[20:49] * rendar (~I@host228-179-dynamic.1-87-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[20:52] * rendar (~I@host228-179-dynamic.1-87-r.retail.telecomitalia.it) has joined #ceph
[20:54] * TiCPU (~jeromepou@ has joined #ceph
[20:57] <alfredodeza> ganders: `ceph-deploy osd --help`
[20:57] <alfredodeza> ganders: --> --fs-type FS_TYPE filesystem to use to format DISK (xfs, btrfs or ext4)
[20:58] <ganders> yeah i try that but with no luck :D
[21:01] <bloodice> okay, so ceph-deploy sends the upgrades out to the servers and restarts monitors and rados, but it doesnt restart the osds.... the manual command to restart OSDs isnt work for me... so is this a HUP situation?
[21:01] <alfredodeza> ganders: what do you mean? like it didn't use the btrfs option?
[21:01] <alfredodeza> do you have some output you can share?
[21:03] <iggy> bloodice: -HUP won't restart execution (it generally means to reread config)
[21:04] <bloodice> yea, good point
[21:05] <ganders> alfredodeza: like trying "ceph-deploy --fs-type btrfs osd prepare cephosd01:sdd:/dev/sde1"
[21:05] <alfredodeza> ganders: no
[21:05] <ganders> and gets error
[21:05] <alfredodeza> that flag is for the `osd` sub command
[21:05] <alfredodeza> you are using it *before* the osd subcommand
[21:06] <ganders> oh you are right, i will try it
[21:06] <alfredodeza> e.g. (ceph-deploy executable) (global flags like --verbose) (subcommand) (subcommand specific flags)
[21:06] <ganders> alfredodeza: thx!
[21:06] <alfredodeza> np
[21:06] <alfredodeza> now, if *that* doesn't work, that is a bug, you should ping me :)
[21:07] <bloodice> seems ceph-deploy install method, when run in an environment where the monitor servers and osd hosts are separate servers, doesnt allow you to send restart commands to the osds
[21:10] <bloodice> i can get the version of the running osd using ceph tell osd.x version
[21:12] <bloodice> yea, the ceph-deploy method sets all the osds to run in the foreground
[21:12] <bloodice> 1-17:41:32 /usr/bin/ceph-osd --cluster=ceph -i 53 -f
[21:14] <Gorazd> iggy: thank you. Do you maybe know if there exists some NFS/CIFS Gateway SW for object data stored through RadosGW or librados, similar what Swifstack has offered https://www.swiftstack.com/uploads/factsheets_pdf/swiftstack_filesystem_gateway.pdf within their controller - it is a great Enterprise functionality for legacy applications ....
[21:15] * rturk is now known as rturk|afk
[21:15] <iggy> Gorazd: you're still talking about somehow accessing generic object data via a filesystem interface?
[21:19] * rturk|afk is now known as rturk
[21:21] <gleam> i mean, fuse?
[21:21] <gleam> cephfs?
[21:23] * steveeJ (~junky@HSI-KBW-085-216-022-246.hsi.kabelbw.de) has joined #ceph
[21:33] <Gorazd> iggy: yes, but if I understand you, I can forget about this. Right?
[21:35] <ganders> alfredodeza: mmm having some issues:
[21:35] <alfredodeza> ganders: get me a paste of the whole ceph-deploy output
[21:35] <ganders> ceph-deploy disk prepare --fs-type btrfs cephosd01:sdd:/dev/sde
[21:36] <Gorazd> I would at least like to access object storage that were saved to RADOS through RadosGW with HTTP REST API. Also Maldivica has some solution available within their aplliance, but this could be vendor lock-in -> http://maldivica.com/technology
[21:37] <ganders> [WARNIN] ceph-disk: Error: weird parted units:
[21:37] <ganders> [ERROR ] RuntimeError: command returned non-zero exit status: 1
[21:37] <ganders> [ERROR ] Failed to execute command: ceph-disk-prepare --fs-type btrfs --cluster ceph -- /dev/sdd /dev/sde
[21:37] <ganders> that's the error msg
[21:37] <iggy> Gorazd: correct
[21:38] * erice (~erice@ has joined #ceph
[21:41] <iggy> Gorazd: there is a modified samba server somewhere that talks directly to ceph
[21:41] <alfredodeza> ganders: can you paste the whole output somewhere so I can take a look at it?
[21:41] <ganders> yeah sure, give me one sec
[21:41] <iggy> Gorazd: but that just accesses the same file data as cephfs
[21:42] <Gorazd> iggy: aha thx. do you maybe have any relevant web link for this?
[21:43] <iggy> github.com/ceph -> Next * 5 or so
[21:43] <kitz> are dump_historic_ops entries always sorted by duration descending?
[21:43] <ganders> alfredodeza: http://pastebin.com/print.php?i=9W3Tj8B4
[21:43] * alfredodeza looks
[21:44] <bloodice> welp, found my answer... go to each osd host and run "restart ceph-osd-all
[21:45] <alfredodeza> ganders: hrmnnn I wonder which one triggered the error
[21:45] <alfredodeza> was it `/sbin/parted --machine -- /dev/sde print`
[21:45] <alfredodeza> ?
[21:45] <alfredodeza> can you try that manually?
[21:45] <alfredodeza> at this point it looks like you will have to go into the remote node and tinker with the commands that ceph-disk called
[21:46] <alfredodeza> oh
[21:46] <alfredodeza> *unless*
[21:46] <alfredodeza> you haven't zapped the disks, which is always a good idea to do before
[21:46] <alfredodeza> ganders: have you done that?
[21:46] <ganders> alfredodeza: yes, i've already do that
[21:47] * markbby (~Adium@ Quit (Quit: Leaving.)
[21:47] * erice (~erice@ Quit (Ping timeout: 480 seconds)
[21:48] * madkiss (~madkiss@tmo-107-140.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[21:48] <ganders> let me run the sgdisk cmd manually and test again
[21:48] * erice (~erice@ has joined #ceph
[21:49] <ganders> alfredodeza: different error now: http://pastebin.com/print.php?i=KSSSe2EH
[21:50] <joshd> Gorazd: there are several projects for accessing s3 apis (like radosgw) via fuse
[21:50] <ganders> is rare is said that /dev/sdd1 appears to have a xfs part but i've already do a zap disk
[21:50] <kitz> ganders: what kind of SSDs are you using?
[21:51] <ganders> Intel DC 3500 120G
[21:51] <kitz> thx
[21:53] <iggy> is anybody using just a pure SSD setup? (no spinners)
[21:53] * bbutton (~bbutton@ Quit (Read error: Connection reset by peer)
[21:53] * bbutton (~bbutton@ has joined #ceph
[21:54] * fghaas (~florian@91-119-223-7.dynamic.xdsl-line.inode.at) has joined #ceph
[21:54] * rturk is now known as rturk|afk
[21:54] <gleam> ganders, you can push osd_mkfs_options_xfs = -f and it should force the mkfs
[21:55] * fghaas (~florian@91-119-223-7.dynamic.xdsl-line.inode.at) Quit ()
[21:57] * sage___ (~quassel@gw.sepia.ceph.com) Quit (Quit: No Ping reply in 180 seconds.)
[21:58] * sage___ (~quassel@gw.sepia.ceph.com) has joined #ceph
[21:59] * ircolle-afk is now known as ircolle
[22:01] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[22:02] <Gorazd> joshd: you have something like this in mind: https://code.google.com/p/s3ql/wiki/other_s3_filesystems
[22:02] * t0rn (~ssullivan@2607:fad0:32:a02:d227:88ff:fe02:9896) Quit (Quit: Leaving.)
[22:03] <bloodice> would having the journal on the same disk as the OSD really cause the write speed to be 7MB/s when the read speed is 100MB/s
[22:04] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[22:06] <burley> iggy: Am currently testing an all SATA SSD config
[22:07] <burley> with 36 OSDs across 3 nodes of crucial m500 960GB
[22:07] <iggy> bloodice: wouldn't surprise me
[22:08] * joef (~Adium@2620:79:0:131:d061:a542:af79:d3b9) has left #ceph
[22:08] <iggy> burley: you plan on sharing your experiences on the mailing list? (if you can)
[22:08] <burley> once we're in a more final config I can write something up
[22:08] * kfei (~root@114-27-93-71.dynamic.hinet.net) Quit (Ping timeout: 480 seconds)
[22:08] <ganders> gleam: do you mean to put that parameter on the ceph.conf file and then run again the ceph-deploy prep cmd?
[22:09] <burley> I suspect we'll be at ~114 OSDs in our final config across 6 nodes for SSD storage
[22:09] <burley> assuming we get through a few issues we're having atm
[22:09] <joshd> Gorazd: yeah, I haven't used them much myself, so can't recommend any one in particular
[22:10] <Sysadmin88> burley, you using 10gb networking?
[22:10] <burley> yes
[22:10] <Sysadmin88> you maxing it out?
[22:10] <burley> on reads
[22:10] <bloodice> time to upgrade to 1Tb
[22:10] <iggy> I've been eye balling some 1u 12SSD boxes with 2x 10GE
[22:10] <bloodice> hehe
[22:11] <burley> we're using Dell R720xd's
[22:11] <iggy> seems like a pretty good price point for the performance
[22:11] * markbby (~Adium@ has joined #ceph
[22:11] <bloodice> iggy: how much?
[22:12] <iggy> pack them full of 1T SSDs for pretty cheap
[22:12] <iggy> ~10k each
[22:12] <burley> with the drives I assume
[22:12] <iggy> right
[22:12] <nizedk> have you looked at techreviews ssd endurance writeup?
[22:12] <Sysadmin88> i have a couple of T710s, theyre nice but i wouldnt buy them now
[22:12] <nizedk> regarding consumer grade ssds and how they fail when you torture them?
[22:12] * cookednoodles (~eoin@eoin.clanslots.com) has joined #ceph
[22:13] <burley> http://techreport.com/review/26523/the-ssd-endurance-experiment-casualties-on-the-way-to-a-petabyte <-- that?
[22:13] <nizedk> yes
[22:13] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit (Quit: Leaving.)
[22:13] <bloodice> our current SAN uses SSDs for cache
[22:14] <nizedk> ours too, but they are typically enterprise SAS ssds that tend to back off fast if they fail and let raid controllers handle the situation
[22:14] <nizedk> backed up by a support contract where a new is shipped. :-)
[22:14] <burley> yes, I have read that over a few times
[22:15] <nizedk> our new san is not using ssds, but 1,6 or 3,2 TB flash cards inserted into storage processors themselves
[22:15] <burley> our plan is a mix of spinners and SSDs, so pick and choose as we need
[22:15] <iggy> those are older SSDs though
[22:15] <bloodice> Support contract actually included providing us with a spare onsite, so i unwrapped it.. its an intel ssd enterprise drive
[22:15] <nizedk> yes, they are really good apaprantly
[22:16] <iggy> I guess not that old
[22:16] <iggy> just small
[22:16] <iggy> (well, relative to what I'm shopping for)
[22:17] * nljmo (~nljmo@5ED6C263.cm-7-7d.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[22:17] <bloodice> i wonder if i can buy the larger versions of these and replace the existing cache with a larger amount.. without them knowing
[22:17] <bloodice> they want 10k to upgrade four drives
[22:17] * nljmo (~nljmo@5ED6C263.cm-7-7d.dynamic.ziggo.nl) has joined #ceph
[22:17] <iggy> but honestly, I'd expect at least an 18 month turn over on these things anyway (probably closer to 9-12)
[22:18] <iggy> I don't see myself writing that much data in that amount of time (without trying)
[22:18] <Sysadmin88> probably some clause that if you do that the support disappears
[22:18] <bloodice> yea :(
[22:19] <bloodice> i was looking at ceph to replace this thing when the lease ends, but ceph doesnt seem to be vmware ready
[22:19] * JC1 (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) has joined #ceph
[22:20] <Sysadmin88> i'm hoping XenServer will integrate ceph
[22:21] <bloodice> i thought about making a KVM environment with ceph with some secondary servers to start, but even that seems to be "not ready" yet
[22:21] * kfei (~root@114-27-83-66.dynamic.hinet.net) has joined #ceph
[22:21] <iggy> a lot of people are using that kind of setup
[22:21] <iggy> companies as well
[22:22] <Sysadmin88> i've only seen proxmox integrate it so far... but last time i used proxmox it didnt work for me.
[22:24] <bloodice> well, my upgrade to firefly was a success, but the performance for writes has seen no improvements
[22:26] * JC (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[22:26] <bloodice> so for 36 osds, a single SSD will do right? :P
[22:27] <Sysadmin88> as a single journal?
[22:27] <bloodice> you can do that?
[22:27] <bloodice> i was thinking 36 journals
[22:28] <Sysadmin88> i meant all journals on one device
[22:28] <Sysadmin88> that ratio is quite high
[22:28] <bloodice> yea, all the journals on one drive
[22:28] <Sysadmin88> if the SSD dies, all those OSDs die
[22:28] <burley> I couldn't get 6 spinners to journal to one ssd
[22:28] <burley> in my case
[22:28] <bloodice> ouch
[22:28] <Sysadmin88> single point of failure, not in cephs spirit
[22:28] <burley> so the spinners are journaling to themselves
[22:29] <bloodice> i did that, but the write speed on my spinners is like 7MB/s
[22:29] <bloodice> trying to find a way to speed that up
[22:29] <bloodice> the read is 100MB/s.. which is fine
[22:29] <burley> did you try to stagger the journals as suggested earlier?
[22:30] <bloodice> there was some mention of that... stagger, as in place on the next drive?
[22:30] <burley> yes
[22:30] <iggy> Sysadmin88: libvirt (and thus the 1 bajillion frontends for it) has rbd support
[22:30] <bloodice> that seems like a lot of work to move them
[22:30] <bloodice> just to test it
[22:32] <iggy> then stick with the 7MB/s write speed
[22:32] * rturk|afk is now known as rturk
[22:32] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[22:33] <ganders> ok i got it work with btrfs,... i run before the ceph-deploy prepare cmd, the following cmd on the osd node: "mkfs -t btrfs -m single -l 32768 -n 32768 -f -- /dev/sdX"
[22:34] <burley> have you tuned the OSDs
[22:34] <burley> for readahead, scheduler, etc
[22:34] <burley> and benchmarked them individually
[22:34] <burley> and then tested them all as a group
[22:35] <burley> and work your way through the layers to identify the bottleneck
[22:36] <iggy> burley: what fs are you using on your all-SSD setup?
[22:36] <burley> ext4
[22:37] * qhartman (~qhartman@den.direwolfdigital.com) Quit (Quit: Ex-Chat)
[22:37] <iggy> did you try btrfs (to test the journal optimizations btrfs has)
[22:38] <burley> no, since its labeled as not for production, no sense testing it
[22:38] <iggy> labeled on the btrfs side or the ceph side?
[22:38] <burley> both
[22:40] <nizedk> I'm currently looking at a 6-10 node 36 disk setup, no SSD's at all; intended usage backup storage, so slow write and "somewhat faster than write" reads are OK. However, are there any similar sized setups mentioned inhere?
[22:41] * tracphil (~tracphil@ Quit (Quit: leaving)
[22:41] * cok (~chk@2a02:2350:18:1012:4139:38c6:ed0f:9997) has joined #ceph
[22:42] <nizedk> and eventually a SSD only cache tier in front as write-back to it.
[22:44] <nizedk> however, anyone mentioning a 200-300 OSD spinner-only setups performance?
[22:49] * garphy`aw is now known as garphy
[22:50] * madkiss (~madkiss@tmo-106-25.customers.d1-online.com) has joined #ceph
[22:52] <ganders> anyone see this error before? http://pastebin.com/print.php?i=jEhVbHMA
[22:53] <jiffe> does the ceph kernel module do any kind of client side cacheing on reads or can it?
[22:54] <lurbs> It uses kernel page caching, I believe.
[22:58] * markbby (~Adium@ Quit (Quit: Leaving.)
[22:58] * markbby (~Adium@ has joined #ceph
[22:59] <burley> http://techreport.com/review/26523/the-ssd-endurance-experiment-casualties-on-the-way-to-a-petabyte
[22:59] <burley> wrong paste
[22:59] <burley> http://ceph.com/docs/next/rbd/rbd-config-ref/
[22:59] <burley> ^-- caching info
[23:03] * ganders (~root@200-127-158-54.net.prima.net.ar) Quit (Quit: WeeChat 0.4.1)
[23:04] * brad_mssw (~brad@shop.monetra.com) Quit (Quit: Leaving)
[23:05] * madkiss (~madkiss@tmo-106-25.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[23:07] * baylight (~tbayly@74-220-196-40.unifiedlayer.com) Quit (Ping timeout: 480 seconds)
[23:08] * b0e (~aledermue@x2f2a49c.dyn.telefonica.de) has joined #ceph
[23:10] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[23:12] <jiffe> when I was using gluster I would use nfs to export a gluster mount and then reimport it to the same machine to take advantage of nfs caching, not sure if that would help at all or if its even recommended
[23:15] <iggy> I think that would be ill advised
[23:15] <chowmeined> jiffe, you probably dont want to do that
[23:15] <chowmeined> but rbd has its own caching so you may not need it
[23:15] <chowmeined> also, some people do flashcache on an rbd mapped block device
[23:17] <bloodice> i didnt do any tuning, i figured ceph was tuned for the most part and if you wanted superstar performance, you then did tuning... :)
[23:17] * lupu (~lupu@ Quit (Ping timeout: 480 seconds)
[23:18] <iggy> tuned for having SSD journals, yeah
[23:18] <chowmeined> bloodice, my experience was different. Ceph required a lot of careful hardware/layout choices
[23:19] <bloodice> i have this guy has... http://ceph.com/community/ceph-bobtail-jbod-performance-tuning/
[23:19] * madkiss (~madkiss@tmo-096-40.customers.d1-online.com) has joined #ceph
[23:20] * madkiss (~madkiss@tmo-096-40.customers.d1-online.com) Quit ()
[23:20] * baylight (~tbayly@ has joined #ceph
[23:21] <bloodice> his range is 1-10 and he is in the 5-6 range... i am only slightly higher... so it seems right then
[23:22] * cok (~chk@2a02:2350:18:1012:4139:38c6:ed0f:9997) has left #ceph
[23:23] <bloodice> i thought i read that most of these settings were defaulted to optimal values anyway now...
[23:23] <bloodice> which is why i didnt think tuning would be necessary
[23:23] <bloodice> i mean, read is doing what i expect
[23:24] <bloodice> the developers testing this are complaining about the write speed when they run a test against it versus amazon
[23:26] <jiffe> bonnie++ was giving me some pretty good read/write numbers
[23:26] * bbutton (~bbutton@ Quit (Read error: Connection reset by peer)
[23:26] <jiffe> file creation/deletion was pretty low but I think thats common
[23:26] * bbutton (~bbutton@ has joined #ceph
[23:29] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[23:31] <bloodice> looks like object store will eliminate the journals :)
[23:36] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[23:39] * sgnut (~holoirc@147.Red-83-61-86.dynamicIP.rima-tde.net) has joined #ceph
[23:40] * colinm (~colinm@71-223-134-17.phnx.qwest.net) has joined #ceph
[23:40] <sgnut> Do you usually use dedicated servers for monitoring or do you reuse osd machines?
[23:41] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[23:42] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[23:46] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[23:46] * jeff-YF (~jeffyf@ Quit (Quit: jeff-YF)
[23:47] <rweeks> don't use OSD machines
[23:47] <rweeks> monitor should be a separate box
[23:48] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[23:48] <iggy> ideally
[23:48] <rweeks> for testing you might be ok
[23:48] <rweeks> but for anything production you need separate monitor(s)
[23:49] <iggy> at the very least separate SSDs
[23:49] <iggy> but yeah, better separate hosts
[23:49] <sgnut> Ok
[23:50] * sigsegv (~sigsegv@ Quit (Quit: sigsegv)
[23:51] <sgnut> And can be use it as a storage system for virtualization (e.g. vmware esx). Anyone has some experience under such scenario?
[23:51] <rweeks> works great for openstack/KVM virtualization
[23:51] * bbutton (~bbutton@ Quit (Quit: This computer has gone to sleep)
[23:51] <rweeks> also works well for CloudStack
[23:51] <sgnut> Ok
[23:51] <rweeks> VMware storage would require a hack or two
[23:52] * Gorazd (~Gorazd@89-212-99-37.dynamic.t-2.net) Quit ()
[23:52] <rweeks> VMware has no drivers to mount Ceph block devices
[23:52] * markbby (~Adium@ Quit (Quit: Leaving.)
[23:52] <rweeks> having said that, there are people out there exporting Ceph block devices via something like OpeniSCSI
[23:53] <rweeks> and then mounting those as VMware datastores
[23:53] <rweeks> but it's a hack
[23:53] <sgnut> Mmm i guess you must use nfs or iscsi (using block devices)
[23:53] <rweeks> yes
[23:53] <rweeks> you could mount the block device to a linux machine and then export it via nfs
[23:53] <rweeks> but again you've introduced another layer of abstraction and potential failure
[23:53] <sgnut> Hahaja sorry i write slow with the tablet :P
[23:53] * gregmark (~Adium@ Quit (Quit: Leaving.)
[23:54] <Serbitar> at least with rdb/iscsi bridges you can easily set up failover
[23:54] <Serbitar> since iscsi handles multipathing nicely
[23:54] <sgnut> I should test it before
[23:55] * bbutton (~bbutton@ has joined #ceph
[23:55] <sgnut> I guess that ceph distributes the reads across the different replicated osd in order to improve performance, right?
[23:56] <sgnut> Sorry I'm new :)
[23:56] <rweeks> yes, IO is striped across OSDs in a pool of block devices
[23:56] <Serbitar> the objects that make up the rbd will be evenly distributed over yuour osd
[23:57] * cok (~chk@ has joined #ceph
[23:57] <sgnut> Nice
[23:57] <iggy> someone should kickstart vmware/xen rbd drivers
[23:57] <iggy> xenserver that is... xen should be fine
[23:57] <rweeks> xen can use the kernel RBD
[23:57] <rweeks> IIRC
[23:58] <rweeks> I've been out of the xen loop for a while
[23:58] <iggy> well, hvm, can use the built-in librbd support in qemu
[23:58] <iggy> at least
[23:58] <rweeks> yes
[23:58] <rweeks> and that is the most well-tested and supported virtualization driver for RBD
[23:58] <iggy> but yeah, I meant xenserver
[23:59] * jordanP (~jordan@ has joined #ceph
[23:59] * jordanP (~jordan@ Quit ()
[23:59] <iggy> but realistically, if you're paying 30k per vmware host for licensing, a "supported" storage setup is a drop in the bucket
[23:59] * joao|lap (~JL@ Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.