#ceph IRC Log

Index

IRC Log for 2015-06-19

Timestamps are in GMT/BST.

[0:02] * mdxi_ (~mdxi@50-199-109-154-static.hfc.comcastbusiness.net) has joined #ceph
[0:02] * mdxi (~mdxi@50-199-109-154-static.hfc.comcastbusiness.net) Quit (Read error: Connection reset by peer)
[0:02] * madkiss (~madkiss@2001:6f8:12c3:f00f:adb6:92d5:ccee:4dc4) Quit (Read error: Network is unreachable)
[0:02] <TheSov> as many as you want
[0:02] <TheSov> though at some point sharing that data may result in dimished capability :)
[0:03] * madkiss (~madkiss@2001:6f8:12c3:f00f:ac2f:cfe8:bfb6:b12) has joined #ceph
[0:03] <ska> Oh.. good.. I just want to make all my nodes exactly the same.. (as much as possible, 3 nodes).. for testing.
[0:03] <TheSov> i have come to realization that ceph is basically, object level bittorrent, the OSD's are the invidual torrenters and the monitors are the trackers
[0:04] <TheSov> though there are no "seeds"
[0:04] <TheSov> or each object is a seed...
[0:07] <TheSov> so yeah thats very cool. ceph is a enterprise bittorrent.
[0:13] * evilrob00 (~evilrob00@cpe-72-179-3-209.austin.res.rr.com) Quit (Remote host closed the connection)
[0:15] * trey (~trey@trey.user.oftc.net) has joined #ceph
[0:20] * LeaChim (~LeaChim@host86-132-233-125.range86-132.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[0:22] <ska> Do I need to make the dirs manually for servers : /var/lib/ceph/[osd,mon,mds]/* ?
[0:22] * Sysadmin88 (~IceChat77@2.124.164.69) has joined #ceph
[0:27] <TheSov> i think ceph-deploy takes care of all of that
[0:27] <m0zes> so, is there a good way to quickly delete all the data in a pool? I guess I could delete/recreate the pool.
[0:29] * nsoffer (~nsoffer@bzq-79-180-80-9.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[0:29] * Debesis (~0x@5.254.46.84.mobile.mezon.lt) Quit (Read error: Connection reset by peer)
[0:29] * Debesis (~0x@5.254.46.84.mobile.mezon.lt) has joined #ceph
[0:30] <lurbs> Deleting a pool caused the objects to be cleaned up in the background, so you don't need to wait.
[0:30] * ngoswami (~ngoswami@1.39.96.96) Quit (Quit: Leaving)
[0:30] <lurbs> So yeah, delete/recreate is probably the fastest way.
[0:34] * jclm (~jclm@ip-64-134-187-212.public.wayport.net) has joined #ceph
[0:37] * xarses (~xarses@166.175.184.79) Quit (Ping timeout: 480 seconds)
[0:42] <kszarlej> Hey, how can I check the max backfills?
[0:42] <kszarlej> amount of max backfills
[0:48] * delatte (~cdelatte@vlandnat.mystrotv.com) Quit (Ping timeout: 480 seconds)
[0:49] * oblu (~o@62.109.134.112) Quit (Ping timeout: 480 seconds)
[0:50] * jiyer (~chatzilla@63.229.31.161) has joined #ceph
[0:51] * oblu (~o@62.109.134.112) has joined #ceph
[0:53] <lurbs> That's a per OSD setting. I'm not sure of any way of getting it other than 'ceph --admin-daemon /var/run/ceph/ceph-osd.$OSD.asok config show | grep backfills', where $OSD is the number of the OSD you want to check.
[0:53] <lurbs> Needs to be run on the machine on which the OSD's running.
[0:54] * vbellur (~vijay@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[0:57] * lcurtis (~lcurtis@47.19.105.250) Quit (Remote host closed the connection)
[1:04] * badone__ is now known as badone
[1:07] * xarses (~xarses@12.10.113.130) has joined #ceph
[1:09] * tmrz (~quassel@198-84-192-38.cpe.teksavvy.com) Quit (Remote host closed the connection)
[1:14] * vata (~vata@207.96.182.162) Quit (Ping timeout: 480 seconds)
[1:26] * wschulze1 (~wschulze@cpe-69-206-240-164.nyc.res.rr.com) Quit (Ping timeout: 480 seconds)
[1:28] * dalgaaf (uid15138@id-15138.charlton.irccloud.com) Quit (Quit: Connection closed for inactivity)
[1:35] * osds-fail (~oftc-webi@c-50-165-225-40.hsd1.il.comcast.net) has joined #ceph
[1:37] <osds-fail> I start my osds. they crash. they are part of an ssd cache. the osds not involved in caching always stay online. http://pastebin.com/Y8z6GVJv they were working fine until I upgraded from 94.1 to 94.2. ideas? all servers times are in sync
[1:38] <osds-fail> http://pastebin.com/zhGRQWpJ is the crush map as well. the log fail there is for osd6
[1:39] * johanni (~johanni@173.226.103.101) Quit (Remote host closed the connection)
[1:41] * johanni (~johanni@173.226.103.101) has joined #ceph
[1:50] * MentalRay (~MRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Ping timeout: 480 seconds)
[1:51] * wschulze (~wschulze@cpe-69-206-240-164.nyc.res.rr.com) has joined #ceph
[1:53] * oms101 (~oms101@p20030057EA079000C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[2:02] * oms101 (~oms101@p20030057EA738A00C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[2:05] * johanni (~johanni@173.226.103.101) Quit (Remote host closed the connection)
[2:06] * sage (~quassel@2607:f298:6050:709d:58e6:4e74:e4a3:853) has joined #ceph
[2:06] * ChanServ sets mode +o sage
[2:12] * vata (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[2:30] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) Quit (Ping timeout: 480 seconds)
[2:31] * fam_away is now known as fam
[2:32] * haomaiwang (~haomaiwan@183.206.168.253) Quit (Remote host closed the connection)
[2:40] * Debesis (~0x@5.254.46.84.mobile.mezon.lt) Quit (Quit: Leaving)
[2:40] * Debesis (~0x@5.254.46.84.mobile.mezon.lt) has joined #ceph
[2:44] * Debesis (~0x@5.254.46.84.mobile.mezon.lt) Quit ()
[2:45] * Debesis (~0x@5.254.46.84.mobile.mezon.lt) has joined #ceph
[2:53] * KevinPerks (~Adium@2606:a000:80ad:1300:f9b4:4367:16bf:fbaf) Quit (Quit: Leaving.)
[2:57] * marrusl (~mark@rrcs-70-60-101-195.midsouth.biz.rr.com) has joined #ceph
[2:58] * marrusl (~mark@rrcs-70-60-101-195.midsouth.biz.rr.com) Quit (Remote host closed the connection)
[3:04] * yguang11 (~yguang11@2001:4998:effd:600:b09b:896a:1371:2e5a) Quit (Remote host closed the connection)
[3:07] * MentalRay (~MRay@107.171.161.165) has joined #ceph
[3:08] * haomaiwang (~haomaiwan@218.94.96.134) has joined #ceph
[3:10] * vbellur (~vijay@107-1-123-195-ip-static.hfc.comcastbusiness.net) has joined #ceph
[3:10] * MonkeyJamboree (~Revo84@tor-exit.squirrel.theremailer.net) has joined #ceph
[3:11] * johanni (~johanni@24.4.41.97) has joined #ceph
[3:11] * lightspeed (~lightspee@2001:8b0:16e:1:8326:6f70:89f:8f9c) Quit (Ping timeout: 480 seconds)
[3:14] * dalegaard-39554 (~dalegaard@vps.devrandom.dk) Quit (Ping timeout: 480 seconds)
[3:15] * dopesong_ (~dopesong@lb1.mailer.data.lt) Quit (Remote host closed the connection)
[3:16] * davidzlap (~Adium@206.169.83.146) Quit (Quit: Leaving.)
[3:19] * johanni (~johanni@24.4.41.97) Quit (Ping timeout: 480 seconds)
[3:19] * lightspeed (~lightspee@2001:8b0:16e:1:8326:6f70:89f:8f9c) has joined #ceph
[3:23] * kefu (~kefu@114.92.125.213) has joined #ceph
[3:24] * lucas1 (~Thunderbi@218.76.52.64) has joined #ceph
[3:25] * Debesis (~0x@5.254.46.84.mobile.mezon.lt) Quit (Ping timeout: 480 seconds)
[3:32] * haomaiwang (~haomaiwan@218.94.96.134) Quit (Ping timeout: 480 seconds)
[3:33] * yghannam (~yghannam@0001f8aa.user.oftc.net) has joined #ceph
[3:35] * haomaiwang (~haomaiwan@218.94.96.134) has joined #ceph
[3:38] * jclm (~jclm@ip-64-134-187-212.public.wayport.net) Quit (Ping timeout: 480 seconds)
[3:39] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[3:40] * MonkeyJamboree (~Revo84@5NZAAD17R.tor-irc.dnsbl.oftc.net) Quit ()
[3:41] * jclm (~jclm@ip-64-134-187-212.public.wayport.net) has joined #ceph
[3:43] * georgem (~Adium@23.91.150.96) has joined #ceph
[3:48] * shyu (~Shanzhi@119.254.120.66) has joined #ceph
[3:51] * zhaochao (~zhaochao@111.161.77.241) has joined #ceph
[3:57] * TheSov2 (~TheSov@c-50-158-169-178.hsd1.il.comcast.net) has joined #ceph
[4:00] * OutOfNoWhere (~rpb@199.68.195.101) has joined #ceph
[4:03] * TheSov (~TheSov@c-50-158-169-178.hsd1.il.comcast.net) Quit (Ping timeout: 480 seconds)
[4:06] * fsimonce (~simon@host253-71-dynamic.3-87-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[4:08] * KevinPerks (~Adium@2606:a000:80ad:1300:290:9eff:fe9a:a1b0) has joined #ceph
[4:16] <Nats_> anyone running erasure code in production - whats your opinion reliability-wise vs standard replicated pools? is it just as stable or are there still wrinkles to watch out for?
[4:19] * treenerd (~treenerd@cpe90-146-100-181.liwest.at) has joined #ceph
[4:21] * fam is now known as fam_away
[4:24] * aarcane (~aarcane@99-42-64-118.lightspeed.irvnca.sbcglobal.net) Quit (Quit: Leaving)
[4:25] * yanzheng (~zhyan@182.139.21.245) has joined #ceph
[4:25] * fam_away is now known as fam
[4:29] * flisky (~Thunderbi@106.39.60.34) has joined #ceph
[4:40] * Kupo1 (~tyler.wil@23.111.254.159) Quit (Read error: Connection reset by peer)
[4:40] * t0rn (~ssullivan@c-68-62-1-186.hsd1.mi.comcast.net) has joined #ceph
[4:41] * t0rn (~ssullivan@c-68-62-1-186.hsd1.mi.comcast.net) has left #ceph
[4:43] * ira (~ira@208.217.184.210) Quit (Ping timeout: 480 seconds)
[4:44] * georgem (~Adium@23.91.150.96) Quit (Quit: Leaving.)
[4:44] * georgem (~Adium@fwnat.oicr.on.ca) has joined #ceph
[4:56] * alexbligh1 (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) Quit (Ping timeout: 480 seconds)
[4:59] * treenerd (~treenerd@cpe90-146-100-181.liwest.at) Quit (Ping timeout: 480 seconds)
[5:02] * jclm1 (~jclm@ip-64-134-187-212.public.wayport.net) has joined #ceph
[5:04] * xinze (~xinze@120.35.11.138) has joined #ceph
[5:04] * MentalRay (~MRay@107.171.161.165) Quit (Quit: This computer has gone to sleep)
[5:08] * jclm (~jclm@ip-64-134-187-212.public.wayport.net) Quit (Ping timeout: 480 seconds)
[5:09] * xinze (~xinze@120.35.11.138) has left #ceph
[5:10] * rlrevell (~leer@vbo1.inmotionhosting.com) Quit (Ping timeout: 480 seconds)
[5:14] <kefu> Nats_, yahoo is using ec pool in production, see http://yahooeng.tumblr.com/post/116391291701/yahoo-cloud-object-store-object-storage-at
[5:15] <Nats_> thanks i'll check that out
[5:16] <kefu> Nats_: FYI they are using 8+3
[5:17] * evilrob00 (~evilrob00@cpe-72-179-14-118.austin.res.rr.com) has joined #ceph
[5:20] * zack_dolby (~textual@pa3b3a1.tokynt01.ap.so-net.ne.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[5:21] <Nats_> kefu, the mentioned optimizations in the article; do you happen to know are they proprietry to yahoo or have they submitted patches ?
[5:23] <kefu> Nats_, yahoo guys have been sending patches to ceph.
[5:23] <Nats_> nice
[5:24] <kefu> but Nats_, as you may know, ceph is not a turnkey solution.
[5:24] * overclk (~overclk@59.93.70.161) has joined #ceph
[5:25] <kefu> so your env might have different performance than that in yahoo.
[5:26] <Nats_> kefu, yeah for sure. the biggest difference would seem to be they're doing object storage rather than block (which i forgot to mention)
[5:26] <Nats_> but it would at least seem to demonstrate the erasure code implementation is robust
[5:28] <kefu> Nats_, true. ec backend is robust and is improving.
[5:28] * demonspork (~Uniju@tor-exit-node.7by7.de) has joined #ceph
[5:30] <Nats_> i see from reading the documentation there's the required caching layer for block storage too
[5:31] <Nats_> giving another potential problem point
[5:32] * fam is now known as fam_away
[5:34] <kefu> Nats_: i learned that some of the features are missing in ec backend, but not quite sure that a caching layer is *required* for a ec pool.
[5:35] <kefu> Nats_: oh, sorry, you meant rbd
[5:36] <kefu> Nats_: you are right =(
[5:37] <kefu> ec pool can hardly support random io due its inherent limitation
[5:39] <Nats_> will stick it on my test cluster to see what perf is like at some point
[5:39] * treenerd (~treenerd@91.210.221.44) has joined #ceph
[5:39] <Nats_> was just wondering if there's any word from the trenches as to whether its stable enough to stick production rbd on yet
[5:41] <kefu> Nats_: okay. would suggest put a SSD cache tier in front of the ec pool.
[5:41] <lurbs> I don't believe you can use an EC pool for RBD without a cache tier.
[5:42] <Nats_> we're full ssd in production so no dramas there
[5:42] <lurbs> So it's a necessity in that case, although not for pure object storage.
[5:42] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[5:42] <lurbs> http://ceph.com/docs/master/rados/operations/erasure-code/#erasure-coded-pool-and-cache-tiering <-- "It is not possible to create an RBD image on an erasure coded pool because it requires partial writes. It is however possible to create an RBD image on an erasure coded pools when a replicated pool tier set a cache tier"
[5:43] <Nats_> lurbs, yep. not concerned about it being required, just the robustness of the cache tier
[5:43] * kefu (~kefu@107.191.52.248) has joined #ceph
[5:44] * Vacuum__ (~Vacuum@i59F7AC52.versanet.de) has joined #ceph
[5:44] <Nats_> i haven't found much on ceph-users ML which would suggest its probably reliable
[5:44] * sankarshan (~sankarsha@183.87.39.242) Quit (Quit: Are you sure you want to quit this channel (Cancel/Ok) ?)
[5:44] * sankarshan (~sankarsha@183.87.39.242) has joined #ceph
[5:45] * kefu (~kefu@107.191.52.248) Quit (Max SendQ exceeded)
[5:46] * kefu (~kefu@107.191.52.248) has joined #ceph
[5:50] * tmrz (~quassel@198-84-192-38.cpe.teksavvy.com) has joined #ceph
[5:50] * kefu (~kefu@107.191.52.248) Quit (Max SendQ exceeded)
[5:51] * kefu (~kefu@107.191.52.248) has joined #ceph
[5:51] * Vacuum_ (~Vacuum@88.130.210.84) Quit (Ping timeout: 480 seconds)
[5:57] * overclk (~overclk@59.93.70.161) Quit (Ping timeout: 480 seconds)
[5:58] * demonspork (~Uniju@5NZAAD2CJ.tor-irc.dnsbl.oftc.net) Quit ()
[5:59] * squ (~Thunderbi@46.109.36.167) has joined #ceph
[6:00] * wschulze (~wschulze@cpe-69-206-240-164.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:01] * overclk (~overclk@59.93.227.177) has joined #ceph
[6:01] * treenerd (~treenerd@91.210.221.44) Quit (Ping timeout: 480 seconds)
[6:04] * debian112 (~bcolbert@24.126.201.64) Quit (Quit: Leaving.)
[6:04] * georgem (~Adium@fwnat.oicr.on.ca) Quit (Quit: Leaving.)
[6:10] * CheKoLyN (~saguilar@bender.parc.xerox.com) Quit (Quit: Leaving)
[6:11] * treenerd (~treenerd@91.210.221.44) has joined #ceph
[6:15] * overclk (~overclk@59.93.227.177) Quit (Read error: Connection reset by peer)
[6:18] * kefu (~kefu@107.191.52.248) Quit (Ping timeout: 480 seconds)
[6:24] * overclk (~overclk@59.93.226.23) has joined #ceph
[6:25] * treenerd (~treenerd@91.210.221.44) Quit (Ping timeout: 480 seconds)
[6:29] * vivek_v_c (7797220b@107.161.19.53) has joined #ceph
[6:29] * yguang11 (~yguang11@12.31.82.125) has joined #ceph
[6:30] * evilrob00 (~evilrob00@cpe-72-179-14-118.austin.res.rr.com) Quit (Remote host closed the connection)
[6:31] * vivek_v_c is now known as vivcheri
[6:36] * treenerd (~treenerd@91.210.222.66) has joined #ceph
[6:42] <kszarlej> Hey guys! I had 3 nodes under ceph and pool data size of 3. One one faiiiled and I removed its all OSDs and then set the data size to "2". So only 2 replicas.
[6:42] <kszarlej> Ceph is recovering now
[6:42] <kszarlej> but I afraid the recovery process will make my ceph osd's run out of space
[6:42] <kszarlej> what can I do??
[6:43] <kszarlej> can i stop thiis process?
[6:47] * treenerd (~treenerd@91.210.222.66) Quit (Ping timeout: 480 seconds)
[6:48] * kefu (~kefu@114.92.125.213) has joined #ceph
[6:49] * alexbligh1 (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) has joined #ceph
[6:54] * cooldharma06 (~chatzilla@14.139.180.40) Quit (Quit: ChatZilla 0.9.91.1 [Iceweasel 21.0/20130515140136])
[6:57] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[7:01] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[7:01] * kefu (~kefu@114.92.125.213) has joined #ceph
[7:03] * davidzlap (~Adium@cpe-23-242-27-128.socal.res.rr.com) has joined #ceph
[7:03] * shohn (~shohn@dslb-088-074-069-051.088.074.pools.vodafone-ip.de) has joined #ceph
[7:06] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[7:06] * davidzlap1 (~Adium@2605:e000:1313:8003:d9c8:596:f75c:27b1) has joined #ceph
[7:06] * davidzlap1 (~Adium@2605:e000:1313:8003:d9c8:596:f75c:27b1) Quit ()
[7:07] * yguang11 (~yguang11@12.31.82.125) Quit (Ping timeout: 480 seconds)
[7:07] * kefu (~kefu@114.92.125.213) has joined #ceph
[7:08] * davidz (~davidz@cpe-23-242-27-128.socal.res.rr.com) has joined #ceph
[7:10] * m0zes (~mozes@beocat.cis.ksu.edu) Quit (Ping timeout: 480 seconds)
[7:11] * sleinen (~Adium@84-72-160-233.dclient.hispeed.ch) has joined #ceph
[7:12] * m0zes (~mozes@beocat.cis.ksu.edu) has joined #ceph
[7:13] * sleinen1 (~Adium@2001:620:0:82::102) has joined #ceph
[7:14] * davidzlap (~Adium@cpe-23-242-27-128.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[7:15] * sjm (~sjm@49.32.0.204) has joined #ceph
[7:16] * trociny (~mgolub@93.183.239.2) has joined #ceph
[7:17] * yguang11 (~yguang11@12.31.82.125) has joined #ceph
[7:19] * sleinen (~Adium@84-72-160-233.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[7:21] <osds-fail> I start my osds. they crash. they are part of an ssd cache. the osds not involved in caching always stay online. http://pastebin.com/Y8z6GVJv they were working fine until I upgraded from 94.1 to 94.2. ideas? all servers times are in sync
[7:21] <osds-fail> http://pastebin.com/zhGRQWpJ is the crush map as well. the log fail there is for osd6
[7:22] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[7:23] * kefu (~kefu@114.92.125.213) has joined #ceph
[7:25] * KevinPerks (~Adium@2606:a000:80ad:1300:290:9eff:fe9a:a1b0) Quit (Quit: Leaving.)
[7:26] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[7:28] * MACscr (~Adium@2601:247:4102:c3ac:2db2:df88:e19a:558f) Quit (Quit: Leaving.)
[7:29] * zack_dolby (~textual@e0109-114-22-11-74.uqwimax.jp) has joined #ceph
[7:29] * puffy1 (~puffy@216.207.42.140) has joined #ceph
[7:32] * puffy (~puffy@c-50-131-179-74.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[7:32] * zack_dolby (~textual@e0109-114-22-11-74.uqwimax.jp) Quit ()
[7:32] * MACscr (~Adium@2601:247:4102:c3ac:3d73:d1bc:1ffc:cb2a) has joined #ceph
[7:33] * puffy (~puffy@c-50-131-179-74.hsd1.ca.comcast.net) has joined #ceph
[7:33] * puffy (~puffy@c-50-131-179-74.hsd1.ca.comcast.net) Quit ()
[7:34] * MACscr (~Adium@2601:247:4102:c3ac:3d73:d1bc:1ffc:cb2a) Quit ()
[7:34] * haomaiwang (~haomaiwan@218.94.96.134) Quit (Read error: Connection reset by peer)
[7:35] * haomaiwang (~haomaiwan@218.94.96.134) has joined #ceph
[7:35] * rdas (~rdas@121.244.87.116) has joined #ceph
[7:36] * MACscr (~Adium@2601:247:4102:c3ac:3d73:d1bc:1ffc:cb2a) has joined #ceph
[7:38] * dalegaard-39554 (~dalegaard@vps.devrandom.dk) has joined #ceph
[7:39] * puffy1 (~puffy@216.207.42.140) Quit (Ping timeout: 480 seconds)
[7:46] * yguang11 (~yguang11@12.31.82.125) Quit (Remote host closed the connection)
[7:48] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Quit: Verlassend)
[7:48] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[7:51] * amote (~amote@121.244.87.116) has joined #ceph
[7:54] * MACscr (~Adium@2601:247:4102:c3ac:3d73:d1bc:1ffc:cb2a) Quit (Quit: Leaving.)
[7:55] * osds-fail (~oftc-webi@c-50-165-225-40.hsd1.il.comcast.net) Quit (Remote host closed the connection)
[7:58] * zack_dolby (~textual@e0109-114-22-11-74.uqwimax.jp) has joined #ceph
[8:01] * rdas (~rdas@121.244.87.116) has joined #ceph
[8:01] * jclm1 (~jclm@ip-64-134-187-212.public.wayport.net) Quit (Read error: Connection reset by peer)
[8:03] * MACscr (~Adium@2601:247:4102:c3ac:76:c1dc:a06f:cde1) has joined #ceph
[8:04] * jclm (~jclm@ip-64-134-187-212.public.wayport.net) has joined #ceph
[8:04] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[8:06] * kefu (~kefu@114.92.125.213) has joined #ceph
[8:07] * Nacer (~Nacer@2001:41d0:fe82:7200:552b:9a95:ad1b:7236) has joined #ceph
[8:07] * kefu (~kefu@114.92.125.213) Quit (Read error: Connection reset by peer)
[8:07] * sleinen1 (~Adium@2001:620:0:82::102) Quit (Read error: Connection reset by peer)
[8:07] * kefu (~kefu@114.92.125.213) has joined #ceph
[8:09] * derjohn_mob (~aj@tmo-110-80.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[8:10] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[8:11] * kefu (~kefu@114.92.125.213) has joined #ceph
[8:13] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[8:14] * kefu (~kefu@114.92.125.213) has joined #ceph
[8:17] * zack_dolby (~textual@e0109-114-22-11-74.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[8:17] * Sysadmin88 (~IceChat77@2.124.164.69) Quit (Quit: He who laughs last, thinks slowest)
[8:25] * MACscr1 (~Adium@2601:247:4102:c3ac:76:c1dc:a06f:cde1) has joined #ceph
[8:25] * sleinen (~Adium@2001:620:0:2d:7ed1:c3ff:fedc:3223) has joined #ceph
[8:26] * trociny (~mgolub@93.183.239.2) Quit (Read error: No route to host)
[8:28] * trociny (~mgolub@93.183.239.2) has joined #ceph
[8:29] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[8:31] * kefu (~kefu@114.92.125.213) has joined #ceph
[8:31] * MACscr (~Adium@2601:247:4102:c3ac:76:c1dc:a06f:cde1) Quit (Ping timeout: 480 seconds)
[8:32] * Nacer (~Nacer@2001:41d0:fe82:7200:552b:9a95:ad1b:7236) Quit (Remote host closed the connection)
[8:34] * dugravot6 (~dugravot6@dn-infra-04.lionnois.univ-lorraine.fr) has joined #ceph
[8:46] * nsoffer (~nsoffer@bzq-79-183-39-203.red.bezeqint.net) has joined #ceph
[8:47] * alexbligh1 (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) Quit (Ping timeout: 480 seconds)
[8:48] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[8:49] * kefu (~kefu@114.92.125.213) has joined #ceph
[8:55] * madkiss (~madkiss@2001:6f8:12c3:f00f:ac2f:cfe8:bfb6:b12) Quit (Quit: Leaving.)
[8:56] * kszarlej (~kszarlej@5.196.174.189) Quit (Ping timeout: 480 seconds)
[8:59] * kszarlej (~kszarlej@5.196.174.189) has joined #ceph
[9:03] * Be-El (~quassel@fb08-bcf-pc01.computational.bio.uni-giessen.de) has joined #ceph
[9:07] * shang (~ShangWu@27.100.16.145) has joined #ceph
[9:09] * zack_dolby (~textual@e0109-106-188-39-177.uqwimax.jp) has joined #ceph
[9:10] * zack_dol_ (~textual@e0109-114-22-11-74.uqwimax.jp) has joined #ceph
[9:12] * analbeard (~shw@support.memset.com) has joined #ceph
[9:16] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[9:17] * zack_dolby (~textual@e0109-106-188-39-177.uqwimax.jp) Quit (Ping timeout: 480 seconds)
[9:20] * shang (~ShangWu@27.100.16.145) Quit (Ping timeout: 480 seconds)
[9:20] * dis (~dis@109.110.66.238) Quit (Ping timeout: 480 seconds)
[9:25] * fsimonce (~simon@host253-71-dynamic.3-87-r.retail.telecomitalia.it) has joined #ceph
[9:25] * wicope (~wicope@0001fd8a.user.oftc.net) has joined #ceph
[9:25] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) has joined #ceph
[9:28] * mohan_ (~oftc-webi@103.27.8.44) has joined #ceph
[9:28] * gaveen (~gaveen@175.157.177.210) has joined #ceph
[9:28] <mohan_> hi, can I save live snapshot of vm in ceph rbd
[9:28] <mohan_> by running 'virsh snapshot-create ...', I am getting error 95
[9:29] * nardial (~ls@dslb-178-009-182-197.178.009.pools.vodafone-ip.de) has joined #ceph
[9:29] * shang (~ShangWu@111-83-251-135.EMOME-IP.hinet.net) has joined #ceph
[9:30] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[9:30] * evilrob00 (~evilrob00@cpe-72-179-14-118.austin.res.rr.com) has joined #ceph
[9:31] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[9:31] * kefu (~kefu@114.92.125.213) has joined #ceph
[9:37] * Concubidated (~Adium@199.119.131.10) Quit (Ping timeout: 480 seconds)
[9:39] * evilrob00 (~evilrob00@cpe-72-179-14-118.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[9:41] * Hemanth (~Hemanth@121.244.87.117) has joined #ceph
[9:43] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[9:44] * sjm (~sjm@49.32.0.204) has left #ceph
[9:44] * linjan (~linjan@80.179.241.26) has joined #ceph
[9:50] * bobrik___________ (~bobrik@83.243.64.45) Quit (Quit: (null))
[9:50] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) Quit (Ping timeout: 480 seconds)
[9:52] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[9:53] * kefu (~kefu@114.92.125.213) has joined #ceph
[9:58] * nsoffer (~nsoffer@bzq-79-183-39-203.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[9:58] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[10:00] * kefu (~kefu@114.92.125.213) has joined #ceph
[10:01] * calvinx (~calvin@101.100.172.246) has joined #ceph
[10:03] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Ping timeout: 480 seconds)
[10:03] * shang (~ShangWu@111-83-251-135.EMOME-IP.hinet.net) Quit (Remote host closed the connection)
[10:07] * zack_dol_ (~textual@e0109-114-22-11-74.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[10:08] * dis (~dis@109.110.66.238) has joined #ceph
[10:08] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[10:08] * branto (~branto@ip-213-220-214-203.net.upcbroadband.cz) has joined #ceph
[10:08] * jordanP (~jordan@213.215.2.194) has joined #ceph
[10:09] * kefu (~kefu@114.92.125.213) has joined #ceph
[10:16] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[10:24] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[10:25] * vitothe (~vitoxchat@94.161.190.221) has joined #ceph
[10:25] <vitothe> ciao
[10:25] <vitothe> !list
[10:25] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[10:26] * bobrik___________ (~bobrik@109.167.249.178) has joined #ceph
[10:28] * wicope (~wicope@0001fd8a.user.oftc.net) Quit (Remote host closed the connection)
[10:29] * trociny (~mgolub@93.183.239.2) Quit (Remote host closed the connection)
[10:29] * garphy`aw is now known as garphy
[10:30] * vitothe (~vitoxchat@94.161.190.221) Quit (Quit: Sto andando via)
[10:35] * reed (~reed@host110-251-static.62-79-b.business.telecomitalia.it) has joined #ceph
[10:36] * bobrik____________ (~bobrik@109.167.249.178) has joined #ceph
[10:36] * mattch (~mattch@pcw3047.see.ed.ac.uk) has joined #ceph
[10:39] * alexbligh1 (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) has joined #ceph
[10:40] * harlequin (~loris@62-193-45-2.as16211.net) has joined #ceph
[10:41] * vivcheri (7797220b@107.161.19.53) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[10:42] <harlequin> Hi all! Is it possible to specify a different cluster from the rbd command line utility?
[10:43] * bobrik___________ (~bobrik@109.167.249.178) Quit (Ping timeout: 480 seconds)
[10:45] <mohan_> is there any way to take snapshot of vm virtual memory and strore in ceph rbd
[10:45] * jks (~jks@178.155.151.121) has joined #ceph
[10:46] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[10:46] * kefu (~kefu@li413-226.members.linode.com) has joined #ceph
[10:51] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[10:57] * kefu (~kefu@li413-226.members.linode.com) Quit (Max SendQ exceeded)
[10:57] * kefu (~kefu@li413-226.members.linode.com) has joined #ceph
[10:58] * shyu (~Shanzhi@119.254.120.66) Quit (Remote host closed the connection)
[11:03] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[11:03] * MACscr1 (~Adium@2601:247:4102:c3ac:76:c1dc:a06f:cde1) Quit (Ping timeout: 480 seconds)
[11:08] * MACscr (~Adium@2601:247:4102:c3ac:9866:7931:929b:10b4) has joined #ceph
[11:11] * trociny (~mgolub@93.183.239.2) has joined #ceph
[11:11] * haomaiwang (~haomaiwan@218.94.96.134) Quit (Remote host closed the connection)
[11:12] * kefu (~kefu@li413-226.members.linode.com) Quit (Max SendQ exceeded)
[11:12] * kefu (~kefu@li413-226.members.linode.com) has joined #ceph
[11:14] * shylesh (~shylesh@121.244.87.118) has joined #ceph
[11:18] * kefu (~kefu@li413-226.members.linode.com) Quit (Max SendQ exceeded)
[11:20] * kefu (~kefu@li413-226.members.linode.com) has joined #ceph
[11:20] * bobrik_____________ (~bobrik@109.167.249.178) has joined #ceph
[11:21] * bobrik____________ (~bobrik@109.167.249.178) Quit (Ping timeout: 480 seconds)
[11:23] * analbeard1 (~shw@support.memset.com) has joined #ceph
[11:28] <tuxcrafter> hi all
[11:28] <tuxcrafter> i am rebalancing my cluster
[11:28] <tuxcrafter> but i got one bad hdd with an osd
[11:28] <tuxcrafter> that keep going down
[11:28] <tuxcrafter> and when ceph takes it down my cluster goes down
[11:28] <tuxcrafter> can i force ceph to just keep the osd up for a while longer
[11:28] * shyu (~Shanzhi@119.254.120.66) has joined #ceph
[11:29] <tuxcrafter> the hdd is fine for reading data
[11:29] <tuxcrafter> i want it to rebalance first then i can take the osd out without data loss
[11:29] <tuxcrafter> 2015-06-19 11:29:36.121620 mon.0 [INF] pgmap v60334: 520 pgs: 1 active, 6 active+degraded+remapped, 1 active+degraded+remapped+inconsistent, 508 active+clean, 4 active+degraded+remapped+backfilling; 1422 GB data, 2821 GB used, 3695 GB / 6517 GB avail; 11196/728856 objects degraded (1.536%)
[11:29] * analbeard (~shw@support.memset.com) Quit (Ping timeout: 480 seconds)
[11:29] <tuxcrafter> see its almost there
[11:32] <hlkv6-59569> hey ppl, I have a problem with radosgw - Ubuntu 14.10 trying to stop, using kill, service stop, /etc/init.d/radosgw stop - keeps running/starting
[11:33] <hlkv6-59569> also when looking at logs, I get failed to list objects pool_iterate returned r=-2 - can I somehow check something? noob here
[11:36] * shyu (~Shanzhi@119.254.120.66) Quit (Remote host closed the connection)
[11:36] * shylesh (~shylesh@121.244.87.118) Quit (Ping timeout: 480 seconds)
[11:38] * derjohn_mob (~aj@2001:6f8:1337:0:c1e6:700b:1390:ac35) has joined #ceph
[11:41] * dis (~dis@109.110.66.238) Quit (Ping timeout: 480 seconds)
[11:41] * kefu (~kefu@li413-226.members.linode.com) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[11:44] * kefu (~kefu@114.92.125.213) has joined #ceph
[11:46] * overclk (~overclk@59.93.226.23) Quit (Quit: Leaving)
[11:50] * derjohn_mob (~aj@2001:6f8:1337:0:c1e6:700b:1390:ac35) Quit (Ping timeout: 480 seconds)
[11:51] * thomnico (~thomnico@37.163.175.80) has joined #ceph
[11:52] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[11:52] * zack_dolby (~textual@pa3b3a1.tokynt01.ap.so-net.ne.jp) has joined #ceph
[11:53] * kefu (~kefu@114.92.125.213) has joined #ceph
[11:55] * Debesis (0x@5.254.46.84.mobile.mezon.lt) has joined #ceph
[11:55] * OutOfNoWhere (~rpb@199.68.195.101) Quit (Ping timeout: 480 seconds)
[11:58] * derjohn_mob (~aj@b2b-94-79-172-98.unitymedia.biz) has joined #ceph
[11:58] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[12:00] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[12:01] * kefu (~kefu@114.92.125.213) has joined #ceph
[12:06] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[12:07] * thomnico (~thomnico@37.163.175.80) Quit (Ping timeout: 480 seconds)
[12:07] * kefu (~kefu@114.92.125.213) has joined #ceph
[12:10] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[12:11] * kefu (~kefu@114.92.125.213) has joined #ceph
[12:17] * reed (~reed@host110-251-static.62-79-b.business.telecomitalia.it) Quit (Read error: Connection reset by peer)
[12:18] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[12:19] <Be-El> tuxcrafter: you can set the weight of the failing osd to 0.0. pgs stored on it should be tranferred to other osds
[12:19] * kefu (~kefu@114.92.125.213) has joined #ceph
[12:20] * lucas1 (~Thunderbi@218.76.52.64) Quit (Quit: lucas1)
[12:21] * harlequin (~loris@62-193-45-2.as16211.net) Quit (Ping timeout: 480 seconds)
[12:23] * essjayhch (sid79416@id-79416.ealing.irccloud.com) has joined #ceph
[12:24] <essjayhch> Just posted (a rather long and rambling) post in the users' mailing list. Thought I'd pop in here and say hi, in case anyone had any thoughts about my throughput woes with RadosGW.
[12:29] * harlequin (~loris@62-193-45-2.as16211.net) has joined #ceph
[12:31] <T1w> essjayhch: first of - drop Apache
[12:31] <T1w> Apache is a bottleneck and resource-hog
[12:31] <essjayhch> well it wasn't my first choice :)
[12:32] <T1w> use something else - nginx, lighttpd or similar
[12:32] <T1w> in our non-ceph related hosting of own products we went from apache to nginx some years ago
[12:32] <essjayhch> unfortunately I don't have much experiece of the RadosGW bits of this. My previous work was primarily smaller scale just using RBD
[12:33] <essjayhch> Do you think that apache would be slowing it down by that much though?
[12:33] <T1w> on a 24 core 128GB server we had a load of 3-4, 20GB+ ram and 'apachectl graceful' reload times up to 1 minute (where nothing was served in that time!) from apache alone
[12:34] <T1w> switching to nginx the load dropped to < 0.6, memory went down to a few GB and reload times are measured in 1-2 seconds with no detectable stalls
[12:35] <T1w> since then our traffic has increased by 5-10 times that of a few years ago
[12:36] <T1w> on average we handle 400mio+ hits over a 24 hour period and push some 800-1200GB traffic each day through nginx - on that one server alone
[12:37] * calvinx (~calvin@101.100.172.246) Quit (Quit: calvinx)
[12:37] <T1w> so.. ditch apache and use something else.. :)
[12:37] * overclk (~overclk@122.178.199.50) has joined #ceph
[12:38] <T1w> oh yes, and nginx has 12 workers that cope with that load - apache has 100+ and could still not keep up
[12:41] * fdmanana__ (~fdmanana@bl13-144-168.dsl.telepac.pt) has joined #ceph
[12:43] * kefu (~kefu@114.92.125.213) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[12:44] * sjm (~sjm@49.32.0.204) has joined #ceph
[12:44] * kefu (~kefu@114.92.125.213) has joined #ceph
[12:45] * haomaiwang (~haomaiwan@61.132.52.84) has joined #ceph
[12:48] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[12:49] * kefu (~kefu@114.92.125.213) has joined #ceph
[12:50] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[12:50] * KevinPerks (~Adium@2606:a000:80ad:1300:2028:a7c1:5b3:abf9) has joined #ceph
[12:54] * vbellur (~vijay@107-1-123-195-ip-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[12:56] <lathiat> T1w: what apache worker pool, matters alot...
[12:57] <lathiat> T1w: by the sounds of it probably prefork :P
[12:57] <tuxcrafter> i removed an osd but want to add it back
[12:59] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[13:01] * zhaochao (~zhaochao@111.161.77.241) Quit (Quit: ChatZilla 0.9.91.1 [Iceweasel 38.0.1/20150526223604])
[13:04] <essjayhch> ugh habitual 502 errors now with nginx
[13:04] <essjayhch> there's nothing weird with the config.
[13:04] * vbellur (~vijay@107-1-123-195-ip-static.hfc.comcastbusiness.net) has joined #ceph
[13:04] * kefu (~kefu@114.92.125.213) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[13:05] <essjayhch> why do I get the feeling something is turned off by default with the aptitude version of nginx...
[13:05] * bobrik_____________ (~bobrik@109.167.249.178) Quit (Read error: Connection reset by peer)
[13:06] * bobrik_____________ (~bobrik@109.167.249.178) has joined #ceph
[13:06] <T1w> lathiat: prefork did nothing good
[13:07] <T1w> I remember it as mpm we were using
[13:08] <T1w> oh, sorry.. prefork mpm
[13:08] <T1w> yeah, we tried worker mpm, but there were issues with overlap between requests and ssl termination
[13:10] * anorak (~anorak@62.27.88.230) Quit (Remote host closed the connection)
[13:13] <essjayhch> Any ideas why this would return an HTTP 502? https://www.irccloud.com/pastebin/dJgkER7M/nginx%20test_config
[13:18] <T1w> permissions of the socket?
[13:20] <essjayhch> nope, 777
[13:21] <essjayhch> and all the directories leading there are minimum of 755
[13:21] * amote (~amote@121.244.87.116) Quit (Quit: Leaving)
[13:21] <T1w> perhaps a missiong document root
[13:21] <T1w> missing even
[13:22] * kefu (~kefu@114.92.125.213) has joined #ceph
[13:22] <essjayhch> bit of a bug if you're not serving up static content really isn't it :)
[13:23] <essjayhch> nope, not that.
[13:30] <T1w> well.. it would be possible to pass every request to a specific location
[13:31] <T1w> but then again, I've actually never use the fastcgi options, so I'm not quite sure whats needed
[13:32] * wicope (~wicope@0001fd8a.user.oftc.net) has joined #ceph
[13:32] * kefu (~kefu@114.92.125.213) Quit (Ping timeout: 480 seconds)
[13:32] * dis (~dis@109.110.66.238) has joined #ceph
[13:32] * bobrik_____________ (~bobrik@109.167.249.178) Quit (Ping timeout: 480 seconds)
[14:14] -reticulum.oftc.net- *** Looking up your hostname...
[14:14] -reticulum.oftc.net- *** Checking Ident
[14:14] -reticulum.oftc.net- *** Couldn't look up your hostname
[14:14] -reticulum.oftc.net- *** No Ident response
[14:14] * CephLogBot (~PircBot@92.63.168.213) has joined #ceph
[14:14] * Topic is 'CDS Schedule Posted: http://goo.gl/i72wN8 || http://ceph.com/get || dev channel #ceph-devel || test lab channel #sepia'
[14:14] * Set by scuttlemonkey!~scuttle@nat-pool-rdu-t.redhat.com on Mon Mar 02 21:13:33 CET 2015
[14:14] * peeejayz (~peeejayz@isis57186.sci.rl.ac.uk) Quit ()
[14:16] * rlrevell (~leer@184.52.129.221) has joined #ceph
[14:20] * rlrevell (~leer@184.52.129.221) Quit (Read error: Connection reset by peer)
[14:20] * dneary (~dneary@pool-96-252-45-212.bstnma.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[14:29] <monsted> hmm, we need flash makers to sell us very cheap NAND dies that we can run OSDs on.
[14:29] <monsted> imagine a farm of thousands of unreliable NAND dies, but at a fraction of the price?
[14:31] * Concubidated (~Adium@129.192.176.66) has joined #ceph
[14:33] * hostranger (~rulrich@2a02:41a:3999::85) has left #ceph
[14:34] <tuxcrafter> how do i re-add an osd
[14:35] <tuxcrafter> i came as far as:
[14:35] <tuxcrafter> ceph auth add osd.5 osd 'allow *' mon 'allow rwx' -i /var/lib/ceph/osd/ceph-5/keyring
[14:35] <tuxcrafter> Error ENOENT: osd.5 does not exist. create it before updating the crush map
[14:35] * rlrevell (~leer@vbo1.inmotionhosting.com) has joined #ceph
[14:37] * marrusl (~mark@nat-pool-rdu-u.redhat.com) has joined #ceph
[14:37] <ChrisNBlum> sounds like ceph doesn't know about any osd.5 - did you follow: http://ceph.com/docs/master/rados/operations/add-or-rm-osds/ ?
[14:38] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[14:40] * squ (~Thunderbi@46.109.36.167) Quit (Quit: squ)
[14:40] * kefu (~kefu@114.92.125.213) has joined #ceph
[14:43] * rdas (~rdas@121.244.87.116) Quit (Ping timeout: 480 seconds)
[14:47] * cdelatte (~cdelatte@2001:1998:860:1001:91ea:3f92:479c:51c7) has joined #ceph
[14:48] * vbellur (~vijay@nat-pool-bos-u.redhat.com) has joined #ceph
[14:50] <tuxcrafter> ChrisNBlum: i tried to follow it, but could not figure out the syntax for the crush map add
[14:50] <tuxcrafter> ceph osd crush add {id} {name} {weight} [{bucket-type}={bucket-name} ...]
[14:50] <ChrisNBlum> but that's the step after ceph auth add
[14:50] <tuxcrafter> ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047)
[14:50] <tuxcrafter> the auth add worked fine
[14:51] <tuxcrafter> and seem to be workign as well
[14:51] * georgem (~Adium@fwnat.oicr.on.ca) has joined #ceph
[14:51] <ChrisNBlum> did you also see that the syntax of ceph osd crush add changed since version 0.56
[14:52] <tuxcrafter> ceph osd crush add osd.5 0.0
[14:53] <tuxcrafter> i tried that but it didnt work invalid command
[14:53] <tuxcrafter> i want to add the osd with a 0 weight
[14:53] <tuxcrafter> (read only)
[14:54] <essjayhch> my other real question is how the pg_num should be calculated.
[14:55] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[14:55] <ChrisNBlum> there is a ... page for that ;) http://ceph.com/docs/master/rados/operations/placement-groups/
[14:55] <essjayhch> It's a bit of a black hole. When you have that number of OSDs on a machine, if you set it anything approaching what the documentation would have you believe, it makes the whole cluster unstable.
[14:55] <essjayhch> indeed, I've read it top to bottom.
[14:56] * hostranger (~rulrich@2a02:41a:3999::85) has joined #ceph
[14:56] * xarses (~xarses@12.10.113.130) Quit (Ping timeout: 480 seconds)
[14:56] * hostranger (~rulrich@2a02:41a:3999::85) has left #ceph
[14:57] <ska> In the docs: http://ceph.com/docs/master/install/manual-deployment/ is "client.admin" a real username?
[14:57] <tuxcrafter> ska: try using admin as username
[14:57] <tuxcrafter> if your auth doesnt work
[14:57] <tuxcrafter> i had that issue a few times
[14:57] * sugoruyo (~georgev@paarthurnax.esc.rl.ac.uk) has joined #ceph
[14:58] * kanagaraj (~kanagaraj@nat-pool-bos-t.redhat.com) has joined #ceph
[14:59] <ska> I just don't understand the semantics of the commands.
[14:59] * mhack (~mhack@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[15:00] <sugoruyo> hey folks, has anyone ever tried to run a large-ish (5PB raw) Ceph cluster with a small number of PGs? Like 10PGs/OSD or sth...
[15:03] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[15:05] * overclk (~overclk@122.178.199.50) Quit (Quit: Leaving)
[15:06] * shaunm (~shaunm@74.215.76.114) Quit (Ping timeout: 480 seconds)
[15:06] * kefu (~kefu@114.92.125.213) has joined #ceph
[15:08] * Redcavalier (~Redcavali@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[15:08] <ska> I have a dedicated non-root user to issue commands manually (not ceph-deploy,) how does ceph use it?
[15:10] <ska> Also when that document says to do something on a server, does it need to be done on each server?
[15:11] * jrankin (~jrankin@d53-64-170-236.nap.wideopenwest.com) has joined #ceph
[15:12] <Redcavalier> Hi, I have an issue where openstack doesn't create a snapshot clone in ceph to provision VMs. Is there a way to log what the rbd client does so I can have an idea why exactly it fails?
[15:12] * mhack (~mhack@nat-pool-bos-u.redhat.com) has joined #ceph
[15:13] * mhack is now known as mhack|trng
[15:16] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[15:16] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[15:16] * gaveen (~gaveen@175.157.177.210) Quit (Remote host closed the connection)
[15:21] <ska> also, should I avoid hostnames with dashes in them like ceph-01?
[15:24] <burley> dashes are fine
[15:25] <ska> Thanks..
[15:26] * yanzheng (~zhyan@182.139.21.245) Quit (Quit: This computer has gone to sleep)
[15:26] <ska> I have [mon.a], [mon.b] setup in my ceph.conf file. Do I still need a mon initial members = line in [global] ?
[15:27] * yanzheng (~zhyan@182.139.21.245) has joined #ceph
[15:27] * t0rn (~ssullivan@c-68-62-1-186.hsd1.mi.comcast.net) has joined #ceph
[15:29] * tupper (~tcole@2001:420:2280:1272:8900:f9b8:3b49:567e) has joined #ceph
[15:31] * rwheeler (~rwheeler@nat-pool-bos-u.redhat.com) has joined #ceph
[15:32] * shylesh (~shylesh@121.244.87.124) Quit (Remote host closed the connection)
[15:33] <sugoruyo> ska: based on the docs I think "mon initial members" is not required
[15:35] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[15:37] <sugoruyo> I'm wondering has anyone ever tried to run a large-ish (5PB raw) Ceph cluster with a small number of PGs? Like 10PGs/OSD or sth...
[15:39] * arbrandes (~arbrandes@191.7.148.91) has joined #ceph
[15:40] * t0rn (~ssullivan@c-68-62-1-186.hsd1.mi.comcast.net) has left #ceph
[15:41] * ConSi (consi@jest.pro) Quit (Remote host closed the connection)
[15:41] * ConSi (consi@jest.pro) has joined #ceph
[15:44] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Quit: Leaving.)
[15:44] * championofcyrodi (~championo@50-205-35-98-static.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[15:46] * haomaiwang (~haomaiwan@61.132.52.84) Quit (Remote host closed the connection)
[15:47] * kanagaraj (~kanagaraj@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving)
[15:50] * rwheeler (~rwheeler@nat-pool-bos-u.redhat.com) Quit (Quit: Leaving)
[15:53] * kanagaraj (~kanagaraj@nat-pool-bos-t.redhat.com) has joined #ceph
[15:53] * vbellur (~vijay@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[15:56] * danieagle (~Daniel@187.35.201.117) has joined #ceph
[15:57] * shaunm (~shaunm@172.56.28.108) has joined #ceph
[15:58] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[15:59] * ska (~skatinolo@cpe-173-174-111-177.austin.res.rr.com) Quit (Quit: Leaving)
[16:00] * kanagaraj (~kanagaraj@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving)
[16:01] <tuxcrafter> can somebody give me an example of this command ceph osd crush add {id-or-name} {weight} [{bucket-type}={bucket-name} ...]
[16:05] * oblu (~o@62.109.134.112) Quit (Ping timeout: 480 seconds)
[16:06] * sjm (~sjm@49.32.0.204) Quit (Ping timeout: 480 seconds)
[16:07] <sugoruyo> tuxcrafter: have never run it, what are you looking for exactly?
[16:08] * mhack|trng (~mhack@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[16:08] * MentalRay (~MRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[16:09] * linuxkidd (~linuxkidd@209.163.164.50) has joined #ceph
[16:10] * shaunm (~shaunm@172.56.28.108) Quit (Ping timeout: 480 seconds)
[16:11] * oblu (~o@62.109.134.112) has joined #ceph
[16:11] * analbeard1 (~shw@support.memset.com) Quit (Ping timeout: 480 seconds)
[16:12] <sugoruyo> I'd think it's probably like `ceph osd crush add 78 4.5 rack=rack02 host=ceph09 osd=osd.78`
[16:15] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[16:15] * kefu (~kefu@114.92.125.213) has joined #ceph
[16:16] * ichavero_ (~ichavero@189.231.13.18) has joined #ceph
[16:17] * visbits (~textual@8.29.138.28) has joined #ceph
[16:20] <tuxcrafter> sugoruyo: i figured it out
[16:20] <sugoruyo> tuxcrafter: ok, cool
[16:20] <tuxcrafter> i had to run ceph osd create (without any arguments)!
[16:20] <tuxcrafter> then i could just run ceph osd crush add 5 0.0 host=ceph03 root=default
[16:21] <tuxcrafter> set the weight to zero
[16:21] <tuxcrafter> and ceph osd set nodown
[16:21] <tuxcrafter> it seems to rebalance nicely now
[16:21] <sugoruyo> tuxcrafter: what were you trying to achieve?
[16:21] <tuxcrafter> sugoruyo: i was trying to kill my cluster
[16:21] <tuxcrafter> and fixing it later :)
[16:22] * ichavero (~ichavero@189.231.108.162) Quit (Ping timeout: 480 seconds)
[16:22] <tuxcrafter> i was replacing a node with an other ssd and disk
[16:22] <tuxcrafter> and while it was working an other node keep downing an osd
[16:22] <tuxcrafter> due to an bad disk
[16:22] <tuxcrafter> but it was killing my cluster
[16:23] <tuxcrafter> ceph osd set nodown
[16:23] <tuxcrafter> and weight is on zero now for the bad disk(osd)
[16:23] * oblu (~o@62.109.134.112) Quit (Ping timeout: 480 seconds)
[16:25] <sugoruyo> was the disk permanently down or was it flapping up/down?
[16:25] <tuxcrafter> sugoruyo: it wewnt permanently down
[16:25] * garphy is now known as garphy`aw
[16:25] <tuxcrafter> but i keep bringing it up again manually as it needed the disk for its data
[16:26] <tuxcrafter> as i ended up with inconsitent pgs without the disk
[16:26] * analbeard (~shw@5.153.255.226) has joined #ceph
[16:27] <sugoruyo> I see, so you were running into rebalancing IO?
[16:27] * evilrob00 (~evilrob00@cpe-72-179-14-118.austin.res.rr.com) has joined #ceph
[16:27] * oblu (~o@62.109.134.112) has joined #ceph
[16:27] * xarses (~xarses@ma75036d0.tmodns.net) has joined #ceph
[16:27] <tuxcrafter> sugoruyo: running into r
[16:27] <tuxcrafter> rebalancing IO?
[16:28] <tuxcrafter> the disk was failing while doing an rebalancing yes
[16:28] <tuxcrafter> and it was causing inconsistent pgs
[16:32] * lkoranda (~lkoranda@213.175.37.10) Quit (Quit: Splunk> Be an IT superhero. Go home early.)
[16:32] * garphy`aw is now known as garphy
[16:34] <tuxcrafter> also i was wondering yesterday that if i turn write-caching off with hdparm -W 0 /dev/sdc on my ssd my iops for dd if=/dev/zero of=/dev/sdc bs=4k count=10000 oflag=direct,dsync almost doubles
[16:34] <tuxcrafter> why is this?
[16:36] * vata (~vata@208.88.110.46) has joined #ceph
[16:39] <harlequin> Hi! Do you know what capabilities are needed to map an RBD with the kernel RBD client ?
[16:39] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[16:39] * vata (~vata@208.88.110.46) Quit ()
[16:40] * kefu (~kefu@114.92.125.213) has joined #ceph
[16:42] * sleinen (~Adium@2001:620:0:2d:7ed1:c3ff:fedc:3223) Quit (Quit: Leaving.)
[16:42] * arbrandes (~arbrandes@191.7.148.91) Quit (Ping timeout: 480 seconds)
[16:44] * shaunm (~shaunm@74.215.76.114) has joined #ceph
[16:46] * ska (~skatinolo@cpe-173-174-111-177.austin.res.rr.com) has joined #ceph
[16:46] <ska> In manual configuration do i need to create the mount points for the XFS volumes based on the osd.xyz names?
[16:48] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[16:49] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[16:51] <sugoruyo> tuxcrafter: you mentioned "but it was killing my cluster" and I was asking if rebalancing was giving you problems (ie too much rebalancing) or was it just the inconsistent PGs?
[16:51] * arbrandes (~arbrandes@191.30.34.253) has joined #ceph
[16:51] * vata (~vata@208.88.110.46) has joined #ceph
[16:53] <sugoruyo> ska: my mounts look like this: ceph-0 -> /var/lib/ceph/osd/sdb
[16:54] <sugoruyo> no mention of the osd.x naming
[16:54] <Be-El> ska: just use ceph-disk prepare and ceph-disk activate....
[16:55] * brunoleon (~quassel@ns334299.ip-176-31-225.eu) has joined #ceph
[16:57] * calvinx (~calvin@101.100.172.246) has joined #ceph
[17:00] * mhack|trng (~mhack@nat-pool-bos-u.redhat.com) has joined #ceph
[17:01] * flisky (~Thunderbi@118.186.147.37) has joined #ceph
[17:01] * flisky (~Thunderbi@118.186.147.37) Quit ()
[17:02] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) has joined #ceph
[17:02] * MentalRay (~MRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: This computer has gone to sleep)
[17:03] <ska> Be-El: Does that happen on each OSD node?
[17:04] <Be-El> ska: first of all read what that command is about
[17:06] * analbeard (~shw@5.153.255.226) Quit (Quit: Leaving.)
[17:06] <ska> sugoruyo: you must be using the entire disk then.
[17:09] <sugoruyo> ska: indeed we are using each disk for an OSD, but this doesn't affect the naming of your mount-points
[17:11] * arbrandes (~arbrandes@191.30.34.253) Quit (Ping timeout: 480 seconds)
[17:12] * jwilkins (~jwilkins@2601:9:4580:f4c:ea2a:eaff:fe08:3f1d) Quit (Quit: Leaving)
[17:13] * jwilkins (~jwilkins@2601:9:4580:f4c:ea2a:eaff:fe08:3f1d) has joined #ceph
[17:14] * TheSov (~TheSov@cip-248.trustwave.com) has joined #ceph
[17:15] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[17:16] * kefu (~kefu@114.92.125.213) has joined #ceph
[17:18] * ira (~ira@208.217.184.210) has joined #ceph
[17:18] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[17:19] * kefu (~kefu@114.92.125.213) has joined #ceph
[17:20] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[17:26] * jrocha (~jrocha@vagabond.cern.ch) Quit (Quit: Leaving)
[17:28] * vbellur (~vijay@nat-pool-bos-u.redhat.com) has joined #ceph
[17:28] * sleinen (~Adium@84-72-160-233.dclient.hispeed.ch) has joined #ceph
[17:28] * madkiss (~madkiss@2001:6f8:12c3:f00f:ac2f:cfe8:bfb6:b12) has joined #ceph
[17:29] * MentalRay (~MRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[17:29] * arbrandes (~arbrandes@191.7.148.91) has joined #ceph
[17:32] * sleinen1 (~Adium@2001:620:0:82::101) has joined #ceph
[17:33] <tuxcrafter> my ceph seems stuck at: 2015-06-19 17:32:19.489891 mon.0 [INF] pgmap v65492: 520 pgs: 510 active+clean, 8 active+degraded+remapped+backfilling, 1 active+degraded+remapped+inconsistent+backfilling, 1 active+recovering+degraded+remapped; 1422 GB data, 2882 GB used, 3635 GB / 6517 GB avail; 24702/757084 objects degraded (3.263%)
[17:33] <tuxcrafter> how can i figure what it is doing
[17:33] <tuxcrafter> its been in this state for 30min now and no disk io
[17:36] * jordanP (~jordan@213.215.2.194) Quit (Quit: Leaving)
[17:37] * xarses (~xarses@ma75036d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[17:38] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[17:39] * sleinen (~Adium@84-72-160-233.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[17:39] * kefu (~kefu@114.92.125.213) has joined #ceph
[17:39] <TheSov> u only have 1 monitor?
[17:39] * topro (~prousa@host-62-245-142-50.customer.m-online.net) Quit (Ping timeout: 480 seconds)
[17:40] <TheSov> it looks like you have an osd down, do a ceph osd tree
[17:40] <tuxcrafter> ceph osd crush tunables optimal seem to change some things
[17:40] <tuxcrafter> ceph osd tree shows all osds up
[17:41] <TheSov> what does "ceph osd tree" show you
[17:41] <TheSov> ahhh
[17:41] <tuxcrafter> TheSov: http://paste.debian.net/241460/
[17:41] <tuxcrafter> im trying to move data away from osd 2 and 5 though
[17:41] * championofcyrodi (~championo@50-205-35-98-static.hfc.comcastbusiness.net) has joined #ceph
[17:42] <TheSov> did you change the weights?
[17:42] <tuxcrafter> TheSov: ues to zero
[17:42] <tuxcrafter> /ues/yes
[17:42] * BManojlovic (~steki@cable-89-216-174-162.dynamic.sbb.rs) has joined #ceph
[17:43] <tuxcrafter> i should have tree monitors as well
[17:43] <TheSov> yes at least 3
[17:43] <tuxcrafter> monmap e1: 3 mons at {ceph01=192.168.24.23:6789/0,ceph02=192.168.24.24:6789/0,ceph03=192.168.24.25:6789/0}, election epoch 248, quorum 0,1,2 ceph01,ceph02,ceph03
[17:43] <tuxcrafter> they are up
[17:43] <TheSov> you good
[17:44] <tuxcrafter> let me see what the ceph osd crush tunables optimal ends up with
[17:44] <tuxcrafter> it is rebalancing again
[17:46] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) Quit (Remote host closed the connection)
[17:46] * championofcyrodi (~championo@50-205-35-98-static.hfc.comcastbusiness.net) Quit ()
[17:47] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) has joined #ceph
[17:49] * sleinen1 (~Adium@2001:620:0:82::101) Quit (Ping timeout: 480 seconds)
[17:50] * championofcyrodi (~championo@50-205-35-98-static.hfc.comcastbusiness.net) has joined #ceph
[17:52] * bene (~ben@c-24-60-237-191.hsd1.nh.comcast.net) has joined #ceph
[17:52] * dugravot6 (~dugravot6@dn-infra-04.lionnois.univ-lorraine.fr) Quit (Quit: Leaving.)
[17:53] * moore (~moore@64.202.160.88) has joined #ceph
[17:55] * harlequin (~loris@62-193-45-2.as16211.net) Quit (Quit: leaving)
[17:57] * yanzheng (~zhyan@182.139.21.245) Quit (Quit: This computer has gone to sleep)
[18:01] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[18:02] * derjohn_mob (~aj@b2b-94-79-172-98.unitymedia.biz) Quit (Ping timeout: 480 seconds)
[18:05] * burley (~khemicals@cpe-98-28-239-78.cinci.res.rr.com) Quit (Quit: burley)
[18:06] * kefu (~kefu@114.92.125.213) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[18:06] * sugoruyo (~georgev@paarthurnax.esc.rl.ac.uk) Quit (Remote host closed the connection)
[18:08] <foxxx0> hey guys, I'm currently still running on the giant release on ubuntu14.04 with 3 nodes. I have the problem that the cluster becomes somewhat unusable when one node fails. I have set the min_replica_count setting to 2 so writing should still be possible with 2 nodes.
[18:09] <foxxx0> When trying to issue rbd commands while one node is down they just fail eventually with a timeout and claim that they can't connect to the cluster
[18:09] <foxxx0> also the PGs are backfilling like crazy when one node is offline, even when the noout flag is NOT set
[18:09] * sleinen (~Adium@84-72-160-233.dclient.hispeed.ch) has joined #ceph
[18:09] * sleinen (~Adium@84-72-160-233.dclient.hispeed.ch) Quit ()
[18:10] <foxxx0> any ideas on how to solve this issue? or am I mistaken that the cluster should still be usable with 2 nodes?
[18:10] * Hemanth (~Hemanth@121.244.87.117) Quit (Ping timeout: 480 seconds)
[18:12] * Nacer_ (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[18:13] * rwheeler (~rwheeler@nat-pool-bos-u.redhat.com) has joined #ceph
[18:14] * smerz (~ircircirc@37.74.194.90) Quit (Remote host closed the connection)
[18:15] * mhack|trng (~mhack@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[18:15] * kefu (~kefu@114.92.125.213) has joined #ceph
[18:19] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Ping timeout: 480 seconds)
[18:20] * Nacer_ (~Nacer@252-87-190-213.intermediasud.com) Quit (Ping timeout: 480 seconds)
[18:21] * mhack (~mhack@nat-pool-bos-t.redhat.com) has joined #ceph
[18:26] * yguang11 (~yguang11@nat-dip30-wl-d.cfw-a-gci.corp.yahoo.com) has joined #ceph
[18:26] * yguang11 (~yguang11@nat-dip30-wl-d.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[18:26] * yguang11 (~yguang11@nat-dip30-wl-d.cfw-a-gci.corp.yahoo.com) has joined #ceph
[18:33] * garphy is now known as garphy`aw
[18:39] * kefu (~kefu@114.92.125.213) Quit (Max SendQ exceeded)
[18:40] * nsoffer (~nsoffer@bzq-79-180-80-9.red.bezeqint.net) has joined #ceph
[18:40] * lkoranda (~lkoranda@nat-pool-brq-t.redhat.com) has joined #ceph
[18:40] * kefu (~kefu@114.92.125.213) has joined #ceph
[18:41] * bitserker1 (~toni@88.87.194.130) Quit (Ping timeout: 480 seconds)
[18:41] * bobrik (~bobrik@83.243.64.45) has joined #ceph
[18:44] * shylesh (~shylesh@1.23.174.91) has joined #ceph
[18:45] * shylesh (~shylesh@1.23.174.91) Quit ()
[18:45] * shylesh (~shylesh@1.23.174.91) has joined #ceph
[18:46] * shylesh (~shylesh@1.23.174.91) Quit ()
[18:46] * shylesh__ (~shylesh@1.23.174.91) has joined #ceph
[18:55] * tallest_red (~Xylios@5.175.221.162) has joined #ceph
[18:56] * daviddcc (~dcasier@LAubervilliers-656-1-16-164.w217-128.abo.wanadoo.fr) has joined #ceph
[18:56] * lkoranda (~lkoranda@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[18:57] * treenerd (~treenerd@193.43.158.229) has joined #ceph
[18:58] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) Quit (Quit: Leaving.)
[18:58] * treenerd (~treenerd@193.43.158.229) Quit ()
[19:07] * lkoranda (~lkoranda@nat-pool-brq-t.redhat.com) has joined #ceph
[19:08] * lkoranda (~lkoranda@nat-pool-brq-t.redhat.com) Quit ()
[19:10] * lkoranda (~lkoranda@nat-pool-brq-t.redhat.com) has joined #ceph
[19:10] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[19:13] * Be-El (~quassel@fb08-bcf-pc01.computational.bio.uni-giessen.de) Quit (Remote host closed the connection)
[19:14] * i_m (~ivan.miro@deibp9eh1--blueice3n2.emea.ibm.com) Quit (Ping timeout: 480 seconds)
[19:15] * kawa2014 (~kawa@89.184.114.246) Quit (Quit: Leaving)
[19:22] * Teduardo (~Teduardo@57.0.be.static.xlhost.com) Quit ()
[19:24] * kefu (~kefu@114.92.125.213) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:24] * mattch (~mattch@pcw3047.see.ed.ac.uk) Quit (Ping timeout: 480 seconds)
[19:24] * tallest_red (~Xylios@5NZAAD28S.tor-irc.dnsbl.oftc.net) Quit ()
[19:29] * Nacer (~Nacer@203-206-190-109.dsl.ovh.fr) has joined #ceph
[19:30] * diegows (~diegows@190.190.5.238) has joined #ceph
[19:30] * dgbaley27 (~matt@c-67-176-93-83.hsd1.co.comcast.net) Quit (Quit: Leaving.)
[19:30] * MentalRay (~MRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: This computer has gone to sleep)
[19:31] * Hemanth (~Hemanth@117.192.254.40) has joined #ceph
[19:35] * branto (~branto@ip-213-220-214-203.net.upcbroadband.cz) has left #ceph
[19:38] * linuxkidd (~linuxkidd@209.163.164.50) Quit (Quit: Leaving)
[19:38] * daviddcc (~dcasier@LAubervilliers-656-1-16-164.w217-128.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[19:43] * Kupo1 (~tyler.wil@23.111.254.159) has joined #ceph
[19:48] * haomaiwang (~haomaiwan@114.111.166.250) has joined #ceph
[19:55] * brunoleon (~quassel@ns334299.ip-176-31-225.eu) Quit (Remote host closed the connection)
[20:03] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[20:07] * xarses (~xarses@172.56.7.255) has joined #ceph
[20:08] * mykola (~Mikolaj@91.225.203.220) has joined #ceph
[20:13] * yanzheng (~zhyan@182.139.21.245) has joined #ceph
[20:15] * MentalRay (~MRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[20:20] * shylesh__ (~shylesh@1.23.174.91) Quit (Ping timeout: 480 seconds)
[20:23] * portante (~portante@nat-pool-bos-t.redhat.com) Quit (Quit: ZNC - http://znc.in)
[20:23] * yanzheng (~zhyan@182.139.21.245) Quit (Quit: This computer has gone to sleep)
[20:25] * bene (~ben@c-24-60-237-191.hsd1.nh.comcast.net) Quit (Quit: Konversation terminated!)
[20:34] * portante (~portante@nat-pool-bos-t.redhat.com) has joined #ceph
[20:40] <ska> If I'm running my mon, osd, mds on same node, do I point them all to the same mount point for the files?
[20:42] * diegows (~diegows@190.190.5.238) Quit (Ping timeout: 480 seconds)
[20:44] * capri_oner (~capri@212.218.127.222) has joined #ceph
[20:47] <ska> Ok, looks like only osd data gets mounted.
[20:48] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[20:50] * championofcyrodi (~championo@50-205-35-98-static.hfc.comcastbusiness.net) has left #ceph
[20:51] * capri_on (~capri@212.218.127.222) Quit (Ping timeout: 480 seconds)
[20:52] * MentalRay (~MRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: This computer has gone to sleep)
[20:53] * daviddcc (~dcasier@77.151.197.84) has joined #ceph
[20:58] * arbrandes (~arbrandes@191.7.148.91) Quit (Ping timeout: 480 seconds)
[21:01] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[21:03] * Concubidated (~Adium@129.192.176.66) Quit (Quit: Leaving.)
[21:08] * derjohn_mob (~aj@x590c61af.dyn.telefonica.de) has joined #ceph
[21:08] * calvinx (~calvin@101.100.172.246) Quit (Quit: calvinx)
[21:08] * arbrandes (~arbrandes@191.199.182.7) has joined #ceph
[21:09] * LeaChim (~LeaChim@host86-132-233-125.range86-132.btcentralplus.com) has joined #ceph
[21:14] * xarses (~xarses@172.56.7.255) Quit (Ping timeout: 480 seconds)
[21:16] <TheSov> im having some trouble getting ceph on the raspberry pi
[21:16] <TheSov> seems like ceph-deploy does a bunch of hardcoded stuff
[21:17] <alfredodeza> of course it does :) it does so because it is very opinionated about where you are installing and how you are configuring
[21:17] <alfredodeza> and it does that because installing+configuring a whole cluster, from scratch, for someone that wants to get started is hard
[21:18] * derjohn_mob (~aj@x590c61af.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[21:20] * jpmethot (~Redcavali@office-mtl1-nat-146-218-70-69.gtcomm.net) has joined #ceph
[21:23] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[21:23] * angdraug (~angdraug@12.164.168.117) has joined #ceph
[21:24] * maurosr (~maurosr@bi-03pt2.bluebird.ibm.com) has left #ceph
[21:26] * Redcavalier (~Redcavali@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Ping timeout: 480 seconds)
[21:27] * derjohn_mob (~aj@x590c61af.dyn.telefonica.de) has joined #ceph
[21:32] * mohan_ (~oftc-webi@103.27.8.44) Quit (Remote host closed the connection)
[21:33] * championofcyrodi (~championo@50-205-35-98-static.hfc.comcastbusiness.net) has joined #ceph
[21:36] * palmeida (~palmeida@gandalf.wire-consulting.com) Quit (Ping timeout: 480 seconds)
[21:40] * MentalRay (~MRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[21:50] * arbrandes (~arbrandes@191.199.182.7) Quit (Read error: Connection reset by peer)
[21:51] <tuxcrafter> my ceph got stuck again at:
[21:51] <tuxcrafter> TheSov: ues to zero2015-06-19 21:47:22.519551 mon.0 [INF] pgmap v68504: 520 pgs: 510 active+clean, 8 active+degraded+remapped+backfilling, 1 active+degraded+remapped+inconsistent+backfilling, 1 active+recovering+degraded+remapped; 1422 GB data, 2882 GB used, 3635 GB / 6517 GB avail; 24702/757084 objects degraded (3.263%)
[21:52] <TheSov> you may have a disk that is slowly failing
[21:52] <tuxcrafter> http://paste.debian.net/242050/ < health details
[21:52] <TheSov> can u pull the smart data?
[21:53] <tuxcrafter> TheSov: yes the the disk for osd.5 is slowly failing
[21:53] <TheSov> you have 1 scrub error
[21:53] <tuxcrafter> i want to remove it asab
[21:53] <TheSov> how many copies do you keep?
[21:53] <tuxcrafter> two
[21:53] <TheSov> ugh....
[21:53] <TheSov> 3 is the minimum safe one unless you use ec
[21:53] <tuxcrafter> osd pool default size = 2
[21:53] <TheSov> yeah you should have left it 3
[21:54] <TheSov> either way normally i would say, remove the osd enitrely and replace it
[21:54] <TheSov> but since that now puts your data in danger
[21:55] <TheSov> backup first
[21:55] * mhack (~mhack@nat-pool-bos-t.redhat.com) Quit (Remote host closed the connection)
[21:55] * codice_ (~toodles@97-94-175-73.static.mtpk.ca.charter.com) has joined #ceph
[21:57] * codice (~toodles@97-94-175-73.static.mtpk.ca.charter.com) Quit (Ping timeout: 480 seconds)
[22:01] * Debesis (0x@5.254.46.84.mobile.mezon.lt) Quit (Quit: Leaving)
[22:02] * cdelatte (~cdelatte@2001:1998:860:1001:91ea:3f92:479c:51c7) Quit (Quit: Leaving)
[22:02] * primechuck (~primechuc@host-95-2-129.infobunker.com) has joined #ceph
[22:04] <tuxcrafter> TheSov: this is a test system
[22:05] <TheSov> then kill the osd and deploy a new one
[22:05] <TheSov> or just kill it and leave it :)
[22:05] <tuxcrafter> TheSov: i'm just trying to test things and restore so I can see how ceph works
[22:05] <tuxcrafter> TheSov: if i remove the osd the pgs go into an inconsitent state
[22:05] <tuxcrafter> I thought i could fix it this way
[22:06] <TheSov> well in a normal situation when a disk fails, you remove the osd for that disk entirely put in a new disk and create a new osd
[22:06] <tuxcrafter> TheSov: yes i know
[22:06] <tuxcrafter> the disk failed when i was rebuilding an other node
[22:06] <tuxcrafter> so it was a bit of nasty timing
[22:06] <TheSov> heh wow
[22:06] <TheSov> well typically u keep at least 3 copies
[22:06] <TheSov> the fact you have 2 setup is dangerous
[22:07] <tuxcrafter> hmm okay
[22:07] <TheSov> for test its ok
[22:07] <TheSov> dont do that in production
[22:07] <tuxcrafter> i will change it to osd pool default size = 3
[22:07] <tuxcrafter> but that will cost me quite a bit of extra disk space
[22:11] * jiyer (~chatzilla@63.229.31.161) Quit (Remote host closed the connection)
[22:12] * marrusl (~mark@nat-pool-rdu-u.redhat.com) Quit (Remote host closed the connection)
[22:15] * rwheeler (~rwheeler@nat-pool-bos-u.redhat.com) Quit (Quit: Leaving)
[22:16] * arbrandes (~arbrandes@191.7.148.91) has joined #ceph
[22:16] <tuxcrafter> TheSov: but how will my health state change when removing osd 5 and replacing it with a new disk
[22:16] <TheSov> when you remove the osd
[22:16] <TheSov> it will restripe
[22:16] <tuxcrafter> shouldnt i be able to get a health ok state right now
[22:16] <TheSov> so u will have 2 copies again
[22:17] <TheSov> when you add a new osd it will restripe again
[22:17] <TheSov> well u can purge pending cleans but that you should never do
[22:17] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[22:17] * jwilkins (~jwilkins@2601:9:4580:f4c:ea2a:eaff:fe08:3f1d) Quit (Ping timeout: 480 seconds)
[22:18] * DV_ (~veillard@2001:41d0:1:d478::1) Quit (Ping timeout: 480 seconds)
[22:25] <tuxcrafter> TheSov: http://paste.debian.net/242113/
[22:25] * Hemanth (~Hemanth@117.192.254.40) Quit (Ping timeout: 480 seconds)
[22:25] <tuxcrafter> my health status after removing the osd.5
[22:25] * jpmethot (~Redcavali@office-mtl1-nat-146-218-70-69.gtcomm.net) Quit (Quit: Leaving)
[22:26] <tuxcrafter> can i repair that
[22:26] <TheSov> now purge
[22:26] <TheSov> yes
[22:26] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[22:26] <TheSov> u removed the osd from crush aswell correct?
[22:27] <TheSov> ceph osd crush remove osd.5
[22:27] <TheSov> u did that?
[22:27] * DV (~veillard@2001:41d0:1:d478::1) has joined #ceph
[22:27] <TheSov> ceph auth del osd.5 that too?
[22:27] <tuxcrafter> yes
[22:28] <TheSov> ok now you are ready to purge
[22:28] * jrankin (~jrankin@d53-64-170-236.nap.wideopenwest.com) Quit (Quit: Leaving)
[22:28] <tuxcrafter> where can i find info on how to purge
[22:28] <tuxcrafter> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/
[22:28] <tuxcrafter> does not list it
[22:28] <tuxcrafter> ceph osd out 5 /etc/init.d/ceph stop osd.5 ceph osd crush remove osd.5 ceph auth del osd.5
[22:28] <tuxcrafter> ceph osd rm 5
[22:28] <tuxcrafter> that is what i did
[22:28] <tuxcrafter> i can undo it by readding the osd
[22:29] <TheSov> what does your status say now
[22:30] * jiyer (~chatzilla@63.229.31.161) has joined #ceph
[22:30] <TheSov> yeah thats the right link
[22:30] <TheSov> dump stale
[22:30] <TheSov> ceph pg dump_stuck stale
[22:30] <TheSov> ceph pg dump_stuck inactive
[22:30] <TheSov> ceph pg dump_stuck unclean
[22:31] <TheSov> then add a new osd in place of 5, using the same method u used to create it initially
[22:31] <TheSov> it should backfill
[22:35] <tuxcrafter> ceph pg dump_stuck stale
[22:35] <tuxcrafter> ceph pg dump_stuck inactive
[22:35] <tuxcrafter> ceph pg dump_stuck unclean
[22:35] <tuxcrafter> that did not change anything i could see
[22:36] * vbellur (~vijay@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[22:44] * georgem (~Adium@fwnat.oicr.on.ca) Quit (Quit: Leaving.)
[22:44] * LeaChim (~LeaChim@host86-132-233-125.range86-132.btcentralplus.com) Quit (Remote host closed the connection)
[22:50] <TheSov> ...
[22:51] <tuxcrafter> i replaced the hdds and ssds on the node and created the new osds
[22:51] <tuxcrafter> it is rebalancing now
[22:51] <tuxcrafter> will see in what state it is in 10 hours or so
[22:52] <TheSov> it must still be somewhere
[22:52] <TheSov> either in your crush map or osd tree
[23:00] * arbrandes (~arbrandes@191.7.148.91) Quit (Remote host closed the connection)
[23:02] * wschulze (~wschulze@cpe-69-206-240-164.nyc.res.rr.com) Quit (Quit: Leaving.)
[23:05] * jwilkins (~jwilkins@2601:648:8500:e5d8:ea2a:eaff:fe08:3f1d) has joined #ceph
[23:05] * tupper (~tcole@2001:420:2280:1272:8900:f9b8:3b49:567e) Quit (Ping timeout: 480 seconds)
[23:16] * TheSov (~TheSov@cip-248.trustwave.com) Quit (Read error: Connection reset by peer)
[23:17] * jpr (~jpr@thing2.it.uab.edu) has joined #ceph
[23:20] * treenerd (~treenerd@194.204.13.125) has joined #ceph
[23:21] * treenerd (~treenerd@194.204.13.125) Quit ()
[23:40] * vata (~vata@208.88.110.46) Quit (Quit: Leaving.)
[23:50] * Sysadmin88 (~IceChat77@2.124.164.69) has joined #ceph
[23:51] * dneary (~dneary@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[23:58] * ira (~ira@208.217.184.210) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.