#ceph IRC Log

Index

IRC Log for 2015-02-26

Timestamps are in GMT/BST.

[0:00] * puffy (~puffy@216.207.42.144) Quit (Quit: Leaving.)
[0:04] * yghannam (~yghannam@0001f8aa.user.oftc.net) has joined #ceph
[0:05] * al (quassel@niel.cx) Quit (Remote host closed the connection)
[0:06] * diegows (~diegows@190.190.5.238) has joined #ceph
[0:09] * al (d@niel.cx) has joined #ceph
[0:14] * oro (~oro@80-219-254-208.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[0:14] * Azru (~nicatronT@1CIAAGQ3Y.tor-irc.dnsbl.oftc.net) Quit ()
[0:14] * jwandborg (~Zombiekil@tor-exit2-readme.puckey.org) has joined #ceph
[0:15] * agshew_ (~agshew@host-69-145-59-76.bln-mt.client.bresnan.net) has joined #ceph
[0:16] * nitti (~nitti@173-160-123-93-Minnesota.hfc.comcastbusiness.net) Quit (Remote host closed the connection)
[0:16] * reed (~reed@75-101-54-131.dsl.static.fusionbroadband.com) Quit (Ping timeout: 480 seconds)
[0:17] * jdillaman (~jdillaman@pool-108-56-67-212.washdc.fios.verizon.net) has joined #ceph
[0:29] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[0:35] * ccheng (~ccheng@128.211.165.1) Quit (Remote host closed the connection)
[0:41] * togdon (~togdon@74.121.28.6) has joined #ceph
[0:42] * macjack (~Thunderbi@123.51.160.200) has joined #ceph
[0:43] * nwat (~nwat@kyoto.soe.ucsc.edu) Quit (Quit: Leaving)
[0:44] * jwandborg (~Zombiekil@3N2AABCAC.tor-irc.dnsbl.oftc.net) Quit ()
[0:44] * Mattress (~dug@tor00.telenet.unc.edu) has joined #ceph
[0:47] * joef1 (~Adium@2620:79:0:207:5d76:2978:c961:3061) has joined #ceph
[0:47] * moore (~moore@64.202.160.88) Quit (Remote host closed the connection)
[0:48] * macjack (~Thunderbi@123.51.160.200) Quit (Quit: macjack)
[0:49] * wkennington (~william@76.77.180.204) Quit (Remote host closed the connection)
[0:49] * macjack (~Thunderbi@123.51.160.200) has joined #ceph
[0:51] * PerlStalker (~PerlStalk@162.220.127.20) Quit (Quit: ...)
[0:53] * al (d@niel.cx) Quit (Remote host closed the connection)
[0:53] * wkennington (~william@76.77.180.204) has joined #ceph
[0:54] * al (quassel@niel.cx) has joined #ceph
[0:56] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[1:02] <bdonnahue> anyone using dmcrypt with ceph?
[1:04] * Milena21 (~Milena21@95.141.20.198) has joined #ceph
[1:04] <Milena21> High Quality photos and videos
[1:05] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) Quit (Ping timeout: 480 seconds)
[1:06] * Milena21 (~Milena21@95.141.20.198) Quit (autokilled: No spam. Contact support@oftc.net if you feel this is in error. (2015-02-26 00:06:46))
[1:13] * puffy (~puffy@50.185.218.255) has joined #ceph
[1:14] * Mattress (~dug@1CIAAGQ6J.tor-irc.dnsbl.oftc.net) Quit ()
[1:17] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) has joined #ceph
[1:18] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) Quit (Remote host closed the connection)
[1:22] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) has joined #ceph
[1:23] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) Quit (Remote host closed the connection)
[1:23] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) has joined #ceph
[1:23] * Manshoon (~Manshoon@199.16.199.4) has joined #ceph
[1:24] * peeejayz (~peeejayz@vpn-2-236.rl.ac.uk) Quit (Ping timeout: 480 seconds)
[1:26] * Manshoon_ (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) has joined #ceph
[1:28] * agshew_ (~agshew@host-69-145-59-76.bln-mt.client.bresnan.net) Quit (Ping timeout: 480 seconds)
[1:30] * zack_dolby (~textual@pa3b3a1.tokynt01.ap.so-net.ne.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[1:31] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[1:33] * Manshoon (~Manshoon@199.16.199.4) Quit (Ping timeout: 480 seconds)
[1:34] * Manshoon_ (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) Quit (Ping timeout: 480 seconds)
[1:35] * xarses_ (~andreww@12.164.168.117) Quit (Ping timeout: 480 seconds)
[1:35] * togdon (~togdon@74.121.28.6) Quit (Quit: Textual IRC Client: www.textualapp.com)
[1:40] * sjm (~sjm@pool-98-109-11-113.nwrknj.fios.verizon.net) Quit (Quit: Leaving.)
[1:43] * joef1 (~Adium@2620:79:0:207:5d76:2978:c961:3061) Quit (Quit: Leaving.)
[1:45] * hellertime (~Adium@pool-173-48-56-84.bstnma.fios.verizon.net) has joined #ceph
[1:46] * agshew_ (~agshew@host-69-145-59-76.bln-mt.client.bresnan.net) has joined #ceph
[1:52] * fam_away is now known as fam
[1:56] * EdGruberman (~Skyrider@tor-exit1.arbitrary.ch) has joined #ceph
[1:58] * moore (~moore@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[2:00] * yguang11 (~yguang11@vpn-nat.peking.corp.yahoo.com) has joined #ceph
[2:01] * OutOfNoWhere (~rpb@199.68.195.102) has joined #ceph
[2:06] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has joined #ceph
[2:06] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has left #ceph
[2:06] * moore (~moore@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Ping timeout: 480 seconds)
[2:08] <JoeJulian> I have a new problem where when I create a new cluster, my pgs are stuck inactive - creating. osd dump http://ur1.ca/jsx4d , crushmap dump http://fpaste.org/190588/24912703/
[2:10] <JoeJulian> Afaict, everything looks right. There are 2 osds which should satisfy the size 2 min_size 1 just fine. The OSDs are on separate hosts which should satisfy the chooseleaf_firstn op (I think). Any ideas?
[2:19] * rmoe (~quassel@12.164.168.117) Quit (Ping timeout: 480 seconds)
[2:23] * hellertime (~Adium@pool-173-48-56-84.bstnma.fios.verizon.net) Quit (Quit: Leaving.)
[2:24] * hellertime (~Adium@pool-173-48-56-84.bstnma.fios.verizon.net) has joined #ceph
[2:24] * EdGruberman (~Skyrider@1CIAAGQ86.tor-irc.dnsbl.oftc.net) Quit ()
[2:24] * Hideous (~Kurimus@chomsky.torservers.net) has joined #ceph
[2:30] * jclm (~jclm@209.49.224.62) Quit (Quit: Leaving.)
[2:31] * agshew_ (~agshew@host-69-145-59-76.bln-mt.client.bresnan.net) Quit (Ping timeout: 480 seconds)
[2:31] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) Quit (Quit: Leaving.)
[2:32] * rmoe (~quassel@173-228-89-134.dsl.static.fusionbroadband.com) has joined #ceph
[2:35] * t0rn (~ssullivan@2607:fad0:32:a02:d227:88ff:fe02:9896) Quit (Quit: Leaving.)
[2:35] * jwilkins (~jwilkins@38.122.20.226) Quit (Quit: Leaving)
[2:39] * kefu (~kefu@114.92.100.153) has joined #ceph
[2:39] * thb (~me@0001bd58.user.oftc.net) Quit (Ping timeout: 480 seconds)
[2:43] * kefu (~kefu@114.92.100.153) Quit (Max SendQ exceeded)
[2:44] * kefu (~kefu@114.92.100.153) has joined #ceph
[2:46] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Quit: leaving)
[2:50] * debian112 (~bcolbert@24.126.201.64) Quit (Quit: Leaving.)
[2:52] * vasu (~vasu@38.122.20.226) Quit (Ping timeout: 480 seconds)
[2:53] * ShaunR (~ShaunR@staff.ndchost.com) Quit (Read error: Connection reset by peer)
[2:53] * ircolle (~ircolle@38.122.20.226) Quit (Ping timeout: 480 seconds)
[2:53] * zack_dolby (~textual@nfmv001076073.uqw.ppp.infoweb.ne.jp) has joined #ceph
[2:54] * Hideous (~Kurimus@1CIAAGQ94.tor-irc.dnsbl.oftc.net) Quit ()
[2:54] * _303 (~raindog@104.130.25.153) has joined #ceph
[2:54] * bandrus (~brian@57.sub-70-211-74.myvzw.com) Quit (Remote host closed the connection)
[2:54] * georgem (~Adium@69-165-159-72.dsl.teksavvy.com) has joined #ceph
[2:55] * ccheng (~ccheng@c-50-165-131-154.hsd1.in.comcast.net) has joined #ceph
[2:56] * ccheng (~ccheng@c-50-165-131-154.hsd1.in.comcast.net) Quit ()
[2:58] * joshd (~jdurgin@38.122.20.226) Quit (Ping timeout: 480 seconds)
[3:01] * cholcombe973 (~chris@7208-76ef-ff1f-ed2f-329a-f002-3420-2062.6rd.ip6.sonic.net) has left #ceph
[3:02] * ghost1 (~pablodelg@107-208-117-140.lightspeed.miamfl.sbcglobal.net) has joined #ceph
[3:07] * vilobhmm (~vilobhmm@nat-dip33-wl-g.cfw-a-gci.corp.yahoo.com) Quit (Quit: Away)
[3:09] * joshd (~jdurgin@38.122.20.226) has joined #ceph
[3:09] * vilobhmm (~vilobhmm@nat-dip33-wl-g.cfw-a-gci.corp.yahoo.com) has joined #ceph
[3:10] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) has joined #ceph
[3:12] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[3:12] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) has joined #ceph
[3:12] * Rickus (~Rickus@office.protected.ca) Quit (Ping timeout: 480 seconds)
[3:17] * nitti (~nitti@c-66-41-30-224.hsd1.mn.comcast.net) has joined #ceph
[3:18] * diegows (~diegows@190.190.5.238) Quit (Ping timeout: 480 seconds)
[3:20] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[3:22] * jclm (~jclm@ip-64-134-224-99.public.wayport.net) has joined #ceph
[3:24] * vilobhmm (~vilobhmm@nat-dip33-wl-g.cfw-a-gci.corp.yahoo.com) Quit (Quit: Away)
[3:24] * _303 (~raindog@3N2AABCHJ.tor-irc.dnsbl.oftc.net) Quit ()
[3:25] * nitti (~nitti@c-66-41-30-224.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[3:25] * blip2 (~SurfMaths@tor-exit.server7.tvdw.eu) has joined #ceph
[3:29] * segutier (~segutier@c-24-6-218-139.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[3:30] * LeaChim (~LeaChim@host86-159-234-113.range86-159.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[3:36] * georgem (~Adium@69-165-159-72.dsl.teksavvy.com) Quit (Quit: Leaving.)
[3:37] * dalgaaf (uid15138@id-15138.charlton.irccloud.com) Quit (Quit: Connection closed for inactivity)
[3:37] * amote (~amote@1.39.97.119) has joined #ceph
[3:38] * georgem (~Adium@69-165-159-72.dsl.teksavvy.com) has joined #ceph
[3:44] * rljohnsn (~rljohnsn@ns25.8x8.com) Quit (Ping timeout: 480 seconds)
[3:50] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) has joined #ceph
[3:50] * blip2 (~SurfMaths@2BLAAFXEH.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[3:54] * jdillaman is now known as jdillaman_afk
[3:55] * allenmelon1 (~mps@chomsky.torservers.net) has joined #ceph
[3:56] * georgem (~Adium@69-165-159-72.dsl.teksavvy.com) Quit (Quit: Leaving.)
[3:59] * nitti (~nitti@c-66-41-30-224.hsd1.mn.comcast.net) has joined #ceph
[3:59] * nitti (~nitti@c-66-41-30-224.hsd1.mn.comcast.net) Quit (Remote host closed the connection)
[4:02] * mookins (~mookins@induct3.lnk.telstra.net) has joined #ceph
[4:02] * georgem (~Adium@69-165-159-72.dsl.teksavvy.com) has joined #ceph
[4:09] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) Quit (Remote host closed the connection)
[4:12] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) has joined #ceph
[4:12] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[4:18] * saltlake (~saltlake@pool-71-244-62-208.dllstx.fios.verizon.net) has joined #ceph
[4:20] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) Quit (Ping timeout: 480 seconds)
[4:21] * saltlake (~saltlake@pool-71-244-62-208.dllstx.fios.verizon.net) Quit (Read error: Connection reset by peer)
[4:24] * saltlake (~saltlake@pool-71-244-62-208.dllstx.fios.verizon.net) has joined #ceph
[4:24] * allenmelon1 (~mps@1CIAAGRDQ.tor-irc.dnsbl.oftc.net) Quit ()
[4:25] * Kyso_ (~Fapiko@83.149.126.29) has joined #ceph
[4:28] <dmick> JoeJulian: ?? ?? ?? ?? ?? ?? ?? ?? ?? "type": "host"
[4:29] <dmick> there are no host nodes in the crushmap
[4:32] * hellertime (~Adium@pool-173-48-56-84.bstnma.fios.verizon.net) Quit (Quit: Leaving.)
[4:36] * ghost1 (~pablodelg@107-208-117-140.lightspeed.miamfl.sbcglobal.net) Quit (Quit: ghost1)
[4:39] * bkopilov (~bkopilov@bzq-79-182-164-80.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[4:54] * Kyso_ (~Fapiko@1CIAAGREP.tor-irc.dnsbl.oftc.net) Quit ()
[4:54] * Tarazed (~Borf@95.211.169.35) has joined #ceph
[4:56] * saltlake (~saltlake@pool-71-244-62-208.dllstx.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[5:00] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[5:01] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) has joined #ceph
[5:02] * amote (~amote@1.39.97.119) Quit (Ping timeout: 480 seconds)
[5:04] * guppy (~quassel@guppy.xxx) Quit (Quit: No Ping reply in 180 seconds.)
[5:05] * guppy (~quassel@guppy.xxx) has joined #ceph
[5:07] * amote (~amote@1.39.97.119) has joined #ceph
[5:08] * saltlake (~saltlake@pool-71-244-62-208.dllstx.fios.verizon.net) has joined #ceph
[5:08] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) Quit (Ping timeout: 480 seconds)
[5:09] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[5:13] * DV__ (~veillard@2001:41d0:1:d478::1) has joined #ceph
[5:15] * sjm (~sjm@pool-98-109-11-113.nwrknj.fios.verizon.net) has joined #ceph
[5:16] * amote (~amote@1.39.97.119) Quit (Quit: Leaving)
[5:19] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[5:24] * Tarazed (~Borf@3N2AABCLB.tor-irc.dnsbl.oftc.net) Quit ()
[5:25] * DougalJacobs (~mason@188.165.59.43) has joined #ceph
[5:27] * kanagaraj (~kanagaraj@121.244.87.117) has joined #ceph
[5:31] * georgem (~Adium@69-165-159-72.dsl.teksavvy.com) Quit (Quit: Leaving.)
[5:32] <JoeJulian> Thanks dmick. Makes me wonder why it ever worked. :D
[5:33] <dmick> can't imagine it did
[5:34] <JoeJulian> but it did.
[5:37] * KevinPerks (~Adium@cpe-071-071-026-213.triad.res.rr.com) Quit (Quit: Leaving.)
[5:40] <JoeJulian> ... which is probably a bug.
[5:40] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[5:54] * DougalJacobs (~mason@3N2AABCL4.tor-irc.dnsbl.oftc.net) Quit ()
[5:55] * Freddy (~mps@95.128.43.164) has joined #ceph
[5:56] * jkhunter (~oftc-webi@cpe-70-112-163-237.austin.res.rr.com) has joined #ceph
[5:57] <JoeJulian> Oh wait.. "by default, Ceph automatically sets a ceph-osd daemon???s location to be root=default host=HOSTNAME (based on the output from hostname -s)" so that explains why it worked, but not why it sometimes fails.
[5:57] * Vacuum (~vovo@88.130.209.173) has joined #ceph
[5:57] * bkopilov (~bkopilov@nat-pool-tlv-t.redhat.com) has joined #ceph
[5:59] <jkhunter> Hi I am just starting out using ceph and I am having problems creating the osds. I am using Ubuntu 10.4 long term support.
[5:59] * xarses_ (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) has joined #ceph
[6:00] <jkhunter> can anyone help me?
[6:01] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[6:04] * Vacuum_ (~vovo@i59F79B0C.versanet.de) Quit (Ping timeout: 480 seconds)
[6:04] <jkhunter> hello?
[6:05] * jamespage (~jamespage@culvain.gromper.net) Quit (Quit: Coyote finally caught me)
[6:06] * jamespage (~jamespage@culvain.gromper.net) has joined #ceph
[6:06] <jkhunter> hello?
[6:06] <janos> if someone is available to help, they will
[6:06] <janos> that's not a great way to encourage someone to help
[6:07] <jkhunter> sorry new to irc wasn't sure if anyone was readlly seeing my messages. I didn't see anyone else posting comments so wasn't sure why I wasn't seeing any other comments just joins.
[6:08] <janos> much of the support tends to occur during usual daytime hours from east-coast US to west coast US
[6:08] <janos> though there are people around randomly
[6:09] <janos> i'd be more helpful but i'm billing hourly for a client ina hurry right now
[6:09] <janos> sorry
[6:09] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) Quit (Ping timeout: 480 seconds)
[6:09] <jkhunter> okay sounds good.. just been stuck for a while now and finally decided to start trying to reach out for help.
[6:14] <jkhunter> Here is my problem... ceph-deploy osd prepare compute-03:sdb when I run this it does provision the disk, but then it [compute-03][INFO ] Running command: sudo ceph-disk -v prepare --fs-type xfs --cluster ceph -- /dev/sdb. [compute-03][WARNIN] ceph-disk: Error: Device is mounted: /dev/sdb1 . [ceph_deploy.osd][ERROR ] Failed to execute command: ceph-disk -v prepare --fs-type xfs --cluster ceph -- /dev/sdb
[6:15] * tupper (~tcole@rtp-isp-nat1.cisco.com) Quit (Read error: Connection reset by peer)
[6:15] <jkhunter> even if I ssh to the node and then umount the disk and run the command: ceph-disk -v prepare --fs-type xfs --cluster ceph -- /dev/sdb I am not able to get past this error.
[6:20] <tacticus> jkhunter: is there a swap partition on the disk?
[6:20] <tacticus> perhaps some swap detection has found it and decied to mount it
[6:22] <jkhunter> no the first this I ran was ceph-deploy disk zap compute-03:sdb so that deletes everything on this disk
[6:22] <mjevans> jkhunter: I suggest unmounting that disk, zeroing the partition headers (the beginnings) and maybe re-partitioning it... before rebooting. Unsure what zap does, I don't use that part of the deploy scripts.
[6:22] <JoeJulian> I wish that was true, but if there's something mounted, zap doesn't actually zap it.
[6:22] <mjevans> XD and THAT is why I do things manually
[6:23] <JoeJulian> If you zero the partition headers, you do need to create a disklabel (partition table) or ceph-disk won't create the partition.
[6:24] * badone_ (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[6:24] * Freddy (~mps@1CIAAGRIG.tor-irc.dnsbl.oftc.net) Quit ()
[6:25] <mjevans> JoeJulian: by that I mean dd if=/dev/zero of=/dev/XXXn bs=1024k count=128 or something similar
[6:25] <mjevans> However that takes a bit to type out.
[6:26] * badone is now known as Guest694
[6:26] * badone_ is now known as badone
[6:26] * AG_Clinton (~superdug@176.10.99.209) has joined #ceph
[6:26] <jkhunter> Thanks
[6:27] * Guest694 (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[6:27] * jkhunter (~oftc-webi@cpe-70-112-163-237.austin.res.rr.com) Quit (Quit: Page closed)
[6:28] <JoeJulian> No worries. I've actually been doing dd if=/dev/zero of=/dev/XXX bs=4M count=1 ; parted -s /dev/XXX mklabel gpt
[6:29] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[6:30] <badone> Doesn't --zap-disk do that?
[6:30] <JoeJulian> not if it's mounted
[6:30] <badone> JoeJulian: ahh, okay
[6:31] <JoeJulian> And when I'm building integration environments all day, I need to start over a lot.
[6:31] * tupper (~tcole@108-83-203-37.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[6:32] <tacticus> JoeJulian: is there a reason for parted over sgdisk?
[6:32] * brutuscat (~brutuscat@174.34.133.37.dynamic.jazztel.es) has joined #ceph
[6:32] <JoeJulian> familiarity
[6:32] <tacticus> kk
[6:34] <JoeJulian> ... and now I'm going to change my salt state to use sgdisk... thanks tacticus
[6:34] <tacticus> JoeJulian: np it's much nicer in scripts
[6:35] * jackson (~jackson@dynamic-acs-24-154-29-150.zoominternet.net) has joined #ceph
[6:35] <tacticus> and if you already have a layout for your disks you can just pass it that.
[6:35] <tacticus> great for partitioning 30 disks all the same.
[6:35] <mjevans> I prefer gdisk my self
[6:36] <JoeJulian> And 30 is actually how many I do each time.
[6:36] <tacticus> isn't sgdisk just a different interface to gdisk?
[6:37] * runfromn1where (~runfromno@pool-70-104-139-21.nycmny.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[6:37] * jclm (~jclm@ip-64-134-224-99.public.wayport.net) Quit (Quit: Leaving.)
[6:41] * sjm (~sjm@pool-98-109-11-113.nwrknj.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[6:41] * kefu (~kefu@114.92.100.153) Quit (Remote host closed the connection)
[6:42] * kefu (~kefu@114.92.100.153) has joined #ceph
[6:42] * jackson is now known as osi
[6:43] * kevinkevin-work (6dbebb8f@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[6:43] * badone_ (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[6:43] * brutuscat (~brutuscat@174.34.133.37.dynamic.jazztel.es) Quit (Remote host closed the connection)
[6:44] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[6:44] * badone_ is now known as badone
[6:44] * brutuscat (~brutuscat@174.34.133.37.dynamic.jazztel.es) has joined #ceph
[6:45] * kevinkevin-work (6dbebb8f@107.161.19.109) has joined #ceph
[6:46] * karnan (~karnan@121.244.87.117) has joined #ceph
[6:47] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) has joined #ceph
[6:48] * saltlake (~saltlake@pool-71-244-62-208.dllstx.fios.verizon.net) Quit (Quit: Nettalk6 - www.ntalk.de)
[6:49] * OutOfNoWhere (~rpb@199.68.195.102) Quit (Ping timeout: 480 seconds)
[6:53] * brutuscat (~brutuscat@174.34.133.37.dynamic.jazztel.es) Quit (Ping timeout: 480 seconds)
[6:54] * rdas (~rdas@121.244.87.116) has joined #ceph
[6:55] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[6:55] * AG_Clinton (~superdug@1CIAAGRJI.tor-irc.dnsbl.oftc.net) Quit ()
[6:56] * Inuyasha (~mr_flea@tor-exit3-readme.dfri.se) has joined #ceph
[7:00] * nitti (~nitti@c-66-41-30-224.hsd1.mn.comcast.net) has joined #ceph
[7:00] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Quit: Verlassend)
[7:01] * jclm (~jclm@ip-64-134-224-99.public.wayport.net) has joined #ceph
[7:02] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[7:05] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[7:06] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[7:08] * kefu is now known as kefu|afk
[7:08] * nitti (~nitti@c-66-41-30-224.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[7:10] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) Quit (Ping timeout: 480 seconds)
[7:13] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[7:15] * krypto (~oftc-webi@hpm01cs006-ext.asiapac.hp.net) has joined #ceph
[7:18] * davidz1 (~davidz@cpe-23-242-189-171.socal.res.rr.com) has joined #ceph
[7:18] * swami1 (~swami@116.75.99.14) has joined #ceph
[7:18] * tobiash (~quassel@mail.bmw-carit.de) Quit (Remote host closed the connection)
[7:19] * jclm (~jclm@ip-64-134-224-99.public.wayport.net) Quit (Quit: Leaving.)
[7:19] * tobiash (~quassel@mail.bmw-carit.de) has joined #ceph
[7:20] * tobiash (~quassel@mail.bmw-carit.de) Quit (Remote host closed the connection)
[7:21] * osi is now known as Jackson
[7:22] * tobiash (~quassel@mail.bmw-carit.de) has joined #ceph
[7:22] * Jackson is now known as Guest696
[7:22] * davidz (~davidz@2605:e000:1313:8003:213a:f11e:8cdc:2bad) Quit (Ping timeout: 480 seconds)
[7:23] * Guest696 is now known as Jackson_
[7:24] * Muhlemmer (~kvirc@cable-90-50.zeelandnet.nl) Quit (Quit: KVIrc 4.3.1 Aria http://www.kvirc.net/)
[7:25] * Inuyasha (~mr_flea@1CIAAGRKP.tor-irc.dnsbl.oftc.net) Quit ()
[7:26] * Quackie (~Enikma@40.ip-37-187-244.eu) has joined #ceph
[7:27] * overclk (~overclk@121.244.87.117) has joined #ceph
[7:35] <krypto> hello all i am trying ceph with memstore backend using firefly on ubuntu 14.04 3.16.-3 kernel.Client is using xfs partition and writing 100Mb file using "dd" there is not much improvement in performance.In ceph.log its writing at 800kbps.With hdd osd i was getting 750 Kbps and now even with memstore its just 50Kbps improvement.Where can be the issue
[7:36] * oro (~oro@80-219-254-208.dclient.hispeed.ch) has joined #ceph
[7:55] * Quackie (~Enikma@3N2AABCQJ.tor-irc.dnsbl.oftc.net) Quit ()
[7:55] * Jyron (~Shnaw@1CIAAGRMN.tor-irc.dnsbl.oftc.net) has joined #ceph
[7:57] * amote (~amote@121.244.87.116) has joined #ceph
[8:02] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) has joined #ceph
[8:04] * Jackson_ (~jackson@dynamic-acs-24-154-29-150.zoominternet.net) Quit (Quit: Ex-Chat)
[8:04] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) Quit (Quit: Leaving...)
[8:04] * vbellur (~vijay@122.172.31.221) has joined #ceph
[8:06] * cooldharma06 (~chatzilla@14.139.180.52) has joined #ceph
[8:07] * lalatenduM (~lalatendu@122.167.151.186) has joined #ceph
[8:10] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[8:13] * DV__ (~veillard@2001:41d0:1:d478::1) Quit (Ping timeout: 480 seconds)
[8:13] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) has joined #ceph
[8:15] * Nacer (~Nacer@2001:41d0:fe82:7200:2c41:8a2b:cf95:6060) has joined #ceph
[8:16] * swami2 (~swami@116.75.99.14) has joined #ceph
[8:16] * DV__ (~veillard@2001:41d0:1:d478::1) has joined #ceph
[8:16] * Sysadmin88 (~IceChat77@2.125.213.8) Quit (Quit: OUCH!!!)
[8:18] * Be-El (~quassel@fb08-bcf-pc01.computational.bio.uni-giessen.de) has joined #ceph
[8:19] <Be-El> hi
[8:20] * swami1 (~swami@116.75.99.14) Quit (Ping timeout: 480 seconds)
[8:20] * Andreas-IPO (~andreas@2a01:2b0:2000:11::cafe) Quit (Quit: Andreas-IPO)
[8:21] * Andreas-IPO (~andreas@2a01:2b0:2000:11::cafe) has joined #ceph
[8:21] * Manshoon (~Manshoon@c-50-181-29-219.hsd1.wv.comcast.net) Quit (Ping timeout: 480 seconds)
[8:25] * Jyron (~Shnaw@1CIAAGRMN.tor-irc.dnsbl.oftc.net) Quit ()
[8:29] * joshd1 (~jdurgin@24-205-54-236.dhcp.gldl.ca.charter.com) has joined #ceph
[8:30] * jacoo (~drdanick@tor-exit2-readme.puckey.org) has joined #ceph
[8:32] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[8:35] * kefu (~kefu@114.92.100.153) has joined #ceph
[8:37] * Nacer (~Nacer@2001:41d0:fe82:7200:2c41:8a2b:cf95:6060) Quit (Remote host closed the connection)
[8:40] * kefu|afk (~kefu@114.92.100.153) Quit (Ping timeout: 480 seconds)
[8:43] * Miouge (~Miouge@94.136.92.20) has joined #ceph
[8:48] * davidz (~davidz@cpe-23-242-189-171.socal.res.rr.com) has joined #ceph
[8:49] * cok (~chk@2a02:2350:18:1010:458b:ddde:571c:7f15) has joined #ceph
[8:54] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[8:54] * avozza (~avozza@a83-160-116-36.adsl.xs4all.nl) has joined #ceph
[8:55] * davidz1 (~davidz@cpe-23-242-189-171.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[8:57] * branto (~branto@nat-pool-brq-t.redhat.com) has joined #ceph
[9:00] * jacoo (~drdanick@3N2AABCS8.tor-irc.dnsbl.oftc.net) Quit ()
[9:00] * BillyBobJohn (~CobraKhan@edwardsnowden0.torservers.net) has joined #ceph
[9:02] * thb (~me@port-33558.pppoe.wtnet.de) has joined #ceph
[9:06] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[9:11] * ohnomrbill (~ohnomrbil@c-67-174-241-112.hsd1.ca.comcast.net) Quit (Quit: ohnomrbill)
[9:15] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) has joined #ceph
[9:15] * capri (~capri@212.218.127.222) has joined #ceph
[9:17] * thomnico (~thomnico@82.166.93.197) has joined #ceph
[9:18] * vbellur (~vijay@122.172.31.221) Quit (Ping timeout: 480 seconds)
[9:21] * Concubidated (~Adium@71.21.5.251) Quit (Quit: Leaving.)
[9:22] * oro (~oro@80-219-254-208.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[9:24] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[9:26] * thomnico_ (~thomnico@82.166.93.197) has joined #ceph
[9:29] * fghaas (~florian@91-119-130-192.dynamic.xdsl-line.inode.at) has joined #ceph
[9:29] * bitserker (~toni@77.231.177.241) has joined #ceph
[9:30] * BillyBobJohn (~CobraKhan@1CIAAGRPI.tor-irc.dnsbl.oftc.net) Quit ()
[9:30] * Wizeon (~Vidi@vps.reiner-h.de) has joined #ceph
[9:31] * dgurtner (~dgurtner@178.197.231.49) has joined #ceph
[9:31] * jtang (~jtang@109.255.42.21) has joined #ceph
[9:33] * kawa2014 (~kawa@90.216.134.197) has joined #ceph
[9:33] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) Quit (Remote host closed the connection)
[9:35] * thomnico (~thomnico@82.166.93.197) Quit (Quit: Ex-Chat)
[9:35] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Quit: Ex-Chat)
[9:35] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) has joined #ceph
[9:36] * squ (~Thunderbi@46.109.186.160) has joined #ceph
[9:37] * greatmane (~greatmane@CPE-124-188-114-5.wdcz1.cht.bigpond.net.au) Quit ()
[9:38] <baffle> Hi, can anyone help me with some stuck placement groups?
[9:38] * vbellur (~vijay@121.244.87.117) has joined #ceph
[9:41] * ChrisNBlum (~ChrisNBlu@dhcp-ip-230.dorf.rwth-aachen.de) has joined #ceph
[9:41] * ohnomrbill (~ohnomrbil@c-67-174-241-112.hsd1.ca.comcast.net) has joined #ceph
[9:43] * i_m (~ivan.miro@deibp9eh1--blueice4n2.emea.ibm.com) has joined #ceph
[9:44] * ChrisNBlum (~ChrisNBlu@dhcp-ip-230.dorf.rwth-aachen.de) Quit ()
[9:45] * ChrisNBlum (~ChrisNBlu@dhcp-ip-230.dorf.rwth-aachen.de) has joined #ceph
[9:54] * jordanP (~jordan@scality-jouf-2-194.fib.nerim.net) has joined #ceph
[9:55] * tupper (~tcole@108-83-203-37.lightspeed.rlghnc.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[9:55] * zack_dolby (~textual@nfmv001076073.uqw.ppp.infoweb.ne.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[9:57] * Amto_res1 (~amto_res@ks312256.kimsufi.com) has left #ceph
[10:00] * Wizeon (~Vidi@2BLAAFXSI.tor-irc.dnsbl.oftc.net) Quit ()
[10:00] * karimb (~kboumedhe@publico1.tid.es) has joined #ceph
[10:00] * Altitudes (~elt@edwardsnowden0.torservers.net) has joined #ceph
[10:02] * swami2 (~swami@116.75.99.14) Quit (Quit: Leaving.)
[10:03] * SpamapS (~clint@xencbyrum2.srihosting.com) Quit (Remote host closed the connection)
[10:06] * vbellur (~vijay@121.244.87.117) Quit (Ping timeout: 480 seconds)
[10:10] * oro (~oro@2001:620:20:16:210b:c579:3082:12fb) has joined #ceph
[10:10] * fridim_ (~fridim@56-198-190-109.dsl.ovh.fr) has joined #ceph
[10:11] * PaulC (~paul@122-60-36-115.jetstream.xtra.co.nz) Quit (Ping timeout: 480 seconds)
[10:11] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[10:15] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[10:20] * SpamapS (~clint@xencbyrum2.srihosting.com) Quit (Remote host closed the connection)
[10:21] * vbellur (~vijay@121.244.87.124) has joined #ceph
[10:25] * swami1 (~swami@116.75.99.14) has joined #ceph
[10:29] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[10:30] * Altitudes (~elt@2BLAAFXTT.tor-irc.dnsbl.oftc.net) Quit ()
[10:30] * sese_ (~Kakeru@chomsky.torservers.net) has joined #ceph
[10:32] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[10:32] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[10:33] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[10:35] * ChrisNBlum (~ChrisNBlu@dhcp-ip-230.dorf.rwth-aachen.de) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[10:36] <nils_> trying to work my way through the ansible scripts
[10:36] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[10:37] <nils_> now that I have added journal devices
[10:40] <Bosse> baffle: can you go to pastebin and put in the output of 'ceph -s; ceph health detail' and 'ceph pg <pg-id> query' on one of the stuck pgs? that may help us help you. you can find troubleshooting information at http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/
[10:41] <baffle> Bosse: Great, I'll do that immediatly. I've been reading lots of documentation and mailinglists allready..
[10:42] <baffle> Bosse: http://pastebin.com/3ixNd9aG <- health
[10:46] <baffle> Bosse: http://pastebin.com/iWzgFNbH <- pg query .. Uhh.. But now I noticed that query two of the PGs just hang as well..
[10:47] <nils_> it seems that ceph-disk-prepare will try to create a journal partition, this will of course not work when the drive in question also hosts the operating system (since it can't re-read the partition table then)
[10:50] <baffle> Bosse: What happend (as far as I know) is that there was a network outage between ceph nodes. Followed by power outages first on one node, then while that booted, the two other nodes lost power. Major power fluxuations (physical issues with multiple ATSes it seems) wich killed two drives, osd.0 and osd.17. These were marked as lost, and min_size was set to 1.. This recovered and backfilled most PGs, except for these 5.
[10:51] <Bosse> baffle: check the logs for osd 9, 18 and 19 (for pg 6.399) and see what you find. you can try to restart those osds (one at a time) and see if the issue clears. can you also pastebin the output of 'ceph osd tree' so that we can see your spread?
[10:51] <fghaas> nils_: ceph osd prepare generally expects that it has a full disk at its disposal, which is also a standard design pattern (1 disk, and hence one filesystem, per OSD instance)
[10:51] <nils_> fghaas: well I want to host the journal on SSDs instead
[10:52] <fghaas> nils_: ceph-deploy osd create <host>:<dev>:<journal dev>
[10:52] <fghaas> http://ceph.com/docs/master/rados/deployment/ceph-deploy-osd/
[10:52] <baffle> Bosse: http://pastebin.com/XBV6HeJL
[10:53] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[10:54] * madkiss (~madkiss@2001:6f8:12c3:f00f:d863:71cb:876d:f4ee) Quit (Read error: Connection reset by peer)
[10:54] <baffle> Bosse: Oh, yeah, then we temporarily recreated osd.0 + osd.17 as blank OSDs, because the pg will wait for those to come back even if they are removed from crush map and marked as lost. :-/
[10:54] * thomnico (~thomnico@82.166.93.197) has joined #ceph
[10:55] * thomnico (~thomnico@82.166.93.197) Quit ()
[10:55] * thomnico_ (~thomnico@82.166.93.197) Quit (Quit: Ex-Chat)
[10:55] * thomnico (~thomnico@82.166.93.197) has joined #ceph
[10:56] <nils_> fghaas: isn't that basically calling ceph-disk prepare etc. so I'll run into that exact problem?
[10:57] <fghaas> nils_: afaik you can create your own partition beforehand and then run ceph-deploy osd create <host>:<partition>:<journal partition>
[10:57] <nils_> fghaas: yeah that should work
[10:57] <fghaas> but repeat after me: "I don't want to do that".
[10:58] * madkiss (~madkiss@ip5b418369.dynamic.kabel-deutschland.de) has joined #ceph
[10:59] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[11:00] * sese_ (~Kakeru@2BLAAFXU2.tor-irc.dnsbl.oftc.net) Quit ()
[11:00] * ulterior (~xolotl@aurora.enn.lu) has joined #ceph
[11:01] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[11:02] * LeaChim (~LeaChim@host86-159-234-113.range86-159.btcentralplus.com) has joined #ceph
[11:03] * davidz (~davidz@cpe-23-242-189-171.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[11:05] * cooldharma06 (~chatzilla@14.139.180.52) Quit (Quit: ChatZilla 0.9.91.1 [Iceweasel 21.0/20130515140136])
[11:06] <nils_> fghaas: so better to move the OS to yet another device and use the full SSD for journal partitions?
[11:09] <nils_> LVM doesn't work either
[11:09] * vbellur (~vijay@121.244.87.124) Quit (Ping timeout: 480 seconds)
[11:12] <nils_> still have some SATA ports and some crappy SSD left but it's a bit ridiculous
[11:14] * thomnico (~thomnico@82.166.93.197) Quit (Ping timeout: 480 seconds)
[11:20] * vbellur (~vijay@121.244.87.117) has joined #ceph
[11:21] * davidz (~davidz@2605:e000:1313:8003:9468:af06:b5d6:f557) has joined #ceph
[11:29] * kefu (~kefu@114.92.100.153) Quit (Max SendQ exceeded)
[11:30] * ulterior (~xolotl@1CIAAGRUD.tor-irc.dnsbl.oftc.net) Quit ()
[11:30] * lmg (~Deiz@2BLAAFXWY.tor-irc.dnsbl.oftc.net) has joined #ceph
[11:33] * sh (~sh@2001:6f8:1337:0:6059:4d33:2454:d16d) Quit (Ping timeout: 480 seconds)
[11:34] * sh (~sh@2001:6f8:1337:0:7d2a:c1e7:ece9:3542) has joined #ceph
[11:36] * kefu (~kefu@114.92.100.153) has joined #ceph
[11:38] * cronix1 (~cronix@5.199.139.166) Quit (Ping timeout: 480 seconds)
[11:43] * garphy`aw is now known as garphy
[11:58] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[12:00] * lmg (~Deiz@2BLAAFXWY.tor-irc.dnsbl.oftc.net) Quit ()
[12:01] * Pieman (~Quatrokin@2BLAAFXX1.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:01] * grin (grin@h0000-0000-0000-1001.n0200.n0002.ipv6.tolna.net) has joined #ceph
[12:02] <grin> good localtime()
[12:03] <grin> what's up here?
[12:07] <baffle> Any Inktankers that happen to know where to call to get some professional emergency troubleshooting?
[12:09] * zack_dolby (~textual@e0109-49-132-41-178.uqwimax.jp) has joined #ceph
[12:11] * ghost1 (~pablodelg@107-208-117-140.lightspeed.miamfl.sbcglobal.net) has joined #ceph
[12:16] * lucas1 (~Thunderbi@218.76.52.64) Quit (Quit: lucas1)
[12:17] <kevinkevin-work> baffle: still your pg down?
[12:18] <kevinkevin-work> had a similar one, last week. Re-created my cloud images from scratch, renaming the faulty pool and creating a new one (pg being pool-dependend)
[12:18] * krypto (~oftc-webi@hpm01cs006-ext.asiapac.hp.net) Quit (Quit: Page closed)
[12:18] <baffle> kevinkevin-work: Yeah.
[12:19] <baffle> kevinkevin-work: A bit too much data in the pool to actually do that.. :)
[12:19] <kevinkevin-work> I still have my pg down, I'm connected here since 8 days, and still have no clue on how to solve
[12:19] * ghost1 (~pablodelg@107-208-117-140.lightspeed.miamfl.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[12:20] <kevinkevin-work> Tried to re-create the pg down, it goes to creating, then eventually to 'incomplete', ... haven't seen it active since then
[12:20] <baffle> kevinkevin-work: Same here..
[12:20] <kevinkevin-work> well. Point being: I've asked about this a few times already, no one answered so far.
[12:21] <baffle> kevinkevin-work: Did you try to get hold of anyone in Inktank?
[12:21] <kevinkevin-work> nop. just reading docs, and irc, no mail, ...
[12:22] <kevinkevin-work> Well, if we had support aggreements, I guess we won't be there, ...
[12:23] * tupper (~tcole@108-83-203-37.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[12:28] <Kingrat> just curious, but what os/kernel version are both of you using? i had something similar happen on proxmox with the 3.10 kernel after one host locked up
[12:29] <Kingrat> had some osds on other hosts get screwed up while the one machine was in a kernel paniced kinda running state
[12:29] <kraken> http://i.imgur.com/fH9e2.gif
[12:30] * Pieman (~Quatrokin@2BLAAFXX1.tor-irc.dnsbl.oftc.net) Quit ()
[12:30] * Pettis (~Vale@tor-exit.server9.tvdw.eu) has joined #ceph
[12:31] <grin> I am jut writing a detailed experience of first time install; it was weird that everything osd related was stalled, pg's looking dumb, then I have change the default CRUSH map to a hierarchical one, using 1.00 instead of 0.00 weights etc, and when injected it the whole frozen heap went alive and fixed itself in 5 seconds. It was weird. But I'm unfamiliar with the system.
[12:32] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[12:32] <kevinkevin-work> ubuntu 14.04, 3.13.0-43
[12:33] <grin> I did nothing special, just moved some stuff around the map and it changed all incomplete and stuck stuff to completed
[12:34] <grin> [and got lots of kernel panic below v3.16]
[12:34] <kraken> http://i.imgur.com/WS4S2.gif
[12:40] * overclk (~overclk@121.244.87.117) Quit (Quit: Leaving)
[12:46] * thomnico (~thomnico@82.166.93.197) has joined #ceph
[12:46] * swami1 (~swami@116.75.99.14) Quit (Ping timeout: 480 seconds)
[12:53] * marrusl (~mark@cpe-24-90-46-248.nyc.res.rr.com) Quit (Remote host closed the connection)
[12:54] * tserong (~tserong@203-173-33-52.dyn.iinet.net.au) has joined #ceph
[12:55] * amote (~amote@121.244.87.116) Quit (Quit: Leaving)
[12:59] * sjm (~sjm@pool-98-109-11-113.nwrknj.fios.verizon.net) has joined #ceph
[13:00] * Pettis (~Vale@3N2AABC1S.tor-irc.dnsbl.oftc.net) Quit ()
[13:05] * Hell_Fire_ (~hellfire@123-243-155-184.static.tpgi.com.au) Quit (Ping timeout: 480 seconds)
[13:21] * kanagaraj (~kanagaraj@121.244.87.117) Quit (Ping timeout: 480 seconds)
[13:21] * zack_dolby (~textual@e0109-49-132-41-178.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[13:24] * ChrisNBlum (~textual@150-004.eduroam.rwth-aachen.de) has joined #ceph
[13:25] * karnan (~karnan@121.244.87.117) Quit (Ping timeout: 480 seconds)
[13:28] * garphy is now known as garphy`aw
[13:31] * ggg1 (~slowriot@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[13:35] * garphy`aw is now known as garphy
[13:37] * thomnico (~thomnico@82.166.93.197) Quit (Quit: Ex-Chat)
[13:37] * thomnico (~thomnico@82.166.93.197) has joined #ceph
[13:37] * swami1 (~swami@116.75.99.14) has joined #ceph
[13:40] * dneary (~dneary@96.237.180.105) has joined #ceph
[13:42] * diegows (~diegows@190.190.5.238) has joined #ceph
[13:44] * jdillaman_afk (~jdillaman@pool-108-56-67-212.washdc.fios.verizon.net) Quit (Quit: jdillaman_afk)
[13:45] * treaki (~treaki@p5B031148.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[13:45] <anorak> Hi all. Question: If you deploy all your OSDs on your single physical server, would the ceph health WARN you? Seems the case with me.
[13:46] <anorak> i first deployed the ceph cluster in my virtual environment following the same steps but had two hosts configured...there my health was reported as OK
[13:46] <anorak> my health = server's health :)
[13:47] * vbellur (~vijay@121.244.87.117) Quit (Ping timeout: 480 seconds)
[13:47] <grin> anorak, what did health say?
[13:47] * marrusl (~mark@cpe-24-90-46-248.nyc.res.rr.com) has joined #ceph
[13:47] <anorak> grin: health HEALTH_WARN 25 pgs degraded; 12 pgs stuck degraded; 300 pgs stuck unclean; 12 pgs stuck undersized; 25 pgs undersized
[13:48] <anorak> pg size =300 ...i know it is suppose to be power of 2 (256) but had the same results as 300
[13:49] <grin> anorak, my guess would be it isn't because it's on one node, but I'm new to ceph and I just mentioned that seen something similar which was resolved when I changed something (unrealated?) in CRUSH map
[13:49] <Be-El> anorak: do you use one physical server hosting several osd, or one physical server hosting virtual machines with osds?
[13:49] <anorak> one physical server hosting different OSDs
[13:49] <grin> but I didn't get undersized, which looks like something completely different problem
[13:50] * swami1 (~swami@116.75.99.14) Quit (Quit: Leaving.)
[13:50] <grin> ceph osd stat
[13:50] <Be-El> anorak: the default crush ruleset tries to distribute placement group replicated across hosts
[13:50] <anorak> with 256 i had ---> health HEALTH_WARN 56 pgs degraded; 256 pgs stuck unclean; 56 pgs undersized; pool rbd pg_num 256 > pgp_num 64
[13:50] <Be-El> anorak: with the default replicate size of 3 you need three hosts to have a valid setup
[13:50] * dneary (~dneary@96.237.180.105) Quit (Ping timeout: 480 seconds)
[13:51] <grin> quickstart mentions that you should reduce the numbers for two-host setup
[13:51] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) has joined #ceph
[13:51] <anorak> Be-El/grin : so this implies if the replicate size is 1 (for testing) , ceph should have no problem with it. correct?
[13:51] <Be-El> anorak: that should be the case, yes
[13:51] * garphy is now known as garphy`aw
[13:52] <anorak> alright.let me give it a try and will get back to you both. Thanks!
[13:52] <Be-El> anorak: you need to set 'size' and 'min_size' for the pool in question
[13:57] * kanagaraj (~kanagaraj@121.244.87.117) has joined #ceph
[13:58] * KevinPerks (~Adium@cpe-071-071-026-213.triad.res.rr.com) has joined #ceph
[13:58] * garphy`aw is now known as garphy
[14:00] * treaki (~treaki@p5B031148.dip0.t-ipconnect.de) has joined #ceph
[14:00] * ggg1 (~slowriot@3N2AABC31.tor-irc.dnsbl.oftc.net) Quit ()
[14:00] * n0x1d (~geegeegee@3N2AABC4Z.tor-irc.dnsbl.oftc.net) has joined #ceph
[14:02] * georgem (~Adium@184.151.178.33) has joined #ceph
[14:02] * cronix1 (~cronix@5.199.139.166) has joined #ceph
[14:09] <anorak> Be-El: Thanks! That did the trick :)
[14:09] <Be-El> anorak: you can also use a different crush ruleset that distributed replicates accross osds instead of hosts
[14:10] <Be-El> but do not use either setup in production ;-)
[14:11] <anorak> Be-El: Thanks for the tip! ;) Someone mentioned here that the documentation is a bit out-of-date....can you recommend a good place to get yourself familiar with the world of set .e.g. manipulating CRUSH sets, understanding these confusing terminologies (degraded,unclean etc)
[14:12] <anorak> world of CRUSH* set
[14:12] <Be-El> anorak: i think the latest documentation on the ceph homepage should be fine
[14:12] <anorak> Be-El: alrighty. Guess there is no place like home :D
[14:13] <Be-El> anorak: there's also book about ceph which might be helpful (didn't had the time to have a look at it thou)
[14:13] <anorak> Be-El: I found "Learning CEPH by Karan Singh" on Amazon
[14:14] <Be-El> that one, yes
[14:14] <Be-El> and no, i'm neither the author nor affiliated with him in any way ;-)
[14:14] <anorak> cool. Thanks again and see you around here later. :)
[14:14] <Be-El> you're welcome
[14:17] * sputnik13 (~sputnik13@c-73-193-97-20.hsd1.wa.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[14:17] * primechuck (~primechuc@173-17-128-36.client.mchsi.com) Quit (Remote host closed the connection)
[14:17] * danieljh (~daniel@0001b4e9.user.oftc.net) Quit (Quit: Lost terminal)
[14:20] * bkopilov (~bkopilov@nat-pool-tlv-t.redhat.com) Quit (Ping timeout: 480 seconds)
[14:21] * garphy is now known as garphy`aw
[14:22] * hybrid512 (~walid@195.200.167.70) has joined #ceph
[14:28] * garphy`aw is now known as garphy
[14:28] * squ (~Thunderbi@46.109.186.160) Quit (Quit: squ)
[14:28] * dgurtner_ (~dgurtner@178.197.228.42) has joined #ceph
[14:29] * ceph_endusr (~Charles@50-205-35-98-static.hfc.comcastbusiness.net) has joined #ceph
[14:30] * n0x1d (~geegeegee@3N2AABC4Z.tor-irc.dnsbl.oftc.net) Quit ()
[14:30] <ceph_endusr> Hi. If I have several OSDs still up and running, but a single monitor that is no longer recoverable. Is it possible to re-create a monitor using the information from the osds?
[14:31] * Jones (~chrisinaj@chulak.enn.lu) has joined #ceph
[14:31] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[14:31] * dgurtner (~dgurtner@178.197.231.49) Quit (Ping timeout: 480 seconds)
[14:31] * hybrid512 (~walid@195.200.167.70) Quit (Quit: Leaving.)
[14:31] * hybrid512 (~walid@195.200.167.70) has joined #ceph
[14:33] * dyasny (~dyasny@198.251.59.151) has joined #ceph
[14:35] <ceph_endusr> i'm reading through this post: http://docs.deis.io/en/latest/managing_deis/recovering-ceph-quorum/ which implies that recovery should only be attempts if data loss is not considered catastrophic. I took a week off work and another 'administrator' started making modifications to the 2 servers that were running as monitors. Somehow, both monitors were lost and I'm hoping this can be recovered
[14:36] <ceph_endusr> in theory the OSDs might have the last version of the pg mapping and crush map before the mons died?
[14:37] <ceph_endusr> hmm this seems like a better place to start: http://ceph.com/docs/master/rados/configuration/mon-config-ref/
[14:37] * jfunk (~jfunk@2001:470:b:44d:7e7a:91ff:fee8:e80b) Quit (Quit: Konversation terminated!)
[14:37] * jfunk (~jfunk@2001:470:b:44d:7e7a:91ff:fee8:e80b) has joined #ceph
[14:40] <Be-El> ceph_endusr: i cannot give you a definite answer, since i'm just another ceph user. you might want to ask the question on the ceph user mailing list, or wait until most ceph developer active on the channel are awake and/or at work
[14:40] <fghaas> ceph_endusr: get your backups out with the last good mon store; no way to recover that information from just the OSDs
[14:40] <Be-El> ceph_endusr: but as far as i know there's no pgmap on the osds
[14:41] <ceph_endusr> Be-El: so it is likely that unless I can recover an original monitor, e.g. get it up and running w/ the same IP so the OSDs can find it, there will not be any way to recover?
[14:41] <Be-El> fghaas: i'm not sure whether backup will help, since they only reflect a point in time for the mons
[14:42] <Be-El> ceph_endusr: first of all, make a backup of the mon data directory if it is still available
[14:42] * vivcheri (~oftc-webi@72.163.220.9) has joined #ceph
[14:42] <fghaas> Be-El: yup, but just the OSDs definitely *won't* help
[14:42] <Be-El> fghaas: agreed
[14:42] * baffle (baffle@jump.stenstad.net) Quit (Read error: Connection reset by peer)
[14:42] <vivcheri> Hi fghaas and Be-El
[14:43] <Be-El> hi vivcheri
[14:43] <vivcheri> Good evening from India :)
[14:43] <ceph_endusr> okay. I will see if I can recover the data dir for the ceph-mon... Thanks.
[14:43] * baffle (baffle@jump.stenstad.net) has joined #ceph
[14:46] * baffle (baffle@jump.stenstad.net) Quit (Read error: Connection reset by peer)
[14:47] * baffle (baffle@jump.stenstad.net) has joined #ceph
[14:48] * baffle (baffle@jump.stenstad.net) Quit ()
[14:52] * vbellur (~vijay@122.166.171.30) has joined #ceph
[14:53] * SirJoolz (~joolz@216.233.37.188.rev.vodafone.pt) has joined #ceph
[14:55] <vivcheri> I have a 4 node ceph cluster and one of the nodes is down. This node has 3 osds on it osd0,osd1,osd2. The error I got on the console was http://paste.ubuntu.com/10429108/. The node is completely in accessible and the technician used a live cd to copy the contents of /home/, /var,/etc. I was going through the osd logs in /var/log/ceph ceph-osd.0.log and I am getting the following http://paste.ubuntu.com/10429252/
[14:55] <grin> i guess sitting here in this channel is a bit similar to watching a disaster channel on the TV ;-)
[14:56] <ceph_endusr> i'm getting the impression the other admin has completely overwritten the original two monitors' disks by attempting automated redeployment processes...
[14:56] <grin> vivcheri, well your xfs isn't happy about hard io errors in metadata
[14:57] <ceph_endusr> which means lots of work may have been lost... which means i'm going to need to quit my job and go sell sandwiches on the corner.
[14:58] * ceph_endusr (~Charles@50-205-35-98-static.hfc.comcastbusiness.net) has left #ceph
[14:59] <vivcheri> grin: What could be a possible way to resolve it.
[14:59] <vivcheri> ?
[15:00] * Jones (~chrisinaj@1CIAAGR4S.tor-irc.dnsbl.oftc.net) Quit ()
[15:00] * visualne (~oftc-webi@158-147-148-234.harris.com) has joined #ceph
[15:01] <visualne> hello. I was wondering if someone could give us some help on an issue we are currently having with ceph. As of right now we are using ceph v61.2. Whenever we add a new OSD for some reason, another osd that was in the cluster will go down.
[15:01] <vivcheri> Because one my nodes is completely inaccessible, and I am trying to understand if I need to re-install the node with ods0,ods1 and osd2 if all the data is safely replicated to the other 3 nodes of this 4 node cluster.
[15:02] <vivcheri> I am just trying to understand the impact of removing the failed node from the cluster.
[15:02] <grin> vivcheri, only after you have resolved all the problems with its backing device you may try to mount it, and if succeeds more or less then unmount it again and xfs_repair it. I didn't have unpleasant experiences with xfs_repair but others somtimes told sad stories so back up what you can anyway.
[15:03] <vivcheri> I have backed up /etc/,/home,/var,/root
[15:04] <grin> ceph endusr, in that case it'd worth to write to the mailing list and maybe look around for other ways of communication :)
[15:07] <Be-El> .oO ( and i was just about to tell ceph_endusr how to recover data from the osds... )
[15:08] <vivcheri> Would just bringing down the node which has a problem cause an impact on the other 3 nodes ?
[15:08] <vivcheri> Would there be data loss if i bring down the node ?
[15:08] <Be-El> vivcheri: if the underlying device has physical errors, the osd process usually abots
[15:09] * kefu (~kefu@114.92.100.153) Quit (Max SendQ exceeded)
[15:09] <vivcheri> Ok, and does this look like an XFS issue rather than a ceph issue ?
[15:10] <Be-El> visualne: adding osds results in backfilling, which is a cpu-, io- and network intensive process. does the osd host has enough cpu resources for the osd(s)?
[15:10] <Be-El> vivcheri: that how i would interpret it, yes
[15:10] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[15:11] * visualne (~oftc-webi@158-147-148-234.harris.com) Quit (Remote host closed the connection)
[15:11] * jluis (~joao@249.38.136.95.rev.vodafone.pt) Quit (Ping timeout: 480 seconds)
[15:11] * baffle (baffle@jump.stenstad.net) has joined #ceph
[15:12] * tupper (~tcole@108-83-203-37.lightspeed.rlghnc.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[15:12] * kanagaraj (~kanagaraj@121.244.87.117) Quit (Quit: Leaving)
[15:12] * visualne (~oftc-webi@158-147-148-234.harris.com) has joined #ceph
[15:12] <vivcheri> I am using ubuntu 14.04 server edition, there should be some logs in /var/log that points to this xfs error right ?, which logs would point to that error ?, I went through the logs in the backed up /var on the node that went down and I could not notice any xfs issue logs.
[15:13] <vivcheri> Maybe I am missing something kindly guide me.
[15:13] * georgem (~Adium@184.151.178.33) Quit (Quit: Leaving.)
[15:13] <Be-El> vivcheri: have a look at kern.log and syslog
[15:13] <vivcheri> Be-El: Thanks.
[15:14] * t0rn (~ssullivan@2607:fad0:32:a02:d227:88ff:fe02:9896) has joined #ceph
[15:14] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[15:17] * jluis (~joao@249.38.136.95.rev.vodafone.pt) has joined #ceph
[15:17] * ChanServ sets mode +o jluis
[15:17] * nitti (~nitti@162.222.47.218) has joined #ceph
[15:18] <vivcheri> Be-El: I went through the kern.log and syslog, it does not show an xfs issues.
[15:18] <Be-El> it should show an issue with sdc
[15:19] <vivcheri> ok
[15:19] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[15:21] * DV__ (~veillard@2001:41d0:1:d478::1) Quit (Ping timeout: 480 seconds)
[15:21] * SirJoolz (~joolz@216.233.37.188.rev.vodafone.pt) Quit (Remote host closed the connection)
[15:21] * tupper (~tcole@rtp-isp-nat1.cisco.com) has joined #ceph
[15:23] <stannum> hi folks, getting stuck with problem: can't map rbd image, dmesg tells http://pastebin.com/ms7N2H5V
[15:25] <stannum> forgot to mangle mon addresses in 'ceph -s' output :)
[15:26] * primechuck (~primechuc@host-95-2-129.infobunker.com) has joined #ceph
[15:27] <kevinkevin-work> Could that be related to some unsyncd clock?
[15:27] <kevinkevin-work> Have you tried ntpdate on the host mapping the rbd?
[15:28] <kevinkevin-work> oh: http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client
[15:28] * bitserker (~toni@77.231.177.241) Quit (Ping timeout: 480 seconds)
[15:28] * avozza (~avozza@a83-160-116-36.adsl.xs4all.nl) Quit (Remote host closed the connection)
[15:30] <vivcheri> Be-El: These are the logs, syslog and syslog.1 kern.log and kern.log.1
[15:30] * DV__ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[15:30] <vivcheri> http://paste.ubuntu.com/10429787/
[15:31] <vivcheri> Do you want me to check the logs older than these ?
[15:31] * lcurtis (~lcurtis@47.19.105.250) has joined #ceph
[15:33] * kefu (~kefu@114.92.100.153) has joined #ceph
[15:35] * georgem (~Adium@fwnat.oicr.on.ca) has joined #ceph
[15:36] * Salamander_ (~cmrn@193.107.85.61) has joined #ceph
[15:40] <stannum> kevinkevin-work: there seams to be a problem described in http://cephnotes.ksperis.com
[15:41] * rljohnsn (~rljohnsn@c-73-15-126-4.hsd1.ca.comcast.net) has joined #ceph
[15:43] * vbellur (~vijay@122.166.171.30) Quit (Ping timeout: 480 seconds)
[15:45] <kevinkevin-work> what kernel is using your client?
[15:46] <championofcyrodi> (09:07:01 AM) Be-El: .oO ( and i was just about to tell ceph_endusr how to recover data from the osds... )
[15:46] <championofcyrodi> really?
[15:48] <championofcyrodi> Be-El: how is that possible?
[15:49] <Be-El> championofcyrodi: the names of the files in the osd directory structure contain the object identifier and the relative position of the chunk
[15:49] <Be-El> championofcyrodi: with the help of a script, lots of free storage and enough time and coffee you should be able to recover the objects
[15:50] <visualne> Be-El: Absolutely these have 24 processors
[15:51] <vivcheri> Be-El: I am not able to find the issue in the logs.
[15:51] <Be-El> championofcyrodi: it won't give you names or anything else to identify the object, but you might be able to do it with intrinsic methods (e.g. loopback mounting for vm images)
[15:51] <vivcheri> Be-El: Do you want me to check the older logs ?
[15:52] <Be-El> vivcheri: if the host is still alive log in and check the filesystem (as grin has already mentioned before)
[15:53] <championofcyrodi> Be-El: Sounds like a pretty harsh learning experience. Thanks for the info. Still looking to recover the data directory from one of the original monitors. I'll be writing one heck of a blog about this if I'm ever able to get things recovered.
[15:54] <championofcyrodi> i guess the good news is, we have a new coffee machine
[15:54] <championofcyrodi> so i already have a good start.
[15:54] <Be-El> championofcyrodi: no, i got this information during being idle here on the channel some time ago
[15:55] * vbellur (~vijay@122.167.104.82) has joined #ceph
[15:55] <Be-El> championofcyrodi: maybe someone around already has a script for this (or it might be part of a inktank service contract)
[15:55] <championofcyrodi> i meant, it sounds like a pretty harsh learning experience for my team
[15:55] <championofcyrodi> which is myself and the other sysadmin
[15:55] <Be-El> championofcyrodi: the one you should apply a l.a.r.t. to first?
[15:55] <championofcyrodi> i logged on as ceph_endusr from the house
[15:57] <championofcyrodi> l.a.r.t?
[15:58] <Be-El> http://www.catb.org/jargon/html/L/LART.html
[15:59] <championofcyrodi> lol
[16:01] * bitserker (~toni@77.231.177.241) has joined #ceph
[16:02] * bitserker (~toni@77.231.177.241) Quit ()
[16:02] * bitserker (~toni@77.231.177.241) has joined #ceph
[16:02] * moore (~moore@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[16:04] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[16:05] * Salamander_ (~cmrn@2BLAAFX7E.tor-irc.dnsbl.oftc.net) Quit ()
[16:07] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has joined #ceph
[16:09] * cok (~chk@2a02:2350:18:1010:458b:ddde:571c:7f15) has left #ceph
[16:10] <vivcheri> Thanks Be-El
[16:10] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[16:10] * dmsimard_away is now known as dmsimard
[16:12] * sputnik13 (~sputnik13@74.202.214.170) has joined #ceph
[16:13] * bkopilov (~bkopilov@bzq-79-182-164-80.red.bezeqint.net) has joined #ceph
[16:14] * Manshoon (~Manshoon@208.184.50.131) has joined #ceph
[16:14] * jtang (~jtang@109.255.42.21) Quit (Ping timeout: 480 seconds)
[16:14] * kefu (~kefu@114.92.100.153) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[16:16] * kefu (~kefu@114.92.100.153) has joined #ceph
[16:17] * georgem (~Adium@fwnat.oicr.on.ca) Quit (Quit: Leaving.)
[16:18] * PerlStalker (~PerlStalk@162.220.127.20) has joined #ceph
[16:19] * jdillaman (~jdillaman@pool-108-56-67-212.washdc.fios.verizon.net) has joined #ceph
[16:20] * ChrisNBlum (~textual@150-004.eduroam.rwth-aachen.de) Quit (Ping timeout: 480 seconds)
[16:22] * DV__ (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[16:22] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[16:25] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[16:25] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[16:27] <visualne> What is the max number of monitors you can have in quorum in ceph?
[16:28] <jcsp> visualne: when you have N mons, you must have (N/2)+1 mons to form a quorum. So for 3 mons, you need at least 2, and (of course ) at most 3.
[16:28] <jcsp> people usually use 3, 5, maybe 7 mons. Creating more isn't usually useful.
[16:34] * topro (~prousa@host-62-245-142-50.customer.m-online.net) Quit (Read error: Connection reset by peer)
[16:35] * w0lfeh (~ZombieL@edwardsnowden2.torservers.net) has joined #ceph
[16:38] * topro (~prousa@host-62-245-142-50.customer.m-online.net) has joined #ceph
[16:38] * georgem (~Adium@fwnat.oicr.on.ca) has joined #ceph
[16:45] * DV (~veillard@2001:41d0:1:d478::1) has joined #ceph
[16:45] * pi (~pi@host-81-190-2-156.gdynia.mm.pl) Quit (Read error: Connection reset by peer)
[16:47] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[16:48] * reed (~reed@75-101-54-131.dsl.static.fusionbroadband.com) has joined #ceph
[16:48] * nils_ (~nils@doomstreet.collins.kg) Quit (Quit: Leaving)
[16:52] * ChrisNBlum (~ChrisNBlu@dhcp-ip-230.dorf.rwth-aachen.de) has joined #ceph
[16:54] * rljohnsn (~rljohnsn@c-73-15-126-4.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[16:55] * xarses_ (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:56] * elder (~elder@207.66.184.146) has joined #ceph
[16:56] * vata (~vata@208.88.110.46) has joined #ceph
[16:57] * Dasher (~oftc-webi@46.218.69.130) Quit (Quit: Page closed)
[16:58] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[17:01] * sage (~quassel@2605:e000:854d:de00:230:48ff:fed3:6786) Quit (Remote host closed the connection)
[17:01] * ChrisNBl_ (~ChrisNBlu@178.255.153.117) has joined #ceph
[17:02] * Manshoon_ (~Manshoon@208.184.50.130) has joined #ceph
[17:02] * sage (~quassel@cpe-76-95-230-100.socal.res.rr.com) has joined #ceph
[17:02] * ChanServ sets mode +o sage
[17:04] * alram (~alram@38.122.20.226) has joined #ceph
[17:04] * loicd (~loicd@cmd179.fsffrance.org) Quit (Ping timeout: 480 seconds)
[17:05] * w0lfeh (~ZombieL@3N2AABDCY.tor-irc.dnsbl.oftc.net) Quit ()
[17:05] * angdraug (~angdraug@c-50-174-102-105.hsd1.ca.comcast.net) has joined #ceph
[17:05] * ChrisNBlum (~ChrisNBlu@dhcp-ip-230.dorf.rwth-aachen.de) Quit (Ping timeout: 480 seconds)
[17:06] * sigsegv (~sigsegv@188.26.161.163) has joined #ceph
[17:09] * Manshoon (~Manshoon@208.184.50.131) Quit (Ping timeout: 480 seconds)
[17:09] * tupper (~tcole@rtp-isp-nat1.cisco.com) Quit (Ping timeout: 480 seconds)
[17:16] * thomnico (~thomnico@82.166.93.197) Quit (Ping timeout: 480 seconds)
[17:17] * visualne (~oftc-webi@158-147-148-234.harris.com) Quit (Remote host closed the connection)
[17:17] * amote (~amote@1.39.15.19) has joined #ceph
[17:18] * tupper (~tcole@rtp-isp-nat1.cisco.com) has joined #ceph
[17:20] * amote (~amote@1.39.15.19) Quit ()
[17:20] * zack_dolby (~textual@pa3b3a1.tokynt01.ap.so-net.ne.jp) has joined #ceph
[17:21] * nljmo (~nljmo@5ED6C263.cm-7-7d.dynamic.ziggo.nl) Quit (Quit: Textual IRC Client: www.textualapp.com)
[17:22] * kefu (~kefu@114.92.100.153) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:23] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[17:24] * xarses_ (~andreww@12.164.168.117) has joined #ceph
[17:32] * capri (~capri@212.218.127.222) Quit (Read error: Connection reset by peer)
[17:34] * joshd1 (~jdurgin@24-205-54-236.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[17:35] * TehZomB (~anadrom@1CIAAGSDZ.tor-irc.dnsbl.oftc.net) has joined #ceph
[17:36] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has left #ceph
[17:37] * loicd (~loicd@cmd179.fsffrance.org) has joined #ceph
[17:42] * swami1 (~swami@116.75.99.14) has joined #ceph
[17:43] * angdraug (~angdraug@c-50-174-102-105.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[17:44] * danieljh (~daniel@0001b4e9.user.oftc.net) has joined #ceph
[17:44] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[17:45] * markl_ (~mark@knm.org) has joined #ceph
[17:46] * branto (~branto@nat-pool-brq-t.redhat.com) has left #ceph
[17:47] * loicd (~loicd@cmd179.fsffrance.org) Quit (Quit: quit)
[17:48] * loicd (~loicd@cmd179.fsffrance.org) has joined #ceph
[17:50] * kashyap (~kashyap@121.244.87.116) has joined #ceph
[17:51] * kefu (~kefu@114.92.100.153) has joined #ceph
[17:53] * Rickus (~Rickus@office.protected.ca) has joined #ceph
[17:55] <vivcheri> Thanks Be-El, grin for you assistance.
[17:55] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[17:55] <kashyap> I've never setup Ceph before, where does Ceph store disk images?
[17:56] <kashyap> e.g. when setup correctly w/ OpenStack, where is it _supposed_ to store them
[17:57] * kefu (~kefu@114.92.100.153) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:57] * thb (~me@0001bd58.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:57] <vivcheri> kashyap: you can store glance images in ceph. It will get stored in a ceph pool.
[17:57] * bandrus (~brian@50.23.115.87-static.reverse.softlayer.com) has joined #ceph
[17:57] <vivcheri> kashyap: Need to get some sleep. Bye
[17:57] * jclm (~jclm@209.49.224.62) has joined #ceph
[17:57] * vivcheri (~oftc-webi@72.163.220.9) Quit (Quit: Page closed)
[17:57] <kashyap> Thanks, but I'm wondering about the default location of this "ceph pool"
[17:58] * ircolle (~ircolle@38.122.20.226) has joined #ceph
[17:58] * Manshoon_ (~Manshoon@208.184.50.130) Quit (Ping timeout: 480 seconds)
[17:58] <kashyap> Andreas-IPO, these are not Glance images. I'm talking about Nova instances.
[17:59] <kashyap> (Err, that was "And", IRC client badly did a tab complete. Sorry, Andreas.)
[18:00] <Andreas-IPO> nps =)
[18:03] * puffy (~puffy@50.185.218.255) Quit (Quit: Leaving.)
[18:04] * rljohnsn (~rljohnsn@ns25.8x8.com) has joined #ceph
[18:05] * TehZomB (~anadrom@1CIAAGSDZ.tor-irc.dnsbl.oftc.net) Quit ()
[18:05] * Gecko1986 (~Chrissi_@83.149.124.136) has joined #ceph
[18:07] * rljohnsn1 (~rljohnsn@ns25.8x8.com) has joined #ceph
[18:07] * rljohnsn (~rljohnsn@ns25.8x8.com) Quit (Read error: Connection reset by peer)
[18:07] * thomnico (~thomnico@bzq-218-90-50.red.bezeqint.net) has joined #ceph
[18:07] <jcsp> kashyap: Ceph OSD daemons each have a local store, usually on a dedicated block device. The images you store in a ceph pool are scattered across the OSDs. See: http://docs.ceph.com/docs/master/architecture/
[18:09] <kashyap> jcsp, Thank you, I'll investigate.
[18:09] * rljohnsn1 (~rljohnsn@ns25.8x8.com) Quit ()
[18:11] * swami1 (~swami@116.75.99.14) Quit (Quit: Leaving.)
[18:14] * oro (~oro@2001:620:20:16:210b:c579:3082:12fb) Quit (Ping timeout: 480 seconds)
[18:18] * Manshoon (~Manshoon@166.170.34.92) has joined #ceph
[18:20] * cholcombe973 (~chris@pool-108-42-144-175.snfcca.fios.verizon.net) has joined #ceph
[18:22] * vasu (~vasu@38.122.20.226) has joined #ceph
[18:27] * agshew_ (~agshew@host-69-145-59-76.bln-mt.client.bresnan.net) has joined #ceph
[18:32] * moore (~moore@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Remote host closed the connection)
[18:33] * mykola (~Mikolaj@91.225.202.26) has joined #ceph
[18:34] <scuttlemonkey> joshd: around?
[18:35] * Gecko1986 (~Chrissi_@1CIAAGSFE.tor-irc.dnsbl.oftc.net) Quit ()
[18:36] * hb9xar__ (ident@easytux.ch) has joined #ceph
[18:36] * Manshoon (~Manshoon@166.170.34.92) Quit (Remote host closed the connection)
[18:38] * derjohn_mob (~aj@88.128.80.204) has joined #ceph
[18:39] * karimb (~kboumedhe@publico1.tid.es) Quit (Quit: Leaving)
[18:41] <championofcyrodi> Be-El: looks like the monitor data is not available. I'm not digging into the osd data directory and have discovered: /var/lib/ceph/osd/ceph-[\d]/current
[18:41] <championofcyrodi> which contains all of the 4/8 MB chunks, per node.
[18:41] <championofcyrodi> per osd* rather
[18:42] * moore (~moore@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[18:42] <championofcyrodi> so in this paste: http://pastebin.com/d8Kk38NM
[18:43] * fghaas (~florian@91-119-130-192.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[18:43] * Nacer_ (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[18:43] <championofcyrodi> for e.g. i see udata.228174b0dc51.xxxxxx and udata.229283a176da2.xxxxxx
[18:43] * fghaas (~florian@zid-vpnn072.uibk.ac.at) has joined #ceph
[18:44] <championofcyrodi> i guess the hex id between udata longer hex ids 0000000...__head, are specific to an rbd image that i had
[18:44] * Nacer_ (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[18:44] * georgem (~Adium@fwnat.oicr.on.ca) Quit (Quit: Leaving.)
[18:45] <championofcyrodi> so if i traverse all osd nodes for something like udata.228174b0dc51.+ and copy those 4/8MB blocks to a central location... i may be able to dd them back together and create the original image?
[18:45] <championofcyrodi> (using a script of course)
[18:45] <Be-El> championofcyrodi: that's the idea, yes
[18:46] * swami1 (~swami@116.75.99.14) has joined #ceph
[18:46] <Be-El> you'll probably find more than one copy for each chunk (-> replicates)
[18:46] * derjohn_mobi (~aj@88.128.80.175) has joined #ceph
[18:46] <championofcyrodi> hmmm... anyway to determine replicas?
[18:47] <Be-El> the script has to maintain a list of already processed chunks
[18:47] <championofcyrodi> like, i could keep a running HashMap while traversing, and ignore ones that already exist?
[18:47] <joshd> scuttlemonkey: yes
[18:47] <championofcyrodi> ah, so md5 each chunk?
[18:48] <Be-El> championofcyrodi: i assume that the cluster was in a healthy state and the chunks are identical
[18:48] <scuttlemonkey> joshd: excellent, starting to make me sweat that you forgot about your thingamawhoosit in 10m :)
[18:48] <championofcyrodi> yes, i had 0 degraded obs
[18:48] <championofcyrodi> when the mons went away
[18:48] <joshd> scuttlemonkey: sorry to make you sweat. just a bit of an early morning for me
[18:48] <scuttlemonkey> hehe no worries
[18:49] * jwilkins (~jwilkins@38.122.20.226) has joined #ceph
[18:49] <scuttlemonkey> ** Ceph Tech Talk on RBD w/ Joshd in 10m ** -- http://ceph.com/ceph-tech-talks/
[18:49] * gregmark (~Adium@68.87.42.115) has joined #ceph
[18:50] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Ping timeout: 480 seconds)
[18:52] * derjohn_mob (~aj@88.128.80.204) Quit (Ping timeout: 480 seconds)
[18:52] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[18:53] * garphy is now known as garphy`aw
[18:56] * fghaas (~florian@zid-vpnn072.uibk.ac.at) Quit (Ping timeout: 480 seconds)
[18:56] * kanagaraj (~kanagaraj@27.7.37.237) has joined #ceph
[18:56] * nljmo (~nljmo@5ED6C263.cm-7-7d.dynamic.ziggo.nl) has joined #ceph
[18:59] * karnan (~karnan@106.51.234.138) has joined #ceph
[18:59] * alram (~alram@38.122.20.226) Quit (Quit: leaving)
[19:00] * alram (~alram@38.122.20.226) has joined #ceph
[19:00] * cookednoodles (~eoin@89-93-153-201.hfc.dyn.abo.bbox.fr) has joined #ceph
[19:01] * jordanP (~jordan@scality-jouf-2-194.fib.nerim.net) Quit (Quit: Leaving)
[19:02] * hellertime (~Adium@a23-79-238-10.deploy.static.akamaitechnologies.com) has joined #ceph
[19:04] * Sysadmin88 (~IceChat77@2.125.213.8) has joined #ceph
[19:06] * skrblr (~nartholli@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[19:08] * Manshoon (~Manshoon@208.184.50.131) has joined #ceph
[19:10] * ChrisNBl_ (~ChrisNBlu@178.255.153.117) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:11] <championofcyrodi> Be-El: can i ignore current/5.3b_TEMP type directories in the osd data?
[19:11] <Be-El> championofcyrodi: i've no clue
[19:11] <championofcyrodi> lol, alright. thanks
[19:11] * fghaas (~florian@91-119-130-192.dynamic.xdsl-line.inode.at) has joined #ceph
[19:12] * agshew_ (~agshew@host-69-145-59-76.bln-mt.client.bresnan.net) Quit (Ping timeout: 480 seconds)
[19:12] <Be-El> well, time to call it a day
[19:13] <championofcyrodi> df -h | grep TEMP shows 0 bytes in all TEMP folders.
[19:13] <Be-El> championofcyrodi: good luck with the data restoration
[19:13] * thb (~me@2a02:2028:290:6dc1:b973:adb6:8ca6:702d) has joined #ceph
[19:13] <championofcyrodi> thanks Be-El!
[19:13] * Be-El (~quassel@fb08-bcf-pc01.computational.bio.uni-giessen.de) Quit (Remote host closed the connection)
[19:13] * vilobhmm (~vilobhmm@nat-dip33-wl-g.cfw-a-gci.corp.yahoo.com) has joined #ceph
[19:16] * Concubidated (~Adium@71.21.5.251) has joined #ceph
[19:19] * georgem (~Adium@fwnat.oicr.on.ca) has joined #ceph
[19:22] * moore (~moore@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Remote host closed the connection)
[19:23] * dgurtner_ (~dgurtner@178.197.228.42) Quit (Read error: Connection reset by peer)
[19:24] * rmoe (~quassel@173-228-89-134.dsl.static.fusionbroadband.com) Quit (Ping timeout: 480 seconds)
[19:24] * swami1 (~swami@116.75.99.14) Quit (Quit: Leaving.)
[19:25] * Manshoon_ (~Manshoon@208.184.50.130) has joined #ceph
[19:28] * bitserker (~toni@77.231.177.241) Quit (Quit: Leaving.)
[19:30] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) Quit (Remote host closed the connection)
[19:30] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[19:30] <cmdrk_> if I want to change the nearfull/full ratios, do I just add "mon osd full ratio" and "mon osd nearfull ratio" to [global] or [mon.N] ?
[19:31] <cmdrk_> also can i do it with an --injectargs into the mon?
[19:32] * Manshoon (~Manshoon@208.184.50.131) Quit (Ping timeout: 480 seconds)
[19:33] <cmdrk_> http://fpaste.org/191049/42497557/ looks like it works maybe? hrm
[19:33] * vbellur (~vijay@122.167.104.82) Quit (Ping timeout: 480 seconds)
[19:34] * rmoe (~quassel@12.164.168.117) has joined #ceph
[19:35] * skrblr (~nartholli@3N2AABDKJ.tor-irc.dnsbl.oftc.net) Quit ()
[19:36] * moore (~moore@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[19:37] * DV (~veillard@2001:41d0:1:d478::1) Quit (Remote host closed the connection)
[19:39] * Miouge (~Miouge@94.136.92.20) Quit (Quit: Miouge)
[19:39] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) Quit (Ping timeout: 480 seconds)
[19:40] * Jourei (~Gibri@3N2AABDMN.tor-irc.dnsbl.oftc.net) has joined #ceph
[19:41] * karnan (~karnan@106.51.234.138) Quit (Ping timeout: 480 seconds)
[19:41] * kanagaraj (~kanagaraj@27.7.37.237) Quit (Quit: Leaving)
[19:46] * Miouge (~Miouge@94.136.92.20) has joined #ceph
[19:50] * derjohn_mobi (~aj@88.128.80.175) Quit (Ping timeout: 480 seconds)
[19:51] * moore (~moore@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Remote host closed the connection)
[19:54] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[19:57] * hb9xar__ (ident@easytux.ch) Quit (Ping timeout: 480 seconds)
[20:06] * hb9xar__ (ident@easytux.ch) has joined #ceph
[20:07] <bdonnahue> is anyone using ceph with dmcrypt?
[20:10] * Jourei (~Gibri@3N2AABDMN.tor-irc.dnsbl.oftc.net) Quit ()
[20:10] * toast (~pepzi@5.196.105.229) has joined #ceph
[20:16] * i_m (~ivan.miro@deibp9eh1--blueice4n2.emea.ibm.com) Quit (Ping timeout: 480 seconds)
[20:17] * joshd (~jdurgin@38.122.20.226) Quit (Ping timeout: 480 seconds)
[20:17] * hb9xar__ (ident@easytux.ch) Quit (Ping timeout: 480 seconds)
[20:20] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[20:25] * lalatenduM (~lalatendu@122.167.151.186) Quit (Quit: Leaving)
[20:28] * hb9xar__ (ident@easytux.ch) has joined #ceph
[20:28] * joshd (~jdurgin@38.122.20.226) has joined #ceph
[20:29] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[20:30] * SirJoolz (~sirjoolz@216.233.37.188.rev.vodafone.pt) has joined #ceph
[20:30] * arbrandes (~arbrandes@179.111.2.252) has joined #ceph
[20:31] * SirJoolz (~sirjoolz@216.233.37.188.rev.vodafone.pt) Quit (Remote host closed the connection)
[20:31] * nitti (~nitti@162.222.47.218) Quit (Quit: Leaving...)
[20:31] * kevinkevin-work (6dbebb8f@107.161.19.109) Quit (Remote host closed the connection)
[20:32] * kevinkevin-work (6dbebb8f@107.161.19.109) has joined #ceph
[20:33] * arbrandes_ (~arbrandes@191.8.36.246) has joined #ceph
[20:36] * nitti (~nitti@162.222.47.218) has joined #ceph
[20:36] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[20:39] * arbrandes (~arbrandes@179.111.2.252) Quit (Ping timeout: 480 seconds)
[20:40] * toast (~pepzi@2BLAAFYMV.tor-irc.dnsbl.oftc.net) Quit ()
[20:40] * spidu_ (~Dinnerbon@108.61.210.123) has joined #ceph
[20:43] * arbrandes_ is now known as arbrandes
[20:46] * RayTracer (~RayTracer@host-81-190-2-156.gdynia.mm.pl) has joined #ceph
[20:49] <RayTracer> Hi all! I'm about to increase size for pg_num for our pool in cluster. I want to adjust some variables for that time by injectargs to our osds using: `ceph tell osd.* injectargs --osd_recovery_op_priority 1 --osd_recovery_max_active 4 --osd_max_backfills 4`. What do you think? Is it wise to use injectargs with * for all osds at the same time?
[20:51] <RayTracer> I also used this injectargs command on our staging environment. And for some time (couple of seconds) i have an pipe().fault from mon, but there is only one mon (with 2 osds) so maybe it is a fault of this.
[20:51] * agshew_ (~agshew@host-69-145-59-76.bln-mt.client.bresnan.net) has joined #ceph
[21:01] * kevinkevin-work (6dbebb8f@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[21:02] * vasu (~vasu@38.122.20.226) Quit (Ping timeout: 480 seconds)
[21:03] <RayTracer> Ok i see now what was wrong. My mon on staging has some error itself and injectargs should be in ticks '' :]
[21:05] <soren> I have a system disk that accidentally got added to my ceph cluster, so now I want to remove it. I've taken it out, but now I have this: HEALTH_WARN 13 pgs degraded; 13 pgs stuck degraded; 13 pgs stuck unclean; 13 pgs stuck undersized; 13 pgs undersized; recovery 83/161874 objects degraded (0.051%)
[21:05] <soren> It's been stuck that way for many hours how.
[21:07] * ircolle (~ircolle@38.122.20.226) Quit (Ping timeout: 480 seconds)
[21:07] <soren> Am I misunderstanding this process? I'm following http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual and am expecting the cluster to reach a healthy state before I stop the osd.
[21:08] * vilobhmm (~vilobhmm@nat-dip33-wl-g.cfw-a-gci.corp.yahoo.com) Quit (Read error: Connection reset by peer)
[21:10] * spidu_ (~Dinnerbon@2BLAAFYOC.tor-irc.dnsbl.oftc.net) Quit ()
[21:10] * jwilkins (~jwilkins@38.122.20.226) Quit (Ping timeout: 480 seconds)
[21:11] * Da_Pineapple (~Vidi@tor-relay.roldug.in) has joined #ceph
[21:12] * VisBits (~textual@8.29.138.28) has joined #ceph
[21:12] <VisBits> afternoon
[21:14] <skorgu> soren: what does "ceph pg dump_stuck" show?
[21:15] <soren> skorgu: It dumps the 13 pgs that are stuck.
[21:15] <soren> skorgu: Do you want the full output?
[21:16] <skorgu> let's start with one line
[21:16] <soren> https://gist.github.com/sorenh/447fd72da30a4cc42708
[21:20] <soren> https://gist.github.com/sorenh/f7529adf019c7e70d388
[21:21] * thomnico (~thomnico@bzq-218-90-50.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[21:23] <skorgu> well it's not obvious to me what's going on anyway
[21:23] <soren> Am I correct to assume that the cluster should have reached a healthy state when I took an OSD out?
[21:23] <skorgu> is this the first time you've ever removed an osd?
[21:23] <skorgu> also what's your min_size?
[21:24] <soren> It's the first time I've done it from this cluster at least.
[21:24] <soren> It's been years since I last had the pleasure handling things like thais.
[21:25] <soren> Not sure what my min_size is.
[21:25] * daniel2_ (~daniel2_@cpe-24-28-6-151.austin.res.rr.com) has joined #ceph
[21:25] * puffy (~puffy@50.185.218.255) has joined #ceph
[21:27] <soren> min_size is 2.
[21:27] <skorgu> right
[21:28] <skorgu> and based on that pg dump output your size is 2 as well
[21:28] <skorgu> the tl;dr is if you set your min_size to 1 it should recover
[21:28] <kraken> http://i.imgur.com/V2H9y.gif
[21:29] <skorgu> ha
[21:29] <soren> No, it's not. Size is 3.
[21:31] * thomnico (~thomnico@37.162.101.252) has joined #ceph
[21:31] <skorgu> according to your pg dump_stuck the acting and up sets are 2 osds big
[21:31] <skorgu> what does "ceph osd pool get <poolname> size" say?
[21:31] <soren> How can you tell? I only showed the degraded ones.
[21:31] <soren> Like you asked.
[21:31] * nhm (~nhm@65-128-142-103.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[21:31] <soren> It says 3.
[21:32] <skorgu> hm
[21:32] * joef1 (~Adium@2620:79:0:207:dcbf:49c:5cdf:e541) has joined #ceph
[21:32] <skorgu> I just took a cluster here and marked an osd out to compare the output
[21:32] <soren> If I add the osd back, it'll be listed in the active set of those pgs.
[21:32] <skorgu> but I guess yours is getting stuck in a different way
[21:32] <soren> I did that earlier.
[21:32] <soren> I'm running Giant. 0.87.1.
[21:34] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[21:34] * oro (~oro@80-219-254-208.dclient.hispeed.ch) has joined #ceph
[21:35] <skorgu> I'm out of ideas, sorry
[21:35] * joef1 (~Adium@2620:79:0:207:dcbf:49c:5cdf:e541) has left #ceph
[21:35] <skorgu> on my idle 0.92 cluster marking an osd out brings the cluster back to happy in a few minutes
[21:36] <skorgu> size=3,min_size=2 so I think what you're expecting is reasonable
[21:37] <soren> Which version are you running?
[21:37] <soren> What the...
[21:37] <skorgu> v0.92
[21:38] <skorgu> 42 osds
[21:39] <soren> I'm not sure what's going on now, but I just added it back in and took it out again.
[21:39] <soren> ...and now I get: HEALTH_WARN recovery 83/161874 objects misplaced (0.051%)
[21:40] <soren> I got that earlier today to, now that I think about it.... I stayed that way "forever".
[21:40] <soren> Left it alone for at least 10-15 mins.
[21:40] * Da_Pineapple (~Vidi@2BLAAFYPP.tor-irc.dnsbl.oftc.net) Quit ()
[21:40] <skorgu> are those pgs still showing as stuck?
[21:41] <soren> No.
[21:41] * Behedwin (~AGaW@2BLAAFYQ9.tor-irc.dnsbl.oftc.net) has joined #ceph
[21:41] <soren> active+remapped
[21:41] * agshew_ (~agshew@host-69-145-59-76.bln-mt.client.bresnan.net) Quit (Ping timeout: 480 seconds)
[21:41] <skorgu> I guess 'try turning it off and on again' works everywhere huh.
[21:41] * kawa2014 (~kawa@90.216.134.197) Quit (Quit: Leaving)
[21:42] <soren> But...
[21:42] * rwheeler (~rwheeler@173.48.208.246) has joined #ceph
[21:42] <soren> Is this remapped status expected?
[21:42] <soren> Forever?
[21:42] <soren> Until I delete the osd from the crush map?
[21:43] <skorgu> it sounds like they never start backfilling
[21:43] <soren> skorgu: When you took an osd out, did it rebalance and eventually become healthy?
[21:43] <skorgu> yes
[21:43] <soren> Ok.
[21:43] <skorgu> I got foo pgs active+remapped+backfilling and a few active+remapped
[21:44] <soren> And that constitutes a healthy cluster?
[21:44] <skorgu> no that's the "I'm still trying to fix it" stage
[21:44] <skorgu> 2015-02-26 15:44:12.259381 mon.0 [INF] pgmap v14649: 1024 pgs: 954 active+clean, 68 active+remapped, 2 active+remapped+backfilling; 256 GB data, 772 GB used, 22457 GB / 23230 GB avail; 14470/203951 objects misplaced (7.095%)
[21:44] <soren> 2015-02-26 20:39:51.565336 mon.0 [INF] pgmap v195893: 11328 pgs: 11315 active+clean, 13 active+remapped; 35128 MB data, 115 GB used, 274 TB / 274 TB avail; 83/161874 objects misplaced (0.051%)
[21:45] <skorgu> yeah yours never seem to successfully start backfilling
[21:45] <soren> skorgu: But "ceph health" says HEALTH_OK?
[21:45] <soren> Not HEALTH_WARN?
[21:45] <skorgu> HEALTH_WARN while it's backfilling
[21:45] <skorgu> silly question, can your OSDs reach each other?
[21:45] <skorgu> firewalls, etc?
[21:46] <soren> No firewalls.
[21:46] <soren> I only have three nodes right now.
[21:47] <soren> The only thing special about this OSD is that it was a system disk that was accidentally added.
[21:47] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[21:49] <soren> Isn't this channel usually filled with ceph devs?
[21:50] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) Quit (Remote host closed the connection)
[21:50] * avozza (~avozza@static-114-198-78-212.thenetworkfactory.nl) has joined #ceph
[21:51] * cookednoodles (~eoin@89-93-153-201.hfc.dyn.abo.bbox.fr) Quit (Read error: Connection reset by peer)
[21:52] * georgem (~Adium@fwnat.oicr.on.ca) Quit (Quit: Leaving.)
[21:52] <skorgu> I'm not sure how strictly crush tries to maintain invariants but I can imagine a story where because there's nowhere to put the data that isn't on the same node as another copy it might cause problems
[21:52] <skorgu> but that's basically me making things up
[21:54] * thb (~me@0001bd58.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:56] * dmick (~dmick@2607:f298:a:607:2d70:32ce:ee23:a470) Quit (Ping timeout: 480 seconds)
[21:56] * thomnico (~thomnico@37.162.101.252) Quit (Ping timeout: 480 seconds)
[21:56] <soren> There are three nodes, so it should definitely be possible to find another osd for the third replica that isn't on the same node as another replica.
[21:58] <rotbeard> hey folks. where can I start to debug a very uneven placement of obects? in my test cluster I have about 74TB data (~150TB including replicas) on 4TB drives, but some of them are 15% full and some 90%+
[21:58] * georgem (~Adium@fwnat.oicr.on.ca) has joined #ceph
[21:58] <burley_> rotbeard: How many OSDs and what is your pg_num set to?
[21:59] * cookednoodles (~eoin@89-93-153-201.hfc.dyn.abo.bbox.fr) has joined #ceph
[21:59] * cookednoodles (~eoin@89-93-153-201.hfc.dyn.abo.bbox.fr) Quit ()
[21:59] <rotbeard> currently there are 54 OSD (coming from 82 but 1 node is down for now) with 8192 pgs in summary
[22:00] <burley_> is your data all in one pool (or mostly in one pool)?
[22:00] <fghaas> rotbeard: do ceph osd tree and check your weights
[22:02] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[22:02] <rotbeard> the weights were 1 for every osd, I upgrade that to 4 now for some better tuning and for cleaning up the full OSDs. burley_ the data is spread over 3 or 4 pools
[22:04] * disc (~gabrielp1@x590d68b7.dyn.telefonica.de) has joined #ceph
[22:05] <burley_> If your weights were the same, and your pg_num is set to ~2048 for each pool (of those in use) that's all the ideas I have for you.
[22:05] <disc> question: planning to go into production with ceph. I have seen conflicting reports on which stable release is actually stable....firefly and giant
[22:05] <disc> so i would appreciate your suggestions
[22:05] <disc> so far, would be using as block storage
[22:06] * nhm (~nhm@172.56.31.95) has joined #ceph
[22:06] * ChanServ sets mode +o nhm
[22:06] <rotbeard> burley_, so it could be a _problem_ with too many or less pgs per pool? I am just learning for production ;)
[22:07] <burley_> too few
[22:07] * t0rn (~ssullivan@2607:fad0:32:a02:d227:88ff:fe02:9896) Quit (Quit: Leaving.)
[22:07] <burley_> but I don't think that's likely unless one of the pools is improperly configured
[22:07] <lurbs> Are you able to pastebin 'ceph osd tree' and your CRUSH map?
[22:08] * thomnico (~thomnico@37.162.101.252) has joined #ceph
[22:08] <rotbeard> lurbs, yep. just give me 5 minutes
[22:08] * alfredodeza (~alfredode@198.206.133.89) has left #ceph
[22:08] * lcurtis (~lcurtis@47.19.105.250) Quit (Ping timeout: 480 seconds)
[22:09] * ircolle (~ircolle@166.170.45.178) has joined #ceph
[22:09] * jwilkins (~jwilkins@38.122.20.226) has joined #ceph
[22:09] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[22:10] <burley_> "ceph osd pool get POOLNAME pg_num" will let you see the pg_num for each of the pools to check that angle as well
[22:10] * Behedwin (~AGaW@2BLAAFYQ9.tor-irc.dnsbl.oftc.net) Quit ()
[22:11] * DoDzy (~Pirate@3N2AABDT7.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:11] <bsanders_afk> Just realized I missed the Ceph Tech Talk today. Anyone know when it will be on Youtube?
[22:11] * rotbart (~redbeard@2a02:908:df10:d300:6267:20ff:feb7:c20) has joined #ceph
[22:12] * bsanders_afk is now known as bsanders
[22:14] * bilco105_ (~bilco105@irc.bilco105.com) Quit (Max SendQ exceeded)
[22:14] * andreask (~andreask@h081217069051.dyn.cm.kabsi.at) has joined #ceph
[22:14] * ChanServ sets mode +v andreask
[22:15] * bilco105 (~bilco105@irc.bilco105.com) has joined #ceph
[22:15] <disc> in case you guys missed my question. planning to go into production with ceph. I have seen conflicting reports on which stable release is actually stable....firefly and giant. I intend to use ceph as a block storage. so which one would you prefer judging by your experience?
[22:16] * georgem (~Adium@fwnat.oicr.on.ca) Quit (Quit: Leaving.)
[22:16] <fghaas> rotbeard: well are all your OSD actually the same size?
[22:17] <fghaas> their weights should be proportional to their space
[22:17] <lurbs> disc: We have two clusters, one Firefly and one Giant, and are planning on upgrading Firefly -> Giant.
[22:17] <rotbart> lurbs, ceph osd tree -> http://pastebin.com/D6k4907H for the crush map: do you want me to ceph osd getcrushmap -o + some filehoster?
[22:18] <rotbart> fghaas, yes. all drives are 4TB of size. in my osd tree you can see that I gave one node another weight (because most of that OSDs were near full last week)
[22:18] <rotbart> but before, the weight was the same.
[22:19] <disc> lurbs: thanks. how is the recovery with giant?
[22:19] <lurbs> We have recovery/backfill tuned way down on both clusters, so as to not interfere with client ops.
[22:20] <disc> rotbart: correct me if i am wrong. shouldn't ceph be equally distributing load given all drives are of equal size?
[22:20] <fghaas> disc: my vote would be for firefly
[22:20] <rotbart> disc, I just _thought_ that. but I am pretty new to ceph to don't know so much for now ;)
[22:21] <lurbs> rotbeard: You should be able to decompile the map using 'crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}'
[22:21] * lcurtis (~lcurtis@47.19.105.250) has joined #ceph
[22:22] <disc> rotbart: me too. dont worry! but as per my experience, ceph is able to recognize and assign weight appropriately given disks of different sizes. But I would still be curious in your case.
[22:22] <lurbs> fghaas: Why Firefly? Specific issue(s) with Giant, or just that Firefly's more mature?
[22:23] <fghaas> people seem to be experiencing performance regressions. Now afaik none of them have actually been confirmed, but the problem is that we likely won't see them fixed in giant, because the announcement is already out that yesterday's point release will likely be giant's last
[22:23] <fghaas> s/is that/is that, if they exist,/
[22:23] <kraken> fghaas meant to say: people seem to be experiencing performance regressions. Now afaik none of them have actually been confirmed, but the problem is that, if they exist, we likely won't see them fixed in giant, because the announcement is already out that yesterday's point release will likely be giant's last
[22:24] <rotbart> lurbs, here's my crushmap
[22:24] <rotbart> http://pastebin.com/i9YLXU68
[22:25] <lurbs> Prior to rolling out our Giant cluster we did a large amount of benchmarking. Firefly vs Giant, encrypted vs non, and for various different encryption algorithms. Giant was faster, for us, pretty much everywhere.
[22:25] <rotbart> disc, me too. my plan is to do some very deep testing with a new cluster (with new hardware specs) and being able to deploy a production cluster in november or december
[22:26] <disc> rotbart: perhaps you and i should collaborate more. I am also in the same boat as you but my deadline is pretty much nearer than yours. :)
[22:26] <fghaas> lurbs: that includes librbd performance?
[22:27] <lurbs> That was mostly rados bench, but also some in-VM numbers.
[22:27] * derjohn_mobi (~aj@tmo-113-25.customers.d1-online.com) has joined #ceph
[22:27] * ohnomrbill (~ohnomrbil@c-67-174-241-112.hsd1.ca.comcast.net) Quit (Quit: ohnomrbill)
[22:27] <lurbs> Was when I found how important tweaking read_ahead_kb is. :)
[22:27] <rotbart> disc, my first _deadline_ was summer last year... I started with a PoC cluster but as I was on holiday, some collegues fired production virtual maschines on that PoC cluster :(
[22:28] <disc> lurbs: i also did a simple benchmark with "fio" on our proprietary storage and ceph giant. ceph surpassed the proprietary one.
[22:28] <fghaas> yeah, when there have been complaints (and I don't want to frighten anyone here, jftr ??? can't say I've gone exhaustively to the bottom of this), they were around librbd or qemu/rbd
[22:28] * hellertime (~Adium@a23-79-238-10.deploy.static.akamaitechnologies.com) Quit (Ping timeout: 480 seconds)
[22:28] <lurbs> Our biggest problem was the overhead of the encryption, actually.
[22:28] <skorgu> fwiw firefly is 'long term support', giant is not
[22:29] <rotbart> disc, here too. just my small cluster is _faster_ that our netapp fas6040 with about 250 discs. dunno what they are doing.
[22:29] <lurbs> We ended up rolling a patched version that allows LUKS, and a choice of ciphers.
[22:29] <rotbart> s/that/than
[22:29] <kraken> rotbart meant to say: disc, here too. just my small cluster is _faster_ than our netapp fas6040 with about 250 discs. dunno what they are doing.
[22:29] <lurbs> Patches that are now upstream, but not in any released version, I believe.
[22:30] <mjevans> As a side note, giant actually does have a few tools (that I haven't needed to use) that seem useful in diagnosing issues. Having said that, the Firefly version Debian's shipping (.7) //does// have issues, and another user fixed them with .8 ; I fixed mine by updating to Giant previously to that data.
[22:31] <mjevans> Said issues didn't directly cause data-loss, but made 'nice' recovery from the incidents difficult.
[22:31] <disc> i would be using Ubuntu 14.04 LTS with ceph Giant. Don't know if that makes any difference but people have reported some problems with giant given different Linux distros/kernel versions
[22:31] * jwilkins (~jwilkins@38.122.20.226) Quit (Ping timeout: 480 seconds)
[22:31] <mjevans> disc: it's to be expected, Like with BTRFS use some version 'near' the latest, but far enough back to have had major regressions ironed out.
[22:31] * ircolle (~ircolle@166.170.45.178) Quit (Read error: Connection reset by peer)
[22:31] <rotbart> disc, I plan to use 14.04 + firefly. just because of that _long term support_ thing.
[22:31] <disc> mjevans: exactly. Seen bad reports regarding the recovery process with ceph giant.
[22:32] * xarses_ (~andreww@12.164.168.117) Quit (Ping timeout: 480 seconds)
[22:32] <mjevans> disc: In my case, Giant //fixed// recovery issues.
[22:32] <disc> mjevans: If you dont mind telling, are you running ceph with RAID0 or RAID1?
[22:33] <fghaas> lurbs, got an issue tracker ID or GitHub PR for me so I can take a look?
[22:33] <lurbs> https://github.com/ceph/ceph/pull/3092
[22:33] <mjevans> disc: Raid-1 with 4 copies of data across two hosts. There are three monitors. I consider this to be the smallest possible fault tollerant cluster.
[22:34] <lurbs> rotbart: What's your replication level? And are all of the full disks on dc01-ceph-st04?
[22:34] <disc> mjevans: sounds reasonable in my opinion. I was torn between depending on ceph entirely for HD failures or using RAID1 to lessen my worries
[22:34] <mjevans> I think if I were re-designing the setup I'd try to put the drives across three OSD hosts (which also happened to be monitors) and still require 4 copies, but allow the duplicates to pool.
[22:35] <disc> rotbart: i will test ceph giant performance and will reconfigure my cluster with firely. I will then compare the performance of both before i move forward
[22:35] <fghaas> lurbs: that looks sweet, thanks
[22:35] <mjevans> disc: Same reason that ZFS users prefer to have redundancy in the ZFS layer instead of under it. Ceph can be smarter about duplicating the data and it's already doing block-stripe mangling so why not?
[22:36] <lurbs> fghaas: The TLDR is that for AES GCM was significantly faster than AES CBC, especially with AES-NI enabled.
[22:36] <kraken> http://i.imgur.com/dnMjc.gif
[22:36] <rotbart> lurbs, for now I use 2 replicas. the most full disks were on st04. iirc there were some near full on st01, some near full + one full on st03 and three or four full on st04. at the same time some OSDs had 15% space used
[22:36] <disc> mjevans: plan to do some "disaster" testings as well. If things go smooth with this testing, so probably will go for RAID0
[22:39] <rotbart> disc, why going for raid btw?
[22:39] <mjevans> disc: I did find that my crush ruleset, while ensuring data integrity, causes some deadlocks since my requested norms couldn't be fulfilled. I suspect this is due to me instructing Ceph incorrectly about my desires, but it's on my list of things to improve. For the moment I'm still deciding if demanding that many copies is desirable; and leaning towards my infrastructure not being well enough distributed to do with less copies (but one per node).
[22:40] * DoDzy (~Pirate@3N2AABDT7.tor-irc.dnsbl.oftc.net) Quit ()
[22:40] <disc> rotbart: sorry didnt get you. You mean why going for RAID0?
[22:41] * georgem (~Adium@184.151.178.3) has joined #ceph
[22:41] * georgem (~Adium@184.151.178.3) Quit ()
[22:41] <disc> mjevans: I plan to have two copies of each data. I am still in the learning phase of manipulating CRUSH
[22:41] * georgem (~Adium@fwnat.oicr.on.ca) has joined #ceph
[22:41] * thomnico (~thomnico@37.162.101.252) Quit (Ping timeout: 480 seconds)
[22:41] * mykola (~Mikolaj@91.225.202.26) Quit (Quit: away)
[22:41] * Bj_o_rn (~loft@tor-exit.xshells.net) has joined #ceph
[22:42] * xarses_ (~andreww@12.164.168.117) has joined #ceph
[22:42] <lurbs> rotbart: Other than dc01-ceph-st04 being significantly smaller in capacity that looks vaguely sane. What about 'ceph health detail' and 'ceph pg dump'?
[22:43] <rotbart> disc, <disc> mjevans: plan to do some "disaster" testings as well. If things go smooth with this testing, so probably will go for RAID0.
[22:43] <rotbart> just asking for why using raid at all
[22:44] <disc> mjevans: although i was pleasantly surprised that ceph gives you a HEALTH_WARN if you are under-utilizing your cluster.e.g. not configuring the right amount og PGs
[22:45] <disc> rotbart: sorry if i am mistaken but servers with multiple slots of hard disk require you to confgure Virtual Disks (on BIOS level) as RAID0,RAID1 etc....
[22:45] <disc> right amount of* PGs
[22:45] <rotbart> lurbs, http://pastebin.com/jYjWc5st
[22:46] <rotbart> do you _really_ want the output of ceph pg dump? :>
[22:46] * Miouge (~Miouge@94.136.92.20) Quit (Quit: Miouge)
[22:46] <lurbs> Nah, suppose not. :)
[22:47] <rotbart> disc, ah. you mean if the raid controller can't handle multiple jbods?
[22:48] <disc> rotbart: yes.
[22:48] * lcurtis (~lcurtis@47.19.105.250) Quit (Ping timeout: 480 seconds)
[22:49] <rotbart> what controllers do you use?
[22:50] <disc> rotbart: Not sure but they belong to Dell Servers :)
[22:50] <rotbart> lurbs, I have to say that I took some OSDs out, let ceph rebalance the whole stuff + put them back in. I know, that this isn't recommended but I got so much HEALTH_ERRs and a collegue wanted to write his backups to that cluster :>
[22:52] * jdillaman (~jdillaman@pool-108-56-67-212.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[22:52] <lurbs> The CRUSH map seems reasonable, but it can't be good that dc01-ceph-st02 and all its disks are still out. Are to able to bring that back, and maybe tweak 'osd backfill full ratio' up to encourage a rebalance?
[22:53] <disc> rotbart: plus i have a scenario in mind if one osd goes down...having raid1, i would simply replace the faulty hd and let the raid controller do the synching n stuff. if it is raid0, then i would have to put my skills and faith in ceph giant :)
[22:53] * hb9xar__ (ident@easytux.ch) Quit (Ping timeout: 480 seconds)
[22:53] <rotbart> I think I can do this next week. we just take st02 out because 4 or 5 disks died in that chassis, so maybe it has a backplane or controller problem. but sure, I can do this next week + check wether the balancing would be _more even_
[22:54] <rotbart> disc, so you use more than 1 disk for 1 raid0/1?
[22:54] <disc> yes
[22:54] <rotbart> ah ok
[22:55] <rotbart> we want to use 1 disk as 1 osd so far.
[22:55] * visualne (~oftc-webi@158-147-148-234.harris.com) has joined #ceph
[22:55] <rotbart> + in our new boxes we switched from LSI controllers to areca 1883ix-16
[22:56] <RayTracer> My recovery after changing pg_num stuck on something like this; HEALTH_WARN 10 pgs backfill_toofull; 10 pgs stuck unclean; recovery 4042/525368 objects degraded (0.769%); 1 near full osd(s);
[22:56] <rotbart> we saw LSI controllers (but don't remember which exactly) dying, when starting 24 OSDs at the same time :>
[22:58] * bilco105 is now known as bilco105_
[22:59] <RayTracer> is there anything i can do right now? It stuck at 0.7% :(
[22:59] * bilco105_ is now known as bilco105
[23:00] <lurbs> RayTracer: If you're sure your CRUSH map is correct then you can temporarily bump up 'osd backfill full ratio' in order to allow the backfill to continue.
[23:00] <lurbs> http://ceph.com/docs/master/rados/configuration/osd-config-ref/#backfilling
[23:01] * sjm (~sjm@pool-98-109-11-113.nwrknj.fios.verizon.net) has left #ceph
[23:03] * davidz1 (~davidz@2605:e000:1313:8003:fd3e:5130:89a1:ce04) has joined #ceph
[23:03] <RayTracer> lurbs: i reduced max backfills for this job to 4. Maybe that was so great idea after all
[23:04] <RayTracer> Though that will degrease number of jobs and reduce prerfomrance issuse for rest of our service
[23:04] <RayTracer> ok ok i understand
[23:05] <RayTracer> one of my osd have 86% usage
[23:07] <disc> ok guys.signing off. Thank you all for your help and suggestions.
[23:07] <disc> rotbart: hopefully we will remain in touch. :)
[23:08] * disc (~gabrielp1@x590d68b7.dyn.telefonica.de) Quit (Quit: Leaving)
[23:10] * davidz (~davidz@2605:e000:1313:8003:9468:af06:b5d6:f557) Quit (Ping timeout: 480 seconds)
[23:10] * Bj_o_rn (~loft@2BLAAFYTW.tor-irc.dnsbl.oftc.net) Quit ()
[23:10] * rhonabwy (~spidu_@digi00277.torproxy-readme-arachnide-fr-35.fr) has joined #ceph
[23:12] * cephed (506d7573@107.161.19.109) has joined #ceph
[23:13] <cephed> Hello
[23:14] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[23:14] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[23:14] <cephed> where do you "best" put journals
[23:17] <rotbart> cephed, on enterprise SSDs ;)
[23:22] <cephed> right - mirrored I guess?
[23:23] * dmick1 (~dmick@38.122.20.226) has joined #ceph
[23:24] * andreask (~andreask@h081217069051.dyn.cm.kabsi.at) has left #ceph
[23:27] * fghaas (~florian@91-119-130-192.dynamic.xdsl-line.inode.at) has left #ceph
[23:33] * georgem (~Adium@fwnat.oicr.on.ca) Quit (Quit: Leaving.)
[23:33] <rotbart> I don't know what is the best practice for journal SSDs. we try to use a smaller ratio of OSDs:Journal-SSDs without putting that SSDs to a raid1 or something like that
[23:34] <rotbart> for example our new chassis have 12 4TB sata disks + 4 intel dc s3700 200G journal SSDs. so if a SSD dies, _just_ 3 OSDs will die. should be ok
[23:35] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[23:36] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[23:36] <RayTracer> lurbs: Thanks it helps.
[23:37] <cephed> there is a notion from a cern press that they put the journal on on a separate portion on each osd
[23:37] <lurbs> They have a massive number of OSDs, so are a bit of a special case.
[23:39] * joef1 (~Adium@2620:79:0:207:c08:1e5d:e2f3:81db) has joined #ceph
[23:39] <cephed> Ill bring up 96 tomorrow, having to partition SSDs and keep track - not easy
[23:40] * rhonabwy (~spidu_@2BLAAFYU5.tor-irc.dnsbl.oftc.net) Quit ()
[23:40] <cephed> what about symlinking to file-system wise to e.g. a pair of central spinning disks
[23:40] * SweetGirl (~bildramer@exit-01d.noisetor.net) has joined #ceph
[23:41] * SweetGirl (~bildramer@3N2AABDX4.tor-irc.dnsbl.oftc.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * davidz1 (~davidz@2605:e000:1313:8003:fd3e:5130:89a1:ce04) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * nitti (~nitti@162.222.47.218) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * ShaunR (~ShaunR@staff.ndchost.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Concubidated (~Adium@71.21.5.251) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * alram (~alram@38.122.20.226) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * gregmark (~Adium@68.87.42.115) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * cholcombe973 (~chris@pool-108-42-144-175.snfcca.fios.verizon.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * bandrus (~brian@50.23.115.87-static.reverse.softlayer.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * markl_ (~mark@knm.org) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * sage (~quassel@cpe-76-95-230-100.socal.res.rr.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * debian112 (~bcolbert@24.126.201.64) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * PerlStalker (~PerlStalk@162.220.127.20) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * jfunk (~jfunk@2001:470:b:44d:7e7a:91ff:fee8:e80b) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * cronix1 (~cronix@5.199.139.166) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * KevinPerks (~Adium@cpe-071-071-026-213.triad.res.rr.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * marrusl (~mark@cpe-24-90-46-248.nyc.res.rr.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * guppy (~quassel@guppy.xxx) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * mookins (~mookins@induct3.lnk.telstra.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * yghannam (~yghannam@0001f8aa.user.oftc.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * smokedmeets (~smokedmee@c-67-174-241-112.hsd1.ca.comcast.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * jks (~jks@178.155.151.121) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * bdonnahue (~James@24-148-51-171.c3-0.mart-ubr1.chi-mart.il.cable.rcn.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * TiCPU (~ticpu@2001:470:b010:1::10) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * skorgu (skorgu@pylon.skorgu.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * chasmo77 (~chas77@158.183-62-69.ftth.swbr.surewest.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Doug (uid69720@id-69720.ealing.irccloud.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * linuxkidd (~linuxkidd@134.sub-70-210-196.myvzw.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * eternaleye (~eternaley@50.245.141.77) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * logan__ (~logan@63.143.49.103) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * yehudasa_ (~yehudasa@2607:f298:a:607:548b:86d1:f0e4:6ac5) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * mjevans (~mje@209.141.34.79) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * supay (sid47179@id-47179.uxbridge.irccloud.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * mjeanson (~mjeanson@00012705.user.oftc.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * \ask (~ask@oz.develooper.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * dlan (~dennis@116.228.88.131) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * blynch (~blynch@vm-nat.msi.umn.edu) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * qybl (~foo@maedhros.krzbff.de) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Bosse (~bosse@rifter2.klykken.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Georgyo (~georgyo@shamm.as) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * chutz (~chutz@rygel.linuxfreak.ca) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * classicsnail (~David@2600:3c01::f03c:91ff:fe96:d3c0) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Gugge-47527 (gugge@kriminel.dk) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * dwm (~dwm@northrend.tastycake.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * mondkalbantrieb (~quassel@mondkalbantrieb.de) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * j^2 (sid14252@id-14252.brockwell.irccloud.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * stj (~stj@2604:a880:800:10::2cc:b001) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * fvl (~fvl@ipjusup.net.tomline.ru) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * wer_ (~wer@206-248-239-142.unassigned.ntelos.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * kashyap (~kashyap@121.244.87.116) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Annttu (annttu@0001934a.user.oftc.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * HauM1 (~HauM1@login.univie.ac.at) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * fred`` (fred@earthli.ng) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * theanalyst (theanalyst@open.source.rocks.my.socks.firrre.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * nolan (~nolan@2001:470:1:41:a800:ff:fe3e:ad08) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * zerick (~zerick@irc.quassel.zerick.me) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * dosaboy (~dosaboy@65.93.189.91.lcy-01.canonistack.canonical.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * scuttlemonkey (~scuttle@nat-pool-rdu-t.redhat.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * fam (~famz@nat-pool-bos-t.redhat.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * visualne (~oftc-webi@158-147-148-234.harris.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * _prime_ (~oftc-webi@199.168.44.192) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * derjohn_mobi (~aj@tmo-113-25.customers.d1-online.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * oro (~oro@80-219-254-208.dclient.hispeed.ch) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * puffy (~puffy@50.185.218.255) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * VisBits (~textual@8.29.138.28) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * arbrandes (~arbrandes@191.8.36.246) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * joshd (~jdurgin@38.122.20.226) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * jclm (~jclm@209.49.224.62) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * elder (~elder@207.66.184.146) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * sputnik13 (~sputnik13@74.202.214.170) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * tserong (~tserong@203-173-33-52.dyn.iinet.net.au) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * wkennington (~william@76.77.180.204) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Nats (~natscogs@114.31.195.238) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * bearkitten (~bearkitte@cpe-66-27-98-26.san.res.rr.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * markl (~mark@knm.org) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * jamespd (~mucky@mucky.socket7.org) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * shaunm (~shaunm@74.215.76.114) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * evilrob00 (~evilrob00@cpe-72-179-3-209.austin.res.rr.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * tom (~tom@167.88.45.146) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * benh57 (~benh57@sceapdsd43-30.989studios.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Kupo1 (~tyler.wil@23.111.254.159) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * trociny (~mgolub@93.183.239.2) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * MK_FG (~MK_FG@00018720.user.oftc.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * fouxm (~foucault@ks01.commit.ninja) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * sig_wall (~adjkru@xn--hwgz2tba.lamo.su) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * ZyTer (~ZyTer@ghostbusters.apinnet.fr) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * singler (~singler@zeta.kirneh.eu) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * garphy`aw (~garphy@frank.zone84.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * Tene (~tene@173.13.139.236) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * gsilvis (~andovan@c-75-69-162-72.hsd1.ma.comcast.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * pdrakeweb (~pdrakeweb@cpe-65-185-74-239.neo.res.rr.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * bro_ (~flybyhigh@panik.darksystem.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * ifur (~osm@0001f63e.user.oftc.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * carter (~carter@li98-136.members.linode.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * athrift_ (~nz_monkey@203.86.205.13) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * med (~medberry@71.74.177.250) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * seapasul1i (~seapasull@95.85.33.150) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * kraken (~kraken@gw.sepia.ceph.com) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * ctd (~root@00011932.user.oftc.net) Quit (reticulum.oftc.net resistance.oftc.net)
[23:41] * mattronix (~quassel@fw1.sdc.mattronix.nl) Quit (reticulum.oftc.net resistance.oftc.net)
[23:42] * SweetGirl (~bildramer@3N2AABDX4.tor-irc.dnsbl.oftc.net) has joined #ceph
[23:42] * davidz1 (~davidz@2605:e000:1313:8003:fd3e:5130:89a1:ce04) has joined #ceph
[23:42] * derjohn_mobi (~aj@tmo-113-25.customers.d1-online.com) has joined #ceph
[23:42] * oro (~oro@80-219-254-208.dclient.hispeed.ch) has joined #ceph
[23:42] * puffy (~puffy@50.185.218.255) has joined #ceph
[23:42] * VisBits (~textual@8.29.138.28) has joined #ceph
[23:42] * nitti (~nitti@162.222.47.218) has joined #ceph
[23:42] * arbrandes (~arbrandes@191.8.36.246) has joined #ceph
[23:42] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[23:42] * joshd (~jdurgin@38.122.20.226) has joined #ceph
[23:42] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[23:42] * Concubidated (~Adium@71.21.5.251) has joined #ceph
[23:42] * alram (~alram@38.122.20.226) has joined #ceph
[23:42] * gregmark (~Adium@68.87.42.115) has joined #ceph
[23:42] * cholcombe973 (~chris@pool-108-42-144-175.snfcca.fios.verizon.net) has joined #ceph
[23:42] * jclm (~jclm@209.49.224.62) has joined #ceph
[23:42] * bandrus (~brian@50.23.115.87-static.reverse.softlayer.com) has joined #ceph
[23:42] * kashyap (~kashyap@121.244.87.116) has joined #ceph
[23:42] * markl_ (~mark@knm.org) has joined #ceph
[23:42] * sage (~quassel@cpe-76-95-230-100.socal.res.rr.com) has joined #ceph
[23:42] * elder (~elder@207.66.184.146) has joined #ceph
[23:42] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[23:42] * PerlStalker (~PerlStalk@162.220.127.20) has joined #ceph
[23:42] * sputnik13 (~sputnik13@74.202.214.170) has joined #ceph
[23:42] * jfunk (~jfunk@2001:470:b:44d:7e7a:91ff:fee8:e80b) has joined #ceph
[23:42] * cronix1 (~cronix@5.199.139.166) has joined #ceph
[23:42] * KevinPerks (~Adium@cpe-071-071-026-213.triad.res.rr.com) has joined #ceph
[23:42] * marrusl (~mark@cpe-24-90-46-248.nyc.res.rr.com) has joined #ceph
[23:42] * tserong (~tserong@203-173-33-52.dyn.iinet.net.au) has joined #ceph
[23:42] * guppy (~quassel@guppy.xxx) has joined #ceph
[23:42] * mookins (~mookins@induct3.lnk.telstra.net) has joined #ceph
[23:42] * wkennington (~william@76.77.180.204) has joined #ceph
[23:42] * yghannam (~yghannam@0001f8aa.user.oftc.net) has joined #ceph
[23:42] * Nats (~natscogs@114.31.195.238) has joined #ceph
[23:42] * smokedmeets (~smokedmee@c-67-174-241-112.hsd1.ca.comcast.net) has joined #ceph
[23:42] * bdonnahue (~James@24-148-51-171.c3-0.mart-ubr1.chi-mart.il.cable.rcn.com) has joined #ceph
[23:42] * bearkitten (~bearkitte@cpe-66-27-98-26.san.res.rr.com) has joined #ceph
[23:42] * markl (~mark@knm.org) has joined #ceph
[23:42] * jks (~jks@178.155.151.121) has joined #ceph
[23:42] * TiCPU (~ticpu@2001:470:b010:1::10) has joined #ceph
[23:42] * jamespd (~mucky@mucky.socket7.org) has joined #ceph
[23:42] * Doug (uid69720@id-69720.ealing.irccloud.com) has joined #ceph
[23:42] * shaunm (~shaunm@74.215.76.114) has joined #ceph
[23:42] * skorgu (skorgu@pylon.skorgu.net) has joined #ceph
[23:42] * evilrob00 (~evilrob00@cpe-72-179-3-209.austin.res.rr.com) has joined #ceph
[23:42] * chasmo77 (~chas77@158.183-62-69.ftth.swbr.surewest.net) has joined #ceph
[23:42] * _prime_ (~oftc-webi@199.168.44.192) has joined #ceph
[23:42] * tom (~tom@167.88.45.146) has joined #ceph
[23:42] * benh57 (~benh57@sceapdsd43-30.989studios.com) has joined #ceph
[23:42] * linuxkidd (~linuxkidd@134.sub-70-210-196.myvzw.com) has joined #ceph
[23:42] * eternaleye (~eternaley@50.245.141.77) has joined #ceph
[23:42] * Kupo1 (~tyler.wil@23.111.254.159) has joined #ceph
[23:42] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) has joined #ceph
[23:42] * logan__ (~logan@63.143.49.103) has joined #ceph
[23:42] * yehudasa_ (~yehudasa@2607:f298:a:607:548b:86d1:f0e4:6ac5) has joined #ceph
[23:42] * mjevans (~mje@209.141.34.79) has joined #ceph
[23:42] * trociny (~mgolub@93.183.239.2) has joined #ceph
[23:42] * supay (sid47179@id-47179.uxbridge.irccloud.com) has joined #ceph
[23:42] * mjeanson (~mjeanson@00012705.user.oftc.net) has joined #ceph
[23:42] * \ask (~ask@oz.develooper.com) has joined #ceph
[23:42] * dlan (~dennis@116.228.88.131) has joined #ceph
[23:42] * MK_FG (~MK_FG@00018720.user.oftc.net) has joined #ceph
[23:42] * wer_ (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[23:42] * ifur (~osm@0001f63e.user.oftc.net) has joined #ceph
[23:42] * scuttlemonkey (~scuttle@nat-pool-rdu-t.redhat.com) has joined #ceph
[23:42] * zerick (~zerick@irc.quassel.zerick.me) has joined #ceph
[23:42] * fvl (~fvl@ipjusup.net.tomline.ru) has joined #ceph
[23:42] * stj (~stj@2604:a880:800:10::2cc:b001) has joined #ceph
[23:42] * bro_ (~flybyhigh@panik.darksystem.net) has joined #ceph
[23:42] * pdrakeweb (~pdrakeweb@cpe-65-185-74-239.neo.res.rr.com) has joined #ceph
[23:42] * ctd (~root@00011932.user.oftc.net) has joined #ceph
[23:42] * j^2 (sid14252@id-14252.brockwell.irccloud.com) has joined #ceph
[23:42] * gsilvis (~andovan@c-75-69-162-72.hsd1.ma.comcast.net) has joined #ceph
[23:42] * mondkalbantrieb (~quassel@mondkalbantrieb.de) has joined #ceph
[23:42] * Tene (~tene@173.13.139.236) has joined #ceph
[23:42] * dwm (~dwm@northrend.tastycake.net) has joined #ceph
[23:42] * garphy`aw (~garphy@frank.zone84.net) has joined #ceph
[23:42] * singler (~singler@zeta.kirneh.eu) has joined #ceph
[23:42] * HauM1 (~HauM1@login.univie.ac.at) has joined #ceph
[23:42] * dosaboy (~dosaboy@65.93.189.91.lcy-01.canonistack.canonical.com) has joined #ceph
[23:42] * ZyTer (~ZyTer@ghostbusters.apinnet.fr) has joined #ceph
[23:42] * athrift_ (~nz_monkey@203.86.205.13) has joined #ceph
[23:42] * Gugge-47527 (gugge@kriminel.dk) has joined #ceph
[23:42] * nolan (~nolan@2001:470:1:41:a800:ff:fe3e:ad08) has joined #ceph
[23:42] * kraken (~kraken@gw.sepia.ceph.com) has joined #ceph
[23:42] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[23:42] * carter (~carter@li98-136.members.linode.com) has joined #ceph
[23:42] * med (~medberry@71.74.177.250) has joined #ceph
[23:42] * classicsnail (~David@2600:3c01::f03c:91ff:fe96:d3c0) has joined #ceph
[23:42] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[23:42] * theanalyst (theanalyst@open.source.rocks.my.socks.firrre.com) has joined #ceph
[23:42] * Georgyo (~georgyo@shamm.as) has joined #ceph
[23:42] * Bosse (~bosse@rifter2.klykken.com) has joined #ceph
[23:42] * seapasul1i (~seapasull@95.85.33.150) has joined #ceph
[23:42] * mattronix (~quassel@fw1.sdc.mattronix.nl) has joined #ceph
[23:42] * qybl (~foo@maedhros.krzbff.de) has joined #ceph
[23:42] * sig_wall (~adjkru@xn--hwgz2tba.lamo.su) has joined #ceph
[23:42] * fouxm (~foucault@ks01.commit.ninja) has joined #ceph
[23:42] * fam (~famz@nat-pool-bos-t.redhat.com) has joined #ceph
[23:42] * blynch (~blynch@vm-nat.msi.umn.edu) has joined #ceph
[23:42] * Annttu (annttu@0001934a.user.oftc.net) has joined #ceph
[23:42] * fred`` (fred@earthli.ng) has joined #ceph
[23:42] * ChanServ sets mode +v sage
[23:42] * ChanServ sets mode +v scuttlemonkey
[23:43] <bdonnahue> dmcry
[23:43] <dmick1> ok. <cries>
[23:43] <bdonnahue> is anyone using ceph with dmcrypt?
[23:44] <bdonnahue> haha bad paste
[23:44] * cephed (506d7573@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[23:44] <lurbs> bdonnahue: Yes, although it's a patched version to allow for LUKS and choice of ciphers.
[23:44] <rotbart> cephed, we started with 84 OSDs with journals _on_ the OSDs. Pretty bad for the performance + rebalancing and stuff.
[23:45] * cephed (506d7573@107.161.19.109) has joined #ceph
[23:47] <bdonnahue> lurbs: so its not an official release ?
[23:47] * joef1 (~Adium@2620:79:0:207:c08:1e5d:e2f3:81db) Quit (Ping timeout: 480 seconds)
[23:49] <lurbs> The patches have been accepted upstream, but aren't in a release yet.
[23:49] <lurbs> https://github.com/ceph/ceph/pull/3092/commits
[23:54] <championofcyrodi> Be-El: as you said... using custom bash scripts, I was able to traverse the osd data directories for all my osds, find rbd images that had <uid>-0000000000000000_head_<hash> or whatever... read the first block w/ hex dump and determine it was a QFI image.... then traverse and get a list that builds a map of node, osd, and obj.... the pipe in to uniq to remove replicas... then scp ALL the objs to a single folder. then u
[23:55] <RayTracer> Thanks guys for help and good night!
[23:55] * RayTracer (~RayTracer@host-81-190-2-156.gdynia.mm.pl) Quit (Quit: Leaving...)
[23:55] <bdonnahue> lurbs: thanks for the info i will give that a look. do you have any notes on how you got your system up and running?
[23:56] <championofcyrodi> currently it's a couple of scripts that call each other... but i'll clean up and consolidate a bit, and post it w/ link here for anyone who needs to recover images when their monitors are all dead and gone.
[23:58] <lurbs> bdonnahue: It was all just using ceph-deploy, as per the docs, but with --dmcrypt. I'd recommend against running those patches, BTW. Not unless you feel like building your own packages and running code untested by more than one cluster.

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.