#ceph IRC Log

Index

IRC Log for 2016-03-10

Timestamps are in GMT/BST.

[0:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[0:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[0:01] * bjornar_ (~bjornar@ti0099a430-0908.bb.online.no) Quit (Ping timeout: 480 seconds)
[0:01] * cathode (~cathode@50-198-166-81-static.hfc.comcastbusiness.net) has joined #ceph
[0:04] * AI (63f47aa5@107.161.19.53) has joined #ceph
[0:05] * AI (63f47aa5@107.161.19.53) Quit ()
[0:07] * AI (87f5310e@107.161.19.53) has joined #ceph
[0:11] * T1 (~the_one@87.104.212.66) Quit (Read error: Connection reset by peer)
[0:16] * T1 (~the_one@87.104.212.66) has joined #ceph
[0:17] * wyang (~wyang@116.216.30.3) has joined #ceph
[0:18] * Moriarty (~ChauffeR@193-238-47-212.rev.cloud.scaleway.com) has joined #ceph
[0:20] * wyang (~wyang@116.216.30.3) Quit ()
[0:27] * kawa2014 (~kawa@38.109.203.254) Quit (Ping timeout: 480 seconds)
[0:27] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Quit: Lost terminal)
[0:32] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) has joined #ceph
[0:48] * kbader (~kyle@64.169.30.57) Quit (Ping timeout: 480 seconds)
[0:48] * Moriarty (~ChauffeR@76GAAC64H.tor-irc.dnsbl.oftc.net) Quit ()
[0:48] * Diablodoct0r (~cmrn@84ZAADDLX.tor-irc.dnsbl.oftc.net) has joined #ceph
[0:48] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[0:51] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) Quit (Quit: Leaving.)
[0:52] * sudocat1 (~dibarra@192.185.1.19) has joined #ceph
[0:53] * krypto (~krypto@G68-121-13-132.sbcis.sbc.com) has joined #ceph
[0:57] <flaf> Hi, what is the correct Unix rights of /var/lib/ceph in Infernalis? Currently I have [drwxr-x--- ceph ceph] and it works well.
[0:57] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) has joined #ceph
[0:57] * tsg (~tgohad@134.134.139.74) Quit (Remote host closed the connection)
[0:57] <flaf> But my snmp agent can't read it this directory and I have errors in syslog ???snmpd[13898]: Cannot statfs /var/lib/ceph/osd/ceph-14#012: Permission denied???
[0:58] <flaf> Is it a problem if I set the rights to [drwxr-xr-x ceph ceph]?
[0:59] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[1:00] * sudocat1 (~dibarra@192.185.1.19) Quit (Ping timeout: 480 seconds)
[1:00] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) Quit (Ping timeout: 480 seconds)
[1:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[1:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[1:02] * densin (~densin@ppp-58-8-214-217.revip2.asianet.co.th) has joined #ceph
[1:11] * togdon (~togdon@74.121.28.6) Quit (Quit: Bye-Bye.)
[1:11] <motk> well poop
[1:11] <motk> module is missing interpreter line - config_template
[1:11] <motk> leseb: in done broke again
[1:17] * vbellur (~vijay@71.234.224.255) has joined #ceph
[1:18] * Diablodoct0r (~cmrn@84ZAADDLX.tor-irc.dnsbl.oftc.net) Quit ()
[1:18] * MJXII (~CoMa@tor2e1.privacyfoundation.ch) has joined #ceph
[1:21] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) Quit (Quit: Leaving.)
[1:22] * rotbeard (~redbeard@ppp-115-87-78-25.revip4.asianet.co.th) has joined #ceph
[1:22] <motk> anyone know if I can specify datacentre crushmapping in ceph-ansible
[1:23] * LeaChim (~LeaChim@host86-171-90-242.range86-171.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[1:27] * vicente (~~vicente@111-241-44-175.dynamic.hinet.net) has joined #ceph
[1:34] * xarses (~xarses@64.124.158.100) Quit (Ping timeout: 480 seconds)
[1:38] * vicente (~~vicente@111-241-44-175.dynamic.hinet.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[1:40] * rendar (~I@host123-4-dynamic.33-79-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[1:43] * wyang (~wyang@116.216.30.3) has joined #ceph
[1:44] * vicente (~~vicente@111-241-33-60.dynamic.hinet.net) has joined #ceph
[1:47] * fmeppo (~oftc-webi@208-64-171-38.lfytina1.metronetinc.net) has joined #ceph
[1:48] * MJXII (~CoMa@7V7AADATW.tor-irc.dnsbl.oftc.net) Quit ()
[1:49] * MentalRay_ (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Ping timeout: 480 seconds)
[1:49] <fmeppo> I'm not sure I'm able to fully delete a file when using data striping in cephfs. Is this normal?
[1:49] <fmeppo> If I delete a file, the space is reclaimed by the FS after a few seconds. But if I'm using a wider striping layout, only a portion of the file's space is ever reclaimed.
[1:50] <fmeppo> I've got a test FS (Infernalis) that's got 300+ GB of space allocated, but no files in it. Am I doing something wrong here?
[1:52] <diq> fmeppo, you got a cache?
[1:52] <diq> a cache PG pool in front of the data pool?
[1:52] <diq> if so, you'll need to flush the cache to release the disk sapce
[1:52] <diq> space
[1:53] <fmeppo> No - though that's an interesting side-effect of a cache pool I hadn't thought of.
[1:53] * wyang (~wyang@116.216.30.3) Quit (Quit: This computer has gone to sleep)
[1:55] <fmeppo> Just a plain ol' pool, pretty default.
[1:56] * wyang (~wyang@116.216.30.3) has joined #ceph
[1:56] * oms101 (~oms101@p20030057EA042D00C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:57] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[2:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[2:01] * vicente (~~vicente@111-241-33-60.dynamic.hinet.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[2:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[2:01] * cathode (~cathode@50-198-166-81-static.hfc.comcastbusiness.net) Quit (Quit: Leaving)
[2:01] * wyang (~wyang@116.216.30.3) Quit (Quit: This computer has gone to sleep)
[2:01] * lcurtis_ (~lcurtis@47.19.105.250) Quit (Remote host closed the connection)
[2:02] * mx (~myeho@50.46.149.183) Quit (Quit: mx)
[2:05] * oms101 (~oms101@p20030057EA05C300C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[2:09] * angdraug (~angdraug@64.124.158.100) Quit (Quit: Leaving)
[2:11] * Kupo1 (~tyler.wil@23.111.254.159) Quit (Read error: Connection reset by peer)
[2:12] <diq> I'm out of ideas then ;)
[2:12] * wCPO (~Kristian@188.228.31.139) Quit (Ping timeout: 480 seconds)
[2:18] * biGGer (~Curt`@93.115.95.206) has joined #ceph
[2:20] * dnovosel (dee8350e@107.161.19.53) has joined #ceph
[2:23] <dnovosel> Hello all.. I was wondering if there was anyone around that could help me. We have ceph cluster deployed on two nodes, with 14 OSDs per node. This is integrated both directly to Openstack, plus we have a rbd device we mount on one system to store some files. During a rebalance operation it appears we had some networking and possibly HDD failure
[2:23] <dnovosel> situation occur, and the cluster is not working properly.
[2:23] <dnovosel> I tried to xfs_repair the rbd devices and that locks up.
[2:23] <dnovosel> And when I try and list the objects in the pool [rados ls -p shared-storage] it also locks up about half way down the list.
[2:24] * zhaochao (~zhaochao@125.39.112.6) has joined #ceph
[2:27] <dnovosel> We also have some of our PGs in various states [degraded / stale / stuck stale / stuck unclean / etc] although that number is around 150 - 200 of 4500.
[2:27] <dnovosel> One of the OSDs was also failing to come up [** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-7: (2) No such file or directory] which was fixed by manually starting it.
[2:27] <dnovosel> [root@storage1 ~]# ceph-osd -i 7 -f
[2:27] <dnovosel> starting osd.7 at :/0 osd_data /var/lib/ceph/osd/ceph-7 /var/lib/ceph/osd/ceph-7/journal
[2:27] <dnovosel> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[2:27] <dnovosel> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[2:27] <dnovosel> 2016-03-09 11:27:15.583054 7fd6ae958900 -1 osd.7 1836 log_to_monitors {default=true}
[2:27] <dnovosel> And it included that error.
[2:27] <dnovosel> At this point I'm at a loss as to what to look at next.. does anyone have a suggestion?
[2:30] * FjordPrefect (~soorya@103.231.216.194) has joined #ceph
[2:30] <dnovosel> Ultimately what I need to try and do is recover a qcow that is being stored on the rbd device. Or at least gain enough access to it to try and rip some files out of it.
[2:31] * BrianA (~BrianA@fw-rw.shutterfly.com) Quit (Read error: Connection reset by peer)
[2:32] * bliu (~liub@203.192.156.9) Quit (Quit: Leaving)
[2:33] * bliu (~liub@203.192.156.9) has joined #ceph
[2:34] * BrianA (~BrianA@fw-rw.shutterfly.com) has joined #ceph
[2:37] <motk> dnovosel: hardware failure?
[2:37] <dnovosel> I think it's possible an OSD failed, and at the point where it failed the PG could have been degraded.
[2:38] <dnovosel> But I have no idea how to get things back to working now.
[2:38] <motk> is your crushmap satisfiable by remain ing hardware?
[2:38] <motk> have you gone through the troubleshooting pgs/osds doc?
[2:40] * BrianA (~BrianA@fw-rw.shutterfly.com) Quit (Read error: Connection reset by peer)
[2:40] <dnovosel> I have gone through them to some degree, and I am continuing to look over options in that regard.
[2:40] <dnovosel> I'm pursuing this conversation in tandem.
[2:43] <diq> has anyone ever used CephFS with autofs and idle times?
[2:43] <diq> the kernel driver doesn't ever appear to unmount on timeout
[2:44] <motk> dnovosel: yes, they're a bit out of date :(
[2:44] <motk> diq: sorry, no
[2:44] * krypto (~krypto@G68-121-13-132.sbcis.sbc.com) Quit (Read error: Connection reset by peer)
[2:46] * azizulhakim (~oftc-webi@c-73-84-211-145.hsd1.fl.comcast.net) has joined #ceph
[2:48] * biGGer (~Curt`@84ZAADDO2.tor-irc.dnsbl.oftc.net) Quit ()
[2:48] * Tenk (~storage@178-175-128-50.ip.as43289.net) has joined #ceph
[2:49] * kefu (~kefu@114.92.107.250) has joined #ceph
[2:50] <dnovosel> motk: That's pretty normal in my experience with OSS stuff..
[2:50] <motk> dnovosel: it's a notbale issue with ceph though
[2:50] <azizulhakim> I've installed ceph from source. how do I test it now?
[2:50] <motk> the red hat docs are actually better I think
[2:53] <dnovosel> Well I'll continuing hunting around to see what I can to do to get the cluster back online. The worst part of all this is, we were about to integrate two more nodes which would have given us more redundancy which of course may have prevented this.. but that was scheduled for next week.. :(
[2:56] * yanzheng (~zhyan@125.71.108.197) has joined #ceph
[2:58] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[2:58] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[3:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[3:01] * Mika_c (~quassel@122.146.93.152) has joined #ceph
[3:09] * wyang (~wyang@116.216.30.3) has joined #ceph
[3:11] * derjohn_mobi (~aj@x590d712b.dyn.telefonica.de) has joined #ceph
[3:11] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) has joined #ceph
[3:12] * wer (~wer@216.197.66.226) Quit (Remote host closed the connection)
[3:12] * wer (~wer@216.197.66.226) has joined #ceph
[3:13] * Concubidated (~Adium@243.84-253-62.static.virginmediabusiness.co.uk) Quit (Ping timeout: 480 seconds)
[3:18] * Tenk (~storage@76GAAC69I.tor-irc.dnsbl.oftc.net) Quit ()
[3:18] * CoMa (~storage@lumumba.torservers.net) has joined #ceph
[3:18] * derjohn_mob (~aj@x590cea82.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[3:22] * MentalRay (~MentalRay@107.171.161.165) has joined #ceph
[3:23] * georgem (~Adium@69-196-182-134.dsl.teksavvy.com) has joined #ceph
[3:25] <fmeppo> dnovosel: How many disks are failing xfs_repair? If it's a low number, do you have enough spare drives to replace the damaged ones?
[3:25] <fmeppo> You may be able to dd off the damaged disk onto a good one, then xfs_repair the functional drive to get a FS back.
[3:31] * _are_ (~quassel@2a01:238:4325:ca00:f065:c93c:f967:9285) Quit (Ping timeout: 480 seconds)
[3:37] * wyang (~wyang@116.216.30.3) Quit (Quit: This computer has gone to sleep)
[3:38] <dnovosel> It was the /dev/rbd I was trying to xfs_repair on.
[3:38] <dnovosel> Spare drives I can get, that's not really an issue, but the node is full, so I'd have to get somewhere to copy it off to first.
[3:39] <dnovosel> But I will try the suspected back OSD as well.
[3:39] * azizulhakim (~oftc-webi@c-73-84-211-145.hsd1.fl.comcast.net) Quit (Ping timeout: 480 seconds)
[3:40] * wyang (~wyang@114.111.166.41) has joined #ceph
[3:42] * wyang (~wyang@114.111.166.41) Quit ()
[3:45] * georgem (~Adium@69-196-182-134.dsl.teksavvy.com) Quit (Quit: Leaving.)
[3:45] * krypto (~krypto@65.115.222.52) has joined #ceph
[3:48] * CoMa (~storage@7V7AADAXV.tor-irc.dnsbl.oftc.net) Quit ()
[3:48] * efirs (~firs@c-50-185-70-125.hsd1.ca.comcast.net) has joined #ceph
[3:48] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[3:53] * KeeperOfTheSoul (~ulterior@93.115.95.205) has joined #ceph
[3:54] * krypto (~krypto@65.115.222.52) Quit (Ping timeout: 480 seconds)
[3:54] * krypto (~krypto@G68-121-13-179.sbcis.sbc.com) has joined #ceph
[3:57] * georgem (~Adium@69-196-182-134.dsl.teksavvy.com) has joined #ceph
[3:59] * remix_tj is now known as Guest7292
[3:59] * remix_tj (~remix_tj@bonatti.remixtj.net) has joined #ceph
[4:00] * Guest7292 (~remix_tj@bonatti.remixtj.net) Quit (Read error: Connection reset by peer)
[4:05] * dyasny (~dyasny@cable-192.222.131.135.electronicbox.net) Quit (Ping timeout: 480 seconds)
[4:05] * naoto (~naotok@2401:bd00:b001:8920:27:131:11:254) has joined #ceph
[4:06] <m0zes> dnovosel: so, down+peering pgs? did you run a ceph pg $pgid query on them to see what osd they are waiting for?
[4:13] * krypto (~krypto@G68-121-13-179.sbcis.sbc.com) Quit (Read error: Connection reset by peer)
[4:13] * _are_ (~quassel@2a01:238:4325:ca00:f065:c93c:f967:9285) has joined #ceph
[4:14] * wyang (~wyang@114.111.166.41) has joined #ceph
[4:14] * mx (~myeho@50.46.149.183) has joined #ceph
[4:14] * scuttle|afk is now known as scuttlemonkey
[4:20] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[4:22] * bignose (~bignose@jigong.madmonks.org) has left #ceph
[4:23] * KeeperOfTheSoul (~ulterior@7V7AADAYR.tor-irc.dnsbl.oftc.net) Quit ()
[4:25] * wyang (~wyang@114.111.166.41) Quit (Quit: This computer has gone to sleep)
[4:27] * Defaultti1 (~zc00gii@185.65.200.93) has joined #ceph
[4:29] * overclk (~quassel@117.202.97.189) has joined #ceph
[4:31] * wyang (~wyang@114.111.166.41) has joined #ceph
[4:33] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[4:34] * wyang (~wyang@114.111.166.41) Quit ()
[4:37] * Mika_c (~quassel@122.146.93.152) Quit (Remote host closed the connection)
[4:45] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[4:48] * kefu (~kefu@114.92.107.250) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[4:48] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) has joined #ceph
[4:57] * Defaultti1 (~zc00gii@7V7AADAZ2.tor-irc.dnsbl.oftc.net) Quit ()
[4:57] * sese_ (~matx@vps-1065056.srv.pa.infobox.ru) has joined #ceph
[4:57] * kefu (~kefu@114.92.107.250) has joined #ceph
[5:04] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[5:12] * densinz (~densin@ppp-58-8-240-34.revip2.asianet.co.th) has joined #ceph
[5:19] * Vacuum_ (~Vacuum@88.130.198.211) has joined #ceph
[5:19] * densin (~densin@ppp-58-8-214-217.revip2.asianet.co.th) Quit (Ping timeout: 480 seconds)
[5:21] * jtriley (~jtriley@c-73-249-255-187.hsd1.ma.comcast.net) has joined #ceph
[5:23] * georgem (~Adium@69-196-182-134.dsl.teksavvy.com) Quit (Quit: Leaving.)
[5:24] * kbader (~kyle@64.169.30.57) has joined #ceph
[5:26] * Vacuum__ (~Vacuum@88.130.198.126) Quit (Ping timeout: 480 seconds)
[5:27] * sese_ (~matx@7V7AADA0X.tor-irc.dnsbl.oftc.net) Quit ()
[5:28] * wyang (~wyang@116.216.30.3) has joined #ceph
[5:30] * karnan (~karnan@121.244.87.117) has joined #ceph
[5:33] * wyang (~wyang@116.216.30.3) Quit ()
[5:35] * krypto (~krypto@65.115.222.52) has joined #ceph
[5:38] <dnovosel> So regarding my earlier listed errors.. I found this on one of my OSDs that was in a failed state: ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-7: (2) No such file or directory
[5:38] <dnovosel> If I force start it, it seems to come up, but with this error: SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[5:39] <dnovosel> I tried an xfs_repair on the disk and it seemed to say everything was okay.
[5:39] * shyu_ (~shyu@111.201.78.91) Quit (Ping timeout: 480 seconds)
[5:40] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) Quit (Read error: Connection reset by peer)
[5:41] <dnovosel> One person [offline] suggested my error state could be related to the drive failing, and then not having enough space to satisfy the crush map. But I'm not sure how to confirm this detail TBH.
[5:42] * cooldharma06 (~chatzilla@14.139.180.40) Quit (Quit: ChatZilla 0.9.92 [Iceweasel 21.0/20130515140136])
[5:42] * chiluk (~quassel@172.34.213.162.lcy-01.canonistack.canonical.com) Quit (Remote host closed the connection)
[5:43] <m0zes> dnovosel: so, down+peering pgs? did you run a ceph pg $pgid query on them to see what osd they are waiting for?
[5:43] * chiluk (~quassel@172.34.213.162.lcy-01.canonistack.canonical.com) has joined #ceph
[5:44] * rongze (~rongze@36.110.78.250) has joined #ceph
[5:44] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[5:47] * jtriley (~jtriley@c-73-249-255-187.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[5:50] * kbader (~kyle@64.169.30.57) Quit (Ping timeout: 480 seconds)
[5:51] <dnovosel> Well I wasn't seeing down+peering, but I had to manually kick osd.7 to get it online.
[5:51] <m0zes> so, pastebin 'ceph health detail' ?
[5:52] * rdas (~rdas@121.244.87.116) has joined #ceph
[5:53] <m0zes> the "stale" and "stuck stale" pgs are the ones that are probably holding everything up...
[5:53] <dnovosel> http://pastebin.com/rrDByqQk
[5:54] * kbader (~kyle@64.169.30.57) has joined #ceph
[5:54] <m0zes> was this at least "size 2"
[5:56] <dnovosel> Yeah.. size 2.
[5:56] <dnovosel> osd_pool_default_size = 2
[5:56] <dnovosel> Right now it is two hosts, so we went with size 2.
[5:56] <dnovosel> We are moving to 4 hosts next week..
[5:56] <m0zes> so, in theory all the data *should* be there.
[5:56] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) has joined #ceph
[5:56] <m0zes> is min_size 1 or 2?
[5:57] <dnovosel> Hmm, I don't recall setting that specifically.
[5:57] <dnovosel> Let me chek.
[5:57] <dnovosel> *check
[5:57] <m0zes> ceph osd pool get ${name} min_size
[5:57] * Pirate (~thundercl@188.214.129.85) has joined #ceph
[5:58] <dnovosel> Yeah.. set min_size: 1
[5:59] <m0zes> that means there's a chance osd.7 had the only "current" copy of the data for about 14 pgs.
[5:59] <dnovosel> Hmm, would someone have a less current copy?
[6:00] <m0zes> there should be at least 1
[6:00] <m0zes> ceph pg 1.14b query
[6:01] <dnovosel> Umm, seems to lock up.
[6:01] <dnovosel> Error EINTR: problem getting command descriptions from pg.1.14b
[6:02] <m0zes> thats odd.
[6:03] <m0zes> what version of ceph?
[6:03] <dnovosel> 9.2.0
[6:03] <dnovosel> ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299)
[6:04] <m0zes> okay. same here. there must be something up when the pg is stale.
[6:05] * overclk (~quassel@117.202.97.189) Quit (Remote host closed the connection)
[6:05] * madkiss (~madkiss@2001:6f8:12c3:f00f:8448:b6e8:6f3a:c696) Quit (Quit: Leaving.)
[6:06] <dnovosel> Yeah, I tried a second one and same result on stale.
[6:07] <dnovosel> Ultimately, the things that were "actively" writing we can mostly restore off backups, lose maybe a few VMs.. I can live with that.. we have about 1-2TB of data though that replacing would be a completely PITA, but it's less often written.. so I'm hoping we can recover that somehow for the most part.
[6:08] <dnovosel> Not sure if that's likely or not though.
[6:08] * naoto (~naotok@2401:bd00:b001:8920:27:131:11:254) Quit (Quit: Leaving...)
[6:08] * AI (87f5310e@107.161.19.53) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[6:08] <dnovosel> Our cluster was running quite full.. like some of the OSDs were pushing 84%..
[6:09] <m0zes> okay. if you are okay with a potentially *destructive* move, you could mark osd 7 as lost. I expect ceph would allow the stale pgs to come up then.
[6:09] <m0zes> ideally you wouldn't do that.
[6:09] <dnovosel> Which is why we wanted to get 2 more nodes added. Not sure if that led to contributed to this.
[6:09] <dnovosel> led or contributed..
[6:09] <m0zes> did you upgrade from hammer? if so, did you change permissions on the osds to the ceph user?
[6:10] <m0zes> and when starting ceph-osd -i 7, did you run it as ceph?
[6:10] <dnovosel> Wasn't an upgrade from hammer.. install on 9.2.0. We've only been running this cluster for a month or two.
[6:10] <dnovosel> It should have been starting up as the ceph user.
[6:12] <m0zes> are there any logs out of osd 7 when starting it manually?
[6:12] <dnovosel> starting osd.7 at :/0 osd_data /var/lib/ceph/osd/ceph-7 /var/lib/ceph/osd/ceph-7/journal
[6:12] <dnovosel> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[6:12] <dnovosel> SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[6:12] <dnovosel> 2016-03-09 11:27:15.583054 7fd6ae958900 -1 osd.7 1836 log_to_monitors {default=true}
[6:12] <dnovosel> That happens when I tried it last time.. the SG_IO is a bit worrisome.
[6:13] <dnovosel> Automatic startup results in failing due to missing OSD superblock.
[6:13] <m0zes> I don't know what triggers the sg_io, but I've got that on some of mine as well.
[6:13] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[6:13] <dnovosel> Okay.. so maybe not the worst thing..
[6:15] <dnovosel> So if I set osd.7 to lost it's basically gone and I hope things come back online?
[6:15] <dnovosel> With an estimated potential issue of 14 PGs losing data.
[6:15] <m0zes> right, not the first choice to do.
[6:16] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[6:16] <dnovosel> In the morning (EST) I can add a new node and bring it into the cluster to increase my available space by around 3.5TB and adding 14 OSDs. Would it make more sense to wait for that?
[6:17] <dnovosel> I'm just a little worried that some of my OSDs are already quite full, and marking osd.7 as lost might push them up over 90%..
[6:17] <m0zes> right, I'd wait til then at least
[6:18] <m0zes> how about ls -lah /var/lib/ceph/osd/ceph-7/
[6:18] <dnovosel> [root@storage1 ~]# ls -lah /var/lib/ceph/osd/ceph-7/
[6:18] <dnovosel> total 84K
[6:18] <dnovosel> drwxr-xr-x 3 ceph ceph 217 Mar 10 00:14 .
[6:18] <dnovosel> drwxr-x--- 16 ceph ceph 4.0K Nov 25 19:26 ..
[6:18] <dnovosel> -rw-r--r-- 1 root root 194 Nov 25 19:25 activate.monmap
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 3 Nov 25 19:25 active
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 37 Nov 25 19:25 ceph_fsid
[6:18] <dnovosel> drwxr-xr-x 638 ceph ceph 20K Mar 9 17:49 current
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 37 Nov 25 19:25 fsid
[6:18] <dnovosel> lrwxrwxrwx 1 ceph ceph 58 Nov 25 19:25 journal -> /dev/disk/by-partuuid/0c9af0bc-c661-4e93-8299-e374db95447e
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 37 Nov 25 19:25 journal_uuid
[6:18] <dnovosel> -rw------- 1 ceph ceph 56 Nov 25 19:25 keyring
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 21 Nov 25 19:25 magic
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 6 Nov 25 19:25 ready
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 4 Nov 25 19:25 store_version
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 53 Nov 25 19:25 superblock
[6:18] <dnovosel> -rw-r--r-- 1 root root 0 Mar 9 23:32 systemd
[6:18] <dnovosel> -rw-r--r-- 1 ceph ceph 2 Nov 25 19:25 whoami
[6:19] * kefu (~kefu@114.92.107.250) Quit (Max SendQ exceeded)
[6:20] * kefu (~kefu@114.92.107.250) has joined #ceph
[6:21] * EinstCrazy (~EinstCraz@218.26.167.76) has joined #ceph
[6:22] <m0zes> start osd 7 with debug_osd 5?
[6:22] <m0zes> that might give you some more info
[6:22] <m0zes> ceph-osd -i 7 -f --debug_osd 5
[6:22] <m0zes> iirc
[6:24] <dnovosel> Well it's doing something now..
[6:24] <m0zes> it might give you some idea of where/when the hang is.
[6:27] * Pirate (~thundercl@7V7AADA20.tor-irc.dnsbl.oftc.net) Quit ()
[6:27] * PierreW1 (~Wijk@84ZAADDX9.tor-irc.dnsbl.oftc.net) has joined #ceph
[6:28] <dnovosel> Well it looks slightly better
[6:28] <dnovosel> http://pastebin.com/RRSdLRet
[6:28] <m0zes> that it does
[6:30] <dnovosel> That said.. the cluster still seems to be decently at fault still.
[6:30] <dnovosel> Is there a way I can indicate that I don't want osd.7 to be the primary for a PG? I thought there was.
[6:31] <m0zes> I think you can, I can't remmeber how, though.
[6:32] <dnovosel> I'll go googling.
[6:33] <m0zes> 'ceph -s' should show a quick snapshot to see if the cluster is recovering (e.g. recovery io, number of pgs in the recovery state, and peering)
[6:35] <dnovosel> 152 pgs degraded
[6:35] <dnovosel> 9 pgs stale
[6:35] <dnovosel> 152 pgs stuck degraded
[6:35] <dnovosel> 9 pgs stuck stale
[6:35] <dnovosel> 210 pgs stuck unclean
[6:35] <dnovosel> 152 pgs stuck undersized
[6:35] <dnovosel> 152 pgs undersized
[6:35] <dnovosel> The undersized part I'm worried about.
[6:35] <dnovosel> Maybe adding the storage tomorrow will fix that.
[6:35] <dnovosel> Then I can remove osd.7 and replace the HDD.
[6:37] <m0zes> and you have no other pgs that are down?
[6:37] * sleinen1 (~Adium@2001:620:0:82::102) has joined #ceph
[6:37] <m0zes> s/pgs/osds/
[6:38] * krypto (~krypto@65.115.222.52) Quit (Ping timeout: 480 seconds)
[6:38] <dnovosel> Nope.
[6:39] <dnovosel> That was the only OSD that wasn't coming online right now.
[6:39] <m0zes> I'd look at restarting the osds that are 'last active' for the stale pgs...
[6:39] <m0zes> but that can probably wait for the morning, after you've added a new box ;)
[6:40] <m0zes> I've got to get to bed now, good luck.
[6:40] <m0zes> s/active/acting/
[6:41] <dnovosel> Okay, well thanks for all the help.
[6:41] <dnovosel> Definitely appreciated.
[6:41] <dnovosel> And I learned more about ceph today :)
[6:41] <dnovosel> ceph osd primary-affinity osd.7 0.5
[6:41] <dnovosel> That's how to change affinity.
[6:47] * kbader (~kyle@64.169.30.57) Quit (Ping timeout: 480 seconds)
[6:48] * sleinen1 (~Adium@2001:620:0:82::102) Quit (Read error: Connection reset by peer)
[6:49] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[6:50] * overclk (~quassel@121.244.87.117) has joined #ceph
[6:51] * Mika_c (~quassel@122.146.93.152) has joined #ceph
[6:57] * PierreW1 (~Wijk@84ZAADDX9.tor-irc.dnsbl.oftc.net) Quit ()
[6:57] * x303 (~Deiz@64.18.82.164) has joined #ceph
[7:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[7:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[7:03] * Mika_c (~quassel@122.146.93.152) Quit (Remote host closed the connection)
[7:05] * EinstCrazy (~EinstCraz@218.26.167.76) Quit (Remote host closed the connection)
[7:08] * dnovosel (dee8350e@107.161.19.53) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[7:09] * Mika_c (~quassel@122.146.93.152) has joined #ceph
[7:09] * Mika_c (~quassel@122.146.93.152) Quit (Remote host closed the connection)
[7:10] * Mika_c (~quassel@122.146.93.152) has joined #ceph
[7:13] * pam (~pam@host52-104-dynamic.31-79-r.retail.telecomitalia.it) has joined #ceph
[7:16] * enax (~enax@94-21-125-222.pool.digikabel.hu) has joined #ceph
[7:18] * Mika_c (~quassel@122.146.93.152) Quit (Remote host closed the connection)
[7:18] * Mika_c (~quassel@122.146.93.152) has joined #ceph
[7:22] * shylesh__ (~shylesh@121.244.87.118) has joined #ceph
[7:24] * enax (~enax@94-21-125-222.pool.digikabel.hu) Quit (Ping timeout: 480 seconds)
[7:25] * evelu (~erwan@37.161.46.181) has joined #ceph
[7:27] * x303 (~Deiz@84ZAADDY7.tor-irc.dnsbl.oftc.net) Quit ()
[7:27] * dontron (~Yopi@tor-exit4-readme.dfri.se) has joined #ceph
[7:29] * kbader (~kyle@pool-100-9-203-202.lsanca.fios.verizon.net) has joined #ceph
[7:32] * pam (~pam@host52-104-dynamic.31-79-r.retail.telecomitalia.it) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[7:33] * dgurtner (~dgurtner@178.197.234.222) has joined #ceph
[7:35] * kbader_ (~kyle@64.169.30.57) has joined #ceph
[7:37] * kbader (~kyle@pool-100-9-203-202.lsanca.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[7:41] * mx (~myeho@50.46.149.183) Quit (Quit: mx)
[7:41] * naoto (~naotok@27.131.11.254) has joined #ceph
[7:44] * shinobu_ (~oftc-webi@nat-pool-nrt-t1.redhat.com) Quit (Ping timeout: 480 seconds)
[7:48] * BrianA (~BrianA@c-73-189-212-113.hsd1.ca.comcast.net) has joined #ceph
[7:48] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) Quit (Ping timeout: 480 seconds)
[7:57] * dontron (~Yopi@84ZAADD0H.tor-irc.dnsbl.oftc.net) Quit ()
[7:57] * Swompie` (~Hazmat@192.87.28.28) has joined #ceph
[7:57] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[7:58] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) has joined #ceph
[8:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[8:01] * pam (~pam@host52-104-dynamic.31-79-r.retail.telecomitalia.it) has joined #ceph
[8:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[8:04] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[8:04] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[8:08] * kbader_ (~kyle@64.169.30.57) Quit (Ping timeout: 480 seconds)
[8:08] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[8:08] * shohn (~shohn@arcotel154.linznet.at) has joined #ceph
[8:09] * m0zes (~mozes@ns1.beocat.ksu.edu) Quit (Ping timeout: 480 seconds)
[8:12] * pam (~pam@host52-104-dynamic.31-79-r.retail.telecomitalia.it) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[8:14] * toMeloos (~toMeloos@53568B3D.cm-6-7c.dynamic.ziggo.nl) has joined #ceph
[8:16] * rendar (~I@95.238.176.232) has joined #ceph
[8:17] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[8:23] * m0zes (~mozes@ns1.beocat.ksu.edu) has joined #ceph
[8:23] * shohn (~shohn@arcotel154.linznet.at) Quit (Ping timeout: 480 seconds)
[8:25] * enax (~enax@hq.ezit.hu) has joined #ceph
[8:25] * sonea (~sonea@212.218.127.222) has joined #ceph
[8:26] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[8:27] * Swompie` (~Hazmat@84ZAADD1D.tor-irc.dnsbl.oftc.net) Quit ()
[8:29] * shohn (~shohn@193.138.123.10) has joined #ceph
[8:32] * DoDzy (~djidis__@tor-exit-1.netdive.xyz) has joined #ceph
[8:34] * rakeshgm (~rakesh@106.51.29.94) has joined #ceph
[8:36] * dgurtner (~dgurtner@178.197.234.222) Quit (Ping timeout: 480 seconds)
[8:38] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) has joined #ceph
[8:38] * dnovosel (dee8350e@107.161.19.53) has joined #ceph
[8:39] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) has joined #ceph
[8:51] * zaitcev (~zaitcev@c-50-130-189-82.hsd1.nm.comcast.net) Quit (Quit: Bye)
[8:53] * fmeppo (~oftc-webi@208-64-171-38.lfytina1.metronetinc.net) Quit (Ping timeout: 480 seconds)
[8:55] * rongze (~rongze@36.110.78.250) Quit (Remote host closed the connection)
[8:56] * evelu (~erwan@37.161.46.181) Quit (Ping timeout: 480 seconds)
[8:58] * dgurtner (~dgurtner@178.197.231.230) has joined #ceph
[8:58] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) has joined #ceph
[8:59] * pam (~pam@193.106.183.1) has joined #ceph
[8:59] * pabluk__ is now known as pabluk_
[9:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[9:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[9:02] * DoDzy (~djidis__@84ZAADD2D.tor-irc.dnsbl.oftc.net) Quit ()
[9:02] * ZombieTree (~Salamande@76GAADBSC.tor-irc.dnsbl.oftc.net) has joined #ceph
[9:05] * rongze (~rongze@36.110.78.250) has joined #ceph
[9:05] * evelu (~erwan@37.162.44.14) has joined #ceph
[9:08] * sonea (~sonea@212.218.127.222) Quit (Remote host closed the connection)
[9:10] * analbeard (~shw@support.memset.com) has joined #ceph
[9:12] * wyang (~wyang@114.111.166.41) has joined #ceph
[9:13] * wyang (~wyang@114.111.166.41) Quit ()
[9:16] * dnovosel (dee8350e@107.161.19.53) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[9:19] * wyang (~wyang@116.216.30.3) has joined #ceph
[9:23] * wyang (~wyang@116.216.30.3) Quit ()
[9:23] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) Quit (Ping timeout: 480 seconds)
[9:26] * MrAbaddon (~MrAbaddon@a89-155-99-93.cpe.netcabo.pt) Quit (Quit: Leaving)
[9:30] * wyang (~wyang@114.111.166.41) has joined #ceph
[9:32] * ZombieTree (~Salamande@76GAADBSC.tor-irc.dnsbl.oftc.net) Quit ()
[9:32] * Shnaw (~demonspor@192.42.115.101) has joined #ceph
[9:32] * wyang (~wyang@114.111.166.41) Quit ()
[9:34] * wyang (~wyang@114.111.166.41) has joined #ceph
[9:35] * rongze (~rongze@36.110.78.250) Quit (Read error: Connection reset by peer)
[9:35] * derjohn_mobi (~aj@x590d712b.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[9:36] * zwu (~root@58.135.81.96) has joined #ceph
[9:36] * branto (~branto@178.253.157.93) has joined #ceph
[9:36] * Concubidated (~Adium@243.84-253-62.static.virginmediabusiness.co.uk) has joined #ceph
[9:37] * rongze (~rongze@li885-39.members.linode.com) has joined #ceph
[9:37] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[9:38] * overclk (~quassel@121.244.87.117) Quit (Remote host closed the connection)
[9:38] * overclk (~quassel@121.244.87.117) has joined #ceph
[9:39] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) Quit (Ping timeout: 480 seconds)
[9:40] * wyang (~wyang@114.111.166.41) Quit (Quit: This computer has gone to sleep)
[9:40] * dugravot61 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) has joined #ceph
[9:40] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) Quit (Read error: Connection reset by peer)
[9:43] * rongze (~rongze@li885-39.members.linode.com) Quit (Remote host closed the connection)
[9:43] * wyang (~wyang@114.111.166.41) has joined #ceph
[9:44] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Quit: Ex-Chat)
[9:44] * rongze (~rongze@li885-39.members.linode.com) has joined #ceph
[9:45] * rongze_ (~rongze@36.110.78.250) has joined #ceph
[9:48] * RameshN (~rnachimu@121.244.87.117) Quit (Quit: Quit)
[9:49] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) has joined #ceph
[9:51] * rongze (~rongze@li885-39.members.linode.com) Quit (Read error: Connection reset by peer)
[9:54] * Concubidated (~Adium@243.84-253-62.static.virginmediabusiness.co.uk) Quit (Ping timeout: 480 seconds)
[9:55] * dinux (uid110348@id-110348.brockwell.irccloud.com) has joined #ceph
[9:56] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) has joined #ceph
[10:00] * DanFoster (~Daniel@office.34sp.com) has joined #ceph
[10:00] * hyperbaba (~hyperbaba@mw-at-rt-nat.mediaworksit.net) has joined #ceph
[10:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[10:01] * kbader (~kyle@pool-100-9-203-202.lsanca.fios.verizon.net) has joined #ceph
[10:01] * Concubidated (~Adium@243.84-253-62.static.virginmediabusiness.co.uk) has joined #ceph
[10:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[10:02] * Shnaw (~demonspor@84ZAADD4K.tor-irc.dnsbl.oftc.net) Quit ()
[10:09] * kbader (~kyle@pool-100-9-203-202.lsanca.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[10:12] * EchoesOfAmit (~AmitSaroj@103.40.65.111) has joined #ceph
[10:13] * derjohn_mobi (~aj@2001:6f8:1337:0:18d0:69fe:138c:5290) has joined #ceph
[10:15] * EchoesOfAmit (~AmitSaroj@103.40.65.111) Quit ()
[10:15] * sleinen (~Adium@vpn-ho-a-5.switch.ch) has joined #ceph
[10:19] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) Quit (Ping timeout: 480 seconds)
[10:20] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) has joined #ceph
[10:27] * zhaochao_ (~zhaochao@124.202.191.138) has joined #ceph
[10:27] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[10:27] * User9 (~User9@84.241.41.148) has joined #ceph
[10:28] <User9> salam chetorti ?
[10:28] <User9> hi every body
[10:29] * markl (~mark@knm.org) Quit (Ping timeout: 480 seconds)
[10:29] <User9> can i mount one ceph node ro multiple server wirh rbd or cephfs
[10:29] * pam (~pam@193.106.183.1) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[10:30] <User9> can i mount one ceph node ro multiple server wirh rbd or cephfs
[10:33] * zhaochao (~zhaochao@125.39.112.6) Quit (Ping timeout: 480 seconds)
[10:33] * zhaochao_ is now known as zhaochao
[10:36] * TMM (~hp@185.5.122.2) has joined #ceph
[10:36] * rjdias (~rdias@bl7-92-98.dsl.telepac.pt) has joined #ceph
[10:36] * Mousey (~phyphor@62.102.148.67) has joined #ceph
[10:41] * ade (~abradshaw@82.199.64.84) has joined #ceph
[10:41] * ade (~abradshaw@82.199.64.84) Quit ()
[10:42] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[10:42] * rjdias is now known as rdias
[10:43] * MentalRay (~MentalRay@107.171.161.165) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[10:47] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) Quit (Quit: ZNC - http://znc.in)
[10:48] * LeaChim (~LeaChim@host86-171-90-242.range86-171.btcentralplus.com) has joined #ceph
[10:51] * yanzheng (~zhyan@125.71.108.197) Quit (Quit: ??????)
[10:51] * FjordPrefect (~soorya@103.231.216.194) has left #ceph
[10:52] * FjordPrefect (~soorya@103.231.216.194) has joined #ceph
[10:52] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) has joined #ceph
[10:53] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[10:55] * wyang (~wyang@114.111.166.41) Quit (Quit: This computer has gone to sleep)
[10:56] * cooldharma06 (~chatzilla@14.139.180.40) has joined #ceph
[11:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[11:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[11:01] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[11:01] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[11:02] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) Quit (Quit: ZNC - http://znc.in)
[11:03] * b0e (~aledermue@213.95.25.82) has joined #ceph
[11:03] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) has joined #ceph
[11:04] * wyang (~wyang@114.111.166.41) has joined #ceph
[11:06] * branto (~branto@178.253.157.93) Quit (Quit: Leaving.)
[11:06] * branto (~branto@178.253.157.93) has joined #ceph
[11:06] * Mousey (~phyphor@84ZAADD6M.tor-irc.dnsbl.oftc.net) Quit ()
[11:06] * Tumm (~xENO_@178.162.216.42) has joined #ceph
[11:07] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) Quit (Remote host closed the connection)
[11:07] * wyang (~wyang@114.111.166.41) Quit ()
[11:07] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) has joined #ceph
[11:11] * toMeloos (~toMeloos@53568B3D.cm-6-7c.dynamic.ziggo.nl) Quit (Ping timeout: 480 seconds)
[11:12] * vikhyat is now known as vikhyat|brb
[11:12] * Heebie (~thebert@dub-bdtn-office-r1.net.digiweb.ie) has joined #ceph
[11:15] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[11:15] * bjornar_ (~bjornar@109.247.131.38) has joined #ceph
[11:18] * foosinn (~stefan@ipbcc34596.dynamic.kabel-deutschland.de) has joined #ceph
[11:20] * dnovosel (3a7b8acd@107.161.19.53) has joined #ceph
[11:23] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[11:31] * evelu (~erwan@37.162.44.14) Quit (Read error: Connection reset by peer)
[11:32] <skoude_> hmm.. I had a one node failure, and ceph seems to be not getting all the osd's in.. http://pastebin.com/X61gCwRK any idea what do to?
[11:32] <skoude_> one node was away from the cluster like 1 day...
[11:33] <skoude_> do I need to manually get the osds in or?
[11:34] <Gugge-47527> that should happen automatically when they start
[11:34] <Gugge-47527> are they running?
[11:35] * evelu (~erwan@37.162.44.14) has joined #ceph
[11:36] * lmb (~lmb@2a02:8109:8100:1d2c:d597:e030:1e6c:d289) Quit (Ping timeout: 480 seconds)
[11:36] * Tumm (~xENO_@84ZAADD7J.tor-irc.dnsbl.oftc.net) Quit ()
[11:36] * Silentspy (~Scrin@7V7AADBFN.tor-irc.dnsbl.oftc.net) has joined #ceph
[11:37] * User9 (~User9@84.241.41.148) Quit (Read error: Connection reset by peer)
[11:37] * User9 (~User9@84.241.41.148) has joined #ceph
[11:40] * User9 (~User9@84.241.41.148) Quit (Read error: Connection reset by peer)
[11:41] * User9 (~User9@84.241.41.148) has joined #ceph
[11:42] * Muhlemmer (~kvirc@178-85-158-74.dynamic.upc.nl) has joined #ceph
[11:43] * shylesh__ (~shylesh@121.244.87.118) Quit (Ping timeout: 480 seconds)
[11:44] * naoto (~naotok@27.131.11.254) Quit (Quit: Leaving...)
[11:46] <skoude_> no they are down.. Any way to get them up and running manually?
[11:49] <boolman> hod do I figure out the id of my mds I want to remove?
[11:50] * dinux (uid110348@id-110348.brockwell.irccloud.com) Quit (Quit: Connection closed for inactivity)
[11:51] * shinobu (~oftc-webi@p790356f6.tokynt01.ap.so-net.ne.jp) has joined #ceph
[11:52] <Heebie> skoude_: If the OSD's didn't come back up and go back in, there's probably something wrong with them, such as two separate OSD's configured to use the same SSD journal partition or something like that. (I did this one myself on my first test system) There is definitely likely to be a problem. The logs specific to those OSD's (on the nodes) might give you some clues.
[11:52] <boolman> startup arguments: -i ceph-mds02 , ceph mds rm mds.ceph-mds02 apparently this doesnt work, it should be mds.<int>
[11:54] <skoude_> Heebie: but I haven't done any changes to those, and they worked yesterday, before reboot..
[11:55] <skoude_> Heebie: hmm.. I will restart the node, see if that helps ..
[11:57] <overclk> boolman: any hints from ceph mds dump?
[11:57] * drankis (~drankis__@89.111.13.198) has joined #ceph
[11:57] * drankis (~drankis__@89.111.13.198) Quit ()
[11:57] <Heebie> That's exactly what happened with me. They appeared to work fine after I deployed them with ceph-deploy, but the next time I went to reboot the node after a software update, they didn't come back up. Checking the configurations for them, I found that they were both set up to use the same physical SSD disk partition for journaling (the symlink inside the OSD was the same.) so I went through my bash history to find the ceph-deploy commands
[11:59] <boolman> overclk: no not really, I dont even see the mds I want to remove there
[11:59] <skoude_> Heebie: but the oddest thing is that I have rebooted this node before, and it has allways been working after that... :) Well I will make some checks..
[12:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[12:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[12:02] <skoude_> Okay, new reboot solved the problem.. All the osd's are now in :D
[12:05] * treenerd_ (~gsulzberg@adrastea.r3-gis.com) has joined #ceph
[12:06] * Silentspy (~Scrin@7V7AADBFN.tor-irc.dnsbl.oftc.net) Quit ()
[12:07] * Kyso_ (~PappI@tor-exit-node.seas.upenn.edu) has joined #ceph
[12:07] <overclk> boolman: try is "ceph node ls mds" give the thing you're looking for
[12:07] <overclk> boolman: shot in the dark it is.. but looks like it might be
[12:08] <boolman> hm, that might look like rank?
[12:08] <boolman> since active is 0, and standby is -1
[12:09] * efirs (~firs@c-50-185-70-125.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[12:09] * dnovosel (3a7b8acd@107.161.19.53) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[12:10] <boolman> and the one I want to delete isnt there, is it automatically removed from the clustermap if I shutdown the daemon?
[12:10] <skoude_> hmm.. see this paste: http://pastebin.com/4aHZ9RzE for some reason crushmap has changed, because host d0c-c4-7a-1f-00-1e and d0c-c4-7a-1f-0a-44 osd's should be placed on ceph_slow and ceph_fast location in crushmap.. Any idea why this has happened after reboot? Do I need to reapply the crushmap?
[12:11] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[12:12] <skoude_> basically the crushmap was originally like this: http://pastebin.com/H3ARWPDi before rebooting the ceph3 and ceph4 nodes..
[12:13] <skoude_> So why is the crushmap not the same anymore after reboots or don't I just understand how it works?
[12:14] <overclk> boolman: it's definitely rank.. you're right.
[12:15] <skoude_> Okay, maybe it's because I'm missing this in ceph3 and ceph4 configs.. [osd]
[12:15] <skoude_> osd crush update on start = false
[12:16] <skoude_> or?
[12:19] <skoude_> So can I just reapply the crushmap and anything bad will not happen?
[12:19] * zhaochao (~zhaochao@124.202.191.138) Quit (Quit: ChatZilla 0.9.92 [Iceweasel 44.0.2/20160214092551])
[12:21] * pam (~pam@193.106.183.1) has joined #ceph
[12:22] * wyang (~wyang@114.111.166.41) has joined #ceph
[12:22] <skoude_> Okay, it's now recovered, and cluster is okay.. I reapplied the crushmap
[12:23] <boolman> another mds question, how long should it take to recover the cluster if the active one dies? when I have mds cache size = 100k it takes around 1min, if I have 1M it times out and I have to increase the grace beacon time to 1800s
[12:23] * Mika_c (~quassel@122.146.93.152) Quit (Remote host closed the connection)
[12:23] * wyang (~wyang@114.111.166.41) Quit ()
[12:24] * treenerd_ (~gsulzberg@adrastea.r3-gis.com) Quit (Quit: treenerd_)
[12:36] * Kyso_ (~PappI@76GAADBZ5.tor-irc.dnsbl.oftc.net) Quit ()
[12:36] * Kizzi1 (~xanax`@7V7AADBHC.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:38] * alfredodeza (~alfredode@198.206.133.89) has joined #ceph
[12:40] * EinstCrazy (~EinstCraz@218.26.167.76) has joined #ceph
[12:42] * ira (~ira@24.34.255.34) has joined #ceph
[12:47] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[12:47] * morse_ (~morse@supercomputing.univpm.it) Quit (Read error: Connection reset by peer)
[12:59] * wCPO (~Kristian@188.228.31.139) has joined #ceph
[12:59] * rakeshgm (~rakesh@106.51.29.94) Quit (Remote host closed the connection)
[13:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[13:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[13:03] * davidz (~davidz@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[13:05] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[13:06] * davidz1 (~davidz@2605:e000:1313:8003:8152:a56f:99b9:e899) Quit (Ping timeout: 480 seconds)
[13:06] * Kizzi1 (~xanax`@7V7AADBHC.tor-irc.dnsbl.oftc.net) Quit ()
[13:06] * Cue (~ZombieL@vps-1065056.srv.pa.infobox.ru) has joined #ceph
[13:06] * Miouge (~Miouge@94.136.92.20) Quit (Quit: Miouge)
[13:06] * vikhyat|brb is now known as vikhyat
[13:08] * The1w (~jens@node3.survey-it.dk) has joined #ceph
[13:09] * shinobu (~oftc-webi@p790356f6.tokynt01.ap.so-net.ne.jp) Quit (Ping timeout: 480 seconds)
[13:10] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[13:10] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[13:13] * lmb (~lmb@2a02:8109:8100:1d2c:e9f0:81c2:e816:849a) has joined #ceph
[13:14] * lmb (~lmb@2a02:8109:8100:1d2c:e9f0:81c2:e816:849a) Quit ()
[13:14] * lmb (~lmb@2a02:8109:8100:1d2c:e9f0:81c2:e816:849a) has joined #ceph
[13:14] * pam (~pam@193.106.183.1) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[13:14] * sleinen (~Adium@vpn-ho-a-5.switch.ch) Quit (Read error: Connection reset by peer)
[13:17] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[13:19] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) has joined #ceph
[13:19] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[13:20] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[13:20] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) has joined #ceph
[13:20] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) Quit ()
[13:21] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) has joined #ceph
[13:27] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[13:27] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[13:30] * alecatuae (~alecatuae@vpn.novapontocom.com.br) has joined #ceph
[13:32] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[13:33] * Jesse (~oftc-webi@175.100.202.254) has joined #ceph
[13:33] * Jesse is now known as Guest7345
[13:35] <flaf> boolman: it's curious, IRC in my last tests the cephfs was unattainable during ~30 seconds.
[13:36] * Cue (~ZombieL@84ZAADEAU.tor-irc.dnsbl.oftc.net) Quit ()
[13:36] <Guest7345> hi
[13:36] <Guest7345> http://pastebin.com/HMmPsR4w
[13:36] * The1w (~jens@node3.survey-it.dk) Quit (Remote host closed the connection)
[13:37] * GeoTracer (~Geoffrey@41.77.153.99) Quit (Read error: Connection reset by peer)
[13:37] <Guest7345> ceph-authtool --help has no -n option
[13:37] * alfredodeza (~alfredode@198.206.133.89) has left #ceph
[13:37] <Guest7345> but it can be used!!
[13:37] * GeoTracer (~Geoffrey@41.77.153.99) has joined #ceph
[13:37] <Guest7345> Is it a bug?
[13:38] <alecatuae> hi
[13:39] <boolman> flaf: while the mds cluster is recovering all reads/writes are suspended
[13:39] * bara (~bara@213.175.37.12) has joined #ceph
[13:40] * cooldharma06 (~chatzilla@14.139.180.40) Quit (Remote host closed the connection)
[13:41] * wyang (~wyang@114.111.166.41) has joined #ceph
[13:43] * smokedmeets (~smokedmee@c-73-158-201-226.hsd1.ca.comcast.net) Quit (Quit: smokedmeets)
[13:45] <flaf> yes indeed.
[13:45] * Drezil1 (~KrimZon@7V7AADBJS.tor-irc.dnsbl.oftc.net) has joined #ceph
[13:49] * rakeshgm (~rakesh@106.51.29.94) has joined #ceph
[13:49] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[13:55] * Muhlemmer (~kvirc@178-85-158-74.dynamic.upc.nl) Quit (Quit: KVIrc 4.3.1 Aria http://www.kvirc.net/)
[13:55] * wyang (~wyang@114.111.166.41) Quit (Quit: This computer has gone to sleep)
[13:55] * Muhlemmer (~kvirc@178-85-158-74.dynamic.upc.nl) has joined #ceph
[13:59] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[14:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[14:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[14:02] * Racpatel (~Racpatel@2601:87:3:3601::8a6f) has joined #ceph
[14:02] * kbader (~kyle@pool-100-9-203-202.lsanca.fios.verizon.net) has joined #ceph
[14:03] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) Quit (Quit: Leaving.)
[14:06] * thumpba (~thumbpa@rrcs-67-79-8-124.sw.biz.rr.com) has joined #ceph
[14:06] * wyang (~wyang@116.216.30.3) has joined #ceph
[14:06] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[14:06] * davidz (~davidz@cpe-172-91-154-245.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[14:08] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) has joined #ceph
[14:10] * kbader (~kyle@pool-100-9-203-202.lsanca.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[14:11] * Kurt1 (~Adium@2001:628:1:5:a597:aa48:4b8d:9ab7) Quit (Read error: Connection reset by peer)
[14:11] * Kurt (~Adium@2001:628:1:5:d87d:3091:d2bc:f58b) has joined #ceph
[14:11] * kiranos (~quassel@109.74.11.233) Quit (Quit: No Ping reply in 180 seconds.)
[14:11] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[14:11] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[14:11] * stefan (~quassel@109.74.11.233) has joined #ceph
[14:12] * wyang (~wyang@116.216.30.3) Quit (Quit: This computer has gone to sleep)
[14:13] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[14:14] * wyang (~wyang@114.111.166.41) has joined #ceph
[14:14] * davidz (~davidz@2605:e000:1313:8003:8152:a56f:99b9:e899) has joined #ceph
[14:14] * thesix (~thesix@leifhelm.mur.at) Quit (Remote host closed the connection)
[14:14] * thesix (~thesix@leifhelm.mur.at) has joined #ceph
[14:14] * T1 (~the_one@87.104.212.66) Quit (Read error: Connection reset by peer)
[14:15] * T1 (~the_one@87.104.212.66) has joined #ceph
[14:15] * wolsen (~quassel@152.34.213.162.lcy-01.canonistack.canonical.com) Quit (Quit: No Ping reply in 180 seconds.)
[14:15] <skoude_> doesn't anybody have any idea how can i see the space used by pools?
[14:15] * Drezil1 (~KrimZon@7V7AADBJS.tor-irc.dnsbl.oftc.net) Quit ()
[14:15] * Shadow386 (~Lattyware@7V7AADBKV.tor-irc.dnsbl.oftc.net) has joined #ceph
[14:16] * wolsen (~quassel@152.34.213.162.lcy-01.canonistack.canonical.com) has joined #ceph
[14:16] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[14:16] <skoude_> Because I have pools assigned on different rulesets. and these different rulests uses different disks that are defined in crushmap
[14:17] <skoude_> I only found this: http://tracker.ceph.com/issues/8943
[14:18] * dneary (~dneary@pool-96-237-170-97.bstnma.fios.verizon.net) has joined #ceph
[14:18] <skoude_> okay, I can get it by using ceph df and monitor that, so problem solved..
[14:19] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Ping timeout: 480 seconds)
[14:20] * pam (~pam@193.106.183.1) has joined #ceph
[14:22] * rongze_ (~rongze@36.110.78.250) Quit (Remote host closed the connection)
[14:24] * RayTracer (~RayTracer@153.19.7.39) Quit (Ping timeout: 480 seconds)
[14:26] * bara (~bara@213.175.37.12) Quit (Ping timeout: 480 seconds)
[14:26] <mistur> hello
[14:26] <mistur> I have a erasure profile k=7 m=3 plugin=jerasure ruleset-failure-domain=host
[14:27] <mistur> but when I create a pool I have :
[14:27] <mistur> pool 12 '.rgw.buckets' erasure size 10 min_size 7 crush_ruleset 1 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 274 flags hashpspool stripe_width 4256
[14:28] * alecatuae (~alecatuae@vpn.novapontocom.com.br) Quit (Quit: This computer has gone to sleep)
[14:29] <mistur> is that ok erasure size 10 min_size 7 ?
[14:29] <mistur> size = k + m and min = k
[14:32] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[14:35] * wyang (~wyang@114.111.166.41) Quit (Ping timeout: 480 seconds)
[14:35] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[14:36] <m0zes> mistur: yes, that is right.
[14:40] <mistur> ok
[14:41] <mistur> other question is that possible to create multiple radosbd pool ?
[14:41] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[14:41] <mistur> I'd like to do some performance test on Erasure code and Replicate rbd pool ?
[14:41] * suom (~oftc-webi@eduroam-cl383.wl.lut.fi) has joined #ceph
[14:42] <mistur> in nominal configuration and degraded configuration
[14:42] * thumpba (~thumbpa@rrcs-67-79-8-124.sw.biz.rr.com) Quit (Remote host closed the connection)
[14:44] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[14:45] * alfredodeza (~alfredode@198.206.133.89) has joined #ceph
[14:45] * Shadow386 (~Lattyware@7V7AADBKV.tor-irc.dnsbl.oftc.net) Quit ()
[14:45] * ChauffeR1 (~cyphase@158.69.201.229) has joined #ceph
[14:45] * suom (~oftc-webi@eduroam-cl383.wl.lut.fi) Quit ()
[14:47] * yanzheng (~zhyan@125.71.108.197) has joined #ceph
[14:47] * kefu (~kefu@114.92.107.250) Quit (Max SendQ exceeded)
[14:48] * kefu (~kefu@114.92.107.250) has joined #ceph
[14:51] * sleinen (~Adium@vpn-ho-c-2.switch.ch) has joined #ceph
[14:51] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[14:51] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[14:52] * alecatuae (~alecatuae@vpn.novapontocom.com.br) has joined #ceph
[14:53] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[14:53] * ira (~ira@24.34.255.34) Quit (Ping timeout: 480 seconds)
[14:53] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[14:53] * ira (~ira@24.34.255.34) has joined #ceph
[14:55] * vbellur (~vijay@71.234.224.255) Quit (Ping timeout: 480 seconds)
[14:56] * georgem (~Adium@206.108.127.16) has joined #ceph
[14:56] * thumpba (~thumbpa@rrcs-67-79-8-124.sw.biz.rr.com) has joined #ceph
[14:56] * pam (~pam@193.106.183.1) Quit (Quit: Textual IRC Client: www.textualapp.com)
[14:57] * pam (~pam@193.106.183.1) has joined #ceph
[14:57] * Concubidated (~Adium@243.84-253-62.static.virginmediabusiness.co.uk) Quit (Read error: Connection reset by peer)
[14:58] * karnan (~karnan@121.244.87.117) Quit (Quit: Leaving)
[14:58] * kefu (~kefu@114.92.107.250) Quit (Max SendQ exceeded)
[14:58] * kefu (~kefu@114.92.107.250) has joined #ceph
[14:59] * Concubidated (~Adium@243.84-253-62.static.virginmediabusiness.co.uk) has joined #ceph
[15:00] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[15:00] * foosinn (~stefan@ipbcc34596.dynamic.kabel-deutschland.de) Quit (Quit: Leaving)
[15:00] * kefu (~kefu@114.92.107.250) Quit (Max SendQ exceeded)
[15:01] * kefu (~kefu@114.92.107.250) has joined #ceph
[15:01] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[15:02] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[15:02] <m0zes> you can create any number of pools and use them for whatever you'd like.
[15:03] <m0zes> for EC rbd pools, you need a cache tier on top.
[15:04] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[15:04] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[15:06] * kefu is now known as kefu|afk
[15:07] <mbtamuli12> I have to repeatedly run the ceph-create-keys. What files do I have to remove or what do I have to do so that it works on the same mon-id the next time without any other changes?
[15:08] <mbtamuli12> Can anyone guide me or at least point to to a reading resource?
[15:08] * shohn (~shohn@193.138.123.10) Quit (Ping timeout: 480 seconds)
[15:08] * analbeard (~shw@support.memset.com) Quit (Ping timeout: 480 seconds)
[15:08] * alfredodeza (~alfredode@198.206.133.89) has left #ceph
[15:08] <mistur> m0zes: oh ok, I haven't play with cache tier yet
[15:09] * DanFoster (~Daniel@office.34sp.com) Quit (Quit: Leaving)
[15:10] * DanFoster (~Daniel@2a00:1ee0:3:1337:c153:d78e:99bc:d064) has joined #ceph
[15:10] <mistur> a radosgw can use multiple pool for backend ?
[15:14] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[15:15] * overclk (~quassel@121.244.87.117) Quit (Remote host closed the connection)
[15:15] * ChauffeR1 (~cyphase@7V7AADBLT.tor-irc.dnsbl.oftc.net) Quit ()
[15:15] * zviratko (~TomyLobo@atlantic480.us.unmetered.com) has joined #ceph
[15:18] * dyasny (~dyasny@cable-192.222.131.135.electronicbox.net) has joined #ceph
[15:19] * Racpatel (~Racpatel@2601:87:3:3601::8a6f) Quit (Quit: Leaving)
[15:19] * Racpatel (~Racpatel@2601:87:3:3601::8a6f) has joined #ceph
[15:25] * vikhyat (~vumrao@121.244.87.116) Quit (Ping timeout: 480 seconds)
[15:28] * jtriley (~jtriley@140.247.242.54) has joined #ceph
[15:28] <rmart04> Hi Chaps, had an interesting problem this morning. I have a 6 node cluster (+3 dedicated mons) (cluster is slighly more dense than I would have liked) with a pci-e cache tier. I???ve had two different nodes with load skyrocketing up to 2000 at two different points of the morning. Both with OSD???s showing ceph failed assert hit suicide timeout errors. I???ve had a look around, and it seems like that might be the symptom rather than the cause, but im a bi
[15:28] <rmart04> stuck what to look for next! Unfortunately I dont get to look around much as the terminal is non responsive, and a reboots been needed both times!
[15:33] * vikhyat (~vumrao@49.248.94.136) has joined #ceph
[15:34] <m0zes> mistur: you can have zones in rgw that have different performance profiles
[15:35] <m0zes> (using different pools) or you can have completely seperate radosgw installs using different pools.
[15:38] * shyu_ (~shyu@114.241.13.231) has joined #ceph
[15:38] * Racpatel (~Racpatel@2601:87:3:3601::8a6f) Quit (Ping timeout: 480 seconds)
[15:40] <mistur> m0zes: ok, I gonna give a look on zonning
[15:41] * shyu_ (~shyu@114.241.13.231) Quit (Remote host closed the connection)
[15:41] * Guest7345 (~oftc-webi@175.100.202.254) Quit (Ping timeout: 480 seconds)
[15:41] <rmart04> Also, am I right in saying, that performance of a cache tier whilst data is being flushed is generally bad? (How can this be increased?) 0.94.5 / Trusty
[15:42] * rotbeard (~redbeard@ppp-115-87-78-25.revip4.asianet.co.th) Quit (Quit: Leaving)
[15:42] * rotbeard (~redbeard@ppp-115-87-78-25.revip4.asianet.co.th) has joined #ceph
[15:43] * rakeshgm (~rakesh@106.51.29.94) Quit (Read error: No route to host)
[15:44] * FjordPrefect (~soorya@103.231.216.194) Quit (Ping timeout: 480 seconds)
[15:45] * zviratko (~TomyLobo@76GAADB7S.tor-irc.dnsbl.oftc.net) Quit ()
[15:48] * rotbeard (~redbeard@ppp-115-87-78-25.revip4.asianet.co.th) Quit (Quit: Leaving)
[15:48] <m0zes> rmart04: infernalis has high and low priorities for such things based on different high-water marks. and a unified i/o queue meaning priorities actually work.
[15:49] * vbellur (~vijay@nat-pool-bos-u.redhat.com) has joined #ceph
[15:50] * MatthewH12 (~Shadow386@tor2r.ins.tor.net.eu.org) has joined #ceph
[15:51] * alecatuae (~alecatuae@vpn.novapontocom.com.br) Quit (Quit: This computer has gone to sleep)
[15:53] * kawa2014 (~kawa@38.109.203.254) has joined #ceph
[15:56] <devicenull> cephx: verify_authorizercould not decrypt authorize request with error: NSS AES ply, op: auth(proto 2 165
[15:56] <devicenull> huh?
[16:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[16:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[16:01] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[16:01] <Aeso> Does anyone know what the w value in an erasure coding profile represents?
[16:02] * rwheeler (~rwheeler@1.186.34.66) has joined #ceph
[16:03] * hyperbaba (~hyperbaba@mw-at-rt-nat.mediaworksit.net) Quit (Ping timeout: 480 seconds)
[16:04] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[16:05] <rmart04> Hi m0zes, Thanks for the info, I saw there was a cache_target_dirty_high_ratio option now, that does sound handy. Im not entierly sure what a unified i/o queue actually means, but it sounds like a good thing!
[16:05] <rmart04> ill have a read, 0.94.6 is the current LTS right?
[16:06] <rmart04> just checking the releases looks to be the case
[16:06] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) has joined #ceph
[16:07] <rmart04> There is now a unified queue (and thus prioritization) of client IO, recovery, scrubbing, and snapshot trimming. - nice??? Can anyone comment on running infernalis in production? How have your experiences been?
[16:09] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[16:10] * BrianA (~BrianA@c-73-189-212-113.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[16:12] * dneary (~dneary@pool-96-237-170-97.bstnma.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[16:14] * thomnico (~thomnico@38.109.203.11) has joined #ceph
[16:15] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[16:16] * allaok (~allaok@machine107.orange-labs.com) has left #ceph
[16:16] * FjordPrefect (~soorya@103.231.216.194) has joined #ceph
[16:17] * georgem (~Adium@206.108.127.16) has joined #ceph
[16:20] * MatthewH12 (~Shadow386@7V7AADBPM.tor-irc.dnsbl.oftc.net) Quit ()
[16:20] * Azru (~rushworld@torproxy02.31173.se) has joined #ceph
[16:20] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[16:21] * LDA (~lda@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[16:23] * evelu is now known as erwan_taf
[16:24] * tsg (~tgohad@134.134.139.72) has joined #ceph
[16:25] * BrianA (~BrianA@fw-rw.shutterfly.com) has joined #ceph
[16:27] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) has joined #ceph
[16:30] * erwan_taf is now known as evelu
[16:30] * enax (~enax@hq.ezit.hu) Quit (Ping timeout: 480 seconds)
[16:30] <devicenull> I seem to recall there being a command to show reads/writes per pool
[16:31] <devicenull> but I cant remember wtf it is
[16:31] <Be-El> ceph osd pool stats
[16:31] <m0zes> ceph osd pool stats
[16:31] <devicenull> that was it, thanks!
[16:33] * evelu is now known as erwan_taf
[16:33] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Quit: Leaving.)
[16:34] * erwan_taf (~erwan@37.162.44.14) Quit (Read error: Connection reset by peer)
[16:35] * evelu (~erwan@37.162.44.14) has joined #ceph
[16:36] * evelu is now known as erwan_taf
[16:36] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) Quit (Quit: Leaving.)
[16:36] * erwan_taf is now known as evelu
[16:37] * analbeard (~shw@support.memset.com) has joined #ceph
[16:38] * Racpatel (~Racpatel@2601:87:3:3601::8a6f) has joined #ceph
[16:38] * AI (87f5300e@107.161.19.53) has joined #ceph
[16:40] * rongze (~rongze@123.119.76.155) has joined #ceph
[16:41] * AI (87f5300e@107.161.19.53) Quit ()
[16:42] * AI (80cbd3cc@107.161.19.53) has joined #ceph
[16:42] * FjordPrefect (~soorya@103.231.216.194) Quit (Ping timeout: 480 seconds)
[16:45] * markl (~mark@knm.org) has joined #ceph
[16:45] * dyasny (~dyasny@cable-192.222.131.135.electronicbox.net) Quit (Remote host closed the connection)
[16:49] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[16:49] * EinstCrazy (~EinstCraz@218.26.167.76) Quit (Remote host closed the connection)
[16:49] * RameshN (~rnachimu@101.222.242.170) has joined #ceph
[16:50] * Azru (~rushworld@84ZAADEI1.tor-irc.dnsbl.oftc.net) Quit ()
[16:50] * Jamana (~xanax`@65.19.167.130) has joined #ceph
[16:51] * shyu_ (~shyu@114.241.13.231) has joined #ceph
[16:52] * angdraug (~angdraug@c-69-181-140-42.hsd1.ca.comcast.net) has joined #ceph
[16:52] * shyu_ (~shyu@114.241.13.231) Quit (Remote host closed the connection)
[16:54] * xarses (~xarses@64.124.158.100) has joined #ceph
[16:54] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[16:55] * mattbenjamin (~mbenjamin@aa2.linuxbox.com) has joined #ceph
[16:58] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[16:58] <post-factum> 0.94.5 ??? 0.94.6 update went smoothly for us
[16:59] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[17:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[17:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[17:02] * kefu|afk (~kefu@114.92.107.250) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:06] * shyu_ (~shyu@114.241.13.231) has joined #ceph
[17:06] <mistur> m0zes: I try to follow this article : http://cephnotes.ksperis.com/blog/2014/11/28/placement-pools-on-rados-gw
[17:06] <mistur> but seem to be old
[17:07] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[17:08] * dgurtner (~dgurtner@178.197.231.230) Quit (Ping timeout: 480 seconds)
[17:09] * RameshN (~rnachimu@101.222.242.170) Quit (Ping timeout: 480 seconds)
[17:09] * yanzheng (~zhyan@125.71.108.197) Quit (Quit: This computer has gone to sleep)
[17:10] <mistur> when I do radosgw-admin region set < region.conf.json
[17:10] <mistur> nothing change
[17:10] <Aeso> Trying to create a new cephfs using an erasure coded pool fails in 9.2.1. Is there a way around this?
[17:11] <m0zes> mistur: that should work... I use seperate rgw instances, though...
[17:11] <m0zes> Aeso: errors?
[17:12] * dyasny (~dyasny@cable-192.222.131.135.electronicbox.net) has joined #ceph
[17:12] <Aeso> m0zes, Error EINVAL: pool 'cephfs_data' (id '3') is an erasure-code pool
[17:12] <mistur> m0zes: ok :(
[17:12] <Aeso> Seems intentional, though I'm not sure why.
[17:13] <m0zes> Aeso: do you have a cache tier on top?
[17:13] <Aeso> m0zes, no.
[17:13] * RameshN (~rnachimu@101.222.241.228) has joined #ceph
[17:13] <m0zes> ec pools require whole object insertion. cephfs requires partial writes to objects. you have to have a cache tier
[17:14] <m0zes> at least for the foreseeable future. I think there is some work to "fix" that, but I doubt it will happen before jewel.
[17:15] <Aeso> Ah, well that puts a hamper on our plans. Back to the drawing board, thanks for the info.
[17:16] <m0zes> Aeso: why is that problematic?
[17:16] * davidz1 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[17:17] * densinz (~densin@ppp-58-8-240-34.revip2.asianet.co.th) Quit (Ping timeout: 480 seconds)
[17:20] * Jamana (~xanax`@84ZAADEKW.tor-irc.dnsbl.oftc.net) Quit ()
[17:20] * Kyso_ (~DoDzy@hessel3.torservers.net) has joined #ceph
[17:22] * smokedmeets (~smokedmee@c-73-158-201-226.hsd1.ca.comcast.net) has joined #ceph
[17:23] * davidz (~davidz@2605:e000:1313:8003:8152:a56f:99b9:e899) Quit (Ping timeout: 480 seconds)
[17:24] <Aeso> m0zes, the workload we were designing around has almost no performance requirements but needs to be as cost sensitive as possible
[17:24] * dugravot61 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) Quit (Quit: Leaving.)
[17:25] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[17:26] <Be-El> Aeso: you can use a hdd backed pool as cache layer. there's no requirement for extra ssd capacity
[17:28] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:29] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[17:31] * RameshN_ (~rnachimu@101.222.176.59) has joined #ceph
[17:31] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[17:32] * jdillaman_ (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) has joined #ceph
[17:33] * RameshN (~rnachimu@101.222.241.228) Quit (Ping timeout: 480 seconds)
[17:33] * bjornar_ (~bjornar@109.247.131.38) Quit (Ping timeout: 480 seconds)
[17:34] * TMM (~hp@185.5.122.2) Quit (Quit: Ex-Chat)
[17:35] * sleinen (~Adium@vpn-ho-c-2.switch.ch) Quit (Ping timeout: 480 seconds)
[17:36] * RameshN__ (~rnachimu@101.222.178.159) has joined #ceph
[17:37] <m0zes> and you can set it flush/evict practically immediately
[17:38] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) has joined #ceph
[17:42] * vicente (~~vicente@111-241-33-60.dynamic.hinet.net) has joined #ceph
[17:43] * RameshN_ (~rnachimu@101.222.176.59) Quit (Ping timeout: 480 seconds)
[17:44] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:45] * RameshN__ (~rnachimu@101.222.178.159) Quit (Ping timeout: 480 seconds)
[17:46] * zaitcev (~zaitcev@c-50-130-189-82.hsd1.nm.comcast.net) has joined #ceph
[17:50] * Kyso_ (~DoDzy@76GAADCEJ.tor-irc.dnsbl.oftc.net) Quit ()
[17:50] * Keiya (~Pettis@176.10.99.206) has joined #ceph
[17:50] * karnan (~karnan@106.51.141.49) has joined #ceph
[17:54] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[17:54] * kbader (~kyle@pool-100-9-203-202.lsanca.fios.verizon.net) has joined #ceph
[17:56] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[17:57] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:59] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[17:59] * kbader_ (~kyle@64.169.30.57) has joined #ceph
[18:00] * vicente (~~vicente@111-241-33-60.dynamic.hinet.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[18:01] * rakeshgm (~rakesh@106.51.29.94) has joined #ceph
[18:01] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[18:03] * kbader (~kyle@pool-100-9-203-202.lsanca.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[18:03] * EinstCrazy (~EinstCraz@218.26.167.76) has joined #ceph
[18:05] * rmart04 (~rmart04@support.memset.com) Quit (Quit: rmart04)
[18:06] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[18:12] * thumpba (~thumbpa@rrcs-67-79-8-124.sw.biz.rr.com) Quit (Remote host closed the connection)
[18:12] * EinstCrazy (~EinstCraz@218.26.167.76) Quit (Ping timeout: 480 seconds)
[18:13] * thumpba (~thumbpa@rrcs-67-79-8-124.sw.biz.rr.com) has joined #ceph
[18:18] * pam (~pam@193.106.183.1) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[18:18] * sleinen (~Adium@2001:620:0:82::100) has joined #ceph
[18:18] * Concubidated (~Adium@243.84-253-62.static.virginmediabusiness.co.uk) Quit (Quit: Leaving.)
[18:20] * Kupo1 (~tyler.wil@23.111.254.159) has joined #ceph
[18:20] * Keiya (~Pettis@84ZAADENR.tor-irc.dnsbl.oftc.net) Quit ()
[18:20] * Misacorp (~mLegion@exit1.torproxy.org) has joined #ceph
[18:20] * Kioob`Taff (~plug-oliv@2a01:e35:2e8a:1e0::42:10) Quit (Quit: Leaving.)
[18:21] * overclk (~quassel@117.202.103.206) has joined #ceph
[18:21] * overclk (~quassel@117.202.103.206) Quit ()
[18:23] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) Quit (Quit: Leaving.)
[18:24] * Concubidated (~Adium@243.84-253-62.static.virginmediabusiness.co.uk) has joined #ceph
[18:27] <diq> it actually works decently well depending on sizing/spindles
[18:27] <diq> what m0zes is suggesting
[18:32] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[18:34] * wCPO (~Kristian@188.228.31.139) Quit (Ping timeout: 480 seconds)
[18:37] * georgem (~Adium@206.108.127.16) Quit (Read error: Network is unreachable)
[18:37] * georgem (~Adium@206.108.127.16) has joined #ceph
[18:40] * AI (80cbd3cc@107.161.19.53) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[18:45] * theCanadianBaker (~jason@p200300872A1AE54F601B7335539EC9A0.dip0.t-ipconnect.de) has joined #ceph
[18:45] * rakeshgm (~rakesh@106.51.29.94) Quit (Quit: Leaving)
[18:46] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[18:48] * davidz1 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) Quit (Quit: Leaving.)
[18:48] * davidz (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[18:49] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[18:50] * Misacorp (~mLegion@76GAADCHB.tor-irc.dnsbl.oftc.net) Quit ()
[18:50] * Mousey (~Enikma@vps-1065056.srv.pa.infobox.ru) has joined #ceph
[18:51] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) Quit (Quit: Leaving.)
[18:52] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) Quit (Quit: Leaving.)
[18:53] * overclk (~quassel@117.202.103.206) has joined #ceph
[18:54] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[18:54] * davidz1 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[18:57] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[18:57] * bjornar_ (~bjornar@ti0099a430-0908.bb.online.no) has joined #ceph
[18:58] * pam (~pam@5.170.200.11) has joined #ceph
[18:58] * mykola (~Mikolaj@91.225.201.110) has joined #ceph
[18:59] * davidz (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) Quit (Ping timeout: 480 seconds)
[19:00] * pabluk_ is now known as pabluk__
[19:00] * alexxy (~alexxy@biod.pnpi.spb.ru) Quit (Remote host closed the connection)
[19:00] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) has joined #ceph
[19:00] * alexxy (~alexxy@biod.pnpi.spb.ru) has joined #ceph
[19:01] * alexxy (~alexxy@biod.pnpi.spb.ru) Quit (Remote host closed the connection)
[19:01] * alexxy (~alexxy@biod.pnpi.spb.ru) has joined #ceph
[19:05] * branto (~branto@178.253.157.93) Quit (Quit: Leaving.)
[19:05] * FjordPrefect (~soorya@103.231.216.194) has joined #ceph
[19:06] * sleinen (~Adium@2001:620:0:82::100) Quit (Ping timeout: 480 seconds)
[19:06] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Quit: Bye guys! (??????????????????? ?????????)
[19:08] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[19:12] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) Quit (Quit: Leaving)
[19:16] * pam (~pam@5.170.200.11) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:20] * Mousey (~Enikma@7V7AADBYD.tor-irc.dnsbl.oftc.net) Quit ()
[19:20] * ylmson (~Guest1390@84ZAADESD.tor-irc.dnsbl.oftc.net) has joined #ceph
[19:20] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) Quit (Quit: Leaving.)
[19:22] * mx (~myeho@66.193.98.66) has joined #ceph
[19:23] * jclm (~jclm@ip68-96-198-45.lv.lv.cox.net) has joined #ceph
[19:23] * m87carlson (~m87carlso@207.111.246.196) has joined #ceph
[19:24] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[19:27] <m87carlson> Hey everyone. I posted a message to the ceph-users mailing list back in Janurary (https://www.mail-archive.com/ceph-users@lists.ceph.com/msg26329.html) about our cluster fsid changing. Since this is a production environment, we wanted to know if anyone had experience with this situation and if there are steps to resolve it
[19:28] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[19:29] * ivancich (~ivancich@aa2.linuxbox.com) Quit (Ping timeout: 480 seconds)
[19:31] * ivancich (~ivancich@aa2.linuxbox.com) has joined #ceph
[19:33] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Ping timeout: 480 seconds)
[19:35] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[19:35] * fmeppo (~oftc-webi@yarf.rcac.purdue.edu) has joined #ceph
[19:36] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Ping timeout: 480 seconds)
[19:36] * derjohn_mobi (~aj@2001:6f8:1337:0:18d0:69fe:138c:5290) Quit (Ping timeout: 480 seconds)
[19:37] * DanFoster (~Daniel@2a00:1ee0:3:1337:c153:d78e:99bc:d064) Quit (Quit: Leaving)
[19:42] * angdraug (~angdraug@c-69-181-140-42.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[19:50] * ylmson (~Guest1390@84ZAADESD.tor-irc.dnsbl.oftc.net) Quit ()
[19:50] * Ralth (~SaneSmith@static-ip-85-25-103-119.inaddr.ip-pool.com) has joined #ceph
[19:58] * Daznis (~Darius@82-135-143-87.static.zebra.lt) has joined #ceph
[19:58] * alecatuae (~alecatuae@vpn.novapontocom.com.br) has joined #ceph
[20:04] <Daznis> Hello, anyonone got some insight on why ceph might get errors on partprobe either journal or the drive itself.
[20:06] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[20:09] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[20:10] * Discovery (~Discovery@178.239.49.67) has joined #ceph
[20:15] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[20:15] * rwheeler (~rwheeler@1.186.34.66) Quit (Ping timeout: 480 seconds)
[20:20] * Ralth (~SaneSmith@76GAADCLP.tor-irc.dnsbl.oftc.net) Quit ()
[20:20] * cmrn (~Bonzaii@anonymous.sec.nl) has joined #ceph
[20:22] <nils_> Daznis, what kind of errors?
[20:23] * rendar (~I@95.238.176.232) Quit (Ping timeout: 480 seconds)
[20:23] <Daznis> DEBUG:ceph-disk:partprobe /dev/sdi failed : Error: Error informing the kernel about modifications to partition /dev/sdi1 -- Device or resource busy. This means Linux won't know about any changes you made to /dev/sdi1 until you reboot -- so you shouldn't mount it or use it in any way before rebooting.
[20:24] * evelu (~erwan@37.162.44.14) Quit (Read error: Connection reset by peer)
[20:25] <flaf> My snmp agent can't read the mountpoints /var/lib/ceph/* because of unix rights and I have errors in my syslog ???snmpd[13898]: Cannot statfs /var/lib/ceph/osd/ceph-14#012: Permission denied??? (for instance). It's on infernalis. Is there a problem if I set /var/lib/ceph to [rwxr-xr-x ceph:ceph] (instead of [rwxr-x--- ceph:ceph])?
[20:25] * rendar (~I@95.238.176.232) has joined #ceph
[20:26] <flaf> Or is it better to add the snmp account to the unix group "ceph"?
[20:28] <flaf> (I use snmp to collect data and have graphs)
[20:28] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[20:30] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[20:31] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[20:32] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[20:33] * vikhyat (~vumrao@49.248.94.136) Quit (Quit: Leaving)
[20:33] * karnan (~karnan@106.51.141.49) Quit (Quit: Leaving)
[20:37] * Muhlemmer (~kvirc@178-85-158-74.dynamic.upc.nl) Quit (Quit: KVIrc 4.9.1 Aria http://www.kvirc.net/)
[20:38] * rwheeler (~rwheeler@1.186.34.66) has joined #ceph
[20:39] * evelu (~erwan@37.162.44.14) has joined #ceph
[20:39] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[20:41] * davidz (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[20:43] * davidz2 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[20:44] * thomnico (~thomnico@38.109.203.11) Quit (Quit: Ex-Chat)
[20:47] * davidz1 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) Quit (Ping timeout: 480 seconds)
[20:48] * pam (~pam@host52-104-dynamic.31-79-r.retail.telecomitalia.it) has joined #ceph
[20:49] * linuxkidd (~linuxkidd@149.sub-70-210-32.myvzw.com) Quit (Ping timeout: 480 seconds)
[20:50] * davidz (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) Quit (Ping timeout: 480 seconds)
[20:50] * Muhlemmer (~kvirc@178-85-158-74.dynamic.upc.nl) has joined #ceph
[20:50] * cmrn (~Bonzaii@76GAADCM5.tor-irc.dnsbl.oftc.net) Quit ()
[20:50] * Fapiko (~Curt`@chomsky.torservers.net) has joined #ceph
[20:54] * Muhlemmer (~kvirc@178-85-158-74.dynamic.upc.nl) Quit ()
[20:55] * gregmark (~Adium@68.87.42.115) has joined #ceph
[20:55] * overclk (~quassel@117.202.103.206) Quit (Read error: Connection reset by peer)
[20:55] * gregmark (~Adium@68.87.42.115) Quit ()
[20:55] * gregmark (~Adium@68.87.42.115) has joined #ceph
[20:56] * Muhlemmer (~kvirc@178-85-158-74.dynamic.upc.nl) has joined #ceph
[20:56] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[20:56] * evelu (~erwan@37.162.44.14) Quit (Ping timeout: 480 seconds)
[20:58] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[20:59] * linuxkidd (~linuxkidd@174.sub-70-197-164.myvzw.com) has joined #ceph
[21:00] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[21:01] * alecatuae (~alecatuae@vpn.novapontocom.com.br) Quit (Quit: This computer has gone to sleep)
[21:03] * alecatuae (~alecatuae@vpn.novapontocom.com.br) has joined #ceph
[21:04] * derjohn_mobi (~aj@x590d712b.dyn.telefonica.de) has joined #ceph
[21:05] * evelu (~erwan@37.161.187.176) has joined #ceph
[21:10] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[21:11] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[21:12] * allaok (~allaok@ARennes-658-1-79-162.w92-139.abo.wanadoo.fr) has joined #ceph
[21:13] * toMeloos (~toMeloos@53568B3D.cm-6-7c.dynamic.ziggo.nl) has joined #ceph
[21:14] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[21:18] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[21:18] * linuxkidd (~linuxkidd@174.sub-70-197-164.myvzw.com) Quit (Ping timeout: 480 seconds)
[21:18] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[21:20] * Fapiko (~Curt`@76GAADCOL.tor-irc.dnsbl.oftc.net) Quit ()
[21:20] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[21:20] * Qiasfah (~oracular@ns316491.ip-37-187-129.eu) has joined #ceph
[21:23] * masterom1 (~ivan@93-142-246-118.adsl.net.t-com.hr) has joined #ceph
[21:23] * gabrtv (sid36209@charlton.irccloud.com) Quit (Quit: Connection closed for inactivity)
[21:23] <rkeene> Improving the debuggability of my product -- now "strip" is an alias for "false", all kinds of software fails to build with that :-D
[21:23] * alecatuae (~alecatuae@vpn.novapontocom.com.br) Quit (Quit: This computer has gone to sleep)
[21:26] * kbader_ (~kyle@64.169.30.57) Quit (Ping timeout: 480 seconds)
[21:27] * linuxkidd (~linuxkidd@245.sub-70-210-58.myvzw.com) has joined #ceph
[21:27] * masteroman (~ivan@78-1-253-152.adsl.net.t-com.hr) Quit (Read error: Connection reset by peer)
[21:31] * pam (~pam@host52-104-dynamic.31-79-r.retail.telecomitalia.it) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[21:33] * sleinen (~Adium@83-103-7-16.ip.fastwebnet.it) has joined #ceph
[21:33] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[21:35] * sleinen1 (~Adium@2001:620:0:69::101) has joined #ceph
[21:38] * angdraug (~angdraug@64.124.158.100) has joined #ceph
[21:40] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[21:41] * sleinen (~Adium@83-103-7-16.ip.fastwebnet.it) Quit (Ping timeout: 480 seconds)
[21:43] * toMeloos (~toMeloos@53568B3D.cm-6-7c.dynamic.ziggo.nl) Quit (Quit: Ik ga weg)
[21:45] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[21:47] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[21:48] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:48] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) has joined #ceph
[21:50] * Qiasfah (~oracular@84ZAADEYK.tor-irc.dnsbl.oftc.net) Quit ()
[21:50] * thumpba (~thumbpa@rrcs-67-79-8-124.sw.biz.rr.com) Quit (Remote host closed the connection)
[21:50] * totalwormage (~Chaos_Lla@chomsky.torservers.net) has joined #ceph
[21:52] * evelu (~erwan@37.161.187.176) Quit (Read error: Connection reset by peer)
[21:53] * jdohms_ (~jdohms@flyingmonkey.concordia.ab.ca) has joined #ceph
[21:53] * johnavp19891 (~jpetrini@8.39.115.8) has joined #ceph
[21:54] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[21:55] * jdohms (~jdohms@flyingmonkey.concordia.ab.ca) Quit (Ping timeout: 480 seconds)
[21:55] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[21:56] * infernix (nix@spirit.infernix.net) Quit (Ping timeout: 480 seconds)
[21:56] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[21:58] * mykola (~Mikolaj@91.225.201.110) Quit (Quit: away)
[21:58] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[22:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[22:01] * m87carlson (~m87carlso@207.111.246.196) Quit (Quit: leaving)
[22:05] * Walex (~Walex@72.249.182.114) Quit (Remote host closed the connection)
[22:07] * evelu (~erwan@37.161.187.176) has joined #ceph
[22:09] * infernix (nix@2001:41f0::2) has joined #ceph
[22:11] * LDA (~lda@host217-114-156-249.pppoe.mark-itt.net) Quit (Quit: LDA)
[22:12] * Walex (~Walex@72.249.182.114) has joined #ceph
[22:20] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[22:20] * totalwormage (~Chaos_Lla@76GAADCRA.tor-irc.dnsbl.oftc.net) Quit ()
[22:20] * rogst (~LRWerewol@Relay-J.tor-exit.network) has joined #ceph
[22:29] * dgurtner (~dgurtner@178.197.234.222) has joined #ceph
[22:30] * davidz (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[22:33] * alecatuae (~alecatuae@vpn.novapontocom.com.br) has joined #ceph
[22:34] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[22:36] * bjornar_ (~bjornar@ti0099a430-0908.bb.online.no) Quit (Ping timeout: 480 seconds)
[22:36] * davidz2 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) Quit (Ping timeout: 480 seconds)
[22:36] * owlbot (~supybot@pct-empresas-50.uc3m.es) Quit (Ping timeout: 480 seconds)
[22:37] * alecatuae (~alecatuae@vpn.novapontocom.com.br) Quit ()
[22:38] * owlbot (~supybot@pct-empresas-50.uc3m.es) has joined #ceph
[22:39] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[22:40] * vbellur (~vijay@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[22:43] * LobsterRoll (~LobsterRo@140.247.242.44) has joined #ceph
[22:44] <LobsterRoll> Hi all, I am following this writeup on manually setting up a cluster: http://docs.ceph.com/docs/hammer/install/manual-deployment/#monitor-bootstrapping but when it comes time to ???start the cluster my process fails with ???ceph-mon[140543]: Invalid argument: /var/lib/ceph/mon/ceph-ceph-mon01/store.db: does not exist (create_if_missing is false)???
[22:45] <LobsterRoll> now, the store.db in fact is not there, my question is, at what stage should that be created within this manual setup?
[22:47] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Ping timeout: 480 seconds)
[22:50] * rogst (~LRWerewol@84ZAADE1B.tor-irc.dnsbl.oftc.net) Quit ()
[22:50] * SEBI1 (~capitalth@84ZAADE2X.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:52] * johnavp19891 (~jpetrini@8.39.115.8) Quit (Remote host closed the connection)
[22:53] <flaf> LobsterRoll: during the command ???ceph-mon --mkfs ...???
[22:54] <flaf> LobsterRoll: For me, the creation of the working dir of ceph mon is ???ceph-mon --mkfs -i "$id" --conf "/etc/ceph/$cluster.conf" --monmap "/tmp/monmap" --keyring "/tmp/$cluster.mon.keyring" --cluster "$cluster"???
[22:54] <LobsterRoll> hmm yea, i was runnign this through puppet, when i do it manually it complains that /var/lib/ceph/mon already exists
[22:55] <LobsterRoll> which it does (puppet lays it down before ceph-mon ???mkfs
[22:55] <LobsterRoll> guess i need to alter my ordering
[22:55] <flaf> ah yes, it's probably not an idempotent command. :)
[22:56] <LobsterRoll> well i made it idempotent around whether the done file exists
[22:56] <LobsterRoll> i think the mkfs command needs to make /var/lib/ceph/mon/host
[22:56] <LobsterRoll> so i shouldnt attempt to do it via puppet
[22:56] <LobsterRoll> thanks!
[22:57] <flaf> Personally I don't create mon, osd via puppet.
[22:57] <flaf> Too much work for a minimal gain of time. ;)
[22:58] <LobsterRoll> probably
[22:58] <LobsterRoll> but im already halfway down the road
[22:58] <flaf> (and risky)
[22:58] * evelu (~erwan@37.161.187.176) Quit (Ping timeout: 480 seconds)
[23:05] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[23:06] * alfredodeza (~alfredode@198.206.133.89) has joined #ceph
[23:06] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[23:06] * LobsterRoll (~LobsterRo@140.247.242.44) Quit (Ping timeout: 480 seconds)
[23:06] * jtriley (~jtriley@140.247.242.54) Quit (Ping timeout: 480 seconds)
[23:07] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[23:08] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[23:09] * alfredodeza (~alfredode@198.206.133.89) has left #ceph
[23:12] * dgurtner_ (~dgurtner@213.55.184.224) has joined #ceph
[23:14] * dgurtner (~dgurtner@178.197.234.222) Quit (Ping timeout: 480 seconds)
[23:15] * linuxkidd (~linuxkidd@245.sub-70-210-58.myvzw.com) Quit (Quit: Leaving)
[23:16] * LobsterRoll (~LobsterRo@140.247.242.44) has joined #ceph
[23:18] * Discovery (~Discovery@178.239.49.67) Quit (Read error: Connection reset by peer)
[23:20] * SEBI1 (~capitalth@84ZAADE2X.tor-irc.dnsbl.oftc.net) Quit ()
[23:20] * AG_Scott (~Arcturus@95.211.205.151) has joined #ceph
[23:22] * nhm (~nhm@172.56.7.97) has joined #ceph
[23:22] * ChanServ sets mode +o nhm
[23:24] * shyu_ (~shyu@114.241.13.231) Quit (Ping timeout: 480 seconds)
[23:31] * davidz1 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[23:32] * Rickus_ (~Rickus@office.protected.ca) has joined #ceph
[23:33] * davidz2 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[23:33] * vbellur (~vijay@71.234.224.255) has joined #ceph
[23:35] * davidz3 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) has joined #ceph
[23:35] * bene2 (~bene@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[23:37] * davidz (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) Quit (Ping timeout: 480 seconds)
[23:39] * Rickus (~Rickus@office.protected.ca) Quit (Ping timeout: 480 seconds)
[23:40] * davidz1 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) Quit (Ping timeout: 480 seconds)
[23:40] * allaok (~allaok@ARennes-658-1-79-162.w92-139.abo.wanadoo.fr) Quit (Quit: Leaving.)
[23:42] * davidz2 (~davidz@2605:e000:1313:8003:add4:7ea9:c6b7:3d75) Quit (Ping timeout: 480 seconds)
[23:42] * FjordPrefect (~soorya@103.231.216.194) Quit (Ping timeout: 480 seconds)
[23:42] * dgurtner_ (~dgurtner@213.55.184.224) Quit (Ping timeout: 480 seconds)
[23:48] * nhm (~nhm@172.56.7.97) Quit (Ping timeout: 480 seconds)
[23:50] * AG_Scott (~Arcturus@76GAADCVK.tor-irc.dnsbl.oftc.net) Quit ()
[23:50] * andrew_m (~darks@Relay-J.tor-exit.network) has joined #ceph
[23:51] * sleinen1 (~Adium@2001:620:0:69::101) Quit (Ping timeout: 480 seconds)
[23:52] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.