#ceph IRC Log

Index

IRC Log for 2016-10-06

Timestamps are in GMT/BST.

[0:14] * kristen (~kristen@134.134.139.76) Quit (Quit: Leaving)
[0:15] * cathode (~cathode@50-232-215-114-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[0:19] * bene3 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[0:19] * bene3 (~bene@nat-pool-bos-t.redhat.com) Quit ()
[0:22] * bene3 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[0:23] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) has joined #ceph
[0:23] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[0:23] * bene2 (~bene@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[0:26] * dneary (~dneary@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[0:29] * davidzlap (~Adium@2605:e000:1313:8003:683f:b1ef:b626:4aa1) has joined #ceph
[0:31] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Quit: doppelgrau)
[0:46] * bene3 (~bene@nat-pool-bos-t.redhat.com) Quit (Quit: Konversation terminated!)
[0:53] * xinli (~charleyst@32.97.110.55) Quit (Remote host closed the connection)
[0:57] * ledgr_ (~ledgr@88-222-11-185.meganet.lt) Quit (Quit: Leaving...)
[0:59] * ira (~ira@12.118.3.106) Quit (Ping timeout: 480 seconds)
[1:00] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:84c4:a35d:32e9:a5e9) Quit (Ping timeout: 480 seconds)
[1:04] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[1:09] * vata (~vata@207.96.182.162) Quit (Quit: Leaving.)
[1:24] * oms101 (~oms101@p20030057EA49CC00C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:29] * logan- (~logan@63.143.60.136) Quit (Ping timeout: 480 seconds)
[1:32] * oms101 (~oms101@p20030057EA3E1F00C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[1:38] * logan- (~logan@63.143.60.136) has joined #ceph
[1:42] * vata (~vata@96.127.202.136) has joined #ceph
[1:43] * sudocat1 (~dibarra@192.185.1.20) has joined #ceph
[1:47] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) Quit (Ping timeout: 480 seconds)
[1:49] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[1:49] * davidzlap1 (~Adium@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[1:51] * sudocat1 (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[1:57] * davidzlap (~Adium@2605:e000:1313:8003:683f:b1ef:b626:4aa1) Quit (Ping timeout: 480 seconds)
[1:59] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) has joined #ceph
[2:03] * LegalResale (~LegalResa@66.165.126.130) Quit (Remote host closed the connection)
[2:03] * linuxkidd (~linuxkidd@ip70-189-214-97.lv.lv.cox.net) Quit (Ping timeout: 480 seconds)
[2:04] * LegalResale (~LegalResa@66.165.126.130) has joined #ceph
[2:05] * andreww (~xarses@64.124.158.3) Quit (Ping timeout: 480 seconds)
[2:05] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) has joined #ceph
[2:06] * Concubidated (~cube@68.140.239.164) Quit (Quit: Leaving.)
[2:11] * dgurtner_ (~dgurtner@109.236.136.226) Quit (Ping timeout: 480 seconds)
[2:13] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[2:14] * linuxkidd (~linuxkidd@mobile-166-171-122-121.mycingular.net) has joined #ceph
[2:18] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[2:20] * salwasser (~Adium@2601:197:101:5cc1:b1e3:cd9:d44d:6ad6) has joined #ceph
[2:22] * dneary (~dneary@pool-96-233-46-27.bstnma.fios.verizon.net) has joined #ceph
[2:27] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[2:27] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[2:28] * linuxkidd (~linuxkidd@mobile-166-171-122-121.mycingular.net) Quit (Ping timeout: 480 seconds)
[2:29] * sudocat (~dibarra@2602:306:8bc7:4c50:f913:a406:3cba:59e1) has joined #ceph
[2:35] * davidzlap1 (~Adium@cpe-172-91-154-245.socal.res.rr.com) Quit (Quit: Leaving.)
[2:35] * davidzlap (~Adium@2605:e000:1313:8003:3436:7c31:a0c1:d646) has joined #ceph
[2:37] * linuxkidd (~linuxkidd@ip70-189-232-202.lv.lv.cox.net) has joined #ceph
[2:37] * dneary (~dneary@pool-96-233-46-27.bstnma.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[2:39] * efirs1 (~firs@209.49.4.114) Quit (Ping timeout: 480 seconds)
[2:40] * Concubidated (~cube@h4.246.129.40.static.ip.windstream.net) has joined #ceph
[2:42] * davidzlap (~Adium@2605:e000:1313:8003:3436:7c31:a0c1:d646) Quit (Quit: Leaving.)
[2:54] * rf`1 (~Hejt@104.156.228.192) has joined #ceph
[2:57] * linuxkidd (~linuxkidd@ip70-189-232-202.lv.lv.cox.net) Quit (Ping timeout: 480 seconds)
[3:04] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[3:05] * vbellur (~vijay@71.234.224.255) has joined #ceph
[3:07] * linuxkidd (~linuxkidd@mobile-166-171-122-121.mycingular.net) has joined #ceph
[3:07] * Racpatel (~Racpatel@2601:87:3:31e3::34db) Quit (Ping timeout: 480 seconds)
[3:08] * jfaj (~jan@p20030084AD152E005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[3:15] * salwasser (~Adium@2601:197:101:5cc1:b1e3:cd9:d44d:6ad6) Quit (Quit: Leaving.)
[3:16] * Racpatel (~Racpatel@2601:87:3:31e3::34db) has joined #ceph
[3:17] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[3:17] * jfaj (~jan@p20030084AD1667005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) has joined #ceph
[3:18] * adamcrume__ (~quassel@2601:647:cb01:f890:9555:5234:1afb:604b) has joined #ceph
[3:18] * adamcrume___ (~quassel@2601:647:cb01:f890:a0dc:2825:3ec5:f9da) has joined #ceph
[3:19] * adamcrume (~quassel@2601:647:cb01:f890:c07:bc1:9ffe:aecb) Quit (Ping timeout: 480 seconds)
[3:20] * adamcrume_ (~quassel@2601:647:cb01:f890:c07:bc1:9ffe:aecb) Quit (Ping timeout: 480 seconds)
[3:23] * rf`1 (~Hejt@104.156.228.192) Quit ()
[3:23] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit (Quit: Leaving.)
[3:26] * dneary (~dneary@pool-96-233-46-27.bstnma.fios.verizon.net) has joined #ceph
[3:30] * sudocat (~dibarra@2602:306:8bc7:4c50:f913:a406:3cba:59e1) Quit (Ping timeout: 480 seconds)
[3:36] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) has joined #ceph
[3:53] <tessier_> If ceph supposedly has an s3 API etc. and acts as an object store would that make cephfs decent for storing Maildir? I've read about people looking to implement libradosgw in dovecot etc. but nothing seems to have happened there. The world has needed a good mailstore for a long time.
[3:54] * yanzheng1 (~zhyan@125.70.23.12) has joined #ceph
[3:55] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[4:00] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit ()
[4:06] * jfaj (~jan@p20030084AD1667005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[4:09] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[4:16] * jfaj (~jan@p20030084AD3146005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) has joined #ceph
[4:20] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[4:20] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[4:22] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[4:28] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[4:31] * linuxkidd (~linuxkidd@mobile-166-171-122-121.mycingular.net) Quit (Ping timeout: 480 seconds)
[4:31] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[4:41] * linuxkidd (~linuxkidd@ip70-189-232-202.lv.lv.cox.net) has joined #ceph
[4:54] * ade_b (~abradshaw@p200300886B2C3100A6C494FFFE000780.dip0.t-ipconnect.de) has joined #ceph
[5:01] * ade (~abradshaw@p4FF7B414.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[5:03] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit (Ping timeout: 482 seconds)
[5:04] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[5:10] * linuxkidd (~linuxkidd@ip70-189-232-202.lv.lv.cox.net) Quit (Ping timeout: 480 seconds)
[5:13] * dneary (~dneary@pool-96-233-46-27.bstnma.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[5:22] * linuxkidd (~linuxkidd@mobile-166-171-122-121.mycingular.net) has joined #ceph
[5:30] * linuxkidd (~linuxkidd@mobile-166-171-122-121.mycingular.net) Quit (Ping timeout: 480 seconds)
[5:33] * rotbeard (~redbeard@aftr-109-90-233-215.unity-media.net) has joined #ceph
[5:34] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit (Quit: Leaving.)
[5:38] * linuxkidd (~linuxkidd@ip70-189-214-97.lv.lv.cox.net) has joined #ceph
[5:39] * krypto (~krypto@G68-121-13-23.sbcis.sbc.com) has joined #ceph
[5:39] * vimal (~vikumar@114.143.160.250) has joined #ceph
[5:58] * Vacuum__ (~Vacuum@88.130.193.8) has joined #ceph
[6:01] <tessier_> Finally got a VM working on ceph! Yeay!
[6:02] <tessier_> Odd thing: iostat on each of my two osd servers show 80MB/s being written to disk. But the VM itself only claims to be writing 45MB/s.
[6:02] * sep (~sep@2a04:2740:1ab:1::2) Quit (Ping timeout: 480 seconds)
[6:04] * sep (~sep@2a04:2740:1ab:1::2) has joined #ceph
[6:05] * Vacuum_ (~Vacuum@88.130.222.224) Quit (Ping timeout: 480 seconds)
[6:08] * vimal (~vikumar@114.143.160.250) Quit (Quit: Leaving)
[6:09] * John341 (~ceph@118.200.221.105) Quit (Ping timeout: 480 seconds)
[6:11] * walcubi (~walcubi@p5797A06E.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[6:12] * walcubi (~walcubi@p5797AEFD.dip0.t-ipconnect.de) has joined #ceph
[6:19] * SweetGirl (~hoopy@163.172.150.189) has joined #ceph
[6:34] * vimal (~vikumar@121.244.87.116) has joined #ceph
[6:36] <tessier_> Hmm...journal on same device. That's how.
[6:41] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[6:41] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[6:43] * vikhyat (~vumrao@49.248.82.78) has joined #ceph
[6:46] * vata (~vata@96.127.202.136) Quit (Quit: Leaving.)
[6:49] * SweetGirl (~hoopy@163.172.150.189) Quit ()
[6:57] * rdas (~rdas@121.244.87.116) has joined #ceph
[7:03] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Quit: doppelgrau)
[7:11] * mnc (~mnc@c-50-137-214-131.hsd1.mn.comcast.net) has joined #ceph
[7:12] * mnc (~mnc@c-50-137-214-131.hsd1.mn.comcast.net) Quit ()
[7:34] * karnan (~karnan@125.16.34.66) has joined #ceph
[7:35] * John341 (~ceph@118.200.221.105) has joined #ceph
[7:42] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[7:58] * vikhyat (~vumrao@49.248.82.78) Quit (Ping timeout: 480 seconds)
[7:58] * rwheeler (~rwheeler@46.189.28.81) Quit (Quit: Leaving)
[7:59] <FidoNet> morning ???. so in the midst of a renumber I think I???ve broken my mds cluster ??? any tips on recovering / restoring ?
[7:59] * branto (~branto@transit-86-181-132-209.redhat.com) has joined #ceph
[8:07] * vikhyat (~vumrao@49.248.94.97) has joined #ceph
[8:11] * rraja (~rraja@125.16.34.66) has joined #ceph
[8:13] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) has joined #ceph
[8:19] * efirs (~firs@98.207.153.155) Quit (Quit: Leaving.)
[8:24] * Ivan1 (~ipencak@213.151.95.130) has joined #ceph
[8:25] * newdave (~newdave@14-202-180-170.tpgi.com.au) has joined #ceph
[8:25] * newdave (~newdave@14-202-180-170.tpgi.com.au) Quit (Remote host closed the connection)
[8:25] * lmb (~Lars@ip5b404bab.dynamic.kabel-deutschland.de) Quit (Ping timeout: 480 seconds)
[8:29] * wushudoin (~wushudoin@2601:646:8200:c9f0:2ab2:bdff:fe0b:a6ee) has joined #ceph
[8:33] * briner (~briner@129.194.16.54) Quit (Quit: briner)
[8:34] * b0e (~aledermue@213.95.25.82) has joined #ceph
[8:40] * fridim (~fridim@56-198-190-109.dsl.ovh.fr) has joined #ceph
[8:41] * wushudoin (~wushudoin@2601:646:8200:c9f0:2ab2:bdff:fe0b:a6ee) Quit (Ping timeout: 480 seconds)
[8:46] * dgurtner (~dgurtner@178.197.233.113) has joined #ceph
[8:46] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[8:50] * analbeard (~shw@host86-142-132-208.range86-142.btcentralplus.com) has joined #ceph
[9:00] * dux0r (~Aal@178-175-128-50.static.host) has joined #ceph
[9:06] * Hemanth (~hkumar_@125.16.34.66) has joined #ceph
[9:06] * b0e (~aledermue@213.95.25.82) Quit (Ping timeout: 480 seconds)
[9:14] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[9:15] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) has joined #ceph
[9:19] * b0e (~aledermue@213.95.25.82) has joined #ceph
[9:19] * nardial (~ls@p5DC06206.dip0.t-ipconnect.de) has joined #ceph
[9:22] * nilez (~nilez@104.129.29.42) Quit (Ping timeout: 480 seconds)
[9:22] <IcePic> tessier_: isnt Maildir based on doint smart renames and moves to (re)sort and tag mails?
[9:23] <IcePic> tessier_: I dont think the S3 interface will feel snappy for those kinds of operations, especially if the client is distant, network wise
[9:27] * lmb (~Lars@62.214.2.210) has joined #ceph
[9:30] * rotbeard (~redbeard@aftr-109-90-233-215.unity-media.net) Quit (Quit: Leaving)
[9:30] * dux0r (~Aal@178-175-128-50.static.host) Quit ()
[9:35] * nilez (~nilez@104.129.29.42) has joined #ceph
[9:35] * peetaur2 (~peter@i4DF67CD2.pool.tripleplugandplay.com) Quit (Remote host closed the connection)
[9:38] * peetaur2 (~peter@i4DF67CD2.pool.tripleplugandplay.com) has joined #ceph
[9:44] * Goodi (~Hannu@194.251.119.207) has joined #ceph
[9:45] * peetaur2 (~peter@i4DF67CD2.pool.tripleplugandplay.com) Quit (Quit: Konversation terminated!)
[9:45] * peetaur2 (~peter@i4DF67CD2.pool.tripleplugandplay.com) has joined #ceph
[9:47] * Ivan1 (~ipencak@213.151.95.130) Quit (Remote host closed the connection)
[9:49] * Ivan1 (~ipencak@213.151.95.130) has joined #ceph
[9:58] * sto_ is now known as sto
[10:00] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[10:01] * krypto (~krypto@G68-121-13-23.sbcis.sbc.com) Quit (Ping timeout: 480 seconds)
[10:01] * ivve (~zed@m176-68-29-112.cust.tele2.se) has joined #ceph
[10:04] * vikhyat (~vumrao@49.248.94.97) Quit (Ping timeout: 480 seconds)
[10:12] * ashah (~ashah@125.16.34.66) has joined #ceph
[10:12] * vikhyat (~vumrao@114.143.46.214) has joined #ceph
[10:13] * krypto (~krypto@G68-121-13-23.sbcis.sbc.com) has joined #ceph
[10:14] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[10:22] * b0e1 (~aledermue@213.95.25.82) has joined #ceph
[10:22] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[10:28] * derjohn_mob (~aj@67.red-176-83-15.dynamicip.rima-tde.net) has joined #ceph
[10:32] * lmb (~Lars@62.214.2.210) has joined #ceph
[10:33] * DanFoster (~Daniel@2a00:1ee0:3:1337:6c24:8eb3:9c6a:ee55) has joined #ceph
[10:34] * dgurtner (~dgurtner@178.197.233.113) Quit (Read error: Connection reset by peer)
[10:38] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[10:42] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:84c4:a35d:32e9:a5e9) has joined #ceph
[10:44] * andrei__1 (~andrei@host81-151-140-236.range81-151.btcentralplus.com) has joined #ceph
[10:44] <andrei__1> hello guys
[10:44] <andrei__1> I am having an issue with starting radosgw service
[10:44] <andrei__1> after i've upgraded from 10.2.2 to 10.2.3
[10:45] <andrei__1> i get the following error:
[10:45] <andrei__1> 2016-10-05 22:15:10.049523 7f48bd0cba00 0 zonegroup default missing zone for master_zone=
[10:45] <andrei__1> 2016-10-05 22:15:10.056794 7f48bd0cba00 -1 Couldn't init storage provider (RADOS)
[10:45] <andrei__1> could someone help me out please?
[10:52] * hk135 (~horner@rs-mailrelay1.hornerscomputer.co.uk) Quit (Quit: leaving)
[10:55] * Ramakrishnan (~ramakrish@125.16.34.66) has joined #ceph
[10:56] * ggarg (~ggarg@host-82-135-29-34.customer.m-online.net) Quit (Ping timeout: 480 seconds)
[10:56] * JANorman (~JANorman@81.137.246.31) has joined #ceph
[10:57] * Chaos_Llama (~Chrissi_@tsn109-201-154-205.dyn.nltelcom.net) has joined #ceph
[10:57] * Ramakrishnan (~ramakrish@125.16.34.66) Quit ()
[10:57] <JANorman> Morning!
[10:57] * Ramakrishnan (~ramakrish@125.16.34.66) has joined #ceph
[10:57] <IcePic> andrei__1: see if this mail (and the follow up) help you https://www.mail-archive.com/ceph-users@lists.ceph.com/msg31764.html
[10:58] <IcePic> its about how the realm -> zonegroup -> zone things in Jewel should end up
[11:01] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[11:03] * lmb (~Lars@62.214.2.210) has joined #ceph
[11:04] <andrei__1> IcePic, thanks ,i will take a look
[11:08] * ivve (~zed@m176-68-29-112.cust.tele2.se) Quit (Ping timeout: 480 seconds)
[11:10] * dgurtner (~dgurtner@178.197.228.112) has joined #ceph
[11:11] * Mika_c (~Mika@122.146.93.152) has joined #ceph
[11:11] * adamcrume___ (~quassel@2601:647:cb01:f890:a0dc:2825:3ec5:f9da) Quit (Ping timeout: 480 seconds)
[11:11] * adamcrume__ (~quassel@2601:647:cb01:f890:9555:5234:1afb:604b) Quit (Ping timeout: 480 seconds)
[11:12] * Ramakrishnan (~ramakrish@125.16.34.66) Quit (Quit: Leaving)
[11:12] * JANorman (~JANorman@81.137.246.31) Quit (Remote host closed the connection)
[11:12] * JANorman (~JANorman@81.137.246.31) has joined #ceph
[11:13] * Ramakrishnan (~ramakrish@125.16.34.66) has joined #ceph
[11:15] * Ramakrishnan (~ramakrish@125.16.34.66) has left #ceph
[11:16] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[11:16] * Ramakrishnan (~ramakrish@125.16.34.66) has joined #ceph
[11:17] * andrei__1 (~andrei@host81-151-140-236.range81-151.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[11:17] <JANorman> I'm using the C API libraries, and trying to write directly to an erasure coded pool. I get error 95 - operation not supported/permitted. We're using rados_append. Someone reccomended rados_write_full but we don't want to use this due to memory limitations. We were also recommended to add a replicated pool in front of the erasure coded pool, but we don't actually control the client's Ceph installations. Is there any way to append to an object that wor
[11:17] <JANorman> ks?
[11:20] * ggarg (~ggarg@host-82-135-29-34.customer.m-online.net) has joined #ceph
[11:21] * Concubidated (~cube@h4.246.129.40.static.ip.windstream.net) Quit (Quit: Leaving.)
[11:23] * KpuCko (~KpuCko@87-126-68-130.ip.btc-net.bg) has joined #ceph
[11:23] <KpuCko> hello, anybody alive?
[11:23] * rdas (~rdas@121.244.87.113) has joined #ceph
[11:24] <doppelgrau> JANorman: IIRC EC-Pools have a reduced feature set, either reduce to that or use a (small) cache pool
[11:25] <JANorman> doppelgrau: Is the cache not going away in 2.0?
[11:26] <doppelgrau> 2.0?
[11:26] * andrei__1 (~andrei@37.220.104.190) has joined #ceph
[11:26] <JANorman> Ceph 2.0
[11:26] <KpuCko> How i can see the status of ceph cluster when node was down? When i type ceph --status i'm only getting "2016-10-06 12:20:53.628571 7fa6396f9700 0 -- 10.1.85.110:0/1068598500 >> 10.1.85.102:6789/0 pipe(0x7fa630004f50 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fa6300051e0).fault"
[11:27] * Chaos_Llama (~Chrissi_@tsn109-201-154-205.dyn.nltelcom.net) Quit ()
[11:27] <JANorman> Or perhaps I heard wrong!
[11:27] <Gugge-47527> JANorman: are you looking for this? http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-September/004068.html
[11:28] <doppelgrau> JANorman: what version shoul ceph 2.0 be? current stable is 10.2.3...
[11:29] <peetaur2> doppelgrau: RH renumbered them to make sure to confuse paying users
[11:29] <JANorman> Gugge-47527: Thanks, yeah have come across that thread. Seems this should be handled by Ceph itself not the client trying to write the data?
[11:31] <doppelgrau> peetaur2: oh, what a great idea
[11:31] <Gugge-47527> JANorman: seems like you need to obay the alignment on appends in EC pools
[11:32] <Gugge-47527> The easy way is to put a replicated pool in front
[11:32] <Gugge-47527> then you will have access to all features
[11:32] <JANorman> Gugge-47527: It's not too clear if I were to respect that, how I would actually do that. Should I rados_append in 4k bytes chunks?
[11:32] <JANorman> Gugge-47527: Do you have reference to which features using EC disables?
[11:32] <Gugge-47527> in n*4k byte chunks
[11:32] <Gugge-47527> i have no idea
[11:33] <JANorman> Gugge-47527: ok, np.
[11:33] <JANorman> Gugge-47527: n being? (sorry if it's obvious)
[11:33] <Gugge-47527> its not about the size, its about the alignment
[11:36] * derjohn_mob (~aj@67.red-176-83-15.dynamicip.rima-tde.net) Quit (Ping timeout: 480 seconds)
[11:38] * vikhyat (~vumrao@114.143.46.214) Quit (Quit: Leaving)
[11:44] * egi (~egi@83.220.237.221) has joined #ceph
[11:44] <JANorman> This really does seem like something that should be handled by Ceph and not client's writing the data
[11:44] <JANorman> :/
[11:45] <Gugge-47527> everything requiering a read/modify/write is not allowed in EC pools
[11:45] <Gugge-47527> deal with it :P
[11:46] <doppelgrau> design decision, since append can be painfully expensive in EC pools (read all chunks, change object, do EC again, write back)
[11:46] <Gugge-47527> but this is one of the reasons why i dont use EC pools :)
[11:46] <doppelgrau> if you need these operations, replicated poll, cache pool or find workarounds
[11:47] * derjohn_mob (~aj@189.red-176-83-104.dynamicip.rima-tde.net) has joined #ceph
[11:51] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[11:51] <FidoNet> morning ???. so in the midst of a renumber I think I???ve broken my mds cluster ??? any tips on recovering / restoring ?
[11:52] <JANorman> Gugge-47527: "everything requiering a read/modify/write is not allowed in EC pools" what exactly do you mean by this?
[11:58] <egi> Hello, can you help me? I try to configure cephfs on 3 node. After install all looks fine for week. When common usage of ceph became 5TB some osd become marked as down. The case is very similar to this http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-April/000291.html . I add all changes to ceph cluster, but when I turn deep-scrubbing or just scrubbing - osds marked down again.. Version of ceph - 9.2.1, Centos 7.2.1511, core - 3.10. OSD use disk arrays
[11:58] <egi> over fibre channel (journal too on disk array). Now I try to use hdd for osd, everithing looks fine with deep-scrubbing, but I used to configure ceph on disk array. Is it normal to use one osd per disk array?
[11:59] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Quit: Nettalk6 - www.ntalk.de)
[12:07] * wkennington (~wak@0001bde8.user.oftc.net) Quit (Quit: Leaving)
[12:11] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[12:17] <IcePic> egi: one osd per disk is normal
[12:18] <Gugge-47527> JANorman: in not sure i can explain it better
[12:18] <Gugge-47527> JANorman: but in ec pools you cant overwrite part of an object, without reading the object, modifying it, and writing the full object again
[12:19] <Gugge-47527> JANorman: radoes does not do that for you, but expects you to do that yourself
[12:19] <Gugge-47527> JANorman: instead of allowing a slow write
[12:23] <egi> IcePic: Can I use one disk array per osd?
[12:26] <peetaur2> you should probably let Ceph handle the redundancy and not use arrays .... but I think you can do it, just not a useful idea, more complex, more problems
[12:26] * andrei__1 (~andrei@37.220.104.190) Quit (Ping timeout: 480 seconds)
[12:28] * dgurtner (~dgurtner@178.197.228.112) Quit (Ping timeout: 480 seconds)
[12:29] <IcePic> there is nothing to prevent it, you may do it, but as peetaur2 said, better to leave redundancy to ceph
[12:29] <egi> peetaur2: I havent so much nodes for that =)
[12:30] <peetaur2> nodes? just add many osds to each host
[12:30] <peetaur2> I'm buying 2U 12 disk machines and will have up to 12 OSDs on each
[12:30] <egi> I have some count of hdd and only 3 node, so I can use ont disk array for one node only to utilize this count
[12:30] <IcePic> same goes for zfs, better to leave mirroring/extra copies/raid stuff to the fs, instead of stacking them ontop of eachother
[12:31] <peetaur2> so is your disk array system a single point of failure?
[12:31] <egi> my nodes can fit only 3 HDD (
[12:32] <egi> I have 3 disk array, so system can tolerate one disk array failure
[12:32] <IcePic> the point is that you can tell ceph how you hardware is rigged, with x disks per host, x hosts per rack, x racks per site and so on, and ceph can make sure copies dont end up in the same disk, same host, same rack and so on
[12:32] <egi> but problem is in deep-scrubbing.. when i allow it - my osd's marked as down
[12:33] <doppelgrau> egi: reduce load? (max scrubs and so on)
[12:33] <IcePic> egi: check if your network get starved, or if your disks stack up long queues while scrubbing?
[12:34] <peetaur2> sounds like something runs slow due to increased load and times out and gives up (to make the cluster faster I guess if just 1 disk is dying... but if all go down that's not what you want)
[12:34] <IcePic> egi: and as the mail thread you pointed to, have you looked at similar things in your logs to see if your case actually is identical to that one?
[12:35] <peetaur2> I would at least test 1 disk per array per osd... like do all disks in 1 array on one node, or 1/3 disks in each array on each node, every disk in the 'array' is just a single disk ... jbod or pass through, not a real array
[12:35] <egi> I fix so much thing so I dont know what I can do next)) Fix tcp stack, reduce priority of scrubbing and so one..
[12:38] <peetaur2> I wouldn't disable scrub... I think scrub is showing you a problem, not causing one
[12:38] <egi> Can my problem be on journal over fiber channel or something like that?
[12:38] <peetaur2> you don't want to have to disable all kinds of things just to make it work but still fragile... you want it to tolerate any abuse you give it
[12:38] * [0x4A6F] (~ident@0x4a6f.user.oftc.net) Quit (Ping timeout: 480 seconds)
[12:38] <peetaur2> maybe the log would be useful
[12:39] <peetaur2> or just test with single disk per osd first
[12:39] <egi> peetaur2: But is it save to disable scrub and deep-scrub for all time of use?
[12:39] * [0x4A6F] (~ident@p508CD485.dip0.t-ipconnect.de) has joined #ceph
[12:39] <peetaur2> single disk per osd is normal... it's what everyone does, so you will not run into unique bugs that way
[12:40] <peetaur2> scrub will prevent silent corruption... maybe that's safe, but still less safe
[12:40] <egi> peetaur2: log is pretty similar to http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-April/000291.html
[12:40] <peetaur2> it's like fsck on bootup...you hate it but don't disable it :D
[12:41] * Mika_c (~Mika@122.146.93.152) Quit (Remote host closed the connection)
[12:41] <egi> peetaur2: thank you so much )
[12:42] * Hemanth (~hkumar_@125.16.34.66) Quit (Quit: Leaving)
[12:44] <sep> one of my osd's keep crashing. and the last line say ;; -1> 2016-10-06 08:09:18.869687 7ffaa037f700 -1 osd.7 pg_epoch: 128840 pg[5.3as0( v 84797'30080 (67219'27080,84797'30080] local-les=128834 n=13146 ec=61149 les/c 128834/127358 128829/128829/128829) [7,109,4,0,62,32]/[7,109,32,0,62,39] r=0 lpr=128829 pi=127357-128828/12 rops=5 bft=4(2),32(5) crt=0'0 lcod 0'0 mlcod 0'0 active+remapped+backfilling] handle_recovery_read_complete: inconsistent shard sizes
[12:44] <sep> 5/abc6d43a/rbd_data.33640a238e1f29.000000000003b165/head the offending shard must be manually removed after verifying there are enough shards to recover (0, 8388608, [32(2),0, 39(5),0])
[12:44] <sep> googeling this error gives 1 hit, the commit when error message went into the source code.
[12:45] <sep> anyone know how i would go about manually verifying enoughf shards ?
[12:46] * derjohn_mob (~aj@189.red-176-83-104.dynamicip.rima-tde.net) Quit (Ping timeout: 480 seconds)
[12:48] * jeh (~jeh@76.16.206.198) Quit ()
[12:48] <peetaur2> sep: using EC?
[12:48] <peetaur2> sep: https://github.com/ceph/ceph/pull/6946
[12:50] <peetaur2> so.... not sure what it means exactly. maybe this is what they say they added in that link: (0, 8388608, [32(2),0, 39(5),0])
[12:50] * bniver (~bniver@pool-71-174-250-171.bstnma.fios.verizon.net) Quit (Remote host closed the connection)
[12:50] <peetaur2> but it also says "and what steps must be taken to fix the issue" ... does it say which steps?
[12:51] <sep> yes that's the one
[12:51] <sep> "the offending shard must be manually removed after verifying there are enough shards to recover"
[12:52] <sep> i am assuming "the offending shard" is the shard in the error 5/abc6d43a/rbd_data.33640a238e1f29.000000000003b165/head
[12:52] * Hemanth (~hkumar_@125.16.34.66) has joined #ceph
[12:58] * analbeard (~shw@host86-142-132-208.range86-142.btcentralplus.com) Quit (Quit: Leaving.)
[12:58] * krypto (~krypto@G68-121-13-23.sbcis.sbc.com) Quit (Ping timeout: 480 seconds)
[12:59] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[12:59] <egi> Someone know, can I use current cephfs version with kernel 3.10?
[13:00] <peetaur2> egi: questions like that are in the docs/faq, ... they basically say you can mix whatever you want, because ceph daemons are userland; so only things like kernel cephfs and rbd drivers are affected
[13:00] <peetaur2> kernel drivers like those are in the clients
[13:01] <peetaur2> for rbd compat, here's a table http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client
[13:02] <egi> Question is not sow simple to ask docs =) I find problem with "cache pressure" and google that this problem caused by too old kernel version
[13:03] <egi> table from site sayd that version is ok, but on bag description sayd that better 4.+
[13:03] <peetaur2> ok let me answer another way then... see http://docs.ceph.com/docs/master/start/os-recommendations/ where in the table they write "B, I, C" for centos 7 and ubuntu 14.04 ... so if you use those, then you can bet it's well tested
[13:04] <peetaur2> so strange mismatches are unlikely in those daemon machines. (but maybe still clients will fail in some way, but without taking down the cluster)
[13:04] <egi> thank you again ) thats was all my questions
[13:04] <peetaur2> ubuntu 14.04 uses kernel 3.13.x by the way
[13:06] * andrei__1 (~andrei@37.220.104.190) has joined #ceph
[13:07] * egi (~egi@83.220.237.221) Quit (Quit: Leaving)
[13:09] <sep> hum i actualy have 2 osd's failing on that same object. how would i know witch one is bad ?
[13:10] <peetaur2> someone once said that you should have size 3 so you can know which is bad
[13:10] * lmb (~Lars@62.214.2.210) has joined #ceph
[13:10] <peetaur2> and I once saw something about a checksum feature...maybe it was for bluestore only
[13:10] <peetaur2> without a checksum or a majority vote sort of thing, you can't know which is bad
[13:11] <peetaur2> (or knowing the data...like if you read it and one has readable text and other is binary, you could guess the readable text is the right one
[13:11] <peetaur2> maybe ceph-objectstore-tool can let you read the content
[13:13] * ade_b (~abradshaw@p200300886B2C3100A6C494FFFE000780.dip0.t-ipconnect.de) Quit (Quit: Too sexy for his shirt)
[13:13] <sep> peetaur2, i have size3 for replicated pools but this is Erasure coding
[13:13] * ade (~abradshaw@p4FF7B18D.dip0.t-ipconnect.de) has joined #ceph
[13:14] <sep> so the objects are not 3 identical they all have uniqe md5sums.
[13:16] <peetaur2> but if you reverse the math, you should have 1-3 matching
[13:16] <sep> altho i would wish the osd moved such "offending shards" to a lost+found or offending-objects directory. instead or crashing and cousing a lot of recovery
[13:16] <peetaur2> maybe ceph-objectstore-tool does that for you
[13:17] <peetaur2> yeah I agree... full safety stop and manual recovery is a fine option, but I would choose the auto recovery and rely on backups if that fails
[13:22] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[13:23] * karnan (~karnan@125.16.34.66) Quit (Remote host closed the connection)
[13:30] * mib_24qg6i (a668f648@64.62.228.82) has joined #ceph
[13:32] <mib_24qg6i> hello, may you help me? When I create a rbd on client-node, there is a problem: "monclient(hunting): authenticate timed out after 300."
[13:33] <mib_24qg6i> The rbd was created successfully when I create it on mon-node.
[13:34] <peetaur2> mib_24qg6i: could be a firewall issue or some kind of network issue
[13:35] * andrei__1 (~andrei@37.220.104.190) Quit (Ping timeout: 480 seconds)
[13:35] <mib_24qg6i> I can connect by ssh from client-node to mon-node.
[13:36] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[13:37] * dgurtner (~dgurtner@178.197.224.203) has joined #ceph
[13:42] <sep> mib_24qg6i, client node can reach all osd's and mon's ? same mtu set all over ?
[13:50] <mib_24qg6i> osds and mons is in one computer, and client is different computer.
[13:53] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[13:53] <thoht> SamYaple: what is a raisonnable value for bcache, with a SATA disk of 1.1TB ? is it ok 300G of SSD ? too much ?
[14:03] <FidoNet> how long does am nds replay take ?
[14:03] <FidoNet> meh .. an mds replay ???
[14:04] * andrei__1 (~andrei@37.220.104.190) has joined #ceph
[14:06] * ade_b (~abradshaw@p4FF79CD2.dip0.t-ipconnect.de) has joined #ceph
[14:07] * ade (~abradshaw@p4FF7B18D.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[14:08] * mib_24qg6i (a668f648@64.62.228.82) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[14:17] <darkfader> thoht: bcache is very keen on skipping sequential IO and stuff, so 300GB is quite a bit
[14:17] <darkfader> but since it's not easy to increase or manage i'd say go with anything between 100 and 300GB and just see how it goes
[14:18] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[14:20] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) has joined #ceph
[14:33] * derjohn_mob (~aj@52.red-176-83-101.dynamicip.rima-tde.net) has joined #ceph
[14:33] * Hemanth (~hkumar_@125.16.34.66) Quit (Ping timeout: 480 seconds)
[14:33] * ashah (~ashah@125.16.34.66) Quit (Quit: Leaving)
[14:35] <sep> wrote on the mailinglist to see if anyone had any more suggestions with the offended shards
[14:37] * lmb (~Lars@62.214.2.210) has joined #ceph
[14:38] * The1w (~jens@node3.survey-it.dk) has joined #ceph
[14:43] * KpuCko (~KpuCko@87-126-68-130.ip.btc-net.bg) Quit ()
[14:43] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[14:50] * Racpatel (~Racpatel@2601:87:3:31e3::34db) Quit (Quit: Leaving)
[14:50] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[14:52] * derjohn_mob (~aj@52.red-176-83-101.dynamicip.rima-tde.net) Quit (Ping timeout: 480 seconds)
[14:55] * The_Ball (~pi@20.92-221-43.customer.lyse.net) Quit (Read error: Connection reset by peer)
[14:56] * The1w (~jens@node3.survey-it.dk) Quit (Remote host closed the connection)
[14:57] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[14:57] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[14:58] * Racpatel (~Racpatel@2601:87:3:31e3::34db) has joined #ceph
[14:58] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) has joined #ceph
[15:00] <andrei__1> hello guys
[15:00] <andrei__1> i've got a quick and simple question
[15:01] <andrei__1> I am running ubuntu server and I am unable to start radosgw service as user ceph
[15:01] * lmb (~Lars@62.214.2.210) has joined #ceph
[15:01] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[15:01] <andrei__1> by unable i mean that the service radosgw start by default starts the process as user root
[15:01] <andrei__1> and not ceph
[15:01] <andrei__1> all other ceph processes are started as user ceph
[15:02] <andrei__1> could someone please tell me where do I specify the start user for the radosgw process?
[15:03] * derjohn_mob (~aj@147.red-176-83-69.dynamicip.rima-tde.net) has joined #ceph
[15:06] * jarrpa (~jarrpa@2602:3f:e183:a600:a4c6:1a92:820f:bb6) has joined #ceph
[15:07] <btaylor> i was rereading the ???learning ceph??? book last night and noticed it said that if the osd server has a raid controller then its best to put the drives in raid0???s as single drives but it didn???t offer any explanation as to why. Is it really better than running them in JBOD?
[15:07] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[15:11] <IcePic> btaylor: its better to let ceph handle all disks separately
[15:11] <btaylor> which is why i???d think JBOD would be better
[15:11] * fridim (~fridim@56-198-190-109.dsl.ovh.fr) Quit (Ping timeout: 480 seconds)
[15:11] <IcePic> btaylor: then they can fail each for their own, instead of a large jbod/raidX that will make it all fail at the same time
[15:12] <btaylor> ???i thought jbod just presented them all to the OS as if they were all single disks 'regularly'
[15:12] <FidoNet> yes but what if you create 8 x RAID0 from 8 drives (ie 8 distinct virtual disks)
[15:12] <IcePic> jbod => just-a-bunch-of-disks => one-large concatenation of all disks into one
[15:12] <FidoNet> as opposed to 1 virtual disk consisting of 8 drives (which is what I think you???re thinking when there???s talk of presenting the drive(s) as RAID0)
[15:13] <FidoNet> some controllers don???t have the option for JBOD
[15:13] <IcePic> LSI will allow you to make single-disk raid0s so that may be one way of saying "let the ceph see the raw single disk"
[15:13] <btaylor> maybe jbod mode on my controller is differnet because i???m seeing sda - sdx in my OS
[15:14] <FidoNet> I???ve done that before with FreeNAS ??? the H700 for example (LSI 9240?) doesn???t do JBOD
[15:15] <IcePic> seems like different vendors use JBOD differently than I am used to also.
[15:15] <IcePic> some do actually mean "handle disks separately", which makes me wonder why that even got a name, but there you go.
[15:16] <btaylor> ok so i guess that???s why there wasn???t much more exlpanation. as long as they are single disks, it???s in the best configuration
[15:16] <IcePic> yes, so that the failure domain is one-disk == one-osd in ceph
[15:17] <IcePic> which is why you shouldnt raid0 the lot and hand out 1-2-3-4 small parts to ceph or something like that.
[15:17] <btaylor> right
[15:18] <FidoNet> ok that???s cool ??? I had wondered if there was a reason why the advice was against using RAID0 ??? when my thought was export the disks as a RAID0 unit of 1 disk ???. I guess it???s an interpretation thing :) ??? so long as there isn???t a valid reason for not using RAID0 of 1 disk then that makes life a lot simpler
[15:18] <btaylor> but putting them into their own raid0s would have some additional overhead because of the controller, right?
[15:18] <IcePic> its not about being of a small size or so, but rather allowing ceph to manage the failures in the best possible way.
[15:18] <andrei__1> IcePic, thanks for the link. i've ran through what the guy did and it helped to start the service. Even though the errors were different in his case
[15:18] <IcePic> btaylor: that overhead is probably free in time since its solved in hardware or by a cpu on the raid card.
[15:18] * The_Ball (~pi@20.92-221-43.customer.lyse.net) has joined #ceph
[15:18] <IcePic> andrei__1: \o/
[15:19] <IcePic> andrei__1: it helped me also, and I didnt have exact the same problem as him, nor you, but his research was good
[15:19] <andrei__1> IcePic, I am now having some minor issues trying to figure out why my radosgw service starts as user root and not ceph
[15:20] <andrei__1> which is strange as all other ceph services are run as user ceph
[15:22] <andrei__1> test
[15:22] <IcePic> andrei__1: perhaps so it can bind to 80/443 for the web stuff?
[15:23] * fridim (~fridim@56-198-190-109.dsl.ovh.fr) has joined #ceph
[15:24] <andrei__1> IcePic, nope, not that. it starts okay as user ceph if I manually start it with the user options
[15:24] <IcePic> ok, I was just guessing there.
[15:24] <andrei__1> the init scripts are not starting it as ceph for some reason
[15:24] <andrei__1> i think it starts initially as root, binds and drops privs to user ceph
[15:24] <andrei__1> that's how it's able to run on ports 80/443 as ceph
[15:29] * ira (~ira@12.118.3.106) has joined #ceph
[15:29] * yanzheng1 (~zhyan@125.70.23.12) Quit (Quit: This computer has gone to sleep)
[15:32] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[15:32] * scuttle|afk is now known as scuttlemonkey
[15:37] * tvon (~tvon@50.58.105.131) has joined #ceph
[15:39] * salwasser (~Adium@72.246.3.14) has joined #ceph
[15:40] * diver (~diver@95.85.8.93) has joined #ceph
[15:41] <tvon> I don't suppose anyone has used the ceph/demo docker image with the xhyve-based Docker for Mac?
[15:43] <tvon> Something about how xyhve or docker handles networking leads to "unable to find any IP address in networks" error. I have the same command working with docker-machine but at the moment ceph/demo is the only reason I have docker-machine installed and I'd like to get past that.
[15:43] * gmoro (~guilherme@193.120.208.221) Quit (Quit: Leaving)
[15:43] * gmoro (~guilherme@193.120.208.221) has joined #ceph
[15:44] <andrei__1> right, i've figured out where my problem was
[15:44] * mattbenjamin (~mbenjamin@12.118.3.106) has joined #ceph
[15:44] <andrei__1> it was the incorrect systemctl configuration
[15:44] <andrei__1> all sorted now
[15:44] <andrei__1> IcePic, thanks for your help mate!
[15:49] * marco208 (~root@159.253.7.204) Quit (Ping timeout: 480 seconds)
[15:50] * Ramakrishnan (~ramakrish@125.16.34.66) Quit (Ping timeout: 480 seconds)
[15:51] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[15:53] <FidoNet> I???m having issues with ceph-mon not starting after a reboot ???. I can???t see anything obvious in the logs and it seems to start ok if I run it manually ??? this is all on ubuntu 16.04 and I???m kind of new to systemd .. what should I be looking for?
[15:53] * wiebalck (~wiebalck@pb-d-128-141-194-105.cern.ch) has joined #ceph
[15:53] * marco208 (~root@159.253.7.204) has joined #ceph
[15:55] * rendar (~I@82.61.125.178) has joined #ceph
[15:56] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[15:56] * wiebalck_ (~wiebalck@pb-d-128-141-7-249.cern.ch) has joined #ceph
[15:59] <andrei__1> FidoNet, you should have a servcie called something like
[15:59] <andrei__1> ceph-mon@<hostname>
[16:00] <FidoNet> yup .. if I do a systemctl start ceph-mon@mon01 it starts
[16:00] <andrei__1> so, in my case when I run service ceph-mon@arh-ibstorage1-ib status
[16:00] * wiebalck__ (~wiebalck@pb-d-128-141-6-130.cern.ch) has joined #ceph
[16:00] <FidoNet> it just doesn???t start on boot any more
[16:00] <andrei__1> it gives me info on the process
[16:00] <FidoNet> aha
[16:01] <FidoNet> disabled ??? ok now enabled :)
[16:01] <FidoNet> that was easy
[16:01] <andrei__1> FidoNet, what about systemctl is-enabled <service name>
[16:01] <andrei__1> it should give you a status
[16:01] <andrei__1> do enable and it should start at boot
[16:02] <FidoNet> testing .. but symlinks created so I guess it should be ok - thanks
[16:02] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[16:02] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[16:02] <FidoNet> have been doing a lot of juggling ??? renumbering / etc ???
[16:02] <FidoNet> yup that worked
[16:03] <FidoNet> now anyone here know anything about mds and replay / etc
[16:03] <FidoNet> ?
[16:04] * jfaj (~jan@p20030084AD3146005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) Quit (Quit: WeeChat 1.5)
[16:04] * wiebalck_ (~wiebalck@pb-d-128-141-7-249.cern.ch) Quit (Ping timeout: 480 seconds)
[16:04] * marco208 (~root@159.253.7.204) Quit (Ping timeout: 480 seconds)
[16:07] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[16:12] * jarrpa (~jarrpa@2602:3f:e183:a600:a4c6:1a92:820f:bb6) Quit (Ping timeout: 480 seconds)
[16:13] <wiebalck__> The OpenStack Manila CephFS Native driver evicts all clients upon initialization (unless the client is using the admin authID). From what I see, this prevents the use of multiple Manila share servers running at the same time, the last one will win. Does anyone know what the reason for this behaviour is?
[16:17] * peetaur2 (~peter@i4DF67CD2.pool.tripleplugandplay.com) Quit (Remote host closed the connection)
[16:17] * andrei__1 (~andrei@37.220.104.190) Quit (Quit: Ex-Chat)
[16:18] * derjohn_mob (~aj@147.red-176-83-69.dynamicip.rima-tde.net) Quit (Ping timeout: 480 seconds)
[16:19] * marco208 (~root@159.253.7.204) has joined #ceph
[16:21] * andreww (~xarses@64.124.158.3) has joined #ceph
[16:21] * nardial (~ls@p5DC06206.dip0.t-ipconnect.de) Quit (Quit: Leaving)
[16:23] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) Quit (Quit: Miouge)
[16:23] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[16:23] * jarrpa (~jarrpa@63.225.131.166) has joined #ceph
[16:23] * jfaj (~jan@p20030084AD3146006AF728FFFE6777FF.dip0.t-ipconnect.de) has joined #ceph
[16:27] * derjohn_mob (~aj@169.red-176-83-69.dynamicip.rima-tde.net) has joined #ceph
[16:27] * jcsp (~jspray@62.220.28.124) has joined #ceph
[16:32] * wiebalck_ (~wiebalck@pb-d-128-141-7-249.cern.ch) has joined #ceph
[16:33] * Goodi (~Hannu@194.251.119.207) Quit (Quit: This computer has gone to sleep)
[16:35] <wiebalck_> jcsp ^, maybe? :)
[16:35] <jcsp> Just connected, I don't have the scrollback
[16:35] <jcsp> (I'm on slightly unreliable wifi)
[16:36] <wiebalck_> The OpenStack Manila CephFS Native driver evicts all clients upon initialization (unless the client is using the admin authID). From what I see, this prevents the use of multiple Manila share servers running at the same time, the last one will win. Does anyone know what the reason for this behaviour is?
[16:36] <rraja> wiebalck__: so I guess you're talking about this https://github.com/openstack/manila/blob/master/manila/share/drivers/cephfs/cephfs_native.py#L143
[16:36] <wiebalck_> rraja: yes
[16:37] <rraja> wiebalck_: you're having multiple m-shr service using same ceph_auth_id?
[16:37] * wiebalck__ (~wiebalck@pb-d-128-141-6-130.cern.ch) Quit (Ping timeout: 480 seconds)
[16:37] <wiebalck_> rraja: yes
[16:38] <wiebalck_> that???s at least what I planned to do, copying the way we run Cinder
[16:38] <jcsp> hmm, the eviction is necessary to avoid a long timeout waiting for a previous (dead) instance to have its capabilities taken away (because volumeclient acts as a libcephfs client)
[16:39] * Kurt (~Adium@2001:628:1:5:3cbc:8252:2296:6255) Quit (Quit: Leaving.)
[16:39] * georgem (~Adium@206.108.127.16) has joined #ceph
[16:39] <jcsp> so if you have multiple share server instances with multiple ceph backends that target the same cluster, then I guess that is an issue (didn't think of it)
[16:40] <jcsp> if you use different auth ids for different backends then the problem would go away, although I wonder if there is a neater way for us to handle this
[16:41] <rraja> wiebalck_: can you not have each ceph driver specific manila config section to have a different cephfs_auth_id?
[16:42] <wiebalck_> it seems awkward to use different IDs to access the very same backend, no? the m_shr services are clones of each other.
[16:44] <wiebalck_> would it makes sense to make the default ID configurable?
[16:45] <jcsp> hmm, I thought that Manila was only supposed to run one plugin instance at a time for a given backend (i.e. pick one share service and run it there)
[16:45] <jcsp> but honestly I've always been a bit fuzzy on that part
[16:45] * bvi (~Bastiaan@185.56.32.1) has joined #ceph
[16:46] <jcsp> or maybe it runs multiple instances but only calls through to one of them or something
[16:46] * vata (~vata@207.96.182.162) has joined #ceph
[16:47] <wiebalck_> I regarded the m-shr instances more like worker threads
[16:48] <rraja> jcsp: AFAIK manila's scheduler picks the backend host if the share type is same based on available space etc.
[16:49] <wiebalck_> yeah, but the backend host is not necessarily a host, but more a tag of identical hosts ??? at least I have treated it like that :)
[16:49] <wiebalck_> as Manila is similar to Cinder I copied my usage
[16:51] <wiebalck_> and there you also had the notion of a host (now backend_hostname IIRC) which in the rbd driver, for instances, is rbd:<pool> I think
[16:52] <wiebalck_> are you aware of any issue if there were multiple m-shr servers?
[16:54] * rwheeler (~rwheeler@62.214.2.210) has joined #ceph
[16:55] <jcsp> I'm casting my mind back to http://lists.openstack.org/pipermail/openstack-dev/2016-March/088372.html
[16:55] * rwheeler (~rwheeler@62.214.2.210) Quit ()
[16:55] * devicenull (sid4013@id-4013.ealing.irccloud.com) Quit (Ping timeout: 480 seconds)
[16:55] * jnq (sid150909@id-150909.highgate.irccloud.com) Quit (Remote host closed the connection)
[16:56] * scalability-junk (sid6422@id-6422.ealing.irccloud.com) Quit (Ping timeout: 480 seconds)
[16:56] * JohnPreston78 (sid31393@id-31393.ealing.irccloud.com) Quit (Ping timeout: 480 seconds)
[16:58] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[17:00] * derjohn_mob (~aj@169.red-176-83-69.dynamicip.rima-tde.net) Quit (Read error: Connection reset by peer)
[17:00] <jcsp> in the multi-active Manila case, we can't do anything smart without Manila telling us something about whether our peers are meant to be alive or not
[17:01] * lmb (~Lars@62.214.2.210) has joined #ceph
[17:01] <jcsp> when you've got two identical Manila's running, how does it even decide when one has failed?
[17:02] * dgurtner (~dgurtner@178.197.224.203) Quit (Ping timeout: 480 seconds)
[17:03] * jeh (~jeh@76.16.206.198) has joined #ceph
[17:04] <wiebalck_> not sure it has to decide: if a req comes in one of the instances picks it from the queue, the other doesn???t see it (maybe I???m just too naive :)
[17:05] * Concubidated (~cube@68.140.239.164) has joined #ceph
[17:05] <wiebalck_> so if both are up, the faster one does the job; if only one is up, well, it does it
[17:06] <jcsp> hmm, the guidance from the thread I had before said that shares were particularly assigned to one, so requests affecting a particular share couldn't just be sent to either
[17:06] <jcsp> it's possible that that is no longer the case I guess
[17:07] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[17:07] <wiebalck_> in Cinder (sorry), everyone was using the host param to make the volume servers identical and ensure that volume operations happen if at least one server is up
[17:08] <wiebalck_> this flag has been moved into the backend definition and by default does not use host specific information anymore
[17:08] <wiebalck_> doesn???t mean that Manila works the same way of course
[17:09] * nardial (~ls@p5DC06206.dip0.t-ipconnect.de) has joined #ceph
[17:09] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[17:10] * kristen (~kristen@134.134.139.78) has joined #ceph
[17:10] * Ivan1 (~ipencak@213.151.95.130) Quit (Quit: Leaving.)
[17:13] <rraja> jcsp: I think each backend config section definition would lead to a unique host name constructed from a combination of config section name + share type (?)
[17:16] <wiebalck_> rraja: which is essentially the same as my setup where I have only one identical (!) config section on all m-shr instances and use the global ???host??? parameter :)
[17:16] * wiebalck_ (~wiebalck@pb-d-128-141-7-249.cern.ch) Quit (Quit: wiebalck_)
[17:18] <rraja> wiebalck: okay. for now can you try using different cephfs_auth_id for each of those config section. and about having multiple config backend section pointing to same storage cluster working properly, i'll have to verify with the manila community
[17:18] <rraja> does that make sense?
[17:20] * JohnPreston78 (sid31393@2604:8300:100:200b:6667:2:0:7aa1) has joined #ceph
[17:20] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) Quit (Quit: Miouge)
[17:21] <rraja> wiebalck: so a share's host field is identified as mentioned here, https://github.com/openstack/manila/blob/stable/newton/doc/source/devref/pool-aware-manila-scheduler.rst#data-model-impact
[17:22] * wushudoin (~wushudoin@2601:646:8200:c9f0:2ab2:bdff:fe0b:a6ee) has joined #ceph
[17:22] * scalability-junk (sid6422@id-6422.ealing.irccloud.com) has joined #ceph
[17:25] * cetex (~oskar@nadine.juza.se) Quit (Remote host closed the connection)
[17:25] * cetex (~oskar@nadine.juza.se) has joined #ceph
[17:25] * devicenull (sid4013@id-4013.ealing.irccloud.com) has joined #ceph
[17:34] * dneary (~dneary@main-branch-wireless.portland.lib.me.us) has joined #ceph
[17:36] * lmb (~Lars@62.214.2.210) has joined #ceph
[17:36] * dgurtner (~dgurtner@109.236.136.226) has joined #ceph
[17:38] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[17:41] * jnq (sid150909@id-150909.highgate.irccloud.com) has joined #ceph
[17:43] * nardial (~ls@p5DC06206.dip0.t-ipconnect.de) Quit (Quit: Leaving)
[17:47] * wushudoin (~wushudoin@2601:646:8200:c9f0:2ab2:bdff:fe0b:a6ee) Quit (Ping timeout: 480 seconds)
[17:48] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) Quit (Quit: Leaving.)
[17:50] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[17:54] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) has joined #ceph
[17:55] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[17:56] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[18:00] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) Quit (Quit: leaving)
[18:00] * tvon (~tvon@50.58.105.131) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[18:01] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) has joined #ceph
[18:01] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[18:02] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) Quit ()
[18:02] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) has joined #ceph
[18:04] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) Quit (Quit: Miouge)
[18:04] * ira (~ira@12.118.3.106) Quit (Remote host closed the connection)
[18:05] * tvon (~tvon@50.58.105.131) has joined #ceph
[18:06] * b0e1 (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[18:06] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[18:07] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[18:09] * lmb (~Lars@62.214.2.210) has joined #ceph
[18:10] * georgem (~Adium@24.114.57.180) has joined #ceph
[18:10] * dneary (~dneary@main-branch-wireless.portland.lib.me.us) Quit (Ping timeout: 480 seconds)
[18:18] * rdas (~rdas@121.244.87.113) Quit (Remote host closed the connection)
[18:20] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) Quit (Quit: leaving)
[18:21] * lmb (~Lars@62.214.2.210) Quit (Ping timeout: 480 seconds)
[18:21] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) has joined #ceph
[18:21] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[18:21] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) Quit (Ping timeout: 480 seconds)
[18:22] * efirs (~firs@98.207.153.155) has joined #ceph
[18:24] * georgem (~Adium@24.114.57.180) Quit (Quit: Leaving.)
[18:25] * nilez (~nilez@104.129.29.42) Quit (Ping timeout: 480 seconds)
[18:25] * jcsp (~jspray@62.220.28.124) Quit (Quit: Ex-Chat)
[18:28] * kefu (~kefu@114.92.125.128) has joined #ceph
[18:29] * kefu (~kefu@114.92.125.128) Quit ()
[18:30] * nilez (~nilez@104.129.29.42) has joined #ceph
[18:32] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) has joined #ceph
[18:33] * wedge (~wedge@modemcable104.203-131-66.mc.videotron.ca) has joined #ceph
[18:35] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[18:37] <limebyte> well after 4 days, my 100Mbit Internet clust sill alive
[18:37] <limebyte> not dam slow but slow
[18:38] <limebyte> doing fine so far
[18:40] * rdas (~rdas@121.244.87.113) has joined #ceph
[18:42] * ircolle (~Adium@2601:285:201:633a:ded:c0e5:7272:e568) Quit (Quit: Leaving.)
[18:44] * Ramakrishnan (~ramakrish@106.51.27.134) has joined #ceph
[18:45] * JANorman (~JANorman@81.137.246.31) Quit (Ping timeout: 480 seconds)
[18:46] * DanFoster (~Daniel@2a00:1ee0:3:1337:6c24:8eb3:9c6a:ee55) Quit (Quit: Leaving)
[18:48] * rotbeard (~redbeard@2a02:908:df13:bb00:b8e3:9985:4e8c:27db) has joined #ceph
[18:52] * wiebalck_ (~wiebalck@AAnnecy-653-1-50-224.w90-41.abo.wanadoo.fr) has joined #ceph
[18:54] * wedge (~wedge@modemcable104.203-131-66.mc.videotron.ca) Quit ()
[19:03] * sw3 (sweaung@2400:6180:0:d0::66:100f) Quit (Quit: need to zzZ)
[19:07] * JANorman (~JANorman@host31-48-185-146.range31-48.btcentralplus.com) has joined #ceph
[19:08] * sw3 (sweaung@2400:6180:0:d0::66:100f) has joined #ceph
[19:11] * branto (~branto@transit-86-181-132-209.redhat.com) Quit (Quit: ZNC 1.6.3 - http://znc.in)
[19:15] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[19:15] * nilez (~nilez@104.129.29.42) Quit (Ping timeout: 480 seconds)
[19:15] * JANorman (~JANorman@host31-48-185-146.range31-48.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[19:21] * nilez (~nilez@104.129.29.42) has joined #ceph
[19:23] * jdillaman (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[19:24] * bvi (~Bastiaan@185.56.32.1) Quit (Ping timeout: 480 seconds)
[19:32] <rraja> wiebalck: wiebalck_ : ping about manila's cephfs_native driver
[19:32] * tvon (~tvon@50.58.105.131) Quit (Read error: Connection reset by peer)
[19:33] * georgem (~Adium@69-165-135-139.dsl.teksavvy.com) has joined #ceph
[19:33] <gregsfortytwo> rraja: wiebalck_: I spent less time in those meetings and things than others, but I'm definitely under the impression that Cinder is a lot farther along to having a sensible multi-server story than Manila is
[19:33] * nilez (~nilez@104.129.29.42) Quit (Ping timeout: 480 seconds)
[19:34] <gregsfortytwo> Manila's multi-server story is/was very bad
[19:35] * xinli (~charleyst@32.97.110.54) has joined #ceph
[19:37] <rraja> gregsfortytwo: so wiebalck is targeting HA of manila-share service, right? and is worried about possible races?
[19:37] <gregsfortytwo> I gather that's the case? and we just force blacklist any other share servers in the driver setup, right?
[19:38] <gregsfortytwo> because we don't want to have horrible races going on from a not-really-dead server, and Manila is definitely not ready for HA
[19:38] <gregsfortytwo> (manila-share is the Manila service running our driver, right?)
[19:38] <rraja> yes. manila-share is the service running the driver.
[19:39] <rraja> gregsfortytwo: look at design summit topic for manila from line 26 here https://etherpad.openstack.org/p/manila-ocata-design-summit-topics
[19:39] <rraja> manila is definitely not ready for HA yet.
[19:39] <gregsfortytwo> yeah
[19:40] <gregsfortytwo> in April/May they were discussing whether it should use db locks, Zookeeper locks, mutexes, or a combination to prevent data races o_0
[19:40] * nilez (~nilez@104.129.29.42) has joined #ceph
[19:41] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) Quit (Quit: Miouge)
[19:41] <rraja> the community is definitely not for db locks. they're still trying to figure out a optimal solution.
[19:48] <rraja> gregsfortytwo: "and we just force blacklist any other share servers in the driver setup, right?" https://github.com/openstack/manila/blob/master/manila/share/drivers/cephfs/cephfs_native.py#L147, only if all those manila-share services are set to use the same `cephfs_auth_id`.
[19:49] <limebyte> Tiny question; Usually you setup the file redundancy over the PG right? but there is also a option in the configs?
[19:49] <gregsfortytwo> rraja: right, that's what I meant
[19:49] <gregsfortytwo> not that I've looked at really any of the source for this, but given the above discussion with jcsp
[19:49] <gregsfortytwo> I just wanted to back up because the whole multi-Manila thing doesn't seem like a good idea regardless of what our driver does
[19:51] <rraja> gregsfortytwo: yeah, so we recommend users not to have multiple manila share servers point to the same clusters, until manila has sorted out it's HA story and the races that I just pointed out too.
[19:52] <rraja> gregsfortytwo: so basically the control plane doesn't have HA, but the data plane does because of Ceph's HA.
[19:53] <gregsfortytwo> yeah
[19:53] <rraja> hopefully wiebalck wiebalck_ got that. thanks gregsfortytwo!
[19:54] * vata (~vata@207.96.182.162) Quit (Ping timeout: 480 seconds)
[19:54] * mykola (~Mikolaj@91.245.73.11) has joined #ceph
[19:57] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[19:57] * Ramakrishnan (~ramakrish@106.51.27.134) Quit (Read error: Connection reset by peer)
[19:58] * Hemanth (~hkumar_@103.228.221.149) has joined #ceph
[19:59] * bene3 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[20:01] * dgurtner (~dgurtner@109.236.136.226) Quit (Ping timeout: 480 seconds)
[20:02] * bene2 (~bene@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[20:04] * xinli (~charleyst@32.97.110.54) Quit (Remote host closed the connection)
[20:04] * xinli (~charleyst@32.97.110.54) has joined #ceph
[20:05] <gregsfortytwo> np and danke
[20:08] * vata (~vata@207.96.182.162) has joined #ceph
[20:13] * rotbeard (~redbeard@2a02:908:df13:bb00:b8e3:9985:4e8c:27db) Quit (Quit: Leaving)
[20:17] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[20:19] <cholcombe> gregsfortytwo, does cephfs only need a bootstrap key to get going? Does it also need an admin cephx key?
[20:19] <gregsfortytwo> I really have no idea :)
[20:20] <cholcombe> ok
[20:20] <cholcombe> gregsfortytwo, is there a better person to ask for auth crap?
[20:20] <gregsfortytwo> eh, rraja probably messes around with it for the Manila drivers but it's gotten late in India by now
[20:20] <cholcombe> ok
[20:21] <gregsfortytwo> I think you'd need an auth key to do the fs create commands and things
[20:21] <gregsfortytwo> the bootstrap keys are really about enabling daemons, right
[20:21] <gregsfortytwo> ?
[20:21] <cholcombe> yeah i'm not exactly sure what profile mds allows
[20:21] <cholcombe> the fs create thing i've already got squared away
[20:22] * minnesotags (~herbgarci@c-50-137-242-97.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[20:24] * vata (~vata@207.96.182.162) Quit (Ping timeout: 480 seconds)
[20:25] * dneary (~dneary@main-branch-wireless.portland.lib.me.us) has joined #ceph
[20:26] * bene3 (~bene@nat-pool-bos-t.redhat.com) Quit (Quit: Konversation terminated!)
[20:28] <gregsfortytwo> oh, yeah, it should just need the mds bootstrap key to...bootstrap an mds :p
[20:28] <gregsfortytwo> and once that's up and you've got an FS in the maps, you have a filesystem
[20:29] * davidzlap1 (~Adium@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[20:35] * dneary (~dneary@main-branch-wireless.portland.lib.me.us) Quit (Ping timeout: 480 seconds)
[20:36] * vata (~vata@207.96.182.162) has joined #ceph
[20:44] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Quit: doppelgrau)
[20:45] * jermudgeon (~jermudgeo@southend.mdu.whitestone.link) has joined #ceph
[20:48] * salwasser (~Adium@72.246.3.14) Quit (Quit: Leaving.)
[20:49] * wiebalck_ (~wiebalck@AAnnecy-653-1-50-224.w90-41.abo.wanadoo.fr) Quit (Quit: wiebalck_)
[20:51] * Hemanth (~hkumar_@103.228.221.149) Quit (Quit: Leaving)
[20:53] * wiebalck_ (~wiebalck@AAnnecy-653-1-50-224.w90-41.abo.wanadoo.fr) has joined #ceph
[20:54] * vimal (~vikumar@121.244.87.116) Quit (Quit: Leaving)
[20:58] <limebyte> Guys
[20:58] <limebyte> if you setup multiple MDS servers
[20:58] <limebyte> one is active and the other are in standby
[20:58] <limebyte> how does CEPH picks a active server?
[20:58] <limebyte> does it change over time?
[20:59] <limebyte> Well have my Cluster running since 4 days
[20:59] <limebyte> and noticed the MDS server changed for whatsoever reason
[20:59] <s3an2> limebyte, You may want to look at standby-replay over standby for faster failover
[20:59] <s3an2> It should not really change, I would be looking at the mds log files
[20:59] <limebyte> but it did change..
[21:00] <limebyte> I didnt touched my cluster
[21:00] <limebyte> shit
[21:00] <s3an2> Yea, AFAIK it should not change unless there is a problem
[21:00] <limebyte> then there was a prob
[21:00] <limebyte> fine
[21:00] <limebyte> but the cluster is healthy
[21:00] <limebyte> so fixed I guess
[21:01] <limebyte> premium failover
[21:01] <limebyte> which log file?
[21:01] <limebyte> ceph.log?
[21:01] <s3an2> yea, just have a good look at the mds log file when it changed and see if it gives you any indication why.
[21:01] <s3an2> you should see the log file in /var/log/ceph on the mds server
[21:02] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[21:02] <limebyte> mds.beacon.deHetzner handle_mds_beacon no longer laggy
[21:02] <limebyte> it respawned
[21:03] <limebyte> then turned into standby
[21:03] <limebyte> hmmm
[21:04] * davidzlap1 (~Adium@cpe-172-91-154-245.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[21:04] <limebyte> well s3an2 its over the internet
[21:05] <limebyte> maybe a route fucked up, but TINC should usually re route traffic over a other VPN
[21:05] * davidzlap (~Adium@2605:e000:1313:8003:110:644b:20e3:b787) has joined #ceph
[21:07] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) has joined #ceph
[21:09] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:84c4:a35d:32e9:a5e9) Quit (Ping timeout: 480 seconds)
[21:09] * ira (~ira@12.118.3.106) has joined #ceph
[21:10] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[21:11] <wiebalck_> gregsfortytwo rraja: thanks for the follow-up on the Manila multi share server setup
[21:12] <wiebalck_> I guess in that sense Cinder is better (and immproving), but is not there yet either
[21:12] <rraja> wiebalck_: I just checked with a manila core. active-active in manila is definitely not supported
[21:12] <rraja> s/a manila core/a manila core developer/
[21:13] <wiebalck_> rraja: thanks!
[21:13] <rraja> and he says it's not supported in Cinder as well
[21:13] <wiebalck_> exactly my point!
[21:14] <wiebalck_> we (and I think many others) run Cinder already in such a config, though
[21:14] <wiebalck_> the remaining races seems to have limited impact in real life setups
[21:14] <rraja> wiebalck_: http://pastebin.com/6yNWSh0c
[21:15] <rraja> wiebalck_: interesting!
[21:16] <wiebalck_> rraja: thanks, that???s very helpful!
[21:17] <wiebalck_> for now, we ran Manila in testing
[21:17] <rraja> wiebalck_: you're welcome.
[21:17] <wiebalck_> I???ll give it a try to remove the eviction and see what happens
[21:17] <rraja> wiebalck_: how is it coming along?
[21:17] <wiebalck_> so far it???s really great
[21:18] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) Quit (Quit: Miouge)
[21:18] <wiebalck_> the main work went into setting up nodes, the DB, RabbitMQ
[21:18] * davidzlap (~Adium@2605:e000:1313:8003:110:644b:20e3:b787) Quit (Quit: Leaving.)
[21:18] <wiebalck_> the CephFS native driver setup was 30 minutes or so :)
[21:18] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:19] * davidzlap (~Adium@2605:e000:1313:8003:110:644b:20e3:b787) has joined #ceph
[21:19] <rraja> wiebalck_: good to know. please let us know if you've issues with documentation. which version of Manila are you using?
[21:19] <rraja> and Ceph?
[21:19] <wiebalck_> initial testing went w/o issues, that???s why I wanted to beef up the controllers from 1 to 3 (which is when I ran into the eviction issue)
[21:19] <wiebalck_> I???m on Mitaka atm.
[21:20] <wiebalck_> Waiting for Newton :)
[21:20] <rraja> and Ceph?
[21:20] <wiebalck_> Jewel
[21:20] <wiebalck_> we have setup a cluster in one VM :)
[21:21] <wiebalck_> for this test
[21:21] <wiebalck_> we have a real CephFS cluster that I???ll connect to once I have the Manila setup and configuration under control
[21:21] * diver (~diver@95.85.8.93) Quit ()
[21:22] <rraja> wiebalck_: you just used the devstack-plugin-ceph script? :)
[21:23] <wiebalck_> no ??? we have an in-house Ceph exprt :)
[21:24] <rraja> cool!
[21:26] * rraja (~rraja@125.16.34.66) Quit (Quit: Leaving)
[21:28] * ade_b (~abradshaw@p4FF79CD2.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[21:31] * sileht (~sileht@gizmo.sileht.net) Quit (Quit: WeeChat 1.5)
[21:31] * jarrpa (~jarrpa@63.225.131.166) Quit (Ping timeout: 480 seconds)
[21:32] * sileht (~sileht@gizmo.sileht.net) has joined #ceph
[21:39] * delcake (~Shadow386@31.220.4.161) has joined #ceph
[21:40] <thoht> after updating from hammer to Jewel; and surprise ! it went to HEALTH_WARN : crush map has legacy tunables
[21:40] <thoht> i had to run "ceph osd crush tunables optimal" then the cluster became unusable due to slowless
[21:42] <thoht> will it be always like that during version upgrade ? i mean having a cluster ceph in production is critical and i thought upgrade were transparent but it was not totaly for a couple of hours due to this crush optimization :/
[21:47] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[21:49] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) Quit (Quit: Miouge)
[21:51] * delcake (~Shadow386@31.220.4.161) Quit (Ping timeout: 480 seconds)
[21:52] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:56] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) Quit ()
[21:58] * Racpatel (~Racpatel@2601:87:3:31e3::34db) Quit (Ping timeout: 480 seconds)
[21:58] * blizzow (~jburns@c-50-152-51-96.hsd1.co.comcast.net) has joined #ceph
[21:59] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:59] * gucore (~fridim@56-198-190-109.dsl.ovh.fr) has joined #ceph
[22:01] * bjozet (~bjozet@82.183.17.144) has joined #ceph
[22:02] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[22:03] * fridim (~fridim@56-198-190-109.dsl.ovh.fr) Quit (Ping timeout: 480 seconds)
[22:04] * mykola (~Mikolaj@91.245.73.11) Quit (Quit: away)
[22:09] * gucore (~fridim@56-198-190-109.dsl.ovh.fr) Quit (Read error: No route to host)
[22:15] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[22:18] * gucore (~fridim@56-198-190-109.dsl.ovh.fr) has joined #ceph
[22:22] * davidzlap1 (~Adium@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[22:23] * lmb (~Lars@ip5b404bab.dynamic.kabel-deutschland.de) has joined #ceph
[22:24] * davidzlap (~Adium@2605:e000:1313:8003:110:644b:20e3:b787) Quit (Ping timeout: 480 seconds)
[22:24] * wiebalck_ (~wiebalck@AAnnecy-653-1-50-224.w90-41.abo.wanadoo.fr) Quit (Quit: wiebalck_)
[22:26] <fusl> what could be the reason that ceph starts a recovery of about 50% of objects when i just remove a single osd out of a 34 osd cluster?
[22:26] * gucore (~fridim@56-198-190-109.dsl.ovh.fr) Quit (Read error: Connection reset by peer)
[22:26] <jermudgeon> fusl: did you weight/crush reweight first?
[22:27] <fusl> i did not, am i supposed to?
[22:29] <jermudgeon> it can help; depends on what you???re doing
[22:29] <jermudgeon> ceph osd reweight moves data to other OSDs on same host; crush reweight remaps everywhere
[22:29] * Miouge (~Miouge@208.143-65-87.adsl-dyn.isp.belgacom.be) Quit (Quit: Miouge)
[22:30] <jermudgeon> in general, you???ll always have object movement; I don???t know why you???re seeing 50% movement though, is your crush map sensible?
[22:30] * wiebalck_ (~wiebalck@AAnnecy-653-1-50-224.w90-41.abo.wanadoo.fr) has joined #ceph
[22:31] * wiebalck_ (~wiebalck@AAnnecy-653-1-50-224.w90-41.abo.wanadoo.fr) Quit ()
[22:31] <lurbs> What does 'ceph osd df' give you? What does your current data placement look like?
[22:34] * xinli (~charleyst@32.97.110.54) Quit (Ping timeout: 480 seconds)
[22:36] <fusl> right now the cluster runs with many osds removed (up out), though i only removed one osd yesterday from the cluster in health_ok state and its at "recovery 3569490/5207509 objects misplaced (68.545%)" at the moment
[22:37] <fusl> ceph osd df -> https://scr.meo.ws/paste/1475786212628282738.txt / ceph osd tree -> https://scr.meo.ws/paste/1475786235663746164.txt
[22:37] <fusl> osd 25 is the one i marked out
[22:38] <fusl> and here is my crush map in case its needed https://scr.meo.ws/paste/1475786302267855959.txt
[22:39] <fusl> the recovery started at about 45% yesterday and instead of decreasing it constantly increased to 69% by now
[22:41] * xinli (~charleyst@32.97.110.51) has joined #ceph
[22:42] * davidzlap1 (~Adium@cpe-172-91-154-245.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[22:45] * ntpttr_ (~ntpttr@134.134.139.76) has joined #ceph
[22:45] * ntpttr_ (~ntpttr@134.134.139.76) Quit ()
[22:46] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) Quit (Ping timeout: 480 seconds)
[22:48] * cyphase (~cyphase@000134f2.user.oftc.net) Quit (Quit: cyphase.com)
[22:48] * cyphase (~cyphase@2601:640:c401:969a:468a:5bff:fe29:b5fd) has joined #ceph
[22:49] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[22:53] * bjozet (~bjozet@82.183.17.144) Quit (Ping timeout: 480 seconds)
[22:53] * davidzlap (~Adium@2605:e000:1313:8003:9c2b:4dee:7d49:9248) has joined #ceph
[22:58] <fusl> ceph status > https://scr.meo.ws/paste/1475787444726875887.txt
[22:58] <jermudgeon> fusl: only 20 osds in?
[22:59] <jermudgeon> 20 out of 34 is seriously degraded
[22:59] <jermudgeon> or presumably, assuming how many replicas?
[22:59] <fusl> i'm scaling down the cluster to 20 nodes as i need to replace the underlying hardware nodes asap
[22:59] <fusl> 3 replicas for data, 5 for metadata
[23:00] <s3an2> 15 mons?
[23:02] <s3an2> 14 up:standby (that 14 standby MDS servers?)
[23:04] * dneary (~dneary@main-branch-wireless.portland.lib.me.us) has joined #ceph
[23:05] * Kingrat (~shiny@2605:6000:1526:4063:2c6d:6d69:9355:642c) Quit (Remote host closed the connection)
[23:07] * georgem (~Adium@69-165-135-139.dsl.teksavvy.com) Quit (Quit: Leaving.)
[23:10] <fusl> s3an2: yea
[23:12] * dneary (~dneary@main-branch-wireless.portland.lib.me.us) Quit (Ping timeout: 480 seconds)
[23:18] * AluAlu (~dux0r@108.61.166.135) has joined #ceph
[23:25] * rendar (~I@82.61.125.178) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[23:32] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[23:34] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[23:48] * AluAlu (~dux0r@108.61.166.135) Quit ()
[23:50] * marco208 (~root@159.253.7.204) Quit (Ping timeout: 480 seconds)
[23:52] * dneary (~dneary@rrcs-24-103-206-82.nys.biz.rr.com) has joined #ceph
[23:53] * marco208 (~root@159.253.7.204) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.