#ceph IRC Log

Index

IRC Log for 2016-09-14

Timestamps are in GMT/BST.

[0:00] * thansen (~thansen@17.253.sfcn.org) Quit (Quit: Ex-Chat)
[0:03] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[0:08] * salwasser (~Adium@2601:197:101:5cc1:34a9:b556:f3be:51e6) Quit (Quit: Leaving.)
[0:16] <imcsk8> hello, i have this problem: # cephfs-journal-tool journal reset
[0:16] <imcsk8> 2016-09-13 16:15:48.644064 7fcc51149700 0 -- 192.168.1.68:6801/743639902 >> 192.168.1.11:6800/1303 pipe(0x562c1d54a000 sd=8 :40976 s=1 pgs=0 cs=0 l=0 c=0x562c1d494f00).connect claims to be 192.168.1.11:6800/11579 not 192.168.1.11:6800/1303 - wrong node!
[0:19] * bene2 (~bene@2601:193:4101:f410:ea2a:eaff:fe08:3c7a) has joined #ceph
[0:25] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[0:27] * diver (~diver@95.85.8.93) has joined #ceph
[0:33] * diver_ (~diver@cpe-2606-A000-111B-C12B-710B-A724-C591-36D1.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[0:38] * vata (~vata@96.127.202.136) has joined #ceph
[0:43] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[0:50] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) has joined #ceph
[0:50] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[0:53] * wes_dillingham (~wes_dilli@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) has joined #ceph
[0:56] * rendar (~I@host221-46-dynamic.31-79-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[1:01] * srk (~Siva@2605:6000:ed04:ce00:29b6:4f15:6ed6:f965) has joined #ceph
[1:10] * kuku (~kuku@119.93.91.136) has joined #ceph
[1:12] * Discovery (~Discovery@109.235.52.6) Quit (Read error: Connection reset by peer)
[1:13] * diver (~diver@95.85.8.93) Quit (Remote host closed the connection)
[1:13] * diver (~diver@cpe-98-26-71-226.nc.res.rr.com) has joined #ceph
[1:14] * xarses_ (~xarses@64.124.158.3) Quit (Ping timeout: 480 seconds)
[1:15] * bene2 (~bene@2601:193:4101:f410:ea2a:eaff:fe08:3c7a) Quit (Quit: Konversation terminated!)
[1:21] * diver (~diver@cpe-98-26-71-226.nc.res.rr.com) Quit (Ping timeout: 480 seconds)
[1:23] * diver (~diver@cpe-2606-A000-111B-C12B-48B9-5224-465E-9CEB.dyn6.twc.com) has joined #ceph
[1:23] * diver (~diver@cpe-2606-A000-111B-C12B-48B9-5224-465E-9CEB.dyn6.twc.com) Quit (Remote host closed the connection)
[1:23] * diver (~diver@95.85.8.93) has joined #ceph
[1:23] * srk (~Siva@2605:6000:ed04:ce00:29b6:4f15:6ed6:f965) Quit (Ping timeout: 480 seconds)
[1:28] * dack (~oftc-webi@gateway.ola.bc.ca) has joined #ceph
[1:28] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[1:28] * kuku (~kuku@119.93.91.136) Quit (Remote host closed the connection)
[1:29] <dack> has anyone got a working ceph-fuse entry in fstab? The example from the docs doesn't work - maybe it's distribution specific?
[1:29] * diver_ (~diver@cpe-2606-A000-111B-C12B-48B9-5224-465E-9CEB.dyn6.twc.com) has joined #ceph
[1:32] * diver_ (~diver@cpe-2606-A000-111B-C12B-48B9-5224-465E-9CEB.dyn6.twc.com) Quit (Remote host closed the connection)
[1:32] * diver_ (~diver@cpe-98-26-71-226.nc.res.rr.com) has joined #ceph
[1:35] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[1:36] * diver (~diver@95.85.8.93) Quit (Ping timeout: 480 seconds)
[1:39] <dack> never mind, found a mailing list thread and this: https://bugzilla.redhat.com/show_bug.cgi?id=1248003
[1:40] * diver_ (~diver@cpe-98-26-71-226.nc.res.rr.com) Quit (Ping timeout: 480 seconds)
[1:41] * diver (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) has joined #ceph
[1:45] * wushudoin (~wushudoin@2601:646:8200:c9f0:2ab2:bdff:fe0b:a6ee) Quit (Ping timeout: 480 seconds)
[1:45] * xarses_ (~xarses@73.93.155.167) has joined #ceph
[1:47] * northrup (~northrup@50-249-151-243-static.hfc.comcastbusiness.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[1:55] * jarrpa (~jarrpa@67-4-148-200.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[1:58] * cathode (~cathode@50.232.215.114) Quit (Quit: Leaving)
[2:02] * diver_ (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) has joined #ceph
[2:04] * diver__ (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) has joined #ceph
[2:08] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[2:08] * diver (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[2:10] * diver_ (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[2:12] * cholcombe (~chris@97.93.161.2) Quit (Ping timeout: 480 seconds)
[2:12] * Skaag (~lunix@65.200.54.234) Quit (Quit: Leaving.)
[2:16] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[2:17] * srk (~Siva@2605:6000:ed04:ce00:b0f3:ee03:22a2:3dde) has joined #ceph
[2:22] * xarses_ (~xarses@73.93.155.167) Quit (Ping timeout: 480 seconds)
[2:26] * diver (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) has joined #ceph
[2:28] * diver_ (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) has joined #ceph
[2:32] * diver__ (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[2:34] * diver (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[2:35] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[2:39] <jiffe> so I'm curious, if I have an osd marked down/out it reorganizes the pgs, if I then remove it from crush it reorganizes again, why the double reorganization?
[2:40] * srk (~Siva@2605:6000:ed04:ce00:b0f3:ee03:22a2:3dde) Quit (Ping timeout: 480 seconds)
[2:40] * diver_ (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) Quit (Read error: Connection reset by peer)
[2:40] <SamYaple> jiffe: the first one "reorgnaize" is setting a 4 location for the object (assuming initally there were three copies)
[2:41] * diver (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) has joined #ceph
[2:41] <SamYaple> that makes sure you have your deisred number of replicates
[2:41] <SamYaple> REMOVING the osd from the crush map changes the crush map in a very different way
[2:41] <SamYaple> they do try to prevent this and ther ewas some good features in jewel to help lessen that affect i believe
[2:42] <jiffe> I see, so in this case a drive died and that osd was marked down/out, so if this happens again do I just want to remove it from crush right away to prevent double reorganization?
[2:44] <SamYaple> jiffe: lower the wieght of the drive to 0, you dont neccesarily need to remove it
[2:44] <SamYaple> you dont want to rush a removal
[2:45] <jiffe> why is that?
[2:46] <SamYaple> you make mistakes when you rush
[2:46] <SamYaple> you can always change weight back, you cant readd the osd
[2:46] <SamYaple> changing weight to 0 will have the same affect on the crush map though
[2:47] <jiffe> so I don't want to replace and readd the disk while reorganization is happening?
[2:48] <SamYaple> I would not advise it until the cluster is rebalanced, no
[2:49] * diver (~diver@cpe-2606-A000-111B-C12B-FC00-5AEB-44E2-2A27.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[2:49] <SamYaple> though if you are really jsut _replacing_ the disk, you can remove and readd teh disk and have only the cost of fill up the disk for speed technically.
[2:49] <SamYaple> you hsould test this in a lab so you see how it works
[2:57] * aNuposic (~aNuposic@192.55.54.38) Quit (Ping timeout: 480 seconds)
[2:57] * danieagle (~Daniel@179.98.53.95) has joined #ceph
[3:05] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[3:10] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[3:13] * aNuposic (~aNuposic@134.134.137.75) has joined #ceph
[3:16] * jfaj_ (~jan@p4FC25CD8.dip0.t-ipconnect.de) has joined #ceph
[3:19] <masber> is there an archive of this channel?
[3:21] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[3:23] * jfaj (~jan@p57983686.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[3:26] * praveen (~praveen@122.172.31.182) has joined #ceph
[3:31] * garphy`aw is now known as garphy
[3:36] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[3:38] * xarses_ (~xarses@73.93.152.143) has joined #ceph
[3:39] * xarses_ (~xarses@73.93.152.143) Quit (Remote host closed the connection)
[3:39] * xarses_ (~xarses@73.93.152.143) has joined #ceph
[3:41] * sebastian-w (~quassel@212.218.8.138) has joined #ceph
[3:41] * mattbenjamin (~mbenjamin@121.244.54.198) has joined #ceph
[3:42] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[3:42] * georgem (~Adium@69-165-135-139.dsl.teksavvy.com) has joined #ceph
[3:44] * sebastian-w_ (~quassel@212.218.8.139) Quit (Ping timeout: 480 seconds)
[3:47] * georgem (~Adium@69-165-135-139.dsl.teksavvy.com) Quit ()
[3:47] * georgem (~Adium@206.108.127.16) has joined #ceph
[3:48] * diver (~diver@cpe-2606-A000-111B-C12B-D989-F641-2EC9-E9AA.dyn6.twc.com) has joined #ceph
[3:55] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[3:58] * mattbenjamin (~mbenjamin@121.244.54.198) Quit (Ping timeout: 480 seconds)
[3:58] * yanzheng (~zhyan@125.70.21.187) has joined #ceph
[3:58] * zapu (~airsoftgl@tsn109-201-154-188.dyn.nltelcom.net) has joined #ceph
[3:58] * garphy is now known as garphy`aw
[4:01] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[4:04] * srk (~Siva@2605:6000:ed04:ce00:a491:1fe0:90e8:8ebc) has joined #ceph
[4:08] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[4:09] * diver (~diver@cpe-2606-A000-111B-C12B-D989-F641-2EC9-E9AA.dyn6.twc.com) Quit (Remote host closed the connection)
[4:09] * diver (~diver@cpe-2606-A000-111B-C12B-D989-F641-2EC9-E9AA.dyn6.twc.com) has joined #ceph
[4:15] * xarses_ (~xarses@73.93.152.143) Quit (Ping timeout: 480 seconds)
[4:17] * diver (~diver@cpe-2606-A000-111B-C12B-D989-F641-2EC9-E9AA.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[4:20] * ntpttr_ (~ntpttr@134.134.139.74) has joined #ceph
[4:21] * ntpttr (~ntpttr@134.134.139.77) Quit (Remote host closed the connection)
[4:22] * ntpttr_ (~ntpttr@134.134.139.74) Quit ()
[4:24] * srk (~Siva@2605:6000:ed04:ce00:a491:1fe0:90e8:8ebc) Quit (Ping timeout: 480 seconds)
[4:28] * zapu (~airsoftgl@tsn109-201-154-188.dyn.nltelcom.net) Quit ()
[4:30] * kefu (~kefu@114.92.125.128) has joined #ceph
[4:32] * garphy`aw is now known as garphy
[4:34] * wjw-freebsd2 (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[4:35] * srk (~Siva@2605:6000:ed04:ce00:6e40:8ff:fe9c:8b58) has joined #ceph
[4:41] * mhack (~mhack@nat-pool-bos-t.redhat.com) Quit (Remote host closed the connection)
[4:43] * linuxkidd (~linuxkidd@ip70-189-202-62.lv.lv.cox.net) Quit (Quit: Leaving)
[4:45] * wes_dillingham (~wes_dilli@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) Quit (Quit: wes_dillingham)
[4:48] * Skaag (~lunix@cpe-172-91-77-84.socal.res.rr.com) has joined #ceph
[4:55] * garphy is now known as garphy`aw
[4:58] * flisky (~Thunderbi@106.38.61.181) has joined #ceph
[4:59] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[5:00] * flisky (~Thunderbi@106.38.61.181) Quit ()
[5:02] * davidz (~davidz@2605:e000:1313:8003:4c6:c0a8:5969:efdb) Quit (Quit: Leaving.)
[5:08] * rwheeler (~rwheeler@202.62.94.195) Quit (Quit: Leaving)
[5:11] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[5:12] * bniver (~bniver@202.62.94.195) Quit (Quit: Leaving)
[5:18] * jcsp (~jspray@121.244.54.198) Quit (Ping timeout: 480 seconds)
[5:20] * Vacuum__ (~Vacuum@88.130.197.23) has joined #ceph
[5:20] * Vacuum_ (~Vacuum@88.130.211.4) Quit (Read error: Connection reset by peer)
[5:22] * Tene (~tene@173.13.139.236) Quit (Ping timeout: 480 seconds)
[5:25] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[5:26] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[5:28] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[5:29] * kefu (~kefu@114.92.125.128) Quit (Max SendQ exceeded)
[5:30] * kefu (~kefu@114.92.125.128) has joined #ceph
[5:34] * srk (~Siva@2605:6000:ed04:ce00:6e40:8ff:fe9c:8b58) Quit (Read error: Connection reset by peer)
[5:34] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[5:36] * aNuposic (~aNuposic@134.134.137.75) Quit (Remote host closed the connection)
[5:36] * aNuposic (~aNuposic@192.55.54.42) has joined #ceph
[5:44] * Tene (~tene@173.13.139.236) has joined #ceph
[5:50] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[5:51] * joshd (~jdurgin@125.16.34.66) has joined #ceph
[5:55] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[5:58] * bniver (~bniver@125.16.34.66) has joined #ceph
[6:04] * walcubi_ (~walcubi@p5795B386.dip0.t-ipconnect.de) has joined #ceph
[6:07] * mattbenjamin (~mbenjamin@125.16.34.66) has joined #ceph
[6:08] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[6:11] * walcubi (~walcubi@p5795B317.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[6:14] * bara (~bara@125.16.34.66) has joined #ceph
[6:15] * rwheeler (~rwheeler@125.16.34.66) has joined #ceph
[6:23] * vata (~vata@96.127.202.136) Quit (Quit: Leaving.)
[6:53] * cholcombe (~chris@97.93.161.2) has joined #ceph
[7:00] * Jeffrey4l__ (~Jeffrey@221.195.210.23) has joined #ceph
[7:07] * Jeffrey4l_ (~Jeffrey@110.244.242.55) Quit (Ping timeout: 480 seconds)
[7:09] * kefu is now known as kefu|afk
[7:09] * TomasCZ (~TomasCZ@yes.tenlab.net) Quit (Quit: Leaving)
[7:14] * cholcombe (~chris@97.93.161.2) Quit (Ping timeout: 480 seconds)
[7:14] * northrup (~northrup@173.14.101.193) has joined #ceph
[7:15] * praveen (~praveen@122.172.31.182) Quit (Remote host closed the connection)
[7:15] * kefu|afk is now known as kefu
[7:19] * kalleeen (~Wizeon@185.65.134.74) has joined #ceph
[7:23] * northrup (~northrup@173.14.101.193) Quit (Quit: Textual IRC Client: www.textualapp.com)
[7:32] * rdas (~rdas@121.244.87.116) has joined #ceph
[7:37] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[7:49] * kalleeen (~Wizeon@185.65.134.74) Quit ()
[7:54] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) has joined #ceph
[8:05] * Nicho1as (~nicho1as@00022427.user.oftc.net) has joined #ceph
[8:07] * praveen (~praveen@121.244.155.9) has joined #ceph
[8:08] * doppelgrau_ (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[8:11] * diver (~diver@cpe-2606-A000-111B-C12B-D989-F641-2EC9-E9AA.dyn6.twc.com) has joined #ceph
[8:14] * karnan (~karnan@121.244.87.117) has joined #ceph
[8:19] * diver (~diver@cpe-2606-A000-111B-C12B-D989-F641-2EC9-E9AA.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[8:20] * ledgr (~ledgr@88-222-11-185.meganet.lt) has joined #ceph
[8:24] * ade (~abradshaw@194.169.251.11) has joined #ceph
[8:27] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) has joined #ceph
[8:28] * ledgr (~ledgr@88-222-11-185.meganet.lt) Quit (Ping timeout: 480 seconds)
[8:34] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[8:35] * ashah (~ashah@121.244.87.117) has joined #ceph
[8:40] * jclm (~jclm@77.95.96.78) has joined #ceph
[8:43] * jclm (~jclm@77.95.96.78) Quit ()
[8:55] * EinstCrazy (~EinstCraz@61.165.253.98) has joined #ceph
[8:55] * imcsk8_ (~ichavero@189.155.163.170) has joined #ceph
[8:55] * imcsk8 (~ichavero@189.155.163.170) Quit (Read error: Connection reset by peer)
[8:57] * mattbenjamin (~mbenjamin@125.16.34.66) Quit (Ping timeout: 480 seconds)
[8:58] * EinstCra_ (~EinstCraz@61.165.253.98) has joined #ceph
[8:58] * EinstCrazy (~EinstCraz@61.165.253.98) Quit (Read error: Connection reset by peer)
[9:01] * dugravot61 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) has joined #ceph
[9:03] * joshd (~jdurgin@125.16.34.66) Quit (Quit: Leaving.)
[9:03] * oms101 (~oms101@p20030057EA024000C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[9:04] * EinstCra_ (~EinstCraz@61.165.253.98) Quit (Remote host closed the connection)
[9:06] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[9:06] * briner (~briner@129.194.16.54) Quit (Quit: briner)
[9:06] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) Quit (Ping timeout: 480 seconds)
[9:10] * briner (~briner@2001:620:600:1000:70cc:216f:578c:e9ba) has joined #ceph
[9:13] * ade (~abradshaw@194.169.251.11) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * aNuposic (~aNuposic@192.55.54.42) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * danieagle (~Daniel@179.98.53.95) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * dack (~oftc-webi@gateway.ola.bc.ca) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * erhudy (uid89730@id-89730.ealing.irccloud.com) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * wak-work (~wak-work@2620:15c:2c5:3:7c9e:3261:bdc9:bdc9) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * ErifKard (~ErifKard@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * post-factum (~post-fact@vulcan.natalenko.name) Quit (resistance.oftc.net beauty.oftc.net)
[9:13] * Kingrat (~shiny@2605:6000:1526:4063:ecdf:a098:2871:dc2c) Quit (resistance.oftc.net beauty.oftc.net)
[9:16] * ade (~abradshaw@194.169.251.11) has joined #ceph
[9:16] * danieagle (~Daniel@179.98.53.95) has joined #ceph
[9:16] * dack (~oftc-webi@gateway.ola.bc.ca) has joined #ceph
[9:16] * erhudy (uid89730@id-89730.ealing.irccloud.com) has joined #ceph
[9:16] * wak-work (~wak-work@2620:15c:2c5:3:7c9e:3261:bdc9:bdc9) has joined #ceph
[9:16] * ErifKard (~ErifKard@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[9:16] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) has joined #ceph
[9:16] * post-factum (~post-fact@vulcan.natalenko.name) has joined #ceph
[9:16] * Kingrat (~shiny@2605:6000:1526:4063:ecdf:a098:2871:dc2c) has joined #ceph
[9:16] * ChanServ sets mode +v nhm
[9:16] * sleinen (~Adium@2001:620:0:2d:a65e:60ff:fedb:f305) has joined #ceph
[9:21] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[9:23] <rmart04> Morning Guys, I have a QQ about cache tiers. I find that I keep hitting my target_max_bytes, and I am trying to work out how best to start flushing data earliar than that. I seem to find that performance is lower when the cache is flushing/evicting data, so ideally I want this to happen pre-emptively. Is there a ???max age??? for items in the cache ? ( I can only find reference to min_age) (infernalis on trusty)
[9:24] <rmart04> Sorry I am getting my terms mixed up, evict datat thats not been changed or access for a while, flush datat that has been changed and has not been accessed for a while. This is a fairly general purpose cluster. I???d like to start that process say 24 hours or so after not being accessed
[9:26] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[9:27] * EinstCrazy (~EinstCraz@61.165.253.98) has joined #ceph
[9:27] * ledgr (~ledgr@84.15.178.214) has joined #ceph
[9:28] * Kingrat (~shiny@2605:6000:1526:4063:ecdf:a098:2871:dc2c) Quit (resistance.oftc.net beauty.oftc.net)
[9:28] * ErifKard (~ErifKard@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (resistance.oftc.net beauty.oftc.net)
[9:28] * erhudy (uid89730@id-89730.ealing.irccloud.com) Quit (resistance.oftc.net beauty.oftc.net)
[9:28] * dack (~oftc-webi@gateway.ola.bc.ca) Quit (resistance.oftc.net beauty.oftc.net)
[9:28] * danieagle (~Daniel@179.98.53.95) Quit (resistance.oftc.net beauty.oftc.net)
[9:28] * ade (~abradshaw@194.169.251.11) Quit (resistance.oftc.net beauty.oftc.net)
[9:28] * post-factum (~post-fact@vulcan.natalenko.name) Quit (resistance.oftc.net beauty.oftc.net)
[9:28] * wak-work (~wak-work@2620:15c:2c5:3:7c9e:3261:bdc9:bdc9) Quit (resistance.oftc.net beauty.oftc.net)
[9:28] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) Quit (resistance.oftc.net beauty.oftc.net)
[9:29] * fsimonce (~simon@host98-71-dynamic.1-87-r.retail.telecomitalia.it) has joined #ceph
[9:31] * ade (~abradshaw@194.169.251.11) has joined #ceph
[9:31] * dack (~oftc-webi@gateway.ola.bc.ca) has joined #ceph
[9:31] * erhudy (uid89730@id-89730.ealing.irccloud.com) has joined #ceph
[9:31] * wak-work (~wak-work@2620:15c:2c5:3:7c9e:3261:bdc9:bdc9) has joined #ceph
[9:31] * ErifKard (~ErifKard@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[9:31] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) has joined #ceph
[9:31] * post-factum (~post-fact@vulcan.natalenko.name) has joined #ceph
[9:31] * Kingrat (~shiny@2605:6000:1526:4063:ecdf:a098:2871:dc2c) has joined #ceph
[9:31] * ChanServ sets mode +v nhm
[9:32] * EinstCrazy (~EinstCraz@61.165.253.98) Quit (Remote host closed the connection)
[9:32] * TheSov2 (~TheSov@108-75-213-57.lightspeed.cicril.sbcglobal.net) has joined #ceph
[9:35] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[9:37] * doppelgrau_ (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Quit: doppelgrau_)
[9:39] * TheSov (~TheSov@108-75-213-57.lightspeed.cicril.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[9:40] * EinstCrazy (~EinstCraz@61.165.253.98) has joined #ceph
[9:41] * EinstCrazy (~EinstCraz@61.165.253.98) Quit (Remote host closed the connection)
[9:43] * ledgr (~ledgr@84.15.178.214) Quit (Remote host closed the connection)
[9:43] * ledgr (~ledgr@84.15.178.214) has joined #ceph
[9:52] * ledgr (~ledgr@84.15.178.214) Quit (Ping timeout: 480 seconds)
[9:55] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[9:59] <rmart04> Appreciate any comments on this! evicing/flushing at TMB is not idea as this presents degraded performance!
[10:05] * ledgr (~ledgr@84.15.178.214) has joined #ceph
[10:07] * huats (~quassel@stuart.objectif-libre.com) has joined #ceph
[10:07] * TMM (~hp@dhcp-077-248-009-229.chello.nl) Quit (Quit: Ex-Chat)
[10:17] * ledgr (~ledgr@84.15.178.214) Quit (Remote host closed the connection)
[10:18] * ledgr (~ledgr@84.15.178.214) has joined #ceph
[10:19] * joshd (~jdurgin@125.16.34.66) has joined #ceph
[10:20] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[10:24] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[10:26] * ledgr (~ledgr@84.15.178.214) Quit (Ping timeout: 480 seconds)
[10:27] * brians (~brian@80.111.114.175) Quit (Quit: Textual IRC Client: www.textualapp.com)
[10:28] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) has joined #ceph
[10:28] * brians (~brian@80.111.114.175) has joined #ceph
[10:34] * wkennington (~wkenningt@c-71-204-170-241.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[10:38] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[10:42] * andihit (uid118959@id-118959.richmond.irccloud.com) has joined #ceph
[10:43] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[10:43] * Lokta (~Lokta@193.164.231.98) has joined #ceph
[10:44] * jcsp (~jspray@125.16.34.66) has joined #ceph
[10:49] <ivve> rmart04: best way is to set relative flushes and evicts to 0 (or very low) and then set min ages on evicts and flushes
[10:49] <ivve> hitting max_bytes will simply stop client IO until it has freed up space
[10:50] <ivve> anyways, that way min time effciently becomes max time ;)
[10:51] <rmart04> OK thanks. I will look into that. Slightly distracted as I just had some very strange behaviour when I set cache_target_dirty_ratio. My ???cold store??? seemd to shed a lot of data. This is worrying
[10:51] <ivve> since min is forcing to keep the data in cache while relative wants to out it
[10:51] <rmart04> :)
[10:51] <ivve> yes the ratio is relative
[10:51] <ivve> if you set it high it will wait until that % and start flushing/evicting
[10:51] <ivve> you can get all your pool data and paste it here or in a pastebin/hastebin
[10:52] * derjohn_mob (~aj@46.189.28.68) has joined #ceph
[10:52] <rmart04> Is it relative to target_max_bytes or to the max avail/
[10:52] <ivve> max bytes = max avail
[10:52] <ivve> since ceph doesn't know how large the cache is
[10:52] <ivve> or rather, how large you want it to be
[10:53] <ivve> you can use it for more than cacheing i guess
[10:53] <ivve> so max bytes is more like a "reserved for cache"
[10:53] <ivve> so yea, realative to max bytes
[10:54] <rmart04> Its is unsual for the cold tier to have a significant data reduction when setting cache_target_max_dirty_ratio settings though isnt it?
[10:55] <ivve> well dirty data will be rewritten, "clean" evicted
[10:55] <ivve> clean=unchanged
[10:55] <ivve> im guessing you are using writeback?
[10:55] <ivve> as cache-mode
[10:57] <ivve> so if you would write something that means that data in coldstorage is "useless", im betting it would write to coldstorage as available/empty space
[10:57] <ivve> i.e if the rbd is mounted with discard and you remove a file
[10:57] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) has joined #ceph
[10:58] <ivve> cache would get that info first and later on tell the coldstorage that its "gone"
[10:58] <ivve> not sure if it makes any sense
[10:58] <rmart04> sorry, give me a minuite to process :)
[11:00] <ivve> a good start is to ask wether you run writeback or something else
[11:00] * dugravot61 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) Quit (Ping timeout: 480 seconds)
[11:00] * TMM (~hp@185.5.121.201) has joined #ceph
[11:00] <ivve> bbl lunch :)
[11:02] <rmart04> No that makes a lot of sense, I hope that is the case!!
[11:03] <rmart04> Yes it is a writeback cache
[11:08] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) Quit (Quit: Leaving.)
[11:09] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) has joined #ceph
[11:09] <rmart04> Just for a full rundown: Currently there is ~80GB showing in use on the cache tier. What I was trying to do was set target dirty 40% dirty high 60% of the Max TB (100GB ish). These did not seem to be doing the trick, so I figured that it was a % of the Max Available (550GB). So I tried setting 0.1 (ie 55Gb) so see if it shifted. At that point I noticed the backing store reducing significantly.
[11:11] <rmart04> In this instance the pool was for glance images. (Yes probably should not even have a cache tier really). I guess its removing images that have been previously removed but not taken effect in the cold tier
[11:13] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[11:16] * homosaur (~Nanobot@tor-exit.squirrel.theremailer.net) has joined #ceph
[11:32] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[11:38] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[11:38] * Fira (~oftc-webi@46.218.229.172) has joined #ceph
[11:39] <Fira> Hey guys
[11:39] * karnan (~karnan@121.244.87.117) Quit (Ping timeout: 480 seconds)
[11:40] <Fira> Considering usage of Ceph here for several usecases, and i'd have some questions
[11:43] * ledgr (~ledgr@84.15.178.214) has joined #ceph
[11:43] <Fira> Let's assume we store our data in RADOS via the Object Storage Gateway's REST API in a Cluster, and want to field-deploy the system later on. How easy would it be to extract data from our home ceph cluster and into a docker container for temporary usage? Does that kind of data extraction fall into application logic, or are there way to bulk export from CLI?
[11:46] * homosaur (~Nanobot@2RTAAACMQ.tor-irc.dnsbl.oftc.net) Quit ()
[11:47] <Fira> Another question, when it comes to deployment, would you use ceph-deploy in production, or rather only for testing ? Assuming you only want a handful of nodes, would you rather go 3 OSD/Monitor nodes, or 3 OSD + single dedicated monitor ?
[11:48] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[11:49] <ivve> yeah perhaps writeback is not the best for glance
[11:49] * derjohn_mob (~aj@46.189.28.68) Quit (Ping timeout: 480 seconds)
[11:50] <ivve> rmart04: so did you delete images or what? i mean since backing store was being reduced (as in space being freed up?)
[11:51] <rmart04> http://pastebin.com/TEvBsHRU
[11:51] <rmart04> Well images have been added and removed over time
[11:51] * karnan (~karnan@121.244.87.117) has joined #ceph
[11:51] <rmart04> this pools been running for a few months
[11:53] <ivve> btw you get the same data with 'ceph osd pool ls detail'
[11:54] <rmart04> oh yea, nice
[11:54] <rmart04> gets rid of all the OSD data
[11:54] <ivve> :)
[11:55] <ivve> although i could understand writeback if you keep the read data long enough
[11:55] <ivve> and you have few ssds but most of the images on sata
[11:55] <ivve> if you keep reading the same images over and over
[11:56] <ivve> then i would simply just use ratios
[11:57] <ivve> and no min keep
[11:57] <rmart04> I???m a bit supprised the cache pool isnt shrinking though, as I???ve set the ratio down
[11:57] <rmart04> Its still 76GB
[11:58] <rmart04> glance-hot 6 76256M 0.13 546G
[11:58] <Be-El> Fira: you are mixing up the ceph layers. there's rados (underlying object based layer), and the object storage gateway with s3/swift interfaces
[11:59] * derjohn_mob (~aj@46.189.28.52) has joined #ceph
[11:59] <Be-El> Fira: an application _can_ use the rados layer, but needs to manage its data on its own
[12:00] <Be-El> Fira: the gateway on the other hand is a standalone web server (or a bunch of webservers), that exposes objects via the amazon S3 api or the openstack swift api. clients do not care about the underlying storage in this scenario
[12:01] <Be-El> Fira: if you have multiple docker/vm/whatever/$HYPE instances, using the object gateway results in all data being passed through the gateway's webserver. but you can use standard libraries in this case with a wide support for programming languages / frameworks etc.
[12:02] <Be-El> Fira: using plain rados requires direct connections between the client and the ceph osd hosts. but there's no bottleneck, since there's no gateway server involved
[12:03] * bniver (~bniver@125.16.34.66) Quit (Remote host closed the connection)
[12:03] * rwheeler (~rwheeler@125.16.34.66) Quit (Quit: Leaving)
[12:05] <rmart04> Sorry, min keep? Do you mean min_write/read_recency for promoto?
[12:08] <ivve> no
[12:08] <ivve> min flush age
[12:08] <ivve> and min evict age
[12:09] * joshd (~jdurgin@125.16.34.66) Quit (Ping timeout: 480 seconds)
[12:09] <ivve> guess you can toy with promote
[12:09] <ivve> but i don't see the point with images
[12:10] * jcsp (~jspray@125.16.34.66) Quit (Ping timeout: 480 seconds)
[12:10] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:6004:a51a:3c47:611d) has joined #ceph
[12:10] <Fira> Be-El: Yeah i understood as much.... I'm asking about migrating said data input via S3/Swift Interfaces
[12:10] <rmart04> OK. thanks
[12:11] <ivve> or depending on how many images you read
[12:11] <ivve> during a "work-day" set the minage to 8 hours if your cache size allows it
[12:11] <ivve> and low relative values
[12:11] <Be-El> Fira: why do you want to migrate data?
[12:12] <Fira> Be-El: <fira> .... and want to field-deploy the system later on .....
[12:12] * diver (~diver@cpe-2606-A000-111B-C12B-D989-F641-2EC9-E9AA.dyn6.twc.com) has joined #ceph
[12:12] <Be-El> Fira: whatever that means...
[12:13] <rmart04> So last thing, and again appreciate all your help. If the max aviailable is 500GB, the target max bytes is 100GB, and I want to keep the cache say half full most of the time, (so that its not always touching the target max bytes causing IO issues). You???d recommend setting the dirty ratio / dirty high ratio to around 0.10 (ie 50Gb of max size)/ 0.15 and ie 75GB of max size)?
[12:13] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[12:13] <Fira> ok let me put that really simply
[12:13] <Fira> different place, different infrastructure, different cluster, how do i bring data from S3/Swift ? do you need to handle that at app level ?
[12:14] <ivve> 0.1 ratio would mean 10GB out of 100GB max_bytes_size
[12:14] <Be-El> Fira: different ceph clusters?
[12:15] <ivve> since we are talking about images i think i would set flush to a low value, and evict to a pretty high value (you want to keep unchanged objects just for reading longer)
[12:15] <Fira> Be-El: yeah?
[12:15] <ivve> and if you would perform changes (since we are talking images) they could flush sooner to keep cache free of that type of objects.
[12:16] <Be-El> Fira: use any method you would use in other s3 based setups. with ceph you get the additional opportunity to run multisite s3 gateways that are synchronizing objects between them
[12:16] <Be-El> Fira: the application itself in a container/vm only needs to be configured to use the correct s3 endpoint
[12:17] <Be-El> (+authentication if different credentials are necessary)
[12:18] <ivve> rmart04: so if you want to keep it ~half way full then set ratio to 0.5, then 50gb will stay in cache. if it migrates too soon keep with min_evict_age, but you could let min_flush_age at a lower value
[12:19] <ivve> you also have a new value with jewel
[12:19] <ivve> lemme see if i can find it
[12:20] * joshd (~jdurgin@125.16.34.66) has joined #ceph
[12:21] * diver (~diver@cpe-2606-A000-111B-C12B-D989-F641-2EC9-E9AA.dyn6.twc.com) Quit (Ping timeout: 480 seconds)
[12:21] * rraja (~rraja@121.244.87.117) has joined #ceph
[12:21] <ivve> When the dirty objects reaches a certain percentage of its capacity, flush dirty objects with a higher speed. To set the cache_target_dirty_high_ratio
[12:22] <ivve> you will mostly have clean objects im guessing
[12:22] <ivve> so that value can be set pretty low
[12:22] <ivve> maybe around your wanted ~50%
[12:24] <Fira> Be-El: no, erm, you don't get it... let me start over.... it's actually really simple, i guess i'm not explaining it right... let's say i have this cool Ceph cluster @ home and put data in it through S3/Swift with a production web app.... and suddenly i need to package that app in a box and ship it to the other end of the world for *offline* use, with a subset of the data
[12:24] <Fira> Be-El: are there tools or facilities that let me manage/export objects or do i write an actual app for that?
[12:26] <Fira> i mean, i can spin a new ceph cluster through docker and bring it along, but how do i get data accross ?
[12:26] <Be-El> Fira: S3/swift are http based protocols. there are no files or whatever you can "package". if you need to bundle app and its data, then your app needs to be able to handle files or/and s3 based on the environment
[12:27] <Fira> i know ;_;
[12:27] <Be-El> Fira: if you just want to transfer data from a s3 bucket to another bucket (on a different cluster), then you can use standard s3 synchronisation tools
[12:27] <Fira> aaah
[12:28] <Fira> alright
[12:28] <Be-El> but the important point about s3 is the fact that you do not have plain objects, but also metadata. and the probably also want to sync the metadata
[12:28] <rmart04> OK great, thanks ivve. Ill keep playing
[12:29] <Be-El> Fira: one simple way if you do not need the metadata is mounting one s3 bucket on the target host using s3fs, and importing the data into the other bucket as files using s3cmd sync or other tools
[12:30] <Be-El> but there're probably better solutions that also keep the metadata intact
[12:30] * EinstCrazy (~EinstCraz@61.165.253.98) has joined #ceph
[12:31] <Fira> Be-El: yes, that makes sense, i'll look into syncing and s3fs then. thanks :)
[12:32] <Be-El> and regarding ceph-deploy....it will definitely help you in production setups, but I would recommend to also have a closer look at manual operations to understand what's going on under the hood
[12:34] <Be-El> ...and off for lunch
[12:34] * joshd (~jdurgin@125.16.34.66) Quit (Ping timeout: 480 seconds)
[12:41] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[12:41] * bara (~bara@125.16.34.66) Quit (Quit: Bye guys! (??????????????????? ?????????)
[12:48] * EinstCrazy (~EinstCraz@61.165.253.98) Quit (Remote host closed the connection)
[12:48] * karnan (~karnan@121.244.87.117) Quit (Ping timeout: 480 seconds)
[12:48] * EinstCrazy (~EinstCraz@61.165.253.98) has joined #ceph
[12:53] * Behedwin1 (~PcJamesy@108.61.122.224) has joined #ceph
[12:54] * flisky (~Thunderbi@210.12.157.85) has joined #ceph
[12:56] * EinstCrazy (~EinstCraz@61.165.253.98) Quit (Ping timeout: 480 seconds)
[12:56] * karnan (~karnan@121.244.87.117) has joined #ceph
[13:03] <sep> i see this in my health detail ;; pg 5.ee is stuck unclean for 1718423.835182, current state down+remapped+peering, last acting [37,25,2147483647,55,103,6] ; i imagined the numbers in bracets was osd's. but i do not have more then 112 osd's so what is that large number ?
[13:04] <Be-El> sep: -1 as unsigned int
[13:04] * mattbenjamin (~mbenjamin@125.16.34.66) has joined #ceph
[13:05] <Be-El> sep: your crush rules are not able to calculate 6 distinct osds for the ec pg
[13:05] <sep> becouse of uneven size of the nodes ?
[13:06] <Be-El> maybe
[13:11] * mattbenjamin (~mbenjamin@125.16.34.66) Quit (Read error: Connection reset by peer)
[13:12] * mattbenjamin (~mbenjamin@125.16.34.66) has joined #ceph
[13:14] * ledgr (~ledgr@84.15.178.214) Quit (Remote host closed the connection)
[13:14] * ledgr (~ledgr@84.15.178.214) has joined #ceph
[13:15] * salwasser (~Adium@c-76-118-229-231.hsd1.ma.comcast.net) has joined #ceph
[13:16] <sep> thanks
[13:18] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[13:22] * salwasser1 (~Adium@2601:197:101:5cc1:cae0:ebff:fe18:8237) has joined #ceph
[13:22] * salwasser (~Adium@c-76-118-229-231.hsd1.ma.comcast.net) Quit (Read error: Connection reset by peer)
[13:22] * ledgr (~ledgr@84.15.178.214) Quit (Ping timeout: 480 seconds)
[13:23] * Behedwin1 (~PcJamesy@108.61.122.224) Quit ()
[13:26] * arbrandes1 (~arbrandes@ec2-54-172-54-135.compute-1.amazonaws.com) has joined #ceph
[13:26] * rakeshgm (~rakesh@121.244.87.117) has joined #ceph
[13:27] * salwasser1 (~Adium@2601:197:101:5cc1:cae0:ebff:fe18:8237) Quit (Quit: Leaving.)
[13:27] * salwasser (~Adium@2601:197:101:5cc1:5c6f:37b1:a7de:4c23) has joined #ceph
[13:28] * flisky (~Thunderbi@210.12.157.85) Quit (Ping timeout: 480 seconds)
[13:28] * diver (~diver@216.85.162.38) has joined #ceph
[13:30] <rmart04> ivve, thanks again. I think what I was missing was the cache target full ratio (for flushing clean objects)!
[13:31] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[13:32] * arbrandes (~arbrandes@ec2-54-172-54-135.compute-1.amazonaws.com) Quit (Ping timeout: 480 seconds)
[13:33] * diver_ (~diver@95.85.8.93) has joined #ceph
[13:33] * kefu is now known as kefu|afk
[13:36] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[13:36] * salwasser (~Adium@2601:197:101:5cc1:5c6f:37b1:a7de:4c23) Quit (Quit: Leaving.)
[13:38] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[13:40] * diver (~diver@216.85.162.38) Quit (Ping timeout: 480 seconds)
[13:44] * mattbenjamin (~mbenjamin@125.16.34.66) Quit (Ping timeout: 480 seconds)
[13:47] * madkiss (~madkiss@2a02:8109:8680:2000:b8e3:4e23:c7f2:b34d) has joined #ceph
[13:58] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[14:04] * Frostshifter (~tallest_r@185.3.135.82) has joined #ceph
[14:05] * malevolent (~quassel@192.146.172.118) has joined #ceph
[14:14] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[14:18] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[14:19] * dack (~oftc-webi@gateway.ola.bc.ca) Quit (Ping timeout: 480 seconds)
[14:20] * georgem (~Adium@24.114.58.120) has joined #ceph
[14:24] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[14:27] <ivve> rmart04: cool, nice i could help :)
[14:29] <ivve> sep: did you solve the issue?
[14:30] * flisky (~Thunderbi@114.111.166.88) has joined #ceph
[14:30] * flisky (~Thunderbi@114.111.166.88) Quit ()
[14:30] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[14:32] * Frostshifter (~tallest_r@185.3.135.82) Quit (Ping timeout: 480 seconds)
[14:33] * wes_dillingham (~wes_dilli@140.247.242.44) has joined #ceph
[14:33] * wes_dillingham (~wes_dilli@140.247.242.44) Quit ()
[14:35] <sep> ivve, no i want to wait until the recovery/backfill have calmed down a bit. i also need to scrunge up some hardware for a temp node and osd
[14:36] <sep> note to self. do not att 40TB of storage at the same time...
[14:38] * georgem (~Adium@24.114.58.120) Quit (Read error: Connection reset by peer)
[14:38] <IcePic> sep: if you just could get $0.01 for every gig shuffled on the backplane network right now...
[14:40] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[14:41] <sep> haha :) yes
[14:49] <jiffe> so here's another question, is it possible for me to replace a disk for an osd without removing the osd from ceph?
[14:52] * serif (~serif@92.36.131.199) has joined #ceph
[14:53] <IcePic> jiffe: what would you expect ceph to do when an empty disk appears in that case?
[14:55] * serif (~serif@92.36.131.199) Quit ()
[14:55] * georgem (~Adium@206.108.127.16) has joined #ceph
[14:56] <jiffe> allow me to reinitialize it as if it were the original osd and then repopulate?
[14:56] <andihit> what could be the reason that for CephFS reads (~600MB/s) are slower than writes (~800-1000MB/s)? shouldn't it be the other way?
[14:56] <sep> jiffe, if you remove the disk, then add a new one it normaly gets the osd.id of the old removed one. ; so if you set noout, nobackfill noracovery before removing the old disk. then you will have minimum data movement (other then backfilling the disk)
[14:57] <sep> andihit, perhaps writes go to a ssd journal and reads need to find that data on spinning rust ?
[14:57] <andihit> it's a 3 nodes cluster with 25 hdds each, no ssd, no cache tiering
[14:59] <andihit> basically I'm reading the data back I wrote before, so it should be in the page cache of the OSDs
[15:02] * rmart04_ (~rmart04@5.153.255.226) has joined #ceph
[15:03] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[15:03] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[15:05] * rmart04 (~rmart04@support.memset.com) Quit (Ping timeout: 480 seconds)
[15:05] * rmart04_ is now known as rmart04
[15:07] * jarrpa (~jarrpa@63.225.131.166) has joined #ceph
[15:08] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) has joined #ceph
[15:10] * rmart04 (~rmart04@5.153.255.226) Quit (Quit: rmart04)
[15:13] <andihit> ceph osd perf shows unusual high numbers (> 200) for both fs_commit_latency and fs_apply_latency after the read benchmark, where it is < 40 after the write benchmark
[15:18] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[15:18] * mattbenjamin (~mbenjamin@121.244.54.198) has joined #ceph
[15:21] * rmart04 (~rmart04@5.153.255.226) has joined #ceph
[15:21] * rmart04 (~rmart04@5.153.255.226) Quit ()
[15:23] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[15:27] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[15:42] * vata (~vata@207.96.182.162) has joined #ceph
[15:44] * salwasser (~Adium@72.246.3.14) has joined #ceph
[15:44] * wes_dillingham (~wes_dilli@140.247.242.44) has joined #ceph
[15:45] * derjohn_mob (~aj@46.189.28.52) Quit (Ping timeout: 480 seconds)
[15:46] * jordan_c (~jconway@cable-192.222.129.3.electronicbox.net) has joined #ceph
[15:47] <jordan_c> can someone tell me what the default permissions for /var/log/ceph/ceph-mon.${hostname}.log are on Jewel EL7?
[15:48] * derjohn_mob (~aj@46.189.28.68) has joined #ceph
[15:48] <jordan_c> logrotate wasn't working for me because it was root:ceph 644 and the logrotate script does "su ceph ceph"
[15:48] <jordan_c> all of the other logs in /var/log/ceph/ were ceph:ceph 600 except for the mon specific ones
[15:48] <jordan_c> trying to figure out why
[15:49] <Gugge-47527> maybe you started the mon as root once
[15:49] <Gugge-47527> the first time
[15:49] <jordan_c> that's the best I can figure
[15:49] <jordan_c> since ceph also wasn't writing to that file anymore and hasn't been for like a week
[15:50] <jordan_c> I'm just unsure whether it was something I did, a coworker, or the puppet module we're using to setup ceph did =\
[15:52] * yanzheng (~zhyan@125.70.21.187) Quit (Quit: This computer has gone to sleep)
[15:52] * yanzheng (~zhyan@125.70.21.187) has joined #ceph
[15:54] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[15:57] * derjohn_mob (~aj@46.189.28.68) Quit (Ping timeout: 480 seconds)
[16:01] * CephFan1 (~textual@68-233-224-175.static.hvvc.us) has joined #ceph
[16:01] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[16:04] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:06] * krogon (~krogon@irdmzpr01-ext.ir.intel.com) Quit (Remote host closed the connection)
[16:12] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[16:14] * brianjjo (~Bonzaii@tor-exit.squirrel.theremailer.net) has joined #ceph
[16:20] * rmart04_ (~rmart04@5.153.255.226) has joined #ceph
[16:20] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[16:23] * ad (~oftc-webi@88.215.192.251) has joined #ceph
[16:23] * ad is now known as Guest359
[16:24] * rmart04 (~rmart04@support.memset.com) Quit (Ping timeout: 480 seconds)
[16:24] * rmart04_ is now known as rmart04
[16:26] * squizzi (~squizzi@107.13.237.240) has joined #ceph
[16:26] <Guest359> does anybody know how to migrate a radosgw index to a separate pool?
[16:27] * srk (~Siva@32.97.110.53) has joined #ceph
[16:29] * vbellur (~vijay@71.234.224.255) Quit (Ping timeout: 480 seconds)
[16:29] * yanzheng (~zhyan@125.70.21.187) Quit (Quit: This computer has gone to sleep)
[16:30] * mmm_c_n (~mmm_c_n@91.123.199.191) has joined #ceph
[16:33] <mmm_c_n> Hi. Wondering if anyone can help me with a ceph issue that I get. I'm trying to install ceph via the ceph ansible playbook. 5 servers that I'm currently testing with all running centos 7. The error I get is that I get is when trying to create ceph keys: "INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'"
[16:33] <mmm_c_n> That message keeps repeating on all 5 nodes.
[16:34] * Jeffrey4l__ (~Jeffrey@221.195.210.23) Quit (Ping timeout: 480 seconds)
[16:34] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[16:35] * krogon (~krogon@irdmzpr01-ext.ir.intel.com) has joined #ceph
[16:36] * rakeshgm (~rakesh@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:36] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[16:36] <mmm_c_n> ceph conf: http://pastebin.com/rPiQ2eHV
[16:37] * ade (~abradshaw@194.169.251.11) Quit (Ping timeout: 480 seconds)
[16:37] * garphy`aw is now known as garphy
[16:38] * Jeffrey4l__ (~Jeffrey@221.195.210.23) has joined #ceph
[16:39] <mmm_c_n> I'm pretty much stuck there, not sure what to check next. Been googling like a maniac but haven't found anything useful yet. The services seems to start up just fine and that is more or less the only message that I can see in the logs related to ceph that points to an issue.
[16:39] <mmm_c_n> a
[16:39] <mmm_c_n> Any help appreciated
[16:39] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) has joined #ceph
[16:40] <georgem> mmm_c_n: verify connectivity between the monitors
[16:41] <mmm_c_n> Gah so simple. I seem to have a routing issue. Slamming head against the wall for missing that one.
[16:41] <mmm_c_n> Will fix and recheck, thanks a lot so far :)
[16:44] * brianjjo (~Bonzaii@5AEAABOE7.tor-irc.dnsbl.oftc.net) Quit ()
[16:45] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[16:46] <mmm_c_n> And yeah that was it.
[16:46] <mmm_c_n> thanks georgem
[16:47] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[16:47] <georgem> mmm_c_n: welcome
[16:47] * ledgr (~ledgr@88-222-11-185.meganet.lt) has joined #ceph
[16:50] * rmart04_ (~rmart04@support.memset.com) has joined #ceph
[16:50] * vbellur (~vijay@nat-pool-bos-u.redhat.com) has joined #ceph
[16:51] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[16:51] * rmart04 (~rmart04@5.153.255.226) Quit (Ping timeout: 480 seconds)
[16:51] * rmart04_ is now known as rmart04
[16:54] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[16:54] * karnan (~karnan@121.244.87.117) Quit (Remote host closed the connection)
[16:55] * ledgr (~ledgr@88-222-11-185.meganet.lt) Quit (Ping timeout: 480 seconds)
[16:55] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[16:59] * ntpttr (~ntpttr@134.134.137.75) has joined #ceph
[17:00] * kefu|afk (~kefu@114.92.125.128) Quit (Max SendQ exceeded)
[17:00] * kefu (~kefu@114.92.125.128) has joined #ceph
[17:03] * andreww (~xarses@64.124.158.3) has joined #ceph
[17:04] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[17:06] * ntpttr (~ntpttr@134.134.137.75) Quit (Remote host closed the connection)
[17:10] * lmb (~Lars@nat.nue.novell.com) has joined #ceph
[17:12] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) Quit (Quit: Leaving.)
[17:12] * mmm_c_n (~mmm_c_n@91.123.199.191) Quit (Ping timeout: 480 seconds)
[17:13] * togdon (~togdon@74.121.28.6) has joined #ceph
[17:14] * rmart04 (~rmart04@support.memset.com) Quit (Ping timeout: 480 seconds)
[17:15] * Skaag (~lunix@cpe-172-91-77-84.socal.res.rr.com) Quit (Quit: Leaving.)
[17:16] * aNuposic (~aNuposic@134.134.137.75) has joined #ceph
[17:17] * kristen (~kristen@134.134.137.75) has joined #ceph
[17:24] * wushudoin (~wushudoin@38.140.108.2) has joined #ceph
[17:25] * praveen (~praveen@121.244.155.9) Quit (Remote host closed the connection)
[17:26] * praveen (~praveen@121.244.155.9) has joined #ceph
[17:31] * cholcombe (~chris@97.93.161.2) has joined #ceph
[17:34] * praveen (~praveen@121.244.155.9) Quit (Ping timeout: 480 seconds)
[17:38] * hassifa (~lobstar@213.61.149.100) has joined #ceph
[17:39] * cholcombe (~chris@97.93.161.2) Quit (Ping timeout: 480 seconds)
[17:40] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:43] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[17:44] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[17:45] * Skaag (~lunix@65.200.54.234) has joined #ceph
[17:51] * cholcombe (~chris@97.93.161.13) has joined #ceph
[17:51] * linuxkidd (~linuxkidd@ip70-189-202-62.lv.lv.cox.net) has joined #ceph
[17:53] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[17:53] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) Quit (Quit: Leaving.)
[17:53] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) has joined #ceph
[17:55] * togdon (~togdon@74.121.28.6) Quit (Quit: Sleeping...)
[17:58] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) Quit ()
[17:58] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) has joined #ceph
[18:00] <johnavp1989> Hi All, MDS question. If I create a second MDS will it automatically act as a standby or do I need to set mds standby for name in my ceph.conf?
[18:01] * [0x4A6F] (~ident@p508CD520.dip0.t-ipconnect.de) has joined #ceph
[18:02] * Brochacho (~alberto@97.93.161.13) has joined #ceph
[18:02] * walcubi_ is now known as walcubi
[18:02] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[18:03] <johnavp1989> Looking at mon force standby active being set to true by default I assume that's the case just looking for some confirmation
[18:03] <m0zes> unless you have increased the max_mds to greater than 1, it will act as a cold standby
[18:03] * togdon (~togdon@74.121.28.6) has joined #ceph
[18:03] <johnavp1989> m0zes: ah great thank you
[18:04] <m0zes> hot standby can be enabled with the following settings in ceph.conf; mds standby for rank = 0; mds standby replay = true;
[18:04] <m0zes> I've been told it is safe (and I've not had any issues with it)
[18:04] * Racpatel (~Racpatel@2601:87:3:31e3::2433) Quit (Ping timeout: 480 seconds)
[18:05] * praveen (~praveen@122.171.114.51) has joined #ceph
[18:05] * sleinen (~Adium@2001:620:0:2d:a65e:60ff:fedb:f305) Quit (Quit: Leaving.)
[18:05] <m0zes> hot standby just monitors the cephfs journals and pre-warms the cache with that data. in the event of a failover, it is ready sooned.
[18:05] <m0zes> s/sooned/sooner/
[18:05] <johnavp1989> m0zes: interesting. so it's not active active but it's faster recovery then cold standby?
[18:06] <m0zes> correct.
[18:06] * aNuposic (~aNuposic@134.134.137.75) Quit (Remote host closed the connection)
[18:06] <johnavp1989> m0zes: awesome. thank you!
[18:07] * sudocat1 (~dibarra@192.185.1.19) has joined #ceph
[18:07] * garphy is now known as garphy`aw
[18:08] * hassifa (~lobstar@635AAALM0.tor-irc.dnsbl.oftc.net) Quit ()
[18:10] * sudocat2 (~dibarra@192.185.1.20) has joined #ceph
[18:11] * ntpttr (~ntpttr@134.134.139.76) has joined #ceph
[18:13] * Racpatel (~Racpatel@2601:87:3:31e3:4e34:88ff:fe87:9abf) has joined #ceph
[18:13] * Nicho1as (~nicho1as@00022427.user.oftc.net) Quit (Quit: A man from the Far East; using WeeChat 1.5)
[18:15] * sudocat1 (~dibarra@192.185.1.19) Quit (Ping timeout: 480 seconds)
[18:15] * garphy`aw is now known as garphy
[18:15] * lmb (~Lars@nat.nue.novell.com) Quit (Ping timeout: 480 seconds)
[18:16] * Brochacho (~alberto@97.93.161.13) Quit (Quit: Brochacho)
[18:22] * vbellur (~vijay@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[18:26] * doppelgrau (~doppelgra@132.252.235.172) Quit (Read error: Connection reset by peer)
[18:26] * TMM (~hp@185.5.121.201) Quit (Quit: Ex-Chat)
[18:29] * derjohn_mob (~aj@tmo-107-20.customers.d1-online.com) has joined #ceph
[18:37] * vbellur (~vijay@nat-pool-bos-t.redhat.com) has joined #ceph
[18:38] * jclm (~jclm@92.66.244.229) has joined #ceph
[18:40] * garphy is now known as garphy`aw
[18:41] * thansen (~thansen@17.253.sfcn.org) has joined #ceph
[18:42] * jclm (~jclm@92.66.244.229) Quit ()
[18:42] * thansen (~thansen@17.253.sfcn.org) Quit ()
[18:43] * thansen (~thansen@17.253.sfcn.org) has joined #ceph
[18:44] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[18:45] * ircuser-1 (~Johnny@158.183-62-69.ftth.swbr.surewest.net) Quit (Quit: because)
[18:51] * tsg (~tgohad@134.134.137.71) has joined #ceph
[18:52] * kefu_ (~kefu@114.92.125.128) has joined #ceph
[18:55] * aNuposic (~aNuposic@192.55.54.44) has joined #ceph
[18:56] * aNuposic (~aNuposic@192.55.54.44) Quit (Remote host closed the connection)
[18:58] * kefu (~kefu@114.92.125.128) Quit (Ping timeout: 480 seconds)
[18:59] * andreww (~xarses@64.124.158.3) Quit (Ping timeout: 480 seconds)
[19:01] * Roy (~PeterRabb@172.98.67.112) has joined #ceph
[19:11] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[19:27] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) has joined #ceph
[19:30] * andihit (uid118959@id-118959.richmond.irccloud.com) Quit (Quit: Connection closed for inactivity)
[19:31] * Roy (~PeterRabb@172.98.67.112) Quit ()
[19:31] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[19:34] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[19:36] * BrianA (~BrianA@fw-rw.shutterfly.com) has joined #ceph
[19:40] * Yopi (~SaneSmith@exit1.radia.tor-relays.net) has joined #ceph
[19:40] * Racpatel (~Racpatel@2601:87:3:31e3:4e34:88ff:fe87:9abf) Quit (Quit: Leaving)
[19:40] * Racpatel (~Racpatel@2601:87:3:31e3:4e34:88ff:fe87:9abf) has joined #ceph
[19:40] * mattbenjamin (~mbenjamin@121.244.54.198) Quit (Ping timeout: 480 seconds)
[19:42] * Brochacho (~alberto@97.93.161.13) has joined #ceph
[19:45] * ashah (~ashah@121.244.87.117) Quit (Quit: Leaving)
[19:47] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[19:55] * togdon (~togdon@74.121.28.6) Quit (Quit: Sleeping...)
[19:56] * togdon (~togdon@74.121.28.6) has joined #ceph
[19:58] * xarses_ (~xarses@73.93.153.115) has joined #ceph
[19:59] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[20:02] * TMM (~hp@dhcp-077-248-009-229.chello.nl) has joined #ceph
[20:04] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[20:07] * xarses_ (~xarses@73.93.153.115) Quit (Ping timeout: 480 seconds)
[20:10] * Yopi (~SaneSmith@26XAABXMF.tor-irc.dnsbl.oftc.net) Quit ()
[20:11] * tsg (~tgohad@134.134.137.71) Quit (Remote host closed the connection)
[20:14] * xarses_ (~xarses@64.124.158.3) has joined #ceph
[20:15] * derjohn_mob (~aj@tmo-107-20.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[20:21] * tsg (~tgohad@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[20:28] * salwasser (~Adium@72.246.3.14) Quit (Quit: Leaving.)
[20:29] * ade (~abradshaw@46.189.28.49) has joined #ceph
[20:29] * codice (~toodles@75-128-34-237.static.mtpk.ca.charter.com) has joined #ceph
[20:33] * rraja (~rraja@121.244.87.117) Quit (Quit: Leaving)
[20:41] * xarses_ (~xarses@64.124.158.3) Quit (Remote host closed the connection)
[20:41] * xarses_ (~xarses@64.124.158.3) has joined #ceph
[20:42] * garphy`aw is now known as garphy
[20:43] * ade (~abradshaw@46.189.28.49) Quit (Quit: Too sexy for his shirt)
[20:44] * tsg (~tgohad@jfdmzpr05-ext.jf.intel.com) Quit (Remote host closed the connection)
[20:45] * garphy is now known as garphy`aw
[20:52] * Brochacho (~alberto@97.93.161.13) Quit (Quit: Brochacho)
[20:59] * georgem1 (~Adium@206.108.127.16) has joined #ceph
[20:59] * georgem (~Adium@206.108.127.16) Quit (Read error: Connection reset by peer)
[20:59] * georgem (~Adium@206.108.127.16) has joined #ceph
[20:59] * georgem1 (~Adium@206.108.127.16) Quit (Read error: Connection reset by peer)
[21:01] * davidb (~David@MTRLPQ42-1176054809.sdsl.bell.ca) has left #ceph
[21:20] <diver_> does anyone uses EnhanceIO with Ceph?
[21:21] <diver_> I just set that up for my test cluster and running write-only stress test
[21:24] * davidz (~davidz@2605:e000:1313:8003:4417:3d54:ddb5:fa77) has joined #ceph
[21:27] * kefu_ (~kefu@114.92.125.128) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[21:28] * CephFan1 (~textual@68-233-224-175.static.hvvc.us) Quit (Ping timeout: 480 seconds)
[21:29] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:6004:a51a:3c47:611d) Quit (Ping timeout: 480 seconds)
[21:33] * sleinen (~Adium@2001:620:0:82::103) has joined #ceph
[21:37] <SamYaple> diver_: i tested enhanceIO, dmcache and bcache a few years back
[21:37] <SamYaple> ive done bcache + ceph for the past few years, most performant
[21:38] <diver_> I read some article about running that in production, but I had issues with setting up bcache on the centos 7
[21:39] <diver_> great that enhanceIO doesn't create new block device, so I can use existing ansible config
[21:48] * togdon (~togdon@74.121.28.6) Quit (Quit: Sleeping...)
[21:49] <rkeene> Is there any convienent way to get a list of OSDs on a given node ?
[21:50] <rkeene> I guess "ceph osd crush tree" is the best ?
[21:52] <johnavp1989> Can I combine a RGW node with either a Mon node of MDS node?
[21:53] <diver_> rkeene: ceph osd dump|grep IP
[21:53] <diver_> probably
[21:54] <diver_> johnavp1989: yes, that's what I did
[21:54] <diver_> combined RGW, MON and HAProxy on the same nodes
[21:54] <diver_> but have SSD backed mon servers
[21:54] <diver_> works well. data load balanced with dns roundrobin
[21:55] <johnavp1989> diver_: oh nice. can i ask what you're using HAProxy for here?
[21:56] <diver_> johnavp1989: sorry, what do you mean?
[21:56] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[21:56] <diver_> I'm not sure that I understood you
[21:57] <diver_> I use it only for S3 API (citweb)
[21:57] <rkeene> diver_, Seems like ceph osd crush tree is easier to parse ?
[21:58] <johnavp1989> I was just curious what you were using it for. Are you load balancing your RGW's that way?
[21:59] <diver_> rkeene: true, it's json. but for basic shell I found that 'grep' way faster
[22:00] <diver_> johnavp1989: yes. haproxy in front of citweb.
[22:00] <rkeene> Most of my code is written in Tcl anyway
[22:00] <diver_> 3 haproxy, 3 citwebs on the 3 nodes. each node has 1 haproxy and 1 citweb
[22:00] <diver_> haproxy has 3 backends
[22:00] <rkeene> Parsing something regular is a lot better
[22:01] <diver_> frontend - mix of RSA\ECC (http://blog.haproxy.com/2015/07/15/serving-ecc-and-rsa-certificates-on-same-ip-with-haproxy/)
[22:01] * georgem (~Adium@206.108.127.16) Quit (Read error: Connection reset by peer)
[22:01] <diver_> 99% of clients supports ECC, so I save 60% of the CPU
[22:01] * vbellur (~vijay@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving.)
[22:01] * georgem (~Adium@206.108.127.16) has joined #ceph
[22:02] <diver_> load balancing - tried first AWS Route53 with their 'weights' feature
[22:02] <rkeene> Hmm
[22:02] <rkeene> What does "ECC" stand for there ?
[22:02] <diver_> but found round robin more stable - near 33\33\33 between the haproxies
[22:02] <diver_> ECC - TLS certificate
[22:02] <rkeene> No.
[22:02] <rkeene> This is all kinds of wrong.
[22:03] * wes_dillingham (~wes_dilli@140.247.242.44) Quit (Ping timeout: 480 seconds)
[22:03] <rkeene> ECC = Elliptic Curve C??????
[22:03] <diver_> yes
[22:03] <rkeene> And what is the "??????" there ?
[22:04] <diver_> haproxy in the chain only to do the encryption between the ceph and clients
[22:04] <diver_> I lost you again :)
[22:04] <rkeene> I'm just trying to figure out why they called ECDSA "ECC" here
[22:04] <diver_> elliptic cryptography certificate (ecc)
[22:05] <rkeene> That's gibberish.
[22:05] <diver_> you can really skip this part and have usual RSA
[22:05] <johnavp1989> diver_: awesome. thanks for all the info. i'm very surprised by the performance gain you've gotten by supporting ECC.
[22:05] <rkeene> TLS uses X.509 certificates, X.509 certificates can have RSA signatures, ECDSA signatures, etc
[22:06] <rkeene> So I have no idea what an "Elliptic Cryptography Certificate" is
[22:07] <diver_> johnavp1989: sure. it gives 40% less time on handshakes and 60-70% less CPU usage. as I had ~% of old clients that didn't support ECC then I set that things up - works great
[22:07] <rkeene> diver_, Keep in mind I've implemented X.509 (v1,v3), RSA PKCS#1 (v1.5), PKCS#11 (v2.30 -- both modules and clients), ASN.1, etc
[22:07] <diver_> ecc certificate from comodo, ~80$ or so
[22:07] <diver_> rkeene: sorry man, I really don't get what are you talking about here
[22:08] <rkeene> diver_, Elliptic Cryptography certificates certificates ?
[22:08] <rkeene> diver_, I'm just trying to figure out why you keep adding more and more "c"s to the "EC" portion of "ECDSA"
[22:08] <diver_> just get used to. please ignore it
[22:09] <diver_> a. ecc
[22:09] <diver_> https://www.digicert.com/ecc.htm
[22:09] <diver_> Elliptic Curve Cryptography (ECC)
[22:09] <diver_> that's why two 'C'
[22:09] <rkeene> So why didn't they call it RSAC ?
[22:10] <rkeene> RSA Cryptography.
[22:10] <diver_> I don't know :) and don't care :)
[22:12] * tsg (~tgohad@192.55.55.41) has joined #ceph
[22:13] * W|ldCraze (~GuntherDW@tsn109-201-154-146.dyn.nltelcom.net) has joined #ceph
[22:13] <rkeene> It's likely they meant ECSDA and RSA... but there several places where these are used in TLS -- X.509 certificates can have an RSA or a ECDSA signature and refer to an RSA or ECDSA key, then DH key exchange can use ECDHE
[22:16] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) Quit (Ping timeout: 480 seconds)
[22:16] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) has joined #ceph
[22:17] * borei (~dan@216.13.217.230) has joined #ceph
[22:17] <borei> hi all
[22:18] <borei> im testing ceph cluster failover - 3 nodes, 6 OSDs. Powered off on node (uplugged power cord), ceph-mon was kicked out pretty much instantly, but OSDs were presented in the tree like "up"
[22:19] <rkeene> This document, FWIW, specifies that if the client supports the latter that you should do the former (even though they aren't strictly correlated -- it's a good bet that anything that supports ECDHE can verify ECDSA certificates)
[22:19] <borei> i was expecting that they will be down pretty much instantly
[22:19] <borei> am i missing some configuration settings ?
[22:20] <rkeene> borei, There are heartbeats, missing several causes the OSD to be down
[22:20] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[22:20] <rkeene> There are options that specify the times more precisely
[22:22] * vbellur (~vijay@173-13-111-22-NewEngland.hfc.comcastbusiness.net) has joined #ceph
[22:22] <borei> http://docs.ceph.com/docs/jewel/rados/configuration/mon-osd-interaction ?
[22:23] <borei> mon osd down out interval - 5 minutes
[22:24] <borei> ok, more fundamental question
[22:24] <rkeene> mon osd down out interval
[22:24] <rkeene> Description: The number of seconds Ceph waits before marking a Ceph OSD Daemon down and out if it doesn???t respond
[22:25] <borei> i have some OSDs are down, ceph will mark them down in 5 minutes
[22:25] <rkeene> Maybe, check the percentage
[22:25] <borei> but within that 5 minutes will IO be sent to that OSDs ?
[22:25] * ledgr (~ledgr@88-222-11-185.meganet.lt) has joined #ceph
[22:25] <rkeene> mon osd min up ratio
[22:25] <rkeene> Description: The minimum ratio of up Ceph OSD Daemons before Ceph will mark Ceph OSD Daemons down
[22:26] <borei> mon osd min up ratio - that one is not clear for me
[22:27] <rkeene> As I understand it: pgs are primary to that OSD will be sent to that OSD... the client will notice that it doesn't work and send it to a different OSD, causing it to be misplaced at that point
[22:27] <rkeene> borei, If there are 6 OSDs, if not atleast 6*0.3 are up, it won't mark any as down
[22:28] * vbellur (~vijay@173-13-111-22-NewEngland.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[22:28] * vbellur (~vijay@173-13-111-22-NewEngland.hfc.comcastbusiness.net) has joined #ceph
[22:28] * diver (~diver@216.85.162.34) has joined #ceph
[22:32] <borei> confused
[22:32] <borei> i have 6
[22:33] <borei> 2 were turned off, but they were marked as up/in in the tree
[22:33] <borei> for 5 minutes
[22:33] <rkeene> Yes
[22:34] <borei> im running VMs from storage, so VMs which were affected stucked completely (i lost one, need to repair FS)
[22:34] <borei> so it means IO wasn't stopped
[22:35] * diver_ (~diver@95.85.8.93) Quit (Ping timeout: 480 seconds)
[22:35] <borei> my "osd min up ratio" = 1.8, but i had 4 up
[22:35] <borei> so they were not marked as down ?
[22:36] <borei> is it correct ?
[22:36] * diver (~diver@216.85.162.34) Quit (Ping timeout: 480 seconds)
[22:37] * Brochacho (~alberto@97.93.161.13) has joined #ceph
[22:39] <rkeene> I presume your osd min up ratio is 0.3 (default), meaning if you had atleast 1.8 (2) OSDs up, it will mark OSDs down
[22:39] * togdon (~togdon@74.121.28.6) has joined #ceph
[22:39] <rkeene> If you have only 2 OSDs up and you have 6 OSDs, it won't mark either of those 2 down
[22:39] * tsg (~tgohad@192.55.55.41) Quit (Remote host closed the connection)
[22:40] <borei> ok
[22:42] * georgem (~Adium@24.114.71.31) has joined #ceph
[22:43] * georgem (~Adium@24.114.71.31) Quit ()
[22:43] * W|ldCraze (~GuntherDW@tsn109-201-154-146.dyn.nltelcom.net) Quit ()
[22:43] * georgem (~Adium@206.108.127.16) has joined #ceph
[22:43] * vbellur (~vijay@173-13-111-22-NewEngland.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[22:45] <borei> anyway im missing something. that failover test shouldn't cause service outage, but i don't see where i have weak spot
[22:45] <borei> monitor and 2 OSD went down
[22:46] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[22:47] <borei> client(s) were timeout talking with that monitor - was it the reason ??? or io was sent ot that OSDs ??
[22:48] * ledgr (~ledgr@88-222-11-185.meganet.lt) Quit (Remote host closed the connection)
[22:49] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[22:49] * derjohn_mob (~aj@x4db2ad83.dyn.telefonica.de) has joined #ceph
[22:51] * masber (~masber@129.94.15.152) Quit (Ping timeout: 480 seconds)
[22:51] * Fira (~oftc-webi@46.218.229.172) Quit (Ping timeout: 480 seconds)
[22:54] * dneary (~dneary@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[22:59] * sleinen (~Adium@2001:620:0:82::103) Quit (Ping timeout: 480 seconds)
[23:00] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[23:00] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[23:02] * tsg (~tgohad@134.134.139.82) has joined #ceph
[23:06] * jordan_c (~jconway@cable-192.222.129.3.electronicbox.net) Quit (Read error: Connection reset by peer)
[23:08] * BrianA (~BrianA@fw-rw.shutterfly.com) Quit (Read error: Connection reset by peer)
[23:09] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[23:17] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Remote host closed the connection)
[23:17] * vbellur (~vijay@71.234.224.255) has joined #ceph
[23:17] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[23:18] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[23:25] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[23:27] * kristen (~kristen@134.134.137.75) Quit (Remote host closed the connection)
[23:32] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[23:35] * tsg (~tgohad@134.134.139.82) Quit (Remote host closed the connection)
[23:42] * tsg (~tgohad@192.55.55.41) has joined #ceph
[23:54] * vbellur (~vijay@71.234.224.255) Quit (Quit: Leaving.)
[23:54] * vbellur (~vijay@71.234.224.255) has joined #ceph
[23:55] * Lokta (~Lokta@193.164.231.98) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.