#ceph IRC Log

Index

IRC Log for 2016-08-09

Timestamps are in GMT/BST.

[0:01] * cathode (~cathode@50.232.215.114) Quit (Quit: Leaving)
[0:01] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Remote host closed the connection)
[0:03] * wak-work (~wak-work@2620:15c:202:0:a490:e76c:794e:fb56) Quit (Ping timeout: 480 seconds)
[0:07] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[0:12] * wak-work (~wak-work@2620:15c:202:0:85ce:c4bc:c124:86cb) has joined #ceph
[0:18] * [0x4A6F]_ (~ident@p508CD169.dip0.t-ipconnect.de) has joined #ceph
[0:18] * ahmeni (~Azru@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[0:20] * aNupoisc (~adnavare@fmdmzpr03-ext.fm.intel.com) has joined #ceph
[0:20] * srk (~Siva@32.97.110.55) Quit (Ping timeout: 480 seconds)
[0:21] * [0x4A6F] (~ident@0x4a6f.user.oftc.net) Quit (Ping timeout: 480 seconds)
[0:21] * [0x4A6F]_ is now known as [0x4A6F]
[0:22] * rendar (~I@host121-176-dynamic.52-79-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[0:27] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[0:31] * bene3 (~bene@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[0:31] * badone (~badone@66.187.239.16) Quit (Quit: k?thxbyebyenow)
[0:32] * danieagle (~Daniel@179.110.8.48) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[0:36] * haplo37 (~haplo37@199.91.185.156) Quit (Ping timeout: 480 seconds)
[0:38] * kevinc_ (~kevinc__@132.249.238.94) has joined #ceph
[0:40] * kevinc___ (~kevinc__@client64-35.sdsc.edu) has joined #ceph
[0:45] * kevinc (~kevinc__@client64-35.sdsc.edu) Quit (Ping timeout: 480 seconds)
[0:47] * kevinc_ (~kevinc__@132.249.238.94) Quit (Ping timeout: 480 seconds)
[0:48] * ahmeni (~Azru@26XAAAXBF.tor-irc.dnsbl.oftc.net) Quit ()
[0:51] * fsimonce (~simon@host203-44-dynamic.183-80-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[0:52] * theTrav (~theTrav@1.136.97.121) has joined #ceph
[0:53] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[0:55] * bene2 (~bene@nat-pool-bos-t.redhat.com) Quit (Quit: Konversation terminated!)
[0:56] * theTrav (~theTrav@1.136.97.121) Quit (Remote host closed the connection)
[0:57] * theTrav (~theTrav@1.136.97.121) has joined #ceph
[0:57] * ntpttr_ (~ntpttr@192.55.54.44) has joined #ceph
[1:02] * theTrav (~theTrav@1.136.97.121) Quit (Remote host closed the connection)
[1:03] * theTrav (~theTrav@1.136.97.121) has joined #ceph
[1:06] * _28_ria (~kvirc@opfr028.ru) has joined #ceph
[1:12] * kevinc___ (~kevinc__@client64-35.sdsc.edu) Quit (Quit: Leaving)
[1:12] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[1:14] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[1:16] * Nacer (~Nacer@pai34-5-88-176-168-157.fbx.proxad.net) has joined #ceph
[1:20] * _28_ria (~kvirc@opfr028.ru) Quit (Read error: Connection reset by peer)
[1:21] * Miouge (~Miouge@109.128.94.173) has joined #ceph
[1:22] * _28_ria (~kvirc@opfr028.ru) has joined #ceph
[1:22] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[1:24] * Nacer (~Nacer@pai34-5-88-176-168-157.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[1:29] * Miouge (~Miouge@109.128.94.173) Quit (Ping timeout: 480 seconds)
[1:29] * theTrav (~theTrav@1.136.97.121) Quit (Remote host closed the connection)
[1:29] <leandrojpg> someone online?
[1:41] * theTrav (~theTrav@203.35.9.142) has joined #ceph
[1:43] * wkennington (~wkenningt@c-71-204-170-241.hsd1.ca.comcast.net) has joined #ceph
[1:45] * ntpttr_ (~ntpttr@192.55.54.44) Quit (Ping timeout: 480 seconds)
[1:47] * oms101 (~oms101@p20030057EA02AD00C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:48] * andreww (~xarses@64.124.158.192) Quit (Ping timeout: 480 seconds)
[1:48] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[1:48] * theTrav (~theTrav@203.35.9.142) Quit (Read error: Connection timed out)
[1:51] * theTrav (~theTrav@203.35.9.142) has joined #ceph
[1:51] * Zyn (~click@tor-exit.squirrel.theremailer.net) has joined #ceph
[1:53] * salwasser (~Adium@a72-246-0-10.deploy.akamaitechnologies.com) has joined #ceph
[1:56] * oms101 (~oms101@p20030057EA020500C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[2:00] * blizzow (~jburns@50.243.148.102) Quit (Ping timeout: 480 seconds)
[2:01] * penguinRaider (~KiKo@14.139.82.6) has joined #ceph
[2:01] * salwasser (~Adium@a72-246-0-10.deploy.akamaitechnologies.com) Quit (Quit: Leaving.)
[2:01] <leandrojpg> health HEALTH_WARN
[2:01] <leandrojpg> 64 pgs degraded
[2:01] <leandrojpg> 64 pgs stuck degraded
[2:01] <leandrojpg> 64 pgs stuck unclean
[2:01] <leandrojpg> 64 pgs stuck undersized
[2:01] <leandrojpg> 64 pgs undersized
[2:02] * penguinRaider_ (~KiKo@23.27.206.118) Quit (Ping timeout: 480 seconds)
[2:02] <leandrojpg> I have 2 osd would only be so that message?
[2:03] <leandrojpg> someone for me help
[2:06] <leandrojpg> please guys
[2:06] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[2:07] * truan-wang (~truanwang@220.248.17.34) has joined #ceph
[2:10] * penguinRaider (~KiKo@14.139.82.6) Quit (Ping timeout: 480 seconds)
[2:10] * ntpttr_ (~ntpttr@192.55.54.44) has joined #ceph
[2:13] <Kingrat_> you need more hosts
[2:13] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[2:15] * truan-wang (~truanwang@220.248.17.34) Quit (Ping timeout: 480 seconds)
[2:16] * Nacer (~Nacer@pai34-5-88-176-168-157.fbx.proxad.net) has joined #ceph
[2:17] * ntpttr_ (~ntpttr@192.55.54.44) Quit (Remote host closed the connection)
[2:18] * penguinRaider (~KiKo@204.152.207.173) has joined #ceph
[2:21] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[2:21] <leandrojpg> dfs
[2:21] <leandrojpg> in case I need to do it all again I say, create User to deploy without password ssh etc ... to enter another host? Kingrat_ can say?
[2:21] * Zyn (~click@26XAAAXC2.tor-irc.dnsbl.oftc.net) Quit ()
[2:21] * KungFuHamster (~Gecko1986@tor.exit.relay.dedicatedpi.com) has joined #ceph
[2:22] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[2:22] <leandrojpg> Kingrat?
[2:22] <leandrojpg> would it be this?
[2:28] <leandrojpg> help
[2:29] * truan-wang (~truanwang@220.248.17.34) has joined #ceph
[2:40] <Kingrat_> by default it spreads the data across multiple nodes, if you only have 1 node or only 2 osds on 1 node, you will need to change your replica size and crush map
[2:40] <Kingrat_> this is not a filesystem that you want to use on 2 osds
[2:40] * theTrav_ (~theTrav@ipc032.ipc.telstra.net) has joined #ceph
[2:44] <leandrojpg> in my case I have 1 knot where I do deploy and 2 OSD in that case then I think not indicated?
[2:44] <leandrojpg> if I want to have my cluster as ok could tell me how?
[2:44] <leandrojpg> Kingrat
[2:44] * Nacer (~Nacer@pai34-5-88-176-168-157.fbx.proxad.net) Quit (Remote host closed the connection)
[2:45] * theTrav (~theTrav@203.35.9.142) Quit (Read error: Connection timed out)
[2:46] <leandrojpg> the ideal is to have when we osd? to be a health or improved cluster ok?
[2:48] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) Quit (Remote host closed the connection)
[2:51] * KungFuHamster (~Gecko1986@26XAAAXDT.tor-irc.dnsbl.oftc.net) Quit ()
[2:51] * superdug (~poller@torrelay9.tomhek.net) has joined #ceph
[2:51] * aNupoisc (~adnavare@fmdmzpr03-ext.fm.intel.com) Quit (Remote host closed the connection)
[2:51] * aNupoisc (~adnavare@fmdmzpr03-ext.fm.intel.com) has joined #ceph
[2:53] <Kingrat_> if you have a size of 3 (number of replicas) by default you need 3 nodes with 1+ osd each, and that would be a bare minimum configuration to get health ok, it would be far from ideal
[2:53] <Kingrat_> if you need storage smaller/simpler than that, i would suggest using another technology
[2:54] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[2:57] <leandrojpg> I understand but if I have an admin node where do I deploy and two hosts that are OSD, can have health ok moving on maps or not?
[2:57] <leandrojpg> understand that this is not ideal but wanted to see the possibilities that ceph gives me is how? I move the map?
[2:59] <leandrojpg> the replicas of number follows the number of hosts + right OSD?
[2:59] <Kingrat_> you can run monitors on the same nodes as your osds
[3:00] <leandrojpg> I understood was what I did but still not have the health ok
[3:00] <Kingrat_> replicas is how many copies of each object, and by default the objects are not allowed to be duplicated on any node, so replica 3 requires 3+ nodes, replica 2 will work with 2
[3:02] <leandrojpg> understood, one last doubt.
[3:02] <leandrojpg> so I add another host as osd need to do the same configurations, to create User on the host to deploy without password ssh etc ...?
[3:03] <Kingrat_> im not sure what you mean, but it would probably need to be configured the same as the other hosts?
[3:05] <leandrojpg> to add another host / OSD in the cluster must have relationship of trust with ssh same requests in the documentation? understand now I'm sorry for my english
[3:07] * truan-wang (~truanwang@220.248.17.34) Quit (Ping timeout: 480 seconds)
[3:07] * truan-wang (~truanwang@220.248.17.34) has joined #ceph
[3:08] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[3:11] * yanzheng (~zhyan@125.70.20.176) has joined #ceph
[3:14] * penguinRaider (~KiKo@204.152.207.173) Quit (Ping timeout: 480 seconds)
[3:17] * Nicho1as (~nicho1as@00022427.user.oftc.net) has joined #ceph
[3:21] * superdug (~poller@9YSAAA7VZ.tor-irc.dnsbl.oftc.net) Quit ()
[3:24] * tsg (~tgohad@jfdmzpr04-ext.jf.intel.com) Quit (Remote host closed the connection)
[3:25] <leandrojpg> thanks <Kingrat_>
[3:25] * leandrojpg (~IceChat9@189-12-20-140.user.veloxzone.com.br) Quit (Quit: If you're not living on the edge, you're taking up too much space)
[3:26] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) Quit (Quit: Leaving.)
[3:28] * penguinRaider (~KiKo@204.152.207.173) has joined #ceph
[3:30] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[3:34] * scg (~zscg@181.122.37.47) has joined #ceph
[3:37] * penguinRaider (~KiKo@204.152.207.173) Quit (Ping timeout: 480 seconds)
[3:38] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[3:41] * haplo37 (~haplo37@107-190-44-23.cpe.teksavvy.com) has joined #ceph
[3:41] * sebastian-w (~quassel@212.218.8.139) has joined #ceph
[3:42] * wingwu (~wingwu@www.nisys.be) has joined #ceph
[3:43] * sebastian-w_ (~quassel@212.218.8.138) Quit (Read error: Connection reset by peer)
[3:44] * jamespd (~mucky@mucky.socket7.org) Quit (Remote host closed the connection)
[3:44] * malevolent (~quassel@192.146.172.118) Quit (Ping timeout: 480 seconds)
[3:45] * swami1 (~swami@27.7.171.153) has joined #ceph
[3:45] * vbellur (~vijay@71.234.224.255) has joined #ceph
[3:47] * theTrav_ (~theTrav@ipc032.ipc.telstra.net) Quit (Remote host closed the connection)
[3:48] * theTrav (~theTrav@203.35.9.142) has joined #ceph
[3:48] * tsg (~tgohad@134.134.139.83) has joined #ceph
[3:50] * kefu (~kefu@114.92.96.253) has joined #ceph
[3:51] * scg (~zscg@181.122.37.47) Quit (Remote host closed the connection)
[3:53] * cryptk (~csharp@178-175-128-50.static.host) has joined #ceph
[4:00] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[4:01] * davidzlap (~Adium@2605:e000:1313:8003:cdd9:3191:5b30:49ef) Quit (Quit: Leaving.)
[4:02] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[4:03] * kefu (~kefu@114.92.96.253) has joined #ceph
[4:04] * haomaiwang (~oftc-webi@61.149.85.206) has joined #ceph
[4:05] * srk (~Siva@2605:6000:ed04:ce00:19fd:42fb:1625:4207) has joined #ceph
[4:07] * marco208 (~root@159.253.7.204) Quit (Remote host closed the connection)
[4:07] * marco208 (~root@159.253.7.204) has joined #ceph
[4:08] * cyphase (~cyphase@000134f2.user.oftc.net) Quit (Ping timeout: 480 seconds)
[4:15] * srk (~Siva@2605:6000:ed04:ce00:19fd:42fb:1625:4207) Quit (Ping timeout: 480 seconds)
[4:16] * cyphase (~cyphase@c-50-148-131-137.hsd1.ca.comcast.net) has joined #ceph
[4:19] * T1 (~the_one@5.186.54.143) has joined #ceph
[4:21] * jfaj__ (~jan@p20030084AF32BF005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) has joined #ceph
[4:21] * Miouge (~Miouge@109.128.94.173) has joined #ceph
[4:22] * cryptk (~csharp@5AEAAAUZ5.tor-irc.dnsbl.oftc.net) Quit ()
[4:23] * kefu (~kefu@114.92.96.253) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[4:25] * jamespd (~mucky@mucky.socket7.org) has joined #ceph
[4:26] * The1_ (~the_one@5.186.54.143) Quit (Ping timeout: 480 seconds)
[4:28] * jfaj_ (~jan@p4FE4F5FB.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[4:28] * kefu (~kefu@114.92.96.253) has joined #ceph
[4:30] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[4:31] * kefu (~kefu@114.92.96.253) has joined #ceph
[4:32] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[4:33] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[4:34] * Miouge (~Miouge@109.128.94.173) Quit (Ping timeout: 480 seconds)
[4:34] * Racpatel (~Racpatel@2601:87:0:24af::53d5) Quit (Ping timeout: 480 seconds)
[4:40] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[4:45] * offender (~Lattyware@torrelay4.tomhek.net) has joined #ceph
[4:49] * wingwu (~wingwu@www.nisys.be) Quit (Remote host closed the connection)
[4:52] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[4:55] * LegalResale (~LegalResa@66.165.126.130) has joined #ceph
[4:56] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[5:03] * ntpttr_ (~ntpttr@192.55.54.40) has joined #ceph
[5:05] * swami1 (~swami@27.7.171.153) Quit (Ping timeout: 480 seconds)
[5:07] * ntpttr_ (~ntpttr@192.55.54.40) Quit (Remote host closed the connection)
[5:07] * ntpttr_ (~ntpttr@134.134.137.75) has joined #ceph
[5:15] * offender (~Lattyware@9YSAAA7YN.tor-irc.dnsbl.oftc.net) Quit ()
[5:16] * ntpttr_ (~ntpttr@134.134.137.75) Quit (Ping timeout: 480 seconds)
[5:19] * wushudoin (~wushudoin@2601:646:8281:cfd:2ab2:bdff:fe0b:a6ee) Quit (Ping timeout: 480 seconds)
[5:24] * FNugget (~Swompie`@108.61.122.114) has joined #ceph
[5:31] * vimal (~vikumar@114.143.165.8) has joined #ceph
[5:38] * Vacuum__ (~Vacuum@88.130.202.82) has joined #ceph
[5:45] * Vacuum_ (~Vacuum@88.130.195.58) Quit (Ping timeout: 480 seconds)
[5:47] * aNupoisc (~adnavare@fmdmzpr03-ext.fm.intel.com) Quit (Remote host closed the connection)
[5:49] * rdas (~rdas@121.244.87.116) has joined #ceph
[5:50] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[5:50] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[5:53] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[5:54] * FNugget (~Swompie`@26XAAAXHI.tor-irc.dnsbl.oftc.net) Quit ()
[5:56] * theTrav (~theTrav@203.35.9.142) Quit (Read error: Connection timed out)
[5:59] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[6:01] * jamespage (~jamespage@culvain.gromper.net) Quit (Read error: Connection reset by peer)
[6:01] * walcubi_ (~walcubi@p5797AF87.dip0.t-ipconnect.de) has joined #ceph
[6:03] * KUSmurf (~ricin@89-72-54-15.dynamic.chello.pl) has joined #ceph
[6:03] * truan-wang (~truanwang@220.248.17.34) Quit (Remote host closed the connection)
[6:07] * KUSmurf (~ricin@61TAAA6Z7.tor-irc.dnsbl.oftc.net) Quit (Read error: Connection reset by peer)
[6:08] * Silentspy (~Chrissi_@tor-exit.gansta93.com) has joined #ceph
[6:08] * walcubi (~walcubi@p5795B41D.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[6:09] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[6:11] * vimal (~vikumar@114.143.165.8) Quit (Quit: Leaving)
[6:25] * tsg (~tgohad@134.134.139.83) Quit (Remote host closed the connection)
[6:27] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[6:32] * vimal (~vikumar@121.244.87.116) has joined #ceph
[6:36] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[6:36] * truan-wang (~truanwang@220.248.17.34) has joined #ceph
[6:38] * Silentspy (~Chrissi_@9YSAAA70L.tor-irc.dnsbl.oftc.net) Quit ()
[6:40] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[6:50] * truan-wang_ (~truanwang@220.248.17.34) has joined #ceph
[6:50] * _28_ria (~kvirc@opfr028.ru) Quit (Read error: Connection reset by peer)
[6:52] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[6:53] * truan-wang (~truanwang@220.248.17.34) Quit (Ping timeout: 480 seconds)
[6:54] * swami1 (~swami@49.38.1.162) has joined #ceph
[6:58] * karnan (~karnan@2405:204:5502:b48e:3602:86ff:fe56:55ae) has joined #ceph
[7:02] * ronrib (~boswortr@45.32.242.135) has joined #ceph
[7:03] * kefu is now known as kefu|afk
[7:08] * kefu|afk is now known as kefu
[7:09] * Unai (~Adium@208.80.71.24) has joined #ceph
[7:10] * swami1 (~swami@49.38.1.162) Quit (Read error: Connection timed out)
[7:12] * swami1 (~swami@49.38.1.162) has joined #ceph
[7:17] * hassifa (~ylmson@torsrva.snydernet.net) has joined #ceph
[7:18] * truan-wang_ (~truanwang@220.248.17.34) Quit (Ping timeout: 480 seconds)
[7:19] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[7:24] * swami1 (~swami@49.38.1.162) Quit (Quit: Leaving.)
[7:27] * Miouge (~Miouge@109.128.94.173) has joined #ceph
[7:30] * swami1 (~swami@49.38.1.162) has joined #ceph
[7:34] * haplo37 (~haplo37@107-190-44-23.cpe.teksavvy.com) Quit (Ping timeout: 480 seconds)
[7:35] * Miouge (~Miouge@109.128.94.173) Quit (Ping timeout: 480 seconds)
[7:35] * Unai (~Adium@208.80.71.24) Quit (Quit: Leaving.)
[7:36] * Xmd (~Xmd@78.85.35.236) has joined #ceph
[7:44] <Xmd> Hi.
[7:44] <Xmd> Problem. On the server rbd mounted directory that is exported by NFS. But the server that mounts the NFS does not see the mounted image in there. OpenSuse Leap. Who faced with such? Why NFS directory is not seen. With smb there is no problem.
[7:47] * hassifa (~ylmson@26XAAAXI9.tor-irc.dnsbl.oftc.net) Quit ()
[7:52] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[7:53] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[7:58] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[8:00] * truan-wang (~truanwang@220.248.17.34) has joined #ceph
[8:02] <Xmd> for FAQ - answer: nfs export with param 'rw,async,wdelay,no_root_squash,no_subtree_check'
[8:08] <IcePic> those nfs options seem rather "normal", aren't they?
[8:09] <IcePic> as in "none of them look like make-invisible-stuff-seen"..
[8:10] <T1w> permissions on parent dirs?
[8:10] <T1w> NFS exported correctly?
[8:10] <T1w> nfs v3 or v4?
[8:14] * ntpttr_ (~ntpttr@134.134.139.78) has joined #ceph
[8:20] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[8:22] * ntpttr_ (~ntpttr@134.134.139.78) Quit (Ping timeout: 480 seconds)
[8:24] * Miouge (~Miouge@109.128.94.173) has joined #ceph
[8:28] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[8:29] * kefu (~kefu@114.92.96.253) has joined #ceph
[8:38] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[8:40] * badone (~badone@66.187.239.16) has joined #ceph
[8:40] * kefu (~kefu@114.92.96.253) has joined #ceph
[8:43] * truan-wang (~truanwang@220.248.17.34) Quit (Remote host closed the connection)
[8:45] * davidzlap1 (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[8:46] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit (Ping timeout: 480 seconds)
[8:48] * babilen (~babilen@babilen.user.oftc.net) has left #ceph
[8:49] * rmart04 (~rmart04@host109-155-213-112.range109-155.btcentralplus.com) has joined #ceph
[8:52] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) has joined #ceph
[8:56] * rmart04 (~rmart04@host109-155-213-112.range109-155.btcentralplus.com) Quit (Quit: rmart04)
[8:56] * analbeard (~shw@support.memset.com) has joined #ceph
[8:57] * ceph-ircslackbot2 (~ceph-ircs@ds9536.dreamservers.com) has joined #ceph
[9:00] * Pulp (~Pulp@63-221-50-195.dyn.estpak.ee) Quit (Read error: Connection reset by peer)
[9:03] * rendar (~I@host222-180-dynamic.12-79-r.retail.telecomitalia.it) has joined #ceph
[9:04] * ceph-ircslackbot (~ceph-ircs@ds9536.dreamservers.com) Quit (Ping timeout: 480 seconds)
[9:05] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) has joined #ceph
[9:08] * aj__ (~aj@p578b6aa1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[9:08] * ade (~abradshaw@tmo-100-60.customers.d1-online.com) has joined #ceph
[9:22] * truan-wang (~truanwang@58.247.8.186) has joined #ceph
[9:28] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[9:29] * boolman (boolman@79.138.78.238) has joined #ceph
[9:35] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[9:35] * fsimonce (~simon@host203-44-dynamic.183-80-r.retail.telecomitalia.it) has joined #ceph
[9:36] * kefu (~kefu@114.92.96.253) has joined #ceph
[9:37] * aj__ (~aj@fw.gkh-setu.de) has joined #ceph
[9:40] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[9:41] * kefu (~kefu@114.92.96.253) has joined #ceph
[9:45] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[9:45] * kefu (~kefu@114.92.96.253) has joined #ceph
[9:45] * rakeshgm (~rakesh@121.244.87.117) has joined #ceph
[9:50] * aj__ (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[9:52] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[9:55] * Nacer (~Nacer@pai34-5-88-176-168-157.fbx.proxad.net) has joined #ceph
[9:59] * aj__ (~aj@fw.gkh-setu.de) has joined #ceph
[9:59] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[10:02] * jamespag` (~jamespage@culvain.gromper.net) has joined #ceph
[10:03] * rotbeard (~redbeard@185.32.80.238) Quit ()
[10:07] * Aal (~Tralin|Sl@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[10:08] * i_m (~ivan.miro@deibp9eh1--blueice2n4.emea.ibm.com) has joined #ceph
[10:09] * brians_ (~brianoftc@brian.by) Quit (Quit: ZNC - http://znc.in)
[10:11] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[10:15] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[10:16] * Miouge_ (~Miouge@109.128.94.173) has joined #ceph
[10:16] * art_yo (~kvirc@149.126.169.197) Quit (Read error: Connection reset by peer)
[10:19] * rotbeard (~redbeard@185.32.80.238) Quit ()
[10:21] * Miouge (~Miouge@109.128.94.173) Quit (Ping timeout: 480 seconds)
[10:21] * Miouge_ is now known as Miouge
[10:22] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[10:22] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[10:24] * kefu (~kefu@114.92.96.253) has joined #ceph
[10:27] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[10:30] * wkennington_ (~wkenningt@c-71-204-170-241.hsd1.ca.comcast.net) has joined #ceph
[10:35] * truan-wang (~truanwang@58.247.8.186) Quit (Ping timeout: 480 seconds)
[10:36] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[10:37] * kefu (~kefu@114.92.96.253) has joined #ceph
[10:37] * Aal (~Tralin|Sl@26XAAAXMP.tor-irc.dnsbl.oftc.net) Quit ()
[10:37] * wkennington (~wkenningt@c-71-204-170-241.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[10:40] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[10:41] * kefu (~kefu@114.92.96.253) has joined #ceph
[10:42] * SquallSeeD31 (~Jase@213.61.149.100) has joined #ceph
[10:45] * TMM (~hp@185.5.121.201) has joined #ceph
[10:53] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:4d98:7dea:2462:19d7) has joined #ceph
[10:59] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[11:02] * shyu (~Frank@218.241.172.114) has joined #ceph
[11:09] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[11:11] * SquallSeeD31 (~Jase@26XAAAXNA.tor-irc.dnsbl.oftc.net) Quit ()
[11:11] * rakeshgm (~rakesh@121.244.87.117) Quit (Ping timeout: 480 seconds)
[11:19] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:4d98:7dea:2462:19d7) Quit (Ping timeout: 480 seconds)
[11:20] * rakeshgm (~rakesh@121.244.87.118) has joined #ceph
[11:28] * Nicho1as (~nicho1as@00022427.user.oftc.net) Quit (Quit: A man from the Far East; using WeeChat 1.5)
[11:28] * analbeard (~shw@support.memset.com) has joined #ceph
[11:28] * adun153 (~adun153@130.105.147.50) has joined #ceph
[11:28] * Nacer (~Nacer@pai34-5-88-176-168-157.fbx.proxad.net) Quit (Remote host closed the connection)
[11:29] <adun153> Hello everyone, please look at this:
[11:29] <adun153> root@controller1:/var/log/nova# ceph -s
[11:29] <adun153> cluster 3175dc2e-bd5b-4cd7-91ce-1ab9454b4142
[11:29] <adun153> health HEALTH_ERR
[11:29] <adun153> 4 pgs inconsistent
[11:29] <adun153> 26 scrub errors
[11:29] <adun153> monmap e1: 3 mons at {storage1=192.168.0.15:6789/0,storage2=192.168.0.16:6789/0,storage3=192.168.0.17:6789/0}
[11:29] <adun153> election epoch 228, quorum 0,1,2 storage1,storage2,storage3
[11:29] <adun153> osdmap e16747: 66 osds: 66 up, 66 in
[11:29] <adun153> flags sortbitwise
[11:29] <adun153> pgmap v14320107: 8080 pgs, 17 pools, 2821 GB data, 500 kobjects
[11:29] <adun153> 5634 GB used, 55485 GB / 61119 GB avail
[11:29] <adun153> 8068 active+clean
[11:29] <adun153> 8 active+clean+scrubbing+deep
[11:29] <adun153> 4 active+clean+inconsistent
[11:29] <adun153> client io 0 B/s rd, 5265 kB/s wr, 795 op/s
[11:29] <adun153> What can I do to fix the HEALTH_ERR status of my cluster? What's causing the scrub errors, and what's causing the 4 pgs to be inconsistent?
[11:30] <adun153> Is there a general rule to follow regarding this?
[11:32] <IcePic> even if I havent seen exactly that, the ceph -s output in my cases usually gave me a scare since it uses words for "working on fixing this-or-that" which on other storage systems would indicate "death imminent, get a new job now" ;)
[11:33] <IcePic> so watch it for a while and see if it fixes at least on of those 4 pgs then you know ceph is on top of it and in progress of getting stuff in order
[11:33] <IcePic> but reading the output of "ceph -w" could give more hints on why it errors out
[11:33] <IcePic> did anything in particular happen before you got into this state?
[11:34] <adun153> IcePic: Yes.
[11:34] <adun153> I'm using this for OpenStack
[11:35] <adun153> the day admin reported that some users had complained about "slowness" in their VMs.
[11:35] <adun153> He rebooted one of the sluggish ones, and then the dashboard showed an "error" state.
[11:36] <adun153> Eventually, I found that one of the OSDs was down (I have about 60). Curious. I looked at the osd log, it kept showing a "wrong node!" line.
[11:36] <adun153> Failing to remember how to restart an individual OSD, I stopped all OSDs on that node, then started it up again.
[11:36] <adun153> That solved the problem with the VMs, everything went back to working normally.
[11:37] <adun153> That's the cluster state I ended up with.
[11:38] * kefu (~kefu@114.92.96.253) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[11:43] * aj__ (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[11:48] * brians_ (~brianoftc@brian.by) has joined #ceph
[11:52] <doppelgrau> adun153: http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent
[11:52] * aj__ (~aj@fw.gkh-setu.de) has joined #ceph
[11:53] <doppelgrau> adun153: you have probably a disk that is not very reliable anymore
[11:53] <doppelgrau> adun153: or broken timebase
[11:53] <adun153> Doppelgrau: thanks.
[11:53] <adun153> what's a timebase?
[11:54] <doppelgrau> adun153: not the same time everywhere
[11:54] <adun153> Ah.
[11:54] <doppelgrau> adun153: e.g. not a working ntp daemon
[11:56] <adun153> Ok, that gives me something to work with, thanks!
[12:01] * eXeler0n (~Kurimus@185.133.32.19) has joined #ceph
[12:01] * DanFoster (~Daniel@office.34sp.com) has joined #ceph
[12:07] <adun153> doppelgrau: I got it back to a HEALTH_OK state. I read from the link that inconsistent pgs are caused by errors in scrubbing. "scrub" and its variants aren't in the glossary. Where can I find out what scrubbing is?
[12:08] <adun153> Ah, nevermind found it in http://docs.ceph.com/docs/master/architecture/?highlight=scrub Thanks again!
[12:12] * i_m (~ivan.miro@deibp9eh1--blueice2n4.emea.ibm.com) Quit (Ping timeout: 480 seconds)
[12:16] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[12:24] * walcubi_ is now known as walbuci
[12:25] <walbuci> SamYaple, I also found out that these two settings are important also: http://docs.ceph.com/docs/master/rados/configuration/filestore-config-ref/#misc
[12:25] <walbuci> where filestore_merge_threshold must be negative, otherwise no pre-emptive directory creation will take place.
[12:28] * rakeshgm (~rakesh@121.244.87.118) Quit (Ping timeout: 480 seconds)
[12:30] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[12:31] * eXeler0n (~Kurimus@b9852013.test.dnsbl.oftc.net) Quit ()
[12:35] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[12:35] * sebastian-w_ (~quassel@212.218.8.138) has joined #ceph
[12:37] * rakeshgm (~rakesh@121.244.87.117) has joined #ceph
[12:39] * sebastian-w (~quassel@212.218.8.139) Quit (Ping timeout: 480 seconds)
[12:43] * jfaj__ (~jan@p20030084AF32BF005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[12:43] * jfaj__ (~jan@p4FE4ED78.dip0.t-ipconnect.de) has joined #ceph
[12:56] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[12:59] <koollman> Hi. I'm trying to think of the best way to partition and use some hardware I have to put ceph on it (and to make sure I can grow it later by adding more servers). I've got 2 ssd and 2 hdd per servers, and I'm wondering how I can keep most of the data out of the ssd (probably doing a specific pool for ssd-only storage)
[13:00] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[13:00] <koollman> if I understand correctly, I have to generate a crush map based on my hardware so that I can make some choices and rules about how the data will be stored. is that correct ?
[13:03] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[13:05] <doppelgrau> koollman: you can have two different roots (one for hdds, one for ssd) and crush-rules choosing between them
[13:06] * haomaiwang (~oftc-webi@61.149.85.206) Quit (Ping timeout: 480 seconds)
[13:06] <doppelgrau> koollman: but be carefull, if you create rules like first copy on ssd, other two copies on hdds two copies might lang on one server
[13:06] <doppelgrau> koollman: if you think about such rules, better putt only hdds or only ssds in one box (but the hdds with ssd journal)
[13:07] * bene2 (~bene@2601:193:4101:f410:ea2a:eaff:fe08:3c7a) has joined #ceph
[13:07] <koollman> I'm thinking more of 'hdd-only' pool (with a partition on ssd for journaling) and 'ssd-only' pool. so I should not have the problem for now
[13:08] <doppelgrau> koollman: then simply use two different "roots"
[13:09] <koollman> although the doc here http://docs.ceph.com/docs/master/rados/operations/crush-map/ does have a ssd-primary example. But I'm guessing it could land on the same host twice ? although the type host seems to be used (I cannot yet read this map easily :) )
[13:09] <koollman> but anyway, not doing that, or at least not now :)
[13:09] <koollman> two different roots seems good and simple enough to do
[13:09] <doppelgrau> koollman: the problem is, the host has two names so ceph does not recognize it as the same host
[13:10] <koollman> oh, right
[13:10] <doppelgrau> koollman: thats the reason one system I take care of has ssdonly and hdd+journal systems, to ensure that one host failure does not take two copies down
[13:10] <koollman> yeah, seems like a nice property to keep in mind :)
[13:11] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[13:11] <doppelgrau> koollman: fast 8ssd) reads, fast writes (at leat for smal writes) since ssd journal, IMHO good balance for mixed workload backend for virtual servers
[13:11] * Kalado (~Guest1390@ip95.ip-94-23-150.eu) has joined #ceph
[13:13] <koollman> another question. I have 2x1.2TB ssd (sda and sdb), and 2x4TB hdd per node (sdc and sdd). I could either do : [sdc with journal on sda, sdb with journal on sdb], or [ sdc with raid journal on sda+sdb, sdb with raid1 journal on sda+sdb ]
[13:13] <koollman> the first might be faster but the second might be fast enough and more resilient. any thoughts on that ?
[13:14] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[13:15] <koollman> (also I'm finding wildly differing opinions/advice on journal sizing... hard to choose a size)
[13:16] <doppelgrau> koollman: sice the journal is a very write intensive part and SSDs do not like that too much, I do not use raid, if the ssd dies, the few platters dies with them, but since I can easily tolerate a whole server loss, that does not bother me (and did not yet happen, but platter/hdds dies nearly every week)
[13:17] <doppelgrau> rule of thumb 2 times yout commit time x max expected bandwith
[13:17] <doppelgrau> either limit of the disks or the network
[13:17] <koollman> yeah ... I'm guessing I will go with "we may have to resize that later" :)
[13:17] <doppelgrau> I'm lazy and said 8GB should be more than enough and only a small part of the ssd
[13:18] <doppelgrau> but since I have mainly small writes (4-32kb) a spinning disk only moves a few mb per second...
[13:19] <doppelgrau> (so the journal is way oversized for that)
[13:19] <flaf> Hi. koollman: Imho, you should avoid raid etc. The philosophy is: your ceph cluster should resist to crashed of disks, that's all.
[13:20] <flaf> (and it's probably too late but indeed there is no pb to make "ssd" pools and "hdd" pools with the crushmap).
[13:25] <koollman> thanks. I'm guessing that's enough info for now to continue my cluster design
[13:29] * Nacer (~Nacer@37.165.118.163) has joined #ceph
[13:29] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[13:34] * rraja (~rraja@121.244.87.117) has joined #ceph
[13:34] * Nacer (~Nacer@37.165.118.163) Quit (Read error: Connection reset by peer)
[13:35] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:4d98:7dea:2462:19d7) has joined #ceph
[13:39] * inevity (~androirc@107.170.0.159) has joined #ceph
[13:41] * Kalado (~Guest1390@61TAAA67H.tor-irc.dnsbl.oftc.net) Quit ()
[13:42] * Nicho1as (~nicho1as@14.52.121.20) has joined #ceph
[13:47] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[13:49] * inevity (~androirc@107.170.0.159) Quit (Remote host closed the connection)
[13:55] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[13:55] * Behedwin (~JohnO@46.166.190.208) has joined #ceph
[13:55] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[13:56] * b0e (~aledermue@213.95.25.82) has joined #ceph
[14:07] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[14:18] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[14:21] * gluco (~gluco@milhouse.imppc.org) Quit (Remote host closed the connection)
[14:23] * georgem (~Adium@24.114.59.93) has joined #ceph
[14:24] * aj__ (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[14:24] * mschiff (~mschiff@mx10.schiffbauer.net) Quit (Remote host closed the connection)
[14:24] * mschiff (~mschiff@mx10.schiffbauer.net) has joined #ceph
[14:25] * Behedwin (~JohnO@46.166.190.208) Quit ()
[14:27] * Jeffrey4l__ (~Jeffrey@119.251.128.22) Quit (Ping timeout: 480 seconds)
[14:28] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[14:36] * vimal (~vikumar@121.244.87.116) Quit (Quit: Leaving)
[14:39] * Racpatel (~Racpatel@2601:87:0:24af::53d5) has joined #ceph
[14:47] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[14:48] * Lunk2 (~dontron@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[14:51] * aj__ (~aj@fw.gkh-setu.de) has joined #ceph
[14:52] * mhack (~mhack@nat-pool-bos-u.redhat.com) has joined #ceph
[14:54] * jowilkin (~jowilkin@2601:644:4000:b0bf:56ee:75ff:fe10:724e) Quit (Ping timeout: 480 seconds)
[14:55] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[14:55] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[14:56] * jowilkin (~jowilkin@2601:644:4000:b0bf:56ee:75ff:fe10:724e) has joined #ceph
[15:00] * georgem (~Adium@24.114.59.93) Quit (Quit: Leaving.)
[15:02] * vimal (~vikumar@114.143.165.8) has joined #ceph
[15:02] * adun153 (~adun153@130.105.147.50) Quit (Quit: Ex-Chat)
[15:04] * mattbenjamin (~mbenjamin@12.118.3.106) has joined #ceph
[15:07] * rraja (~rraja@121.244.87.117) Quit (Ping timeout: 480 seconds)
[15:14] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[15:15] * salwasser (~Adium@72.246.3.14) has joined #ceph
[15:17] * rraja (~rraja@121.244.87.118) has joined #ceph
[15:17] * Lunk2 (~dontron@61TAAA699.tor-irc.dnsbl.oftc.net) Quit ()
[15:19] * georgem (~Adium@206.108.127.16) has joined #ceph
[15:19] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[15:22] * Doodlepieguy (~clusterfu@ns316491.ip-37-187-129.eu) has joined #ceph
[15:22] * gregmark (~Adium@68.87.42.115) has joined #ceph
[15:23] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) has joined #ceph
[15:27] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[15:28] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[15:28] * aj__ (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[15:29] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) has joined #ceph
[15:30] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) Quit (Remote host closed the connection)
[15:39] * rraja (~rraja@121.244.87.118) Quit (Quit: Leaving)
[15:40] * roeland (~roeland@87.215.30.74) has joined #ceph
[15:40] * rraja (~rraja@121.244.87.117) has joined #ceph
[15:41] * Jeffrey4l__ (~Jeffrey@119.251.128.22) has joined #ceph
[15:43] * Jeffrey4l__ (~Jeffrey@119.251.128.22) Quit (Read error: Connection reset by peer)
[15:43] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[15:43] * wes_dillingham (~wes_dilli@140.247.242.44) has joined #ceph
[15:43] <roeland> anyone who has ceph connected as a datastore on opennebula by any chance?
[15:44] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[15:44] <roeland> I have it on the CLI working fine but the UI itse;f keeps showing a zero byte size datastore. a hit of 2013 or so may be a tip but we're in 2016 before disabling cepx auth.
[15:45] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) has joined #ceph
[15:45] <wes_dillingham> from what i understand the object-map is an in-memory table of objects locations in the cluster and is an improvement of performance as the (typically) client can load the object map into memory and doesnt have to use crush when computing each objects location, is this an accurate descriptions of object-map?
[15:47] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[15:48] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[15:49] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) Quit (Remote host closed the connection)
[15:52] * Doodlepieguy (~clusterfu@26XAAAXRZ.tor-irc.dnsbl.oftc.net) Quit ()
[15:53] <wes_dillingham> Does the object map need to be rebuild periodically as the rbd device changes, and snapshots are taken, as their a built in auto-update of the object map for constantly changing rbd devices?
[15:54] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[15:54] <jdillaman> wes_dillingham: it would only need to be rebuilt if it is flagged as invalid (which shouldn't happen)
[15:54] <jdillaman> wes_dillingham: ... or if you dynamically enabled object map / fast-diff on an image
[15:54] * haomaiwang (~oftc-webi@61.148.242.76) has joined #ceph
[15:55] <jdillaman> wes_dillingham: as the writes/discards hit the image, the object map is updated. when snapshots are created, an associated snapshot version of the object map is created
[15:55] * dneary (~dneary@nat-pool-bos-u.redhat.com) Quit (Quit: Ex-Chat)
[15:55] <SamYaple> wes_dillingham: its not an only in-memory table either. it is used to know _if_ a object as been allocated and where since rbds are created sparse
[15:55] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[15:56] <jdillaman> wes_dillingham: the crush map is still required to lookup the PGs for an object, but if librbd knows the object doesn't exist (as per the object map), it can optimize away the unnecessary IO
[15:58] <wes_dillingham> jdillaman: SamYaple: Thanks for the info, that is useful.
[15:58] * mhackett (~mhack@nat-pool-bos-t.redhat.com) has joined #ceph
[15:58] <wes_dillingham> When attempting to take a snapshot recently I got 2016-08-08 11:07:25.642294 7f0aaf6fd700 -1 librbd::object_map::RefreshRequest: failed to load object map: rbd_object_map.c171652eb141f2
[15:58] <wes_dillingham> 2016-08-08 11:07:25.642583 7f0aaf6fd700 -1 librbd::object_map::InvalidateRequest: 0x7f0a8c00d3c0 invalidating object map in-memory
[15:58] <wes_dillingham> 2016-08-08 11:07:25.642635 7f0aaeefc700 -1 librbd::object_map::InvalidateRequest: 0x7f0a8c00d3c0 should_complete: r=0
[15:58] <wes_dillingham> 2016-08-08 11:07:26.744934 7f0aaf6fd700 -1 librbd::object_map::Request: failed to update object map: (2) No such file or directory
[15:58] <wes_dillingham> Mon Aug 8 11:07:26 EDT 2016: Snapshot: 1ss_root_disk/one-76-1067-0@2016-08-08 creation completed
[15:59] <wes_dillingham> I see that it ultimately works, but it appears the object map is busted for this vm
[15:59] <jdillaman> wes_dillingham: was the object map dynamically enabled on the image after that snapshot was created?
[15:59] <wes_dillingham> no, its always been enabled
[16:00] <jdillaman> wes_dillingham: odd -- well, it definitely did the correct thing when it failed to locate the object map on disk, but not sure why you don't have an object map to begin with if you didn't dynamically enable it
[16:01] * thomnico (~thomnico@2a01:e35:8b41:120:1ced:c62b:3189:98e5) has joined #ceph
[16:01] <wes_dillingham> O, i thought I did have it, it was just not valid...
[16:01] <jdillaman> wes_dillingham: if you can figure out some steps that occurred to make that happen, i'd love to fix it
[16:02] * mhack (~mhack@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[16:02] * dnunez (~dnunez@nat-pool-bos-u.redhat.com) has joined #ceph
[16:02] * karnan (~karnan@2405:204:5502:b48e:3602:86ff:fe56:55ae) Quit (Quit: Leaving)
[16:02] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[16:05] <wes_dillingham> I might not have it, I presume the object-map object has the words ???object-map??? in it ?
[16:05] * brannmar (~Arcturus@104.156.228.156) has joined #ceph
[16:06] * srk (~Siva@2605:6000:ed04:ce00:486f:83dc:c391:239f) has joined #ceph
[16:06] <wes_dillingham> We tried to launch 1000s of vms at once as a test and hit a ulimit issue on our hypervisors, and so i believe many things didnt finish, possibly the creation of the object map being one of them
[16:07] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:10] * dnunez (~dnunez@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[16:11] * lcurtis (~lcurtis@47.19.105.250) has joined #ceph
[16:13] * thomnico (~thomnico@2a01:e35:8b41:120:1ced:c62b:3189:98e5) Quit (Quit: Ex-Chat)
[16:14] * srk (~Siva@2605:6000:ed04:ce00:486f:83dc:c391:239f) Quit (Ping timeout: 480 seconds)
[16:17] * swami1 (~swami@49.38.1.162) Quit (Quit: Leaving.)
[16:19] * roeland (~roeland@87.215.30.74) Quit (Quit: Konversation terminated!)
[16:20] <wes_dillingham> jdillaman: I do have some object_map objects related to that device (5 objects)
[16:21] <jdillaman> wes_dillingham: do you have 4 snapshots?
[16:21] <wes_dillingham> yes???r
[16:22] <jdillaman> wes_dillingham: k -- that initial load failure would have created a dummy object after flagging the object map as invalid
[16:22] * jdillaman (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) has left #ceph
[16:22] * jdillaman (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) has joined #ceph
[16:23] * dnunez (~dnunez@nat-pool-bos-t.redhat.com) has joined #ceph
[16:23] <wes_dillingham> gotcha
[16:24] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[16:30] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[16:31] * rakeshgm (~rakesh@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:32] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[16:35] * brannmar (~Arcturus@104.156.228.156) Quit ()
[16:38] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) has joined #ceph
[16:38] * vimal (~vikumar@114.143.165.8) Quit (Quit: Leaving)
[16:42] * rakeshgm (~rakesh@121.244.87.118) has joined #ceph
[16:42] * Jeffrey4l (~Jeffrey@119.251.128.22) has joined #ceph
[16:44] * andreww (~xarses@64.124.158.192) has joined #ceph
[16:44] * kefu (~kefu@114.92.96.253) has joined #ceph
[16:45] * joshd1 (~jdurgin@2602:30a:c089:2b0:1549:559d:48b2:b046) has joined #ceph
[16:54] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[16:57] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) Quit (Ping timeout: 480 seconds)
[17:00] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) has joined #ceph
[17:03] * srk (~Siva@32.97.110.50) has joined #ceph
[17:04] * yanzheng (~zhyan@125.70.20.176) Quit (Quit: This computer has gone to sleep)
[17:06] * kefu (~kefu@114.92.96.253) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:07] * wushudoin (~wushudoin@2601:646:8281:cfd:2ab2:bdff:fe0b:a6ee) has joined #ceph
[17:10] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:10] * kefu (~kefu@114.92.96.253) has joined #ceph
[17:12] * haomaiwang (~oftc-webi@61.148.242.76) Quit (Ping timeout: 480 seconds)
[17:14] * rdias (~rdias@2001:8a0:749a:d01:938:13ff:6e72:610b) Quit (Remote host closed the connection)
[17:14] * TMM (~hp@185.5.121.201) Quit (Quit: Ex-Chat)
[17:22] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) has joined #ceph
[17:29] * ntpttr_ (~ntpttr@134.134.139.78) has joined #ceph
[17:32] * rmart04 (~rmart04@support.memset.com) Quit (Quit: rmart04)
[17:32] * danieagle (~Daniel@201-69-183-143.dial-up.telesp.net.br) has joined #ceph
[17:33] * rmart04 (~rmart04@5.153.255.226) has joined #ceph
[17:36] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[17:37] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) Quit (Quit: Leaving.)
[17:37] * tsg (~tgohad@134.134.139.83) has joined #ceph
[17:39] * art_yo (~kvirc@149.126.169.197) has joined #ceph
[17:40] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) Quit (Remote host closed the connection)
[17:42] * rmart04 (~rmart04@5.153.255.226) Quit (Ping timeout: 480 seconds)
[17:44] * kefu (~kefu@114.92.96.253) Quit (Ping timeout: 480 seconds)
[17:45] * rakeshgm (~rakesh@121.244.87.118) Quit (Ping timeout: 480 seconds)
[17:48] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[17:50] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Quit: Bye guys! (??????????????????? ?????????)
[17:51] * ade (~abradshaw@tmo-100-60.customers.d1-online.com) Quit (Read error: Connection reset by peer)
[17:53] * swami1 (~swami@27.7.172.255) has joined #ceph
[17:54] * bara (~bara@213.175.37.12) has joined #ceph
[17:54] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[17:56] * efirs (~firs@5.128.174.86) has joined #ceph
[17:56] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) has joined #ceph
[18:07] * kefu (~kefu@114.92.96.253) has joined #ceph
[18:10] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[18:19] * Knuckx (~Shnaw@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[18:21] * wes_dillingham (~wes_dilli@140.247.242.44) Quit (Quit: wes_dillingham)
[18:23] * swami1 (~swami@27.7.172.255) Quit (Quit: Leaving.)
[18:24] * efirs (~firs@5.128.174.86) Quit (Ping timeout: 480 seconds)
[18:25] * aNupoisc (~adnavare@fmdmzpr03-ext.fm.intel.com) has joined #ceph
[18:26] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) Quit (Ping timeout: 480 seconds)
[18:28] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) has joined #ceph
[18:29] * Nicho1as (~nicho1as@00022427.user.oftc.net) Quit (Quit: A man from the Far East; using WeeChat 1.5)
[18:30] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) has joined #ceph
[18:34] * vbellur (~vijay@71.234.224.255) Quit (Ping timeout: 480 seconds)
[18:36] * wes_dillingham (~wes_dilli@140.247.242.44) has joined #ceph
[18:41] * DanFoster (~Daniel@office.34sp.com) Quit (Quit: Leaving)
[18:41] * joshd1 (~jdurgin@2602:30a:c089:2b0:1549:559d:48b2:b046) Quit (Quit: Leaving.)
[18:44] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[18:44] * davidzlap1 (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit (Quit: Leaving.)
[18:46] * diq (~diq@2620:11c:f:2:c23f:d5ff:fe62:112c) has joined #ceph
[18:49] * Knuckx (~Shnaw@61TAAA7G1.tor-irc.dnsbl.oftc.net) Quit ()
[18:50] * bara (~bara@213.175.37.12) Quit (Quit: Bye guys! (??????????????????? ?????????)
[18:52] * blizzow (~jburns@50.243.148.102) has joined #ceph
[18:52] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[18:56] * ntpttr__ (~ntpttr@134.134.139.78) has joined #ceph
[18:56] * ntpttr_ (~ntpttr@134.134.139.78) Quit (Remote host closed the connection)
[18:59] * pdrakewe_ (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Ping timeout: 480 seconds)
[19:00] * analbeard (~shw@host86-142-132-208.range86-142.btcentralplus.com) has joined #ceph
[19:05] * kefu (~kefu@114.92.96.253) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:09] * cathode (~cathode@50.232.215.114) has joined #ceph
[19:17] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) Quit (Quit: Leaving.)
[19:19] * davidzlap (~Adium@2605:e000:1313:8003:688e:a2d8:f0a8:493b) has joined #ceph
[19:20] * analbeard (~shw@host86-142-132-208.range86-142.btcentralplus.com) Quit (Quit: Leaving.)
[19:26] * swami1 (~swami@27.7.172.255) has joined #ceph
[19:28] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:4d98:7dea:2462:19d7) Quit (Ping timeout: 480 seconds)
[19:36] * mhackett (~mhack@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[19:36] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[19:36] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) Quit (Read error: Connection reset by peer)
[19:43] * Kidlvr (~Lunk2@torrelay4.tomhek.net) has joined #ceph
[19:44] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[19:47] * mhackett (~mhack@nat-pool-bos-u.redhat.com) has joined #ceph
[19:47] * swami1 (~swami@27.7.172.255) Quit (Quit: Leaving.)
[19:51] * rakeshgm (~rakesh@106.51.29.33) has joined #ceph
[19:52] * vbellur (~vijay@nat-pool-bos-t.redhat.com) has joined #ceph
[19:54] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[19:57] * prpplague (~David@107-206-67-71.lightspeed.rcsntx.sbcglobal.net) has joined #ceph
[19:57] * krypto (~krypto@106.51.25.113) has joined #ceph
[19:58] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[19:58] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[20:03] * mhack (~mhack@nat-pool-bos-t.redhat.com) has joined #ceph
[20:04] * krypto (~krypto@106.51.25.113) Quit (Quit: Leaving)
[20:09] * mhackett (~mhack@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[20:13] * Kidlvr (~Lunk2@9YSAAA8JV.tor-irc.dnsbl.oftc.net) Quit ()
[20:15] * aNupoisc (~adnavare@fmdmzpr03-ext.fm.intel.com) has left #ceph
[20:19] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:27] * wogri_ (~wolf@nix.wogri.at) has joined #ceph
[20:29] * wogri (~wolf@nix.wogri.at) Quit (Ping timeout: 480 seconds)
[20:33] * mhack (~mhack@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[20:35] * GuntherDW1 (~notarima@46.166.138.150) has joined #ceph
[20:36] * tsg (~tgohad@134.134.139.83) Quit (Remote host closed the connection)
[20:37] * nathani (~nathani@2607:f2f8:ac88::) Quit (Ping timeout: 480 seconds)
[20:38] * nathani (~nathani@2607:f2f8:ac88::) has joined #ceph
[20:38] * tsg (~tgohad@134.134.139.83) has joined #ceph
[20:44] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[20:45] * rakeshgm (~rakesh@106.51.29.33) Quit (Quit: Leaving)
[20:46] * bitserker (~toni@81.184.9.72.dyn.user.ono.com) has joined #ceph
[20:54] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) has joined #ceph
[20:58] * bitserker (~toni@81.184.9.72.dyn.user.ono.com) Quit (Ping timeout: 480 seconds)
[21:03] * wogri_ (~wolf@nix.wogri.at) Quit (Ping timeout: 480 seconds)
[21:05] * GuntherDW1 (~notarima@46.166.138.150) Quit ()
[21:06] * roaet (~aleksag@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[21:08] * t4nk852 (~oftc-webi@pubip.ny.tower-research.com) has joined #ceph
[21:09] * rendar (~I@host222-180-dynamic.12-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[21:10] * mykola (~Mikolaj@91.245.79.221) has joined #ceph
[21:12] * ntpttr__ (~ntpttr@134.134.139.78) Quit (Remote host closed the connection)
[21:12] * ntpttr_ (~ntpttr@192.55.54.40) has joined #ceph
[21:14] * Psi-Jack (~psi-jack@mx.linux-help.org) Quit (Quit: Where'd my terminal go?)
[21:16] <walbuci> Ah, documentation is wrong for pool creation
[21:16] <walbuci> ceph osd pool create rbd 256 256 replicated replicated_ruleset expected_num_objects 2147483648
[21:16] <walbuci> It doesn't say that you need to have "expected_num_objects". Only <int>
[21:17] * Psi-Jack (~psi-jack@mx.linux-help.org) has joined #ceph
[21:20] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) Quit (Ping timeout: 480 seconds)
[21:22] * wogri (~wolf@nix.wogri.at) has joined #ceph
[21:22] * wogri (~wolf@nix.wogri.at) Quit ()
[21:22] * wogri (~wolf@nix.wogri.at) has joined #ceph
[21:22] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) has joined #ceph
[21:22] * wogri (~wolf@nix.wogri.at) Quit ()
[21:22] * mhack (~mhack@nat-pool-bos-t.redhat.com) has joined #ceph
[21:22] * wogri (~wolf@nix.wogri.at) has joined #ceph
[21:23] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving)
[21:29] * ntpttr_ (~ntpttr@192.55.54.40) Quit (Remote host closed the connection)
[21:29] * ntpttr__ (~ntpttr@134.134.139.78) has joined #ceph
[21:30] * carter (~carter@li98-136.members.linode.com) Quit (Quit: ZNC - http://znc.in)
[21:30] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[21:32] * essjayhch (sid79416@id-79416.highgate.irccloud.com) Quit (Quit: Connection closed for inactivity)
[21:33] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[21:35] * rendar (~I@host222-180-dynamic.12-79-r.retail.telecomitalia.it) has joined #ceph
[21:35] * roaet (~aleksag@26XAAAX3E.tor-irc.dnsbl.oftc.net) Quit ()
[21:36] * carter (~carter@li98-136.members.linode.com) has joined #ceph
[21:40] * srk (~Siva@32.97.110.50) Quit (Ping timeout: 480 seconds)
[21:41] * puffy (~puffy@c-71-198-18-187.hsd1.ca.comcast.net) has joined #ceph
[21:42] * puffy (~puffy@c-71-198-18-187.hsd1.ca.comcast.net) Quit ()
[21:42] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[21:46] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[21:46] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[21:48] * ntpttr__ (~ntpttr@134.134.139.78) Quit (Remote host closed the connection)
[21:51] * rorr (~rorr@45.73.146.238) has joined #ceph
[21:51] * rorr (~rorr@45.73.146.238) has left #ceph
[21:51] * oarra (~rorr@45.73.146.238) has joined #ceph
[21:52] <oarra> hi, anyone around that can help with a ceph-mds crashing issue?
[21:52] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) has joined #ceph
[21:53] * hrast (~hrast@cpe-24-55-26-86.austin.res.rr.com) has joined #ceph
[21:54] * srk (~Siva@32.97.110.51) has joined #ceph
[21:55] * analbeard (~shw@host86-142-132-208.range86-142.btcentralplus.com) has joined #ceph
[21:55] * Hemanth (~hkumar_@103.228.221.131) has joined #ceph
[21:59] * mgolub (~Mikolaj@91.245.73.133) has joined #ceph
[22:00] * carter (~carter@li98-136.members.linode.com) Quit (Quit: ZNC - http://znc.in)
[22:02] * srk (~Siva@32.97.110.51) Quit (Ping timeout: 480 seconds)
[22:03] <gregsfortytwo> oarra: probably best to put it on the mailing list
[22:03] * mykola (~Mikolaj@91.245.79.221) Quit (Ping timeout: 480 seconds)
[22:04] <oarra> gregsfortytwo: thanks will do
[22:05] <blizzow> I've created a rados gateway and want to use it to allow people to store files in s3 style buckets using s3cmd. I've successfully run the s3test.py script from here http://docs.ceph.com/docs/jewel/install/install-ceph-gateway/
[22:05] <blizzow> I still can't get my s3 client to work though. Do I need to create a swift subuser?
[22:05] * mlupton (2d1fb434@107.161.19.109) has joined #ceph
[22:05] * mhack (~mhack@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[22:06] * srk (~Siva@32.97.110.51) has joined #ceph
[22:06] <mlupton> Hi everyone, I'm having a peculiar problem with two of my OSDs. Cluster health is OK, but two of my OSDs on one of my ceph nodes are down.
[22:07] * carter (~carter@li98-136.members.linode.com) has joined #ceph
[22:09] <mlupton> logs on the offending ceph node have the following error:
[22:09] <mlupton> ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-2
[22:09] * wogri (~wolf@nix.wogri.at) Quit (Quit: Lost terminal)
[22:09] * wogri (~wolf@nix.wogri.at) has joined #ceph
[22:09] * wogri (~wolf@nix.wogri.at) Quit ()
[22:09] * wogri (~wolf@nix.wogri.at) has joined #ceph
[22:09] * ntpttr_ (~ntpttr@134.134.139.77) has joined #ceph
[22:10] * ntpttr_ (~ntpttr@134.134.139.77) Quit ()
[22:12] <blizzow> are your drives okay?
[22:12] <mlupton> lsblk shows that both osd are mounted to the appropriate devices as well.
[22:12] <blizzow> are you running xfs?
[22:12] <mlupton> vdb 253:16 0 10G 0 disk
[22:12] <mlupton> ??????vdb1 253:17 0 9G 0 part /var/lib/ceph/osd/ceph-2
[22:12] <mlupton> ??????vdb2 253:18 0 999M 0 part
[22:12] <mlupton> vdc 253:32 0 10G 0 disk
[22:12] <mlupton> ??????vdc1 253:33 0 9G 0 part /var/lib/ceph/osd/ceph-3
[22:12] <mlupton> ??????vdc2 253:34 0 999M 0 part
[22:12] <mlupton> vdb 253:16 0 10G 0 disk
[22:13] <mlupton> ??????vdb1 253:17 0 9G 0 part /var/lib/ceph/osd/ceph-2
[22:13] <mlupton> ??????vdb2 253:18 0 999M 0 part
[22:13] <mlupton> vdc 253:32 0 10G 0 disk
[22:13] <mlupton> ??????vdc1 253:33 0 9G 0 part /var/lib/ceph/osd/ceph-3
[22:13] <mlupton> ??????vdc2 253:34 0 999M 0 part
[22:13] <mlupton> Whoops didn't mean to paste twice
[22:13] <mlupton> Both of these should be ext4
[22:13] <blizzow> Have you unmounted and run a fsck on them?
[22:13] <mlupton> Actually they are xfs
[22:14] <mlupton> I'll go and do that now
[22:15] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) Quit (Remote host closed the connection)
[22:16] * mhack (~mhack@nat-pool-bos-u.redhat.com) has joined #ceph
[22:18] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[22:20] * art_yo (~kvirc@149.126.169.197) Quit (Read error: Connection reset by peer)
[22:20] * art_yo (~kvirc@149.126.169.197) has joined #ceph
[22:21] * XMVD (~Xmd@78.85.35.236) has joined #ceph
[22:23] * wes_dillingham (~wes_dilli@140.247.242.44) Quit (Quit: wes_dillingham)
[22:24] <mlupton> "bad primary supeblock - bad magic number !!!" on /dev/vdb
[22:24] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[22:24] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) Quit (Read error: Connection reset by peer)
[22:24] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) has joined #ceph
[22:24] <mlupton> Sounds pretty terrifying
[22:25] * Hemanth (~hkumar_@103.228.221.131) Quit (Quit: Leaving)
[22:25] <blizzow> mlupton: are you doing an xfs_repair on /dev/vdb or /dev/vdb1?
[22:26] <mlupton> I was doing it on /dev/vdb, using the -n option
[22:27] <mlupton> Do I need to temporarily remove this node from the cluster in order to run xfs_repair?
[22:27] <mlupton> I've manually unmounted /dev/vdb1 and /dev/vdc1 already
[22:27] * mhackett (~mhack@nat-pool-bos-t.redhat.com) has joined #ceph
[22:28] <blizzow> you may want to manually check those partitions in particular. Make sure xfs_repair isn't returning errors when checking the journal partition.
[22:28] <blizzow> Do you have smartctl checking on those drives?
[22:28] * analbeard (~shw@host86-142-132-208.range86-142.btcentralplus.com) Quit (Quit: Leaving.)
[22:28] * Xmd (~Xmd@78.85.35.236) Quit (Ping timeout: 480 seconds)
[22:30] <mlupton> I'm not using smartctl as far as I know. I haven't been able to to run xfs_repair on /dev/vdb1 and /dev/vdc1 as they are apparently still mounted, even though I manually unmounted them.
[22:32] * mgolub (~Mikolaj@91.245.73.133) Quit (Quit: away)
[22:32] <blizzow> Doubt it will work but, you could try: umount -l /dev/vdX1 then run partprobe
[22:33] * mhack (~mhack@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[22:34] * Jeffrey4l_ (~Jeffrey@110.252.71.193) has joined #ceph
[22:36] <SamYaple> blizzow: if somethigns using it -l wont free it up
[22:36] <SamYaple> i mean the dev itself. it will unmount it
[22:36] <SamYaple> but that does little good
[22:37] <SamYaple> mlupton: are they getting remounted because of udev rules or some such?
[22:37] <mlupton> I don't believe so.
[22:38] * dnunez (~dnunez@nat-pool-bos-t.redhat.com) Quit (Remote host closed the connection)
[22:38] * Jeffrey4l (~Jeffrey@119.251.128.22) Quit (Ping timeout: 480 seconds)
[22:38] <mlupton> I deployed and activated the nodes all the same way using xargs; so there shouldn't be any special rules applying to this particular node
[22:40] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[22:46] <mlupton> It's also worth noting that I'm co-locating OSDs and Mons as this is a test environment
[22:46] * tsg (~tgohad@134.134.139.83) Quit (Remote host closed the connection)
[22:49] * analbeard (~shw@support.memset.com) has joined #ceph
[22:51] * Miouge (~Miouge@109.128.94.173) Quit (Quit: Miouge)
[22:56] * hrast (~hrast@cpe-24-55-26-86.austin.res.rr.com) Quit (Quit: hrast)
[22:57] * georgem (~Adium@24.114.49.128) has joined #ceph
[22:58] * georgem (~Adium@24.114.49.128) Quit ()
[22:58] * georgem (~Adium@206.108.127.16) has joined #ceph
[23:00] <blizzow> watching ceph -w I'm suddenly seeing lots of "scrub starts", "scrub ok", "deep-scrub starts", and
[23:00] <blizzow> deep-scrub ok" messages, is that okay?
[23:00] <blizzow> I've never seen them before while watching ceph output.
[23:01] * isaxi (~TGF@108.61.123.88) has joined #ceph
[23:02] <SamYaple> blizzow: 100% normal
[23:03] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Remote host closed the connection)
[23:04] * salwasser (~Adium@72.246.3.14) Quit (Quit: Leaving.)
[23:11] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[23:14] * mattbenjamin (~mbenjamin@12.118.3.106) Quit (Ping timeout: 480 seconds)
[23:15] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[23:16] * rmart04 (~rmart04@support.memset.com) Quit (Quit: rmart04)
[23:22] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) Quit (Quit: Leaving.)
[23:24] * davidzlap (~Adium@2605:e000:1313:8003:688e:a2d8:f0a8:493b) Quit (Quit: Leaving.)
[23:25] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[23:26] * mlupton (2d1fb434@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[23:27] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[23:32] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) has joined #ceph
[23:32] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) Quit ()
[23:32] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[23:32] * davidzlap (~Adium@2605:e000:1313:8003:7cc8:a6be:b851:6099) has joined #ceph
[23:35] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) Quit ()
[23:38] * isaxi (~TGF@108.61.123.88) Quit (Ping timeout: 480 seconds)
[23:39] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) has joined #ceph
[23:39] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[23:40] * cathode (~cathode@50.232.215.114) Quit (Quit: Leaving)
[23:42] * haplo37 (~haplo37@199.91.185.156) Quit (Remote host closed the connection)
[23:44] * davidzlap (~Adium@2605:e000:1313:8003:7cc8:a6be:b851:6099) Quit (Quit: Leaving.)
[23:44] * oliveiradan (~doliveira@137.65.133.10) has joined #ceph
[23:45] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[23:45] * vbellur (~vijay@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[23:45] * oliveiradan_ (~doliveira@137.65.133.10) Quit (Ping timeout: 480 seconds)
[23:51] <blizzow> Is ceph node aware? I have one node with 8x4TB OSDs and my other nodes have 4x4TB OSDs. I also have a few nodes that have 4x1TB OSDs. I'm worried that if I lose/reboot my node with 8x4TB OSDs, I'll get hosed. I know I can manually reweight the CRUSH algorithm, but I'd rather not interfere with ceph, if it's smart.
[23:57] * Unai (~Adium@50-115-70-150.static-ip.telepacific.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.