#ceph IRC Log

Index

IRC Log for 2016-07-15

Timestamps are in GMT/BST.

[0:01] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[0:05] * rendar (~I@host83-178-dynamic.251-95-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[0:06] * vata (~vata@207.96.182.162) Quit (Quit: Leaving.)
[0:06] * overload (~oc-lram@79.108.113.172.dyn.user.ono.com) has joined #ceph
[0:06] <overload> hi
[0:08] <overload> I've a cluster with 5 osd's nodes, if i reboot one of them, everything freeze and i get this messages: cluster [WRN] slow request 30.625102 seconds old, received at 2016-07-14 19:38:37.371697: osd_op(client.593241.0:3283315 3.d8215fdb (undecoded) ondisk+write+known_if_redirected e11433) currently waiting for active
[0:08] <overload> what i'm doing wrong?
[0:09] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Ping timeout: 480 seconds)
[0:13] * Discovery (~Discovery@109.235.52.9) Quit (Read error: Connection reset by peer)
[0:17] * squizzi_ (~squizzi@2001:420:2240:1268:a0b7:f4b7:490:2105) Quit (Ping timeout: 480 seconds)
[0:19] * Spessu (~Morde@93.115.95.207) has joined #ceph
[0:20] * dougf (~dougf@96-38-99-179.dhcp.jcsn.tn.charter.com) has joined #ceph
[0:23] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[0:30] * gillesMo (~gillesMo@00012912.user.oftc.net) Quit (Remote host closed the connection)
[0:32] * willi (~willi@p200300774E3708FC9D08C6E0F20E68B7.dip0.t-ipconnect.de) has joined #ceph
[0:37] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) Quit (Remote host closed the connection)
[0:40] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[0:41] * willi (~willi@p200300774E3708FC9D08C6E0F20E68B7.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[0:45] * debian112 (~bcolbert@64.235.157.198) Quit (Read error: Connection reset by peer)
[0:46] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[0:46] * debian112 (~bcolbert@64.235.157.198) has joined #ceph
[0:46] * nils_ (~nils_@doomstreet.collins.kg) Quit ()
[0:46] * vata (~vata@cable-173.246.3-246.ebox.ca) has joined #ceph
[0:48] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[0:49] * Spessu (~Morde@61TAAAKRB.tor-irc.dnsbl.oftc.net) Quit ()
[0:51] * moon (~moon@217-19-26-201.dsl.cambrium.nl) Quit (Ping timeout: 480 seconds)
[0:54] * DJComet (~hyst@89.34.237.12) has joined #ceph
[0:54] * moon (~moon@217-19-26-201.dsl.cambrium.nl) has joined #ceph
[0:55] * jermudgeon_ (~jhaustin@gw1.ttp.biz.whitestone.link) has joined #ceph
[1:00] * jermudgeon (~jhaustin@gw1.ttp.biz.whitestone.link) Quit (Ping timeout: 480 seconds)
[1:01] * jermudgeon_ is now known as jermudgeon
[1:02] * xarses (~xarses@64.124.158.100) Quit (Ping timeout: 480 seconds)
[1:06] * theTrav (~theTrav@CPE-124-188-218-238.sfcz1.cht.bigpond.net.au) Quit (Remote host closed the connection)
[1:09] * moon (~moon@217-19-26-201.dsl.cambrium.nl) Quit (Ping timeout: 480 seconds)
[1:12] * kuku (~kuku@119.93.91.136) has joined #ceph
[1:13] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[1:16] * fsimonce (~simon@host99-64-dynamic.27-79-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[1:16] * oms101 (~oms101@p20030057EA2D3E00C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:24] * DJComet (~hyst@7EXAAAH42.tor-irc.dnsbl.oftc.net) Quit ()
[1:25] * oms101 (~oms101@p20030057EA731F00C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[1:33] * motk (~motk@2600:3c00::f03c:91ff:fe98:51ee) Quit (Remote host closed the connection)
[1:36] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[1:43] * krypto (~krypto@G68-121-13-32.sbcis.sbc.com) Quit (Ping timeout: 480 seconds)
[1:44] * jermudgeon_ (~jhaustin@gw1.ttp.biz.whitestone.link) has joined #ceph
[1:44] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[1:48] * jermudgeon (~jhaustin@gw1.ttp.biz.whitestone.link) Quit (Ping timeout: 480 seconds)
[1:48] * jermudgeon_ is now known as jermudgeon
[1:54] * sardonyx (~dicko@109.236.90.209) has joined #ceph
[1:57] * penguinRaider (~KiKo@204.152.207.173) Quit (Ping timeout: 480 seconds)
[1:58] * zero_shane (~textual@c-73-231-84-106.hsd1.ca.comcast.net) has joined #ceph
[1:59] * harbie (~notroot@2a01:4f8:211:2344:0:dead:beef:1) Quit (Ping timeout: 480 seconds)
[2:00] * mattbenjamin (~mbenjamin@12.118.3.106) Quit (Ping timeout: 480 seconds)
[2:01] * debian112 (~bcolbert@64.235.157.198) Quit (Quit: Leaving.)
[2:01] * DLange (~DLange@dlange.user.oftc.net) Quit (Ping timeout: 480 seconds)
[2:01] * debian112 (~bcolbert@64.235.157.198) has joined #ceph
[2:06] * penguinRaider (~KiKo@146.185.31.226) has joined #ceph
[2:06] * x86_g (~pedrox86@0001cc00.user.oftc.net) has joined #ceph
[2:06] * DLange (~DLange@dlange.user.oftc.net) has joined #ceph
[2:07] * borei (~dan@216.13.217.230) Quit (Ping timeout: 480 seconds)
[2:07] * harbie (~notroot@2a01:4f8:211:2344:0:dead:beef:1) has joined #ceph
[2:09] * theTrav (~theTrav@ipc032.ipc.telstra.net) has joined #ceph
[2:11] * reed (~reed@216.38.134.18) Quit (Ping timeout: 480 seconds)
[2:13] * debian112 (~bcolbert@64.235.157.198) Quit (Remote host closed the connection)
[2:13] * blizzow (~jburns@50.243.148.102) Quit (Ping timeout: 480 seconds)
[2:13] * debian112 (~bcolbert@64.235.157.198) has joined #ceph
[2:21] * wushudoin (~wushudoin@38.99.12.237) Quit (Ping timeout: 480 seconds)
[2:24] * sardonyx (~dicko@9YSAAAL3N.tor-irc.dnsbl.oftc.net) Quit ()
[2:24] * kiasyn (~Nijikokun@65.19.167.132) has joined #ceph
[2:28] * noahw (~noahw@eduroam-169-233-234-163.ucsc.edu) Quit (Ping timeout: 480 seconds)
[2:33] * willi (~willi@p200300774E3708FC9D08C6E0F20E68B7.dip0.t-ipconnect.de) has joined #ceph
[2:35] * CephFan1 (~textual@173-171-133-163.res.bhn.net) has joined #ceph
[2:42] * willi (~willi@p200300774E3708FC9D08C6E0F20E68B7.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[2:46] * Hazelesque (~hazel@phobos.hazelesque.uk) Quit (Ping timeout: 480 seconds)
[2:46] * aiicore (~aiicore@s30.linuxpl.com) Quit (Ping timeout: 480 seconds)
[2:47] * davidz (~davidz@2605:e000:1313:8003:f85b:1c82:1c0c:9737) Quit (Quit: Leaving.)
[2:47] * brians__ (~brian@80.111.114.175) has joined #ceph
[2:52] * Hazelesque (~hazel@phobos.hazelesque.uk) has joined #ceph
[2:53] * brians (~brian@80.111.114.175) Quit (Ping timeout: 480 seconds)
[2:54] * kiasyn (~Nijikokun@61TAAAKWJ.tor-irc.dnsbl.oftc.net) Quit ()
[2:54] * cmrn (~Inuyasha@hessel2.torservers.net) has joined #ceph
[2:55] * aiicore (~aiicore@s30.linuxpl.com) has joined #ceph
[2:58] * praveen (~praveen@122.171.81.192) Quit (Ping timeout: 480 seconds)
[2:58] * dnunez (~dnunez@c-73-38-0-185.hsd1.ma.comcast.net) has joined #ceph
[3:06] * yanzheng (~zhyan@125.70.22.67) has joined #ceph
[3:16] * x86_g (~pedrox86@0001cc00.user.oftc.net) Quit (Ping timeout: 480 seconds)
[3:17] * vbellur (~vijay@71.234.224.255) has joined #ceph
[3:19] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) has joined #ceph
[3:24] * cmrn (~Inuyasha@9YSAAAL51.tor-irc.dnsbl.oftc.net) Quit ()
[3:24] * jermudgeon (~jhaustin@gw1.ttp.biz.whitestone.link) Quit (Quit: jermudgeon)
[3:27] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[3:32] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) has joined #ceph
[3:34] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[3:36] * CephFan1 (~textual@173-171-133-163.res.bhn.net) Quit (Quit: My MacBook Pro has gone to sleep. ZZZzzz???)
[3:42] * brians (~brian@80.111.114.175) has joined #ceph
[3:43] * Racpatel (~Racpatel@2601:87:0:24af::4c8f) Quit (Ping timeout: 480 seconds)
[3:44] * aj__ (~aj@x590cffa3.dyn.telefonica.de) has joined #ceph
[3:45] * kefu (~kefu@183.193.119.183) has joined #ceph
[3:47] * brians__ (~brian@80.111.114.175) Quit (Ping timeout: 480 seconds)
[3:48] * kefu_ (~kefu@114.92.96.253) has joined #ceph
[3:49] * noahw (~noahw@2601:647:ca01:57bc:e057:8969:87d4:c96f) has joined #ceph
[3:50] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[3:51] * debian112 (~bcolbert@64.235.157.198) has left #ceph
[3:52] * derjohn_mobi (~aj@x590c247c.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[3:52] * swami1 (~swami@27.7.165.149) has joined #ceph
[3:53] * kefu (~kefu@183.193.119.183) Quit (Ping timeout: 480 seconds)
[3:54] * Esvandiary (~FierceFor@tsn109-201-154-182.dyn.nltelcom.net) has joined #ceph
[4:04] * noahw (~noahw@2601:647:ca01:57bc:e057:8969:87d4:c96f) Quit (Ping timeout: 480 seconds)
[4:04] * Jeffrey4l (~Jeffrey@221.192.178.67) has joined #ceph
[4:05] * circ-user-JjbSd (~circuser-@220.248.17.34) has joined #ceph
[4:06] * swami1 (~swami@27.7.165.149) Quit (Quit: Leaving.)
[4:08] * Jeffrey4l (~Jeffrey@221.192.178.67) Quit ()
[4:08] * scg (~zscg@146-115-134-246.c3-0.nwt-ubr1.sbo-nwt.ma.cable.rcn.com) has joined #ceph
[4:09] * bniver (~bniver@pool-173-48-58-27.bstnma.fios.verizon.net) has joined #ceph
[4:11] * circ-user-JjbSd is now known as truan-wang
[4:12] * Racpatel (~Racpatel@2601:87:0:24af::4c8f) has joined #ceph
[4:13] * debian112 (~bcolbert@173-164-167-200-SFBA.hfc.comcastbusiness.net) has joined #ceph
[4:14] * dnunez (~dnunez@c-73-38-0-185.hsd1.ma.comcast.net) Quit (Quit: Leaving)
[4:16] * kuku (~kuku@119.93.91.136) Quit (Remote host closed the connection)
[4:20] * Racpatel (~Racpatel@2601:87:0:24af::4c8f) Quit (Ping timeout: 480 seconds)
[4:23] * kuku (~kuku@119.93.91.136) has joined #ceph
[4:23] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[4:24] * Esvandiary (~FierceFor@tsn109-201-154-182.dyn.nltelcom.net) Quit ()
[4:24] * darkid (~biGGer@61TAAAK09.tor-irc.dnsbl.oftc.net) has joined #ceph
[4:25] * noahw (~noahw@2601:647:ca01:57bc:e057:8969:87d4:c96f) has joined #ceph
[4:29] * vbellur (~vijay@71.234.224.255) Quit (Ping timeout: 480 seconds)
[4:41] * vbellur (~vijay@2601:18f:700:55b0:5e51:4fff:fee8:6a5c) has joined #ceph
[4:45] * MentalRay (~MentalRay@LPRRPQ1401W-LP130-02-1242363207.dsl.bell.ca) has joined #ceph
[4:46] * praveen (~praveen@122.172.136.225) has joined #ceph
[4:54] * darkid (~biGGer@61TAAAK09.tor-irc.dnsbl.oftc.net) Quit ()
[4:56] * kefu_ (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[4:57] * kefu (~kefu@114.92.96.253) has joined #ceph
[4:59] * MentalRay (~MentalRay@LPRRPQ1401W-LP130-02-1242363207.dsl.bell.ca) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[5:00] * noahw (~noahw@2601:647:ca01:57bc:e057:8969:87d4:c96f) Quit (Ping timeout: 480 seconds)
[5:01] * noahw (~noahw@2601:647:ca01:57bc:e057:8969:87d4:c96f) has joined #ceph
[5:02] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[5:14] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[5:14] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[5:16] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[5:16] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[5:21] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[5:21] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[5:24] * FierceForm (~Popz@torrelay5.tomhek.net) has joined #ceph
[5:25] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[5:25] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[5:26] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[5:26] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[5:33] * Karcaw (~evan@71-95-122-38.dhcp.mdfd.or.charter.com) Quit (Ping timeout: 480 seconds)
[5:34] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[5:34] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[5:36] * noahw (~noahw@2601:647:ca01:57bc:e057:8969:87d4:c96f) Quit (Ping timeout: 480 seconds)
[5:42] * neurodrone_ (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[5:46] * vimal (~vikumar@114.143.167.115) has joined #ceph
[5:46] * scg (~zscg@146-115-134-246.c3-0.nwt-ubr1.sbo-nwt.ma.cable.rcn.com) Quit (Quit: Ex-Chat)
[5:48] * noahw (~noahw@c-24-5-210-58.hsd1.ca.comcast.net) has joined #ceph
[5:53] * Vacuum_ (~Vacuum@i59F79228.versanet.de) has joined #ceph
[5:54] * FierceForm (~Popz@7EXAAAIEE.tor-irc.dnsbl.oftc.net) Quit ()
[5:54] * SweetGirl (~verbalins@minos.blahz.info) has joined #ceph
[6:00] * Vacuum__ (~Vacuum@i59F79C3E.versanet.de) Quit (Ping timeout: 480 seconds)
[6:00] * ircolle (~Adium@66.228.68.185) has joined #ceph
[6:01] * walcubi_ (~walcubi@p5795A4B9.dip0.t-ipconnect.de) has joined #ceph
[6:02] * Karcaw (~evan@71-95-122-38.dhcp.mdfd.or.charter.com) has joined #ceph
[6:08] * truan-wang (~circuser-@220.248.17.34) Quit (Ping timeout: 480 seconds)
[6:08] * walcubi__ (~walcubi@p5795A823.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[6:08] * ircolle (~Adium@66.228.68.185) Quit (Ping timeout: 480 seconds)
[6:12] * ircolle (~Adium@66.228.68.185) has joined #ceph
[6:13] * vimal (~vikumar@114.143.167.115) Quit (Remote host closed the connection)
[6:20] * ircolle (~Adium@66.228.68.185) Quit (Ping timeout: 480 seconds)
[6:24] * SweetGirl (~verbalins@61TAAAK4K.tor-irc.dnsbl.oftc.net) Quit ()
[6:24] * Gibri (~Kwen@65.19.167.130) has joined #ceph
[6:32] * vimal (~vikumar@121.244.87.116) has joined #ceph
[6:33] * truan-wang (~truanwang@58.247.8.186) has joined #ceph
[6:34] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[6:45] * truan-wang (~truanwang@58.247.8.186) Quit (Read error: Connection reset by peer)
[6:45] * truan-wang (~truanwang@220.248.17.34) has joined #ceph
[6:54] * Gibri (~Kwen@9YSAAAMBL.tor-irc.dnsbl.oftc.net) Quit ()
[6:54] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[6:57] * swami1 (~swami@49.38.1.109) has joined #ceph
[6:58] * demonspork (~redbeast1@tor-exit7-readme.dfri.se) has joined #ceph
[7:00] <TheSov> anyone ever used inkscape?
[7:07] * theTrav (~theTrav@ipc032.ipc.telstra.net) Quit (Quit: Leaving...)
[7:08] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[7:13] * ircolle (~Adium@66.228.68.185) has joined #ceph
[7:18] * truan-wang (~truanwang@220.248.17.34) Quit (Remote host closed the connection)
[7:18] * truan-wang (~truanwang@220.248.17.34) has joined #ceph
[7:22] * ircolle (~Adium@66.228.68.185) Quit (Ping timeout: 480 seconds)
[7:23] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[7:25] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[7:27] * rjdias (~rdias@2001:8a0:749a:d01:dd9e:522e:b4f:77e2) has joined #ceph
[7:27] * karnan (~karnan@121.244.87.117) has joined #ceph
[7:27] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) has joined #ceph
[7:27] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[7:28] * demonspork (~redbeast1@61TAAAK6V.tor-irc.dnsbl.oftc.net) Quit ()
[7:28] * Grimmer (~Moriarty@212.7.192.148) has joined #ceph
[7:29] * rdias (~rdias@bl7-92-98.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[7:29] * rjdias is now known as rdias
[7:32] * noahw (~noahw@c-24-5-210-58.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[7:32] * theTrav (~theTrav@ipc032.ipc.telstra.net) has joined #ceph
[7:40] * rdas (~rdas@121.244.87.116) has joined #ceph
[7:41] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[7:46] * tiantian (~oftc-webi@116.228.48.102) has joined #ceph
[7:57] <tiantian> hi, I got a problem now, radosgw support swift API, and I try to use swiftclient update a container's ACL with X-Container-Read: * (allows anonymous requests) , unfortunately, when I curl the object location under the container, result tell me AccessDenied! please help me .. thanks!
[7:58] * Grimmer (~Moriarty@7EXAAAIH8.tor-irc.dnsbl.oftc.net) Quit ()
[7:58] <tiantian> How can I make container???s access control as publicyly accessible?
[7:59] <tiantian> with swift API.
[7:59] * bara (~bara@ip4-83-240-10-82.cust.nbox.cz) has joined #ceph
[8:01] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[8:03] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:b1dd:1c39:b635:bd88) Quit (Ping timeout: 480 seconds)
[8:03] * Bwana (~ggg@212.83.40.239) has joined #ceph
[8:05] * kmajk (~kmajk@host-185-78-133-232.jmdi.pl) has joined #ceph
[8:07] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[8:15] * i_m (~ivan.miro@deibp9eh1--blueice4n6.emea.ibm.com) has joined #ceph
[8:23] * rraja (~rraja@121.244.87.117) has joined #ceph
[8:25] * shylesh (~shylesh@59.95.70.152) has joined #ceph
[8:33] * Bwana (~ggg@9YSAAAMCP.tor-irc.dnsbl.oftc.net) Quit ()
[8:35] * rakeshgm (~rakesh@106.51.225.53) has joined #ceph
[8:37] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) has joined #ceph
[8:48] * yanzheng (~zhyan@125.70.22.67) Quit (Ping timeout: 480 seconds)
[8:50] * yanzheng (~zhyan@125.70.22.67) has joined #ceph
[8:51] * swami2 (~swami@49.38.1.109) has joined #ceph
[8:54] * swami1 (~swami@49.38.1.109) Quit (Ping timeout: 480 seconds)
[8:57] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[9:00] * willi (~willi@212.124.32.5) has joined #ceph
[9:03] * moon (~moon@217-19-26-201.dsl.cambrium.nl) has joined #ceph
[9:09] * rendar (~I@95.234.180.40) has joined #ceph
[9:13] * moon (~moon@217-19-26-201.dsl.cambrium.nl) Quit (Ping timeout: 480 seconds)
[9:20] * kmajk (~kmajk@host-185-78-133-232.jmdi.pl) Quit (Ping timeout: 480 seconds)
[9:22] <willi> hi at all
[9:22] <willi> rbd map datastore
[9:22] <willi> rbd: sysfs write failed
[9:22] <willi> RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable".
[9:22] <willi> In some cases useful info is found in syslog - try "dmesg | tail" or so.
[9:22] <willi> rbd: map failed: (6) No such device or address
[9:22] <willi> where is the problem?
[9:22] <willi> ceph jewel
[9:23] <willi> rbd info says: features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
[9:28] <vikhyat> willi: Kernel RBD still wotn support all these features exclusive-lock, object-map, fast-diff, deep-flatten
[9:28] <vikhyat> you need to disable them
[9:28] <willi> all of them?
[9:28] <vikhyat> yes it only supports layering (clone)
[9:29] <willi> ah okay
[9:29] <vikhyat> I think may be in latest kernel
[9:29] <vikhyat> exclusive-lock, object-map, fast-diff support is available but I am not sure about version
[9:29] <Hatsjoe> I can confirm it still wont work in kernel 4.6.4
[9:29] <vikhyat> you need to google it
[9:30] <vikhyat> Hatsjoe: perfect here you go
[9:30] <vikhyat> thanks Hatsjoe
[9:30] <Hatsjoe> It might be possible to compile it in, but not sure about that
[9:30] <vikhyat> willi: so for now you have layering
[9:30] <vikhyat> okay
[9:31] <willi> i have another problem in jewel and ubuntu 16.04
[9:31] <willi> look at my crush map and my ceph.conf
[9:31] <willi> http://pastebin.com/Z6RkteAD
[9:31] <willi> http://pastebin.com/LUMbAAEV
[9:32] <willi> 18 data server over 3 racks. 6 data server per rack. 1 mon server per rack. 1 iscsi gateway server per rack.
[9:32] <willi> if i shut down 6 data server in rack 1
[9:32] <willi> rados bench -p rbd 60 write -b 4M -t 1 on iscsi node 1 dont work
[9:33] * rraja (~rraja@121.244.87.117) Quit (Ping timeout: 480 seconds)
[9:33] <willi> after 900 seconds it works again. ceph -w told me than pg_stats timeout fpr 900 seconds
[9:33] <willi> yesterday i had the same problem. i thought that it was a network problem with ethernet bonding but it isnt
[9:34] <willi> i get then after the shut down oder data server 1-6 in ceph -w
[9:34] <willi> 1 requests are blocked > 32 sec
[9:34] <willi> ceph health detail told me than
[9:34] <willi> 1 ops are blocked > 524.288 sec on osd.53
[9:34] <willi> 1 osds have slow requests
[9:35] <willi> but osd.53 is in data server 11 not in 1-6 which are shut down
[9:37] <vikhyat> willi: step chooseleaf firstn 3 type rack
[9:37] <willi> log from osd.53 is
[9:37] <willi> Jul 15 09:36:46 ceph11 ceph-osd[4587]: 2016-07-15 09:36:46.765607 7f9ebb34f700 -1 osd.53 13017 heartbeat_check: no reply from osd.29 since back 2016-07-15 09:28:11.355891 front 2016-07-15 09:28:11.355891 (cutoff 2016-07-15 09:36:26.765406)
[9:37] <vikhyat> I think you modified default ruleset
[9:37] <vikhyat> firstn 3
[9:37] <willi> yes
[9:37] <willi> correct
[9:37] <vikhyat> mostly it is 0
[9:38] <willi> i can change it back to 0
[9:38] <willi> but yesterday with 0 i had the same problem
[9:38] <vikhyat> http://docs.ceph.com/docs/master/rados/operations/crush-map/#crush-map-rules
[9:38] <vikhyat> this should help you
[9:39] <vikhyat> check with 0
[9:39] * theTrav (~theTrav@ipc032.ipc.telstra.net) Quit (Remote host closed the connection)
[9:39] <willi> i try it one moment
[9:39] <vikhyat> what is min_size in your pool
[9:40] <vikhyat> #mon osd adjust heartbeat grace = false
[9:40] <vikhyat> #mon osd adjust down out interval = false
[9:40] <vikhyat> #osd max markdown period = 120
[9:40] <vikhyat> #osd max markdown count = 2
[9:40] <vikhyat> I hope this is all disabled
[9:40] <vikhyat> not just commented
[9:41] <vikhyat> in osd and mon daemons also
[9:41] <vikhyat> means you injected them with ceph tell or you restarted the daemons
[9:41] <willi> yes it is false
[9:41] <willi> i show you
[9:41] <vikhyat> then it is fine
[9:41] <vikhyat> no need
[9:42] <willi> ceph daemon osd.0 config show | grep adjust
[9:42] <willi> "mon_osd_adjust_heartbeat_grace": "true",
[9:42] <willi> "mon_osd_adjust_down_out_interval": "true",
[9:42] * praveen (~praveen@122.172.136.225) Quit (Remote host closed the connection)
[9:42] <willi> and so on....
[9:42] <vikhyat> perfect
[9:42] <vikhyat> what about mon
[9:42] <vikhyat> because these are mon options
[9:43] <willi> what do you mean?
[9:43] * rraja (~rraja@121.244.87.118) has joined #ceph
[9:43] <vikhyat> Monitor daemon
[9:43] <willi> yes
[9:43] <willi> what do you want to know?
[9:43] <vikhyat> ceph daemon mon.x config show | grep adjust
[9:45] <willi> didnt work
[9:45] <willi> root@ceph-mon-1:~# ceph daemon mon.0 config show | grep adjust
[9:45] <willi> admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[9:45] <vikhyat> you need to give mon name
[9:45] <vikhyat> you can find in
[9:45] <vikhyat> /var/run/ceph
[9:45] <willi> ah sorry
[9:45] * DanFoster (~Daniel@office.34sp.com) has joined #ceph
[9:48] <willi> ehm...
[9:48] <willi> ceph-mon.ceph-mon-1.asok
[9:48] <willi> ceph daemon ceph-mon.ceph-mon-1 config show | grep adjust
[9:48] <willi> Can't get admin socket path: unable to get conf option admin_socket for ceph-mon.ceph-mon-1: error parsing 'ceph-mon.ceph-mon-1': expected string of the form TYPE.ID, valid types are: auth, mon, osd, mds, client
[9:49] <vikhyat> ceph daemon mon.`hostname -s` config show | grep adjust
[9:49] <vikhyat> this will work
[9:49] <willi> ceph daemon mon.`hostname -s` config show | grep adjust
[9:49] <willi> "mon_osd_adjust_heartbeat_grace": "true",
[9:49] <willi> "mon_osd_adjust_down_out_interval": "true",
[9:49] <vikhyat> perfect
[9:49] <willi> so crush map is set to 0
[9:50] <vikhyat> then modify your crushmap compile it back upload it back
[9:50] <vikhyat> test it
[9:50] <willi> i try now to power down ceph1-6
[9:51] <vikhyat> sure
[9:51] <willi> :-(
[9:51] <vikhyat> if it wont help
[9:52] <willi> 1 ops are blocked > 65.536 sec on osd.40
[9:52] <willi> 1 osds have slow requests
[9:52] <willi> traffic stopped on iscsi gateway
[9:52] <vikhyat> see block requests have nothing to do with
[9:52] <vikhyat> because your data could be rebalancing now
[9:53] <willi> okay
[9:53] <vikhyat> and as you have not reduced the backfill and recovery it may slow down the request
[9:53] <willi> but you see here
[9:53] <willi> osd.84 [WRN] 1 slow requests, 1 included below; oldest blocked for > 30.420386 secs
[9:53] <willi> 2016-07-15 09:52:52.138332 osd.84 [WRN] slow request 30.420386 seconds old, received at 2016-07-15 09:52:21.717883: osd_op(client.1304338.0:1 1.10c1276b benchmark_data_ceph-iscsi-1_6636_object0 [write 0~4194304] snapc 0=[] ack+ondisk+write+known_if_redirected e13050) currently waiting for subops from 12,34
[9:53] <vikhyat> subops from 12,34
[9:53] <willi> yes 12 is down
[9:53] <willi> server 1-6
[9:53] <willi> but 34 is sevrer 7
[9:54] <willi> not down
[9:54] <vikhyat> it is serving but slow
[9:54] <vikhyat> if you want a maintenance window
[9:54] <vikhyat> you can use flags
[9:54] <willi> no...
[9:54] <willi> root@ceph-iscsi-1:~# rados bench -p rbd 60 write -b 4M -t 1
[9:54] <willi> Maintaining 1 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 60 seconds or 0 objects
[9:54] <willi> Object prefix: benchmark_data_ceph-iscsi-1_6668
[9:54] <willi> sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
[9:54] <willi> 0 0 0 0 0 0 - 0
[9:54] <vikhyat> rebalance and noout
[9:54] <willi> 1 1 1 0 0 0 - 0
[9:54] <willi> 2 1 1 0 0 0 - 0
[9:54] <willi> 3 1 1 0 0 0 - 0
[9:54] <willi> cur MB/s 0
[9:55] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:b1dd:1c39:b635:bd88) has joined #ceph
[9:55] <willi> my problem is... what is if one server is down oder one rack is down in my production environment. it could happen nightly at 2 or 3 pm for example
[9:55] <willi> than i have a full disrutption
[9:56] <willi> health detail tells me now:
[9:56] <willi> 1 ops are blocked > 262.144 sec on osd.84
[9:56] <willi> 1 ops are blocked > 262.144 sec on osd.69
[9:56] <willi> 1 ops are blocked > 262.144 sec on osd.56
[9:57] <willi> 1 ops are blocked > 524.288 sec on osd.40
[9:57] <vikhyat> what is your replica count
[9:57] <willi> 3
[9:57] <vikhyat> and min size
[9:57] * [arx] (~arx@the.kittypla.net) Quit (Ping timeout: 480 seconds)
[9:57] <willi> there must be a problem or a bug with jewel and ubuntu 16.04
[9:57] <willi> 1
[9:57] <vikhyat> hmm
[9:57] <willi> ceph osd dump | grep -i rbd
[9:57] <willi> pool 1 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 4096 pgp_num 4096 last_change 12629 flags hashpspool stripe_width 0
[9:57] <vikhyat> is it re balancing
[9:57] <willi> no
[9:58] <vikhyat> ceph -s
[9:58] <willi> i get after 900 seconds traffic back
[9:58] <vikhyat> in pastebin
[9:58] <willi> http://pastebin.com/8Qx4R2eD
[9:58] <willi> traffic back after pg_stats timeout
[9:59] <willi> seems that heartbeat has a problem
[9:59] <willi> don't now what exactly
[9:59] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) Quit (Quit: Leaving)
[9:59] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[9:59] <vikhyat> how you killed the rack
[9:59] <willi> and: Monitor clock skew detected not the problem only a litte ... servers are in sync with own ntp server
[10:00] <willi> i have shutdown the ports on the ethernet switches
[10:00] * kuku (~kuku@119.93.91.136) Quit (Remote host closed the connection)
[10:00] * aj__ (~aj@x590cffa3.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[10:00] <vikhyat> willi: could you please create a tracker in tracker.ceph.com
[10:01] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Ping timeout: 480 seconds)
[10:01] <vikhyat> with logs from mon and reproduce the issue with debug_ms = 1
[10:01] <vikhyat> and upload the logs from some of osds
[10:01] <willi> yes but that is the next problem
[10:01] <vikhyat> which are participating in slow request
[10:02] <vikhyat> I think some one should look into it
[10:02] <willi> subops from 12,34
[10:02] <willi> if i look into osd.34 log on server 7
[10:02] <willi> there is nothing
[10:02] <willi> i look one moment
[10:03] <willi> yes log file is empty on 34
[10:03] <vikhyat> I am not sure just try ceph daemon osd.34 log reopen
[10:03] <willi> http://pastebin.com/9w4ERFuG
[10:04] <willi> root@ceph7:~# ceph daemon osd.34 log reopen
[10:04] <willi> {}
[10:04] <willi> ok works now
[10:05] <vikhyat> slow requests are very much expected if you have down time in one replica set
[10:05] <vikhyat> you can tune your cluster
[10:05] <vikhyat> and for that you can mail in ceph-user
[10:05] <vikhyat> like throttling the backfill and recovery
[10:05] <vikhyat> tuning heartbeat
[10:06] <vikhyat> but it is very much environment specific
[10:06] * swami2 (~swami@49.38.1.109) Quit (Read error: Connection reset by peer)
[10:06] <vikhyat> I hope you will get good response in ceph-user
[10:06] <willi> here u can see
[10:06] <willi> 2016-07-15 10:04:04.972520 mon.0 10.250.250.4:6789/0 11130 : cluster [INF] osd.29 marked down after no pg stats for 904.036091seconds
[10:06] <willi> that is what i mean
[10:06] <willi> after that rbd works...
[10:07] <willi> 900 seconds
[10:07] <willi> long time
[10:07] <vikhyat> I think you modified something
[10:07] <willi> for a disruption
[10:07] <vikhyat> it is not 900
[10:07] <vikhyat> not sure
[10:07] <vikhyat> http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/
[10:07] <vikhyat> read this
[10:08] <willi> look
[10:08] <willi> ceph daemon osd.0 config show | grep 900
[10:08] <willi> "ms_tcp_read_timeout": "900",
[10:08] <willi> "mon_osd_report_timeout": "900",
[10:08] <willi> "osd_command_thread_suicide_timeout": "900",
[10:08] <vikhyat> we mark out in 300 seconds
[10:08] <willi> "rgw_keystone_revocation_interval": "900",
[10:08] <vikhyat> and down is very early
[10:08] <vikhyat> this is the issue
[10:08] <willi> that is my problem
[10:08] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) Quit (Remote host closed the connection)
[10:08] <willi> reports are not really generated
[10:08] <vikhyat> mon osd down out interval
[10:08] <Hatsjoe> I have the same values, and when I shut down 1 node (failure domain is node), and have 2 left, I am not having the same issues
[10:08] <vikhyat> this is 300
[10:09] <vikhyat> right
[10:09] <Hatsjoe> So the interval has nothing to do with it, must be something else
[10:09] <vikhyat> some issue in other configuration
[10:09] <vikhyat> willi: I got it I think
[10:10] <vikhyat> mon osd down out subtree limit
[10:10] <vikhyat> this is by default rack
[10:10] <vikhyat> you are marking out rack
[10:10] * mashwo00 (~textual@51.179.162.234) has joined #ceph
[10:10] <vikhyat> means you disrupting full rack correct
[10:10] <willi> look here
[10:10] <willi> https://paste.ee/p/GQGpu
[10:11] <willi> you can see at 9:50
[10:11] <willi> i powered down 1-6
[10:11] <willi> 9:51 slow requests
[10:11] <willi> 9:52 reports failed
[10:11] * shylesh (~shylesh@59.95.70.152) Quit (Ping timeout: 480 seconds)
[10:11] <vikhyat> willi: just modify this config to row
[10:11] <vikhyat> instead of rack
[10:11] * shylesh (~shylesh@45.124.225.209) has joined #ceph
[10:12] <vikhyat> mon osd down out subtree limit = row
[10:12] <vikhyat> and then test
[10:12] <vikhyat> it should help you
[10:12] <willi> ok one moment
[10:12] <vikhyat> set in [mon] section
[10:12] <willi> on global?
[10:12] <vikhyat> or global
[10:12] <willi> ok
[10:12] <vikhyat> restart three mons one by one
[10:13] <vikhyat> then disconnect full rack
[10:13] <willi> i reboot the whole cluster...
[10:13] <vikhyat> nope
[10:13] <vikhyat> only three Monitor
[10:13] * kawa2014 (~kawa@109.112.4.67) has joined #ceph
[10:13] <vikhyat> restart the service
[10:13] <vikhyat> no need of reboot
[10:14] <willi> ok
[10:14] * chengpeng_ (~chengpeng@180.168.126.179) Quit (Quit: Leaving)
[10:14] * analbeard (~shw@support.memset.com) has joined #ceph
[10:14] <vikhyat> ceph daemon mon.`hostname -s` config show | grep subtree
[10:14] <willi> takes only 2-3 minutes
[10:14] <vikhyat> from all three mons
[10:14] <willi> yes one moment
[10:17] * vikhyat is now known as vikhyat|away
[10:17] * kawa2014 (~kawa@109.112.4.67) Quit (Read error: Connection reset by peer)
[10:18] * fsimonce (~simon@host99-64-dynamic.27-79-r.retail.telecomitalia.it) has joined #ceph
[10:18] * Eric3 (~ke@180.168.197.82) has joined #ceph
[10:19] <willi> ok
[10:19] <willi> on all 3 mon row...
[10:19] <Eric3> 2016-07-15 16:12:06.573996 7f905589d700 0 log_channel(cluster) log [INF] : 1.2b2 repair starts
[10:19] <Eric3> 2016-07-15 16:12:08.934673 7f905589d700 -1 log_channel(cluster) log [ERR] : repair 1.2b2 984a72b2/rbd_data.99e7a54a9969b.0000000000002cbd/head//1 on disk size (4194304) does not match object info size (348160) adjusted for ondisk to (348160)
[10:19] <Eric3> 2016-07-15 16:12:25.122179 7f905589d700 -1 log_channel(cluster) log [ERR] : 1.2b2 repair 1 errors, 0 fixed
[10:19] <Eric3> someone know it
[10:20] * praveen (~praveen@121.244.155.12) has joined #ceph
[10:20] <Eric3> ll rbd\\udata.99e7a54a9969b.0000000000002cbd__head_984A72B2__1
[10:20] <Eric3> -rw-r--r-- 1 root root 4194304 7??? 15 00:13 rbd\udata.99e7a54a9969b.0000000000002cbd__head_984A72B2__1
[10:20] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) has joined #ceph
[10:21] <willi> :-(
[10:21] <willi> mark_data_ceph-iscsi-1_2159_object130 [write 0~4194304] snapc 0=[] ack+ondisk+write+known_if_redirected e13086) currently waiting for subops from 23,72
[10:24] <willi> https://paste.ee/p/WY2RX
[10:28] * ade (~abradshaw@212.77.58.61) has joined #ceph
[10:29] <willi> https://paste.ee/p/U4whf
[10:30] <willi> i am relatively sure that on my test cluster with infernalis and ubuntu 14.04 there was this problem not
[10:31] * rraja (~rraja@121.244.87.118) Quit (Ping timeout: 480 seconds)
[10:31] * rraja (~rraja@121.244.87.117) has joined #ceph
[10:32] * vZerberus (dog@00021993.user.oftc.net) Quit (Quit: Coyote finally caught me)
[10:36] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) has joined #ceph
[10:36] * kawa2014 (~kawa@host154-114-static.124-81-b.business.telecomitalia.it) has joined #ceph
[10:37] * Skyrider (~notmyname@159.148.186.194) has joined #ceph
[10:40] * aj__ (~aj@2001:6f8:1337:0:2c0c:bbaa:e7c6:f313) has joined #ceph
[10:41] * vikhyat|away is now known as vikhyat
[10:42] <vikhyat> willi: this is scrub issue
[10:42] <vikhyat> willi: I hope the first problem is solved
[10:42] <vikhyat> your osds are getting marked out in 300 seconds
[10:42] <vikhyat> mon osd down out subtree limit = row
[10:43] <vikhyat> when you have set this option
[10:43] <vikhyat> and bringing down full rack
[10:50] * arcimboldo (~antonio@84-75-174-248.dclient.hispeed.ch) has joined #ceph
[10:54] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[11:02] <willi> i have done that
[11:02] <willi> same as before
[11:03] <vikhyat> willi: means still your osds are taking 900 seconds
[11:03] <willi> yes
[11:03] <vikhyat> then better you create a tracker
[11:03] <vikhyat> this is it from my end
[11:06] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) Quit (Quit: Leaving)
[11:06] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[11:07] * micw (~micw@ip92346916.dynamic.kabel-deutschland.de) has joined #ceph
[11:07] <micw> hi
[11:07] * Skyrider (~notmyname@61TAAALEF.tor-irc.dnsbl.oftc.net) Quit ()
[11:07] <micw> i'm building a small cluster with 3-6 nodes (all storage nodes with 4 OSDs on each). i need to run my mons on these nodes.
[11:07] <micw> is it ok to run the mons on all nodes?
[11:08] <micw> afaik that only should increase the redundancy (and mon communication a bit)
[11:08] <boolman> i have a 4 node cluster, running mons on 3 of them
[11:09] <micw> sure. with 4 mons, you would need a quorum of 3
[11:10] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[11:11] * truan-wang (~truanwang@220.248.17.34) Quit (Remote host closed the connection)
[11:12] * F|1nt (~F|1nt@host37-212.lan-isdn.imaginet.fr) has joined #ceph
[11:19] * arcimboldo (~antonio@84-75-174-248.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[11:20] * vZerberus (~dogtail@00021993.user.oftc.net) has joined #ceph
[11:21] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) Quit (Quit: Ex-Chat)
[11:21] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[11:23] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) Quit ()
[11:23] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[11:24] * Mika_c (~Mika@122.146.93.152) has joined #ceph
[11:33] * TMM (~hp@185.5.121.201) has joined #ceph
[11:36] <willi> vikhyat i install now hammer
[11:37] <willi> i think it is a jewel problem
[11:37] * PierreW (~andrew_m@82.94.251.227) has joined #ceph
[11:42] * F|1nt (~F|1nt@host37-212.lan-isdn.imaginet.fr) Quit (Remote host closed the connection)
[11:45] <chrome0> Hi, I have a cluster that has 1 stuck degraded object which won't clean, and now requests are blocking on it (Jewel fwiw). Disk contents for the pg seem accessible, load is only light. Any ideas for debugging?
[11:50] * arcimboldo (~antonio@dhcp-y11-zi-s3it-130-60-34-054.uzh.ch) has joined #ceph
[12:02] * tiantian (~oftc-webi@116.228.48.102) Quit (Quit: Page closed)
[12:03] * overload (~oc-lram@79.108.113.172.dyn.user.ono.com) Quit (Remote host closed the connection)
[12:07] * PierreW (~andrew_m@7EXAAAIOW.tor-irc.dnsbl.oftc.net) Quit ()
[12:08] * wbg (~wbg@ave.zedat.fu-berlin.de) Quit (Quit: WeeChat 1.5)
[12:13] * raso1 (~raso@ns.deb-multimedia.org) has joined #ceph
[12:18] * raso (~raso@ns.deb-multimedia.org) Quit (Ping timeout: 480 seconds)
[12:21] * Mika_c (~Mika@122.146.93.152) Quit (Remote host closed the connection)
[12:24] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) has joined #ceph
[12:24] * ffilz (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) Quit (Server closed connection)
[12:24] * ffilz (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) has joined #ceph
[12:27] * kmajk (~kmajk@nat-hq.ext.getresponse.com) has joined #ceph
[12:28] * ccourtaut (~ccourtaut@157.173.31.93.rev.sfr.net) Quit (Quit: I'll be back!)
[12:29] <willi> ceph-deploy mon create-initial
[12:29] <willi> [ceph-mon-1][INFO ] Running command: systemctl enable ceph.target
[12:29] <willi> [ceph-mon-1][WARNIN] Failed to execute operation: No such file or directory
[12:29] <willi> [ceph-mon-1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[12:29] <willi> anyone knows??
[12:29] <willi> root@ceph-mon-1:~# systemctl enable ceph.target
[12:29] <willi> Failed to execute operation: No such file or directory
[12:31] <kmajk> willi: you dont have systemd installed?
[12:31] <willi> apt-get install systemd
[12:31] <willi> Paketlisten werden gelesen... Fertig
[12:31] <willi> Abh??ngigkeitsbaum wird aufgebaut.
[12:31] <willi> Statusinformationen werden eingelesen.... Fertig
[12:31] <willi> ??systemd?? ist bereits die neuste Version (229-4ubuntu6).
[12:31] <willi> 0 aktualisiert, 0 neu installiert, 0 zu entfernen und 0 nicht aktualisiert.
[12:32] <kmajk> willi: dpkg -l|grep ceph
[12:32] <willi> dpkg -l|grep ceph
[12:32] <willi> ii ceph 0.94.7-1xenial amd64 distributed storage and file system
[12:32] <willi> ii ceph-common 0.94.7-1xenial amd64 common utilities to mount and interact with a ceph storage cluster
[12:32] <willi> ii ceph-deploy 1.5.34 all Ceph-deploy is an easy to use configuration tool
[12:32] <willi> ii ceph-mds 0.94.7-1xenial amd64 metadata server for the ceph distributed file system
[12:32] <willi> ii libcephfs1 0.94.7-1xenial amd64 Ceph distributed file system client library
[12:32] <willi> ii python-cephfs 0.94.7-1xenial amd64 Python libraries for the Ceph libcephfs library
[12:33] <kmajk> willi: this is from ceph-mon-1 host?
[12:33] <willi> ja
[12:33] <willi> yes
[12:33] <kmajk> willi: apt-get install ceph-mon
[12:34] <kmajk> or better: ceph-deploy install ceph-mon-1
[12:34] * linuxkidd (~linuxkidd@ip70-189-207-54.lv.lv.cox.net) Quit (Ping timeout: 480 seconds)
[12:34] <kmajk> this should install all packages
[12:35] <kmajk> you can't run command ceph-deploy mon create-initial without all packages installed
[12:36] * red_nh (red@infidel.e-lista.pl) has joined #ceph
[12:36] <kmajk> willi: btw. i recommand install a jewel version: 10.2.2-1~bpo80+1 of ceph
[12:36] <willi> oh noooo
[12:37] <willi> i install hammer
[12:37] <willi> jewel mackes problems
[12:37] <kmajk> jewel is now LTS
[12:37] <willi> apt-cache search ceph-mon
[12:37] <willi> i know
[12:37] <willi> i have problems with jewel since 3 days
[12:38] <willi> jewel was installed
[12:38] <kmajk> with support for next year, support for hammer will end soon
[12:39] <kmajk> ok, i forgotten that hammer doesnt have ceph-mon package
[12:40] <willi> look
[12:40] <willi> http://download.ceph.com/debian-hammer/pool/main/c/ceph/
[12:40] <willi> yes
[12:40] <willi> right
[12:40] <willi> and now?
[12:40] <kmajk> willi: did you ran: ceph-deploy install ceph-mon-1
[12:40] <willi> were do i get ceph-mon ?
[12:40] * scheuk (~scheuk@204.246.67.78) Quit (Server closed connection)
[12:40] * scheuk (~scheuk@204.246.67.78) has joined #ceph
[12:40] <kmajk> this should be in ceph/ceph-common for hammer release
[12:41] <kmajk> so its strange that your debs are broken
[12:41] <willi> i ran
[12:41] <willi> ceph-deploy install --release hammer ceph-mon-1 ceph-mon-2 ceph-mon-3 ceph-iscsi-1 ceph-iscsi-2 ceph-iscsi-3 ceph1 ceph2 ceph3 ceph4 ceph5 ceph6 ceph7 ceph8 ceph9 ceph10 ceph11 ceph12 ceph13 ceph14 ceph15 ceph16 ceph17 ceph18
[12:41] <kmajk> i never have install hammer nor jewel on ubuntu (only debian jessie)
[12:41] <willi> ah okay
[12:42] <willi> you mean its better to install cpeh on debian than on ubuntu ?
[12:42] <kmajk> willi: ceph packages are from official ceph repo or ubuntu?
[12:42] <willi> http://download.ceph.com/debian-hammer/dists/xenial/
[12:42] <kmajk> ok
[12:42] <willi> apt-get update && apt-get dist-upgrade && apt-get install ntp git wget -y
[12:42] <willi> wget -q -O- 'https://download.ceph.com/keys/release.asc' | apt-key add -
[12:42] <willi> echo deb http://download.ceph.com/debian-hammer/ $(lsb_release -sc) main | tee /etc/apt/sources.list.d/ceph.list
[12:43] <willi> apt-get update && apt-get dist-upgrade -y
[12:43] <willi> apt-get install ceph-deploy
[12:43] <kmajk> ok run: dpkg -L ceph && dpkg -L ceph-common | grep ceph.target
[12:43] * linuxkidd (~linuxkidd@ip70-189-207-54.lv.lv.cox.net) has joined #ceph
[12:45] <kmajk> willi: paste output
[12:45] <willi> https://paste.ee/p/7WYFL
[12:46] <kmajk> there is no file ceph.target
[12:46] <kmajk> strange
[12:46] * ccourtaut (~ccourtaut@157.173.31.93.rev.sfr.net) has joined #ceph
[12:46] * ccourtaut (~ccourtaut@157.173.31.93.rev.sfr.net) Quit ()
[12:48] * IvanJobs (~ivanjobs@103.50.11.146) Quit ()
[12:48] <kmajk> willi: ok i have solution for u
[12:49] * wgao (~wgao@106.120.101.38) Quit (Server closed connection)
[12:50] * wgao (~wgao@106.120.101.38) has joined #ceph
[12:51] <willi> ok and what?
[12:51] <kmajk> willi: https://github.com/ceph/ceph/blob/master/systemd/ceph.target download this file to: /lib/systemd/system/ceph.target
[12:51] <kmajk> willi: https://github.com/ceph/ceph/blob/master/systemd/ceph.target download this file to: /lib/systemd/system/ceph.target
[12:51] <kmajk> .
[12:51] <willi> i test it
[12:52] <kmajk> willi: ..
[12:52] * yanzheng1 (~zhyan@125.70.23.222) has joined #ceph
[12:52] * kmajk (~kmajk@nat-hq.ext.getresponse.com) Quit (Quit: leaving)
[12:52] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) Quit (Remote host closed the connection)
[12:53] * kmajk (~kmajk@nat-hq.ext.getresponse.com) has joined #ceph
[12:53] <kmajk> willi: https://github.com/ceph/ceph/blob/master/systemd/ceph.target download this to /lib/systemd/system/ceph.target
[12:55] * yanzheng (~zhyan@125.70.22.67) Quit (Ping timeout: 480 seconds)
[12:55] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) has joined #ceph
[12:58] <willi> Failed to execute operation: Unit file is masked
[12:58] <willi> systemctl enable ceph.target
[12:59] <willi> argh
[13:00] * Eric3 (~ke@180.168.197.82) Quit (Quit: Leaving)
[13:00] * gregmark (~Adium@68.87.42.115) has joined #ceph
[13:01] * dingz (~rober@bahamas.zedat.fu-berlin.de) Quit (Quit: WeeChat 1.5)
[13:07] * Epi (~notmyname@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[13:08] * ccourtaut (~ccourtaut@157.173.31.93.rev.sfr.net) has joined #ceph
[13:10] * bniver (~bniver@pool-173-48-58-27.bstnma.fios.verizon.net) Quit (Remote host closed the connection)
[13:11] * swami1 (~swami@49.38.0.169) has joined #ceph
[13:12] * kawa2014 (~kawa@host154-114-static.124-81-b.business.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[13:13] <kmajk> willi: ceph.target should be in packages, fixing it for systemd its waste of time imho
[13:14] * willi (~willi@212.124.32.5) Quit (Ping timeout: 480 seconds)
[13:14] <kmajk> willi: maybe remove systemd - it will revert to upstart
[13:14] <kmajk> willi: so ceph-deploy will not use systemctl but service
[13:15] * truan-wang (~truanwang@183.167.211.5) has joined #ceph
[13:16] <kmajk> willi: apt-get remove --auto-remove systemd and after reboot: apt-get purge systemd
[13:17] * karnan (~karnan@121.244.87.117) Quit (Ping timeout: 480 seconds)
[13:18] * kefu is now known as kefu|afk
[13:21] * i_m (~ivan.miro@deibp9eh1--blueice4n6.emea.ibm.com) Quit (Ping timeout: 480 seconds)
[13:25] * mashwo00 (~textual@51.179.162.234) Quit (Quit: Textual IRC Client: www.textualapp.com)
[13:26] <arcimboldo> hi all, I have an issue on Hammer: 3 osds are flapping and even if I put them out the rebalancing does not proceed
[13:26] <arcimboldo> There might be some issue with those 3 disks, but for some reason I only see errors on the upstart log, not in ceph log, kern.log or syslog
[13:29] * Racpatel (~Racpatel@2601:87:0:24af::1fbc) has joined #ceph
[13:31] * willi (~willi@p5797BB64.dip0.t-ipconnect.de) has joined #ceph
[13:32] * shyu (~Frank@218.241.172.114) has joined #ceph
[13:34] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[13:34] * neurodrone_ (~neurodron@162.243.191.67) has joined #ceph
[13:37] * Epi (~notmyname@61TAAALJX.tor-irc.dnsbl.oftc.net) Quit ()
[13:37] * Gecko1986 (~Yopi@7EXAAAIS0.tor-irc.dnsbl.oftc.net) has joined #ceph
[13:45] <micw> if i want to ad a new mon, is it safe to create a fresh monmal on this mon, add all known mons and then start it?
[13:45] <micw> (in manual deplyoment the monmap is downloaded from cluster but that's not good for my autmatied deployment because it makes a difference between the first and al other nodes
[13:45] <micw> )
[13:48] * valeech (~valeech@pool-108-44-162-111.clppva.fios.verizon.net) Quit (Quit: valeech)
[13:51] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[13:51] * truan-wang (~truanwang@183.167.211.5) Quit (Ping timeout: 480 seconds)
[13:58] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[14:05] <willi> systemctl enable ceph.target
[14:05] <willi> Failed to execute operation: Unit file is masked
[14:05] <willi> anyone knows?
[14:07] * Gecko1986 (~Yopi@7EXAAAIS0.tor-irc.dnsbl.oftc.net) Quit ()
[14:07] * DJComet (~aldiyen@exit1.torproxy.org) has joined #ceph
[14:22] * kuku (~kuku@112.202.165.64) has joined #ceph
[14:24] * kuku (~kuku@112.202.165.64) Quit (Remote host closed the connection)
[14:25] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[14:29] * kmajk (~kmajk@nat-hq.ext.getresponse.com) Quit (Ping timeout: 480 seconds)
[14:34] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Ping timeout: 480 seconds)
[14:37] * DJComet (~aldiyen@7EXAAAITZ.tor-irc.dnsbl.oftc.net) Quit ()
[14:45] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[14:47] * garphy is now known as garphy`aw
[14:48] * kmajk (~kmajk@nat-hq.ext.getresponse.com) has joined #ceph
[14:50] * penguinRaider (~KiKo@146.185.31.226) Quit (Ping timeout: 480 seconds)
[14:53] * shylesh (~shylesh@45.124.225.209) Quit (Ping timeout: 480 seconds)
[14:53] * praveen (~praveen@121.244.155.12) Quit (Remote host closed the connection)
[14:53] * praveen (~praveen@121.244.155.12) has joined #ceph
[14:56] * cronburg (~cronburg@wr-130-64-194-42.medford.tufts.edu) has joined #ceph
[15:02] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) has joined #ceph
[15:02] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[15:04] * branto (~branto@213.175.37.12) has joined #ceph
[15:07] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) has joined #ceph
[15:07] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[15:09] * Heebie (~thebert@dub-bdtn-office-r1.net.digiweb.ie) Quit (Read error: Connection reset by peer)
[15:10] * penguinRaider (~KiKo@14.139.82.6) has joined #ceph
[15:10] * Heebie (~thebert@dub-bdtn-office-r1.net.digiweb.ie) has joined #ceph
[15:12] * branto (~branto@213.175.37.12) Quit (Ping timeout: 480 seconds)
[15:14] * neurodrone_ (~neurodron@162.243.191.67) Quit (Quit: neurodrone_)
[15:14] * EthanL (~lamberet@cce02cs4035-fa12-z.ams.hpecore.net) Quit (Ping timeout: 480 seconds)
[15:17] * georgem (~Adium@2a02:c7d:14bf:7200:8930:87ee:81d8:cfc1) has joined #ceph
[15:17] * georgem (~Adium@2a02:c7d:14bf:7200:8930:87ee:81d8:cfc1) Quit ()
[15:17] * georgem (~Adium@206.108.127.16) has joined #ceph
[15:21] * mattbenjamin (~mbenjamin@12.118.3.106) has joined #ceph
[15:24] * vimal (~vikumar@121.244.87.116) Quit (Quit: Leaving)
[15:24] * branto (~branto@nat-pool-brq-t.redhat.com) has joined #ceph
[15:25] * EthanL (~lamberet@cce02cs4035-fa12-z.ams.hpecore.net) has joined #ceph
[15:28] <arcimboldo> hi all, I'm still struggling with this pg that doesn't want to come back
[15:28] <arcimboldo> anyone willing to help?
[15:28] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[15:34] * dugravot6 (~dugravot6@l-p-dn-in-4a.lionnois.site.univ-lorraine.fr) Quit (Remote host closed the connection)
[15:36] * Randleman (~jesse@89.105.204.182) Quit (Server closed connection)
[15:36] * Randleman (~jesse@89.105.204.182) has joined #ceph
[15:37] * Drezil1 (~homosaur@tor-exit7-readme.dfri.se) has joined #ceph
[15:37] * Psi-Jack (~psi-jack@mx.linux-help.org) Quit (Quit: Where'd my terminal go?)
[15:38] * rnowling (~rnowling@104-186-210-225.lightspeed.milwwi.sbcglobal.net) has joined #ceph
[15:38] * dec (~dec@71.29.197.104.bc.googleusercontent.com) has joined #ceph
[15:41] * Psi-Jack (~psi-jack@mx.linux-help.org) has joined #ceph
[15:41] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[15:41] * maybebuggy (~maybebugg@2a01:4f8:191:2350::2) has joined #ceph
[15:46] * kingcu (~kingcu@kona.ridewithgps.com) Quit (Server closed connection)
[15:46] * kingcu (~kingcu@kona.ridewithgps.com) has joined #ceph
[15:49] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) has joined #ceph
[15:49] * yanzheng1 (~zhyan@125.70.23.222) Quit (Quit: This computer has gone to sleep)
[15:49] * Superdawg (~Superdawg@ec2-54-243-59-20.compute-1.amazonaws.com) Quit (Server closed connection)
[15:50] * Superdawg (~Superdawg@ec2-54-243-59-20.compute-1.amazonaws.com) has joined #ceph
[15:51] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[15:53] * penguinRaider (~KiKo@14.139.82.6) Quit (Ping timeout: 480 seconds)
[15:54] * Walex (~Walex@72.249.182.114) Quit (Server closed connection)
[15:54] * vimal (~vikumar@121.244.87.116) has joined #ceph
[15:54] * Walex (~Walex@SMTP.sabi.co.UK) has joined #ceph
[15:56] * branto (~branto@nat-pool-brq-t.redhat.com) Quit (Quit: Leaving.)
[15:56] * georgem1 (~Adium@24.114.64.150) has joined #ceph
[15:57] * s3an2 (~root@korn.s3an.me.uk) Quit (Server closed connection)
[15:57] * georgem1 (~Adium@24.114.64.150) Quit (Read error: Connection reset by peer)
[15:57] * s3an2 (~root@korn.s3an.me.uk) has joined #ceph
[15:57] * georgem1 (~Adium@2.222.31.80) has joined #ceph
[15:58] * jiffe (~jiffe@nsab.us) Quit (Server closed connection)
[15:59] * mnaser (~mnaser@162.253.53.193) Quit (Server closed connection)
[15:59] * jiffe (~jiffe@nsab.us) has joined #ceph
[15:59] * mnaser (~mnaser@162.253.53.193) has joined #ceph
[15:59] * georgem (~Adium@206.108.127.16) Quit (Read error: Connection reset by peer)
[15:59] * arthurh (~arthurh@38.101.34.128) Quit (Server closed connection)
[16:00] * arthurh (~arthurh@38.101.34.128) has joined #ceph
[16:00] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) Quit (Quit: Ex-Chat)
[16:00] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[16:00] * owasserm_ (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[16:00] * owasserm_ (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) Quit ()
[16:01] * ircolle (~Adium@dhcp-18-189-103-40.dyn.MIT.EDU) has joined #ceph
[16:01] * ircolle (~Adium@dhcp-18-189-103-40.dyn.MIT.EDU) Quit ()
[16:02] * swami1 (~swami@49.38.0.169) Quit (Quit: Leaving.)
[16:02] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) Quit (Remote host closed the connection)
[16:02] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[16:02] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) has joined #ceph
[16:04] * penguinRaider (~KiKo@103.6.219.219) has joined #ceph
[16:04] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Read error: Connection reset by peer)
[16:05] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[16:06] * gregmark (~Adium@68.87.42.115) has joined #ceph
[16:07] * Drezil1 (~homosaur@5AEAAAAAU.tor-irc.dnsbl.oftc.net) Quit ()
[16:07] * _303 (~ricin@26XAAAAEH.tor-irc.dnsbl.oftc.net) has joined #ceph
[16:08] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[16:08] * swami1 (~swami@49.38.0.169) has joined #ceph
[16:10] * qman__ (~rohroh@2600:3c00::f03c:91ff:fe69:92af) Quit (Server closed connection)
[16:10] * qman (~rohroh@2600:3c00::f03c:91ff:fe69:92af) has joined #ceph
[16:10] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[16:12] * rendar (~I@95.234.180.40) Quit (Ping timeout: 480 seconds)
[16:14] * kjetijor (kjetijor@microbel.pvv.ntnu.no) Quit (Ping timeout: 480 seconds)
[16:14] * kuku (~kuku@112.202.165.64) has joined #ceph
[16:15] * valeech (~valeech@74-93-221-70-WashingtonDC.hfc.comcastbusiness.net) has joined #ceph
[16:15] * kefu|afk is now known as kefu
[16:15] * swami1 (~swami@49.38.0.169) Quit (Quit: Leaving.)
[16:16] * bara (~bara@ip4-83-240-10-82.cust.nbox.cz) Quit (Ping timeout: 480 seconds)
[16:17] * devicenull debates putting 'ceph tell mon.* compact' on a cron
[16:18] <cholcombe> arcimboldo, sure
[16:19] * vimal (~vikumar@121.244.87.116) Quit (Remote host closed the connection)
[16:19] * georgem1 (~Adium@2.222.31.80) Quit (Quit: Leaving.)
[16:23] * vimal (~vikumar@121.244.87.116) has joined #ceph
[16:25] * hughsaunders (~hughsaund@2001:4800:7817:101:1843:3f8a:80de:df65) Quit (Server closed connection)
[16:25] * hughsaunders (~hughsaund@2001:4800:7817:101:1843:3f8a:80de:df65) has joined #ceph
[16:27] * CephFan1 (~textual@68-233-224-175.static.hvvc.us) has joined #ceph
[16:27] * penguinRaider (~KiKo@103.6.219.219) Quit (Ping timeout: 480 seconds)
[16:28] * vimal (~vikumar@121.244.87.116) Quit (Quit: Leaving)
[16:28] <maybebuggy> hi all, we're currently struggling a bit with blocked io on our ceph cache pool. we are a bit unsure about the various parameters, so maybe somebody here could explain them a bit more detailed. - we're on hammer (0.94.7)
[16:29] <maybebuggy> target_full_ratio is set to 0.8, target_dirty_ratio to 0.4 - target_max_bytes is set (max_objects is not)
[16:30] * remix_tj (~remix_tj@bonatti.remixtj.net) Quit (Server closed connection)
[16:30] * remix_tj (~remix_tj@bonatti.remixtj.net) has joined #ceph
[16:30] <maybebuggy> from the docs target_full_ratio is "only" clean objects. so that would mean cephs aiming for 80% clean objects and 40% dirty objects?
[16:31] * dougf (~dougf@96-38-99-179.dhcp.jcsn.tn.charter.com) Quit (Quit: bye)
[16:31] <arcimboldo> cholcombe, thnx, so the issue is the following: 3 osds are kind of flapping. I was able to trigger a rebalancing of most of the data
[16:31] <arcimboldo> but there is 1 pg that was on all of them
[16:31] * dougf (~dougf@96-38-99-179.dhcp.jcsn.tn.charter.com) has joined #ceph
[16:31] * squizzi_ (~squizzi@nat-pool-rdu-t.redhat.com) has joined #ceph
[16:32] <cholcombe> arcimboldo, what's the configuration of your cluster? osds, replicas, etc?
[16:32] <arcimboldo> I have 904 osds, most of them on platter on 36 nodes,
[16:32] <arcimboldo> 3 replicas
[16:32] <arcimboldo> mostly RBD images
[16:33] <cholcombe> ok
[16:33] * lcurtis (~lcurtis@47.19.105.250) has joined #ceph
[16:33] <arcimboldo> what happens is: sometimes one osd dies
[16:33] <cholcombe> anything interesting in the logs?
[16:33] <arcimboldo> https://cloud.s3it.uzh.ch:8080/v1/AUTH_aa92fbfda5c34db6a74e6f7d93165903/ceph-mess/ceph.osd.72.trace%2Bioerror.txt
[16:33] <ceph-ircslackbot1> <vdb> @arcimboldo: Were all 3 OSDs on the same host?
[16:33] <arcimboldo> looks like there is an op thread that commit suicide
[16:33] <devicenull> arcimboldo: looks like you have a bad drive
[16:34] <arcimboldo> ceph-ircslackbot1, no, 3 different hosts
[16:34] <ceph-ircslackbot1> <vdb> Oh and they all went down at once?
[16:34] <arcimboldo> well, I am not sure about the bad drive: I see this error every time an osd restart. I think it's because the disks are on RAID controllers configured as JBOD
[16:34] <arcimboldo> so maybe it's sending some SCSI command that the controller do not accept
[16:34] <cholcombe> arcimboldo, what raid controller do you have running?
[16:34] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[16:35] <cholcombe> it looks like from the trace you're getting missed heartbeats which causes the osd to commit suicide
[16:35] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) Quit (Server closed connection)
[16:35] <arcimboldo> cholcombe, it's a dell machine, let me check
[16:35] * GooseYArd (~GooseYArd@ec2-52-5-245-183.compute-1.amazonaws.com) has joined #ceph
[16:36] <arcimboldo> cholcombe, that's what I have guessed, but I don't know how to proceed
[16:36] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[16:36] <cholcombe> arcimboldo, is the osd that machine is on especially busy?
[16:36] <arcimboldo> not particularly
[16:36] <cholcombe> arcimboldo, well if you don't suspect anything is physically wrong you could loosen the heartbeat timeouts
[16:36] <ceph-ircslackbot1> <vdb> If you increase the objecter and osd logging you might be able to see something more useful.
[16:36] <cholcombe> arcimboldo, is IO slow on that osd?
[16:36] <arcimboldo> I've also checked on the switch, I don't see any issue. And btw, it's strange that this happened on only these 3 osds, and no the other osds on the same hosts
[16:37] <arcimboldo> cholcombe, from iostat it doesn't look like it's doing much io
[16:37] <cholcombe> arcimboldo, interesting
[16:37] <ceph-ircslackbot1> <vdb> It is probably because of the PG. I have seen something like this before when working with RBD images.
[16:37] <cholcombe> i was figuring you had some io wait going on
[16:37] <arcimboldo> but ceph-osd is doing a lot of cpu
[16:37] * _303 (~ricin@26XAAAAEH.tor-irc.dnsbl.oftc.net) Quit ()
[16:37] * Kaervan (~elt@93.174.93.133) has joined #ceph
[16:38] <ceph-ircslackbot1> <vdb> I was playing with snapshotting and one of my clones was named badly. It had a -1 or something in it's prefix. The OSDs they held those PGs kept crashing after that.
[16:38] <arcimboldo> so I don't know how to come back from this situation
[16:38] <ceph-ircslackbot1> <vdb> I tried moving objects for that image manually around but that didn't solve anything IIRC. I had to remove that image.
[16:38] <arcimboldo> I've tried to put out 2 of the 3 osds and to reweight them to 0
[16:39] <cholcombe> arcimboldo, well you could do as ceph-ircslackbot1 suggests and up the log level to see if anything comes up
[16:39] <cholcombe> arcimboldo, or you could relax the heartbeats a little
[16:39] <arcimboldo> now the pg map returns:
[16:39] <arcimboldo> osdmap e71632 pg 36.2dda (36.2dda) -> up [220,623,148] acting [591,72]
[16:39] * joshd1 (~jdurgin@2602:30a:c089:2b0:9504:35b4:c932:1817) has joined #ceph
[16:40] <cholcombe> right, or you could evacuate those drives
[16:40] <arcimboldo> cholcombe, evacuate those drives <= how exactly? is there a "manual" procedure?
[16:40] <arcimboldo> I tried increasing the logs but I couldn't see much
[16:41] <cholcombe> i'm not exactly sure which log level to increase for heartbeats
[16:41] * shaunm (~shaunm@74.83.215.100) Quit (Ping timeout: 480 seconds)
[16:41] <cholcombe> i'm not familiar with that code as much
[16:41] <cholcombe> arcimboldo, well try something simple first. run a bash script that times how long a touch file takes every 5 seconds or something
[16:41] <cholcombe> and show the timings
[16:41] <cholcombe> that might point to what's going on
[16:42] <arcimboldo> cholcombe, you mean on the osd drive?
[16:42] * shylesh (~shylesh@45.124.225.209) has joined #ceph
[16:42] <cholcombe> arcimboldo, yeah
[16:42] * ade (~abradshaw@212.77.58.61) Quit (Ping timeout: 480 seconds)
[16:42] <arcimboldo> I can run a dd
[16:42] <arcimboldo> root@osd-l2-36:/var/lib/ceph/osd/ceph-591# dd if=/dev/zero bs=1M count=1024 of=zero
[16:42] <arcimboldo> 1024+0 records in
[16:42] <arcimboldo> 1024+0 records out
[16:42] <arcimboldo> 1073741824 bytes (1.1 GB) copied, 0.708837 s, 1.5 GB/s
[16:42] <arcimboldo> maybe with direct io...?
[16:42] * [arx] (~arx@the.kittypla.net) has joined #ceph
[16:42] * penguinRaider (~KiKo@14.139.82.6) has joined #ceph
[16:43] <cholcombe> arcimboldo, that dd is writing to cache
[16:43] <arcimboldo> yes, I'm trying with oflag=direct
[16:43] <cholcombe> you need oflag=sync
[16:43] <arcimboldo> 1073741824 bytes (1.1 GB) copied, 14.2396 s, 75.4 MB/s
[16:44] <cholcombe> arcimboldo, which ceph version are you running?
[16:44] * aj__ (~aj@2001:6f8:1337:0:2c0c:bbaa:e7c6:f313) Quit (Ping timeout: 480 seconds)
[16:44] <arcimboldo> 0.94.7
[16:44] <cholcombe> ok
[16:44] <arcimboldo> with sync is a bit slower: 29.7 MB/s
[16:45] <cholcombe> exactly haha
[16:45] <arcimboldo> let me try on a "good" drive
[16:45] <cholcombe> well what i was hoping with the touch a file every few seconds and then print how long it took maybe you could match that with something else that's going on in the cluster.
[16:46] <arcimboldo> also on a different drive is around 30MB/s
[16:46] <arcimboldo> which is quite strange...
[16:46] <arcimboldo> I remember I've did some benchmark and it was way faster
[16:46] <arcimboldo> (last year)
[16:46] <cholcombe> well linux cache is very deceptive
[16:46] <arcimboldo> but the nodes are in productino now, so maybe...
[16:47] <arcimboldo> cholcombe, you mean "while true; do time touch testfile; sleep 1; done" ?
[16:47] <cholcombe> yeah
[16:47] <ceph-ircslackbot1> <vdb> Can you up the objecter, osd, filestore logs to 20/20 and paste the run?
[16:47] <arcimboldo> 0.001s
[16:48] <ceph-ircslackbot1> <vdb> Pretty sure I have seen this issue before. The OSD lockup I had seen was due to a broken image.
[16:48] <ceph-ircslackbot1> <vdb> And I was on hammer then.
[16:48] <cholcombe> oh i'm sure it's going to be a tiny number but maybe it bounces for some reason
[16:48] * taleks (~oftc-webi@vpn-89-206-118-127.uzh.ch) has joined #ceph
[16:48] <arcimboldo> btw controller is a H830
[16:48] <ceph-ircslackbot1> <vdb> If controller is an issue then all disks on that host should lockup and fail.
[16:48] * natarej (~natarej@101.188.54.14) Quit (Read error: Connection reset by peer)
[16:49] <cholcombe> ceph-ircslackbot1, yeah i'd suspect that also
[16:49] * natarej (~natarej@101.188.54.14) has joined #ceph
[16:49] * TMM (~hp@185.5.121.201) Quit (Quit: Ex-Chat)
[16:49] <ceph-ircslackbot1> <vdb> The OSDs are across different hosts I think I heard earlier. And the ones exactly which has that degraded PG on.
[16:51] <arcimboldo> interesting enough, I've tried to inject --debug_heartbeatmap=20 on the osds, and one of them is stuck
[16:51] <arcimboldo> root@mon-k5-41:~# ceph tell osd.591 injectargs -- --debug_heartbeatmap=20
[16:51] <arcimboldo> 2016-07-15 16:49:14.101060 7fb6a82ff700 0 -- 10.129.31.224:0/3378414895 >> 10.129.31.136:6803/83663 pipe(0x7fb6a4066a40 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb6a405e6b0).fault
[16:51] <arcimboldo> and it's still there
[16:51] <arcimboldo> and indeed it's dead
[16:51] <arcimboldo> with the trace i've shown you :https://cloud.s3it.uzh.ch:8080/v1/AUTH_aa92fbfda5c34db6a74e6f7d93165903/ceph-mess/ceph-osd.591.tracelog.txt
[16:51] * penguinRaider (~KiKo@14.139.82.6) Quit (Ping timeout: 480 seconds)
[16:52] * CephFan1 (~textual@68-233-224-175.static.hvvc.us) Quit (Ping timeout: 480 seconds)
[16:53] <cholcombe> arcimboldo, are there any open bugs that match this?
[16:53] <arcimboldo> I think I've seen something similar, let me find them back
[16:53] <arcimboldo> none of them looked resolved though
[16:54] <arcimboldo> http://tracker.ceph.com/issues/4116
[16:54] <arcimboldo> http://tracker.ceph.com/issues/9554
[16:55] * portante (~portante@nat-pool-bos-t.redhat.com) Quit (Server closed connection)
[16:55] * portante (~portante@nat-pool-bos-t.redhat.com) has joined #ceph
[16:56] <arcimboldo> there is also http://tracker.ceph.com/issues/2116 but I don't know if it's relevant
[16:59] * xarses (~xarses@64.124.158.100) has joined #ceph
[16:59] <arcimboldo> a few logs from one of the osds: https://cloud.s3it.uzh.ch:8080/v1/AUTH_aa92fbfda5c34db6a74e6f7d93165903/ceph-mess/ceph-osd.72.log
[16:59] <arcimboldo> part of it should be with increased debug
[17:00] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:00] * babilen (~babilen@babilen.user.oftc.net) has left #ceph
[17:03] <arcimboldo> any idea on what's going on/
[17:03] * folivora (~out@devnull.drwxr-xr-x.eu) Quit (Server closed connection)
[17:03] * folivora (~out@devnull.drwxr-xr-x.eu) has joined #ceph
[17:04] * wushudoin (~wushudoin@38.99.12.237) has joined #ceph
[17:05] * kuku (~kuku@112.202.165.64) Quit (Remote host closed the connection)
[17:06] <arcimboldo> the currently stuck osd is 591, it doesn't accept "ceph tel osd.591 injectargs" and it has 19 ops with age > 120s
[17:07] * Kaervan (~elt@5AEAAAAGE.tor-irc.dnsbl.oftc.net) Quit ()
[17:15] * stupidnic (~tomwalsh@69.sub-70-193-165.myvzw.com) has joined #ceph
[17:16] <stupidnic> We have a minor issue with our production ceph cluster and I wanted to bounce our thought process off somebody before we actually did it.
[17:17] <stupidnic> We have a failed drive in one of our cluster nodes (media error confirmed in dmesg).
[17:17] * debian112 (~bcolbert@173-164-167-200-SFBA.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[17:17] <stupidnic> Our thinking is that we should just remove that OSD from the cluster completely and let Ceph reshuffle the data. Then when the time comes add the new drive to the cluster node, and then let Ceph shuffle the PGs again.
[17:18] <stupidnic> Is that the correct way to deal with this sort of failure?
[17:18] * jordan_c (~jconway@cable-192.222.199.37.electronicbox.net) has joined #ceph
[17:20] * logan (~logan@63.143.60.136) Quit (Server closed connection)
[17:20] <jordan_c> I'm having an issue where I'm trying to setup a test cluster, but when I reboot a VM the osds wont start (jewel on CentOS 7) 3 nodes, two osds per node with a single disk per osd and journal on a shared partition of a third disk. Is there something specific you need to do to recover after an unexpected reboot?
[17:20] * logan- (~logan@63.143.60.136) has joined #ceph
[17:21] <jordan_c> a monitor on each node, admin node is one of the 3 storage nodes
[17:23] * brians (~brian@80.111.114.175) Quit (Quit: Textual IRC Client: www.textualapp.com)
[17:23] * krypto (~krypto@G68-121-13-164.sbcis.sbc.com) has joined #ceph
[17:23] * jgornick (~jgornick@2600:3c00::f03c:91ff:fedf:72b4) Quit (Server closed connection)
[17:24] * jgornick (~jgornick@2600:3c00::f03c:91ff:fedf:72b4) has joined #ceph
[17:24] * kefu is now known as kefu|afk
[17:24] * valeech (~valeech@74-93-221-70-WashingtonDC.hfc.comcastbusiness.net) Quit (Quit: valeech)
[17:25] * brians (~brian@80.111.114.175) has joined #ceph
[17:26] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[17:26] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[17:26] <arcimboldo> cholcombe, any idea?
[17:27] <cholcombe> arcimboldo, sorry i was in meetings. lemme read what you wrote
[17:28] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) has joined #ceph
[17:29] <maybebuggy> stupidnic: is the failing drive / osd already marked as out?
[17:29] <cholcombe> arcimboldo, the suck pg is in 591 right?
[17:29] <stupidnic> maybebuggy: Let me check. I am pretty sure it is.
[17:29] <arcimboldo> stuck osds are 72, 272 and 591
[17:29] <maybebuggy> then ceph should have already reshuffeld the data
[17:30] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) Quit (Read error: Connection reset by peer)
[17:30] <arcimboldo> stuck pg is 36.2dda
[17:30] * reed (~reed@184-23-0-196.dsl.static.fusionbroadband.com) has joined #ceph
[17:30] <cholcombe> arcimboldo, can you ceph pg 36.2dda query ?
[17:30] <arcimboldo> the pg is currently remapped, so there are 3 other osds that are supposed to take care, but the data was not moved
[17:30] <arcimboldo> ceph pg query takes a lot of time, I'll post the output
[17:30] <maybebuggy> stupidnic: what does "ceph -s" print regarding pgs? if there is something like <number-of-pgs> active+clean it already did
[17:30] <arcimboldo> I have the output I've got some hours ago at https://cloud.s3it.uzh.ch:8080/v1/AUTH_aa92fbfda5c34db6a74e6f7d93165903/ceph-mess/ceph.pg.36.2dda.query.txt
[17:31] <stupidnic> maybebuggy: It doesn???t actually that???s what told us there was a problem
[17:31] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[17:31] <cholcombe> arcimboldo, so that's down+peering. and the acting set is 220 623 148. Are all of those osds up?
[17:31] <stupidnic> maybebuggy: ceph osd tree shows it marked as down, but not out
[17:31] <arcimboldo> cholcombe, yes
[17:32] <arcimboldo> but they don't have all the data, I can see the pg is around 30G and they only have 4G
[17:32] <stupidnic> maybebuggy: we show 7 PGs as inconsistent.
[17:32] <arcimboldo> the pg goes from down+peering to active+undersized+degraded+remapped+backfilling then one of the original osd dies
[17:33] <cholcombe> arcimboldo, right cause the heartbeat fails right?
[17:33] * shylesh (~shylesh@45.124.225.209) Quit (Ping timeout: 480 seconds)
[17:33] <arcimboldo> cholcombe, that's what I think
[17:33] <cholcombe> arcimboldo, what's your max_backfills set to?
[17:34] <maybebuggy> stupidnic: is that osd process still running? or do you have set ceph to "noout"?
[17:34] <arcimboldo> after a while I see that in the logs of the other osd there is something like "this osd didn't reply since ..."
[17:34] <arcimboldo> and then the osd that is not replying dies with the stacktrace
[17:34] <stupidnic> maybebuggy: The OSD is not running.
[17:34] <arcimboldo> cholcombe, how can I see it? ceph --admin-daemon ... config show?
[17:34] <cholcombe> arcimboldo, well that circles me back to my original thought. you could try relaxing the heartbeat timeout time and maybe lowering the max_backfills and see if that gets it moving
[17:34] <cholcombe> arcimboldo, correct
[17:35] <arcimboldo> "osd_max_backfills": "10",
[17:35] <ceph-ircslackbot1> <vdb> @arcimboldo: From your logs it's the image with `rbd_data.4f23891b0f8fea` header that is broken.
[17:35] <ceph-ircslackbot1> <vdb> You need to find that image and do a `rbd info` on it.
[17:35] <ceph-ircslackbot1> <vdb> It will most possibly hang and rightfully so.
[17:35] <arcimboldo> I know which volume is
[17:35] <cholcombe> arcimboldo, ^^
[17:36] * penguinRaider (~KiKo@14.139.82.6) has joined #ceph
[17:36] <arcimboldo> actually, I think everything started when I've tried to increase the size of a volume
[17:36] <cholcombe> arcimboldo, max backfills of 10 is prob fine given you have 900 osds
[17:36] <maybebuggy> stupidnic: hm, not sure but imho that should make ceph rebalance the cluster. do you have a pool with size=1 or min_size=1 so it could be that during write there was some data only in the failed pg?
[17:36] <arcimboldo> root@mon-k5-41:~# rbd -p cinder info volume-16dc1320-b478-4353-b856-16d470f287dd
[17:36] <arcimboldo> rbd image 'volume-16dc1320-b478-4353-b856-16d470f287dd':
[17:36] <arcimboldo> size 214 TB in 56320000 objects
[17:36] <arcimboldo> order 22 (4096 kB objects)
[17:36] <arcimboldo> block_name_prefix: rbd_data.4f23891b0f8fea
[17:36] <arcimboldo> format: 2
[17:36] <arcimboldo> features: layering, exclusive, object map
[17:36] <arcimboldo> flags: object map invalid
[17:37] <stupidnic> maybebuggy: we do have a min_size of one as our replica size is 2. It shows that the PGs failed during a scrub as it shows them as scrub errors
[17:37] <arcimboldo> btw, rbd info takes an awful amount of time on this volume
[17:37] <ceph-ircslackbot1> <vdb> arcimboldo: I don't know what you can do with it. Can you export and import it perhaps? The image I was playing with was not in production so I could just delete it.
[17:37] <arcimboldo> ceph-ircslackbot1, can I get rid of that volume?
[17:37] <maybebuggy> stupidnic: http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ <- pgs inconsistent
[17:37] * Inuyasha (~Guest1390@torrelay5.tomhek.net) has joined #ceph
[17:37] * moon (~moon@tunnel.greenhost.nl) has joined #ceph
[17:37] <ceph-ircslackbot1> <vdb> Yes, ideal if you can simply rbd rm it.
[17:37] <maybebuggy> stupidnic: ceph pg repair {pg-id}
[17:38] <stupidnic> maybebuggy: so I should just remove that OSD?
[17:38] <arcimboldo> this image is in production but it's mine and I can delete it, but when I tried it was stuck
[17:38] <maybebuggy> stupidnic: you don't have to. the pg should be repairable without that osd being up...
[17:39] <ceph-ircslackbot1> <vdb> arcimboldo: Beautiful.
[17:39] <maybebuggy> stupidnic: regarding the osd it's up to you, either replace the failing disk, reformat it and bring the osd back up or remove it, and later add one again... in both cases the starting of the osd will make ceph rebalance itself
[17:39] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[17:40] <arcimboldo> ceph-ircslackbot1, it would be beautiful, but I'm not sure it works
[17:40] * vimal (~vikumar@114.143.163.114) has joined #ceph
[17:41] <arcimboldo> I whish there was a way to delete it manually...
[17:41] <arcimboldo> like deleting files around
[17:41] <cholcombe> arcimboldo, well that might be so easy to do
[17:41] <cholcombe> might not*
[17:41] * sw3 (sweaung@2400:6180:0:d0::66:100f) Quit (Server closed connection)
[17:41] <cholcombe> because your rbd image is sharded into chunks you'd have to track down all chunks
[17:42] * sw3 (sweaung@2400:6180:0:d0::66:100f) has joined #ceph
[17:42] <arcimboldo> root@mon-k5-41:~# rbd -p cinder rm volume-16dc1320-b478-4353-b856-16d470f287dd
[17:42] <arcimboldo> 2016-07-15 17:41:40.209882 7f77e24f9700 0 -- 10.129.31.224:0/2541751862 >> 10.129.31.121:6804/107559 pipe(0x7f77e8103840 sd=7 :0 s=1 pgs=0 cs=0 l=1 c=0x7f77e8107ae0).fault
[17:42] <arcimboldo> 2016-07-15 17:41:41.703029 7f77e24f9700 0 -- 10.129.31.224:0/2541751862 >> 10.129.31.121:6804/107559 pipe(0x7f77e8103840 sd=7 :51214 s=1 pgs=0 cs=0 l=1 c=0x7f77e8107ae0).connect claims to be 10.129.31.121:6804/108793 not 10.129.31.121:6804/107559 - wrong node!
[17:42] <arcimboldo> 2016-07-15 17:41:42.503835 7f77e24f9700 0 -- 10.129.31.224:0/2541751862 >> 10.129.31.121:6804/107559 pipe(0x7f77e8103840 sd=7 :51220 s=1 pgs=0 cs=0 l=1 c=0x7f77e8107ae0).connect claims to be 10.129.31.121:6804/108793 not 10.129.31.121:6804/107559 - wrong node!
[17:42] <arcimboldo> what does it means wrong node??
[17:43] <arcimboldo> maybe an osd was respawned in the meantime?
[17:43] * shaunm (~shaunm@cpe-192-180-17-174.kya.res.rr.com) has joined #ceph
[17:43] * dnunez (~dnunez@130.64.25.56) has joined #ceph
[17:44] * penguinRaider (~KiKo@14.139.82.6) Quit (Ping timeout: 480 seconds)
[17:44] * debian112 (~bcolbert@64.235.157.198) has joined #ceph
[17:44] <cholcombe> well on connect the client shares sock_t info. I think ceph caches old clients for a bit to speed up the cephx back and forth
[17:45] <cholcombe> i'm not exactly sure what it's trying to say here
[17:45] * swami1 (~swami@27.7.162.30) has joined #ceph
[17:45] <cholcombe> arcimboldo, it does look like the tcp socket changed
[17:46] <arcimboldo> I'm not sure rbd rm will ever succeed
[17:46] <arcimboldo> alternative "solution": how can I know which rbd images have objects in a PG?
[17:46] <cholcombe> arcimboldo, good question. time to go down the rabbit hole :)
[17:48] <arcimboldo> cholcombe, I don't feel prepared for it
[17:48] <arcimboldo> so, the delete fails because the osd is respawned?
[17:48] * ndru_ (~jawsome@104.236.94.35) Quit (Server closed connection)
[17:48] * ndru (~jawsome@104.236.94.35) has joined #ceph
[17:49] <arcimboldo> Jul 15 16:32:10 osd-k6-30 kernel: [ 1607.903929] init: ceph-osd (ceph/272) main process (7020) killed by ABRT signal
[17:49] <arcimboldo> Jul 15 16:32:10 osd-k6-30 kernel: [ 1607.903941] init: ceph-osd (ceph/272) main process ended, respawning
[17:49] <arcimboldo> since the OSDs do not stay for enough time alive...
[17:49] <ceph-ircslackbot1> <vdb> arcimboldo: Are you deleting all the objects?
[17:50] <arcimboldo> I was running rbd rm
[17:50] <ceph-ircslackbot1> <vdb> As long as you are doing them individually you should be fine. Make sure to remove the image metadata after.
[17:50] <ceph-ircslackbot1> <vdb> Yep, you'd need to run `rbd rm` at the last.
[17:50] <arcimboldo> 2016-07-15 17:49:31.914097 7f37693777c0 -1 librbd: cannot obtain exclusive lock - not removing
[17:50] <arcimboldo> Removing image: 0% complete...failed.
[17:50] <arcimboldo> rbd: error: image still has watchers
[17:50] <arcimboldo> This means the image is still open or the client using it crashed. Try again after closing/unmapping it or waiting 30s for the crashed client to timeout.
[17:52] <cholcombe> arcimboldo, looks like https://www.sebastien-han.fr/blog/2012/07/16/rbd-objects/ has an rbd info cmd to find the block_name_prefix
[17:52] * micw (~micw@ip92346916.dynamic.kabel-deutschland.de) Quit (Quit: Leaving)
[17:52] <stupidnic> maybebuggy: Thanks for the help. We have the PGs cleaned up now, and we are working on getting a replacement drive into the DC now.
[17:52] <cholcombe> arcimboldo, do you have anyone mapping it?
[17:52] * penguinRaider (~KiKo@103.6.219.219) has joined #ceph
[17:52] <cholcombe> arcimboldo, there's a list watchers command to see
[17:53] <maybebuggy> stupidnic: great to hear :)
[17:53] <arcimboldo> apparently yes
[17:53] <cholcombe> rados -p rbd listwatchers {rbd_image}
[17:53] <cholcombe> i believe that's it
[17:54] <ceph-ircslackbot1> <vdb> Yep. or `rbd lock list image`.
[17:54] <ceph-ircslackbot1> <vdb> Or even `rbd status image` I think.
[17:54] <arcimboldo> step back: I was trying to delete the rbd volume (which is a cinder volume), the watcher is the node running cinder-volume. I've restarted cinder-voulme though, so I don't know why it's still listed as watcher
[17:54] <arcimboldo> (I did rbd status <image>)
[17:55] <ceph-ircslackbot1> <vdb> Clearly it's doing something with the image.
[17:55] <cholcombe> arcimboldo, can you shut down cinder-volume for a minute while you delete it?
[17:55] <ceph-ircslackbot1> <vdb> ^
[17:55] <arcimboldo> I've resetted the status of the cinder volume and restarted the daemon, let's see if it's enough
[17:56] <arcimboldo> but rbd status takes a lot...
[17:56] <cholcombe> ok
[17:56] <arcimboldo> ok that was enough to have no watchers anymore. Let's try rbd rm now...
[17:57] * mattbenjamin (~mbenjamin@12.118.3.106) Quit (Remote host closed the connection)
[17:57] * mattbenjamin (~mbenjamin@12.118.3.106) has joined #ceph
[17:57] * cholcombe crosses fingers
[17:57] * sugoruyo (~georgev@paarthurnax.esc.rl.ac.uk) Quit (Quit: I'm going home!)
[17:57] * taleks (~oftc-webi@vpn-89-206-118-127.uzh.ch) Quit (Ping timeout: 480 seconds)
[17:58] * danieagle (~Daniel@201-95-100-147.dsl.telesp.net.br) has joined #ceph
[17:58] * arcimboldo crosses fingers as well
[17:59] * arcimboldo also light a cigarette...
[17:59] <arcimboldo> btw, the pg is now again in remapped+peering
[17:59] <ceph-ircslackbot1> <vdb> Nice!
[18:00] <ceph-ircslackbot1> <vdb> If `rbd rm` doesn't work, I'd recommend removing individual objects that match `^rbd_data.4f23891b0f8fea` prefix.
[18:01] <arcimboldo> ceph-ircslackbot1, so, all the objects of the volume havenames starting with rbd_data.4f23891b0f8fea?
[18:02] <arcimboldo> 2016-07-15 17:59:04.295101 7ff238bf5700 0 -- 10.129.31.224:0/1595742704 >> 10.129.31.136:6803/122684 pipe(0x7ff24010a820 sd=7 :0 s=1 pgs=0 cs=0 l=1 c=0x7ff24010eac0).fault
[18:02] <arcimboldo> 2016-07-15 18:01:41.036343 7ff2388f2700 0 -- 10.129.31.224:0/1595742704 >> 10.129.31.121:6809/115570 pipe(0x7ff240114050 sd=9 :0 s=1 pgs=0 cs=0 l=1 c=0x7ff24010d2c0).fault
[18:02] <kmajk> anyone knowns how to add more rgw instances in one zone in new mutlisite jewel active/active setup?
[18:02] <ceph-ircslackbot1> <vdb> (Aside: I don't know why it's showing my nick as `ceph-ircslackbot1`, it should be `vdb`).
[18:02] <ceph-ircslackbot1> <vdb> arcimboldo: That is correct.
[18:02] <ceph-ircslackbot1> <vdb> All the rados objects for that rbd image will be prefixed as ^rbd_data.4f23891b0f8fea.
[18:03] * dalegaard-39554 (~dalegaard@vps.devrandom.dk) Quit (Server closed connection)
[18:03] * dalegaard-39554 (~dalegaard@vps.devrandom.dk) has joined #ceph
[18:03] * stupidnic (~tomwalsh@69.sub-70-193-165.myvzw.com) Quit (Quit: stupidnic)
[18:07] * swami1 (~swami@27.7.162.30) Quit (Quit: Leaving.)
[18:07] * cronburg_ (~cronburg@nat-pool-bos-t.redhat.com) Quit (Server closed connection)
[18:07] * Inuyasha (~Guest1390@5AEAAAAK5.tor-irc.dnsbl.oftc.net) Quit ()
[18:07] * cronburg_ (~cronburg@nat-pool-bos-t.redhat.com) has joined #ceph
[18:11] <arcimboldo> I'll work from home... the image is still not deleting, I had to restart another osd.
[18:11] <arcimboldo> I've increased osd_heartbeat_grace
[18:11] * _are__ (~quassel@2a01:238:4325:ca00:f065:c93c:f967:9285) Quit (Server closed connection)
[18:11] * _are_ (~quassel@2a01:238:4325:ca00:f065:c93c:f967:9285) has joined #ceph
[18:11] <arcimboldo> if it doesn't work I will try to delete all the objects
[18:12] <arcimboldo> can I use a wildcard when running rados rm?
[18:13] <ceph-ircslackbot1> <vdb> I don't think so.
[18:13] <ceph-ircslackbot1> <vdb> Has to be the object.
[18:13] <ceph-ircslackbot1> <vdb> But I haven't tried, so feel free to. ;)
[18:14] * blizzow (~jburns@50.243.148.102) has joined #ceph
[18:15] * kiranos_ (~quassel@109.74.11.233) Quit (Server closed connection)
[18:15] * kiranos (~quassel@109.74.11.233) has joined #ceph
[18:15] * cronburg (~cronburg@wr-130-64-194-42.medford.tufts.edu) Quit (Ping timeout: 480 seconds)
[18:16] * jermudgeon (~jhaustin@gw1.ttp.biz.whitestone.link) has joined #ceph
[18:16] * rraja (~rraja@121.244.87.117) Quit (Quit: Leaving)
[18:16] * goberle_ (~goberle@mid.ygg.tf) Quit (Server closed connection)
[18:16] <cholcombe> yeah i don't think it supports glob matching either
[18:16] * goberle (~goberle@mid.ygg.tf) has joined #ceph
[18:21] * arcimboldo (~antonio@dhcp-y11-zi-s3it-130-60-34-054.uzh.ch) Quit (Ping timeout: 480 seconds)
[18:21] * abhishekvrshny (~abhishekv@180.179.116.54) Quit (Server closed connection)
[18:22] * joshd1 (~jdurgin@2602:30a:c089:2b0:9504:35b4:c932:1817) Quit (Quit: Leaving.)
[18:22] * abhishekvrshny (~abhishekv@180.179.116.54) has joined #ceph
[18:24] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[18:27] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[18:30] * jordan_c (~jconway@cable-192.222.199.37.electronicbox.net) Quit (Ping timeout: 480 seconds)
[18:30] * ffilzwin (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) Quit (Quit: Leaving)
[18:34] * zirpu (~zirpu@00013c46.user.oftc.net) Quit (Server closed connection)
[18:34] * zirpu (~zirpu@2600:3c02::f03c:91ff:fe96:bae7) has joined #ceph
[18:35] * zirpu is now known as Guest3141
[18:35] * ffilzwin (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) has joined #ceph
[18:36] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[18:37] * Grum (~Dysgalt@static-ip-85-25-103-119.inaddr.ip-pool.com) has joined #ceph
[18:40] * aj__ (~aj@x590cffa3.dyn.telefonica.de) has joined #ceph
[18:42] * zerick_ (~zerick@104.131.101.65) Quit (Server closed connection)
[18:42] * zerick (~zerick@irc.quassel.zerick.me) has joined #ceph
[18:44] * shylesh (~shylesh@59.95.68.59) has joined #ceph
[18:44] * shylesh (~shylesh@59.95.68.59) Quit (autokilled: This host may be infected. Mail support@oftc.net with questions. BOPM (2016-07-15 16:44:38))
[18:45] * dustinm` (~dustinm`@68.ip-149-56-14.net) Quit (Server closed connection)
[18:45] * dustinm` (~dustinm`@68.ip-149-56-14.net) has joined #ceph
[18:52] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) Quit (Ping timeout: 480 seconds)
[18:52] * wdennis (~wdennis@138.15.207.164) has joined #ceph
[18:53] <wdennis> Hi all - getting an error trying to do 'ceph-deploy osd activate' on a CentOS 7.2 node
[18:54] <wdennis> [ceph4][INFO ] Running command: ceph-disk -v activate --mark-init sysvinit --mount /dev/sdc
[18:54] <wdennis> [ceph4][WARNIN] INFO:ceph-disk:Running command: /sbin/blkid -p -s TYPE -ovalue -- /dev/sdc
[18:54] <wdennis> [ceph4][WARNIN] ceph-disk: Cannot discover filesystem type: device /dev/sdc: Line is truncated:
[18:54] <wdennis> [ceph4][ERROR ] RuntimeError: command returned non-zero exit status: 1
[18:54] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[18:54] * stupidnic (~tomwalsh@73.106.74.212) has joined #ceph
[18:55] <wdennis> Found if I modify the command to "/sbin/blkid -p -s PTTYPE -ovalue -- /dev/sdc" I get an answer ("gpt")
[18:55] * karnan (~karnan@106.51.130.90) has joined #ceph
[18:57] * arcimboldo (~antonio@84-75-174-248.dclient.hispeed.ch) has joined #ceph
[18:57] <wdennis> How may I proceed with 'ceph-deploy osd activate'?
[18:58] * kawa2014 (~kawa@89.184.114.246) Quit (Quit: Leaving)
[18:58] * rakeshgm (~rakesh@106.51.225.53) Quit (Quit: Leaving)
[18:59] * moon (~moon@tunnel.greenhost.nl) Quit (Ping timeout: 480 seconds)
[18:59] * moon (~moon@tunnel.greenhost.nl) has joined #ceph
[19:01] * rossdylan (~rossdylan@2605:6400:1:fed5:22:68c4:af80:cb6e) Quit (Server closed connection)
[19:01] * rossdylan (~rossdylan@2605:6400:1:fed5:22:68c4:af80:cb6e) has joined #ceph
[19:03] * bearkitten (~bearkitte@cpe-76-172-86-115.socal.res.rr.com) Quit (Server closed connection)
[19:04] * bearkitten (~bearkitte@cpe-76-172-86-115.socal.res.rr.com) has joined #ceph
[19:04] * yehudasa_ (~yehudasa@206.169.83.146) Quit (Server closed connection)
[19:04] * yehudasa_ (~yehudasa@206.169.83.146) has joined #ceph
[19:05] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:05] * joshd (~jdurgin@206.169.83.146) Quit (Server closed connection)
[19:05] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) has joined #ceph
[19:06] * joshd (~jdurgin@206.169.83.146) has joined #ceph
[19:06] * debian112 (~bcolbert@64.235.157.198) has left #ceph
[19:07] * Grum (~Dysgalt@5AEAAAAPG.tor-irc.dnsbl.oftc.net) Quit ()
[19:07] * Chaos_Llama (~luigiman@Relay-J.tor-exit.network) has joined #ceph
[19:07] * swami1 (~swami@27.7.162.30) has joined #ceph
[19:08] * moon (~moon@tunnel.greenhost.nl) Quit (Ping timeout: 480 seconds)
[19:08] * moon (~moon@tunnel.greenhost.nl) has joined #ceph
[19:08] * kefu|afk is now known as kefu
[19:09] * DanFoster (~Daniel@office.34sp.com) Quit (Quit: Leaving)
[19:12] * jermudgeon_ (~jhaustin@31.207.56.59) has joined #ceph
[19:13] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[19:14] * mykola (~Mikolaj@91.245.74.217) has joined #ceph
[19:15] * jermudgeon (~jhaustin@gw1.ttp.biz.whitestone.link) Quit (Ping timeout: 480 seconds)
[19:15] * jermudgeon_ is now known as jermudgeon
[19:16] * kefu is now known as kefu|afk
[19:16] * kefu|afk (~kefu@114.92.96.253) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:16] * niknakpaddywak (~xander.ni@outbound.lax.demandmedia.com) Quit (Quit: Lost terminal)
[19:18] * Skaag (~lunix@65.200.54.234) has joined #ceph
[19:18] * chutz (~chutz@rygel.linuxfreak.ca) Quit (Server closed connection)
[19:18] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[19:19] * moon (~moon@tunnel.greenhost.nl) Quit (Ping timeout: 480 seconds)
[19:21] * destrudo (~destrudo@tomba.sonic.net) Quit (Server closed connection)
[19:22] * destrudo (~destrudo@tomba.sonic.net) has joined #ceph
[19:22] * swami1 (~swami@27.7.162.30) Quit (Quit: Leaving.)
[19:23] * kuku (~kuku@112.202.165.64) has joined #ceph
[19:23] * zero_shane (~textual@c-73-231-84-106.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[19:24] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) has joined #ceph
[19:25] * cathode (~cathode@50.232.215.114) has joined #ceph
[19:27] * kuku (~kuku@112.202.165.64) Quit (Remote host closed the connection)
[19:27] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) has joined #ceph
[19:29] * jklare (~jklare@185.27.181.36) Quit (Server closed connection)
[19:29] * jklare (~jklare@185.27.181.36) has joined #ceph
[19:30] * zero_shane (~textual@208.46.223.218) has joined #ceph
[19:31] * masterpe (~masterpe@2a01:670:400::43) Quit (Server closed connection)
[19:31] * masterpe (~masterpe@2a01:670:400::43) has joined #ceph
[19:35] * karnan (~karnan@106.51.130.90) Quit (Quit: Leaving)
[19:36] * jermudgeon_ (~jhaustin@gw1.ttp.biz.whitestone.link) has joined #ceph
[19:37] * Chaos_Llama (~luigiman@26XAAAAWX.tor-irc.dnsbl.oftc.net) Quit ()
[19:38] * kmajk (~kmajk@nat-hq.ext.getresponse.com) Quit (Ping timeout: 480 seconds)
[19:41] * mykola (~Mikolaj@91.245.74.217) Quit (Remote host closed the connection)
[19:41] * jermudgeon (~jhaustin@31.207.56.59) Quit (Ping timeout: 480 seconds)
[19:41] * jermudgeon_ is now known as jermudgeon
[19:41] * walcubi_ (~walcubi@p5795A4B9.dip0.t-ipconnect.de) Quit (Quit: Leaving)
[19:42] * toast (~Tralin|Sl@marinovi.xyz) has joined #ceph
[19:42] * JohnPreston78 (sid31393@ealing.irccloud.com) Quit (Server closed connection)
[19:42] * JohnPreston78 (sid31393@id-31393.ealing.irccloud.com) has joined #ceph
[19:52] * boolman (boolman@79.138.78.238) Quit (Ping timeout: 480 seconds)
[19:56] * mykola (~Mikolaj@91.245.74.217) has joined #ceph
[19:57] * maybebuggy (~maybebugg@2a01:4f8:191:2350::2) Quit (Quit: Leaving)
[20:00] * vimal (~vikumar@114.143.163.114) Quit (Quit: Leaving)
[20:02] * vikhyat (~vumrao@49.248.192.195) has joined #ceph
[20:03] * Sketch (~Sketch@2604:180:2::a506:5c0d) Quit (Server closed connection)
[20:03] * Sketch (~Sketch@new.rednsx.org) has joined #ceph
[20:05] * \ask (~ask@oz.develooper.com) Quit (Server closed connection)
[20:05] * \ask (~ask@oz.develooper.com) has joined #ceph
[20:12] * BranchPredictor (branch@00021630.user.oftc.net) Quit (Server closed connection)
[20:12] * toast (~Tralin|Sl@26XAAAA0Q.tor-irc.dnsbl.oftc.net) Quit ()
[20:12] * adept256 (~Plesioth@torrelay1.tomhek.net) has joined #ceph
[20:12] * BranchPredictor (branch@00021630.user.oftc.net) has joined #ceph
[20:20] * m0zes (~mozes@ns1.beocat.ksu.edu) Quit (Server closed connection)
[20:20] * m0zes (~mozes@ns1.beocat.ksu.edu) has joined #ceph
[20:25] <wdennis> If anyone on this channel is getting my messages, pls ACK (i.e., hello, is this thing on?)
[20:30] * karnan (~karnan@106.51.130.90) has joined #ceph
[20:31] * wdennis (~wdennis@138.15.207.164) Quit (Quit: Leaving...)
[20:33] * vikhyat (~vumrao@49.248.192.195) Quit (Quit: Leaving)
[20:35] <willi> hi
[20:35] <willi> is on
[20:35] <TheSov> i am on :D
[20:36] <icey> me too ;)
[20:42] * adept256 (~Plesioth@26XAAAA39.tor-irc.dnsbl.oftc.net) Quit ()
[20:42] * jacoo1 (~TomyLobo@5AEAAAA0T.tor-irc.dnsbl.oftc.net) has joined #ceph
[20:46] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) Quit (Ping timeout: 480 seconds)
[20:46] * jamespage (~jamespage@culvain.gromper.net) Quit (Server closed connection)
[20:46] * jamespage (~jamespage@culvain.gromper.net) has joined #ceph
[20:50] <arcimboldo> ceph-ircslackbot1, hi are you there?
[20:51] * LongyanG (~long@15255.s.t4vps.eu) Quit (Server closed connection)
[20:51] * LongyanG (~long@15255.s.t4vps.eu) has joined #ceph
[20:53] * stein (~stein@185.56.185.82) Quit (Server closed connection)
[20:53] <ceph-ircslackbot1> <vdb> arcimboldo: How goes it? Also `vdb` dings me.
[20:53] * stein (~stein@185.56.185.82) has joined #ceph
[20:54] <arcimboldo> well, after we removed the watcher, the cluster is now back in OK status
[20:54] <ceph-ircslackbot1> <vdb> Wooo!
[20:54] <arcimboldo> not so fast
[20:54] <ceph-ircslackbot1> <vdb> Aw.
[20:54] <arcimboldo> still, the volume is there
[20:55] <ceph-ircslackbot1> <vdb> Did you `rbd rm` after deleting the objects individually?
[20:55] <arcimboldo> whenever I try to delete it, the active osd of the good old pg goes down
[20:55] <arcimboldo> so, clearly the rbd is creating issues
[20:55] <ceph-ircslackbot1> <vdb> When you delete the image or the objects?
[20:55] <arcimboldo> I don't know if it's because it's a big voluem or what
[20:55] <arcimboldo> I only tried to delete the image
[20:56] <ceph-ircslackbot1> <vdb> Yes, that won't work well. It at least did not for us.
[20:56] <arcimboldo> so, I think I need to delete the objects
[20:56] <ceph-ircslackbot1> <vdb> Do a `rbd ls`, make a list of objects prefix with the appropriate header and then issue an operation to delete those.
[20:56] <arcimboldo> but I wonder: is it *ok* to create a big volume or not?
[20:56] <arcimboldo> or to extend a big volume
[20:56] <ceph-ircslackbot1> <vdb> How big of an image?
[20:56] <arcimboldo> 200T, I was trying to expand to 400T
[20:57] <arcimboldo> but rbd ls will show me the images. rados ls however will show me all the 300M objects....
[20:57] <ceph-ircslackbot1> <vdb> The max I have tried is 30T. Although I don't expect any issues I at least haven't tried this out.
[20:57] <arcimboldo> and I guess it will take some time
[20:57] <arcimboldo> it looks like it's doing a lot of IO on the object containing the header of the image
[20:57] <ceph-ircslackbot1> <vdb> Yes. It should take time proportional the provisioned size rather than actually used.
[20:58] <ceph-ircslackbot1> <vdb> Oh `rados ls` indeed. That's what I meant. Not `rbd ls`.
[20:58] * scg (~zscg@valis.gnu.org) has joined #ceph
[20:58] <arcimboldo> it was ~200T size and it was full
[20:58] <arcimboldo> (which is the reason why I needed to extend it)
[20:58] <ceph-ircslackbot1> <vdb> Oh it was full? I see. Then your rados ls if probably the fastest way to do it. :slightly_smiling_face:
[20:59] <arcimboldo> is it *safe*?
[20:59] <arcimboldo> is there an order to delete the objects?
[20:59] <ceph-ircslackbot1> <vdb> Nope. No order. Just chuck them out in any order.
[20:59] <ceph-ircslackbot1> <vdb> It is safe such that your other objects shouldn't be impacted.
[20:59] <ceph-ircslackbot1> <vdb> Only the objects you delete should be deleted.
[21:00] <ceph-ircslackbot1> <vdb> So make sure you are picking the objects correctly.
[21:00] * rnowling (~rnowling@104-186-210-225.lightspeed.milwwi.sbcglobal.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[21:00] * stupidnic (~tomwalsh@73.106.74.212) Quit (Ping timeout: 480 seconds)
[21:00] * stupidnic_ (~tomwalsh@73.106.79.164) has joined #ceph
[21:02] <shaon> hey guys. we had a unfortunate power outage and when the power came back, I found all the osds are down. now what will be the appropriate way to bring them back with the correct journal?
[21:02] <shaon> *journal partition
[21:02] <shaon> is `ceph-disk activate-journal <dev>` safe to run?
[21:02] * irq0 (~seri@amy.irq0.org) Quit (Server closed connection)
[21:03] <shaon> according to the doc, looks like it activates the OSD as well
[21:03] <shaon> e.g http://docs.ceph.com/docs/hammer/man/8/ceph-disk/#activate-journal
[21:03] * irq0 (~seri@amy.irq0.org) has joined #ceph
[21:04] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[21:05] * andrewschoen (~andrewsch@192.237.167.184) Quit (Server closed connection)
[21:05] * andrewschoen (~andrewsch@2001:4801:7821:77:be76:4eff:fe10:afc7) has joined #ceph
[21:07] * karnan (~karnan@106.51.130.90) Quit (Quit: Leaving)
[21:09] * dlan (~dennis@116.228.88.131) Quit (Server closed connection)
[21:10] <willi> hey guys
[21:10] <willi> debian 8
[21:10] <willi> ceph hammer
[21:10] <willi> fresh install
[21:10] <willi> ceph-deploy mon create-initial
[21:10] <willi> [ceph-mon-1][WARNIN] Failed to execute operation: No such file or directory
[21:10] <willi> [ceph-mon-1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[21:10] <willi> [ceph_deploy.mon][ERROR ] Failed to execute command: systemctl enable ceph.target
[21:10] * Pies (~Pies@srv229.opcja.pl) Quit (Server closed connection)
[21:11] <willi> anyone ideas?
[21:11] * Pies (~Pies@srv229.opcja.pl) has joined #ceph
[21:12] * jacoo1 (~TomyLobo@5AEAAAA0T.tor-irc.dnsbl.oftc.net) Quit ()
[21:12] * JohnO (~oracular@tor-exit1-readme.dfri.se) has joined #ceph
[21:12] * sickology (~mio@vpn.bcs.hr) Quit (Server closed connection)
[21:13] * sickology (~mio@vpn.bcs.hr) has joined #ceph
[21:13] * nathani (~nathani@2607:f2f8:ac88::) Quit (Server closed connection)
[21:14] * nathani (~nathani@2607:f2f8:ac88::) has joined #ceph
[21:14] * mfa298 (~mfa298@krikkit.yapd.net) Quit (Server closed connection)
[21:14] * mfa298 (~mfa298@krikkit.yapd.net) has joined #ceph
[21:18] * rinek (~o@62.109.134.112) Quit (Server closed connection)
[21:18] * rinek (~o@62.109.134.112) has joined #ceph
[21:20] * dlan (~dennis@116.228.88.131) has joined #ceph
[21:22] * Nats_ (~natscogs@114.31.195.238) Quit (Server closed connection)
[21:22] * xarses (~xarses@64.124.158.100) Quit (Read error: Connection reset by peer)
[21:22] * Nats_ (~natscogs@114.31.195.238) has joined #ceph
[21:22] * xarses (~xarses@64.124.158.100) has joined #ceph
[21:22] * willi (~willi@p5797BB64.dip0.t-ipconnect.de) Quit ()
[21:23] * xarses (~xarses@64.124.158.100) Quit (Remote host closed the connection)
[21:23] * xarses (~xarses@64.124.158.100) has joined #ceph
[21:23] * xarses (~xarses@64.124.158.100) Quit (Remote host closed the connection)
[21:23] * xarses (~xarses@64.124.158.100) has joined #ceph
[21:25] * BlaXpirit (~irc@blaxpirit.com) Quit (Server closed connection)
[21:26] * BlaXpirit (~irc@blaxpirit.com) has joined #ceph
[21:26] * noahw (~noahw@96.82.80.65) has joined #ceph
[21:27] * georgem (~Adium@2.222.31.80) has joined #ceph
[21:28] * georgem (~Adium@2.222.31.80) Quit ()
[21:28] * georgem (~Adium@206.108.127.16) has joined #ceph
[21:29] * EthanL (~lamberet@cce02cs4035-fa12-z.ams.hpecore.net) Quit (Ping timeout: 480 seconds)
[21:31] * cyphase (~cyphase@000134f2.user.oftc.net) Quit (Server closed connection)
[21:31] * cyphase (~cyphase@000134f2.user.oftc.net) has joined #ceph
[21:32] * georgem (~Adium@206.108.127.16) Quit ()
[21:35] * F|1nt (~F|1nt@85-170-90-218.rev.numericable.fr) has joined #ceph
[21:35] * davidzlap (~Adium@2605:e000:1313:8003:c871:5ea9:97f4:fcb4) has joined #ceph
[21:36] * dis_ (~dis@nat-pool-brq-t.redhat.com) Quit (Server closed connection)
[21:36] * dis (~dis@00018d20.user.oftc.net) has joined #ceph
[21:37] * marcan (marcan@marcansoft.com) Quit (Server closed connection)
[21:37] * marcan (marcan@marcansoft.com) has joined #ceph
[21:41] * EthanL (~lamberet@cce02cs4035-fa12-z.ams.hpecore.net) has joined #ceph
[21:42] * lurbs_ (user@uber.geek.nz) Quit (Server closed connection)
[21:42] * lurbs (user@uber.geek.nz) has joined #ceph
[21:42] * JohnO (~oracular@5AEAAAA3W.tor-irc.dnsbl.oftc.net) Quit ()
[21:46] * post-factum (~post-fact@vulcan.natalenko.name) Quit (Killed (NickServ (Too many failed password attempts.)))
[21:46] * post-factum (~post-fact@vulcan.natalenko.name) has joined #ceph
[21:48] * Gugge-47527 (gugge@92.246.2.105) Quit (Server closed connection)
[21:48] * Gugge-47527 (gugge@92.246.2.105) has joined #ceph
[21:51] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[21:52] * react (~react@retard.io) Quit (Server closed connection)
[21:52] * react (~react@retard.io) has joined #ceph
[21:53] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[21:55] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) Quit (Server closed connection)
[21:55] * brians_ (~brianoftc@brian.by) Quit (Server closed connection)
[21:55] * cholcombe (~chris@2001:67c:1562:8007::aac:40f1) has joined #ceph
[21:56] * brians_ (~brianoftc@brian.by) has joined #ceph
[21:57] * thadood (~thadood@slappy.thunderbutt.org) Quit (Server closed connection)
[21:57] * thadood (~thadood@slappy.thunderbutt.org) has joined #ceph
[22:00] * valeech (~valeech@pool-108-44-162-111.clppva.fios.verizon.net) has joined #ceph
[22:02] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) has joined #ceph
[22:06] * debian112 (~bcolbert@64.235.157.198) has joined #ceph
[22:07] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[22:09] * dnunez (~dnunez@130.64.25.56) Quit (Quit: Leaving)
[22:09] * onyb (~ani07nov@203.92.59.74) has joined #ceph
[22:14] * jdillaman_ (~jdillaman@mobile-166-172-058-064.mycingular.net) has joined #ceph
[22:16] * DougalJacobs (~mps@Relay-J.tor-exit.network) has joined #ceph
[22:17] * med (~medberry@71.74.177.250) Quit (Server closed connection)
[22:17] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[22:17] * med (~medberry@71.74.177.250) has joined #ceph
[22:18] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[22:20] * truan-wang (~truanwang@114.111.166.5) has joined #ceph
[22:21] * HappyLoaf (~HappyLoaf@cpc93928-bolt16-2-0-cust133.10-3.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[22:22] * jdillaman_ (~jdillaman@mobile-166-172-058-064.mycingular.net) Quit (Ping timeout: 480 seconds)
[22:22] * jdillaman_ (~jdillaman@nat-pool-rdu-u.redhat.com) has joined #ceph
[22:24] * HappyLoaf (~HappyLoaf@cpc93928-bolt16-2-0-cust133.10-3.cable.virginm.net) has joined #ceph
[22:26] * HappyLoaf (~HappyLoaf@cpc93928-bolt16-2-0-cust133.10-3.cable.virginm.net) Quit (Read error: Connection reset by peer)
[22:26] * HappyLoaf (~HappyLoaf@cpc93928-bolt16-2-0-cust133.10-3.cable.virginm.net) has joined #ceph
[22:27] * rony (~rony@125-227-147-112.HINET-IP.hinet.net) Quit (Server closed connection)
[22:30] * karnan (~karnan@106.51.130.90) has joined #ceph
[22:32] * sileht (~sileht@gizmo.sileht.net) Quit (Server closed connection)
[22:32] * verleihnix (~verleihni@195-202-198-60.dynamic.hispeed.ch) Quit (Server closed connection)
[22:32] * sileht (~sileht@gizmo.sileht.net) has joined #ceph
[22:32] * verleihnix (~verleihni@195-202-198-60.dynamic.hispeed.ch) has joined #ceph
[22:43] * squizzi_ (~squizzi@nat-pool-rdu-t.redhat.com) Quit (Quit: bye)
[22:43] * scg (~zscg@valis.gnu.org) Quit (Ping timeout: 480 seconds)
[22:46] * DougalJacobs (~mps@26XAAABHT.tor-irc.dnsbl.oftc.net) Quit ()
[22:46] * ricin (~csharp@176.10.99.206) has joined #ceph
[22:47] * Larsen (~andreas@2001:67c:578:2::15) Quit (Server closed connection)
[22:47] * Larsen (~andreas@2001:67c:578:2::15) has joined #ceph
[22:50] * praveen (~praveen@121.244.155.12) Quit (Remote host closed the connection)
[22:53] * arcimboldo (~antonio@84-75-174-248.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[22:54] * `10` (~10@69.169.91.14) Quit (Server closed connection)
[22:55] * `10` (~10@69.169.91.14) has joined #ceph
[22:55] * valeech (~valeech@pool-108-44-162-111.clppva.fios.verizon.net) Quit (Quit: valeech)
[22:56] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[22:57] * truan-wang (~truanwang@114.111.166.5) Quit (Ping timeout: 480 seconds)
[22:57] * MrBy (~MrBy@85.115.23.2) Quit (Server closed connection)
[22:57] * MrBy (~MrBy@85.115.23.2) has joined #ceph
[22:59] * blizzow (~jburns@50.243.148.102) Quit (Ping timeout: 480 seconds)
[22:59] * HappyLoaf (~HappyLoaf@cpc93928-bolt16-2-0-cust133.10-3.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[23:00] <The_Ball> I'm replacing some OSDs in my home cluster for bigger drives. To minimise rebuilding, is it best to reweight the old OSD to zero, wait for balancing to finish, add new OSD, wait for balancing to finish, then remove old OSD. Or is it better/quicker to just out the old working OSD without "emptying" it, and add the new OSD before the rebalance is finished?
[23:00] * HappyLoaf (~HappyLoaf@cpc93928-bolt16-2-0-cust133.10-3.cable.virginm.net) has joined #ceph
[23:01] <The_Ball> I'm thinking it might be better to no out an OSD, but just shut it down, replace with a bigger disk and restart the OSD, with the same OSD id it should just rebuild shouldn't it?
[23:01] * benner (~benner@188.166.111.206) Quit (Server closed connection)
[23:01] * benner (~benner@188.166.111.206) has joined #ceph
[23:03] * Animazing (~Wut@94.242.217.235) Quit (Server closed connection)
[23:03] * Animazing (~Wut@94.242.217.235) has joined #ceph
[23:04] * willi (~willi@p200300774E3500FC2926216ED3A55C97.dip0.t-ipconnect.de) has joined #ceph
[23:06] * shaunm (~shaunm@cpe-192-180-17-174.kya.res.rr.com) Quit (Ping timeout: 480 seconds)
[23:08] * jdillaman_ (~jdillaman@nat-pool-rdu-u.redhat.com) Quit (Quit: jdillaman_)
[23:08] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) has joined #ceph
[23:08] * Sgaduuw (~eelco@willikins.srv.eelcowesemann.nl) Quit (Server closed connection)
[23:08] * Sgaduuw (~eelco@willikins.srv.eelcowesemann.nl) has joined #ceph
[23:10] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[23:10] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[23:10] * jdillaman_ (~jdillaman@nat-pool-rdu-u.redhat.com) has joined #ceph
[23:14] * lkoranda (~lkoranda@213.175.37.10) Quit (Server closed connection)
[23:14] * lkoranda (~lkoranda@nat-pool-brq-t.redhat.com) has joined #ceph
[23:16] * EinstCrazy (~EinstCraz@60-249-152-164.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[23:16] * ricin (~csharp@26XAAABLS.tor-irc.dnsbl.oftc.net) Quit ()
[23:16] * mykola (~Mikolaj@91.245.74.217) Quit (Quit: away)
[23:20] * Kdecherf (~kdecherf@2001:bc8:35e0:142:7368:616f:6c61:6e00) Quit (Server closed connection)
[23:21] * Kdecherf (~kdecherf@2001:bc8:35e0:142:7368:616f:6c61:6e00) has joined #ceph
[23:21] * haplo37 (~haplo37@198-48-215-247.cpe.pppoe.ca) has joined #ceph
[23:23] * ^Spike^ (~Spike@188.cimarosa.openttdcoop.org) Quit (Server closed connection)
[23:23] * ^Spike^ (~Spike@188.cimarosa.openttdcoop.org) has joined #ceph
[23:26] * borei (~dan@node-1w7jr9qle4x5ix2kjybp8d4fv.ipv6.telus.net) has joined #ceph
[23:26] * borei (~dan@node-1w7jr9qle4x5ix2kjybp8d4fv.ipv6.telus.net) has left #ceph
[23:27] * fsimonce (~simon@host99-64-dynamic.27-79-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[23:27] * borei (~dan@node-1w7jr9qle4x5ix2kjybp8d4fv.ipv6.telus.net) has joined #ceph
[23:27] * xophe (~xophe@62-210-69-147.rev.poneytelecom.eu) Quit (Server closed connection)
[23:27] <borei> hi all
[23:27] * xophe (~xophe@62-210-69-147.rev.poneytelecom.eu) has joined #ceph
[23:28] * kaisan (~kai@213.222.7.5) Quit (Server closed connection)
[23:28] * kaisan (~kai@zaphod.kamiza.nl) has joined #ceph
[23:28] * praveen (~praveen@122.172.136.225) has joined #ceph
[23:29] <borei> i need some heads-up about ceph authentication, i was following docs http://docs.ceph.com/docs/hammer/rbd/libvirt/, but still can't get through
[23:29] <borei> ceph-mon is saying client did not provide supported auth type
[23:30] * nolan (~nolan@2001:470:1:41:a800:ff:fe3e:ad08) Quit (Server closed connection)
[23:30] * nolan (~nolan@2001:470:1:41:a800:ff:fe3e:ad08) has joined #ceph
[23:34] * yebyen (~yebyen@129.21.49.95) Quit (Server closed connection)
[23:34] * yebyen (~yebyen@martyfunkhouser.csh.rit.edu) has joined #ceph
[23:35] * mancdaz (~mancdaz@2a00:1a48:7806:117:be76:4eff:fe08:7623) Quit (Server closed connection)
[23:35] * mancdaz (~mancdaz@2a00:1a48:7806:117:be76:4eff:fe08:7623) has joined #ceph
[23:38] * Amto_res (~amto_res@ks312256.kimsufi.com) Quit (Server closed connection)
[23:38] * Amto_res (~amto_res@ks312256.kimsufi.com) has joined #ceph
[23:38] * jdillaman_ (~jdillaman@nat-pool-rdu-u.redhat.com) Quit (Quit: jdillaman_)
[23:39] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) Quit (Ping timeout: 480 seconds)
[23:39] * stupidnic_ (~tomwalsh@73.106.79.164) Quit (Quit: stupidnic_)
[23:41] * Svedrin (svedrin@elwing.funzt-halt.net) Quit (Server closed connection)
[23:41] * Svedrin (svedrin@elwing.funzt-halt.net) has joined #ceph
[23:43] * singler (~singler@zeta.kirneh.eu) Quit (Server closed connection)
[23:43] * singler (~singler@zeta.kirneh.eu) has joined #ceph
[23:45] * arcimboldo (~antonio@84-75-174-248.dclient.hispeed.ch) has joined #ceph
[23:47] * Georgyo (~georgyo@2600:3c03::f03c:91ff:feae:505c) Quit (Server closed connection)
[23:47] * Georgyo (~georgyo@shamm.as) has joined #ceph
[23:47] * F|1nt (~F|1nt@85-170-90-218.rev.numericable.fr) Quit (Quit: Oups, just gone away...)
[23:49] * corevoid (~lewis@ip68-5-125-61.oc.oc.cox.net) has joined #ceph
[23:51] * davidzlap (~Adium@2605:e000:1313:8003:c871:5ea9:97f4:fcb4) Quit (Ping timeout: 480 seconds)
[23:51] * espeer (~quassel@41.78.129.253) Quit (Server closed connection)
[23:51] * carter (~carter@li98-136.members.linode.com) Quit (Server closed connection)
[23:51] * espeer (~quassel@phobos.isoho.st) has joined #ceph
[23:51] * carter (~carter@li98-136.members.linode.com) has joined #ceph
[23:54] * corevoid (~lewis@ip68-5-125-61.oc.oc.cox.net) Quit (Quit: Ex-Chat)
[23:55] * corevoid (~corevoid@ip68-5-125-61.oc.oc.cox.net) has joined #ceph
[23:56] * jdillaman_ (~jdillaman@mobile-166-172-058-064.mycingular.net) has joined #ceph
[23:56] * haplo37 (~haplo37@198-48-215-247.cpe.pppoe.ca) Quit (Ping timeout: 480 seconds)
[23:56] * chrome0 (~chrome0@mail.sabaini.at) Quit (Server closed connection)
[23:57] * chrome0 (~chrome0@mail.sabaini.at) has joined #ceph
[23:57] * dmanchad (~dmanchad@66.187.233.206) Quit (Server closed connection)
[23:57] * dmanchad (~dmanchad@nat-pool-bos-t.redhat.com) has joined #ceph
[23:58] * debian112 (~bcolbert@64.235.157.198) Quit (Ping timeout: 480 seconds)
[23:59] * amospalla (~amospalla@0001a39c.user.oftc.net) Quit (Server closed connection)
[23:59] * amospalla (~amospalla@0001a39c.user.oftc.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.