#ceph IRC Log

Index

IRC Log for 2016-03-30

Timestamps are in GMT/BST.

[0:03] * georgem (~Adium@206.108.127.16) Quit (Ping timeout: 480 seconds)
[0:04] * cathode (~cathode@50.232.215.114) Quit (Quit: Leaving)
[0:05] * thansen (~thansen@17.253.sfcn.org) has joined #ceph
[0:07] * haplo37 (~haplo37@199.91.185.156) Quit (Remote host closed the connection)
[0:07] * csoukup (~csoukup@159.140.254.106) Quit (Ping timeout: 480 seconds)
[0:08] <descention> is it possible to recover an osd from a host that dies? say, by moving the disk to a different ceph host?
[0:09] <descention> I don't need to know how, I just need to know if it can be done
[0:09] <olid1981110> i think yes
[0:09] <descention> and maybe a topic I could use for google fu
[0:14] <khyron> but you will need the journal also? if is stored en the same disk you are lucky!
[0:15] <khyron> sorry without the ?.... you need the data and the journal
[0:15] <olid1981110> well just search for "recover osd data" or something like that
[0:15] <olid1981110> there is also a chance to get the data without the journal
[0:16] * bjornar_ (~bjornar@ti0099a430-1561.bb.online.no) Quit (Ping timeout: 480 seconds)
[0:16] <khyron> really? that is new to me....
[0:17] <descention> I read somewhere you can create a new journal
[0:18] <descention> http://ceph.com/planet/ceph-recover-osds-after-ssd-journal-failure/
[0:18] <descention> thanks olid1981110, I found an article for recovering
[0:19] <descention> I just didn't use the right words for google to find it
[0:20] * gopher_49 (~gopher_49@host2.drexchem.com) Quit (Ping timeout: 480 seconds)
[0:21] <khyron> thanks for the url descention, I need to try that
[0:23] * Rehevkor (~kiasyn@76GAAD242.tor-irc.dnsbl.oftc.net) Quit ()
[0:23] * olid1981110 (~olid1982@aftr-185-17-206-242.dynamic.mnet-online.de) Quit (Ping timeout: 480 seconds)
[0:25] * ibravo (~ibravo@72.198.142.104) Quit (Quit: This computer has gone to sleep)
[0:29] * gopher_49 (~gopher_49@host2.drexchem.com) has joined #ceph
[0:31] * Skaag (~lunix@65.200.54.234) has joined #ceph
[0:35] * Skaag (~lunix@65.200.54.234) Quit ()
[0:36] * davidccc_ (~dcasier@230.251.90.92.rev.sfr.net) has joined #ceph
[0:36] * davidz (~davidz@2605:e000:1313:8003:e02b:e137:b5c4:ac21) has joined #ceph
[0:36] * davidz1 (~davidz@2605:e000:1313:8003:e02b:e137:b5c4:ac21) Quit (Read error: Connection reset by peer)
[0:36] * dcasier__ (~dcasier@84.197.151.77.rev.sfr.net) has joined #ceph
[0:38] * ircolle (~Adium@2601:285:201:2bf9:21f1:ec17:142a:b48d) Quit (Quit: Leaving.)
[0:42] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) Quit (Ping timeout: 480 seconds)
[0:44] * davidccc_ (~dcasier@230.251.90.92.rev.sfr.net) Quit (Ping timeout: 480 seconds)
[0:45] * Skaag (~lunix@65.200.54.234) has joined #ceph
[0:45] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Ping timeout: 480 seconds)
[0:46] * xarses (~xarses@64.124.158.100) Quit (Ping timeout: 480 seconds)
[0:46] * scubacuda (sid109325@0001fbab.user.oftc.net) Quit (Server closed connection)
[0:47] * scubacuda (sid109325@0001fbab.user.oftc.net) has joined #ceph
[0:48] * dneary (~dneary@12.139.153.2) has joined #ceph
[0:51] * thansen (~thansen@17.253.sfcn.org) Quit (Quit: Ex-Chat)
[0:52] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[0:52] * chasmo77 (~chas77@158.183-62-69.ftth.swbr.surewest.net) Quit (Ping timeout: 480 seconds)
[0:53] * gopher_49 (~gopher_49@host2.drexchem.com) Quit (Quit: Leaving)
[0:54] * fsimonce (~simon@host201-70-dynamic.26-79-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[0:57] * Skaag (~lunix@65.200.54.234) Quit (Quit: Leaving.)
[0:57] * bene2 (~bene@2601:18c:8501:25e4:ea2a:eaff:fe08:3c7a) Quit (Quit: Konversation terminated!)
[0:59] * wwdillingham (~LobsterRo@mobile-166-186-168-86.mycingular.net) has joined #ceph
[0:59] * rendar (~I@host120-23-dynamic.247-95-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[1:00] * Skaag (~lunix@65.200.54.234) has joined #ceph
[1:01] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) has joined #ceph
[1:01] * Skaag (~lunix@65.200.54.234) Quit ()
[1:01] * mattbenjamin (~mbenjamin@aa2.linuxbox.com) Quit (Ping timeout: 480 seconds)
[1:05] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[1:09] * wwdillingham (~LobsterRo@mobile-166-186-168-86.mycingular.net) Quit (Quit: wwdillingham)
[1:10] * wCPO (~Kristian@188.228.31.139) Quit (Ping timeout: 480 seconds)
[1:10] * jrojas (~jrojas@68-95-184-105.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[1:11] <jrojas> anyone know of a way to help troubleshoot radosgw triggering OOM-kill on centos 7?
[1:15] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[1:16] * Skaag (~lunix@65.200.54.234) has joined #ceph
[1:18] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[1:20] * vicente (~~vicente@111-241-25-53.dynamic.hinet.net) has joined #ceph
[1:23] * JWilbur (~Moriarty@192.42.115.101) has joined #ceph
[1:27] * Skaag (~lunix@65.200.54.234) Quit (Quit: Leaving.)
[1:29] * Skaag (~lunix@65.200.54.234) has joined #ceph
[1:29] * mfa298 (~mfa298@krikkit.yapd.net) Quit (Server closed connection)
[1:29] * mfa298 (~mfa298@krikkit.yapd.net) has joined #ceph
[1:30] * vicente (~~vicente@111-241-25-53.dynamic.hinet.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[1:33] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) has joined #ceph
[1:34] * angdraug (~angdraug@64.124.158.100) Quit (Quit: Leaving)
[1:39] * vicente (~~vicente@111-241-25-53.dynamic.hinet.net) has joined #ceph
[1:39] * vicente (~~vicente@111-241-25-53.dynamic.hinet.net) Quit ()
[1:40] <descention> when removing a disk, after setting 'noin' and the osd to out, should I wait for HEALTH_OK?
[1:44] * yanzheng (~zhyan@125.70.23.194) has joined #ceph
[1:45] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[1:48] * vata (~vata@207.96.182.162) Quit (Quit: Leaving.)
[1:49] * yanzheng (~zhyan@125.70.23.194) Quit ()
[1:53] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[1:53] * JWilbur (~Moriarty@06SAAAL2S.tor-irc.dnsbl.oftc.net) Quit ()
[1:53] * Grimhound (~Azru@185.100.86.100) has joined #ceph
[1:54] * yuxiaozou (~yuxiaozou@128.135.100.113) Quit (Ping timeout: 480 seconds)
[1:55] <khyron> if you remove a disk with or without the noout flag the health is going to be affected (warning) but with the flag the cluster will not be rebalanced, but with the noin flag the osd is going to be marked as out, so no rebalancing
[1:55] * yanzheng (~zhyan@125.70.23.194) has joined #ceph
[2:02] * oms101 (~oms101@p20030057EA079D00C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[2:03] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) has joined #ceph
[2:05] * Skaag (~lunix@65.200.54.234) Quit (Quit: Leaving.)
[2:09] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:83e:b71d:7525:859a) Quit (Ping timeout: 480 seconds)
[2:10] * oms101 (~oms101@p20030057EA064F00C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[2:13] * MentalRay (~MentalRay@107.171.161.165) has joined #ceph
[2:14] * SWAT (~swat@cyberdyneinc.xs4all.nl) Quit (Read error: Connection reset by peer)
[2:14] * SWAT_ (~swat@cyberdyneinc.xs4all.nl) has joined #ceph
[2:17] * Skaag (~lunix@65.200.54.234) has joined #ceph
[2:18] * Skaag (~lunix@65.200.54.234) Quit ()
[2:18] * yanzheng (~zhyan@125.70.23.194) Quit (Quit: This computer has gone to sleep)
[2:23] * Grimhound (~Azru@7V7AADY0G.tor-irc.dnsbl.oftc.net) Quit ()
[2:25] * Lea (~LeaChim@host86-168-120-216.range86-168.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[2:28] * dneary (~dneary@12.139.153.2) Quit (Ping timeout: 480 seconds)
[2:29] * yuxiaozou (~yuxiaozou@128.135.100.113) has joined #ceph
[2:31] * khyron (~khyron@200.77.224.239) Quit (Quit: The computer fell asleep)
[2:32] * khyron (~khyron@200.77.224.239) has joined #ceph
[2:32] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) Quit (Ping timeout: 480 seconds)
[2:40] * khyron (~khyron@200.77.224.239) Quit (Ping timeout: 480 seconds)
[2:41] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) has joined #ceph
[2:46] * beardo (~beardo__@beardo.cc.lehigh.edu) Quit (Server closed connection)
[2:47] * beardo (~beardo__@beardo.cc.lehigh.edu) has joined #ceph
[2:48] * georgem (~Adium@69-165-151-116.dsl.teksavvy.com) has joined #ceph
[2:53] * neobenedict (~click@Relay-J.tor-exit.network) has joined #ceph
[2:55] * Skaag (~lunix@cpe-172-91-77-84.socal.res.rr.com) has joined #ceph
[3:00] * csoukup (~csoukup@2605:a601:9c8:6b00:1592:ec6:f528:cde6) has joined #ceph
[3:08] * csoukup (~csoukup@2605:a601:9c8:6b00:1592:ec6:f528:cde6) Quit (Ping timeout: 480 seconds)
[3:10] * yuxiaozou (~yuxiaozou@128.135.100.113) Quit (Ping timeout: 480 seconds)
[3:13] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[3:16] * MentalRay (~MentalRay@107.171.161.165) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[3:18] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[3:19] * neurodrone (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[3:20] * Pintomatic (sid25118@id-25118.ealing.irccloud.com) Quit (Server closed connection)
[3:20] * Pintomatic (sid25118@id-25118.ealing.irccloud.com) has joined #ceph
[3:21] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[3:23] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[3:23] * neobenedict (~click@76GAAD3BJ.tor-irc.dnsbl.oftc.net) Quit ()
[3:23] * anadrom (~maku@Relay-J.tor-exit.network) has joined #ceph
[3:26] * neurodrone (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) has joined #ceph
[3:30] * MentalRay (~MentalRay@107.171.161.165) has joined #ceph
[3:30] * MentalRay (~MentalRay@107.171.161.165) Quit ()
[3:34] * EinstCra_ (~EinstCraz@58.247.119.250) has joined #ceph
[3:34] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Read error: Connection reset by peer)
[3:36] * chengmao (~chengmao@113.57.168.154) has joined #ceph
[3:38] * Zabidin (~oftc-webi@124.13.35.225) has joined #ceph
[3:38] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[3:45] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[3:47] * vata (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[3:47] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[3:50] * yanzheng (~zhyan@125.70.23.194) has joined #ceph
[3:51] * neurodrone (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[3:52] * Mika_c (~quassel@122.146.93.152) has joined #ceph
[3:53] * anadrom (~maku@06SAAAMAH.tor-irc.dnsbl.oftc.net) Quit ()
[3:53] * Quatroking (~PierreW@tor-exit.dhalgren.org) has joined #ceph
[3:55] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[3:56] * zhaochao (~zhaochao@124.202.191.130) has joined #ceph
[3:58] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[4:01] * gopher_49 (~yaaic@mobile-166-172-057-212.mycingular.net) has joined #ceph
[4:04] * kefu (~kefu@183.193.128.175) has joined #ceph
[4:05] * dscastro (~azureuser@191.232.35.215) Quit (Quit: WeeChat 1.4)
[4:13] * yuxiaozou (~yuxiaozou@128.135.100.110) has joined #ceph
[4:15] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) has joined #ceph
[4:16] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) Quit (Remote host closed the connection)
[4:17] * gopher_49 (~yaaic@mobile-166-172-057-212.mycingular.net) Quit (Ping timeout: 480 seconds)
[4:18] * neurodrone (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) has joined #ceph
[4:22] <Zabidin> May i know why ceph using infernalis repo? I already set to use hammer? Any idea why??
[4:22] <Zabidin> I'm using centos 7.
[4:23] * Quatroking (~PierreW@06SAAAMCL.tor-irc.dnsbl.oftc.net) Quit ()
[4:26] <[arx]> so the repo were originally pointing to infernalis?
[4:26] <[arx]> if so, have you cleared your cache since the change?
[4:26] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[4:28] <Zabidin> I try change repo url..
[4:28] <Zabidin> Hope it point to hammer
[4:36] * ftuesca (~ftuesca@181.170.121.36) Quit (Quit: Leaving)
[4:38] * naoto (~naotok@27.131.11.254) has joined #ceph
[4:38] * swami1 (~swami@27.7.160.116) has joined #ceph
[4:39] <Zabidin> Even i put hammer in repo it still take infernalis > [ceph_deploy.install][DEBUG ] Installing stable version infernalis on cluster ceph hosts
[4:40] * swami1 (~swami@27.7.160.116) Quit ()
[4:42] <[arx]> i've only used ceph-deploy briefly over a year ago, can't help there.
[4:52] * georgem (~Adium@69-165-151-116.dsl.teksavvy.com) Quit (Quit: Leaving.)
[4:52] <Zabidin> Continue install infernalis.. If got any error will post here..
[4:53] * Nijikokun (~w2k@marylou.nos-oignons.net) has joined #ceph
[4:57] * IvanJobs (~hardes@103.50.11.146) has joined #ceph
[4:59] * davidz1 (~davidz@2605:e000:1313:8003:e02b:e137:b5c4:ac21) has joined #ceph
[4:59] * davidz (~davidz@2605:e000:1313:8003:e02b:e137:b5c4:ac21) Quit (Read error: Connection reset by peer)
[5:02] * jowilkin (~jowilkin@2601:644:4000:b0bf:56ee:75ff:fe10:724e) Quit (Ping timeout: 480 seconds)
[5:03] * chasmo77 (~chas77@158.183-62-69.ftth.swbr.surewest.net) has joined #ceph
[5:04] * chasmo77 (~chas77@158.183-62-69.ftth.swbr.surewest.net) Quit ()
[5:07] * Vacuum__ (~Vacuum@88.130.192.143) has joined #ceph
[5:08] * chasmo77 (~chas77@158.183-62-69.ftth.swbr.surewest.net) has joined #ceph
[5:09] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) Quit (Quit: wwdillingham)
[5:10] * jowilkin (~jowilkin@2601:644:4000:b0bf:56ee:75ff:fe10:724e) has joined #ceph
[5:13] * Vacuum_ (~Vacuum@i59F79484.versanet.de) Quit (Ping timeout: 480 seconds)
[5:23] * Nijikokun (~w2k@7V7AADY48.tor-irc.dnsbl.oftc.net) Quit ()
[5:23] * clarjon1 (~Lite@2.tor.exit.babylon.network) has joined #ceph
[5:35] * overclk (~quassel@121.244.87.117) has joined #ceph
[5:43] * bilsted (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[5:49] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[5:50] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[5:52] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[5:53] * clarjon1 (~Lite@4MJAADNC9.tor-irc.dnsbl.oftc.net) Quit ()
[5:53] * Phase (~Lite@94.242.228.107) has joined #ceph
[5:53] * magicrobotmonkey (~magicrobo@8.29.8.68) Quit (Server closed connection)
[5:53] * Phase is now known as Guest9711
[5:54] * magicrobotmonkey (~magicrobo@8.29.8.68) has joined #ceph
[5:57] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[6:07] * neurodrone (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[6:07] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[6:08] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[6:09] <Zabidin> Base on this tutorial> http://linoxide.com/storage/setup-red-hat-ceph-storage-centos-7-0/, i have 2 disk on node1.
[6:09] * MentalRay (~MentalRay@107.171.161.165) has joined #ceph
[6:10] <Zabidin> Getting error when adding disk 2 on node 1 to osd. > http://pastebin.com/rBR69pfP
[6:10] <Zabidin> Where is my mistake?
[6:12] <Zabidin> Attach image > http://postimg.org/image/x6t7b8msr/
[6:12] <Zabidin> sdc and sdd is hard disk 3TB
[6:12] <Zabidin> sde is SSD
[6:14] * kanagaraj (~kanagaraj@121.244.87.117) has joined #ceph
[6:17] * MentalRay (~MentalRay@107.171.161.165) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[6:17] * MentalRay (~MentalRay@107.171.161.165) has joined #ceph
[6:22] * jowilkin (~jowilkin@2601:644:4000:b0bf:56ee:75ff:fe10:724e) Quit (Ping timeout: 480 seconds)
[6:23] * Guest9711 (~Lite@06SAAAMJT.tor-irc.dnsbl.oftc.net) Quit ()
[6:23] * Sigma (~Xa@anonymous6.sec.nl) has joined #ceph
[6:29] * swami1 (~swami@49.32.0.92) has joined #ceph
[6:30] * jowilkin (~jowilkin@2601:644:4000:b0bf:56ee:75ff:fe10:724e) has joined #ceph
[6:32] * MentalRay (~MentalRay@107.171.161.165) Quit (Ping timeout: 480 seconds)
[6:34] * Randleman (~jesse@89.105.204.182) Quit (Server closed connection)
[6:34] * Randleman (~jesse@89.105.204.182) has joined #ceph
[6:50] * EinstCra_ (~EinstCraz@58.247.119.250) Quit (Quit: Leaving...)
[6:53] * Sigma (~Xa@06SAAAMK6.tor-irc.dnsbl.oftc.net) Quit ()
[6:53] * PcJamesy (~delcake@tor1e1.privacyfoundation.ch) has joined #ceph
[7:00] * TomasCZ (~TomasCZ@yes.tenlab.net) Quit (Quit: Leaving)
[7:07] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[7:08] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[7:10] * khyron (~khyron@187.190.152.61) has joined #ceph
[7:22] * kefu_ (~kefu@114.92.120.83) has joined #ceph
[7:23] * PcJamesy (~delcake@06SAAAMMS.tor-irc.dnsbl.oftc.net) Quit ()
[7:23] * luigiman (~Kizzi@79.134.255.200) has joined #ceph
[7:29] * kefu (~kefu@183.193.128.175) Quit (Ping timeout: 480 seconds)
[7:34] * chengmao (~chengmao@113.57.168.154) Quit (Ping timeout: 480 seconds)
[7:37] * treenerd_ (~gsulzberg@cpe90-146-148-47.liwest.at) has joined #ceph
[7:48] * karnan (~karnan@121.244.87.124) has joined #ceph
[7:51] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[7:53] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) has joined #ceph
[7:53] * luigiman (~Kizzi@06SAAAMOB.tor-irc.dnsbl.oftc.net) Quit ()
[7:53] * Wizeon (~zapu@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[7:55] <IvanJobs> I'm wondering that did ceph rgw support batch deletion of objects ?
[7:55] <IvanJobs> anyone knows anything about it?
[7:59] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[7:59] * EinstCra_ (~EinstCraz@58.247.119.250) has joined #ceph
[8:04] <Zabidin> Why ceph install selinux? Can be exclude from installed?
[8:05] <vikhyat> because selinux support came in ceph with this package
[8:05] <vikhyat> you can now run ceph with selinux in enforcing mode
[8:06] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Ping timeout: 480 seconds)
[8:06] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[8:07] * krobelus (~krobelus@193-154-252-133.adsl.highway.telekom.at) has joined #ceph
[8:14] * krobelus_ (~krobelus@194-118-144-95.adsl.highway.telekom.at) Quit (Ping timeout: 480 seconds)
[8:15] * rdas (~rdas@121.244.87.116) has joined #ceph
[8:22] * zaitcev (~zaitcev@c-50-130-189-82.hsd1.nm.comcast.net) Quit (Quit: Bye)
[8:23] * Wizeon (~zapu@76GAAD3JF.tor-irc.dnsbl.oftc.net) Quit ()
[8:23] * legion (~cooey@tor-exit4-readme.dfri.se) has joined #ceph
[8:27] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[8:28] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) Quit ()
[8:31] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[8:32] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[8:32] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[8:32] <wgao> After do mon and osd, why my osd is down ?
[8:33] <wgao> Follow the document, I just do osd.0 and osd.1 in one node (node1)
[8:34] <wgao> Any body can tell me the reason, and the ceph.conf should be changed after add osd ?
[8:34] <wgao> Should I restart ceph-mon after add osd ?
[8:44] * Concubidated (~cube@c-50-173-245-118.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[8:46] * huangjun (~kvirc@113.57.168.154) has joined #ceph
[8:47] * jrojas (~jrojas@68-95-184-105.lightspeed.irvnca.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[8:50] * rwheeler (~rwheeler@bzq-82-81-161-51.red.bezeqint.net) has joined #ceph
[8:53] * olid1981110 (~olid1982@aftr-185-17-204-107.dynamic.mnet-online.de) has joined #ceph
[8:53] * legion (~cooey@06SAAAMR2.tor-irc.dnsbl.oftc.net) Quit ()
[8:53] * rogst (~Xylios@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[8:57] * jrojas (~jrojas@68-95-184-105.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[9:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[9:01] * bilsted (~~vicente@125-227-238-55.HINET-IP.hinet.net) has left #ceph
[9:01] * winston-d (~ubuntu@104.236.185.214) Quit (Server closed connection)
[9:01] * winston-d (~ubuntu@104.236.185.214) has joined #ceph
[9:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[9:01] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[9:04] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) has joined #ceph
[9:04] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) Quit (Remote host closed the connection)
[9:05] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) has joined #ceph
[9:07] <khyron> wgao... no need to restart mon daemon when add an osd
[9:07] * fattaneh (~fattaneh@151.240.132.30) has joined #ceph
[9:08] <khyron> how do you add the osd? do u used ceph-deploy?
[9:08] <wgao> not use it , I just use ceph
[9:09] <wgao> I use virtualbox machine, do you think it's not suit for this case ?
[9:10] <khyron> don't think so....
[9:12] <wgao> OK
[9:12] <khyron> if ceph osd tree shows osd.0 and osd.1 down can u check /var/log/ceph/ceph-osd.0.log and /var/log/ceph/ceph-osd.1.log for more info? maybe posted in a pastebin
[9:13] * analbeard (~shw@support.memset.com) has joined #ceph
[9:14] * kefu_ is now known as kefu
[9:15] <wgao> root@node1:/home/s1user# ceph osd tree
[9:15] <wgao> # id weight type name up/down reweight
[9:15] <wgao> -1 2 root default
[9:15] <wgao> -2 2 host node1
[9:15] <wgao> 0 1 osd.0 down 1
[9:15] <wgao> 1 1 osd.1 down 1
[9:16] <IcePic> ..
[9:17] * dneary (~dneary@12.139.153.2) has joined #ceph
[9:23] * rogst (~Xylios@76GAAD3K3.tor-irc.dnsbl.oftc.net) Quit ()
[9:23] * PcJamesy (~Mraedis@06SAAAMVJ.tor-irc.dnsbl.oftc.net) has joined #ceph
[9:27] * evelu (~erwan@37.162.242.49) has joined #ceph
[9:30] * yuxiaozou (~yuxiaozou@128.135.100.110) Quit (Ping timeout: 480 seconds)
[9:30] <sep> morning. this is perhaps more xfs related. but i experience it only on my ceph cluster. ;; an osd goes down. any my dmesg is filled, and keeps on filling with lines like these : "XFS (sdh1): xfs_log_force: error 5 returned." ; the process is dead, but sdh1 is still mounted ;; now i want to replace the drive since i do not trust it anymore. ; how can i make xfs give up on the drive ? i have tried unmounting but that usualy never returns. and i can not sha
[9:30] <sep> ke it loose until i reboot. googeling led me nowhere. any of you experienced something similar ? and know what to do ?
[9:31] * jrojas (~jrojas@68-95-184-105.lightspeed.irvnca.sbcglobal.net) Quit (Quit: Lost terminal)
[9:34] * fattaneh (~fattaneh@151.240.132.30) Quit (Quit: Leaving.)
[9:35] * hyperbaba (~hyperbaba@private.neobee.net) has joined #ceph
[9:36] * fsimonce (~simon@host201-70-dynamic.26-79-r.retail.telecomitalia.it) has joined #ceph
[9:37] * masterpe (~masterpe@2a01:670:400::43) Quit (Server closed connection)
[9:37] * masterpe (~masterpe@2a01:670:400::43) has joined #ceph
[9:38] * fattaneh (~fattaneh@151.240.132.30) has joined #ceph
[9:38] * hyperbaba (~hyperbaba@private.neobee.net) Quit (Read error: Connection reset by peer)
[9:39] * pabluk__ is now known as pabluk_
[9:39] * CustosLim3n (~CustosLim@ns343343.ip-91-121-210.eu) has joined #ceph
[9:39] * red_nh (red@infi.e-lista.pl) Quit (Read error: Connection reset by peer)
[9:40] * evelu (~erwan@37.162.242.49) Quit (Ping timeout: 480 seconds)
[9:41] * CustosLimen (~CustosLim@2001:41d0:1:ff97::1) Quit (Ping timeout: 480 seconds)
[9:42] * MrAbaddon (~MrAbaddon@a89-155-99-93.cpe.netcabo.pt) Quit (Ping timeout: 480 seconds)
[9:47] * Hemanth (~hkumar_@121.244.87.117) has joined #ceph
[9:48] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[9:49] * evelu (~erwan@37.164.33.222) has joined #ceph
[9:49] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[9:50] * branto (~borix@ip-78-102-208-181.net.upcbroadband.cz) has joined #ceph
[9:50] * lightspeed (~lightspee@2001:8b0:16e:1:8326:6f70:89f:8f9c) Quit (Ping timeout: 480 seconds)
[9:53] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[9:53] * shyu_ (~shyu@119.254.120.71) has joined #ceph
[9:53] * PcJamesy (~Mraedis@06SAAAMVJ.tor-irc.dnsbl.oftc.net) Quit ()
[9:53] * KeeperOfTheSoul (~AG_Clinto@06SAAAMXD.tor-irc.dnsbl.oftc.net) has joined #ceph
[9:54] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[9:56] * rendar (~I@host99-1-dynamic.52-79-r.retail.telecomitalia.it) has joined #ceph
[9:58] * DanFoster (~Daniel@2a00:1ee0:3:1337:c8b6:196a:fc04:2b4) has joined #ceph
[9:58] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Quit: Ex-Chat)
[9:59] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[10:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[10:01] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) has joined #ceph
[10:01] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[10:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[10:05] * nhm_ (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) has joined #ceph
[10:07] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[10:10] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[10:10] * linjan_ (~linjan@176.195.77.43) has joined #ceph
[10:11] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Remote host closed the connection)
[10:12] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[10:13] * fattaneh (~fattaneh@151.240.132.30) Quit (Quit: Leaving.)
[10:14] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[10:15] * EinstCra_ (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[10:15] * EinstCra_ (~EinstCraz@58.247.119.250) has joined #ceph
[10:16] <brians__> morning cephers.
[10:16] <brians__> If I have 3 hosts that are identical with 4 OSDs in each with 1 lightning fast SSD for journals
[10:17] <brians__> If I have replication 1 2 or 3 in this case - shouldn't the rados bench be identical for 3 if I have crush set to host in the replicated ruleset ?
[10:17] * linjan (~linjan@176.195.212.248) Quit (Ping timeout: 480 seconds)
[10:18] <brians__> I mean If I rados bench a pool with repl 1 then 2 then 3 if the hosts are identical and writes are synchronous why does the throughput lower with each r=n increase.
[10:20] * kefu (~kefu@114.92.120.83) Quit (Max SendQ exceeded)
[10:20] * kefu (~kefu@114.92.120.83) has joined #ceph
[10:23] * KeeperOfTheSoul (~AG_Clinto@06SAAAMXD.tor-irc.dnsbl.oftc.net) Quit ()
[10:23] * Coestar (~Guest1390@217.23.13.129) has joined #ceph
[10:29] * branto_out (~branto@nat-pool-brq-t.redhat.com) Quit (Quit: Leaving.)
[10:32] * o0c_ (~o0c@chris.any.mx) has joined #ceph
[10:33] * fattaneh (~fattaneh@151.240.132.30) has joined #ceph
[10:33] * fattaneh (~fattaneh@151.240.132.30) Quit ()
[10:35] * o0c (~o0c@chris.any.mx) Quit (Ping timeout: 480 seconds)
[10:36] * fattaneh (~fattaneh@151.240.132.30) has joined #ceph
[10:38] <IcePic> each r=n increase means "talk to yet one host more before acking the write"
[10:38] <IcePic> at least generally.
[10:39] <IcePic> or "talk to one more osd over the network", even if network is localhost in some cases.
[10:41] * bjornar_ (~bjornar@109.247.131.38) has joined #ceph
[10:43] * TMM (~hp@185.5.122.2) has joined #ceph
[10:46] * evelu (~erwan@37.164.33.222) Quit (Read error: Connection reset by peer)
[10:48] * fattaneh (~fattaneh@151.240.132.30) Quit (Ping timeout: 480 seconds)
[10:50] * evelu (~erwan@62.147.161.106) has joined #ceph
[10:53] * Coestar (~Guest1390@76GAAD3NW.tor-irc.dnsbl.oftc.net) Quit ()
[10:53] * Lea (~LeaChim@host86-168-120-216.range86-168.btcentralplus.com) has joined #ceph
[10:54] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[10:57] * wjw-freebsd (~wjw@vpn.ecoracks.nl) has joined #ceph
[10:58] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[11:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[11:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[11:02] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[11:04] <brians__> IcePic thanks - I guess the osds job is to replicate - so I send write to one osd and it sees the r=n and decides it needs a further n-1 writes ?
[11:04] <brians__> I'm just really wanting to get a proper understanding of this - its great stuff :)
[11:04] <IcePic> yes
[11:05] <IcePic> which is why a journal is important, if your osds have slow disks, since as soon as it hits the journal, the OSD can ack the write, and the original osd can "know" it has hit enough disks to send ACK back up to the writer
[11:05] <IcePic> ..why a fast journal is ..
[11:06] * MrAbaddon (~MrAbaddon@193.137.26.66) has joined #ceph
[11:07] <brians__> Thanks IcePic
[11:07] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[11:07] <IcePic> its also why having a backend network separate from the frontend network can be important,since 10k write from the outside will be r=n-1*(10k) between OSDs
[11:10] <brians__> indeed - we're testing with 10Gbps backend net only for ceph
[11:10] <brians__> with replication of 2 I'm getting about 450MB/s consistently with rados bench writes
[11:10] <brians__> quite happy with that.
[11:16] * fattaneh (~fattaneh@151.240.132.30) has joined #ceph
[11:17] <swami1> joshd: Hi
[11:17] <stein> anybody know the status for hadoop over rgw that the folks from intel has been working on?
[11:21] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[11:23] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:83e:b71d:7525:859a) has joined #ceph
[11:28] * Kaervan (~ZombieL@politkovskaja.torservers.net) has joined #ceph
[11:33] * MrAbaddon (~MrAbaddon@193.137.26.66) Quit (Remote host closed the connection)
[11:33] * evelu (~erwan@62.147.161.106) Quit (Ping timeout: 480 seconds)
[11:33] <Zabidin> what it's mean by this > is currently at the state of electing?
[11:34] <Zabidin> Error > [osd04][INFO ] monitor: mon.osd04 is currently at the state of electing
[11:34] * evelu (~erwan@37.160.182.228) has joined #ceph
[11:34] <Zabidin> osd02,osd03 no problem
[11:34] <Zabidin> only osd04 have error..
[11:38] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[11:39] * shyu_ (~shyu@119.254.120.71) Quit (Ping timeout: 480 seconds)
[11:39] * huangjun (~kvirc@113.57.168.154) Quit (Ping timeout: 480 seconds)
[11:40] * fattaneh (~fattaneh@151.240.132.30) Quit (Ping timeout: 480 seconds)
[11:45] * MrAbaddon (~MrAbaddon@193.137.26.66) has joined #ceph
[11:46] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[11:46] * IvanJobs (~hardes@103.50.11.146) Quit (Read error: Connection reset by peer)
[11:47] * IvanJobs (~hardes@103.50.11.146) has joined #ceph
[11:47] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Quit: Leaving...)
[11:57] * Kaervan (~ZombieL@76GAAD3P1.tor-irc.dnsbl.oftc.net) Quit ()
[11:57] * Esvandiary (~rushworld@193.90.12.90) has joined #ceph
[12:01] * erwan_taf (~erwan@62.147.161.106) has joined #ceph
[12:01] * evelu (~erwan@37.160.182.228) Quit (Read error: Connection reset by peer)
[12:10] * overclk (~quassel@121.244.87.117) Quit (Remote host closed the connection)
[12:20] <Zabidin> See you tomorrow guys..
[12:20] <Zabidin> Out now..
[12:20] * Zabidin (~oftc-webi@124.13.35.225) Quit (Quit: Page closed)
[12:22] * kefu is now known as kefu|afk
[12:23] * krypto (~krypto@125.16.137.146) has joined #ceph
[12:27] * kefu|afk is now known as kefu
[12:27] * Esvandiary (~rushworld@7V7AADZFG.tor-irc.dnsbl.oftc.net) Quit ()
[12:28] * yuastnav1 (~oracular@76GAAD3R8.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:28] * EinstCra_ (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[12:31] * Mika_c (~quassel@122.146.93.152) Quit (Remote host closed the connection)
[12:32] <titzer> hey hey
[12:35] <titzer> with a 4 node, 8 osd (10K-sas) cluster running fio testing for iops against rbd image, with 1 job, 1 QD I'm seeing less read/write iops than testing against one disk directly
[12:36] <titzer> this doesn't feel right to me and I'm not sure what the bottle neck is. Any ideas?
[12:38] <koollman> I would guess the network
[12:39] <titzer> oh, even though the amount of data isn't very much?
[12:44] * rraja (~rraja@121.244.87.117) has joined #ceph
[12:51] <T1w> yes
[12:51] <titzer> around ~150 read iops, ~100 write iops, less than 1 MB/s
[12:53] <T1w> use larger blocksizes
[12:53] <T1w> and mulitple jobs
[12:54] <titzer> this is a cisco SG-200 26 port gbit switch, all ceph nodes + the server doing the testing connected to it
[12:54] <T1w> doesn't matter
[12:55] <T1w> every write has to be trnasmissted over the network, so the only way of increasing bandwidth is to use larger packets and do multiple writes at once
[12:55] <T1w> transmitted even
[12:55] <titzer> we also test with 16 jobs and 1, 16, 32 and 64 QD, and iops of course increases as one would expect
[12:55] * shyu_ (~shyu@111.201.70.50) has joined #ceph
[12:55] <titzer> for reads as well?
[12:56] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[12:56] <T1w> for reads as well, yes
[12:56] <T1w> or.. not nearly as much for reads as for writes
[12:56] <titzer> yeah
[12:56] <T1w> but each block must be read from an OSD/PG and transmitted from the OSD to the client
[12:57] <T1w> for simple reads/writes a single direct disk will always be faster - ceph can never outperform that
[12:57] * yuastnav1 (~oracular@76GAAD3R8.tor-irc.dnsbl.oftc.net) Quit ()
[12:57] * MatthewH12 (~Shesh@192.42.115.101) has joined #ceph
[12:58] <T1w> but for more complex patterns and usages it can easily outperform a single disk
[12:58] <titzer> that's very interesting
[12:58] <T1w> one who had a ssd-only based cluster saw reads and writes above 2GB/s (yes, gigabytes!)
[12:58] <titzer> for some reason I wouldn't have expected that
[12:58] <titzer> random?
[12:59] <T1w> my smallish pretty simple 3 node/6OSD cluster gives me around 300MB/s writes and 500MB/s reads
[12:59] <T1w> no, that test was sequential, but random reads was also above 1GB/s
[12:59] <titzer> that's stellar
[13:00] <titzer> but to be expected of ssds I guess
[13:00] <T1w> indeed
[13:00] * gregmark (~Adium@68.87.42.115) has joined #ceph
[13:00] <titzer> I know I am bandwidth limited on the sequential reads/writes with the current setup, but our testing isn't really about sequential performance
[13:00] <T1w> or at least when you have "enough" SSD based journals
[13:00] <titzer> yeah
[13:01] * jclm (~jclm@rrcs-70-60-108-14.midsouth.biz.rr.com) Quit (Ping timeout: 480 seconds)
[13:01] <T1w> you could try and redo Sebastians SSD tests up against your RBD
[13:01] <titzer> the servers we have here doesn't jive too well with SSDs, so we aren't able to get a whole lot of performance out of them unfortunately
[13:02] <T1w> and see if you can squize a bit of performance out of it but ajusting some of aio's settings
[13:02] <titzer> oh, aio itself has different settings?
[13:02] <T1w> ah, sorry - he uses fio, but it's the same principle
[13:02] <T1w> http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
[13:03] <titzer> so far we've tested different max sync intervals
[13:03] <titzer> I believe somewhere around 2 sec gave the best results
[13:03] <titzer> thanks, I'll for sure have a look at that
[13:04] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[13:04] <T1w> .. and yes that blog page is a bit old, but it still gets updated with new models etc etc as people get new models and run the tests
[13:04] <titzer> whoa, he disables dwc? That completely obliterates the performance of our ssds
[13:05] <titzer> I'll read it all and see what he finds
[13:05] <T1w> no it doesn't..
[13:05] <T1w> for a journal it's pretty important
[13:06] <T1w> .. or at least if you want your data safe
[13:06] <T1w> some drives (like Intel S3710) has power loss prevention
[13:06] <T1w> but my tests did not show any difference with or without write cache as fio instructed the disksubsystem to flush after each write anyway
[13:08] <titzer> hmm
[13:09] <titzer> let me grab my numbers
[13:10] * overclk (~quassel@117.202.104.118) has joined #ceph
[13:12] <titzer> can't find the dwc disabled numbers right now, but they were lower than a hdd
[13:12] * madkiss (~madkiss@2001:6f8:12c3:f00f:9081:9723:7c47:d918) Quit (Quit: Leaving.)
[13:13] <titzer> testing combined random read/write
[13:13] <titzer> with dwc, I got up to ~2k iops read/write
[13:14] <titzer> without dwc, the cluster would completely freeze up from the horrid performance, with slow requests constantly and osds falling down
[13:14] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has left #ceph
[13:15] <titzer> note, these are hp dl380G5 servers, so very old controllers that are not aware of ssds. I mostly blame them for all the issues :)
[13:15] * jclm (~jclm@rrcs-70-60-108-15.midsouth.biz.rr.com) has joined #ceph
[13:15] <titzer> thanks for the info, I gotta run out for lunch. bbl!
[13:15] * IvanJobs (~hardes@103.50.11.146) Quit (Quit: Leaving)
[13:15] * kanagaraj (~kanagaraj@121.244.87.117) Quit (Quit: Leaving)
[13:18] <darkfader> titzer: depends a lot on the ssd model, i've got some hitachi where emc played with the firmware and I can't enable the write cache, and they still do the iops they're specified at
[13:18] * ivancich (~ivancich@aa2.linuxbox.com) Quit (Ping timeout: 480 seconds)
[13:18] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) has joined #ceph
[13:19] <darkfader> (I really don't dare changing the firmware)
[13:19] * jklare (~jklare@185.27.181.36) Quit (Server closed connection)
[13:19] * jklare (~jklare@185.27.181.36) has joined #ceph
[13:21] * zhaochao (~zhaochao@124.202.191.130) Quit (Quit: ChatZilla 0.9.92 [Firefox 45.0.1/20160318172635])
[13:23] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[13:24] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[13:25] * bniver (~bniver@pool-173-48-58-27.bstnma.fios.verizon.net) Quit (Remote host closed the connection)
[13:27] * MatthewH12 (~Shesh@06SAAAM7P.tor-irc.dnsbl.oftc.net) Quit ()
[13:28] * Pieman (~Azerothia@ded31663.iceservers.net) has joined #ceph
[13:29] * ibravo (~ibravo@72.83.69.64) has joined #ceph
[13:32] * bene2 (~bene@2601:18c:8501:25e4:ea2a:eaff:fe08:3c7a) has joined #ceph
[13:33] * Gugge-47527 (gugge@92.246.2.105) Quit (Server closed connection)
[13:33] * Gugge-47527 (gugge@92.246.2.105) has joined #ceph
[13:34] * ibravo (~ibravo@72.83.69.64) Quit ()
[13:37] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[13:37] * ibravo (~ibravo@72.83.69.64) has joined #ceph
[13:40] * georgem (~Adium@24.114.68.109) has joined #ceph
[13:41] * wyang (~wyang@114.111.166.44) has joined #ceph
[13:42] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[13:45] * naoto (~naotok@27.131.11.254) Quit (Quit: Leaving...)
[13:48] * rdas (~rdas@106.221.153.248) has joined #ceph
[13:53] <nils_> so what are the risk in increasing the amount of placement groups in a running cluster?
[13:55] * ibravo (~ibravo@72.83.69.64) Quit (Quit: Leaving)
[13:56] * jclm (~jclm@rrcs-70-60-108-15.midsouth.biz.rr.com) Quit (Quit: Leaving.)
[13:57] * Pieman (~Azerothia@4MJAADN5T.tor-irc.dnsbl.oftc.net) Quit ()
[13:59] * EinstCrazy (~EinstCraz@180.174.59.248) has joined #ceph
[14:01] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[14:01] * b0e (~aledermue@213.95.25.82) has joined #ceph
[14:02] * Dragonshadow (~MJXII@67.ip-92-222-38.eu) has joined #ceph
[14:05] * kutija (~kutija@89.216.27.139) has joined #ceph
[14:09] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) has joined #ceph
[14:10] <etienneme> increasing the cpu load
[14:10] * erwan_taf (~erwan@62.147.161.106) Quit (Ping timeout: 480 seconds)
[14:10] * wyang (~wyang@114.111.166.44) Quit (Quit: This computer has gone to sleep)
[14:10] <etienneme> Not a really a risk, just increase with steps
[14:11] <etienneme> Obviously you will get missplaced objects
[14:12] * wyang (~wyang@114.111.166.44) has joined #ceph
[14:13] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) has joined #ceph
[14:13] <titzer> darkfader: ah, interesting
[14:14] <titzer> these are Samsung 840s. Did upgrade the firmware, but no noticable increase in performance :\
[14:16] <nils_> what are you using them for?
[14:16] <titzer> journals
[14:16] <nils_> I had three of them die recently.
[14:16] <nils_> or maybe they are 850
[14:16] <titzer> ouch
[14:17] <nils_> 850 pro
[14:17] <titzer> oh really
[14:17] <nils_> now it's Intel all the way
[14:17] <titzer> I thought they were supposed to be good
[14:17] <nils_> I think most consumer SSD have issues with power loss
[14:18] <nils_> although the systems never lost power
[14:18] <darkfader> titzer: it's a consumer ssd. anyone who tells you "ah, that'll just be fine" needs to go in a special box in your head
[14:18] <nils_> they all died after 60 TBW
[14:18] <darkfader> sure, at home for testing it's a diff story
[14:18] <nils_> I still have two left which I have to trie out
[14:18] <titzer> darkfader: this is testing for a bachelor thesis
[14:19] <titzer> replica of the real setup, but the real setup has intel 3700 ssds
[14:19] <darkfader> ok :)
[14:19] <titzer> :>
[14:19] <nils_> the real issue with consumer SSD is that they don't really follow the SATA standard I guess. Power Loss Protection should be standard, if the drive acknowledges a write while it still may be lost it's a broken drive
[14:19] <darkfader> it'll behave slightly different (as you see) ;)
[14:19] <nils_> broken by design
[14:19] <titzer> yeah heh
[14:19] <nils_> but it's really interesting that they all failed after ~60 TBW except for one
[14:20] * neurodrone (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) has joined #ceph
[14:20] <titzer> yeah, that should be well within their limits
[14:20] <darkfader> nils_: sysadmin version of "interesting" :(
[14:20] <titzer> did you send them to samsung and/or get them replaced?
[14:20] <nils_> well I have a few of them which are working fine however the systems they are in have different I/O patterns
[14:20] <darkfader> 60TB is really not much, i think like what the old 830's were specced for
[14:21] <nils_> well there is a small box here with the label ceph cemetary, we might have them replaced.
[14:21] <darkfader> nils_: photo?
[14:21] <nils_> they are still under warranty.
[14:21] <nils_> but it's a bit of an issue in a corporate setting to send them out.
[14:21] <darkfader> hehe
[14:22] <nils_> looking forward to Bluestore, should lessen the journal issue
[14:22] * t4nk453 (~oftc-webi@178.237.98.13) has joined #ceph
[14:24] * Foloex (~foloex@185.31.151.106) has joined #ceph
[14:24] <Foloex> Hello world
[14:24] <titzer> oh, haven't heard of bluestore. Guess I should look into that
[14:24] <titzer> hello Foloex!
[14:25] <red> samsung ssds are not the best ones, even for consumer
[14:25] <red> 840 250gb rated for 72tb
[14:26] <red> in my micro ceph cluster i use kingston 120gb rate for 290tb
[14:26] <red> d*
[14:26] <titzer> that much, huh? Which model?
[14:26] <red> Kingston HyperX 3K
[14:26] <Foloex> I have trouble with my Ceph cluster, cephfs won't mount anymore. When trying to mount, it hangs for a while before displaying "mount: 192.168.0.60:6789:/: can't read superblock"
[14:27] <Foloex> everything seems right from a cluster's health point of view
[14:27] <Foloex> "health HEALTH_OK"
[14:27] <red> 240gb model being rated at 765tb
[14:27] <t4nk453> hi, I have issues to start rados gw, with the "-d" option, I can see this error message : "ERROR: no socket server point defined, cannot start fcgi frontend"
[14:27] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[14:28] <titzer> ahh, I see
[14:28] <titzer> yeah
[14:28] <titzer> fairly old one, right?
[14:28] <red> yes, older model
[14:29] <titzer> more resilient nand in those
[14:31] <red> there are others with better tbw too
[14:32] * kefu is now known as kefu|afk
[14:32] * Dragonshadow (~MJXII@06SAAANAV.tor-irc.dnsbl.oftc.net) Quit ()
[14:32] * Teddybareman (~Keiya@anonymous6.sec.nl) has joined #ceph
[14:33] <titzer> btw, anyone happen to be a fio expert?
[14:34] <IcePic> titzer: pm Anticimex and see, he runs fio a lot
[14:34] <titzer> US based?
[14:34] <IcePic> nopes
[14:34] <IcePic> .se
[14:34] <titzer> ah, cool
[14:34] <titzer> cheers
[14:37] <red> and there are different levels of power loss protections
[14:38] <red> basiacally _some_ consumers ssd have protection to only ensure correctness of written data from controller buffer
[14:38] <red> enterprise ones have protection for write cache from host
[14:38] * georgem (~Adium@24.114.68.109) Quit (Ping timeout: 480 seconds)
[14:39] * karnan (~karnan@121.244.87.124) Quit (Remote host closed the connection)
[14:40] <nils_> red, well if the drive reports a successful write/flush/barrier whatever it is these days I think it's reasonable to assume that the data is indeed durably written, otherwise the drive is broken.
[14:41] <t4nk453> Well to continue on my radosgw, I got it to work on a previous cluster, but here it doesn't start well.
[14:42] <t4nk453> finally, it give me a 503 error when i try to conect onit
[14:42] <t4nk453> connect on it*
[14:42] * bene2 (~bene@2601:18c:8501:25e4:ea2a:eaff:fe08:3c7a) Quit (Quit: Konversation terminated!)
[14:43] <titzer> I haven't used radosgw, so I can't help :\
[14:43] <t4nk453> thanks ayway titzer
[14:43] <red> nils_ yup, but we have to keep in mind that many consumers and not only firmwares are hell buggy
[14:44] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[14:44] <Foloex> by the way, my mount -t ceph fails but in dmesg I don't see an error, it says "libceph: client264353 fsid cdcdcffc-beaf-49ef-ba62-e4e58aecaee4" followed by "libceph: mon0 192.168.0.60:6789 session established"
[14:45] <nils_> red, I remember back in the day there were HDDs which also didn't honour flushes, which made them shine in benchmarks. In my opinion that's defrauding the customer.
[14:45] <Foloex> I don't know how to obtain more and relevant informations that would help me diagnose why the mount fails
[14:45] * brians_ (~brianoftc@brian.by) Quit (Ping timeout: 480 seconds)
[14:46] <Foloex> I tried to mount using different monitors and different linux kernel: same issue
[14:46] <Foloex> it's seems to be a cluster wide issue
[14:46] <darkfader> nils_: yes, all desktop drives did that, i think
[14:47] <darkfader> so, it's stayed pretty consistent and part of the answer to "why is my dev laptop faster than the server" back then
[14:47] <nils_> yeah and they are pulling the same crap with SSD these days, never mind things like throttling etc.
[14:48] * alram_ (~alram@82.199.64.68) has joined #ceph
[14:48] <nils_> yeah that's usually also a think with macbooks since I think the filesystem also fakes durability.
[14:48] * brian-mac (~textual@5.149.168.66) has joined #ceph
[14:48] * georgem (~Adium@206.108.127.16) has joined #ceph
[14:49] * brian-mac (~textual@5.149.168.66) Quit (Max SendQ exceeded)
[14:49] * brian-mac (~textual@5.149.168.66) has joined #ceph
[14:49] <darkfader> nils_: do you mean the recent jump in reports about data corruption with encrypted disk is related?
[14:50] * darkfader got lucky so far, and bacula for when it's no longer lucky
[14:50] <nils_> darkfader, encrypted disks?
[14:50] * brian-mac (~textual@5.149.168.66) Quit (Max SendQ exceeded)
[14:50] * amospalla (~amospalla@0001a39c.user.oftc.net) Quit (Quit: WeeChat 1.0.1)
[14:50] <nils_> you mean the integrated encryption?
[14:50] <darkfader> nils_: i read like a dozen people angry about losing data on macbooks with sw crypto
[14:50] <darkfader> no sorry
[14:50] <darkfader> can't type that fast
[14:51] <darkfader> macbook + osx crypto + data corruption is something that apparently happens. and you just said the filesystem is not really reliable
[14:51] <darkfader> that would explain things
[14:52] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[14:52] <nils_> I'm not really following that, at my current client it doesn't seem like it happens often.
[14:52] <nils_> and there's like 100 macbooks around here
[14:53] <brians__> hi channel
[14:53] <brians__> if Iadd some pools then delete them
[14:53] <brians__> the pool number increments
[14:53] <Foloex> I don't have any critical data on my ceph cluster (I didn't trust cephfs too much) but I'd like to learn how to fix this issue
[14:53] <brians__> I have no pools right now - if I watn to reset the pool number to start at 1 again is that possible?
[14:53] * amospalla (~amospalla@0001a39c.user.oftc.net) has joined #ceph
[14:53] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[14:54] <Foloex> is there a way to view the metadata of cephfs ?
[14:54] <brians__> this is just my OCD kicking in I know the pool number is totally irrelevant
[14:54] <brians__> :)
[14:54] * ngoswami (~ngoswami@121.244.87.116) Quit ()
[14:54] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[14:57] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[14:59] * alram_ (~alram@82.199.64.68) Quit (Ping timeout: 480 seconds)
[14:59] * simona (~oftc-webi@static.ip-171-033-130-093.signet.nl) has joined #ceph
[15:01] <simona> Hi everyone.Currently we are experiencing a problem on a production ceph cluster, one PG is incomplete
[15:02] <simona> We've tried importing/exporting PG data with ceph-object-store tool
[15:02] * Teddybareman (~Keiya@76GAAD3VR.tor-irc.dnsbl.oftc.net) Quit ()
[15:02] <simona> but to no avail
[15:02] * treenerd_ (~gsulzberg@cpe90-146-148-47.liwest.at) Quit (Quit: treenerd_)
[15:04] * neurodrone (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[15:04] * rdas (~rdas@106.221.153.248) Quit (Quit: Leaving)
[15:05] * bniver (~bniver@nat-pool-bos-u.redhat.com) has joined #ceph
[15:05] * guerby (~guerby@2a03:7220:8080:a500::1) Quit (Quit: Leaving)
[15:05] * guerby (~guerby@2a03:7220:8080:a500::1) has joined #ceph
[15:06] * mattbenjamin1 (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) has joined #ceph
[15:07] * ivancich (~ivancich@aa3.linuxbox.com) has joined #ceph
[15:08] <Foloex> any has tips on how to troubleshoot/fix my cephfs mount issue ?
[15:09] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[15:09] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[15:11] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[15:20] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[15:22] * ibravo (~ibravo@72.198.142.104) has joined #ceph
[15:26] * huangjun (~kvirc@117.152.64.193) has joined #ceph
[15:28] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[15:31] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[15:31] * erwan_taf (~erwan@46.231.131.178) has joined #ceph
[15:32] * Curt` (~Shesh@3.tor.exit.babylon.network) has joined #ceph
[15:34] * wyang (~wyang@114.111.166.44) Quit (Quit: This computer has gone to sleep)
[15:36] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[15:36] * brians (~brianoftc@brian.by) has joined #ceph
[15:37] * wwdillingham (~LobsterRo@65.112.8.194) has joined #ceph
[15:40] * wyang (~wyang@114.111.166.44) has joined #ceph
[15:40] * wwdillingham (~LobsterRo@65.112.8.194) Quit (Read error: Connection reset by peer)
[15:43] * yanzheng (~zhyan@125.70.23.194) Quit (Quit: This computer has gone to sleep)
[15:46] * Racpatel (~Racpatel@2601:87:3:3601::4edb) has joined #ceph
[15:46] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[15:46] <Anticimex> is it possible to online change a pool's erasure coding profile?
[15:47] * krypto (~krypto@125.16.137.146) Quit (Read error: Connection reset by peer)
[15:48] * rwheeler (~rwheeler@bzq-82-81-161-51.red.bezeqint.net) Quit (Quit: Leaving)
[15:49] * Racpatel (~Racpatel@2601:87:3:3601::4edb) Quit ()
[15:50] * olid1981111 (~olid1982@p54848DBA.dip0.t-ipconnect.de) has joined #ceph
[15:54] * mattbenjamin1 (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) Quit (Quit: Leaving.)
[15:55] * yanzheng (~zhyan@125.70.23.194) has joined #ceph
[15:58] * olid1981110 (~olid1982@aftr-185-17-204-107.dynamic.mnet-online.de) Quit (Ping timeout: 480 seconds)
[16:00] * rotbeard (~redbeard@aftr-95-222-30-121.unity-media.net) has joined #ceph
[16:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[16:01] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[16:02] * Curt` (~Shesh@4MJAADOED.tor-irc.dnsbl.oftc.net) Quit ()
[16:02] * xolotl (~cyphase@192.42.115.101) has joined #ceph
[16:04] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[16:05] * Bartek (~Bartek@dynamic-78-9-152-42.ssp.dialog.net.pl) has joined #ceph
[16:06] * linuxkidd (~linuxkidd@29.sub-70-193-113.myvzw.com) Quit (Remote host closed the connection)
[16:06] * csoukup (~csoukup@159.140.254.100) has joined #ceph
[16:07] * linuxkidd (~linuxkidd@29.sub-70-193-113.myvzw.com) has joined #ceph
[16:07] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[16:07] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[16:07] * Hemanth_ (~hkumar_@121.244.87.117) has joined #ceph
[16:07] <Foloex> any has tips on how to troubleshoot/fix my cephfs mount issue (mount: 192.168.0.60:6789:/: can't read superblock) ?
[16:07] * EinstCrazy (~EinstCraz@180.174.59.248) Quit (Remote host closed the connection)
[16:09] * yuxiaozou (~yuxiaozou@128.135.100.110) has joined #ceph
[16:09] * Racpatel (~Racpatel@2601:87:3:3601::4edb) has joined #ceph
[16:10] * Hemanth (~hkumar_@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:10] <devicenull> q: Error E2BIG: specified pg_num 4096 is too large (creating 3496 new PGs on ~48 OSDs exceeds per-OSD max of 32)
[16:11] <devicenull> why is this a problem?
[16:12] <etienneme> Foloex: how do you try to mount it?
[16:12] * alram_ (~alram@82.199.64.68) has joined #ceph
[16:12] <Foloex> etienneme: sudo mount -t ceph 192.168.0.60:6789:/ /data -o context="system_u:object_r:tmp_t:s0"
[16:13] <Foloex> I tried also pointing to another monitor with same effect
[16:13] * thomnico (~thomnico@2a01:e35:8b41:120:752f:f2a9:2bf5:53b4) has joined #ceph
[16:13] <Foloex> I also tried mounting from different host (with different kernel version), same result
[16:13] <Foloex> So I think it's a cluster wide issue
[16:14] <Foloex> and everything is on my home network so there are no firewall
[16:14] * ftuesca (~ftuesca@181.170.107.140) has joined #ceph
[16:16] <etienneme> I was expecting that you forgot the -t ceph :(
[16:16] <Foloex> the cluster's is in "health HEALTH_OK"
[16:16] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[16:17] <Foloex> etienneme: haha
[16:17] <etienneme> I'm sad :p
[16:17] <Foloex> etienneme: I wish it was only that
[16:17] * olid1981112 (~olid1982@aftr-185-17-204-107.dynamic.mnet-online.de) has joined #ceph
[16:18] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[16:20] * wyang (~wyang@114.111.166.44) Quit (Quit: This computer has gone to sleep)
[16:22] * olid1981111 (~olid1982@p54848DBA.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[16:22] * shyu_ (~shyu@111.201.70.50) Quit (Ping timeout: 480 seconds)
[16:22] <Foloex> any other idea on how to troubleshoot/fix ?
[16:22] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: This computer has gone to sleep)
[16:29] * vicente (~~vicente@111-241-25-53.dynamic.hinet.net) has joined #ceph
[16:30] * mattbenjamin (~mbenjamin@aa3.linuxbox.com) has joined #ceph
[16:30] * Larsen (~andreas@2001:67c:578:2::15) Quit (Server closed connection)
[16:30] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[16:31] * Larsen (~andreas@2001:67c:578:2::15) has joined #ceph
[16:32] * xolotl (~cyphase@06SAAANJ7.tor-irc.dnsbl.oftc.net) Quit ()
[16:32] * VampiricPadraig (~Throlkim@static-ip-85-25-103-119.inaddr.ip-pool.com) has joined #ceph
[16:32] * ngoswami (~ngoswami@121.244.87.116) Quit ()
[16:35] * shyu_ (~shyu@123.123.55.156) has joined #ceph
[16:37] * neurodrone (~neurodron@162.243.191.67) has joined #ceph
[16:37] <m0zes> Foloex: permissions? name and key need to be specified for mount.
[16:38] <m0zes> iirc, it should output more debug info in dmesg.
[16:38] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[16:38] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[16:38] <Foloex> I did not set any permissions, I was able to mount it like this before
[16:39] <Foloex> dmesg doesn't say much: "libceph: client264353 fsid cdcdcffc-beaf-49ef-ba62-e4e58aecaee4" and "libceph: mon0 192.168.0.60:6789 session established"
[16:39] * Superdawg (~Superdawg@ec2-54-243-59-20.compute-1.amazonaws.com) Quit (Server closed connection)
[16:39] * Superdawg (~Superdawg@ec2-54-243-59-20.compute-1.amazonaws.com) has joined #ceph
[16:40] <etienneme> You can increase verbosity of logs
[16:40] <Foloex> how ?
[16:41] <etienneme> http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/#subsystem-log-and-debug-settings
[16:41] * yanzheng (~zhyan@125.70.23.194) Quit (Quit: This computer has gone to sleep)
[16:41] <etienneme> You can use admin socket while mon is running
[16:41] <etienneme> ( 20/20 is really really verbose)
[16:42] <s3an2> Anyone successfully using cephfs quotas? I tried setting attr ceph.quota.max_bytes="100000000" - but it seems the client (cephfs-fuse) never stops me from writing data.
[16:44] <Foloex> etienneme: my monitor are fine, I can use regular commands ;)
[16:45] * gregmark (~Adium@68.87.42.115) has joined #ceph
[16:46] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) Quit (Quit: Leaving.)
[16:47] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) has joined #ceph
[16:51] * yanzheng (~zhyan@125.70.23.194) has joined #ceph
[16:53] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[16:53] <Foloex> what should I be looking for in does logs ?
[16:53] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[16:56] * Skaag (~lunix@cpe-172-91-77-84.socal.res.rr.com) Quit (Quit: Leaving.)
[17:00] * yanzheng (~zhyan@125.70.23.194) Quit (Quit: This computer has gone to sleep)
[17:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[17:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[17:01] * haplo37 (~haplo37@199.91.185.156) Quit (Remote host closed the connection)
[17:01] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[17:02] * VampiricPadraig (~Throlkim@4MJAADOJC.tor-irc.dnsbl.oftc.net) Quit ()
[17:03] * harold (~hamiller@71-94-227-123.dhcp.mdfd.or.charter.com) has joined #ceph
[17:03] * harold (~hamiller@71-94-227-123.dhcp.mdfd.or.charter.com) Quit ()
[17:03] * haplo37 (~haplo37@199.91.185.156) Quit (Remote host closed the connection)
[17:03] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[17:04] * dneary (~dneary@12.139.153.2) Quit (Ping timeout: 480 seconds)
[17:06] * yanzheng (~zhyan@125.70.23.194) has joined #ceph
[17:07] * analbeard (~shw@support.memset.com) Quit (Ping timeout: 480 seconds)
[17:08] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[17:10] * yanzheng (~zhyan@125.70.23.194) Quit ()
[17:10] * Foloex (~foloex@185.31.151.106) Quit (Quit: leaving)
[17:11] * fattaneh (~fattaneh@151.241.56.246) has joined #ceph
[17:11] * fattaneh (~fattaneh@151.241.56.246) has left #ceph
[17:12] * mattbenjamin (~mbenjamin@aa3.linuxbox.com) Quit (Ping timeout: 480 seconds)
[17:13] * huangjun (~kvirc@117.152.64.193) Quit (Ping timeout: 480 seconds)
[17:14] * RameshN (~rnachimu@223.227.226.181) has joined #ceph
[17:15] * Skaag (~lunix@65.200.54.234) has joined #ceph
[17:17] * t4nk453 (~oftc-webi@178.237.98.13) Quit (Quit: Page closed)
[17:19] * swami1 (~swami@49.32.0.92) Quit (Quit: Leaving.)
[17:19] * TMM (~hp@185.5.122.2) Quit (Quit: Ex-Chat)
[17:21] * bjornar_ (~bjornar@109.247.131.38) Quit (Ping timeout: 480 seconds)
[17:22] * ivancich (~ivancich@aa3.linuxbox.com) Quit (Ping timeout: 480 seconds)
[17:23] * Hemanth_ (~hkumar_@121.244.87.117) Quit (Ping timeout: 480 seconds)
[17:23] * wwdillingham (~LobsterRo@140.247.242.44) has joined #ceph
[17:26] * Concubidated (~cube@c-50-173-245-118.hsd1.ca.comcast.net) has joined #ceph
[17:31] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[17:31] * fabioFVZ (~fabiofvz@213.187.10.8) has joined #ceph
[17:31] * fabioFVZ (~fabiofvz@213.187.10.8) Quit ()
[17:32] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[17:32] * neobenedict (~Xeon06@anonymous.sec.nl) has joined #ceph
[17:38] * infernix (nix@2001:41f0::2) Quit (Remote host closed the connection)
[17:44] * karnan (~karnan@106.51.128.205) has joined #ceph
[17:44] * mattbenjamin (~mbenjamin@aa2.linuxbox.com) has joined #ceph
[17:45] * haplo37 (~haplo37@199.91.185.156) Quit (Ping timeout: 480 seconds)
[17:45] <wwdillingham> Hi cephers, I am attempting add in a new monitor using the cluster expansion method described here: http://docs.ceph.com/docs/hammer/dev/mon-bootstrap/#cluster-expansion (initially peerless expansion). I am att the second phase where I am feeding the ceph daemon peer ip addresses, problem is I dont have an admin socket in /var/run???. the daemon is running and is posting the following to its log (over and over)
[17:46] <wwdillingham> 2016-03-30 11:45:00.204217 7feaacf88700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
[17:46] <wwdillingham> 2016-03-30 11:45:00.204312 7feaacf88700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
[17:46] <wwdillingham> actually, now its there???.
[17:48] <wwdillingham> problem solved!
[17:49] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) Quit (Quit: Leaving.)
[17:51] * zaitcev (~zaitcev@c-50-130-189-82.hsd1.nm.comcast.net) has joined #ceph
[17:51] * RameshN (~rnachimu@223.227.226.181) Quit (Ping timeout: 480 seconds)
[17:54] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[17:57] * kefu|afk is now known as kefu
[17:57] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) Quit (Quit: Leaving.)
[17:57] * infernix (~nix@2001:41f0::2) has joined #ceph
[17:58] * davidz (~davidz@2605:e000:1313:8003:a95c:f09:afdb:4cbe) has joined #ceph
[17:58] * davidz1 (~davidz@2605:e000:1313:8003:e02b:e137:b5c4:ac21) Quit (Read error: Connection reset by peer)
[18:00] * lcurtis_ (~lcurtis@47.19.105.250) has joined #ceph
[18:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[18:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[18:02] * RameshN (~rnachimu@223.227.115.193) has joined #ceph
[18:02] * neobenedict (~Xeon06@06SAAANQ0.tor-irc.dnsbl.oftc.net) Quit ()
[18:02] * KapiteinKoffie (~KrimZon@199.68.196.126) has joined #ceph
[18:05] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[18:06] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[18:07] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[18:08] * ade (~abradshaw@217.192.191.82) has joined #ceph
[18:08] * ade (~abradshaw@217.192.191.82) Quit ()
[18:08] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[18:10] * RameshN (~rnachimu@223.227.115.193) Quit (Read error: Connection reset by peer)
[18:11] * yuxiaozou (~yuxiaozou@128.135.100.110) Quit (Ping timeout: 480 seconds)
[18:15] * BrianA (~BrianA@216.145.48.133) has joined #ceph
[18:16] <BrianA> CephDay at Yahoo Campus about to start :0
[18:17] * rmart04 (~rmart04@support.memset.com) Quit (Quit: rmart04)
[18:17] * rotbeard (~redbeard@aftr-95-222-30-121.unity-media.net) Quit (Quit: Leaving)
[18:17] * wwdillingham (~LobsterRo@140.247.242.44) Quit (Quit: wwdillingham)
[18:17] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[18:23] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[18:24] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Quit: Bye guys! (??????????????????? ?????????)
[18:24] * dneary (~dneary@64.55.107.4) has joined #ceph
[18:25] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[18:28] * LDA (~lda@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[18:28] * wwdillingham (~LobsterRo@140.247.242.44) has joined #ceph
[18:28] * BrianA (~BrianA@216.145.48.133) Quit (Ping timeout: 480 seconds)
[18:32] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[18:32] * KapiteinKoffie (~KrimZon@06SAAANS4.tor-irc.dnsbl.oftc.net) Quit ()
[18:36] * branto (~borix@ip-78-102-208-181.net.upcbroadband.cz) has left #ceph
[18:36] * kefu (~kefu@114.92.120.83) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[18:39] * kefu (~kefu@183.193.128.175) has joined #ceph
[18:39] * herrsergio (~herrsergi@200.77.224.239) has joined #ceph
[18:40] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[18:40] * herrsergio is now known as Guest9772
[18:42] * bniver (~bniver@nat-pool-bos-u.redhat.com) Quit (Remote host closed the connection)
[18:42] * BrianA (~BrianA@216.145.48.133) has joined #ceph
[18:42] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: This computer has gone to sleep)
[18:43] * kefu_ (~kefu@114.92.120.83) has joined #ceph
[18:44] <T1> devicenull: you cannot increment the number of PGs by more than 1 factor of 2 at a time - ie. go from 32 to 64 is OK - from 32 to 128 is not OK
[18:44] <devicenull> T1: my question is *why*
[18:44] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[18:44] <T1> "because"
[18:44] <T1> it's bad m'kay?
[18:44] <T1> just bad
[18:45] <devicenull> uhh
[18:45] <T1> .. jokes aside it's got something to do with CRUSH and a sudden increase thats hard to handle
[18:46] <T1> if your pool is empty you can just drop it and recreate a new pool with a larger number of PGs
[18:46] <T1> .. with an initial larger number that is..
[18:46] * kawa2014 (~kawa@89.184.114.246) Quit (Quit: Leaving)
[18:47] * ngoswami (~ngoswami@121.244.87.116) Quit ()
[18:48] * pabluk_ is now known as pabluk__
[18:48] * kefu (~kefu@183.193.128.175) Quit (Ping timeout: 480 seconds)
[18:49] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[18:49] * TiCPU (~owrt@c216.218.54-96.clta.globetrotter.net) Quit (Server closed connection)
[18:49] * TiCPU (~owrt@2001:470:1c:40::2) has joined #ceph
[18:50] * ngoswami (~ngoswami@121.244.87.116) Quit ()
[18:52] * erwan_taf (~erwan@46.231.131.178) Quit (Ping timeout: 480 seconds)
[18:54] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[18:54] * ngoswami (~ngoswami@121.244.87.116) Quit ()
[18:55] * bjornar_ (~bjornar@ti0099a430-1561.bb.online.no) has joined #ceph
[18:56] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) Quit (Quit: Leaving)
[18:58] <nils_> I just have the same problem, my pool is broken now since someone tried to increase the number of placement groups
[18:59] * wjw-freebsd (~wjw@vpn.ecoracks.nl) Quit (Ping timeout: 480 seconds)
[19:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[19:01] <m0zes> huge increases can do bad things. massively increase memory and cpu load on the osds and potentially cause massive instability in a running cluster. all of the peering for new pgs could potentially cause timeouts in active pgs, leading to more peering and a chain of suicides ;)
[19:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[19:01] * wCPO (~Kristian@188.228.31.139) has joined #ceph
[19:03] <T1> like I said.. "because" .. :)
[19:04] * DanFoster (~Daniel@2a00:1ee0:3:1337:c8b6:196a:fc04:2b4) Quit (Quit: Leaving)
[19:04] <nils_> yeah
[19:04] <nils_> anything I can do for it to recover?
[19:05] * wjw-freebsd (~wjw@vpn.ecoracks.nl) has joined #ceph
[19:06] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[19:07] <etienneme> The amount of missplaced objects decrease?
[19:07] <etienneme> You just have to wait
[19:08] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[19:12] <nils_> it's not really misplaced objects, it's that a lot of osds dropped off so a lot are stuck inactive.
[19:13] <devicenull> nils_: http://tracker.ceph.com/issues/10411 ?
[19:13] <devicenull> I've run into the stuck inactive thing twice now, I was able to fix it, but with data loss
[19:14] * MrAbaddon (~MrAbaddon@193.137.26.66) Quit (Ping timeout: 480 seconds)
[19:18] * fattaneh (~fattaneh@151.241.23.58) has joined #ceph
[19:19] * overclk (~quassel@117.202.104.118) Quit (Remote host closed the connection)
[19:21] * garphy is now known as garphy`aw
[19:23] * ndevos (~ndevos@nat-pool-ams2-5.redhat.com) Quit (Server closed connection)
[19:23] * ndevos (~ndevos@nat-pool-ams2-5.redhat.com) has joined #ceph
[19:24] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[19:24] * fattaneh (~fattaneh@151.241.23.58) Quit (Quit: Leaving.)
[19:25] * kefu_ (~kefu@114.92.120.83) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:27] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[19:27] <nils_> devicenull, well a lot of osds aborted
[19:29] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit ()
[19:32] * danieagle (~Daniel@189-47-91-188.dsl.telesp.net.br) has joined #ceph
[19:34] * Miouge (~Miouge@h-72-233.a163.priv.bahnhof.se) has joined #ceph
[19:35] * winston-d_ (uid98317@id-98317.richmond.irccloud.com) has joined #ceph
[19:40] * garphy`aw is now known as garphy
[19:41] * kutija (~kutija@89.216.27.139) Quit (Quit: Textual IRC Client: www.textualapp.com)
[19:43] * thomnico (~thomnico@2a01:e35:8b41:120:752f:f2a9:2bf5:53b4) Quit (Quit: Ex-Chat)
[19:45] * fattaneh (~fattaneh@151.241.23.58) has joined #ceph
[19:45] * Hemanth_ (~hkumar_@103.228.221.149) has joined #ceph
[19:45] * Hemanth_ (~hkumar_@103.228.221.149) Quit ()
[19:46] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[19:48] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[19:50] * dneary (~dneary@64.55.107.4) Quit (Ping timeout: 480 seconds)
[19:53] * fattaneh (~fattaneh@151.241.23.58) Quit (Quit: Leaving.)
[19:54] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[19:56] * alram_ (~alram@82.199.64.68) Quit (Ping timeout: 480 seconds)
[19:57] * frozensky (~mart@209.116.65.82) has joined #ceph
[19:57] * LDA (~lda@host217-114-156-249.pppoe.mark-itt.net) Quit (Ping timeout: 480 seconds)
[19:58] * wjw-freebsd2 (~wjw@vpn.ecoracks.nl) has joined #ceph
[19:59] * wjw-freebsd (~wjw@vpn.ecoracks.nl) Quit (Ping timeout: 480 seconds)
[20:00] * MrAbaddon (~MrAbaddon@a89-155-99-93.cpe.netcabo.pt) has joined #ceph
[20:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[20:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[20:02] * LDA (~lda@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[20:03] * garphy is now known as garphy`aw
[20:05] * wjw-freebsd (~wjw@vpn.ecoracks.nl) has joined #ceph
[20:05] * fattaneh (~fattaneh@151.241.23.58) has joined #ceph
[20:06] * rraja_ (~rraja@121.244.87.117) has joined #ceph
[20:07] * drupal (~Mousey@93.115.95.205) has joined #ceph
[20:07] * rraja (~rraja@121.244.87.117) Quit (Quit: Leaving)
[20:07] * rraja_ (~rraja@121.244.87.117) Quit ()
[20:07] * rraja (~rraja@121.244.87.117) has joined #ceph
[20:07] <m0zes> you can increase the suicide timeout, set noout and restart any dead osds.
[20:07] * fattaneh (~fattaneh@151.241.23.58) has left #ceph
[20:08] <nils_> m0zes, yup, already done.
[20:08] <m0zes> then give it time to recover. after that, you can reduce the suicide timeout back to its original values.
[20:08] <nils_> it's recovered
[20:08] <m0zes> hooray
[20:09] * wjw-freebsd3 (~wjw@vpn.ecoracks.nl) has joined #ceph
[20:10] * wjw-freebsd2 (~wjw@vpn.ecoracks.nl) Quit (Ping timeout: 480 seconds)
[20:10] * Hemanth (~hkumar_@103.228.221.149) has joined #ceph
[20:13] <wwdillingham> Is it normal for my pgmap version (pgmap v248818:) to be changing constantly? It seems to tick a new version every second or so
[20:13] * wjw-freebsd (~wjw@vpn.ecoracks.nl) Quit (Ping timeout: 480 seconds)
[20:13] <m0zes> yes
[20:13] <m0zes> it should be increasing approximately every second
[20:14] <wwdillingham> is that a map of objects to placement groups or pgs to osds ?
[20:14] <wwdillingham> or neither
[20:15] <m0zes> I'm not entirely sure what is in it. I think it is pgs->osds, though.
[20:15] <wwdillingham> thanks m0zes
[20:17] <gregsfortytwo1> pgmap is *mostly* just informational logging from the OSDs about their PGs
[20:17] <gregsfortytwo1> it's how we generate the approximate IO rates in your cluster, for instance
[20:25] * wjw-freebsd3 (~wjw@vpn.ecoracks.nl) Quit (Ping timeout: 480 seconds)
[20:25] * Hemanth (~hkumar_@103.228.221.149) Quit (Quit: Leaving)
[20:28] * infernix (~nix@2001:41f0::2) Quit (Remote host closed the connection)
[20:32] * infernix (~nix@spirit.infernix.net) has joined #ceph
[20:34] * vicente (~~vicente@111-241-25-53.dynamic.hinet.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[20:37] * drupal (~Mousey@4MJAADOYG.tor-irc.dnsbl.oftc.net) Quit ()
[20:40] * Guest9772 is now known as herrsergio
[20:41] * herrsergio is now known as Guest9785
[20:41] * hassifa (~Sliker@tollana.enn.lu) has joined #ceph
[20:41] * dneary (~dneary@64.55.107.4) has joined #ceph
[20:42] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) Quit (Ping timeout: 481 seconds)
[20:44] * redf_ (~red@chello080108089163.30.11.vie.surfer.at) has joined #ceph
[20:47] * simona (~oftc-webi@static.ip-171-033-130-093.signet.nl) Quit (Ping timeout: 480 seconds)
[20:49] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[20:49] * TheSov-mobile (~TheSov-mo@2607:fb90:1701:1aa0:e061:eb5:30f3:2707) has joined #ceph
[20:50] <TheSov-mobile> well this is it. I just started my job as an architect at docusign!
[20:50] * azizulhakim (~oftc-webi@neptune.cs.fiu.edu) has joined #ceph
[20:51] * red (~red@chello080108089163.30.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[20:54] * wjw-freebsd3 (~wjw@smtp.digiware.nl) has joined #ceph
[20:57] * Miouge (~Miouge@h-72-233.a163.priv.bahnhof.se) Quit (Quit: Miouge)
[20:58] * vata1 (~vata@207.96.182.162) has joined #ceph
[20:59] * lightspeed (~lightspee@2001:8b0:16e:1:8326:6f70:89f:8f9c) has joined #ceph
[21:00] <nils_> so since this cluster now has pg_num > pgp_num, how would I go about increasing pgp_num? I suppose I have to do it gradually?
[21:00] * xarses (~xarses@172.56.39.46) has joined #ceph
[21:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[21:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[21:03] * pam (~pam@host186-106-dynamic.14-87-r.retail.telecomitalia.it) has joined #ceph
[21:07] * frozensky1 (~mart@31.21.28.78) has joined #ceph
[21:08] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[21:10] * frozensky (~mart@209.116.65.82) Quit (Ping timeout: 480 seconds)
[21:11] * hassifa (~Sliker@4MJAADO0K.tor-irc.dnsbl.oftc.net) Quit ()
[21:11] * Helleshin (~ChauffeR@93.115.95.206) has joined #ceph
[21:14] * simona_ (~oftc-webi@static.ip-171-033-130-093.signet.nl) has joined #ceph
[21:15] <m0zes> since that is when the actual data migration starts, I'd bump it fully.
[21:15] <m0zes> nobody likes migrating data multiple times ;)
[21:24] * dneary (~dneary@64.55.107.4) Quit (Ping timeout: 480 seconds)
[21:26] <simona_> A pG is peering flapping between 2 active groups. Any ideas how to point to which active group it should peer from?
[21:32] <frozensky1> does anybody know how the filename of rbd blocks is formed? I mean files like: udata.900a62ae8944a.0000000000048b36__head_3EA5F117__3
[21:33] <frozensky1> is it true the 900a62ae8944a is the rbd prefix, and the 48b36 point to some position in the file?
[21:33] <frozensky1> what about clones and snapots?
[21:33] <frozensky1> as simona stated earlier, we have a problem with one PG
[21:34] <frozensky1> we have the raw files, but ceph is unable to retore the data in the proper way
[21:34] <frozensky1> so we are thinking of ditching the PG and recreating an empty one
[21:35] <frozensky1> as we still have the data, it would be maybe a posibilty to read this and then map those pieces of missing date to the correct position in the block devices
[21:38] * ftuesca (~ftuesca@181.170.107.140) Quit (Quit: Leaving)
[21:41] * Helleshin (~ChauffeR@06SAAAN6N.tor-irc.dnsbl.oftc.net) Quit ()
[21:41] * bildramer (~ahmeni@tor-exit0-readme.dfri.se) has joined #ceph
[21:42] <georgem> frozensky1: did you try this procedure? http://www.sebastien-han.fr/blog/2015/01/29/ceph-recover-a-rbd-image-from-a-dead-cluster/
[21:44] * Miouge (~Miouge@h-72-233.a163.priv.bahnhof.se) has joined #ceph
[21:45] <frozensky1> looks usefull!
[21:46] <frozensky1> thx.
[21:53] * rraja (~rraja@121.244.87.117) Quit (Quit: Leaving)
[21:55] * angdraug (~angdraug@64.124.158.100) has joined #ceph
[21:57] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[21:59] * Bartek (~Bartek@dynamic-78-9-152-42.ssp.dialog.net.pl) Quit (Ping timeout: 480 seconds)
[22:02] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) has joined #ceph
[22:02] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[22:02] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[22:02] * rendar (~I@host99-1-dynamic.52-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[22:03] * garphy`aw is now known as garphy
[22:05] * rendar (~I@host99-1-dynamic.52-79-r.retail.telecomitalia.it) has joined #ceph
[22:08] * batrick (~batrick@2600:3c00::f03c:91ff:fe96:477b) Quit (Ping timeout: 480 seconds)
[22:10] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[22:11] * bildramer (~ahmeni@4MJAADO4L.tor-irc.dnsbl.oftc.net) Quit ()
[22:11] * Zyn (~Bromine@ns381528.ip-94-23-247.eu) has joined #ceph
[22:11] * xar (~xar@retard.io) Quit (Server closed connection)
[22:11] * xar (~xar@retard.io) has joined #ceph
[22:11] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[22:12] * angdraug (~angdraug@64.124.158.100) Quit (Quit: Leaving)
[22:13] * angdraug (~angdraug@64.124.158.100) has joined #ceph
[22:13] * angdraug (~angdraug@64.124.158.100) Quit ()
[22:14] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[22:14] * qman (~rohroh@2600:3c00::f03c:91ff:fe69:92af) Quit (Quit: No Ping reply in 180 seconds.)
[22:15] * qman (~rohroh@2600:3c00::f03c:91ff:fe69:92af) has joined #ceph
[22:18] * shyu_ (~shyu@123.123.55.156) Quit (Ping timeout: 480 seconds)
[22:19] * TheSov-mobile (~TheSov-mo@2607:fb90:1701:1aa0:e061:eb5:30f3:2707) Quit (Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org)
[22:20] * swami1 (~swami@27.7.172.152) has joined #ceph
[22:22] * swami1 (~swami@27.7.172.152) Quit ()
[22:25] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[22:26] * yuxiaozou (~yuxiaozou@128.135.100.112) has joined #ceph
[22:26] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:83e:b71d:7525:859a) Quit (Ping timeout: 480 seconds)
[22:29] * garphy is now known as garphy`aw
[22:30] * karnan (~karnan@106.51.128.205) Quit (Ping timeout: 480 seconds)
[22:31] * xarses (~xarses@172.56.39.46) Quit (Ping timeout: 480 seconds)
[22:31] * batrick (~batrick@2600:3c00::f03c:91ff:fe96:477b) has joined #ceph
[22:32] * angdraug (~angdraug@64.124.158.100) has joined #ceph
[22:33] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[22:33] * Bartek (~Bartek@dynamic-78-9-152-42.ssp.dialog.net.pl) has joined #ceph
[22:33] * jowilkin (~jowilkin@2601:644:4000:b0bf:56ee:75ff:fe10:724e) Quit (Ping timeout: 480 seconds)
[22:33] * joshd (~jdurgin@206.169.83.146) Quit (Ping timeout: 480 seconds)
[22:36] * garphy`aw is now known as garphy
[22:37] * yuxiaozou (~yuxiaozou@128.135.100.112) Quit (Ping timeout: 480 seconds)
[22:40] * LDA (~lda@host217-114-156-249.pppoe.mark-itt.net) Quit (Quit: LDA)
[22:41] * jowilkin (~jowilkin@2601:644:4000:b0bf:56ee:75ff:fe10:724e) has joined #ceph
[22:41] * Zyn (~Bromine@06SAAAOAC.tor-irc.dnsbl.oftc.net) Quit ()
[22:41] * capitalthree (~loft@static.148.176.243.136.clients.your-server.de) has joined #ceph
[22:45] * frozensky (~mart@209.116.65.82) has joined #ceph
[22:48] * xarses (~xarses@64.124.158.100) has joined #ceph
[22:48] * joshd (~jdurgin@66-194-8-225.static.twtelecom.net) has joined #ceph
[22:50] <aarontc> anyone have thoughts on upgrading from infernalis to jewel with a few inconsitent pgs?
[22:51] * frozensky1 (~mart@31.21.28.78) Quit (Ping timeout: 480 seconds)
[22:53] * BrianA (~BrianA@216.145.48.133) Quit (Read error: Connection reset by peer)
[22:54] * daiver (~daiver@95.85.8.93) has joined #ceph
[22:54] <daiver> hi everyone
[22:56] * simona_ (~oftc-webi@static.ip-171-033-130-093.signet.nl) Quit (Quit: Page closed)
[22:57] <daiver> I'm really new in ceph. trying to configure cluster with 3 systems.
[22:57] <daiver> can't get it to healthy status
[22:57] <daiver> # ceph status
[22:57] <daiver> cluster 7ad4ada7-94f6-42bf-b39b-ade615f0a8a0
[22:57] <daiver> health HEALTH_WARN
[22:57] <daiver> 512 pgs degraded
[22:57] <daiver> 512 pgs stuck degraded
[22:57] <daiver> 512 pgs stuck unclean
[22:57] <daiver> 512 pgs stuck undersized
[22:57] <daiver> 512 pgs undersized
[22:57] <daiver> monmap e7: 3 mons at {ceph-node1=10.0.2.5:6789/0,ceph-node2=10.0.2.6:6789/0,ceph-node3=10.0.2.7:6789/0}
[22:57] <daiver> election epoch 38, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3
[22:57] <daiver> osdmap e108: 9 osds: 9 up, 9 in
[22:57] <daiver> pgmap v311: 512 pgs, 1 pools, 0 bytes data, 0 objects
[22:57] <daiver> 314 MB used, 45666 MB / 45980 MB avail
[22:57] <daiver> 512 active+undersized+degraded
[22:57] <daiver> # ceph osd pool get datapool pg_num
[22:57] <daiver> pg_num: 512
[22:58] <m0zes> pastebin 'ceph osd tree' and 'ceph osd pool ls detail'
[22:58] * reed (~reed@75-101-54-18.dsl.static.fusionbroadband.com) Quit (Quit: Ex-Chat)
[22:59] <daiver> sure
[22:59] <daiver> http://pastebin.com/skUe9MNm
[22:59] <m0zes> 5GB disks for testing?
[22:59] <daiver> 10
[23:00] <m0zes> journal a seperate partition on the disks?
[23:00] <daiver> no, didn't touch that
[23:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) Quit (Remote host closed the connection)
[23:01] <m0zes> want to pastebin lsblk of one of the nodes?
[23:01] * reed (~reed@75-101-54-18.dsl.static.fusionbroadband.com) has joined #ceph
[23:01] * dneary (~dneary@64.55.107.4) has joined #ceph
[23:01] <daiver> 3 vm's with centos 6.7, 4 drive each drive. one drive for OS, 3 drives - for ceph
[23:01] * haomaiwang (~haomaiwan@li745-113.members.linode.com) has joined #ceph
[23:01] <daiver> ure, lsblk: http://pastebin.com/VCyzWNhD
[23:02] <daiver> sure, lsblk: http://pastebin.com/VCyzWNhD
[23:02] * winston-d_ (uid98317@id-98317.richmond.irccloud.com) Quit (Quit: Connection closed for inactivity)
[23:03] <m0zes> the key thing here is that the osds are all weighted 0 because their partition is is small enough to round down when using the automatic weight calculations. note the 5G "mounted" partition and the second 'unmounted' partiton for each osd. anything less than 10GB for the data partition is too small and gets rounded down to 0
[23:03] * wwdillingham (~LobsterRo@140.247.242.44) Quit (Ping timeout: 480 seconds)
[23:04] <m0zes> you can set the weight yourself for testing purposes. ceph osd crush reweight $osdnum 1.000
[23:05] <m0zes> the "weight" column from ceph osd tree was the key factor here.
[23:05] <lurbs> s/$osdnum/osd.$osdnum/
[23:05] * Miouge (~Miouge@h-72-233.a163.priv.bahnhof.se) Quit (Quit: Miouge)
[23:05] <m0zes> ahh. I haven't played with anything that small. did know the precise command :)
[23:06] * dneary (~dneary@64.55.107.4) Quit (Quit: Ex-Chat)
[23:07] <daiver> so you saw that with test drives with 20GB I'll be good?
[23:07] * joshd (~jdurgin@66-194-8-225.static.twtelecom.net) Quit (Ping timeout: 480 seconds)
[23:07] <m0zes> 20GB should be fine to test with.
[23:08] <m0zes> or just bump the weight up for your testing.
[23:08] * brad_mssw (~brad@66.129.88.50) Quit (Quit: Leaving)
[23:10] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[23:11] * capitalthree (~loft@06SAAAOCY.tor-irc.dnsbl.oftc.net) Quit ()
[23:11] * Kayla (~ZombieTre@192.42.115.101) has joined #ceph
[23:13] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[23:14] <daiver> huh. set weight to 1.00 to all osd's, now it's healthy
[23:15] <daiver> thank you for help!
[23:17] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[23:19] * haplo37 (~haplo37@199.91.185.156) Quit (Remote host closed the connection)
[23:22] * joshd (~jdurgin@206.169.83.146) has joined #ceph
[23:31] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[23:33] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:7468:1eb7:fcc9:c6fc) has joined #ceph
[23:34] * dyasny (~dyasny@cable-192.222.176.13.electronicbox.net) Quit (Ping timeout: 480 seconds)
[23:41] * Kayla (~ZombieTre@06SAAAOE5.tor-irc.dnsbl.oftc.net) Quit ()
[23:41] * mollstam (~Esvandiar@marylou.nos-oignons.net) has joined #ceph
[23:44] * frozensky (~mart@209.116.65.82) Quit (Ping timeout: 480 seconds)
[23:44] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[23:45] * jclm (~jclm@rrcs-70-60-108-15.midsouth.biz.rr.com) has joined #ceph
[23:50] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.