#ceph IRC Log

Index

IRC Log for 2016-04-13

Timestamps are in GMT/BST.

[0:02] * yguang11 (~yguang11@nat-dip27-wl-a.cfw-a-gci.corp.yahoo.com) has joined #ceph
[0:02] * LongyanG (~long@15255.s.time4vps.eu) has joined #ceph
[0:05] * dgurtner (~dgurtner@82.199.64.68) has joined #ceph
[0:06] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[0:07] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[0:08] * ircolle (~Adium@2601:285:201:2bf9:8483:3587:6860:ca5) Quit (Quit: Leaving.)
[0:08] * olid1981111114 (~olid1982@aftr-185-17-206-203.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[0:08] * redf (~red@chello080108089163.30.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[0:08] * olid1981111114 (~olid1982@aftr-185-17-206-203.dynamic.mnet-online.de) has joined #ceph
[0:08] * Long_yanG (~long@15255.s.time4vps.eu) Quit (Ping timeout: 480 seconds)
[0:13] * angdraug (~angdraug@64.124.158.100) Quit (Ping timeout: 480 seconds)
[0:14] * dkrdkr (uid110802@id-110802.tooting.irccloud.com) Quit ()
[0:17] * ircolle (~Adium@c-71-229-136-109.hsd1.co.comcast.net) has joined #ceph
[0:17] * olid1981111114 (~olid1982@aftr-185-17-206-203.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[0:17] * olid1981111114 (~olid1982@aftr-185-17-206-203.dynamic.mnet-online.de) has joined #ceph
[0:19] * ircolle (~Adium@c-71-229-136-109.hsd1.co.comcast.net) Quit ()
[0:22] * redf (~red@chello080108089163.30.11.vie.surfer.at) has joined #ceph
[0:23] <nils_> AntonE, did you replace the journal disks?
[0:23] * yguang11 (~yguang11@nat-dip27-wl-a.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[0:24] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[0:25] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[0:27] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[0:27] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[0:27] <nils_> so infernalis got rid of the journal_uuid thing?
[0:28] <AntonE> no the journal_uuid have to be right
[0:29] <AntonE> but i tried running ceph-objectstore-tool on some OSDs it gives the pg-list but cannot export pgs, on others it would not even give pg-list
[0:29] <AntonE> only one 1 OSD did I try to rebuild the journal in fear of causing more harm
[0:29] * vata (~vata@207.96.182.162) Quit (Quit: Leaving.)
[0:31] * yguang11 (~yguang11@nat-dip27-wl-a.cfw-a-gci.corp.yahoo.com) has joined #ceph
[0:32] * AntonE (~AntonE@196.34.18.253) has left #ceph
[0:33] * AntonE (~AntonE@196.34.18.253) has joined #ceph
[0:33] * olid1981111114 (~olid1982@aftr-185-17-206-203.dynamic.mnet-online.de) Quit (Ping timeout: 480 seconds)
[0:34] * Bartek (~Bartek@dynamic-78-9-153-220.ssp.dialog.net.pl) has joined #ceph
[0:36] * alexm (~alexm@000128d2.user.oftc.net) Quit (Quit: Coyote finally caught me)
[0:37] * alexm (~alexm@chulak.ac.upc.edu) has joined #ceph
[0:40] * Bartek (~Bartek@dynamic-78-9-153-220.ssp.dialog.net.pl) Quit (Remote host closed the connection)
[0:44] * bjornar__ (~bjornar@ti0099a430-1561.bb.online.no) Quit (Ping timeout: 480 seconds)
[0:47] * fsimonce (~simon@host201-70-dynamic.26-79-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[0:48] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[0:48] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[0:51] * legion (~Bored@76GAAEFT0.tor-irc.dnsbl.oftc.net) has joined #ceph
[0:55] * BrianA1 (~BrianA@c-50-168-46-112.hsd1.ca.comcast.net) has joined #ceph
[0:55] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[0:56] <AntonE> Does anyone know of a company that can provide emergency CEPH support? I contacted RedHat today to get quote on subscription/support but they did not get back to me with a quote yet. We sit with a down cluster that need to be up in a few hours...
[0:57] * lcurtis (~lcurtis@47.19.105.250) Quit (Remote host closed the connection)
[0:57] * BrianA1 (~BrianA@c-50-168-46-112.hsd1.ca.comcast.net) has left #ceph
[0:58] <diq> I'd waive several hundred bitcoin around in the developers chat
[0:59] <diq> otherwise hang out and wait for RH
[0:59] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[1:04] * dneary (~dneary@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[1:09] * dkrdkr (~dkrdkr@2604:180:3:6ee::c764) has joined #ceph
[1:10] * dkrdkr (~dkrdkr@2604:180:3:6ee::c764) Quit ()
[1:11] * rotbeard (~redbeard@2a02:908:df18:b980:6267:20ff:feb7:c20) Quit (Ping timeout: 480 seconds)
[1:11] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[1:16] * yguang11 (~yguang11@nat-dip27-wl-a.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[1:18] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[1:18] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[1:18] * dkrdkr (~dkrdkr@2604:180:3:6ee::c764) has joined #ceph
[1:20] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[1:20] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[1:21] * legion (~Bored@76GAAEFT0.tor-irc.dnsbl.oftc.net) Quit ()
[1:22] * hyst (~adept256@193.90.12.87) has joined #ceph
[1:25] * yguang11 (~yguang11@nat-dip27-wl-a.cfw-a-gci.corp.yahoo.com) has joined #ceph
[1:29] * wushudoin (~wushudoin@38.140.108.2) Quit (Ping timeout: 480 seconds)
[1:32] * vata (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[1:34] * swami1 (~swami@27.7.172.119) has joined #ceph
[1:36] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[1:37] * swami1 (~swami@27.7.172.119) Quit ()
[1:39] * xarses_ (~xarses@50.141.35.71) has joined #ceph
[1:41] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) has joined #ceph
[1:46] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[1:46] * oms101 (~oms101@p20030057EA078000C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:47] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[1:48] * Racpatel (~Racpatel@2601:87:3:3601:4e34:88ff:fe87:9abf) Quit (Ping timeout: 480 seconds)
[1:48] * Skaag (~lunix@65.200.54.234) Quit (Quit: Leaving.)
[1:51] * hyst (~adept256@4MJAAD61W.tor-irc.dnsbl.oftc.net) Quit ()
[1:51] * GuntherDW (~Maariu5_@tor.effi.org) has joined #ceph
[1:53] * xarses_ (~xarses@50.141.35.71) Quit (Ping timeout: 480 seconds)
[1:54] * Lea (~LeaChim@host86-159-239-193.range86-159.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[1:55] * oms101 (~oms101@p20030057EA022D00C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[1:56] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) Quit (Quit: wwdillingham)
[2:02] <jclm> AntonE: How big is this cluster and what's the exact nature of the problem?
[2:05] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) has joined #ceph
[2:07] * RameshN (~rnachimu@101.222.247.122) has joined #ceph
[2:07] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) Quit (Ping timeout: 480 seconds)
[2:10] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) Quit (Quit: wwdillingham)
[2:17] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) has joined #ceph
[2:21] * huangjun (~kvirc@113.57.168.154) has joined #ceph
[2:21] * GuntherDW (~Maariu5_@6AGAAATK9.tor-irc.dnsbl.oftc.net) Quit ()
[2:21] * JohnO (~pico@193.90.12.86) has joined #ceph
[2:24] * c_soukup (~csoukup@136.63.84.142) has joined #ceph
[2:25] * huangjun|2 (~kvirc@113.57.168.154) has joined #ceph
[2:26] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) Quit (Quit: Leaving)
[2:29] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[2:29] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[2:29] * huangjun (~kvirc@113.57.168.154) Quit (Ping timeout: 480 seconds)
[2:32] * c_soukup (~csoukup@136.63.84.142) Quit (Ping timeout: 480 seconds)
[2:38] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) has joined #ceph
[2:39] * yguang11 (~yguang11@nat-dip27-wl-a.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[2:44] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) Quit (Quit: wwdillingham)
[2:44] <AntonE> jclm 3 servers, 6 osd's each
[2:44] * wongda (~wongda@CPEbc4dfb5bc333-CMbc4dfb5bc330.cpe.net.cable.rogers.com) has joined #ceph
[2:44] <wongda> hello
[2:44] <AntonE> Problem is corrupt OSDs (physical disks are fine). Corrupt journals
[2:45] <wongda> I am trying to compile ceph v10.1.1 and encountered a problem with uuid::generate_random():
[2:46] <wongda> CXXLD xio_client
[2:46] <wongda> tools/monmaptool.o: In function `uuid_d::generate_random()':
[2:46] <wongda> /mnt/cephfs/ceph/src/./include/uuid.h:29: undefined reference to `boost::random::random_device::random_device(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
[2:46] <wongda> collect2: error: ld returned 1 exit status
[2:46] <wongda> Am I missing some libraries?
[2:47] <wongda> I am on CentOS 7.2
[2:47] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Read error: Connection reset by peer)
[2:48] <wongda> I also enabled xio since I need IB support
[2:49] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[2:51] * JohnO (~pico@6AGAAATMB.tor-irc.dnsbl.oftc.net) Quit ()
[2:52] * vata (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[2:52] <wongda> hello, let me take a step back. I am new to this IRC, I wish to ask question regarding compiling ceph v10.1.1, am I on the right channel?
[2:56] * vend3r (~Jaska@tor-relay.zwiebeltoralf.de) has joined #ceph
[2:56] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) Quit (Ping timeout: 480 seconds)
[2:57] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) has joined #ceph
[3:00] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[3:00] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[3:04] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Read error: Connection reset by peer)
[3:05] * georgem (~Adium@45.72.132.68) has joined #ceph
[3:05] * bearkitten (~bearkitte@cpe-76-172-86-115.socal.res.rr.com) has joined #ceph
[3:06] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) has joined #ceph
[3:11] * dgurtner (~dgurtner@82.199.64.68) Quit (Ping timeout: 480 seconds)
[3:22] * atheism (~atheism@182.48.117.114) has joined #ceph
[3:26] * vend3r (~Jaska@4MJAAD64D.tor-irc.dnsbl.oftc.net) Quit ()
[3:26] * Random (~Aramande_@tor-exit.dhalgren.org) has joined #ceph
[3:26] * vata (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[3:27] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[3:29] * IvanJobs (~hardes@103.50.11.146) has joined #ceph
[3:35] * yanzheng (~zhyan@125.70.21.212) has joined #ceph
[3:36] * Skaag (~lunix@cpe-172-91-77-84.socal.res.rr.com) has joined #ceph
[3:36] * dneary (~dneary@pool-96-237-170-97.bstnma.fios.verizon.net) has joined #ceph
[3:40] * mattbenjamin (~mbenjamin@aa2.linuxbox.com) Quit (Quit: Leaving.)
[3:41] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) Quit (Quit: wwdillingham)
[3:42] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) has joined #ceph
[3:43] * Mika_c (~quassel@122.146.93.152) has joined #ceph
[3:47] * ronrib (~boswortr@45.32.242.135) has joined #ceph
[3:49] * IvanJobs (~hardes@103.50.11.146) Quit (Read error: Connection reset by peer)
[3:50] * zhaochao (~zhaochao@125.39.112.5) has joined #ceph
[3:56] * Random (~Aramande_@6AGAAATO4.tor-irc.dnsbl.oftc.net) Quit ()
[3:56] * loft (~Revo84@exit.tor.uwaterloo.ca) has joined #ceph
[3:56] * joshd1 (~jdurgin@71-92-201-212.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[4:03] * askb (~askb@117.208.164.87) has joined #ceph
[4:04] * EinstCra_ (~EinstCraz@58.247.119.250) has joined #ceph
[4:07] * vbellur (~vijay@71.234.224.255) has joined #ceph
[4:11] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Ping timeout: 480 seconds)
[4:17] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[4:20] * Eduardo__ (~Eduardo@bl5-4-245.dsl.telepac.pt) has joined #ceph
[4:22] * RameshN (~rnachimu@101.222.247.122) Quit (Ping timeout: 480 seconds)
[4:26] * loft (~Revo84@76GAAEF1F.tor-irc.dnsbl.oftc.net) Quit ()
[4:26] * Deiz (~JamesHarr@marylou.nos-oignons.net) has joined #ceph
[4:27] * Eduardo_ (~Eduardo@85.193.28.37.rev.vodafone.pt) Quit (Ping timeout: 480 seconds)
[4:32] <nils_> wongda, right channel, wrong time ;)
[4:35] * EinstCra_ (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[4:35] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[4:36] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[4:36] * ndevos (~ndevos@nat-pool-ams2-5.redhat.com) Quit (Remote host closed the connection)
[4:53] * georgem (~Adium@45.72.132.68) Quit (Quit: Leaving.)
[4:55] * Eduardo_ (~Eduardo@85.193.28.37.rev.vodafone.pt) has joined #ceph
[4:56] * Deiz (~JamesHarr@76GAAEF21.tor-irc.dnsbl.oftc.net) Quit ()
[4:56] * OODavo (~Swompie`@185.62.190.38) has joined #ceph
[5:01] * Eduardo__ (~Eduardo@bl5-4-245.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[5:02] * overclk (~quassel@117.202.103.63) has joined #ceph
[5:03] * kefu_ (~kefu@114.92.120.83) has joined #ceph
[5:07] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[5:07] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[5:09] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) Quit (Quit: wwdillingham)
[5:09] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) has joined #ceph
[5:11] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[5:13] * wschulze (~wschulze@cpe-72-225-192-123.nyc.res.rr.com) has joined #ceph
[5:13] * wschulze (~wschulze@cpe-72-225-192-123.nyc.res.rr.com) has left #ceph
[5:16] * wongda (~wongda@CPEbc4dfb5bc333-CMbc4dfb5bc330.cpe.net.cable.rogers.com) Quit (Quit: Leaving)
[5:23] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[5:23] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[5:26] * OODavo (~Swompie`@6AGAAATSS.tor-irc.dnsbl.oftc.net) Quit ()
[5:39] * vata1 (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[5:39] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Remote host closed the connection)
[5:53] * Vacuum_ (~Vacuum@88.130.211.174) has joined #ceph
[5:56] * Arfed1 (~Blueraven@6AGAAATUR.tor-irc.dnsbl.oftc.net) has joined #ceph
[5:58] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[5:58] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[6:00] * Vacuum__ (~Vacuum@88.130.205.88) Quit (Ping timeout: 480 seconds)
[6:06] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[6:12] * rakeshgm (~rakesh@106.51.225.4) Quit (Quit: Leaving)
[6:12] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[6:21] * thansen (~thansen@162.219.43.108) has joined #ceph
[6:24] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[6:25] * yguang11 (~yguang11@c-73-189-35-7.hsd1.ca.comcast.net) has joined #ceph
[6:26] * Arfed1 (~Blueraven@6AGAAATUR.tor-irc.dnsbl.oftc.net) Quit ()
[6:26] * Jourei (~Gibri@6AGAAATVW.tor-irc.dnsbl.oftc.net) has joined #ceph
[6:26] * yguang11_ (~yguang11@2001:4998:effd:7804::1087) has joined #ceph
[6:31] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[6:33] * yguang11 (~yguang11@c-73-189-35-7.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[6:33] * yatin (~yatin@161.163.44.8) has joined #ceph
[6:39] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[6:43] * vata1 (~vata@cable-21.246.173-197.electronicbox.net) Quit (Quit: Leaving.)
[6:43] * boogibugs (~boogibugs@gandalf.csc.fi) Quit (Remote host closed the connection)
[6:52] * dneary (~dneary@pool-96-237-170-97.bstnma.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[6:56] * Jourei (~Gibri@6AGAAATVW.tor-irc.dnsbl.oftc.net) Quit ()
[6:56] * blank (~Kottizen@hessel2.torservers.net) has joined #ceph
[6:56] * Skaag (~lunix@cpe-172-91-77-84.socal.res.rr.com) Quit (Quit: Leaving.)
[6:57] * swami1 (~swami@49.44.57.244) has joined #ceph
[7:00] * swami2 (~swami@49.32.0.58) has joined #ceph
[7:03] * _ndevos (~ndevos@nat-pool-ams2-5.redhat.com) has joined #ceph
[7:03] * _ndevos is now known as ndevos
[7:05] * swami1 (~swami@49.44.57.244) Quit (Ping timeout: 480 seconds)
[7:05] * kefu_ is now known as kefu
[7:18] * Skaag (~lunix@108.47.204.128) has joined #ceph
[7:20] * rraja (~rraja@121.244.87.117) has joined #ceph
[7:22] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[7:26] * blank (~Kottizen@6AGAAATW1.tor-irc.dnsbl.oftc.net) Quit ()
[7:26] * Quatroking (~isaxi@94.155.49.47) has joined #ceph
[7:51] * rdas (~rdas@106.221.130.67) has joined #ceph
[7:53] * Skaag (~lunix@108.47.204.128) Quit (Quit: Leaving.)
[7:56] * Quatroking (~isaxi@76GAAEGAD.tor-irc.dnsbl.oftc.net) Quit ()
[7:56] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) has joined #ceph
[8:05] * shylesh__ (~shylesh@59.95.71.132) has joined #ceph
[8:09] * MannerMan (~oscar@user170.217-10-117.netatonce.net) has joined #ceph
[8:10] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[8:14] * rdas (~rdas@106.221.130.67) Quit (Quit: Leaving)
[8:21] * vikhyat (~vumrao@121.244.87.116) Quit (Remote host closed the connection)
[8:23] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[8:24] * AvengerMoJo (~alex@27.147.36.228) has joined #ceph
[8:33] * zaitcev (~zaitcev@c-50-130-189-82.hsd1.nm.comcast.net) Quit (Quit: Bye)
[8:34] * dyasny (~dyasny@46-117-8-108.bb.netvision.net.il) has joined #ceph
[8:35] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) Quit (Quit: Leaving.)
[8:37] * rendar (~I@host248-181-dynamic.19-79-r.retail.telecomitalia.it) has joined #ceph
[8:39] * rdas (~rdas@121.244.87.116) has joined #ceph
[8:41] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) has joined #ceph
[8:41] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) Quit ()
[8:41] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) has joined #ceph
[8:43] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) Quit ()
[8:43] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) has joined #ceph
[8:47] * yguang11_ (~yguang11@2001:4998:effd:7804::1087) Quit (Ping timeout: 480 seconds)
[8:49] * garphy is now known as garphy`aw
[8:52] <swami2> Hi
[8:53] <swami2> I have imported a RAW image to rbd 'images' pool and created a snapshot from that image...when I tried to set the protected for this snapshot, its failing with below error
[8:53] * Geph (~Geoffrey@41.77.153.99) has joined #ceph
[8:53] <swami2> rbd --pool images snap?? protect?? --snap snap a88c6600-5781-475c-8806-9723a976425c rbd: protecting snap failed: (38) Function not implemented 2016-04-13 06:47:11.879706 7f9697d4f780 -1 librbd: snap_protect: image must support layering
[8:54] <swami2> and not see an option to add as "--image-features layering", but its working with RAW foramt images..
[8:54] <swami2> is there issue with RAW images?
[8:55] * kawa2014 (~kawa@31.218.153.205) has joined #ceph
[8:58] * bvi (~bastiaan@185.56.32.1) has joined #ceph
[9:03] * dgurtner (~dgurtner@82.199.64.68) has joined #ceph
[9:03] * MannerMan (~oscar@user170.217-10-117.netatonce.net) Quit (Ping timeout: 480 seconds)
[9:05] * EinstCra_ (~EinstCraz@58.247.119.250) has joined #ceph
[9:05] * dyasny (~dyasny@46-117-8-108.bb.netvision.net.il) Quit (Ping timeout: 480 seconds)
[9:05] * ZombieTree (~Esge@64.18.82.164) has joined #ceph
[9:06] * wwdillingham (~LobsterRo@209-6-222-74.c3-0.hdp-ubr1.sbo-hdp.ma.cable.rcn.com) Quit (Quit: wwdillingham)
[9:10] * MannerMan (~oscar@user170.217-10-117.netatonce.net) has joined #ceph
[9:12] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Ping timeout: 480 seconds)
[9:13] * Miouge (~Miouge@94.136.92.20) Quit (Quit: Miouge)
[9:13] * Miouge (~Miouge@94.136.92.20) has joined #ceph
[9:15] * dyasny (~dyasny@46-117-8-108.bb.netvision.net.il) has joined #ceph
[9:15] * ade (~abradshaw@dslb-088-075-179-056.088.075.pools.vodafone-ip.de) has joined #ceph
[9:16] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[9:20] * yatin (~yatin@161.163.44.8) Quit (Remote host closed the connection)
[9:21] * analbeard (~shw@support.memset.com) has joined #ceph
[9:22] * rraja is now known as rraja|afk
[9:22] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) has joined #ceph
[9:26] * adun153 (~ljtirazon@112.198.90.40) has joined #ceph
[9:27] * natarej (~natarej@CPE-101-181-149-113.lnse5.cha.bigpond.net.au) has joined #ceph
[9:29] * fsimonce (~simon@host201-70-dynamic.26-79-r.retail.telecomitalia.it) has joined #ceph
[9:30] * olid1981111114 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[9:32] * natarej_ (~natarej@CPE-101-181-53-14.lnse4.cha.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[9:34] * derjohn_mobi (~aj@p578b6aa1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[9:35] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[9:35] * ZombieTree (~Esge@6AGAAAT17.tor-irc.dnsbl.oftc.net) Quit ()
[9:35] * demonspork (~MKoR@exit1.ipredator.se) has joined #ceph
[9:37] * garphy`aw is now known as garphy
[9:38] * jordanP (~jordan@bdv75-2-81-57-250-57.fbx.proxad.net) has joined #ceph
[9:40] * olid1981111115 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[9:40] * olid1981111114 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[9:42] * Miouge (~Miouge@94.136.92.20) Quit (Quit: Miouge)
[9:49] * olid1981111116 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[9:49] * olid1981111115 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[9:53] * jordanP (~jordan@bdv75-2-81-57-250-57.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[9:55] * derjohn_mobi (~aj@fw.gkh-setu.de) has joined #ceph
[9:57] * IvanJobs (~hardes@103.50.11.146) has joined #ceph
[9:58] * olid1981111116 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[9:59] * olid1981111116 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[9:59] <IvanJobs> Hi cephers
[10:00] <IvanJobs> I have a question about ceph's config item "osd pool default size".
[10:00] <IvanJobs> if I set it to 2, does that mean I will have 2 copy of my data object in the cluster or 3?
[10:01] <IvanJobs> I have done some testing, it seems to be 3. I just want to confirm it right here.
[10:02] * laevar (~jschulz1@134.76.80.11) has joined #ceph
[10:02] * Miouge (~Miouge@94.136.92.20) has joined #ceph
[10:05] * demonspork (~MKoR@4MJAAD7DM.tor-irc.dnsbl.oftc.net) Quit ()
[10:05] * ghostnote (~Nephyrin@marylou.nos-oignons.net) has joined #ceph
[10:06] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) has joined #ceph
[10:08] * olid1981111117 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[10:08] * olid1981111116 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[10:09] * DanFoster (~Daniel@office.34sp.com) has joined #ceph
[10:09] * b0e (~aledermue@213.95.25.82) has joined #ceph
[10:10] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Quit: Ex-Chat)
[10:10] * derjohn_mobi (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[10:13] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:14] * MannerMan (~oscar@user170.217-10-117.netatonce.net) Quit (Ping timeout: 480 seconds)
[10:15] <vikhyat> IvanJobs: you will have 2
[10:15] * rraja|afk is now known as rraja
[10:15] <vikhyat> ceph osd map
[10:16] * yatin (~yatin@161.163.44.8) has joined #ceph
[10:16] * olid1981111118 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[10:16] * olid1981111117 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[10:19] * derjohn_mobi (~aj@fw.gkh-setu.de) has joined #ceph
[10:19] <IvanJobs> Well, I found another question with rbd device, when I writing a file with size 2G into the mounted dir, no error occured
[10:20] * bcao (~oftc-webi@223.80.96.63) has joined #ceph
[10:20] <IvanJobs> why? the rbd device I mounted has a capacity of just 1G
[10:20] * yatin (~yatin@161.163.44.8) Quit (Remote host closed the connection)
[10:21] <bcao> HI. All . We are testing ceph . we disconnect a cable of a ceph hsot wich contains 14 OSD , then we find it will takes 10mins until all the osd marked down in #ceph osd tree ,any idea how to speed up the time ?
[10:22] * yatin (~yatin@161.163.44.8) has joined #ceph
[10:24] * dw (~dwaas@nat1.scz.suse.com) has joined #ceph
[10:25] * olid1981111119 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[10:25] * olid1981111118 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[10:28] * karnan (~karnan@121.244.87.117) has joined #ceph
[10:28] * evelu (~erwan@46.231.131.178) has joined #ceph
[10:28] * Lea (~LeaChim@host86-159-239-193.range86-159.btcentralplus.com) has joined #ceph
[10:29] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) has joined #ceph
[10:29] * MannerMan (~oscar@user170.217-10-117.netatonce.net) has joined #ceph
[10:31] <IcePic> bcao: http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/
[10:31] <IcePic> in particular, "mon osd down out interval" I guess
[10:33] * todin (tuxadero@kudu.in-berlin.de) has joined #ceph
[10:33] <T1w> bcao: be careful - you do not want to start backfill and reshuffle of data unless the node (host) has no chance of comming back with resonable time
[10:34] <T1w> 10 mins is not that long in real time - ie. a reboot of a node with UEFI boot, spinup of disks, firmware init etc etc etc..
[10:34] * olid1981111119 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[10:34] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:88a6:1d20:ea65:3830) has joined #ceph
[10:34] * olid1981111119 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[10:35] <T1w> you can of course use "noout" to have ceph not "out" any missing OSDs, but..
[10:35] * ghostnote (~Nephyrin@6AGAAAT5S.tor-irc.dnsbl.oftc.net) Quit ()
[10:35] * rapedex (~Kwen@5.135.65.145) has joined #ceph
[10:39] * efirs (~firs@c-50-185-70-125.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[10:40] * haomaiwang (~haomaiwan@li401-170.members.linode.com) has joined #ceph
[10:43] * olid19811111110 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[10:43] * olid1981111119 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[10:44] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[10:49] * TMM (~hp@2a02:a210:500:9c80:3602:86ff:fe6f:bb84) has joined #ceph
[10:52] * olid19811111111 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[10:52] * olid19811111110 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[10:55] <post-factum> is there some possibility to reparent image against new parent? lets say, i have image1 and image2 created by copying, and now i want image1 to be a parent of image2 to save some space
[10:55] * shyu (~Shanzhi@119.254.120.66) Quit (Ping timeout: 480 seconds)
[10:58] * TMM_ (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[11:01] * haomaiwang (~haomaiwan@li401-170.members.linode.com) Quit (Remote host closed the connection)
[11:01] * smerz (~ircircirc@37.74.194.90) Quit (Quit: Leaving)
[11:01] * TMM (~hp@2a02:a210:500:9c80:3602:86ff:fe6f:bb84) Quit (Ping timeout: 480 seconds)
[11:01] * olid19811111111 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[11:01] * olid19811111111 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[11:02] * haomaiwang (~haomaiwan@li401-170.members.linode.com) has joined #ceph
[11:03] * linjan (~linjan@76GAAED1T.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[11:05] * rapedex (~Kwen@6AGAAAT7M.tor-irc.dnsbl.oftc.net) Quit ()
[11:05] * KapiteinKoffie (~Crisco@ns316491.ip-37-187-129.eu) has joined #ceph
[11:08] * shyu (~Shanzhi@119.254.120.67) has joined #ceph
[11:10] * olid19811111112 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[11:10] * olid19811111111 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[11:20] * olid19811111113 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[11:20] * olid19811111112 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[11:28] * olid19811111114 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[11:28] * olid19811111113 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[11:30] * garphy is now known as garphy`aw
[11:31] * kefu (~kefu@114.92.120.83) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[11:32] * garphy`aw is now known as garphy
[11:33] * yatin (~yatin@161.163.44.8) Quit (Remote host closed the connection)
[11:35] * KapiteinKoffie (~Crisco@06SAAA7HG.tor-irc.dnsbl.oftc.net) Quit ()
[11:35] * Rosenbluth (~Spessu@4MJAAD7GH.tor-irc.dnsbl.oftc.net) has joined #ceph
[11:38] * huangjun|2 (~kvirc@113.57.168.154) Quit (Ping timeout: 480 seconds)
[11:38] * olid19811111114 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[11:38] * olid19811111114 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[11:41] * thomnico (~thomnico@cable-46.253.163.149.coditel.net) has joined #ceph
[11:42] * yatin (~yatin@161.163.44.8) has joined #ceph
[11:43] * bcao (~oftc-webi@223.80.96.63) Quit (Quit: Page closed)
[11:45] * pabluk__ is now known as pabluk_
[11:46] * shyu (~Shanzhi@119.254.120.67) Quit (Ping timeout: 480 seconds)
[11:47] <Jeeves_> Q: Can I safely give customers access to an rbd pool without interfering with my own production pools?
[11:47] * olid19811111114 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[11:47] * olid19811111114 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[11:50] * vvb (~vvb@168.235.85.239) has joined #ceph
[11:52] * bla (~b.laessig@chimeria.ext.pengutronix.de) Quit (Ping timeout: 480 seconds)
[11:54] * bla (~b.laessig@chimeria.ext.pengutronix.de) has joined #ceph
[11:55] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[11:55] * shyu (~Shanzhi@119.254.120.66) has joined #ceph
[11:56] * olid19811111115 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[11:56] * olid19811111114 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[11:58] * Hemanth (~hkumar_@121.244.87.117) has joined #ceph
[11:59] * adun153 (~ljtirazon@112.198.90.40) Quit (Ping timeout: 480 seconds)
[11:59] * adun153 (~ljtirazon@112.198.90.40) has joined #ceph
[12:01] * haomaiwang (~haomaiwan@li401-170.members.linode.com) Quit (Remote host closed the connection)
[12:01] * haomaiwang (~haomaiwan@li401-170.members.linode.com) has joined #ceph
[12:01] * vvb (~vvb@168.235.85.239) Quit (Quit: leaving)
[12:02] * EinstCra_ (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[12:02] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[12:03] * dkrdkr (~dkrdkr@2604:180:3:6ee::c764) Quit (Quit: ZNC - http://znc.in)
[12:05] * Miouge (~Miouge@94.136.92.20) Quit (Quit: Miouge)
[12:05] * Rosenbluth (~Spessu@4MJAAD7GH.tor-irc.dnsbl.oftc.net) Quit ()
[12:05] * olid19811111115 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[12:05] * tokie (~Vidi@91.213.8.235) has joined #ceph
[12:05] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[12:05] * olid19811111115 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[12:05] * adun153 (~ljtirazon@112.198.90.40) Quit (Remote host closed the connection)
[12:05] * thomnico (~thomnico@cable-46.253.163.149.coditel.net) Quit (Ping timeout: 480 seconds)
[12:06] * Nickname (~Name@168.235.85.239) has joined #ceph
[12:07] * Nickname (~Name@168.235.85.239) Quit ()
[12:07] * vvb (~vvb@168.235.85.239) has joined #ceph
[12:09] <IcePic> Jeeves_: you would then have to allow them to talk directly to your mons and your osds, so if they screw up (un)intentionally, it would affect the rest, would it not?
[12:11] <vvb> hi all.. I am trying to wrap my head around what is the difference between mons that participate in cluster and then ones that don't.
[12:11] * alexxy (~alexxy@biod.pnpi.spb.ru) Quit (Ping timeout: 480 seconds)
[12:12] <Jeeves_> IcePic: I don't know. That's what I'm wondering :)
[12:12] <vvb> I see mon_initial_members variable defines the nodes that participate in the cluster..
[12:12] <Jeeves_> vvb: afaik, that's where the monitors start looking initially, to get a monmap
[12:14] * olid19811111116 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[12:14] * olid19811111115 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[12:16] * mancdaz (~mancdaz@2a00:1a48:7806:117:be76:4eff:fe08:7623) Quit (Quit: ZNC - http://znc.in)
[12:16] * mancdaz (~mancdaz@2a00:1a48:7806:117:be76:4eff:fe08:7623) has joined #ceph
[12:17] <IcePic> Jeeves_: I'm not aware of any specific threats, but the general principle seems to apply that if you let them in, they could disturb the rest of your production, in ways you and I cant imagine right now, but that do appear down the line.
[12:19] <vvb> Jeeves_: ah! so if I was bring 3 nodes up manually, I should bring up the first node as a mon.. and specify mon_initial_members = node1 in the ceph.conf of the other two nodes
[12:21] * Mika_c (~quassel@122.146.93.152) Quit (Remote host closed the connection)
[12:21] <Jeeves_> vvb: Well, I think i would make it a 'ring'
[12:21] <Jeeves_> so each node points to another
[12:24] * olid19811111117 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[12:24] * olid19811111116 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: Connection reset by peer)
[12:31] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[12:34] * olid19811111117 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Ping timeout: 480 seconds)
[12:34] * Miouge (~Miouge@94.136.92.20) has joined #ceph
[12:35] * tokie (~Vidi@6AGAAAUAE.tor-irc.dnsbl.oftc.net) Quit ()
[12:35] * MatthewH12 (~vend3r@2.tor.exit.babylon.network) has joined #ceph
[12:37] <IvanJobs> vvb: hm, mons always participate in ceph cluster, what do you mean by the ones they don't?
[12:38] <IvanJobs> vvb: mon_initial_members just used in deployment phase, not in running phase. right?
[12:41] <IvanJobs> vvb: some thing related I found: http://docs.ceph.com/docs/hammer/dev/mon-bootstrap/
[12:43] * kawa2014 (~kawa@31.218.153.205) Quit (Quit: Leaving)
[12:43] * alexxy (~alexxy@biod.pnpi.spb.ru) has joined #ceph
[12:45] * olid19811111117 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[12:50] * Racpatel (~Racpatel@2601:87:3:3601:4e34:88ff:fe87:9abf) has joined #ceph
[12:50] * bara_ (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[12:51] * Racpatel (~Racpatel@2601:87:3:3601:4e34:88ff:fe87:9abf) Quit ()
[12:51] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Remote host closed the connection)
[12:51] * bara_ (~bara@nat-pool-brq-t.redhat.com) Quit (Remote host closed the connection)
[12:52] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[12:53] * olid19811111117 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) Quit (Read error: No route to host)
[12:54] * shyu (~Shanzhi@119.254.120.66) Quit (Remote host closed the connection)
[12:54] * olid19811111117 (~olid1982@aftr-185-17-206-44.dynamic.mnet-online.de) has joined #ceph
[12:55] <vvb> IvanJobs: ah! thanks . going through the link
[12:56] <IvanJobs> vvb: you're welcome
[12:57] <brians> IvanJobs was it you that were writing about running 100s of PM863 as OSDs the other day ? Apols if it wasn't
[12:59] <IvanJobs> brians: no, I'm not
[12:59] <brians> no worries - thanks IvanJobs
[13:00] * Geoff (~Geoffrey@41.77.153.99) has joined #ceph
[13:01] * haomaiwang (~haomaiwan@li401-170.members.linode.com) Quit (Remote host closed the connection)
[13:01] * haomaiwang (~haomaiwan@li401-170.members.linode.com) has joined #ceph
[13:02] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[13:02] * Hemanth (~hkumar_@121.244.87.117) Quit (Ping timeout: 480 seconds)
[13:03] <T1w> brians: I think it was TMM
[13:04] * nebuchadnezzar (~dad@zion.baby-gnu.net) has joined #ceph
[13:04] <nebuchadnezzar> hello
[13:04] <brians> ok thanks T1w
[13:05] <brians> I will ping him later with a question :)
[13:05] * MatthewH12 (~vend3r@06SAAA7I0.tor-irc.dnsbl.oftc.net) Quit ()
[13:05] * zapu (~Yopi@50.7.151.127) has joined #ceph
[13:06] * Geph (~Geoffrey@41.77.153.99) Quit (Ping timeout: 480 seconds)
[13:09] <Jeeves_> brians: I run with 9 PM863's .. :)
[13:10] <brians> Hi Jeeves_ Aha!
[13:10] <brians> interesting
[13:10] <brians> I'm thinking of replacing some spinners with them
[13:11] <brians> purely for iops for db apps
[13:11] <brians> how do you find the performance with 9 ?
[13:11] <Jeeves_> they're pretty fast.
[13:11] <Jeeves_> I'm not sure what you would expect. I saw the cluster topping at 9k ceph iops last night
[13:12] <Jeeves_> SSD's were pretty much idle, I'm guessing qemu was the iops limiting factor
[13:17] * Frank___ (~Frank@149.210.210.150) Quit (Ping timeout: 480 seconds)
[13:17] * atheism (~atheism@182.48.117.114) Quit (Ping timeout: 480 seconds)
[13:20] * dw (~dwaas@nat1.scz.suse.com) Quit (Quit: ZNC 1.6.2 - http://znc.in)
[13:23] * vvb (~vvb@168.235.85.239) Quit (Quit: leaving)
[13:23] * vvb (~vvb@168.235.85.239) has joined #ceph
[13:24] <brians> ok cool
[13:24] <brians> Thanks Jeeves_ how is your MB/s throughput with 9? Are you using r=3?
[13:25] <Jeeves_> I'm doing r=2
[13:25] <Jeeves_> There is 4gbit connectivity. I've seen that full upon resync
[13:28] * haomaiwang (~haomaiwan@li401-170.members.linode.com) Quit (Remote host closed the connection)
[13:28] * EinstCrazy (~EinstCraz@58.247.16.138) has joined #ceph
[13:32] * EinstCrazy (~EinstCraz@58.247.16.138) Quit (Remote host closed the connection)
[13:33] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[13:33] * Geph (~Geoffrey@41.77.153.99) has joined #ceph
[13:33] * dgurtner_ (~dgurtner@82.199.64.68) has joined #ceph
[13:35] * zapu (~Yopi@76GAAEGOP.tor-irc.dnsbl.oftc.net) Quit ()
[13:35] * Pirate (~Peaced@nl3x.mullvad.net) has joined #ceph
[13:35] * zhaochao_ (~zhaochao@124.202.191.137) has joined #ceph
[13:35] * dgurtner (~dgurtner@82.199.64.68) Quit (Ping timeout: 480 seconds)
[13:37] * Geoff (~Geoffrey@41.77.153.99) Quit (Ping timeout: 480 seconds)
[13:38] * ira (~ira@c-24-34-255-34.hsd1.ma.comcast.net) has joined #ceph
[13:38] * wjw-freebsd (~wjw@176.74.240.1) has joined #ceph
[13:40] * zhaochao (~zhaochao@125.39.112.5) Quit (Ping timeout: 480 seconds)
[13:40] * zhaochao_ is now known as zhaochao
[13:43] * shylesh__ (~shylesh@59.95.71.132) Quit (Remote host closed the connection)
[13:45] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[13:46] * wyang (~wyang@122.225.69.4) has joined #ceph
[13:48] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[13:49] * dw (~devin@nat1.scz.suse.com) has joined #ceph
[13:49] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[13:50] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) has joined #ceph
[13:50] * ChanServ sets mode +o nhm
[13:53] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[13:54] * georgem (~Adium@24.114.67.147) has joined #ceph
[13:54] <nebuchadnezzar> We need to deploy an new private OpenNebula and wonder if ceph is something for us. All what I read design ceph for really huge storage, but we need to start very little (dozen of VMs around 2TB) and be able to grow as needed. Do you think we could start with 3 servers (like Dell R630) and add bigger rack server when needed ?
[13:54] <brians> Thanks Jeeves_ how is the reliability ? Any issues yet?
[13:54] * georgem (~Adium@24.114.67.147) Quit ()
[13:54] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[13:54] <Jeeves_> brians: Not running that long yet
[13:54] * georgem (~Adium@206.108.127.16) has joined #ceph
[13:55] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[13:56] <brians> last question - are you using seperate jornals Jeeves_ ?
[13:56] <brians> journals even
[13:56] <Jeeves_> No, of course not
[13:58] * kawa2014 (~kawa@83.111.58.108) has joined #ceph
[13:58] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[14:00] * yatin (~yatin@161.163.44.8) Quit (Remote host closed the connection)
[14:01] <Jeeves_> That would slow things down
[14:02] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[14:03] * dugravot61 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) has joined #ceph
[14:05] * Pirate (~Peaced@76GAAEGPU.tor-irc.dnsbl.oftc.net) Quit ()
[14:05] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) Quit (Ping timeout: 480 seconds)
[14:05] * drupal (~MonkeyJam@tor2r.ins.tor.net.eu.org) has joined #ceph
[14:06] <brians> Jeeves_ you have 3 samsung ssds in each osd server? Potentially offloading journal to a PCIex DC3700 could actually improve performance and also the lifetime of the samsungs no?
[14:07] <Jeeves_> brians: Yes. It might. But in my case the investment for three DC3700 would be useless. I prefer to buy cheapisch ssds and scale horizontally
[14:08] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[14:10] <brians> Thanks for your time Jeeves_ :)
[14:11] <brians> you've done the libreNMS integration?
[14:12] <Jeeves_> brians: You're welcome, and yes.
[14:12] <brians> Cool..
[14:12] <brians> Another reason to take a look at it so :)
[14:13] <Jeeves_> afaik it's the only graphing thing for Ceph
[14:16] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[14:17] * pam (~pam@193.106.183.1) has joined #ceph
[14:17] * RayTracer (~RayTracer@153.19.7.39) Quit (Quit: Leaving...)
[14:18] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[14:18] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) Quit (Ping timeout: 480 seconds)
[14:21] * rakeshgm (~rakesh@121.244.87.117) has joined #ceph
[14:22] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[14:22] <brians> ok
[14:23] * Eduardo_ (~Eduardo@85.193.28.37.rev.vodafone.pt) Quit (Quit: Leaving)
[14:26] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) has joined #ceph
[14:27] * kefu (~kefu@183.193.38.176) has joined #ceph
[14:28] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) has joined #ceph
[14:30] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[14:30] * c_soukup (~csoukup@136.63.84.142) has joined #ceph
[14:33] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[14:34] * Racpatel (~Racpatel@2601:87:3:3601::baf1) has joined #ceph
[14:35] * wyang (~wyang@122.225.69.4) Quit (Quit: This computer has gone to sleep)
[14:35] * drupal (~MonkeyJam@4MJAAD7J5.tor-irc.dnsbl.oftc.net) Quit ()
[14:35] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[14:35] * Zeis (~Jase@freedom.ip-eend.nl) has joined #ceph
[14:36] * wjw-freebsd2 (~wjw@176.74.240.1) has joined #ceph
[14:37] * wyang (~wyang@122.225.69.4) has joined #ceph
[14:39] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[14:39] * dugravot61 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) Quit (Quit: Leaving.)
[14:39] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[14:40] * wyang (~wyang@122.225.69.4) Quit ()
[14:41] * wjw-freebsd (~wjw@176.74.240.1) Quit (Ping timeout: 480 seconds)
[14:42] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[14:44] <Jeeves_> nebuchadnezzar: Sure
[14:49] * dyasny (~dyasny@46-117-8-108.bb.netvision.net.il) Quit (Ping timeout: 480 seconds)
[14:50] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) has joined #ceph
[14:50] * Bartek (~Bartek@78.10.129.82) has joined #ceph
[14:51] * georgem (~Adium@206.108.127.16) has joined #ceph
[14:54] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[14:57] * zwu (~root@58.135.81.96) Quit (Ping timeout: 480 seconds)
[14:58] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) Quit (Ping timeout: 480 seconds)
[14:58] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[15:00] <nebuchadnezzar> Jeeves_: thanks, we already have an OpenNebula connected to a SAN and managing LUNs is quite painful, we don't want that anymore ;-)
[15:00] <Jeeves_> :)
[15:02] * dyasny (~dyasny@46-117-8-108.bb.netvision.net.il) has joined #ceph
[15:02] * mattbenjamin (~mbenjamin@aa2.linuxbox.com) has joined #ceph
[15:02] <nebuchadnezzar> I'm reading documentation and mailing-list archives for what hardware to get, the ???800TB - Ceph Physical Architecture Proposal??? is interesting, but too high for us
[15:04] <m0zes> there is never too much storage.
[15:04] <m0zes> data expands to fit all limits.
[15:05] <nebuchadnezzar> m0zes: yes, but $ is limited for now :-D
[15:05] * Zeis (~Jase@76GAAEGSX.tor-irc.dnsbl.oftc.net) Quit ()
[15:05] * HoboPickle (~Ralth@46.166.138.162) has joined #ceph
[15:05] * huangjun (~kvirc@117.152.69.112) has joined #ceph
[15:06] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) has joined #ceph
[15:07] * Geph (~Geoffrey@41.77.153.99) Quit (Ping timeout: 480 seconds)
[15:08] <nebuchadnezzar> m0zes: so the idea of starting small, make it useful, then expand
[15:08] * atheism (~atheism@106.38.140.252) has joined #ceph
[15:09] * neurodrone_ (~neurodron@pool-100-35-226-97.nwrknj.fios.verizon.net) Quit (Quit: neurodrone_)
[15:10] * wyang (~wyang@122.225.69.4) has joined #ceph
[15:10] * zwu (~root@58.135.81.96) has joined #ceph
[15:11] * bniver (~bniver@nat-pool-bos-u.redhat.com) has joined #ceph
[15:14] <Jeeves_> nebuchadnezzar: I started with 3x https://www.ahead-it.eu/en/shop/servers/1u-servers/si18b/supermicro-si18b--up-to-8-x-satasas-25-hot-swap--8-x-ddr-4--redundant-psu--1-x-pci-e-expansion/server=13269/config_now=1 with 3x 960GB SSD's each
[15:14] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[15:17] <T1> be very very careful about non-DC ssds in OSDs and in OSD nodes for OSD journals
[15:17] <T1> if the SSD dies all OSDs whos journal was on that SSD dies too
[15:18] <T1> .. and non-DC SSDs used for journal is not a good idea
[15:18] <tafelpoot> waht is non-DC ?
[15:18] <T1> samsung 850 pro
[15:18] <T1> eg
[15:19] <T1> read http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
[15:19] <T1> the list of SSDs is still kept up to date
[15:19] <T1> stay away from the consumer-models
[15:19] <nebuchadnezzar> thanks a lot
[15:20] <T1> they could be used for OSD data, but not for journals
[15:20] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[15:21] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[15:22] * Geph (~Geoffrey@41.77.153.99) has joined #ceph
[15:23] <T1> I have good experience using md-raid to create a mirrored device for journals so I can survive a single ssd failure
[15:23] * zhaochao (~zhaochao@124.202.191.137) Quit (Quit: ChatZilla 0.9.92 [Firefox 45.0.1/20160318172635])
[15:24] <nebuchadnezzar> thanks, I keep the links for later
[15:25] <T1> Sebastien Hans (leseb here on IRC) blog contains really good stuff
[15:26] <T1> about regular maintenance-stuff to exciting new features
[15:26] <Jeeves_> I switched from 850 Pro's to PM863
[15:26] <Jeeves_> Helps a lot, in terms of performance :)
[15:26] <T1> if PM863 proves to be as solid as Intel S36xx or S37xx then, yes..
[15:27] <tafelpoot> any ballpark figures in the gain when moving journal to ssd?
[15:27] <T1> otherwise I'd still go for Intel 37xx
[15:27] <T1> tafelpoot: lower overhead for all client writes?
[15:27] <Jeeves_> T1: Pricing is quite different
[15:27] <T1> writing to a ssd has lower latency compared to rotating rust
[15:28] <Jeeves_> tafelpoot: Any write
[15:28] <T1> .. and since ceph only acks a write once it's been written as many times as the number of coipes you have requested - it could be quite a lot
[15:28] <T1> Jeeves_: nah, not really
[15:29] <T1> when I looked at 850 pro the S3710s were only some 100USD more expensive
[15:29] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[15:29] <T1> that's not even worth argumenting about - especially compared to the horrorstories the mailinglist contains about failed samsung ssds over time
[15:29] <tafelpoot> T1: lower latency -> higher troughput?
[15:30] <T1> .. in one thread someone wrote that within a year they had replaced 25+ 840 Pros
[15:30] <T1> and were now replacing them proactivly
[15:30] * askb_ (~askb@117.208.167.42) has joined #ceph
[15:30] <T1> tafelpoot: yes
[15:31] <analbeard> hi guys, can anyone suggest any good information sources regarding how a cluster's performance scales as it gets bigger? obviously the bigger it gets the less of a performance increase you see by adding a new node
[15:31] <T1> .. unless your clients have all the time in the world..
[15:31] <analbeard> i'm after some ballpark guestimations, just for a better idea and understanding
[15:31] <analbeard> we're at quite a small scale at the mo, 2 separate clusters at 165tb raw each
[15:32] <T1> Jeeves_: and in regards to the pricedifference between S3710 and PM863 it simple: PM863 is unproven and it's not possible to know how it fares over some years to come..
[15:32] * Bartek (~Bartek@78.10.129.82) Quit (Ping timeout: 480 seconds)
[15:32] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[15:32] <T1> while the S3700 series has a proven track record
[15:33] <Jeeves_> T1: Yes, that's true. But we'll see. :)
[15:33] * vikhyat (~vumrao@123.252.242.114) has joined #ceph
[15:33] <leseb> T1: thanks :)
[15:33] <T1> leseb: :)
[15:33] <nebuchadnezzar> Jeeves_: so, I finish around $3k server, but we must by from Dell thought since as an french administration we can't buy where we want, and SSD are very expensive ($1,118.90 960GB)
[15:34] <Jeeves_> nebuchadnezzar: Yes, good luck with that. :)
[15:34] <T1> nebuchadnezzar: alas..
[15:34] <T1> actually we ended up with getting dell servers with no disks
[15:34] <Jeeves_> Dell often gives high discounts
[15:34] <Jeeves_> unless they know you MUST buy from them, I suspect :)
[15:34] <T1> and then bought carriers, 3.5"->2.5" converters and disks from local retail
[15:35] <analbeard> leseb: by chance, have you written anything at all on how a cluster's performance scales as it grows?
[15:35] * HoboPickle (~Ralth@46.166.138.162) Quit ()
[15:35] * Epi (~Bwana@93.115.95.206) has joined #ceph
[15:35] <T1> much much MUCH lower than our regular rates - and I knew exactly what ssds we got and what disks we put in
[15:36] <T1> and we do get rather good prices from them
[15:36] * rakeshgm (~rakesh@121.244.87.117) Quit (Quit: Leaving)
[15:37] * wjw-freebsd (~wjw@176.74.240.1) has joined #ceph
[15:37] * askb (~askb@117.208.164.87) Quit (Ping timeout: 480 seconds)
[15:37] <leseb> analbeard: nop not really
[15:37] * rakeshgm (~rakesh@121.244.87.117) has joined #ceph
[15:38] <analbeard> leseb: ok, no worries, thought it was worth a try!
[15:40] * EinstCrazy (~EinstCraz@101.85.214.4) has joined #ceph
[15:40] <tafelpoot> T1: problem is that dell doesn't like it and if something goed booboo with your sata controller or something, they claim it's your own fault
[15:41] <T1> tafelpoot: just say something went wrong during a firmware update..
[15:41] <tafelpoot> we had the same with HP
[15:41] * Bartek (~Bartek@78.10.129.82) has joined #ceph
[15:41] <T1> .. I tried to flas the PERC H310 from IR to IT firmware..
[15:41] <T1> first step went well..
[15:41] <T1> reboot..
[15:41] <tafelpoot> lol
[15:41] * wjw-freebsd2 (~wjw@176.74.240.1) Quit (Ping timeout: 480 seconds)
[15:41] <tafelpoot> good idea :D
[15:41] <tafelpoot> hah
[15:41] <T1> and then it never made it past POST
[15:42] <T1> "there is an unknown storage controller detected in the internale storage PCIe controller slot"
[15:42] <T1> .. that slot MIGHT have been PCIe, but it was not any old regular PCie slot - propritary as hell..
[15:43] <Jeeves_> An ex-collegue of mine places parts in a microwave if dell/hp/sun was making trouble about replacing stuff
[15:43] * IvanJobs (~hardes@103.50.11.146) Quit (Ping timeout: 480 seconds)
[15:43] <T1> sooo I called them up and told them that I had a power failure during firmware update and they shipped a new controller to me
[15:43] <T1> done deal
[15:44] <darkfader> iirc that's called fraud but it might be my memory ;p
[15:44] <T1> .. and then I found out that the H310 has a nice non-raid feature, so I have passthrough on all disks without messing about with firmware reflashing
[15:44] <T1> darkfader: yes, and you never drive above the speedlimit or walk over the crossing when its read etc etc etc..
[15:45] <T1> afk - picking up my son and wife
[15:49] <darkfader> T1: sorry, but at least be professional enough to not run around and talk about it. that's all i meant. no big deal.
[15:53] * c_soukup (~csoukup@136.63.84.142) Quit (Ping timeout: 480 seconds)
[15:56] * remix_auei (~remix_tj@bonatti.remixtj.net) Quit (Remote host closed the connection)
[15:59] <germano> Hi, who is gonna attend to Linux Vault conference, apart from the speakers?
[15:59] * wwdillingham (~LobsterRo@140.247.242.44) has joined #ceph
[15:59] * remix_tj (~remix_tj@bonatti.remixtj.net) has joined #ceph
[16:00] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[16:01] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:04] * remix_tj (~remix_tj@bonatti.remixtj.net) Quit ()
[16:05] * rakeshgm (~rakesh@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:05] * remix_tj (~remix_tj@bonatti.remixtj.net) has joined #ceph
[16:05] * Epi (~Bwana@6AGAAAUJI.tor-irc.dnsbl.oftc.net) Quit ()
[16:05] * Aal (~LRWerewol@tor2.asmer.com.ua) has joined #ceph
[16:08] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[16:12] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[16:14] * vata (~vata@207.96.182.162) has joined #ceph
[16:15] * rakeshgm (~rakesh@121.244.87.124) has joined #ceph
[16:15] * overclk_ (~quassel@117.202.96.59) has joined #ceph
[16:15] * EinstCrazy (~EinstCraz@101.85.214.4) Quit (Remote host closed the connection)
[16:16] * Bartek (~Bartek@78.10.129.82) Quit (Ping timeout: 480 seconds)
[16:16] * EinstCrazy (~EinstCraz@101.85.214.4) has joined #ceph
[16:18] * overclk (~quassel@117.202.103.63) Quit (Ping timeout: 480 seconds)
[16:21] <rkeene> My pgmap epoch increases at a rate of about 1 per second... is this bad ?
[16:24] * EinstCrazy (~EinstCraz@101.85.214.4) Quit (Ping timeout: 480 seconds)
[16:24] * guerby (~guerby@ip165.tetaneutral.net) Quit (Ping timeout: 480 seconds)
[16:26] * karnan (~karnan@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:31] * karnan (~karnan@121.244.87.117) has joined #ceph
[16:31] * wyang (~wyang@122.225.69.4) Quit (Read error: Connection reset by peer)
[16:35] * Aal (~LRWerewol@06SAAA7PF.tor-irc.dnsbl.oftc.net) Quit ()
[16:35] * rwheeler (~rwheeler@pool-173-48-195-215.bstnma.fios.verizon.net) Quit (Quit: Leaving)
[16:35] * wyang (~wyang@114.111.166.44) has joined #ceph
[16:35] * yanzheng (~zhyan@125.70.21.212) Quit (Quit: This computer has gone to sleep)
[16:36] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:37] * wjw-freebsd2 (~wjw@176.74.240.1) has joined #ceph
[16:39] * swami2 (~swami@49.32.0.58) Quit (Quit: Leaving.)
[16:39] * smerz (~ircircirc@37.74.194.90) has joined #ceph
[16:41] * c_soukup (~csoukup@159.140.254.100) has joined #ceph
[16:41] * Frank_ (~Frank@149.210.210.150) has joined #ceph
[16:42] * wjw-freebsd (~wjw@176.74.240.1) Quit (Ping timeout: 480 seconds)
[16:42] * rwheeler (~rwheeler@pool-173-48-195-215.bstnma.fios.verizon.net) has joined #ceph
[16:42] * Sliker (~Schaap@customer-46-39-102-250.stosn.net) has joined #ceph
[16:43] * wyang (~wyang@114.111.166.44) Quit (Quit: This computer has gone to sleep)
[16:43] * med (~medberry@71.74.177.250) Quit (Remote host closed the connection)
[16:44] * med (~medberry@71.74.177.250) has joined #ceph
[16:44] * Geph (~Geoffrey@41.77.153.99) Quit (Ping timeout: 480 seconds)
[16:45] * rakeshgm (~rakesh@121.244.87.124) Quit (Quit: Leaving)
[16:46] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[16:47] * wyang (~wyang@122.225.69.4) has joined #ceph
[16:48] * overclk (~quassel@117.202.96.59) has joined #ceph
[16:49] * guerby (~guerby@ip165.tetaneutral.net) has joined #ceph
[16:50] * shaunm (~shaunm@74.83.215.100) Quit (Ping timeout: 480 seconds)
[16:50] * wushudoin (~wushudoin@2601:646:8201:7769:2ab2:bdff:fe0b:a6ee) has joined #ceph
[16:51] * overclk_ (~quassel@117.202.96.59) Quit (Ping timeout: 480 seconds)
[16:52] * rmart04 (~rmart04@support.memset.com) Quit (Quit: rmart04)
[16:55] * Miouge (~Miouge@94.136.92.20) Quit (Quit: Miouge)
[16:55] * Bartek (~Bartek@78.10.129.82) has joined #ceph
[16:56] * dw (~devin@nat1.scz.suse.com) Quit (Quit: leaving)
[16:57] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[16:58] * ircolle (~Adium@c-71-229-136-109.hsd1.co.comcast.net) has joined #ceph
[17:00] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) Quit (Quit: Leaving.)
[17:00] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) has joined #ceph
[17:04] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[17:04] * pabluk_ is now known as pabluk__
[17:04] * pabluk__ is now known as pabluk_
[17:05] * Geph (~Geoffrey@41.77.153.99) has joined #ceph
[17:05] * wjw-freebsd2 (~wjw@176.74.240.1) Quit (Ping timeout: 480 seconds)
[17:06] * garphy is now known as garphy`aw
[17:08] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[17:11] * briun (~oftc-webi@178.237.98.13) Quit (Quit: Page closed)
[17:12] * Sliker (~Schaap@6AGAAAUNB.tor-irc.dnsbl.oftc.net) Quit ()
[17:12] * Miouge (~Miouge@94.136.92.20) has joined #ceph
[17:16] * karnan (~karnan@121.244.87.117) Quit (Remote host closed the connection)
[17:16] <debian112> ok guys needs some serious help here.
[17:16] <rkeene> 42
[17:16] <m0zes> rkeene: it is expected.
[17:16] <rkeene> m0zes, Thanks
[17:16] <debian112> I have a server that is not starting it's OSD's
[17:17] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) has joined #ceph
[17:17] <rkeene> debian112, Why not ?
[17:17] * Kayla (~tZ@06SAAA7R5.tor-irc.dnsbl.oftc.net) has joined #ceph
[17:17] * davidz1 (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[17:18] <debian112> not sure, when I start the osd
[17:18] <debian112> like this: start ceph-osd id=9
[17:18] <etienneme> Check the logs of the OSD :)
[17:18] <debian112> the process is running in the background seems like it is
[17:19] <debian112> but ceph osd tree shows different
[17:19] <debian112> posting logs:
[17:20] * RameshN (~rnachimu@121.244.87.117) has joined #ceph
[17:20] <debian112> http://paste.debian.net/432575/
[17:21] <debian112> osd are 50% loaded which shouldn't stop them from starting
[17:21] <etienneme> And what does ceph -s outputs?
[17:21] * pam (~pam@193.106.183.1) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:22] <debian112> http://paste.debian.net/432577/
[17:24] <debian112> we setup this cluster a while back, and it was just running. We had one team that was testing, but apparently they went prod, and didn't notify anyone
[17:25] <debian112> when I checked it was using 30TB
[17:25] <debian112> what a joy!
[17:26] <debian112> the journals are on ssd, with a ratio of 1ssd to 3hdd
[17:27] <etienneme> You did not had new logs on OSD?
[17:27] <etienneme> Would be great to have some errors logs :(
[17:28] * shaunm (~shaunm@72.49.2.237) has joined #ceph
[17:28] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[17:29] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[17:29] <debian112> that's all in the logs
[17:29] <debian112> I assume I can crank up more logging..
[17:30] <hoonetorg> hi guys, can i do something against the fact that osd's of equal weight and size differ in %used between 32.43% and 68.62%?
[17:30] <etienneme> Yep, you can increase verbosity to 20 (but there will be many logs)
[17:31] <etienneme> hoonetorg: check your pg/pgp count
[17:31] * atheism (~atheism@106.38.140.252) Quit (Ping timeout: 480 seconds)
[17:31] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[17:34] * haplo37 (~haplo37@199.91.185.156) Quit (Ping timeout: 480 seconds)
[17:34] <hoonetorg> etienneme: how will i do this
[17:34] <hoonetorg> the hosts are all equally deployed
[17:34] <etienneme> ceph df to list your pools
[17:35] <debian112> etienneme: http://paste.debian.net/432579/
[17:35] <etienneme> and ceph osd pool get poolname pg_num
[17:35] <etienneme> ceph osd pool get poolname pgp_num
[17:35] <etienneme> debian112: Ihave noidea sorry :/
[17:35] * LDA (~lda@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[17:35] * wyang (~wyang@122.225.69.4) Quit (Quit: This computer has gone to sleep)
[17:36] <etienneme> hoonetorg: read this :) http://docs.ceph.com/docs/master/rados/operations/placement-groups/
[17:36] <debian112> it is very weired, all the OSD on that one server will not start
[17:37] <m0zes> debian112: is it running through an hba or raid controller?
[17:37] <debian112> hba
[17:37] * karnan (~karnan@121.244.87.124) has joined #ceph
[17:37] <m0zes> it is still in the startup process of the osd. it seems to be taking forever to list and start the pgs on the osd.
[17:37] <m0zes> perhaps some form of hardware failure?
[17:38] <debian112> m0zes: I hope not, pretty new servers
[17:39] <m0zes> I just had a raid controller (in jbod mode) fail in a box that was about a year old... a few disks would routinely drop out, or slow to a crawl...
[17:41] * askb_ (~askb@117.208.167.42) Quit (Quit: Leaving)
[17:41] <debian112> let me investigate more and report back
[17:42] <Jeeves_> debian112: what does iostat say?
[17:42] <debian112> http://paste.debian.net/432581/
[17:42] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[17:43] * yehudasa (~yehuda@cpe-104-172-237-141.socal.res.rr.com) has joined #ceph
[17:43] <Jeeves_> debian112: seems to be doing something?
[17:43] * bvi (~bastiaan@185.56.32.1) Quit (Quit: Ex-Chat)
[17:43] <Jeeves_> Not an awful lot, but still
[17:44] * itamarl (~itamarl@194.90.7.244) Quit (Quit: itamarl)
[17:44] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[17:44] * itamarl (~itamarl@194.90.7.244) Quit ()
[17:44] * joshd1 (~jdurgin@71-92-201-212.dhcp.gldl.ca.charter.com) has joined #ceph
[17:44] * linuxkidd (~linuxkidd@174.sub-70-195-201.myvzw.com) Quit (Remote host closed the connection)
[17:45] <debian112> there are no, active ceph processes
[17:45] * linuxkidd (~linuxkidd@174.sub-70-195-201.myvzw.com) has joined #ceph
[17:45] <hoonetorg> etienneme: ok rbd has pg_num/pgp_num: 64, all other pools 256
[17:46] <hoonetorg> how is this connected to the unequal distribution of data on the osd?
[17:46] <etienneme> Use ceph.com/pgcalc/ to get the value needed
[17:46] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[17:46] <etienneme> Maybe, maybe not :) if depends of the amount of datas you have on your pools
[17:46] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[17:47] <etienneme> If you have the rights value, check the weight of your osd
[17:47] * thomnico (~thomnico@37.162.219.177) has joined #ceph
[17:47] * Kayla (~tZ@06SAAA7R5.tor-irc.dnsbl.oftc.net) Quit ()
[17:47] * Lattyware (~Nephyrin@tor-relay.bahiadelsol.io) has joined #ceph
[17:47] <debian112> lspci
[17:47] <debian112> hangs
[17:47] <debian112> could be hardware
[17:47] <m0zes> if lspci hangs, you've got hardware problems.
[17:47] <evelu> the system hangs ?
[17:47] <evelu> or lspci is stopped
[17:48] <etienneme> You could also use reweight-by-utilization, but first check your pg/pgp values, osd weights
[17:48] * haomaiwang (~haomaiwan@li401-170.members.linode.com) has joined #ceph
[17:48] <evelu> if you have pci troubles, the process will trigger MCEs
[17:48] <evelu> if you don't have MCEs that very unlikely you have hw issues
[17:48] <m0zes> hoonetorg: data is sent to placement groups, placement groups get assigned to osds. if an osd has more placement groups than another osd, that can lead to imbalance.
[17:49] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[17:49] <m0zes> the flip side is that more placement groups mean more resources needed for the osd. memory goes up, threads increase. and it can take longer to peer due to context switching.
[17:50] <debian112> evelu: no MCEs
[17:51] <hoonetorg> m0zes: thx
[17:52] * thansen (~thansen@162.219.43.108) Quit (Quit: Ex-Chat)
[17:52] <hoonetorg> i don't get the thing with one osd has more pg's than another osd. how can this happen with equal 6 servers and each 4x600gb sas 10k?
[17:53] <hoonetorg> when will it happen that an osd gets more pg's than the other?
[17:55] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[17:55] <etienneme> How many OSD do you have?
[17:55] <hoonetorg> 24
[17:55] <m0zes> the crush algorithm is much better at being fast and consistent than being fair and balanced.
[17:55] * yguang11 (~yguang11@2001:4998:effd:600:716d:a75:65c2:b481) has joined #ceph
[17:56] <hoonetorg> can i delete unused pools
[17:56] <hoonetorg> can i delete the standard create rbd pool?
[17:56] <m0zes> you can delete any pools that you don't need.
[17:57] <hoonetorg> and then raise the pg/pgp num of the only pool i use (which is named "one" for opennebula)
[17:57] <hoonetorg> ?
[17:57] <etienneme> Sure :)
[17:57] <hoonetorg> then i'll do
[17:57] <etienneme> If you only use 1 pool on 24 OSD, 256 is low.
[17:57] <etienneme> Check how many you should have
[17:57] <etienneme> But do do it directly
[17:58] <etienneme> In example, if you should have 1024, do 256->512->768->1024
[17:58] * evelu (~erwan@46.231.131.178) Quit (Ping timeout: 480 seconds)
[17:58] <etienneme> You will get a low amount of datas missplaced, and you cant decrease PG
[17:58] <etienneme> So if your servers can't handle the load of 1024 pg you could have huge troubles :)
[17:58] <sugoruyo> anyone have any idea why [migration/N] threads periodically eat all my CPU on OSD hosts?
[17:59] <m0zes> sugoruyo: are you running numad?
[17:59] <sugoruyo> m0zes: I don't know what that is so not that I'm aware off
[18:00] <sugoruyo> numactl numademo numastat do exist
[18:01] * haomaiwang (~haomaiwan@li401-170.members.linode.com) Quit (Remote host closed the connection)
[18:01] * karnan (~karnan@121.244.87.124) Quit (Quit: Leaving)
[18:01] * haomaiwang (~haomaiwan@li401-170.members.linode.com) has joined #ceph
[18:01] <T1> numad is baad
[18:02] <m0zes> numad will try to migrate processes to the cores closest to their allocated memory. I was assuming that was what was triggering migration/N threads.
[18:02] <T1> while it redistributes processes they become slow
[18:02] <sugoruyo> m0zes: yea I looked it up, we don't seem to be running it (we're on SL6)
[18:02] * wwdillingham_ (~LobsterRo@65.112.8.201) has joined #ceph
[18:03] <hoonetorg> etienneme: thx
[18:03] <hoonetorg> will try 512 1st
[18:03] <hoonetorg> will i see rebalancing ???
[18:03] <etienneme> Read the documentation I linked first, to understand what will happen :)
[18:04] <etienneme> Yes and it will impact the performances
[18:04] <hoonetorg> k
[18:04] <hoonetorg> i know
[18:04] <hoonetorg> i already tuned some params to slow balance/recover stuff
[18:05] <hoonetorg> works fine for me
[18:05] * joshd1 (~jdurgin@71-92-201-212.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[18:05] <etienneme> ok :)
[18:06] * yguang11 (~yguang11@2001:4998:effd:600:716d:a75:65c2:b481) Quit ()
[18:06] * dugravot6 (~dugravot6@dn-infra-04.lionnois.site.univ-lorraine.fr) Quit (Quit: Leaving.)
[18:07] * wwdillingham (~LobsterRo@140.247.242.44) Quit (Ping timeout: 480 seconds)
[18:07] * wwdillingham_ is now known as wwdillingham
[18:07] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[18:07] * thomnico (~thomnico@37.162.219.177) Quit (Ping timeout: 480 seconds)
[18:08] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[18:08] * RayTracer (~RayTracer@153.19.7.39) Quit (Remote host closed the connection)
[18:08] * shaunm (~shaunm@72.49.2.237) Quit (Ping timeout: 480 seconds)
[18:09] <sugoruyo> any other ideas on the migration threads? the weird thing is I have two types of machines and this only happens on one
[18:10] * RameshN (~rnachimu@121.244.87.117) Quit (Ping timeout: 480 seconds)
[18:10] * Vacuum__ (~Vacuum@i59F79EB4.versanet.de) has joined #ceph
[18:12] * kefu is now known as kefu|afk
[18:13] * dyasny (~dyasny@46-117-8-108.bb.netvision.net.il) Quit (Ping timeout: 480 seconds)
[18:14] <hoonetorg> etienneme: m0zes: thx very much, did not know that pg_num/pgp_num influences data distribution that much
[18:14] <etienneme> np :)
[18:14] <hoonetorg> but now it's more clear to me what happens underneath
[18:15] * Vacuum_ (~Vacuum@88.130.211.174) Quit (Ping timeout: 480 seconds)
[18:16] * thomnico (~thomnico@62.119.166.9) has joined #ceph
[18:17] * Lattyware (~Nephyrin@06SAAA7TB.tor-irc.dnsbl.oftc.net) Quit ()
[18:17] * Quatroking1 (~KapiteinK@94-245-57-237.customer.t3.se) has joined #ceph
[18:18] * xcezzz (~xcezzz@97-96-111-106.res.bhn.net) has joined #ceph
[18:18] <xcezzz> is a rolling upgrade from 0.87 to 0.94 as easy as it is made to seem through docs/guides??? should i expect data migration if i???m not changing any tunables???
[18:19] * haomaiwang (~haomaiwan@li401-170.members.linode.com) Quit (Remote host closed the connection)
[18:24] <sugoruyo> xcezzz: my experience is thus: I did one of my mons and ran mixed mon quorum for a day, no problems there. Then did the other mons and ran 0.94 mons and 0.87 OSDs and RGWs, no problems there either. Then did half the OSDs, again, no problems. Then did the rest and finally the RGWs
[18:25] <sugoruyo> I didn't see data move around until I changed the tunables and switched the buckets to straw2. I think straw2 was the biggest cause of data moving around.
[18:26] <sugoruyo> caveats, we're not in prod yet so low fill rate and low load, YMMV
[18:27] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[18:32] * kefu (~kefu@114.92.120.83) has joined #ceph
[18:33] * dgurtner_ (~dgurtner@82.199.64.68) Quit (Ping timeout: 480 seconds)
[18:35] * IvanJobs (~hardes@103.50.11.146) has joined #ceph
[18:35] * kefu (~kefu@114.92.120.83) Quit ()
[18:36] * kefu|afk (~kefu@183.193.38.176) Quit (Read error: No route to host)
[18:37] * kefu (~kefu@183.193.38.176) has joined #ceph
[18:37] * huangjun (~kvirc@117.152.69.112) Quit (Ping timeout: 480 seconds)
[18:38] * DanFoster (~Daniel@office.34sp.com) Quit (Quit: Leaving)
[18:41] * pam (~pam@host98-91-dynamic.50-82-r.retail.telecomitalia.it) has joined #ceph
[18:41] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[18:43] * IvanJobs (~hardes@103.50.11.146) Quit (Ping timeout: 480 seconds)
[18:45] * kawa2014 (~kawa@83.111.58.108) Quit (Quit: Leaving)
[18:47] * rraja (~rraja@121.244.87.117) Quit (Quit: Leaving)
[18:47] * Quatroking1 (~KapiteinK@6AGAAAURJ.tor-irc.dnsbl.oftc.net) Quit ()
[18:47] * Skaag (~lunix@65.200.54.234) has joined #ceph
[18:48] <jnq> we seem to get deep scrubbing happen everywhere at midnight every night, causing quite a big IO spike and some lagging on the rbd requests, is there any way to see/change when this is scheduled and to space it out a bit? anyone seen this before?
[18:49] <m0zes> what release of ceph?
[18:50] <jnq> currently 0.94.2 but we're due to upgrade to $latest soon
[18:50] <m0zes> http://tracker.ceph.com/issues/13409
[18:50] <m0zes> latest should help
[18:50] <jnq> hoorah!
[18:50] <jnq> will do that asap then, thanks for that
[18:50] <jnq> my endless googling didn't get me anywhere
[18:52] <m0zes> I know, before that got fixed, some people would cycle the noscrub flag for a while to spread out the scrub start times.
[18:52] * bene2 (~bene@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[18:53] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[18:53] <jnq> ahh ok
[18:54] <jnq> it's starting to hurt us quite badly with rbd lags
[18:56] <georgem> jnq: take a look at this as well http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-June/002039.html
[18:56] <jnq> thanks georgem!
[18:58] * laevar (~jschulz1@134.76.80.11) Quit (Quit: WeeChat 1.4)
[18:58] * mykola (~Mikolaj@91.245.72.200) has joined #ceph
[19:00] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[19:01] * derjohn_mobi (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[19:02] * Geph (~Geoffrey@41.77.153.99) Quit (Ping timeout: 480 seconds)
[19:05] * thomnico (~thomnico@62.119.166.9) Quit (Ping timeout: 480 seconds)
[19:05] * bniver (~bniver@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[19:05] * dgurtner (~dgurtner@217.149.140.193) has joined #ceph
[19:07] * BrianA1 (~BrianA@fw-rw.shutterfly.com) has joined #ceph
[19:07] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Quit: Bye guys! (??????????????????? ?????????)
[19:09] * davidz1 (~davidz@2605:e000:1313:8003:8d16:5723:2897:b2f4) has joined #ceph
[19:09] * davidz (~davidz@2605:e000:1313:8003:1d34:bd8d:a1bf:be6a) Quit (Read error: Connection reset by peer)
[19:12] * dgurtner_ (~dgurtner@82.199.64.68) has joined #ceph
[19:14] * dgurtner (~dgurtner@217.149.140.193) Quit (Ping timeout: 480 seconds)
[19:14] * kefu is now known as kefu|afk
[19:14] * vikhyat (~vumrao@123.252.242.114) Quit (Quit: Leaving)
[19:15] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[19:17] * jacoo (~Gecko1986@tor-exit7-readme.dfri.se) has joined #ceph
[19:17] * andreww is now known as xarses
[19:18] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[19:18] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[19:19] * kefu (~kefu@114.92.120.83) has joined #ceph
[19:19] * krypto (~krypto@G68-90-105-22.sbcis.sbc.com) has joined #ceph
[19:19] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[19:20] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[19:21] * dgurtner_ (~dgurtner@82.199.64.68) Quit (Ping timeout: 480 seconds)
[19:21] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[19:23] * pam (~pam@host98-91-dynamic.50-82-r.retail.telecomitalia.it) Quit (Max SendQ exceeded)
[19:25] * b_rake (~b_rake@69-195-66-67.unifiedlayer.com) has joined #ceph
[19:26] * b_rake is now known as BlakeA
[19:26] * kefu|afk (~kefu@183.193.38.176) Quit (Ping timeout: 480 seconds)
[19:27] <lae> is a petabyte RBD a bad idea if I'm having performance issues with cephfs?
[19:32] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[19:33] * Bartek (~Bartek@78.10.129.82) Quit (Ping timeout: 480 seconds)
[19:37] * olc- (~olecam@93.184.35.82) Quit (Ping timeout: 480 seconds)
[19:37] * derjohn_mobi (~aj@88.128.81.42) has joined #ceph
[19:40] * haomaiwang (~haomaiwan@li401-170.members.linode.com) has joined #ceph
[19:40] * olc- (~olecam@93.184.35.82) has joined #ceph
[19:42] * overclk (~quassel@117.202.96.59) Quit (Remote host closed the connection)
[19:42] * Bartek (~Bartek@78.10.129.82) has joined #ceph
[19:43] * Rickus_ (~Rickus@office.protected.ca) Quit (Quit: Leaving)
[19:46] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) Quit (Quit: Leaving)
[19:47] * jacoo (~Gecko1986@4MJAAD7TM.tor-irc.dnsbl.oftc.net) Quit ()
[19:47] * curtis864 (~neobenedi@06SAAA7XJ.tor-irc.dnsbl.oftc.net) has joined #ceph
[19:47] * swami1 (~swami@27.7.161.226) has joined #ceph
[19:48] * haomaiwang (~haomaiwan@li401-170.members.linode.com) Quit (Ping timeout: 480 seconds)
[19:48] * cathode (~cathode@50.232.215.114) has joined #ceph
[19:49] <BlakeA> Hi guys. We've been seeing an issue in our cluster where we haven't been able to get 1 pg to go clean. This is causing a large amount of slow requests and blocked ops at certain intervals. We've gone through the network infrastructure and are unable to locate any issues. We have 18 nodes, bonded 40G nics on each, replication level set to 3, we do have a caching tier, running 0.94.5 on cent7 4.2.1-1.el7.elrepo.x86_64. Any thoughts on how
[19:49] <BlakeA> to get this unstuck?
[19:49] <BlakeA> pg 17.ca is stuck unclean for 403363.872691, current state active+recovering+degraded+remapped,
[19:50] <BlakeA> We have gone through the standard troubleshooting pg and osd articles along with the mailing lists
[19:51] <xcezzz> jnq: check out CERN???s ceph scripts??? lots of good utilities for doing your own scrub cycles, gentle split of PGs??? all sorts of other useful goodies too https://github.com/cernceph
[19:57] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[20:01] * daiver (~daiver@95.85.8.93) has joined #ceph
[20:01] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[20:03] <daiver> hi all
[20:04] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[20:05] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[20:08] <daiver> facing with problem adding new OSD's to CentOS 7 box with ceph 9.2.1
[20:08] <daiver> here the output
[20:08] <daiver> http://pastebin.com/WPSHn7GP
[20:08] * Hemanth (~hkumar_@103.228.221.145) has joined #ceph
[20:09] <daiver> ceph-deploy creates disk structure, but doesn't add it to auth list
[20:09] <daiver> doesn't mount it
[20:09] <daiver> in that output - new OSD - osd.21.
[20:09] <daiver> osd.11 I added to crush map manually, together with host
[20:10] <daiver> but other hosts - node01, node03, node04 - ceph-deploy added everything smoothly
[20:10] <daiver> any ideas? thanks
[20:10] * kefu is now known as kefu|afk
[20:10] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[20:12] * olc- (~olecam@93.184.35.82) Quit (Ping timeout: 480 seconds)
[20:12] <xcezzz> looks like they already existed before right/
[20:12] <daiver> tried to re-install everything from the scratch. doesn't help. other 3 hosts with exactly the same conf - worked fine. some magic I doesn't get
[20:12] <daiver> yes...
[20:13] <daiver> 11 for sure, but if I add more new OSD's with ceph-deploy so they get osd.33..34..35 - same behavior
[20:14] <daiver> do you think I need to clean-up something?
[20:14] <xcezzz> i had stuff like that happen before when you have dangling auth entries, or you have flags like noout/noin/noX set
[20:14] <daiver> I didn't set such flags
[20:14] * kefu|afk (~kefu@114.92.120.83) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[20:15] <xcezzz> since they???re still down??? remove them completely again??? ssh to the node they are on??? stop the ceph osd services
[20:16] <daiver> yeah, tried that. removed them from crush map
[20:16] <xcezzz> ceph osd crush remove osd.X; ceph auth del osd.X; ceph osd rm X;
[20:16] <daiver> service doesn't even start
[20:16] <daiver> as it don't have mount
[20:16] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[20:16] <xcezzz> ceph-deploy osd create ???zap-disk node:sdX
[20:16] <daiver> yes, that I tried. auth del fails as it didn't add osd to auth map
[20:17] <daiver> okey, let me try again, I'll show an ouput
[20:17] <xcezzz> ok??? also you have any errors showing in /var/log/ceph on the node? or in syslog
[20:17] * curtis864 (~neobenedi@06SAAA7XJ.tor-irc.dnsbl.oftc.net) Quit ()
[20:18] * derjohn_mobi (~aj@88.128.81.42) Quit (Ping timeout: 480 seconds)
[20:18] * bniver (~bniver@nat-pool-bos-u.redhat.com) has joined #ceph
[20:18] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[20:21] * olc- (~olecam@93.184.35.82) has joined #ceph
[20:21] * WedTM (~ain@06SAAA7YB.tor-irc.dnsbl.oftc.net) has joined #ceph
[20:22] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[20:22] <daiver> same problem
[20:22] <daiver> here the new try:
[20:22] <daiver> http://pastebin.com/H20vG2sK
[20:23] * krypto (~krypto@G68-90-105-22.sbcis.sbc.com) Quit (Read error: Connection reset by peer)
[20:25] <xcezzz> did you copy over the keyring?
[20:25] * rendar (~I@host248-181-dynamic.19-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[20:25] * swami1 (~swami@27.7.161.226) Quit (Quit: Leaving.)
[20:27] * rendar (~I@host248-181-dynamic.19-79-r.retail.telecomitalia.it) has joined #ceph
[20:28] <xcezzz> i guess im not sure??? i think my saltstack copies my ceph.conf and admin keyring to all osd nodes??? so i???m not near the same boat as you??? plus we???re ubuntu??? i would think something didn???t get installed right??? if you do ceph-deploy purgedata node does it even do that?
[20:29] <xcezzz> double check that
[20:30] <xcezzz> if no osds are getting created on that host i???d be thinking something didn???t install right??? it???s not able to generate the keyring on the node side??? very odd
[20:30] <BlakeA> Does anyone have any thoughts on bringing this stuck pg back to healthy?
[20:31] <xcezzz> blakea: what status of ???stuck???
[20:31] <BlakeA> unclean
[20:31] <xcezzz> have you tried to force repair or scrub on it?
[20:32] * bniver (~bniver@nat-pool-bos-u.redhat.com) Quit (Quit: Leaving)
[20:32] * bniver (~bniver@nat-pool-bos-u.redhat.com) has joined #ceph
[20:33] <georgem> BlakeA: did you decrease the weight of some OSDs?
[20:33] <BlakeA> What operations would the pg repair perform? I was under the impression that it was going to mostly resolve inconsistency issues which we have seen reports of
[20:34] <xcezzz> ceph pg dump | grep unc
[20:34] <BlakeA> Yes, we have changed the weight of a couple in an attempt to get it to rebalance to other osds
[20:34] <xcezzz> ceph pg repair X.fff
[20:36] <BlakeA> no results when grepping for unc in the pg dump output
[20:36] <xcezzz> whats `ceph health detail` say...
[20:36] * jordanP (~jordan@bdv75-2-81-57-250-57.fbx.proxad.net) has joined #ceph
[20:38] * aboyle (~aboyle__@ardent.csail.mit.edu) Quit (Remote host closed the connection)
[20:38] * daiver (~daiver@95.85.8.93) Quit (Remote host closed the connection)
[20:40] <BlakeA> one sec
[20:41] <BlakeA> http://pastebin.com/q4DQS1DW
[20:41] * getzburg (sid24913@id-24913.ealing.irccloud.com) Quit (Remote host closed the connection)
[20:41] * devicenull (sid4013@id-4013.ealing.irccloud.com) Quit (Remote host closed the connection)
[20:41] * ElNounch (sid150478@id-150478.ealing.irccloud.com) Quit (Remote host closed the connection)
[20:41] * JohnPreston78 (sid31393@id-31393.ealing.irccloud.com) Quit (Remote host closed the connection)
[20:41] * scalability-junk (sid6422@id-6422.ealing.irccloud.com) Quit (Remote host closed the connection)
[20:41] * Pintomatic (sid25118@id-25118.ealing.irccloud.com) Quit (Remote host closed the connection)
[20:43] * MannerMan (~oscar@user170.217-10-117.netatonce.net) Quit (Ping timeout: 480 seconds)
[20:43] * devicenull (sid4013@id-4013.ealing.irccloud.com) has joined #ceph
[20:44] <BlakeA> what is the pg repair actually going to? Is it just going to be looking for inconsistencies of objects and 'resyncing' of sorts?
[20:44] <xcezzz> ya??? but that doesn???t seem to be your problem
[20:44] <xcezzz> i was thinking incomplete???
[20:45] <xcezzz> is a bit weird??? your stuck on active+degraded+remapped+backfilling??? have you tried to restart the osd daemons of the osd???s listed there?
[20:45] * scalability-junk (sid6422@id-6422.ealing.irccloud.com) has joined #ceph
[20:46] <xcezzz> or see if /var/log/ceph/ceph-osd.65.log has any weirdness showing up?
[20:46] <xcezzz> plus how long have you been noscrubbing?
[20:46] <xcezzz> ceph pg scrub 17.ca
[20:47] <xcezzz> if you???re worried about scrubs using up your IOPs and degrading performance if you haven???t done them in a while??? check out CERN???s scripts for doing gentle manually scrubs???
[20:47] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[20:47] <xcezzz> https://github.com/cernceph/ceph-scripts/tree/master/tools/scrubbing
[20:49] <xcezzz> the only time i see stuck/unclean PGs like that are because of an osd failure on a size 2 pool??? but then lowering min_size to 1 and restarting the OSDs cleans it up??? maybe not specific to your issue but restarting OSDs seems to kick things into gear sometimes
[20:49] <BlakeA> we have restarted the osds connected to pg as well as reweighting to try to rebalance to other osds. Nothing in those osd logs are particularly standing out even with turning up the debugging
[20:50] <xcezzz> odd??? but im curious how long it has been since they have been scrubbed
[20:51] <BlakeA> It's been quite a while, taking a look at those scripts
[20:51] * The1_ (~the_one@87.104.212.66) has joined #ceph
[20:51] <xcezzz> http://pastebin.com/2zyRUyUz
[20:51] * WedTM (~ain@06SAAA7YB.tor-irc.dnsbl.oftc.net) Quit ()
[20:52] <xcezzz> as an idea of the info it provides???
[20:52] * khyron (~khyron@187.188.11.61) has joined #ceph
[20:52] * shylesh (~shylesh@59.95.71.203) has joined #ceph
[20:53] * getzburg (sid24913@id-24913.ealing.irccloud.com) has joined #ceph
[20:53] * Pintomatic (sid25118@id-25118.ealing.irccloud.com) has joined #ceph
[20:53] <BlakeA> oh awesome, let me take a look
[20:53] <xcezzz> but ya??? their gentle pg splitting, the gentle scrub, and other utilities in there are a lifesaver
[20:54] <xcezzz> and they doing huge PBs of data lol
[20:56] * JohnPreston78 (sid31393@id-31393.ealing.irccloud.com) has joined #ceph
[20:58] <BlakeA> lol, yeah. I'll run this by the team and let you guys know if we have any issues. Thanks much!
[20:58] * T1 (~the_one@87.104.212.66) Quit (Ping timeout: 480 seconds)
[21:02] * georgem1 (~Adium@206.108.127.16) has joined #ceph
[21:02] * pabluk_ is now known as pabluk__
[21:02] * georgem2 (~Adium@206.108.127.16) has joined #ceph
[21:02] * georgem1 (~Adium@206.108.127.16) Quit (Read error: Connection reset by peer)
[21:03] * georgem2 (~Adium@206.108.127.16) Quit ()
[21:04] * Hemanth (~hkumar_@103.228.221.145) Quit (Ping timeout: 480 seconds)
[21:04] * georgem1 (~Adium@206.108.127.16) has joined #ceph
[21:09] * georgem (~Adium@206.108.127.16) Quit (Ping timeout: 480 seconds)
[21:16] * jordanP (~jordan@bdv75-2-81-57-250-57.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[21:17] * ElNounch (sid150478@id-150478.ealing.irccloud.com) has joined #ceph
[21:17] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) Quit (Quit: Leaving.)
[21:24] * natarej_ (~natarej@CPE-101-181-149-113.lnse5.cha.bigpond.net.au) has joined #ceph
[21:26] * Rickus (~Rickus@office.protected.ca) has joined #ceph
[21:29] * natarej (~natarej@CPE-101-181-149-113.lnse5.cha.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[21:31] * mykola (~Mikolaj@91.245.72.200) Quit (Quit: away)
[21:35] * bniver (~bniver@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[21:36] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[21:37] <wwdillingham> I am taking rbd snapshots of child devices (which are running vms) when this process ocurs, the filesystem within that runnign VM (the source of the snapshot) goes read only. I was under the impression that this would not impact the child device (however the snapshots filesystem may be in an inconsistent state). Do I need to flatten these rbd children before snapshotting them, or am I entirely mistaken that this should work?
[21:42] * debian112 (~bcolbert@24.126.201.64) Quit (Ping timeout: 480 seconds)
[21:49] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) has joined #ceph
[21:51] * Hazmat (~ylmson@torexit.headstrong.de) has joined #ceph
[21:54] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) Quit (Quit: Leaving.)
[21:56] * shylesh (~shylesh@59.95.71.203) Quit (Remote host closed the connection)
[21:57] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) Quit (Quit: Leaving.)
[21:57] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[21:59] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) has joined #ceph
[22:00] * shohn (~shohn@dslb-178-008-198-190.178.008.pools.vodafone-ip.de) Quit ()
[22:07] * elmo (~james@faun.canonical.com) has joined #ceph
[22:11] * davidz (~davidz@2605:e000:1313:8003:8d16:5723:2897:b2f4) has joined #ceph
[22:11] * davidz1 (~davidz@2605:e000:1313:8003:8d16:5723:2897:b2f4) Quit (Read error: Connection reset by peer)
[22:16] * thansen (~thansen@17.253.sfcn.org) has joined #ceph
[22:17] * bniver (~bniver@nat-pool-bos-u.redhat.com) has joined #ceph
[22:20] * haomaiwang (~haomaiwan@li401-170.members.linode.com) has joined #ceph
[22:21] * Hazmat (~ylmson@76GAAEHGR.tor-irc.dnsbl.oftc.net) Quit ()
[22:21] * Yopi (~straterra@91.219.236.222) has joined #ceph
[22:25] * derjohn_mobi (~aj@x4db29a50.dyn.telefonica.de) has joined #ceph
[22:28] * haomaiwang (~haomaiwan@li401-170.members.linode.com) Quit (Ping timeout: 480 seconds)
[22:28] * Bartek (~Bartek@78.10.129.82) Quit (Ping timeout: 480 seconds)
[22:29] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[22:34] * rendar (~I@host248-181-dynamic.19-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[22:37] * rendar (~I@host248-181-dynamic.19-79-r.retail.telecomitalia.it) has joined #ceph
[22:39] * post-factum (~post-fact@vulcan.natalenko.name) Quit (Quit: leaving)
[22:39] * post-factum (~post-fact@vulcan.natalenko.name) has joined #ceph
[22:40] * georgem1 (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[22:43] * post-factum (~post-fact@vulcan.natalenko.name) Quit ()
[22:44] * post-factum (~post-fact@vulcan.natalenko.name) has joined #ceph
[22:45] * georgem (~Adium@206.108.127.16) has joined #ceph
[22:47] * \ask (~ask@oz.develooper.com) has joined #ceph
[22:48] * bniver (~bniver@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[22:51] * pabluk__ is now known as pabluk_
[22:51] * Yopi (~straterra@6AGAAAU22.tor-irc.dnsbl.oftc.net) Quit ()
[22:52] * starcoder (~Yopi@5.135.65.145) has joined #ceph
[22:53] * linuxkidd (~linuxkidd@174.sub-70-195-201.myvzw.com) Quit (Remote host closed the connection)
[22:54] * linuxkidd (~linuxkidd@174.sub-70-195-201.myvzw.com) has joined #ceph
[23:00] * wwdillingham (~LobsterRo@65.112.8.201) Quit (Quit: wwdillingham)
[23:00] * cathode (~cathode@50.232.215.114) Quit (Quit: Leaving)
[23:00] * LDA (~lda@host217-114-156-249.pppoe.mark-itt.net) Quit (Quit: LDA)
[23:04] * angdraug (~angdraug@64.124.158.100) has joined #ceph
[23:07] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[23:08] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[23:09] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) has joined #ceph
[23:11] <debian112> I have some osd that will not start. any idea:
[23:11] <debian112> http://paste.debian.net/432629/
[23:12] <debian112> journaling is on ssd here
[23:13] * Bartek (~Bartek@78.10.129.82) has joined #ceph
[23:15] * ircolle (~Adium@c-71-229-136-109.hsd1.co.comcast.net) Quit (Quit: Leaving.)
[23:15] * davidz (~davidz@2605:e000:1313:8003:8d16:5723:2897:b2f4) Quit (Read error: Connection reset by peer)
[23:15] * davidz (~davidz@2605:e000:1313:8003:8d16:5723:2897:b2f4) has joined #ceph
[23:18] * bniver (~bniver@nat-pool-bos-u.redhat.com) has joined #ceph
[23:19] * _28_ria (~kvirc@opfr028.ru) Quit (Read error: Connection reset by peer)
[23:21] * starcoder (~Yopi@6AGAAAU4E.tor-irc.dnsbl.oftc.net) Quit ()
[23:21] * Kakeru (~Izanagi@06SAAA721.tor-irc.dnsbl.oftc.net) has joined #ceph
[23:21] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[23:22] * bene2 (~bene@nat-pool-bos-t.redhat.com) Quit (Quit: Konversation terminated!)
[23:23] * RayTracer (~RayTracer@host-81-190-123-128.gdynia.mm.pl) Quit (Remote host closed the connection)
[23:24] * brad_mssw (~brad@66.129.88.50) Quit (Quit: Leaving)
[23:30] * _28_ria (~kvirc@opfr028.ru) has joined #ceph
[23:30] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[23:31] * bniver (~bniver@nat-pool-bos-u.redhat.com) Quit (Remote host closed the connection)
[23:34] * dgurtner (~dgurtner@217.149.140.193) has joined #ceph
[23:51] * Kakeru (~Izanagi@06SAAA721.tor-irc.dnsbl.oftc.net) Quit ()
[23:52] * RaidSoft (~Sketchfil@176.10.99.201) has joined #ceph
[23:52] * cathode (~cathode@50.232.215.114) has joined #ceph
[23:53] <khyron> hi cephers....
[23:53] <khyron> quick question, does anyone know how to know the provisioned space in cephfs ???
[23:55] * pabluk_ is now known as pabluk__
[23:58] <khyron> df shows 79T used... but du shows 31T... so my problem is the provisioned space because /var/lib/nova/instances is living on cephfs... and have lots of qcow2 disks... please don't ask me why :( (this configuration was before my time)
[23:59] * xarses_ (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.