#ceph IRC Log

Index

IRC Log for 2015-11-03

Timestamps are in GMT/BST.

[0:00] <markednmbr1> m0zes, whats your recommendations for target_max_bytes cache_target_dirty_ratio and cache_target_full_ratio ?
[0:02] <markednmbr1> should target_max_bytes be the maximum size of your cache?
[0:03] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) has joined #ceph
[0:04] * jclm (~jclm@ip-64-134-184-248.public.wayport.net) has joined #ceph
[0:04] <m0zes> I'd give it some buffer. about 75% personally.
[0:05] * linjan__ (~linjan@176.195.239.174) Quit (Ping timeout: 480 seconds)
[0:08] <markednmbr1> so with this cache tier, I guess if your SSD dies before its written to the sata you lose your data?
[0:08] <markednmbr1> actually no sorry, because the cache tier is replicated also of course
[0:11] * fridim_ (~fridim@56-198-190-109.dsl.ovh.fr) Quit (Ping timeout: 480 seconds)
[0:17] <markednmbr1> m0zes, the target_max_bytes is for the whole pool right, not per osd?
[0:17] <markednmbr1> so if I have 4 x 94G ssd - half lost for replication
[0:18] <markednmbr1> 150G targe_max_bytes would be sensible?
[0:18] <m0zes> correct.
[0:22] * segutier (~segutier@sfo-vpn1.shawnlower.net) Quit (Ping timeout: 480 seconds)
[0:23] * shawniverson (~shawniver@208.38.236.111) has joined #ceph
[0:23] <markednmbr1> thanks for you help m0zes, i'm off now. cheers
[0:24] * markednmbr1 (~Diego@cpc1-lewi13-2-0-cust267.2-4.cable.virginm.net) Quit (Quit: Leaving)
[0:27] <mfa298_> cetex: reading scrollback, radosgw may not be too happy with 300-500m objects if they're all in one bucket (you'll find it slows down as you add more objects due to how radosgw does the meta data)
[0:28] <mfa298_> sharding the metadata and also putting the metadata pools onto faster drives (ssds) can help with that to some extent
[0:28] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Remote host closed the connection)
[0:29] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[0:32] * segutier (~segutier@sfo-vpn1.shawnlower.net) has joined #ceph
[0:33] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) Quit (Remote host closed the connection)
[0:35] * vata1 (~vata@207.96.182.162) Quit (Quit: Leaving.)
[0:41] * xarses (~xarses@118.103.8.153) Quit (Remote host closed the connection)
[0:41] * xarses (~xarses@118.103.8.153) has joined #ceph
[0:46] * sudocat (~dibarra@192.185.1.20) Quit (Quit: Leaving.)
[0:47] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[0:48] * stiopa (~stiopa@cpc73828-dals21-2-0-cust630.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[0:52] * bene2 (~bene@2601:18c:8300:f3ae:ea2a:eaff:fe08:3c7a) has joined #ceph
[0:52] * sileht (~sileht@sileht.net) Quit (Ping timeout: 480 seconds)
[0:56] * s3an2 (~root@korn.s3an.me.uk) has joined #ceph
[1:07] * rendar (~I@host114-183-dynamic.37-79-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[1:10] * moore (~moore@64.202.160.88) Quit (Remote host closed the connection)
[1:12] * davidzlap1 (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) has joined #ceph
[1:12] * davidzlap (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) Quit (Read error: Connection reset by peer)
[1:14] * dneary (~dneary@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[1:18] * dyasny (~dyasny@198.251.57.99) Quit (Ping timeout: 480 seconds)
[1:21] * fsimonce (~simon@host30-173-dynamic.23-79-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[1:24] * ircolle (~Adium@2601:285:201:2bf9:8919:8fda:7dbc:888c) Quit (Quit: Leaving.)
[1:24] * xarses (~xarses@118.103.8.153) Quit (Ping timeout: 480 seconds)
[1:27] * angdraug (~angdraug@12.164.168.117) Quit (Quit: Leaving)
[1:32] * lcurtis_ (~lcurtis@47.19.105.250) Quit (Ping timeout: 480 seconds)
[1:32] * Venturi (Venturi@93-103-91-169.dynamic.t-2.net) Quit (Ping timeout: 480 seconds)
[1:32] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[1:32] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[1:34] * davidzlap (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) has joined #ceph
[1:35] * qhartman (~qhartman@den.direwolfdigital.com) has joined #ceph
[1:35] * davidzlap1 (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) Quit (Read error: Connection reset by peer)
[1:42] * Snowman (~Guest1390@ns213861.ovh.net) has joined #ceph
[1:44] * davidzlap1 (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) has joined #ceph
[1:45] * davidzlap (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) Quit (Read error: Connection reset by peer)
[1:51] * joshd (~jdurgin@206.169.83.146) Quit (Quit: Leaving.)
[2:02] * davidzlap (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) has joined #ceph
[2:02] * davidzlap1 (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) Quit (Read error: Connection reset by peer)
[2:02] * kevinc (~kevinc__@client65-125.sdsc.edu) Quit (Quit: Leaving)
[2:03] * nihilifer (nihilifer@s6.mydevil.net) Quit (Read error: Connection reset by peer)
[2:06] * nihilifer (nihilifer@s6.mydevil.net) has joined #ceph
[2:13] * Snowman (~Guest1390@7V7AAAYN3.tor-irc.dnsbl.oftc.net) Quit ()
[2:13] * segutier (~segutier@sfo-vpn1.shawnlower.net) Quit (Ping timeout: 480 seconds)
[2:13] * dlan_ (~dennis@116.228.88.131) Quit (Ping timeout: 480 seconds)
[2:15] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) Quit (Ping timeout: 480 seconds)
[2:17] * dack (~darrelle@gateway.ola.bc.ca) Quit (Quit: WeeChat 1.3)
[2:22] * dlan (~dennis@116.228.88.131) has joined #ceph
[2:26] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[2:26] * wushudoin (~wushudoin@2601:646:8201:7769:2ab2:bdff:fe0b:a6ee) Quit (Ping timeout: 480 seconds)
[2:29] * yanzheng (~zhyan@182.139.23.79) has joined #ceph
[2:29] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) has joined #ceph
[2:38] * mattbenjamin (~mbenjamin@aa2.linuxbox.com) Quit (Quit: Leaving.)
[2:38] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[2:43] * ibravo (~ibravo@72.83.69.64) Quit (Quit: Leaving)
[2:46] * segutier (~segutier@12.51.50.4) has joined #ceph
[2:47] * kefu (~kefu@114.92.106.70) has joined #ceph
[2:50] * kefu (~kefu@114.92.106.70) Quit (Max SendQ exceeded)
[2:51] * kefu (~kefu@114.92.106.70) has joined #ceph
[3:04] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[3:07] * sankarshan (~sankarsha@121.244.87.124) has joined #ceph
[3:09] * bliu (~liub@203.192.156.9) Quit (Ping timeout: 480 seconds)
[3:11] * bene2 (~bene@2601:18c:8300:f3ae:ea2a:eaff:fe08:3c7a) Quit (Quit: Konversation terminated!)
[3:14] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[3:28] * bliu (~liub@203.192.156.9) has joined #ceph
[3:29] * debian112 (~bcolbert@24.126.201.64) Quit (Quit: Leaving.)
[3:30] * nihilifer1 (nihilifer@s6.mydevil.net) has joined #ceph
[3:30] * zhaochao (~zhaochao@125.39.8.237) has joined #ceph
[3:30] * nihilifer (nihilifer@s6.mydevil.net) Quit (Read error: Connection reset by peer)
[3:35] * jpetrini (~jpetrini@pool-100-34-141-60.phlapa.fios.verizon.net) has joined #ceph
[3:37] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[3:39] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) has joined #ceph
[3:40] * eth00 (~none@cpe-24-162-254-201.nc.res.rr.com) Quit (Ping timeout: 480 seconds)
[3:45] * jpetrini (~jpetrini@pool-100-34-141-60.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[3:45] * stein (~stein@185.56.185.82) Quit (Ping timeout: 480 seconds)
[3:59] * mhack (~mhack@66-168-117-78.dhcp.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[4:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[4:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[4:09] * kefu (~kefu@114.92.106.70) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[4:12] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[4:17] * kefu (~kefu@114.92.106.70) has joined #ceph
[4:28] * stein (~stein@185.56.185.82) has joined #ceph
[4:28] * davidzlap (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) Quit (Read error: Connection reset by peer)
[4:28] * davidzlap (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) has joined #ceph
[4:32] * leseb_ (~leseb@81-64-223-102.rev.numericable.fr) Quit (Ping timeout: 480 seconds)
[4:33] * leseb_ (~leseb@81-64-223-102.rev.numericable.fr) has joined #ceph
[4:38] * overclk (~overclk@59.93.67.28) has joined #ceph
[4:56] * overclk_ (~overclk@59.93.67.28) has joined #ceph
[4:56] * davidzlap (~Adium@2605:e000:1313:8003:2128:c8c7:1de0:6989) Quit (Quit: Leaving.)
[4:59] * overclk_ (~overclk@59.93.67.28) Quit (Remote host closed the connection)
[5:00] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[5:00] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Remote host closed the connection)
[5:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[5:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[5:01] * kefu (~kefu@114.92.106.70) Quit (Max SendQ exceeded)
[5:03] * overclk (~overclk@59.93.67.28) Quit (Ping timeout: 480 seconds)
[5:06] * kefu (~kefu@114.92.106.70) has joined #ceph
[5:06] * Vacuum__ (~Vacuum@88.130.220.142) has joined #ceph
[5:11] * overclk (~overclk@59.93.67.28) has joined #ceph
[5:11] * overclk (~overclk@59.93.67.28) Quit (Remote host closed the connection)
[5:11] * overclk (~overclk@59.93.67.28) has joined #ceph
[5:13] * Vacuum_ (~Vacuum@i59F79F0F.versanet.de) Quit (Ping timeout: 480 seconds)
[5:17] * kefu is now known as kefu|afk
[5:17] * kefu|afk (~kefu@114.92.106.70) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[5:42] * amote (~amote@121.244.87.116) has joined #ceph
[5:44] * jwilkins (~jowilkin@67.204.149.211) Quit (Ping timeout: 480 seconds)
[5:44] * bearkitten (~bearkitte@cpe-76-172-86-115.socal.res.rr.com) Quit (Quit: WeeChat 1.3)
[5:45] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[5:45] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[5:45] * overclk (~overclk@59.93.67.28) Quit (Ping timeout: 480 seconds)
[5:46] * fretb (~fretb@n-3n.static-37-72-162.as30961.net) Quit (Remote host closed the connection)
[5:47] * wonko_be (bernard@november.openminds.be) Quit (Remote host closed the connection)
[5:47] * fretb (~fretb@n-3n.static-37-72-162.as30961.net) has joined #ceph
[5:47] * wonko_be (bernard@november.openminds.be) has joined #ceph
[5:47] * joshd (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[5:53] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) Quit (Ping timeout: 480 seconds)
[5:55] * bearkitten (~bearkitte@cpe-76-172-86-115.socal.res.rr.com) has joined #ceph
[5:55] * kefu (~kefu@114.92.106.70) has joined #ceph
[5:58] * overclk (~overclk@59.93.65.11) has joined #ceph
[6:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[6:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[6:03] * hoonetorg (~hoonetorg@77.119.226.254.static.drei.at) has joined #ceph
[6:03] * janos_ (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) Quit (Quit: Leaving)
[6:04] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[6:07] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Quit: Leaving.)
[6:11] * tiantian (~oftc-webi@116.228.88.99) has joined #ceph
[6:18] * espeer (~quassel@phobos.isoho.st) Quit (Read error: Connection reset by peer)
[6:18] * darkfaded (~floh@88.79.251.60) has joined #ceph
[6:18] * darkfader (~floh@88.79.251.60) Quit (Read error: Connection reset by peer)
[6:18] * joshd (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) Quit (Quit: Leaving.)
[6:20] * espeer (~quassel@phobos.isoho.st) has joined #ceph
[6:28] * overclk (~overclk@59.93.65.11) Quit (Remote host closed the connection)
[6:28] * overclk (~overclk@59.93.65.11) has joined #ceph
[6:33] * jclm (~jclm@ip-64-134-184-248.public.wayport.net) Quit (Ping timeout: 480 seconds)
[6:35] * rdas (~rdas@182.70.152.17) has joined #ceph
[6:39] * kefu (~kefu@114.92.106.70) Quit (Max SendQ exceeded)
[6:39] * kefu (~kefu@114.92.106.70) has joined #ceph
[6:40] * sileht (~sileht@sileht.net) has joined #ceph
[6:40] * davidzlap (~Adium@2605:e000:1313:8003:adad:f438:b5e5:dd40) has joined #ceph
[6:43] * overclk (~overclk@59.93.65.11) Quit (Ping timeout: 480 seconds)
[6:45] * overclk (~overclk@59.93.66.131) has joined #ceph
[6:45] * davidzlap (~Adium@2605:e000:1313:8003:adad:f438:b5e5:dd40) Quit ()
[6:46] * espeer (~quassel@phobos.isoho.st) Quit (Ping timeout: 480 seconds)
[6:47] * espeer (~quassel@phobos.isoho.st) has joined #ceph
[6:50] * Venturi (~Venturi@93-103-91-169.dynamic.t-2.net) has joined #ceph
[6:50] * joao (~joao@8.184.114.89.rev.vodafone.pt) Quit (Ping timeout: 480 seconds)
[6:51] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[6:51] * dlan (~dennis@116.228.88.131) Quit (Ping timeout: 480 seconds)
[6:52] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) has joined #ceph
[6:53] * toabctl (~toabctl@toabctl.de) Quit (Ping timeout: 480 seconds)
[6:54] * toabctl (~toabctl@toabctl.de) has joined #ceph
[6:55] * adun153 (~ljtirazon@112.198.90.112) has joined #ceph
[7:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[7:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[7:03] * nardial (~ls@dslb-178-009-201-039.178.009.pools.vodafone-ip.de) has joined #ceph
[7:07] * overclk (~overclk@59.93.66.131) Quit (Remote host closed the connection)
[7:08] * Venturi (~Venturi@93-103-91-169.dynamic.t-2.net) Quit (Ping timeout: 480 seconds)
[7:10] * davidzlap1 (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[7:10] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[7:10] * dlan (~dennis@116.228.88.131) has joined #ceph
[7:12] * kefu is now known as kefu|afk
[7:14] * overclk (~overclk@59.93.66.131) has joined #ceph
[7:15] * rakeshgm (~rakesh@121.244.87.117) has joined #ceph
[7:22] * RaidSoft (~Coe|work@89.248.173.115) has joined #ceph
[7:22] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) has joined #ceph
[7:22] * kefu|afk (~kefu@114.92.106.70) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[7:27] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[7:27] * davidzlap1 (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[7:34] * zaitcev (~zaitcev@c-50-130-189-82.hsd1.nm.comcast.net) Quit (Quit: Bye)
[7:36] * zenpac (~zenpac3@66.55.33.66) Quit (Ping timeout: 480 seconds)
[7:44] * zenpac (~zenpac3@66.55.33.66) has joined #ceph
[7:47] * linjan__ (~linjan@176.195.239.174) has joined #ceph
[7:48] * yuan (~yzhou67@shzdmzpr01-ext.sh.intel.com) has joined #ceph
[7:48] * overclk_ (~overclk@59.93.66.67) has joined #ceph
[7:48] * overclk_ (~overclk@59.93.66.67) Quit (autokilled: This host may be infected. Mail support@oftc.net with questions. BOPM (2015-11-03 06:48:57))
[7:49] * overclk (~overclk@59.93.66.131) Quit (Read error: Connection reset by peer)
[7:52] * RaidSoft (~Coe|work@89.248.173.115) Quit ()
[7:52] * roaet1 (~FNugget@4K6AACDQX.tor-irc.dnsbl.oftc.net) has joined #ceph
[7:57] * Be-El (~quassel@fb08-bcf-pc01.computational.bio.uni-giessen.de) has joined #ceph
[7:59] * adun153 (~ljtirazon@112.198.90.112) Quit (Ping timeout: 480 seconds)
[8:00] <Be-El> hi
[8:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[8:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[8:06] <MrBy> hi, is it possible to deploy multiple radosgw entities within once ceph cluster, whereas each of them has different pool settings for domain_root, control_pool, gc_pool, etc... maybe only with a shared users pool?
[8:12] * daviddcc (~dcasier@80.215.226.131) has joined #ceph
[8:14] * enax (~enax@hq.ezit.hu) has joined #ceph
[8:15] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[8:18] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[8:18] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[8:21] * kefu (~kefu@114.92.106.70) has joined #ceph
[8:22] * roaet1 (~FNugget@4K6AACDQX.tor-irc.dnsbl.oftc.net) Quit ()
[8:22] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) has joined #ceph
[8:26] * SquallSeeD31 (~Neon@195-154-191-67.rev.poneytelecom.eu) has joined #ceph
[8:30] * linjan__ (~linjan@176.195.239.174) Quit (Ping timeout: 480 seconds)
[8:33] * adun153 (~ljtirazon@112.198.78.80) has joined #ceph
[8:33] * daviddcc (~dcasier@80.215.226.131) Quit (Read error: No route to host)
[8:33] * kefu (~kefu@114.92.106.70) Quit (Max SendQ exceeded)
[8:34] * kefu (~kefu@114.92.106.70) has joined #ceph
[8:42] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[8:44] * adun153 (~ljtirazon@112.198.78.80) Quit (Ping timeout: 480 seconds)
[8:45] * dgurtner (~dgurtner@77.95.96.78) has joined #ceph
[8:48] * joao (~joao@8.184.114.89.rev.vodafone.pt) has joined #ceph
[8:48] * ChanServ sets mode +o joao
[8:50] * xarses (~xarses@118.103.8.153) has joined #ceph
[8:52] * garphy`aw is now known as garphy
[8:53] * xarses (~xarses@118.103.8.153) Quit (Remote host closed the connection)
[8:56] * SquallSeeD31 (~Neon@0WFAAAMPB.tor-irc.dnsbl.oftc.net) Quit ()
[8:56] * Wielebny (~Icedove@cl-927.waw-01.pl.sixxs.net) has joined #ceph
[8:58] * xarses (~xarses@118.103.8.153) has joined #ceph
[8:58] * fsimonce (~simon@host30-173-dynamic.23-79-r.retail.telecomitalia.it) has joined #ceph
[9:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[9:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[9:01] * bjozet (~bjozet@82-183-17-144.customers.ownit.se) has joined #ceph
[9:02] * houming-wang (~houming-w@103.10.86.234) has joined #ceph
[9:03] * bjozet (~bjozet@82-183-17-144.customers.ownit.se) Quit ()
[9:03] * houming-wang (~houming-w@103.10.86.234) has left #ceph
[9:03] * evl (~chatzilla@39.138.216.139.sta.dodo.net.au) has joined #ceph
[9:04] * evl (~chatzilla@39.138.216.139.sta.dodo.net.au) Quit ()
[9:04] * bjozet (~bjozet@82-183-17-144.customers.ownit.se) has joined #ceph
[9:05] * houming-wang (~houming-w@103.10.86.234) has joined #ceph
[9:05] * houming-wang (~houming-w@103.10.86.234) has left #ceph
[9:06] * jasuarez (~jasuarez@186.Red-83-37-101.dynamicIP.rima-tde.net) has joined #ceph
[9:07] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[9:07] <mistur> Hello
[9:07] <mistur> I'm looking for a way to define the size of the
[9:07] <mistur> ssd journal partition
[9:08] <mistur> I have 10x6TB per node
[9:08] <mistur> is there a calcule to define this ?
[9:09] * bjozet (~bjozet@82-183-17-144.customers.ownit.se) Quit ()
[9:09] * bjozet (~bjozet@82-183-17-144.customers.ownit.se) has joined #ceph
[9:10] * dgurtner (~dgurtner@77.95.96.78) Quit (Ping timeout: 480 seconds)
[9:11] * olid14 (~olid1982@185.17.206.44) has joined #ceph
[9:13] * thomnico (~thomnico@2a01:e35:8b41:120:546:2d4f:48d3:b4a1) has joined #ceph
[9:17] * ade (~abradshaw@tmo-113-50.customers.d1-online.com) has joined #ceph
[9:18] * pabluk_ is now known as pabluk
[9:20] * brutuscat (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) has joined #ceph
[9:21] * analbeard (~shw@support.memset.com) has joined #ceph
[9:23] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[9:23] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[9:25] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[9:28] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[9:30] * adun153 (~ljtirazon@112.198.78.80) has joined #ceph
[9:34] * mewald (~mewald@p54AFE274.dip0.t-ipconnect.de) has joined #ceph
[9:35] * b0e (~aledermue@213.95.25.82) has joined #ceph
[9:35] <mewald> I am getting these random error messages https://gist.github.com/anonymous/0bbf058d41f105a9bd73 after second try they disappear. happens with other commands, too. What do they mean?
[9:40] * alrick (~alrick@91.218.144.129) has joined #ceph
[9:41] * garphy is now known as garphy`aw
[9:42] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) has joined #ceph
[9:43] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:c11e:ac88:f68b:f7e8) has joined #ceph
[9:44] * drankis (~drankis__@89.111.13.198) has joined #ceph
[9:44] * drankis (~drankis__@89.111.13.198) Quit ()
[9:47] * rendar (~I@host225-177-dynamic.21-87-r.retail.telecomitalia.it) has joined #ceph
[9:49] * garphy`aw is now known as garphy
[9:50] * Wijk (~vend3r@195-154-191-67.rev.poneytelecom.eu) has joined #ceph
[9:50] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[9:50] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[9:51] * fridim_ (~fridim@56-198-190-109.dsl.ovh.fr) has joined #ceph
[9:53] * bitserker (~toni@63.pool85-52-240.static.orange.es) has joined #ceph
[9:55] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[9:55] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[9:57] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[9:57] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[10:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[10:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[10:02] * RayTracer (~RayTracer@public-gprs517434.centertel.pl) has joined #ceph
[10:02] * dgurtner (~dgurtner@77.95.96.78) has joined #ceph
[10:06] * bara (~bara@nat-pool-brq-u.redhat.com) has joined #ceph
[10:09] * dalgaaf (uid15138@id-15138.ealing.irccloud.com) has joined #ceph
[10:10] * dgurtner (~dgurtner@77.95.96.78) Quit (Ping timeout: 480 seconds)
[10:15] * kanagaraj (~kanagaraj@121.244.87.117) has joined #ceph
[10:18] * kefu is now known as kefu|afk
[10:18] * kawa2014 (~kawa@89.184.114.246) Quit (Read error: Connection reset by peer)
[10:18] * kefu|afk (~kefu@114.92.106.70) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[10:19] * linjan__ (~linjan@86.62.112.22) has joined #ceph
[10:19] * kawa2014 (~kawa@2001:67c:1560:8007::aac:c1a6) has joined #ceph
[10:20] * Wijk (~vend3r@4Z9AAAQU6.tor-irc.dnsbl.oftc.net) Quit ()
[10:21] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[10:22] * RayTracer (~RayTracer@public-gprs517434.centertel.pl) Quit (Remote host closed the connection)
[10:23] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[10:29] * bara (~bara@nat-pool-brq-u.redhat.com) Quit (Ping timeout: 480 seconds)
[10:29] * bara (~bara@nat-pool-brq-u.redhat.com) has joined #ceph
[10:30] * kefu (~kefu@114.92.106.70) has joined #ceph
[10:31] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[10:33] * branto1 (~branto@ip-213-220-232-132.net.upcbroadband.cz) has joined #ceph
[10:36] * dgurtner (~dgurtner@77.95.96.78) has joined #ceph
[10:41] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[10:41] * markednmbr1 (~markednmb@109.239.90.187) has joined #ceph
[10:43] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[10:43] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[10:44] * kefu (~kefu@114.92.106.70) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[10:49] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[10:49] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[10:50] * RayTracer (~RayTracer@153.19.7.39) has joined #ceph
[10:51] * garphy is now known as garphy`aw
[10:55] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[10:55] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Remote host closed the connection)
[10:56] * Kwen (~tZ@46.19.139.126) has joined #ceph
[10:58] * jluis (~joao@8.184.114.89.rev.vodafone.pt) has joined #ceph
[10:58] * ChanServ sets mode +o jluis
[11:00] * garphy`aw is now known as garphy
[11:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[11:01] * haomaiwang (~haomaiwan@li414-102.members.linode.com) has joined #ceph
[11:02] * lpabon (~quassel@24-151-54-34.dhcp.nwtn.ct.charter.com) has joined #ceph
[11:02] * brutuscat (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) Quit (Remote host closed the connection)
[11:04] * joao (~joao@8.184.114.89.rev.vodafone.pt) Quit (Ping timeout: 480 seconds)
[11:09] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[11:09] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[11:10] * dgurtner (~dgurtner@77.95.96.78) Quit (Ping timeout: 480 seconds)
[11:10] * longguang_ (~chatzilla@123.126.33.253) has joined #ceph
[11:12] * brutuscat (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) has joined #ceph
[11:15] * kawa2014 (~kawa@2001:67c:1560:8007::aac:c1a6) Quit (Ping timeout: 480 seconds)
[11:15] * longguang (~chatzilla@123.126.33.253) Quit (Ping timeout: 480 seconds)
[11:15] * longguang_ is now known as longguang
[11:16] <mewald> I am getting these random error messages https://gist.github.com/anonymous/0bbf058d41f105a9bd73 after second try they disappear. happens with other commands, too. What do they mean?
[11:22] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[11:22] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[11:23] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[11:25] * tiantian (~oftc-webi@116.228.88.99) Quit (Quit: Page closed)
[11:26] * rdas (~rdas@182.70.152.17) Quit (Quit: Leaving)
[11:28] * RayTracer (~RayTracer@153.19.7.39) Quit (Ping timeout: 480 seconds)
[11:29] * Kwen (~tZ@5P6AAAREP.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[11:34] * shawniverson (~shawniver@208.38.236.111) Quit (Remote host closed the connection)
[11:37] * xarses (~xarses@118.103.8.153) Quit (Ping timeout: 480 seconds)
[11:48] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[11:55] * kawa2014 (~kawa@89.184.114.246) Quit (Quit: Leaving)
[11:55] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[11:56] * adun153 (~ljtirazon@112.198.78.80) Quit (Quit: Leaving)
[11:57] * haomaiwang (~haomaiwan@li414-102.members.linode.com) Quit (Remote host closed the connection)
[12:01] * amote (~amote@121.244.87.116) Quit (Ping timeout: 480 seconds)
[12:02] <kiranos> is there a command to verify that syntax in /etc/ceph/ceph.conf is valid
[12:02] <kiranos> ?
[12:04] <kiranos> my conf is not matching https://github.com/ceph/ceph/blob/master/src/sample.ceph.conf
[12:04] <kiranos> here all entries are without _
[12:04] <kiranos> ;mon host = cephhost01,cephhost02
[12:04] <kiranos> ;mon addr = 192.168.0.101,192.168.0.102
[12:04] <kiranos> but when my ceph.conf was created I got
[12:04] <kiranos> mon_initial_members =
[12:04] <kiranos> mon_host =
[12:04] <kiranos> etc
[12:04] <kiranos> ;osd pool default size = 3
[12:05] <kiranos> osd_pool_default_size = 3
[12:05] <kiranos> which is correct and it would be nice with a verifying tool that shows which variables ceph doesnt understand
[12:10] * brutuscat (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) Quit (Remote host closed the connection)
[12:10] * no-thing_ (~arvydas@88-118-134-149.static.zebra.lt) has joined #ceph
[12:17] * zhaochao (~zhaochao@125.39.8.237) Quit (Quit: ChatZilla 0.9.92 [Iceweasel 38.3.0/20150922225347])
[12:36] * bene2 (~bene@2601:18c:8300:f3ae:ea2a:eaff:fe08:3c7a) has joined #ceph
[12:37] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Remote host closed the connection)
[12:40] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[12:40] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[12:42] * olid14 (~olid1982@185.17.206.44) Quit (Ping timeout: 480 seconds)
[12:42] <no-thing_> hello, what if i have 100TB rbd block device with 60TB used for filesystem, and 40TB left free in PV. Is it safe to shrink rbd image to 60TB ?
[12:46] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[12:46] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[12:51] * nardial (~ls@dslb-178-009-201-039.178.009.pools.vodafone-ip.de) Quit (Quit: Leaving)
[12:52] * dlan (~dennis@116.228.88.131) Quit (Read error: Connection reset by peer)
[12:53] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[12:54] * dlan (~dennis@116.228.88.131) has joined #ceph
[12:55] * ira (~ira@c-71-233-225-22.hsd1.ma.comcast.net) has joined #ceph
[12:56] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[12:58] * dlan_ (~dennis@116.228.88.131) has joined #ceph
[12:58] * dlan (~dennis@116.228.88.131) Quit (Read error: Connection reset by peer)
[13:01] * Cybertinus (~Cybertinu@cybertinus.customer.cloud.nl) Quit (Remote host closed the connection)
[13:03] * xarses (~xarses@118.103.8.153) has joined #ceph
[13:06] * brutuscat (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) has joined #ceph
[13:08] * Cybertinus (~Cybertinu@cybertinus.customer.cloud.nl) has joined #ceph
[13:09] * dgurtner (~dgurtner@77.95.96.78) has joined #ceph
[13:10] * mario7 (~mario@5.172.252.49) has joined #ceph
[13:12] * bara (~bara@nat-pool-brq-u.redhat.com) Quit (Ping timeout: 480 seconds)
[13:12] * mario7 (~mario@5.172.252.49) Quit (Read error: Connection reset by peer)
[13:12] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[13:12] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[13:19] * dneary (~dneary@pool-96-237-170-97.bstnma.fios.verizon.net) has joined #ceph
[13:20] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[13:20] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[13:20] * kanagaraj (~kanagaraj@121.244.87.117) Quit (Ping timeout: 480 seconds)
[13:21] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[13:22] * rakeshgm (~rakesh@121.244.87.117) Quit (Ping timeout: 480 seconds)
[13:23] * vbellur (~vijay@122.172.75.21) has joined #ceph
[13:25] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:c11e:ac88:f68b:f7e8) Quit (Ping timeout: 480 seconds)
[13:25] <Anticimex> that seems like a lvm question primarily
[13:26] <Anticimex> lvm must make sure the data is in first blocks. but i believe lvm does that?
[13:28] * kanagaraj (~kanagaraj@121.244.87.124) has joined #ceph
[13:37] * kanagaraj (~kanagaraj@121.244.87.124) Quit (Ping timeout: 480 seconds)
[13:37] * brutuscat (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) Quit (Ping timeout: 480 seconds)
[13:38] * andreask (~andreask@2001:67c:1933:800::3200) has joined #ceph
[13:38] * ChanServ sets mode +v andreask
[13:40] * lkoranda (~lkoranda@213.175.37.10) has joined #ceph
[13:41] * dneary (~dneary@pool-96-237-170-97.bstnma.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[13:41] * xarses (~xarses@118.103.8.153) Quit (Remote host closed the connection)
[13:45] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[13:45] * lkoranda (~lkoranda@213.175.37.10) Quit ()
[13:45] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[13:47] * xarses (~xarses@118.103.8.153) has joined #ceph
[13:49] <MrBy> hi, is it possible to deploy multiple radosgw entities within once ceph cluster, whereas each of them has different pool settings for domain_root, control_pool, gc_pool, etc... maybe only with a shared users pool?
[13:51] * dgurtner (~dgurtner@77.95.96.78) Quit (Ping timeout: 480 seconds)
[13:52] <mewald> I am getting these random error messages https://gist.github.com/anonymous/0bbf058d41f105a9bd73 after second try they disappear. happens with other commands, too. What do they mean?
[13:53] * nhm (~nhm@c-50-171-139-246.hsd1.mn.comcast.net) has joined #ceph
[13:53] * ChanServ sets mode +o nhm
[13:53] * dgurtner (~dgurtner@77.95.96.78) has joined #ceph
[13:54] <Heebie> Has anyone come up with a good guide for testing latency etc, and what different numbers in say rados bench actually mean?
[13:55] * lkoranda (~lkoranda@nat-pool-brq-t.redhat.com) has joined #ceph
[13:58] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:c11e:ac88:f68b:f7e8) has joined #ceph
[14:01] * andreask (~andreask@2001:67c:1933:800::3200) has left #ceph
[14:08] * daviddcc (~dcasier@LCaen-656-1-144-187.w217-128.abo.wanadoo.fr) has joined #ceph
[14:11] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:c11e:ac88:f68b:f7e8) Quit (Ping timeout: 480 seconds)
[14:20] * olid14 (~olid1982@193.24.209.54) has joined #ceph
[14:23] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) has joined #ceph
[14:35] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[14:35] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:c11e:ac88:f68b:f7e8) has joined #ceph
[14:39] * _are_ (~quassel@2a01:238:4325:ca00:f065:c93c:f967:9285) Quit (Remote host closed the connection)
[14:42] * olid14 (~olid1982@193.24.209.54) Quit (Ping timeout: 480 seconds)
[14:46] * kefu (~kefu@101.81.133.6) has joined #ceph
[14:48] * delattec (~cdelatte@cpe-71-75-20-42.carolina.res.rr.com) Quit (Quit: This computer has gone to sleep)
[14:51] * kefu (~kefu@101.81.133.6) Quit (Max SendQ exceeded)
[14:51] * kefu (~kefu@101.81.133.6) has joined #ceph
[14:52] * kefu (~kefu@101.81.133.6) Quit ()
[14:57] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[14:57] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[15:00] * dgurtner (~dgurtner@77.95.96.78) Quit (Ping timeout: 480 seconds)
[15:02] * _are_ (~quassel@2a01:238:4325:ca00:f065:c93c:f967:9285) has joined #ceph
[15:02] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[15:02] * mhack (~mhack@nat-pool-bos-t.redhat.com) has joined #ceph
[15:04] * ibravo (~ibravo@72.83.69.64) has joined #ceph
[15:06] * danieagle (~Daniel@179.98.86.152) has joined #ceph
[15:06] * brutuscat (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) has joined #ceph
[15:10] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[15:11] * brutusca_ (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) has joined #ceph
[15:11] * brutuscat (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) Quit (Read error: Connection reset by peer)
[15:14] * bara (~bara@213.175.37.12) has joined #ceph
[15:23] * yanzheng (~zhyan@182.139.23.79) Quit (Quit: This computer has gone to sleep)
[15:23] * krypto (~krypto@103.252.26.214) has joined #ceph
[15:23] * overclk (~overclk@59.93.64.233) has joined #ceph
[15:23] * overclk (~overclk@59.93.64.233) Quit (autokilled: This host may be infected. Mail support@oftc.net with questions. BOPM (2015-11-03 14:23:58))
[15:24] * amote (~amote@1.39.13.48) has joined #ceph
[15:24] * bara (~bara@213.175.37.12) Quit (Ping timeout: 480 seconds)
[15:25] * yanzheng (~zhyan@182.139.23.79) has joined #ceph
[15:28] * jwilkins (~jowilkin@67.204.149.211) has joined #ceph
[15:29] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[15:29] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[15:34] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[15:36] * krypto (~krypto@103.252.26.214) Quit (Ping timeout: 480 seconds)
[15:41] * danieagle (~Daniel@179.98.86.152) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[15:42] * sankarshan (~sankarsha@121.244.87.124) Quit (Quit: Are you sure you want to quit this channel (Cancel/Ok) ?)
[15:45] * tiagonux (~oftc-webi@186-232-188-6.tiviths.com.br) has joined #ceph
[15:45] * alrick (~alrick@91.218.144.129) Quit (Remote host closed the connection)
[15:48] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[15:51] * dyasny (~dyasny@198.251.57.99) has joined #ceph
[15:54] * yanzheng (~zhyan@182.139.23.79) Quit (Quit: This computer has gone to sleep)
[15:54] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[15:54] * TheSov (~TheSov@38.106.143.234) has joined #ceph
[15:56] * krypto (~krypto@103.252.26.214) has joined #ceph
[15:56] * TheSov2 (~TheSov@204.13.200.248) has joined #ceph
[15:57] * kawa2014 (~kawa@89.184.114.246) Quit (Ping timeout: 480 seconds)
[15:57] * kefu (~kefu@114.92.106.70) has joined #ceph
[15:58] * kawa2014 (~kawa@2001:67c:1560:8007::aac:c1a6) has joined #ceph
[16:02] <TheSov2> has anyone done a cache pool on top of an EC pool?
[16:02] * segutier (~segutier@12.51.50.4) Quit (Quit: segutier)
[16:02] * derjohn_mob (~aj@94.119.1.2) has joined #ceph
[16:03] * TheSov (~TheSov@38.106.143.234) Quit (Ping timeout: 480 seconds)
[16:04] * Gorazd (~Gorazd@89-212-99-37.dynamic.t-2.net) has joined #ceph
[16:05] * m0zes
[16:05] <TheSov2> m0zes, you have?
[16:05] <m0zes> yes I have
[16:05] <TheSov2> what level of replication do you have on your cache tier?
[16:05] <m0zes> 3
[16:05] <TheSov2> hmmm
[16:05] <TheSov2> i figured since your backend pool was ec
[16:05] * linjan__ (~linjan@86.62.112.22) Quit (Ping timeout: 480 seconds)
[16:06] <TheSov2> you would do 2 at most
[16:06] <m0zes> the problem is that the likelihood of two disks failing simultaneously is fairly high.
[16:07] <TheSov2> well with 3, you are essentially keeping 5 copies of same data which seems bad
[16:07] <TheSov2> 3 from the cache tier and what is essentially 2 from the EC pool
[16:07] <m0zes> no, I keep 3 copies in the cache tier and 1.5 from the ec pool. but all data is not hot.
[16:08] <TheSov2> wait what ec level do you have?
[16:08] <m0zes> k=8,m=4
[16:08] <TheSov2> so 8 blocks and 4 parity?
[16:09] * doppelgrau (~doppelgra@pd956d116.dip0.t-ipconnect.de) has joined #ceph
[16:09] <m0zes> yes. I would probably go k=6,m=3 next time. or k=9,m=3 depending on how many disks, and how long I want recalculating parities to take.
[16:11] <m0zes> it is all pretty tunable, but there are tradeoffs.
[16:11] <TheSov2> hmmm
[16:11] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[16:14] * jasuarez (~jasuarez@186.Red-83-37-101.dynamicIP.rima-tde.net) Quit (Quit: WeeChat 1.2)
[16:14] * jasuarez (~jasuarez@186.Red-83-37-101.dynamicIP.rima-tde.net) has joined #ceph
[16:16] * krypto (~krypto@103.252.26.214) Quit (Ping timeout: 480 seconds)
[16:20] * pabluk is now known as pabluk_
[16:22] * markednmbr1 (~markednmb@109.239.90.187) Quit (Quit: Leaving)
[16:26] * CheKoLyN (~saguilar@bender.parc.xerox.com) has joined #ceph
[16:26] <TheSov2> so if i do k=3 and m=3 thats basically a slower version of replication eh
[16:27] * tenshi (~David@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[16:30] <m0zes> yep, that would be replication, and it would require *all* in osds to respond to read requests. on top of having 3 times as many disks writing.
[16:30] <m0zes> so, probably a *much* slower version of size 2, but it would also be safer, as you could lose 3 disks and still have all your data.
[16:31] * mattbenjamin (~mbenjamin@aa2.linuxbox.com) has joined #ceph
[16:33] * georgem (~Adium@206.108.127.16) has joined #ceph
[16:34] <babilen> Does it make sense to use SSDs in dedicated MON nodes? And if so: How would you use/configure them ?
[16:34] <TheSov2> yes
[16:34] <TheSov2> in fact
[16:34] <TheSov2> its highly recommended to put /var on ssd for monitors
[16:34] <TheSov2> and your monitors should be dedicated
[16:35] <TheSov2> low cost environments tend to pair monitors with osd
[16:35] <babilen> I'm just planning hardware for a testlab and am not sure if what I came up with makes sense ...
[16:35] * pabluk_ is now known as pabluk
[16:35] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[16:35] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[16:36] <babilen> https://www.refheap.com/111336 is what I came up with for now, but I'm not entirely sure about SSD usage for MONs
[16:37] <tenshi> no need
[16:37] <babilen> Wanted to order 3 each (3 OSD nodes, 3 MONs)
[16:38] <tenshi> you can run with regular drives for mon servers.
[16:38] <babilen> I planned to put the journal on SSDs, but I'm not sure if it makes sense to use it for MONs too
[16:38] <TheSov2> like i said for monitors
[16:38] <TheSov2> use a normal disk for OS
[16:38] <m0zes> monitors have to sync every change, much like a traditional database, changes happen multiple times per second, and spinning disks (for monitors) often can't keep up with clusters at load.
[16:39] <TheSov2> and a ssd for /var
[16:39] <babilen> I'm also trying to decide if I'd like the journal on SSD RAID, given that journal loss is rather hard to deal with
[16:39] <m0zes> for testing, I doubt it matters as much.
[16:39] <TheSov2> m0zes, look at his reference, those are serious test nodes
[16:39] <TheSov2> btw the monitors do not need 10gig nics
[16:39] <babilen> Okay, so you would definitely use SSDs for /var in production on MONs then
[16:39] <TheSov2> you are wasting money
[16:40] <TheSov2> babilen, correct
[16:40] <babilen> So 2x1G bond would suffice there?
[16:40] <TheSov2> everytime you make a write the pgmap is updated on all monitors
[16:40] <m0zes> latency. 10Gb would help wiht latency.
[16:40] <TheSov2> oh yeah
[16:40] <TheSov2> m0zes, for a monitor? seriously?
[16:40] <m0zes> ssds from the same manufacturer shouldn't be raided, as they generally have the same wear patterns. leading to both ssds failing at the same time.
[16:41] <TheSov2> you wont need raids on monitors, they are designed for failure, just have an odd number, more than 1
[16:41] <babilen> I plan to start with 3 nodes each (OSD and MON)
[16:41] <TheSov2> for test i say use 3 monitors, use at least 5 in production
[16:41] <babilen> So the hardware specs in my paste look halfway alright?
[16:41] <TheSov2> more than adequate
[16:42] <TheSov2> in fact i would say you have a high end test setup
[16:42] <babilen> Yeah, I plan to up those numbers dramatically before we take it into production. This would be just for "playing"
[16:42] <TheSov2> my ceph test has been in amazon's cloud/crappy old equipment i have here
[16:42] <TheSov2> amazon's cloud isnt anywhere near as fast as they say :/
[16:43] <babilen> Well, I planned to order fewer spinning drives and less RAM (we can always get more). The specs there are for a full OSD and MON node
[16:43] <TheSov2> how big will your prod disks be
[16:43] <babilen> I have been playing with VMs now and got the go ahead for a "real" setup ..
[16:44] <babilen> 1.8T per OSD was my idea on 10k spinning
[16:45] <TheSov2> what do you mean 1.8
[16:45] <TheSov2> you mean 2tb disks
[16:45] <TheSov2> dont spend real money on disks it wont help you
[16:45] <TheSov2> the idea is to use commodity drives
[16:45] <TheSov2> unless you are building an SSD cache tier, thats worth it
[16:45] <babilen> 1.8 = 1.8TB
[16:46] <TheSov2> so ideally you should have a gig of ram for every 1tb of disk
[16:46] <TheSov2> so a 32tb osd system should ahve at least 32gigs
[16:47] * gregmark (~Adium@68.87.42.115) has joined #ceph
[16:47] <babilen> So the OSD node would have too much RAM?
[16:47] <TheSov2> what do you mean?
[16:47] * joshd (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[16:47] <TheSov2> you can have more ram it doesnt hurt
[16:47] <m0zes> depending on the price, I might say go with 2 or 3 of the P3700 series for journals, rather than the S3700 disks, and you'd have more space for more osds in those nodes http://ark.intel.com/compare/71918,71917,79624
[16:47] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[16:47] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[16:47] <TheSov2> in fact it helps with read caches
[16:48] <TheSov2> and get ssd's with ultracaps
[16:48] <babilen> such as?
[16:48] <m0zes> but, that being said, it looks like a great setup to me.
[16:49] <TheSov2> intel disks, the enterprise ones have supercaps inside, so when power cuts off all the ram cached writes on ssd get written to the ssd before it actually shuts off
[16:49] <babilen> I just spend some time reading up on "magic numbers" and ratios you'd like to keep in a ceph setup and tried to map that to hardware we could order from Dell and setup you see is the result of that.
[16:49] <TheSov2> http://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/ssd-320-series-power-loss-data-protection-brief.pdf
[16:50] * moore (~moore@64.202.160.88) has joined #ceph
[16:50] <TheSov2> well you want about 100 pg's per osd
[16:50] <TheSov2> in total
[16:50] <TheSov2> you want it to be a power of 2 aswell
[16:50] <TheSov2> so try to get as close as possible
[16:50] <TheSov2> also its better to have too few pg's than too many
[16:51] <TheSov2> http://ceph.com/pgcalc/ <--- this will help you
[16:52] <TheSov2> sorry i been on a ceph binge recently :)
[16:53] <babilen> I'm not too sure how to take that information into account to be honest ..
[16:54] <m0zes> that isn't relevant until your at the software configuration stage.
[16:55] <babilen> I played with pgcalc before and am also aware of the fact that you want it to be close to a power of 2, but how to correlate that to the order I'm trying to come with somewhat escapes me right now
[16:57] * vata (~vata@207.96.182.162) has joined #ceph
[16:57] <m0zes> right. magic numbers. from what I've seen 1GHz of processing power per spinning disk. 2-2.5GHz of processing power per SSD. 1-4GB of memory per TB of OSD space, for osd memory usage and page cache. a decent network, as you don't want the network to be the bottlenck of your storage.
[16:58] * rwheeler (~rwheeler@pool-173-48-214-9.bstnma.fios.verizon.net) Quit (Quit: Leaving)
[16:59] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:00] <m0zes> unless you've got PCI-E ssds, 1 good SSD per 3-6 spinning disks for journaling.
[17:00] * mewald (~mewald@p54AFE274.dip0.t-ipconnect.de) Quit (Quit: Lost terminal)
[17:01] <babilen> m0zes: Regarding the SSDs: I read/heard that you'd use one SSD for roughly 4 OSD which gives me 18/4 = 4 SSDs. Naturally the capacity would be too high if I buy P3700 (smallest is 400G) which means that I'd probably get 3 .. Does that still make sense?
[17:01] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[17:01] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[17:01] * lcurtis (~lcurtis@47.19.105.250) has joined #ceph
[17:02] <m0zes> PCI-E SSDs have more IOPs, and the large sizes will accept more writes, so 3x P3700's is probably better than 4x S3700s,
[17:03] <babilen> aye
[17:04] * angdraug (~angdraug@c-50-174-102-105.hsd1.ca.comcast.net) has joined #ceph
[17:05] * brutusca_ (~brutuscat@60.Red-193-152-185.dynamicIP.rima-tde.net) Quit (Remote host closed the connection)
[17:05] <babilen> How much space do I need for /var on the MONs (SSD) ?
[17:06] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[17:06] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[17:08] * ScOut3R (~ScOut3R@business-89-135-189-217.business.broadband.hu) has joined #ceph
[17:08] <m0zes> depends on the number of objects, osds. My cluster (200M objects, 432 OSDs) needs about 4-5 GB under ideal situations, It has ballooned up to 30GB before (during a 650TB backfill), and then dropped back down to 4-5GB
[17:08] <babilen> So a single P3700 with 400G would be more than enough
[17:08] <babilen> (on each MON)
[17:08] <m0zes> yes.
[17:09] <babilen> right, ta
[17:09] <lincolnb> does SSD for the mon store make a noticable difference?
[17:09] * no-thing_ (~arvydas@88-118-134-149.static.zebra.lt) Quit (Remote host closed the connection)
[17:09] <ScOut3R> Hey everyone! After a serious hardware failure i have two OSDs which don't want to start with the following error: journal Unable to read past sequence 833326183 but header indicates the journal has committed up through 833326591, journal is corrupt. I tried to make a new journal for them with --mkjournal, but the error is the same. Should I reformat the drives or is there a way to clear out the corrupted journal? Thanks
[17:09] <ScOut3R> !
[17:10] <tenshi> lincolnb it should n t
[17:10] <m0zes> lincolnb: It can, I know my monitors eat 65% of the performance of my current SSDs. 90-100MB/s of constant fsyncs...
[17:10] <lincolnb> hmm
[17:11] <babilen> TheSov2, m0zes: Thank you for your comments!
[17:11] <tenshi> we are using regular drive
[17:11] <tenshi> and having less than 14% io
[17:12] <tenshi> on them. How can you reach 100MB/s on your monitors ?
[17:12] <lincolnb> well, it's a good potential bottleneck to be aware of. will have to watch ganglia for I/O wait on the mons..
[17:14] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[17:15] <m0zes> I guess that 100MB/p isn't entirely accurate. that is the difference in performance of my monitor disks vs SSDs for OSDs. I think it is more to do with IOPs rather than bandwidth proper.
[17:17] * wushudoin (~wushudoin@38.140.108.2) has joined #ceph
[17:18] * kefu is now known as kefu|afk
[17:19] * LPG (~LPG@c-50-181-212-148.hsd1.wa.comcast.net) has joined #ceph
[17:20] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[17:20] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[17:20] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) has joined #ceph
[17:23] * TheSov2 (~TheSov@204.13.200.248) Quit (Ping timeout: 480 seconds)
[17:23] * kefu|afk (~kefu@114.92.106.70) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:26] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[17:26] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[17:34] * davidzlap1 (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[17:34] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[17:36] * olid14 (~olid1982@185.17.206.44) has joined #ceph
[17:37] * ira (~ira@c-71-233-225-22.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[17:41] * lpabon (~quassel@24-151-54-34.dhcp.nwtn.ct.charter.com) Quit (Ping timeout: 480 seconds)
[17:44] * sudocat (~dibarra@192.185.1.20) Quit (Remote host closed the connection)
[17:48] * vbellur (~vijay@122.172.75.21) Quit (Ping timeout: 480 seconds)
[17:50] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[17:52] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[17:53] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[17:53] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[17:54] * shylesh__ (~shylesh@59.95.68.86) has joined #ceph
[17:57] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[17:58] * vbellur (~vijay@122.172.251.132) has joined #ceph
[18:00] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[18:00] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[18:00] * TheSov (~TheSov@cip-248.trustwave.com) has joined #ceph
[18:01] * ircolle (~Adium@2601:285:201:2bf9:8af:2fa5:450b:60dc) has joined #ceph
[18:06] <TheSov> tenshi, it would make a difference, the monitors have to write the new maps in place before than can be disseminated so latency on your monitor translates to latency on your cluster for writes
[18:11] * amote (~amote@1.39.13.48) Quit (Ping timeout: 480 seconds)
[18:11] * amote (~amote@1.39.13.48) has joined #ceph
[18:14] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[18:14] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[18:16] * shylesh__ (~shylesh@59.95.68.86) Quit (Ping timeout: 480 seconds)
[18:16] * shylesh__ (~shylesh@59.95.68.86) has joined #ceph
[18:18] * rwheeler (~rwheeler@nat-pool-bos-u.redhat.com) has joined #ceph
[18:22] * ksperis (~laurent@46.218.42.103) has left #ceph
[18:23] * amote (~amote@1.39.13.48) Quit (Ping timeout: 480 seconds)
[18:24] * jluis (~joao@8.184.114.89.rev.vodafone.pt) Quit (Ping timeout: 480 seconds)
[18:24] * tiagonux (~oftc-webi@186-232-188-6.tiviths.com.br) Quit (Ping timeout: 480 seconds)
[18:24] * jluis (~joao@8.184.114.89.rev.vodafone.pt) has joined #ceph
[18:24] * ChanServ sets mode +o jluis
[18:24] * wushudoin (~wushudoin@38.140.108.2) Quit (Ping timeout: 480 seconds)
[18:25] * wushudoin (~wushudoin@38.140.108.3) has joined #ceph
[18:25] * Kupo1 (~tyler.wil@23.111.254.159) has joined #ceph
[18:26] <lookcrabs> is there a way to set bucket policy as well as all of the objects inside the bucket for ceph? I know you can in S3 but it doesn't look like ceph supports this yet.
[18:27] <lookcrabs> right now i am iterating through all of the keys inside of a bucket but does this mean I need to do this again for any new keys created inside the bucket? is there a way for me to just set the policy of the bucket so that any objects inside the bucket have the same policy/acl?
[18:29] <TheSov> you are using the radowgw?
[18:29] <lookcrabs> yup
[18:29] <lookcrabs> radosgw
[18:29] * bara_ (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[18:30] <lookcrabs> I tried setting bucket ACL but it doesn't seem to affect the rest of the objects.
[18:31] <TheSov> it should work
[18:31] <TheSov> try creating a new object
[18:31] <TheSov> inside the bucket
[18:31] <TheSov> and see if the correct acl is applied
[18:32] * tiagonux (~oftc-webi@lbl-sp.tiviths.com.br) has joined #ceph
[18:36] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) Quit (Quit: Leaving)
[18:37] * bara_ (~bara@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[18:38] * bara_ (~bara@213.175.37.10) has joined #ceph
[18:38] <TheSov> lookcrabs, did it do it properly?
[18:39] <lookcrabs> I am not sure yet. I am about to try again now : )
[18:40] <lookcrabs> so I should be able to set the ACL on a bucket and all objects in the bucket should be accessible at that point?
[18:40] * wushudoin (~wushudoin@38.140.108.3) Quit (Quit: Leaving)
[18:40] <TheSov> well
[18:40] * wushudoin (~wushudoin@38.140.108.3) has joined #ceph
[18:40] * dgurtner (~dgurtner@185.10.235.250) has joined #ceph
[18:40] <TheSov> im thinking its not applying to existing objects
[18:40] <TheSov> but any new objects should be
[18:41] <TheSov> so if you create a policy on a new bucket
[18:41] <TheSov> it should be fully accessible by that policy
[18:41] <TheSov> but you are correct, its supposed to allow full access to via the policy on existing and new objects
[18:42] <TheSov> i dont have a radosgw setup but ill set one up to test it
[18:43] <lookcrabs> it hasn't worked for me yet but i'm trying again.
[18:43] <kiranos> I posted about this earlier: http://pastebin.com/xaXghDpj and didnt get an input :) maybe someone on here now knows
[18:43] <lookcrabs> need to set it all up again to test :)
[18:43] <kiranos> its about ceph.conf syntax
[18:44] * branto1 (~branto@ip-213-220-232-132.net.upcbroadband.cz) Quit (Quit: Leaving.)
[18:44] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[18:44] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[18:45] <lookcrabs> kiranos: I think underline is optional as I see it both ways. I use the underline myself
[18:47] * garphy is now known as garphy`aw
[18:48] * wushudoin_ (~wushudoin@38.140.108.2) has joined #ceph
[18:51] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Quit: Bye guys! (??????????????????? ?????????)
[18:52] * thomnico (~thomnico@2a01:e35:8b41:120:546:2d4f:48d3:b4a1) Quit (Quit: Ex-Chat)
[18:52] <TheSov> can you pastebin your policy?
[18:54] * thomnico (~thomnico@2a01:e35:8b41:120:ec8c:221f:ae09:d94d) has joined #ceph
[18:55] * wushudoin (~wushudoin@38.140.108.3) Quit (Ping timeout: 480 seconds)
[18:57] * jasuarez (~jasuarez@186.Red-83-37-101.dynamicIP.rima-tde.net) Quit (Quit: WeeChat 1.2)
[18:59] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[18:59] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[18:59] * bara_ (~bara@213.175.37.10) Quit (Ping timeout: 480 seconds)
[19:04] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[19:04] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[19:06] * Destreyf_ (~quassel@email.newagecomputers.info) Quit (Read error: Connection reset by peer)
[19:07] * tiagonux_ (~oftc-webi@186-232-188-6.tiviths.com.br) has joined #ceph
[19:07] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) Quit (Remote host closed the connection)
[19:10] * segutier (~segutier@sfo-vpn1.shawnlower.net) has joined #ceph
[19:12] * dupont-y (~dupont-y@familledupont.org) has joined #ceph
[19:12] * davidzlap1 (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[19:12] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[19:12] * bitserker (~toni@63.pool85-52-240.static.orange.es) Quit (Ping timeout: 480 seconds)
[19:13] * tiagonux (~oftc-webi@lbl-sp.tiviths.com.br) Quit (Ping timeout: 480 seconds)
[19:17] * thomnico (~thomnico@2a01:e35:8b41:120:ec8c:221f:ae09:d94d) Quit (Ping timeout: 480 seconds)
[19:17] * pabluk is now known as pabluk_
[19:22] * ScOut3R (~ScOut3R@business-89-135-189-217.business.broadband.hu) Quit (Quit: Leaving...)
[19:28] * shylesh__ (~shylesh@59.95.68.86) Quit (Ping timeout: 480 seconds)
[19:28] * shylesh__ (~shylesh@59.95.68.86) has joined #ceph
[19:29] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[19:30] * angdraug (~angdraug@c-50-174-102-105.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[19:31] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) has joined #ceph
[19:32] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[19:32] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[19:32] * derjohn_mob (~aj@94.119.1.2) Quit (Ping timeout: 480 seconds)
[19:33] * ade (~abradshaw@tmo-113-50.customers.d1-online.com) Quit (Quit: Too sexy for his shirt)
[19:34] <lookcrabs> policy? Yeah sure sorry. haen't changed it yet
[19:34] <lookcrabs> http://pastebin.com/6u11Q7sw
[19:34] <lookcrabs> it's an older bucket
[19:35] <lookcrabs> changing it now and will post paste of updated policy
[19:39] <lookcrabs> bagh and I am mixing policy and ACL. I will print acl too
[19:39] * ira (~ira@c-71-233-225-22.hsd1.ma.comcast.net) has joined #ceph
[19:39] * dgbaley27 (~matt@c-67-176-93-83.hsd1.co.comcast.net) has joined #ceph
[19:40] * daviddcc (~dcasier@LCaen-656-1-144-187.w217-128.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[19:42] * dalgaaf (uid15138@id-15138.ealing.irccloud.com) Quit (Quit: Connection closed for inactivity)
[19:43] <lookcrabs> http://pastebin.com/TDGeGQgM
[19:44] * BManojlovic (~steki@cable-89-216-194-159.dynamic.sbb.rs) has joined #ceph
[19:45] * stiopa (~stiopa@cpc73828-dals21-2-0-cust630.20-2.cable.virginm.net) has joined #ceph
[19:46] * mykola (~Mikolaj@91.225.202.134) has joined #ceph
[19:48] * kawa2014 (~kawa@2001:67c:1560:8007::aac:c1a6) Quit (Quit: Leaving)
[19:48] * mgolub (~Mikolaj@91.225.203.72) has joined #ceph
[19:49] * sasha1 (~achuzhoy@BURLON0309W-LP140-03-1279478798.dsl.bell.ca) has joined #ceph
[19:55] * mykola (~Mikolaj@91.225.202.134) Quit (Ping timeout: 480 seconds)
[19:58] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[19:58] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[20:03] * ircolle (~Adium@2601:285:201:2bf9:8af:2fa5:450b:60dc) Quit (Ping timeout: 480 seconds)
[20:03] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[20:03] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[20:05] * Scaevolus (~airsoftgl@176.123.6.155) has joined #ceph
[20:07] <TheSov> odd
[20:08] * angdraug (~angdraug@12.164.168.117) has joined #ceph
[20:10] * dupont-y (~dupont-y@familledupont.org) Quit (Ping timeout: 480 seconds)
[20:13] * enax (~enax@hq.ezit.hu) Quit (Ping timeout: 480 seconds)
[20:13] * ircolle (~Adium@2601:285:201:2bf9:8af:2fa5:450b:60dc) has joined #ceph
[20:15] <TheSov> i need help understanding how ceph gives out ip's
[20:15] <TheSov> is it possible to have osd systems on dhcp?
[20:15] <TheSov> since the monitors give out ip addresses?
[20:17] <TheSov> ahh nevermind
[20:17] <TheSov> http://tracker.ceph.com/issues/3550
[20:21] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) has joined #ceph
[20:22] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[20:22] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[20:28] * tiagonux_ (~oftc-webi@186-232-188-6.tiviths.com.br) Quit (Ping timeout: 480 seconds)
[20:35] * Scaevolus (~airsoftgl@5P6AAAR2G.tor-irc.dnsbl.oftc.net) Quit ()
[20:35] * geegeegee (~Chrissi_@pei69-1-78-193-103-77.fbxo.proxad.net) has joined #ceph
[20:38] * davidzlap1 (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[20:38] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[20:41] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) Quit (Remote host closed the connection)
[20:48] * Docta (~Docta@pool-108-46-37-8.nycmny.fios.verizon.net) has joined #ceph
[20:49] * rwheeler (~rwheeler@nat-pool-bos-u.redhat.com) Quit (Quit: Leaving)
[20:49] <Docta> anyone seeing an issue with CEPH where all IOPS completely drop to the cluster until you bounce a single OSD?
[20:50] * skmarin (~oftc-webi@74.112.38.14) has joined #ceph
[20:50] <tenshi> you mean one of your osd slow down the whole cluster ? if you remove it then your cluster is fast again ?
[20:51] <Docta> cluster is unable to write/read and an OSD drive was brought down, then back up again and cluster iops resumed as normal
[20:51] * shylesh__ (~shylesh@59.95.68.86) Quit (Remote host closed the connection)
[20:52] <Docta> OSD was chosen at random, not correlated to any issues reported
[20:52] <tenshi> any hardware issues reported regarding this osd ?
[20:52] <Docta> none
[20:52] <tenshi> mmm..
[20:53] <Docta> we're running hammer if it makes any difference
[20:53] <Docta> all OSD/Mon servers have same version
[20:53] <tenshi> it comes back up automagically ?
[20:54] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[20:54] <Docta> after bouncing the OSD yes
[20:54] <Docta> very odd
[20:54] <tenshi> waooo
[20:54] <tenshi> yes it is,.
[20:54] <tenshi> have any issue with journal ?
[20:54] <tenshi> what is your replica ratio ?
[20:54] <Docta> no issues as far as i can tell
[20:55] <Docta> replica ratio is 3
[20:55] <tenshi> and min_replica ?
[20:55] <Docta> checking
[20:55] <tenshi> ty
[20:55] <Docta> no, thank you!
[20:57] <Docta> not seeing that set, where can i find that value?
[20:57] <Docta> likely still the default, which is 1
[20:58] <m0zes> default in hammer is 0, which equates to ceil(size/2)
[20:59] <m0zes> iirc
[20:59] <Docta> ah, thanks, apologies
[20:59] * skmarin (~oftc-webi@74.112.38.14) Quit (Quit: Page closed)
[21:02] <Gorazd> is it ok in 5 node cluster, when choosing CephFS, to put MONs and MDS on the same nodes? In case of 5 node cluster, is it ok to put 3 MONs and 2 MDSs, or better to have also 3 MDS? Each node is a mix of compute (to run VMs) and storage (1x SSD for OS and journaling) and 2x SATA 1 TB to represent storage infrastructure for VMs.
[21:05] * geegeegee (~Chrissi_@4Z9AAARON.tor-irc.dnsbl.oftc.net) Quit ()
[21:05] * Lite (~Kristophe@179.43.151.234) has joined #ceph
[21:05] <Gorazd> Also the question. Is the journal data (which is stored on a shared OS SSD disk) also replicated across the cluster when node fails? Each of 5 nodes in the cluster has it's own SSD for OS and journaling? How big should be partition on SSD for journaling? Each node has 2x 1TB for OSDs?
[21:06] * shinanigan (~oftc-webi@74.112.38.14) has joined #ceph
[21:07] <mfa298_> journals are per OSD and there's a formulae for determining size (2 * max speed for the underlying device * max sync interval)
[21:07] <mfa298_> which in most cases suggests 1G for journal is enough (I think ceph-disk normally creates 5G by default which gives some spare if you want to change settings)
[21:11] <Gorazd> aha ok. thx mfa299_
[21:11] <Docta> any other thoughts tenshi? we're grasping at straws on this one as no errors were reported, and if this happens again when no one is actively watching iops we'll be less able to resolve the issues in a timely fashion
[21:13] * georgem1 (~Adium@206.108.127.16) has joined #ceph
[21:15] <kutija> what is the best way to get read/write ration in CEPH?
[21:16] <kutija> and how can I calculate current IOPS usage through CEPH stats
[21:16] * georgem (~Adium@206.108.127.16) Quit (Read error: Connection reset by peer)
[21:17] <lurbs> Basic stats are available through something like 'ceph -w'. If you're after something (a lot) more comprehensive then look at: http://docs.ceph.com/docs/hammer/dev/perf_counters/
[21:19] <kutija> well I have to purchase new storage in order to replace what I have now
[21:19] <kutija> so actually I need to calculate how much IOPS I really need
[21:19] <kutija> with some extra of course
[21:20] * dupont-y (~dupont-y@familledupont.org) has joined #ceph
[21:20] <kutija> currently I have 2 nodes with 4 OSD's each on 7200RPM drives with replication factor 2
[21:21] <kutija> so If I'm correct that means I have 4 Disk IOPS for write and 8 Disk IOPS for read?
[21:21] <Gorazd> any recommendations on CephFS for how much MDS to set-up in a 5 node cluster? shoud the number of MONs and MDS be the same from availability stand point?
[21:24] <tenshi> Docta, thinking out the box
[21:24] <Docta> all ears my friend
[21:25] <tenshi> nothing in ceph logs ? no error writing to osd ?
[21:25] <tenshi> seems pretty weird.
[21:25] <Docta> having our engineer who's in here going through them now, so far nothing of note
[21:25] <Docta> actually we just had iops drop to 0 again
[21:25] <Docta> and are attempting to reboot another OSD to see if we can replicate the solution
[21:27] <tenshi> yeah, you need to replicate the bug and looks closeley at the ceph logs
[21:27] <tenshi> on your monitor.
[21:27] <Docta> roger that
[21:28] <tenshi> ;p
[21:29] * MentalRay (~MentalRay@MTRLPQ42-1176054809.sdsl.bell.ca) has joined #ceph
[21:30] <Docta> rebooting OSD fixed the issue again
[21:30] <Docta> having tech collect logs from Monitors
[21:30] <mfa298_> Gorazd: at the moment there may be little benefit in having more than 2 mds servers as you probably want to run with only one active at a time
[21:32] * erhudy (uid89730@id-89730.ealing.irccloud.com) has joined #ceph
[21:33] <Gorazd> mfa298. ok thx. What's the recommendation, should they be running on different nodes (if possible) than MONS?
[21:33] <Gorazd> or no problems when runing MDS on the same node as MON?
[21:34] <Gorazd> Would CephFS when stable in Q1 2016 would still be recommended to be run on XFS or is btrfs already a good choice?
[21:34] <mfa298_> that may depend on the sort of load you're looking at and what you're hardware is like, we tried mds on the same hardware as some of our mons, but we have very over specced hardware for the mons
[21:35] * Lite (~Kristophe@7V7AAAZDJ.tor-irc.dnsbl.oftc.net) Quit ()
[21:35] <m0zes> the mds can be memory hungry. I think the default settings can let it balloon to 16-24GB under heavy load... (at least in my experience)
[21:35] <tenshi> Docta, first one OSD down should not avoid read/write in pool
[21:35] <tenshi> there is a problem here first, then it really seems like an hardware issue that freeze your OSD node
[21:35] <Docta> agreed, just to clarify, the OSD's are never down
[21:35] <Docta> they are reporting as green according to CEPH
[21:35] <Docta> but rebooting a random OSD resolves the issue
[21:36] <tenshi> waoo, thinking.
[21:36] <Docta> really appreciate the help tenshi
[21:36] <tenshi> no prob, have been here asking questions and someone helped me too
[21:36] <tenshi> this is how open community works
[21:36] <tenshi> ;)
[21:36] <tenshi> which let me think, try to reproduce it
[21:37] <tenshi> and do a ceph -s
[21:37] <tenshi> on your monitor in the same time
[21:37] <tenshi> see how it behaves.
[21:37] <TheSov> i need 80k to build my idea cluster. 1.1 petabytes
[21:37] <TheSov> ideal*
[21:37] <Docta> i can provide logs from mon server, but don't want to spam the #ceph channel
[21:37] <TheSov> someone give me 80 thousand dollars
[21:38] <Docta> just sent to you, we can't see any issues from it, but if you can please don't be shy :)
[21:39] <m0zes> $80,000 isn't too bad for 1.1PB. we spent $350 for 2.2PB...
[21:39] <m0zes> s/350/350K/
[21:40] <m0zes> Docta: pastebin, dpaste, fpaste or past.ie are all good places to put some logs to share publicly.
[21:41] <m0zes> nvm about past.ie apparently I can't remember what that site was.
[21:41] <via> pastee.org
[21:43] <m0zes> ahh, I remember now. paste.ie mostly because I liked the tag line. "Paste.ie - Irish for Pastebin"
[21:52] * rendar (~I@host225-177-dynamic.21-87-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[21:56] * rendar (~I@host225-177-dynamic.21-87-r.retail.telecomitalia.it) has joined #ceph
[22:08] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[22:09] * bene2 is now known as bene_in_alu_mtg
[22:10] * DV (~veillard@2001:41d0:1:d478::1) Quit (Remote host closed the connection)
[22:10] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[22:12] * rwheeler (~rwheeler@pool-173-48-214-9.bstnma.fios.verizon.net) has joined #ceph
[22:13] * dgurtner_ (~dgurtner@185.10.235.250) has joined #ceph
[22:15] * dgurtner (~dgurtner@185.10.235.250) Quit (Ping timeout: 480 seconds)
[22:15] * sbfox (~Adium@vancouver.xmatters.com) has joined #ceph
[22:16] * sbfox (~Adium@vancouver.xmatters.com) has left #ceph
[22:16] * sbfox (~Adium@vancouver.xmatters.com) has joined #ceph
[22:17] <sbfox> Hi everyone, can someone tell me what the impact of disabling scrub and deepscrub would be?
[22:18] <lurbs> Pro: Less background IO. Con: No periodic check of {meta,}data consistency.
[22:18] <lurbs> TLDR: Don't.
[22:19] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[22:21] * georgem1 (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[22:24] * doppelgrau (~doppelgra@pd956d116.dip0.t-ipconnect.de) Quit (Quit: doppelgrau)
[22:25] <TheSov> sbfox, bad things will happen
[22:25] <TheSov> dogs and cats living together
[22:25] <TheSov> mass histeria!
[22:25] <TheSov> hysteria!
[22:25] <TheSov> even
[22:25] * linjan__ (~linjan@176.195.239.174) has joined #ceph
[22:26] * lurbs prefers TheSov's answer to his own.
[22:26] * shaunm (~shaunm@208.102.161.229) has joined #ceph
[22:27] <TheSov> HA when sigourney weaver shows up it gets really bad!
[22:27] <sbfox> Ok so its more than just a defragger then?
[22:27] * davidzlap1 (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[22:27] * boredatwork (~overonthe@199.68.193.54) has joined #ceph
[22:27] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[22:27] <TheSov> no
[22:27] <TheSov> no
[22:27] <TheSov> no no no!
[22:28] <TheSov> each block of data is hashed
[22:28] <TheSov> and then the hash is checked against the data during deep scrub
[22:28] <TheSov> if they dont match, its broken data
[22:28] <TheSov> and then it gets fixed
[22:28] <TheSov> if u disable that the COSMIC RAYS WILL GET YOU!
[22:29] <TheSov> its called bitrot, and the bigger your data set, the bigger chance it has of being an actual thing
[22:29] <sbfox> The reason I ask is because I had to take down a node for a day for mobo replacement, when the node came back up io went thru the roof. scrubs/deepscrubs/rebalancing etc. This (I think) caused a vm volume to iowait severely and loose a block.
[22:30] <sbfox> the block happened to be a postgres datablock, problems and panicking happened. a lot
[22:30] <lurbs> It's okay to temporarily disable it, but not to leave it in that state.
[22:30] <sbfox> Understood
[22:31] <lurbs> If a rebalance causes IO stalls I'd recommend tweaking that first.
[22:31] <sbfox> Happy to do whatever the recommended path is
[22:31] <sbfox> How does one throttle rebalancing?
[22:32] <sbfox> (posh english voice came out there, sorry :) )
[22:32] <TheSov> yeah if backfilling, rebalancing or anything like that causes high iowait you may need to re-evaluate how your cluster is built
[22:32] <lurbs> Specifically 'osd max backfills' and 'osd recovery max active', to begin with.
[22:32] <lurbs> http://docs.ceph.com/docs/hammer/rados/configuration/osd-config-ref/
[22:32] <lurbs> Those default, in my opinion, to values that are far too high.
[22:33] <sbfox> I inhereted a cluster where compute and ceph are mixed on the same nodes (very very bad)
[22:33] <TheSov> OMG
[22:33] <TheSov> STOP IT IMMEDIATELY
[22:34] <TheSov> kill the bastard who did that
[22:34] <sbfox> If only I could
[22:34] <lurbs> I'm guessing that it was effectively the accountants. :)
[22:34] <sbfox> You guess correctly
[22:34] * dynamicudpate (~overonthe@199.68.193.54) Quit (Ping timeout: 480 seconds)
[22:35] <TheSov> it also sounds like you may not have dedicated ssd journals
[22:35] * bene_in_alu_mtg is now known as bene2
[22:35] <sbfox> We do, 1 partitioned ssd per 4 osd
[22:36] <TheSov> ok thats not a bad ratio at all
[22:36] <lurbs> sbfox: Those values default to 10 and 15, I think. We run them at 1 and 1 and tweak if/when necessary.
[22:36] <TheSov> i would definately move compute off
[22:36] <TheSov> when you say compute, do you mean a hypervisor, or just some apps installed?
[22:37] * mgolub (~Mikolaj@91.225.203.72) Quit (Quit: away)
[22:37] <lurbs> You can change them live with, for example: ceph tell osd.* injectargs '--osd_recovery_max_active 1'
[22:37] <sbfox> The cluster is only 7 nodes and rebalancing 14.5% of the data means heavy io load, even on ssd's. At some point you still have to flush to sas drives
[22:37] <lurbs> That won't survive a restart of the OSDs, you'd need to amend ceph.conf for that.
[22:39] * chiluk (~quassel@172.34.213.162.lcy-01.canonistack.canonical.com) has joined #ceph
[22:39] <m0zes> we ran 2 months with noscrub and nodeep-scrub. out of necessity, but it certainly isn't ideal. and when you turn it back on -- OMG SCRUBBING!
[22:39] <cetex> :>
[22:39] <cetex> i'm actually planning to run compute and ceph on same nodes
[22:39] <m0zes> some places will turn scrubbing off during business hours, and turn it back on overnight.
[22:39] <cetex> but at most 3 osd's per node though..
[22:39] <sbfox> Thanks for the tweaks, I'll get them pushed out. @TheSov, yes compute == hypervisors. My plan is to purchase more compute in the next round of spending the migrate running vm's off the ceph nodes then disable nova-compute
[22:39] <lurbs> m0zes: Yeah, I've had to extend the intervals to spread it out.
[22:39] <chiluk> what is the current ceph recommendation in relation to ZFS as the osd filesystem?
[22:40] <TheSov> well since ZOL sucks
[22:40] <TheSov> and ceph doesnt have a build for bsd
[22:40] <TheSov> good luck with that :)
[22:40] <m0zes> zfs isn't in-tree, so I doubt there will be a *recommendation* for it.
[22:40] <lurbs> chiluk: http://docs.ceph.com/docs/hammer/rados/configuration/filesystem-recommendations/#filesystems
[22:41] * enax (~enax@94-21-125-223.pool.digikabel.hu) has joined #ceph
[22:41] <chiluk> m0zes: things change..
[22:41] <m0zes> that doesn't mean it won't work ;)
[22:41] <lurbs> Given that it's not even mentioned there, I'd guess the answer is "don't". :)
[22:41] * enax (~enax@94-21-125-223.pool.digikabel.hu) has left #ceph
[22:41] <chiluk> has anyone tested it?
[22:41] <m0zes> it just means that it is a path less travelled.
[22:41] <chiluk> yeah that's my conclusion as well..
[22:42] <TheSov> chiluk, will you be using ZOL?
[22:42] <lurbs> chiluk: Yep, the mailing lists have details of a few intrepid souls and their issues/fixes.
[22:43] <chiluk> TheSov: I'm just trying to convince some poor misinformed souls about the errors of their ways.
[22:43] <TheSov> thats not an answer...
[22:46] <Thunderbird> should the ceph-mon script during mkfs place a 'keyring' file in the directory for that monitor (e.g. /var/lib/ceph/mon/ceph-0/)? some docs say it should, but I don't see the keyring there ,which prevents the monitor from starting
[22:47] <TheSov> Thunderbird, did you already create a cluster
[22:47] <TheSov> ?
[22:47] <Thunderbird> I created a config file, various keys and the monitor map
[22:47] <Thunderbird> the only thing which ceph-mon outputs in that place is store.db
[22:48] <Thunderbird> this is on Infernalis btw
[22:48] * ira (~ira@c-71-233-225-22.hsd1.ma.comcast.net) Quit (Quit: Leaving)
[22:49] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[22:50] * bjozet (~bjozet@82-183-17-144.customers.ownit.se) Quit (Quit: leaving)
[22:53] <diq_> I understand that EC pools can't perform partial writes, is that why you can't run CephFS data on an EC pool?
[22:54] <gregsfortytwo> among others ??? EC pools have some other limitations as well, but that's the big one
[22:55] <lurbs> Are you able to do so if you put a replicated cache pool in front?
[22:56] <chiluk> thanks TheSov, lurbs... you were much helpful.
[22:56] <diq_> lurbs, I haven't tried. Is that something people do?
[22:57] <diq_> I guess it is
[22:57] <lurbs> It was my understanding that was the workaround for not being able to run RBD directly on EC pools, but not sure if it also worked for CephFS.
[22:57] <lurbs> We don't run EC, so I'm not sure.
[22:58] <diq_> well, it's mentioned in the use case http://docs.ceph.com/docs/v0.86/dev/erasure-coded-pool/
[22:58] <gregsfortytwo> yes, you can put CephFS on EC+cache
[22:58] <diq_> A replicated pool is created and set as a cache tier for the replicated pool. An agent demotes objects (i.e. moves them from the replicated pool to the erasure-coded pool) if they have not been accessed in a week.
[22:59] <gregsfortytwo> there are some annoyances around it with monitor commands in terms of set-up and teardown in most of the releases :( but functionally it's transparent to CephFS (and to everything) if you have replicated cache + EC
[23:00] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) Quit (Remote host closed the connection)
[23:01] * sudocat (~dibarra@192.185.1.20) Quit (Read error: Connection reset by peer)
[23:01] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[23:01] * fridim_ (~fridim@56-198-190-109.dsl.ovh.fr) Quit (Ping timeout: 480 seconds)
[23:02] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) has joined #ceph
[23:02] <diq_> thanks for the advice. much appreciated everyone!
[23:03] <cetex> hm.. so if we run an EC pool it's not possible to edit data?
[23:04] <cetex> that's nice in a way. as long as you plan for it accordingly :)
[23:04] <diq_> depends on data mutability and use case
[23:04] <diq_> our use case provides for immutable data, so that's why I'm tinkering with it
[23:05] <cetex> yeah. same here.
[23:05] <cetex> what would happen if you loose a couple of blocks from an ec-coded object?
[23:05] <cetex> more than the coding can handle in data-loss
[23:06] <cetex> without loosing the pg's it's on
[23:06] <cetex> :>
[23:06] <diq_> you mean multi OSD bitrot?
[23:06] <cetex> yeah, for example
[23:07] <cetex> or another one, ramdisk for journal and dc-wide powerloss
[23:07] <cetex> some blocks have been written, others haven't.
[23:07] <diq_> Not sure. I'd hope it would throw a notfound error or something rather than partial data
[23:07] <cetex> yeah. maybe. but what happens to the data? does it get cleaned up? stays forever?
[23:07] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) Quit (Remote host closed the connection)
[23:08] <cetex> deletes as usual without hiccups when asked for deletion?
[23:09] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) has joined #ceph
[23:10] <cetex> and what happens on deep scrub? :)
[23:11] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) Quit (Remote host closed the connection)
[23:11] * georgem (~Adium@184.151.190.158) has joined #ceph
[23:13] <Thunderbird> did some more testing with the ceph-mon command it feels like it indeed doesn't copy the keyring file on Infernalis, on a Hammer box I have it seems to do it
[23:14] * Dysgalt (~Coestar@4Z9AAARU5.tor-irc.dnsbl.oftc.net) has joined #ceph
[23:14] * georgem (~Adium@184.151.190.158) Quit ()
[23:14] * georgem (~Adium@206.108.127.16) has joined #ceph
[23:16] * foxxx0 (~fox@nano-srv.net) Quit (Remote host closed the connection)
[23:16] * foxxx0 (~fox@mail.nano-srv.net) has joined #ceph
[23:17] * Icey (~IceyEC_@0001bbad.user.oftc.net) Quit (Ping timeout: 480 seconds)
[23:17] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[23:17] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[23:18] * shinanigan (~oftc-webi@74.112.38.14) Quit (Quit: Page closed)
[23:19] <Thunderbird> I noticed ceph-mon has a large number of intenal debug messages, how do you enable these? I tried playing with debug_ms / debug_mon but not much luck, not sure where the messages should even appear
[23:19] * mattbenjamin (~mbenjamin@aa2.linuxbox.com) Quit (Quit: Leaving.)
[23:19] <mfa298_> cetex: if you're thinking of putting the journals on ram disk only I suspect you might be setting yourself up for some nasty dataloss
[23:20] <cetex> mfa298_: yeah. i know. but it seems i only loose what's in the journals. the osd's seems to reindex the hdd's upon boot so they find the old data there.
[23:20] <cetex> we'll solve redundancy of data in case of datacenter outages another way. (two datacenters we write to in parallel is one solution)
[23:21] * xarses (~xarses@118.103.8.153) Quit (Ping timeout: 480 seconds)
[23:21] <mfa298_> if you lost several osds at the same time you could be looking at loosing several seconds worth of data writes.
[23:22] * mhack (~mhack@nat-pool-bos-t.redhat.com) Quit (Remote host closed the connection)
[23:22] <mfa298_> two data centers (unless they're really close) could have more latency that using something like ssd journals - and if the dc's are that close you could be finding that an event taking out one coudl take out both.
[23:25] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[23:25] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[23:25] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[23:25] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[23:26] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) has joined #ceph
[23:27] * TheSov (~TheSov@cip-248.trustwave.com) Quit (Quit: Leaving)
[23:28] <cetex> mfa298_: yeah. but if we loose a datacenter we're looking at loosing quite a bit of data that's not on ceph anyways.
[23:28] <lurbs> s/loos/los/
[23:28] <cetex> stuff in buffers, data transmitted to us (udp streams) that we can't handle (since, powerloss..)
[23:29] <cetex> :D
[23:30] <mfa298_> depending on what you're using ceph to store loss of data in ceph could be fairly disastourous for other things (e.g. if you're using rbd for vm images - which would seem the most likely reason you want to minimise latency that much)
[23:31] <cetex> yeah. not going to be an issue in this case.
[23:31] <cetex> we're going to write segments of video to it.
[23:34] * moore (~moore@64.202.160.88) Quit (Remote host closed the connection)
[23:34] <Gorazd> Trying to understand Manila. Is Manila intend to server data, saved in RADOS, through CephFS or NFS to the VM or other guests?
[23:34] * moore (~moore@64.202.160.88) has joined #ceph
[23:34] <lurbs> Gorazd: https://www.youtube.com/watch?v=dNTCBouMaAU
[23:35] <lurbs> ^^ Sage talking about Ceph and Manila.
[23:36] <cetex> but the reason for building without ssd's is that it's much cheaper, and imho simpler to setup. and kinda guaranteed latency.
[23:37] <mfa298_> cetex: I'm not really sure why you mentioned putting journals in ramdisk in that case, If it's just bandwidth you need lots of drives can do that
[23:37] <cetex> you still get serious latency hits. (like, many seconds) if you have a short burst of writes.
[23:37] <cetex> at least in my testing.
[23:38] <mfa298_> you may not find it's simpler to setup, my experience so far is that ceph is designed to work in particular ways and if you deviate from that things can get interesting
[23:39] <cetex> i've done tests already, haven't managed to kill the cluster with journal on ramdisk even though i've powercycled hosts randomly and stuff. although, i need to implement a constant-write service and then try to re-read the data afterwards to see how much is lost.
[23:41] * dgurtner_ (~dgurtner@185.10.235.250) Quit (Ping timeout: 480 seconds)
[23:42] * moore (~moore@64.202.160.88) Quit (Ping timeout: 480 seconds)
[23:43] <mfa298_> if you're getting many second latency hits with ssd journals you might have something very odd in your setup.
[23:44] * Dysgalt (~Coestar@4Z9AAARU5.tor-irc.dnsbl.oftc.net) Quit ()
[23:44] * dgurtner (~dgurtner@185.10.235.250) has joined #ceph
[23:48] <cetex> no ssd's in that case. :)
[23:48] <cetex> problem with ssd's is that we'll kill most of them according to my math.
[23:49] <cetex> intel p3700 400GB should last us some time, but that's getting quite expensive (triples cost of building our own storage cluster since we only run 2-3hdd's per node)
[23:50] * davidzlap1 (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[23:52] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) Quit (Read error: Connection reset by peer)
[23:52] <olid14> hi, cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
[23:52] <olid14> system time is sync
[23:54] * dgurtner_ (~dgurtner@185.10.235.250) has joined #ceph
[23:54] <cetex> the nice case: assume we'll write 0.75TB/day/hdd, 2 hdd's per node so 1.5TBW/day on the journal-drive, intel s3700 100GB (very cheap) handles 10DWPD, so we'll write 1.5x it's max (and we've lost one disk-slot since we put an ssd there), GC may become an issue on the SSD and we'll kill the drive in ~3years.
[23:55] * dgurtner (~dgurtner@185.10.235.250) Quit (Ping timeout: 480 seconds)
[23:55] <cetex> the bad case: assume we'll write 1TB/day/hdd, 3 hdd's per node so 3TBW/day on the journal-drive, intel s3700 100GB (if we could fit it..) handles 10DWPD, so we'll write 3x it's max, gc will most likely become an isue and we'll kill the drive in ~1.5years.
[23:56] <Gorazd> lurbs: i do no have yet experience with cephfs. is the size of the CephFS mount point on a guest (let's say withing VM) dependant on the size of the partition within VM, or is it similar as with rbd image, which's size is as set when created withing RADOS?
[23:56] <cetex> to scale this we'd need the 400GB ssd's to handle 3GB of journals, and then we're looking at a cost-increase of ~40-50% for the storage environment.
[23:57] <mfa298_> cetex: what spinning drives are you planning on using ?
[23:57] <lurbs> It's the size of the available pool on with the CephFS is backed, I believe. Not sure about quotas etc. We don't use CephFS.
[23:57] <Thunderbird> cetex, note for DWPD estimations for example that 10 is a bit worst case for tiny blocks (4kB) in general you get 3-5x more since blocks are bigger
[23:57] <cetex> i haven't planned very much there yet, this example is with 8TB hdd's, but since they're slow when you write a lot we'll most likely go for 6TB max.
[23:57] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[23:58] <mfa298_> cetex: if you're thinking 8TB SMR drives don't, really don't
[23:58] <Gorazd> lurbs. ok thx. I guess yes that when mounting CephFS, you actualy mount a pool: http://docs.ceph.com/docs/v0.80.5/man/8/cephfs/ . so the size is dependant on the size of the pool
[23:59] <cetex> mfa298_: yeah. i've done the math on 8TB (before i even investigated what drives would actually be useful) and it's pretty clear that we won't get those.
[23:59] * davidzlap (~Adium@2605:e000:1313:8003:e83b:414f:4015:1417) has joined #ceph
[23:59] <mfa298_> you may find 8x 1TB drives are much better than 1x 8T drives, lots more iops that way even if you have journals on the spinners

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.