#ceph IRC Log

Index

IRC Log for 2016-08-11

Timestamps are in GMT/BST.

[0:00] * mhackett (~mhack@nat-pool-bos-t.redhat.com) Quit (Remote host closed the connection)
[0:02] * lxxl (~oftc-webi@177.135.35.215.dynamic.adsl.gvt.net.br) Quit (Ping timeout: 480 seconds)
[0:07] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[0:14] * BrianA (~BrianA@fw-rw.shutterfly.com) Quit (Ping timeout: 480 seconds)
[0:17] * srk (~Siva@32.97.110.55) Quit (Ping timeout: 480 seconds)
[0:19] * cooey (~Xerati@178.162.205.28) has joined #ceph
[0:19] * BrianA (~BrianA@fw-rw.shutterfly.com) has joined #ceph
[0:20] * fsimonce (~simon@host203-44-dynamic.183-80-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[0:20] * omg_im_dead (8087646b@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[0:21] * omg_im_dead (8087646b@107.161.19.109) has joined #ceph
[0:23] * [0x4A6F]_ (~ident@p4FC26FC3.dip0.t-ipconnect.de) has joined #ceph
[0:24] * [0x4A6F] (~ident@0x4a6f.user.oftc.net) Quit (Ping timeout: 480 seconds)
[0:24] * [0x4A6F]_ is now known as [0x4A6F]
[0:25] * rendar (~I@host118-131-dynamic.59-82-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[0:29] * Meths_ (~meths@95.151.244.141) has joined #ceph
[0:34] * Meths (~meths@95.151.244.156) Quit (Ping timeout: 480 seconds)
[0:35] * neurodrone (~neurodron@158.106.193.162) Quit (Ping timeout: 480 seconds)
[0:35] * neurodrone__ is now known as neurodrone
[0:41] * omg_im_dead (8087646b@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[0:41] * omg_im_dead (8087646b@107.161.19.109) has joined #ceph
[0:42] * lcurtis_ (~lcurtis@47.19.105.250) Quit (Remote host closed the connection)
[0:43] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[0:45] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) has joined #ceph
[0:45] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[0:47] * vbellur (~vijay@71.234.224.255) has joined #ceph
[0:48] * linuxkidd (~linuxkidd@ip70-189-207-54.lv.lv.cox.net) has joined #ceph
[0:49] * cooey (~Xerati@178.162.205.28) Quit ()
[0:53] * hrast (~hrast@cpe-24-55-26-86.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[0:56] * BrianA (~BrianA@fw-rw.shutterfly.com) Quit (Ping timeout: 480 seconds)
[1:02] * BrianA (~BrianA@fw-rw.shutterfly.com) has joined #ceph
[1:06] * vbellur (~vijay@71.234.224.255) Quit (Remote host closed the connection)
[1:17] * KindOne (sillyfool@h156.225.28.71.dynamic.ip.windstream.net) has joined #ceph
[1:28] <omg_im_dead> does anyone know how to recover a cluster if all three monitors have corrupt fs / store.db ?
[1:29] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) has joined #ceph
[1:34] * danieagle (~Daniel@187.75.21.173) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[1:34] * analbeard (~shw@host86-142-132-208.range86-142.btcentralplus.com) Quit (Quit: Leaving.)
[1:41] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[1:45] * oms101 (~oms101@p20030057EA01F800C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:54] * oms101 (~oms101@p20030057EA01D600C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[2:00] * BrianA (~BrianA@fw-rw.shutterfly.com) has left #ceph
[2:01] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) Quit (Remote host closed the connection)
[2:04] <badone> omg_im_dead: out of interest, how could that possibly happen?
[2:04] * truan-wang (~truanwang@220.248.17.34) has joined #ceph
[2:09] * Scymex (~CoZmicShR@tor-exit.squirrel.theremailer.net) has joined #ceph
[2:15] * MACscr (~Adium@c-73-9-230-5.hsd1.il.comcast.net) Quit (Quit: Leaving.)
[2:15] <omg_im_dead> a power failure in our dc badone
[2:16] <omg_im_dead> ```
[2:16] <vasu> but how do you know its corrupt after power failure?
[2:17] <omg_im_dead> i mean.. I can't be certain but I'm getting this in the ceph-mon log::
[2:17] <omg_im_dead> error opening mon data directory at '/var/lib/ceph/mon/ceph-kh08-8': (22) Invalid argument
[2:17] <omg_im_dead> and then when I try to take a look:
[2:17] * MACscr_ (~MACscr@c-73-9-230-5.hsd1.il.comcast.net) has joined #ceph
[2:18] <omg_im_dead> 2016-08-10 19:17:51.163476 7f2bfd22d8c0 0 ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403), process ceph-mon, pid 21819
[2:18] <omg_im_dead> Corruption: error in middle of record
[2:18] <omg_im_dead> 2016-08-10 19:17:55.628950 7f2bfd22d8c0 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-kh08-8': (22) Invalid argument
[2:18] <omg_im_dead> so that to me looks like the database is corrupt
[2:18] <vasu> on all mon nodes?
[2:18] <omg_im_dead> on all of em. Different error though
[2:18] <omg_im_dead> we had a ground fault in our data center and fs corruption across all of our 1us. The odd thing is that we have write-through mode so I don't know how this happened
[2:19] <omg_im_dead> well of the ceph 1us
[2:19] <omg_im_dead> so my monitor and gateways
[2:19] <omg_im_dead> monitors* and gateways*
[2:19] <vasu> how many mon nodes and how many osd's did you have?
[2:19] <omg_im_dead> but yes, vasu, it's happening on all of the monitors
[2:19] <omg_im_dead> 3 mons 630 osds
[2:19] <vasu> all of them went through power failure at the same time?
[2:19] * wushudoin (~wushudoin@38.140.108.2) Quit (Ping timeout: 480 seconds)
[2:20] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[2:20] <omg_im_dead> ground fault at our datacenter which knocked out 1/3rd of the rack including all 3 monitor nodes
[2:20] <omg_im_dead> here is the failure of the 2nd monitor:
[2:20] <omg_im_dead> 2016-08-10 19:17:51.150928 7fae2c7628c0 0 ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403), process ceph-mon, pid 36715
[2:20] <omg_im_dead> Corruption: 17 missing files; e.g.: /var/lib/ceph/mon/ceph-kh09-8/store.db/10845998.ldb
[2:20] <omg_im_dead> 2016-08-10 19:17:57.470679 7fae2c7628c0 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-kh09-8': (22) Invalid argument
[2:21] <omg_im_dead> and 3rd
[2:21] <omg_im_dead> 2016-08-10 19:17:51.164710 7f20683088c0 0 ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403), process ceph-mon, pid 27786
[2:21] <omg_im_dead> Corruption: 5 missing files; e.g.: /var/lib/ceph/mon/ceph-kh10-8/store.db/10882319.ldb
[2:21] <omg_im_dead> 2016-08-10 19:17:57.411920 7f20683088c0 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-kh10-8': (22) Invalid argument
[2:21] <omg_im_dead> so i think the leveldb is corrupt on all 3
[2:22] <omg_im_dead> i installed plyvel on all of them and I am trying to recover at least one of the monitors now if possible
[2:22] <vasu> you should raise a issue here tracker.ceph.com, i guess you have to upload the mon logs as well
[2:22] <omg_im_dead> vasu: will that help? Technically this isn't a bug right? Just an unfortunate circumstance
[2:23] <vasu> both i would say :)
[2:24] <vasu> do you have ssd's or normal drives?
[2:25] <vasu> the /var/lib/ceph/mon
[2:26] <omg_im_dead> normal drives
[2:26] <vasu> hmmm
[2:26] <omg_im_dead> well sas spinning rust
[2:27] * xarses (~xarses@64.124.158.32) Quit (Ping timeout: 480 seconds)
[2:29] * KindOne (sillyfool@0001a7db.user.oftc.net) Quit (Remote host closed the connection)
[2:31] <infernix> anyone running ssd cache tier with many hitsets (and accompanying high min recency) and short periods?
[2:31] * MACscr_ (~MACscr@c-73-9-230-5.hsd1.il.comcast.net) Quit (Quit: Textual IRC Client: www.textualapp.com)
[2:32] * KindOne (sillyfool@h156.225.28.71.dynamic.ip.windstream.net) has joined #ceph
[2:36] * omg_im_dead (8087646b@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[2:36] * georgem (~Adium@76-10-180-154.dsl.teksavvy.com) has joined #ceph
[2:37] * omg_im_dead (8087646b@107.161.19.109) has joined #ceph
[2:39] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[2:39] * Scymex (~CoZmicShR@5AEAAAWNQ.tor-irc.dnsbl.oftc.net) Quit ()
[2:41] * georgem (~Adium@76-10-180-154.dsl.teksavvy.com) Quit ()
[2:41] * georgem (~Adium@206.108.127.16) has joined #ceph
[2:41] * chunmei (~chunmei@134.134.139.83) Quit (Remote host closed the connection)
[2:44] * MACscr_ (~MACscr@c-73-9-230-5.hsd1.il.comcast.net) has joined #ceph
[2:48] * Miho (~Chrissi_@2.tor.exit.babylon.network) has joined #ceph
[3:03] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[3:11] <omg_im_dead> is there any way to pull/piece together the information I would need to rebuild a monitor from the osd nodes?
[3:11] <omg_im_dead> or with 3/3 corrupt monitors am I basically screwed and have to delete almost 1PB of data?
[3:13] * devster (~devsterkn@2001:41d0:1:a3af::1) Quit (Quit: cya)
[3:16] * derjohn_mobi (~aj@x590cb994.dyn.telefonica.de) has joined #ceph
[3:17] <badone> omg_im_dead: did you try to fix the file systems?
[3:17] <omg_im_dead> unfortunately
[3:17] <omg_im_dead> the servers were not booting otherwise
[3:17] <omg_im_dead> they are ext4
[3:17] <omg_im_dead> the monitors are
[3:17] <omg_im_dead> the OSDs are xfs
[3:18] * Miho (~Chrissi_@5AEAAAWOI.tor-irc.dnsbl.oftc.net) Quit ()
[3:21] <omg_im_dead> with gmane down anyone know an alternative to search through the mailing lists?
[3:22] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[3:23] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[3:23] * derjohn_mob (~aj@x590e2d72.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[3:28] * neurodrone (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[3:30] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[3:30] * penguinRaider (~KiKo@146.185.31.226) Quit (Ping timeout: 480 seconds)
[3:33] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[3:36] * Skyrider (~thundercl@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[3:36] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[3:36] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[3:37] <omg_im_dead> I found this "http://www.spinics.net/lists/ceph-devel/msg06662.html" which says that all of the data is in the rest of the cluster
[3:38] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Read error: Connection reset by peer)
[3:38] <omg_im_dead> but that recovery is tedious. I am okay with tedious is it is at all possible.
[3:38] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[3:38] * penguinRaider (~KiKo@146.185.31.226) has joined #ceph
[3:41] * sebastian-w_ (~quassel@212.218.8.139) Quit (Remote host closed the connection)
[3:41] * sebastian-w (~quassel@212.218.8.138) has joined #ceph
[3:47] * joshd1 (~jdurgin@2602:30a:c089:2b0:e8be:edb:d7be:db8e) Quit (Quit: Leaving.)
[3:55] * omg_im_dead (8087646b@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[3:56] * omg_im_dead (8087646b@107.161.19.109) has joined #ceph
[4:00] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) Quit (Read error: Connection reset by peer)
[4:03] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[4:06] * Skyrider (~thundercl@61TAAA8UM.tor-irc.dnsbl.oftc.net) Quit ()
[4:19] * jfaj_ (~jan@p20030084AF3449005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) has joined #ceph
[4:23] * neurodrone (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[4:23] * srk (~Siva@2605:6000:ed04:ce00:2dc5:1278:78ca:a08a) has joined #ceph
[4:24] * kefu (~kefu@114.92.96.253) has joined #ceph
[4:24] * FNugget (~ain@tor2r.ins.tor.net.eu.org) has joined #ceph
[4:24] * Nicho1as (~nicho1as@00022427.user.oftc.net) has joined #ceph
[4:25] * jfaj (~jan@p4FE4FF94.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[4:29] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[4:32] * Long_yanG (~long@15255.s.t4vps.eu) has joined #ceph
[4:34] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[4:35] * srk (~Siva@2605:6000:ed04:ce00:2dc5:1278:78ca:a08a) Quit (Ping timeout: 480 seconds)
[4:36] * LongyanG (~long@15255.s.t4vps.eu) Quit (Ping timeout: 480 seconds)
[4:40] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) has joined #ceph
[4:40] * kefu_ (~kefu@114.92.96.253) has joined #ceph
[4:41] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Remote host closed the connection)
[4:42] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[4:43] * kefu (~kefu@114.92.96.253) Quit (Read error: No route to host)
[4:43] * kefu (~kefu@114.92.96.253) has joined #ceph
[4:48] * Racpatel (~Racpatel@2601:87:0:24af::53d5) Quit (Ping timeout: 480 seconds)
[4:49] * kefu_ (~kefu@114.92.96.253) Quit (Ping timeout: 480 seconds)
[4:50] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[4:54] * FNugget (~ain@26XAAAZB7.tor-irc.dnsbl.oftc.net) Quit ()
[5:07] * truan-wang_ (~truanwang@220.248.17.34) has joined #ceph
[5:08] * jarrpa (~jarrpa@2602:3f:e183:a600:eab1:fcff:fe47:f680) has joined #ceph
[5:13] * truan-wang (~truanwang@220.248.17.34) Quit (Ping timeout: 480 seconds)
[5:15] * truan-wang_ (~truanwang@220.248.17.34) Quit (Remote host closed the connection)
[5:16] * omg_im_dead (8087646b@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[5:24] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[5:24] * kefu (~kefu@114.92.96.253) Quit (Max SendQ exceeded)
[5:24] * kefu (~kefu@114.92.96.253) has joined #ceph
[5:32] * Jeffrey4l__ (~Jeffrey@110.252.58.4) Quit (Ping timeout: 480 seconds)
[5:34] * neurodrone (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[5:34] * neurodrone (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[5:36] * Vacuum__ (~Vacuum@88.130.203.172) has joined #ceph
[5:43] * vimal (~vikumar@114.143.165.8) has joined #ceph
[5:43] * Vacuum_ (~Vacuum@88.130.223.234) Quit (Ping timeout: 480 seconds)
[5:46] * dnunez (~dnunez@209-6-91-147.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com) has joined #ceph
[5:53] * davidzlap (~Adium@2605:e000:1313:8003:49a:34df:fe9e:be17) Quit (Quit: Leaving.)
[6:00] * AGaW (~Bored@177.154.139.199) has joined #ceph
[6:01] * walcubi_ (~walcubi@p5795BF9D.dip0.t-ipconnect.de) has joined #ceph
[6:04] * neurodrone (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[6:06] * Jeffrey4l (~Jeffrey@119.251.252.117) has joined #ceph
[6:07] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[6:07] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[6:08] * vimal (~vikumar@114.143.165.8) Quit (Quit: Leaving)
[6:08] * walcubi (~walcubi@p5797A25F.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[6:11] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) Quit ()
[6:30] * AGaW (~Bored@177.154.139.199) Quit ()
[6:32] * vimal (~vikumar@121.244.87.116) has joined #ceph
[6:37] * omg_im_dead (180e4376@107.161.19.109) has joined #ceph
[6:41] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[6:46] * kefu is now known as kefu|afk
[6:46] * yanzheng (~zhyan@125.70.22.133) has joined #ceph
[6:47] * toastyde1th (~toast@pool-71-255-253-39.washdc.fios.verizon.net) has joined #ceph
[6:48] * mrapple (~HoboPickl@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[6:53] * dnunez (~dnunez@209-6-91-147.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com) Quit (Quit: Leaving)
[6:53] * TomasCZ (~TomasCZ@yes.tenlab.net) Quit (Quit: Leaving)
[6:53] * toastydeath (~toast@pool-71-255-253-39.washdc.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[6:55] * theTrav (~theTrav@203.35.9.142) has joined #ceph
[7:00] * penguinRaider (~KiKo@146.185.31.226) Quit (Ping timeout: 480 seconds)
[7:00] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[7:00] * kefu|afk is now known as kefu
[7:08] * masber (~masber@129.94.15.152) has joined #ceph
[7:08] <masber> hi
[7:09] <masber> was wondering how ceph performs with small files like 20-30kbits compared to glusterFS
[7:15] * chengpeng__ (~chengpeng@180.168.170.2) has joined #ceph
[7:16] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[7:18] * mrapple (~HoboPickl@5AEAAAWSO.tor-irc.dnsbl.oftc.net) Quit ()
[7:19] * penguinRaider (~KiKo@146.185.31.226) has joined #ceph
[7:22] * chengpeng_ (~chengpeng@180.168.170.2) Quit (Ping timeout: 480 seconds)
[7:26] <badone> masber: there is documentation like this around, http://ceph.com/community/ceph-performance-part-1-disk-controller-write-throughput/
[7:27] <badone> masber: a little dated but a place to start research
[7:27] <masber> badone: thank you!
[7:27] * theTrav_ (~theTrav@ipc032.ipc.telstra.net) has joined #ceph
[7:28] <masber> I will have a look into it
[7:29] <badone> masber: np, there's this for an extreme example as well, http://sssslide.com/www.slideshare.net/Inktank_Ceph/accelerating-cassandra-workloads-on-ceph-with-allflash-pcie-ssds
[7:29] <badone> masber: speed depends on many variables
[7:29] <badone> one being $ spent
[7:31] <masber> what I currently have is 5x nodes with x10 2TB ssd nvme drives, x28 intel cores and 512Gig RAM
[7:31] * rdas (~rdas@121.244.87.116) has joined #ceph
[7:32] <masber> my problem is how to make it cost efficiently
[7:32] <masber> I am currently using Panasas and I want to know if using ceph would be more cost efficiently
[7:33] * theTrav (~theTrav@203.35.9.142) Quit (Read error: Connection timed out)
[7:33] <masber> *cost effective
[7:41] <badone> masber: I know nothing about panasas :)
[7:42] <masber> Panasas is an appliance
[7:42] <badone> yes, one which I know nothing about :)
[7:42] <masber> basically each shelf costs $100000 and gives less than 15000 IOPs
[7:43] * badone nods
[7:44] <masber> so I want to see if I can improve performance using supermicro hardware and either cephFS or glusterFS
[7:44] <masber> and make it more cost efficiently
[7:45] <badone> masber: this outlines the difficulties faced in creating off-the shelf performance figures, https://www.mail-archive.com/ceph-users@lists.ceph.com/msg31207.html
[7:45] <badone> masber: I know gluster had issues with small file perf but I think they have donw some work in that area recently
[7:46] <masber> hum, I also in contact with them on the irc channel
[7:46] <badone> masber: if you need a network/distributed file system in production atm then glusterfs is far more mature than cephfs
[7:46] <masber> thank you for your 2 links I will go through them
[7:47] <badone> masber: ceph object and block storage is mature, cephfs still has a bit to go
[7:47] <badone> np
[7:47] <masber> yeah cephfs only became production ready a few days ago
[7:47] <masber> I am also considering about alluxio
[7:48] <masber> too many exciting things happening now on distributed storage
[7:50] * haomaiwang (~oftc-webi@114.242.248.120) has joined #ceph
[7:50] <badone> masber: yes, it is a fast moving area :)
[7:50] <haomaiwang> kefu: sorry for the missing comment, I always only look at github email notification. So it seemed I lost your later comment reply on the existing comment.
[7:52] * theTrav_ (~theTrav@ipc032.ipc.telstra.net) Quit (Remote host closed the connection)
[7:54] * Averad (~cooey@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[7:59] * efirs (~firs@31.173.240.8) has joined #ceph
[7:59] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) has joined #ceph
[7:59] <Be-El> hi
[8:02] * swami1 (~swami@49.38.1.248) has joined #ceph
[8:02] * haomaiwang (~oftc-webi@114.242.248.120) Quit (Ping timeout: 480 seconds)
[8:14] * Miouge (~Miouge@109.128.94.173) has joined #ceph
[8:17] * karnan (~karnan@121.244.87.117) has joined #ceph
[8:19] * swami1 (~swami@49.38.1.248) Quit (Read error: Connection timed out)
[8:22] * flisky (~Thunderbi@210.12.157.90) has joined #ceph
[8:23] * swami1 (~swami@49.38.1.248) has joined #ceph
[8:24] * Averad (~cooey@61TAAA8Z1.tor-irc.dnsbl.oftc.net) Quit ()
[8:24] * clarjon1 (~dicko@173.244.223.122) has joined #ceph
[8:29] * saintpablo (~saintpabl@gw01.mhitp.dk) has joined #ceph
[8:30] * saintpablo (~saintpabl@gw01.mhitp.dk) Quit ()
[8:39] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) Quit (Ping timeout: 480 seconds)
[8:39] * swami1 (~swami@49.38.1.248) Quit (Read error: Connection timed out)
[8:41] * swami1 (~swami@49.38.1.248) has joined #ceph
[8:42] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[8:44] * flisky1 (~Thunderbi@210.12.157.90) has joined #ceph
[8:44] * flisky1 (~Thunderbi@210.12.157.90) Quit ()
[8:46] * branto (~branto@nat-pool-brq-t.redhat.com) has joined #ceph
[8:48] * rendar (~I@host50-34-dynamic.25-79-r.retail.telecomitalia.it) has joined #ceph
[8:49] * flisky (~Thunderbi@210.12.157.90) Quit (Ping timeout: 480 seconds)
[8:50] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) has joined #ceph
[8:54] * clarjon1 (~dicko@173.244.223.122) Quit ()
[8:55] * Jeffrey4l (~Jeffrey@119.251.252.117) Quit (Quit: Leaving)
[8:55] * Jeffrey4l (~Jeffrey@119.251.252.117) has joined #ceph
[8:58] * swami1 (~swami@49.38.1.248) Quit (Read error: Connection timed out)
[8:59] * ade (~abradshaw@tmo-108-29.customers.d1-online.com) has joined #ceph
[9:00] * swami1 (~swami@49.38.1.248) has joined #ceph
[9:06] * x4x5 (~oftc-webi@107-192-81-136.lightspeed.nsvltn.sbcglobal.net) has joined #ceph
[9:06] * x4x5 (~oftc-webi@107-192-81-136.lightspeed.nsvltn.sbcglobal.net) Quit ()
[9:14] * garphy is now known as garphy`aw
[9:15] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[9:24] * fsimonce (~simon@host203-44-dynamic.183-80-r.retail.telecomitalia.it) has joined #ceph
[9:34] * toastyde1th (~toast@pool-71-255-253-39.washdc.fios.verizon.net) Quit (Read error: Connection reset by peer)
[9:35] * omg_im_dead (180e4376@107.161.19.109) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[9:38] * rakeshgm (~rakesh@106.51.29.33) has joined #ceph
[9:41] * toast-work (~arabassa@pool-71-255-253-39.washdc.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[9:42] * analbeard (~shw@support.memset.com) has joined #ceph
[9:42] * art_yo (~kvirc@149.126.169.197) has joined #ceph
[9:43] * rakeshgm (~rakesh@106.51.29.33) Quit (Quit: Leaving)
[9:44] * treenerd_ (~gsulzberg@cpe90-146-148-47.liwest.at) has joined #ceph
[9:48] * AGaW (~loft@162.216.46.57) has joined #ceph
[9:49] * treenerd_ (~gsulzberg@cpe90-146-148-47.liwest.at) Quit (Quit: treenerd_)
[9:51] <sep> when running a cephfs on a replicated pool; the pool have 33TB free space, but since it's size 3 it's realy just 11TB. is it possible to have df-h show 11TB instead of 33TB on the clientmounting the cephfs ?
[9:53] * swami2 (~swami@49.44.57.242) has joined #ceph
[9:54] <Be-El> sep: no, since cephfs does not know about other pool on ceph that may have different replication factors/ec setup
[9:54] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[9:55] <sep> true, but that would in the best case just make the 11TB too optimistic. 33TB is guaranteed to be 3x too much, having other pools inn addition would make 11TB more or less wrong as well,. but 11TB is more accurate then 33TB
[9:55] <Be-El> and i'm not sure whether the cephfs client even takes care about the pool setup
[9:56] <sep> if the cephfs client is not aware if it's a replicated or EC pool it lives on, then it's more difficult yes
[9:57] <Be-El> the ec pool case is even more complicated, since cephfs cannot use ec pool directly and needs a cache layer as data pool (that in turn may use a ec pool as backend)
[9:58] <sep> but if it knows it's replicated, and knows the size. then df -h = "space/size" should be easy
[9:58] <Be-El> the information that a pool is a cache pool is also part of the pool table, so in that case getting the correct available space is a matter of redirection
[9:58] <Be-El> for simple replicated pools is should be easy, yes
[9:59] <sep> if it was on a EC one could ignore the cache and just do the math on the EC pool
[9:59] <Be-El> but i'm not sure how accurate the value will be in case of an uneven crush rule setup, e.g. distribution about racks with different available storage
[10:00] * swami1 (~swami@49.38.1.248) Quit (Ping timeout: 480 seconds)
[10:00] <Be-El> i agree that the current way is not optimal, but getting a correct value might be impossible
[10:01] * madkiss2 (~madkiss@2a02:8109:8680:2000:69de:61b5:bbb:5f61) has joined #ceph
[10:02] * kaisan_ is now known as kaisan
[10:05] * madkiss (~madkiss@2a02:8109:8680:14f2:9568:de2d:adfd:9e9d) Quit (Ping timeout: 480 seconds)
[10:06] * codice (~toodles@75-128-34-237.static.mtpk.ca.charter.com) Quit (Ping timeout: 480 seconds)
[10:06] * DanFoster (~Daniel@office.34sp.com) has joined #ceph
[10:08] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[10:11] * jayjay (~jayjay@2a00:f10:121:400:444:3cff:fe00:4bc) Quit (Quit: WeeChat 0.4.2)
[10:13] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[10:14] <sep> Be-El, that's a pity. perhaps its possible to hide the number in nfs and samba shared. since no answer is better then a very wrong answer
[10:15] <Be-El> what about the case of multiple pools with different replication factors?
[10:16] <sep> cephfs only needs to deal with it's own pool replication or (m+k) factor. since data moved into other pools will reduce the space available on the cluster. the total free space of cephfs will be correct.
[10:17] <Be-El> you can assign different data pool for different directories, e.g. a scratch directory may use a pool with replication factor of two or even one.
[10:18] * derjohn_mobi (~aj@x590cb994.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[10:18] * KindOne_ (sillyfool@h125.161.186.173.dynamic.ip.windstream.net) has joined #ceph
[10:18] * AGaW (~loft@162.216.46.57) Quit ()
[10:18] <sep> did't know that. have only used one single cephfs mount
[10:19] <sep> does indeed complicate things. but if one / the highest size one will be most correct. if one writes to the scratch area. one will fill slower then expected and that's a good thing, and does not lead to out of space scenarios
[10:19] <Be-El> http://docs.ceph.com/docs/master/cephfs/file-layouts/
[10:21] <Be-El> i remember seeing a more in-depth discussion about this on the mailing list or in a ticket
[10:21] * KindOne (sillyfool@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[10:21] * KindOne_ is now known as KindOne
[10:22] <boolman> seems like my cluster doesnt honor the osd_scrub_end_hour config param
[10:22] <boolman> 2016-08-11 08:20:17.789149 osd.33 [INF] 1.1e5 deep-scrub starts
[10:23] <boolman> osd_scrub_start_hour = 0 ; osd_scrub_end_hour = 6
[10:23] <sep> personaly i would also want to deduct 20% since that's when one runs into near full situations. mounted shares in windows showing 3X + nearfull overhead wrong space is less then useful atleast
[10:24] <sep> hum actualy for samba there is a dfree command that can acomplish this.
[10:24] <sep> does not work for nfs or sshfs but samba users are mostly the ones it's difficult to explain the concept to
[10:24] <sep> and why they can not put a 10TB of data on a share that have 12TB free.
[10:27] <doppelgrau> sep: other use case where a correct df is hard to compute: first coppy on ssd, other two copies on hdd but in two different racks
[10:29] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) has joined #ceph
[10:30] <sep> i agree that a 100% correct df is hard. but i still claim that / relplication would be a lot closer to the thruth then showing free raw space.
[10:32] <doppelgrau> sep: it's a question, is a correct reliable value that is hard to interpret better or a unreliabale value that is more intitive but in many cases missleading accurate
[10:36] <sep> doppelgrau, indeed. the problem occures when the "technically corrrect" but hard to interpret value is visible to end users. samba dfree solves it in this case.
[10:36] <IcePic> not being able to write a 10TB file on a 12TB filesystem is a common problem in lots of scenarios, ranging from out-of-inodes to the underlying backing store lied due to thin provisioning so we didnt get all blocks promised to us
[10:37] <IcePic> and the reverse is true of course, for filesystems with dedup/compression and so on, which allow you to write far more than df claims.
[10:37] <IcePic> so most of the time, programs just start writing and see if it works.
[10:38] <IcePic> by the time "cp" is near the end of a 12TB copy, someone might have added disks or removed other files. Who knows..
[10:38] <sep> IcePic, true and true. but both thin provition and deduplication gives some % error margin. cephfs currently give X size + overhead error margin
[10:38] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[10:39] <sep> i compleatly understand that it's a problem. and perhaps solving it with samba dfree is the "correct" solution.
[10:39] <IcePic> I can see the point in free/replication, but there are a ton of other factors working both ways
[10:39] <sep> wonder if nfs have something similar.
[10:40] <IcePic> I remember AFS clients always reporting max_file_size-1 byte which of course differs for platforms
[10:40] * haomaiwang (~oftc-webi@114.242.248.120) has joined #ceph
[10:40] * shaunm (~shaunm@nat-eduroam-02.scc.kit.edu) has joined #ceph
[10:40] <IcePic> but since there was theoretically "infinite space" if the AFS admin was faster in buying than you in filling, it was as true as any other number. And quotas will hit you first in most cases anyhow.
[10:41] <sep> IcePic, asolutly getting it 100% right would be very hard and tricky. but going from X3+ wrong to some % wrong is still a huge improvement.
[10:41] * codice (~toodles@75-128-34-237.static.mtpk.ca.charter.com) has joined #ceph
[10:42] <IcePic> yeah, just saying that programs got fooled so many times, they just stop caring. ;)
[10:42] <IcePic> for humans, it might matter more
[10:42] * thomnico (~thomnico@2a01:e35:8b41:120:d491:7abc:d665:4d84) has joined #ceph
[10:43] <WildyLion> Could anyone take a look at this bug: http://tracker.ceph.com/issues/16981 - we've been seeing this behaviour on some OSDs lately (relevant: jewel, bluestore, bad alloc)
[10:43] <sep> IcePic, absolutly a human problem
[10:43] <IcePic> sep: if you keep on writing 1M-sized files, you would be able to notice how 3M goes away for each file written, and as you near the limit, the df would get more and more true as it fills up, until it ends up being 11TB filled and <1M bytes free
[10:44] <IcePic> so as time goes, it will become more true.
[10:44] <sep> altho it's solveable with samba dfree. i'd not allways want to have that layer between ceph and users
[10:44] * valeech_ (~valeech@pool-96-247-203-33.clppva.fios.verizon.net) has joined #ceph
[10:44] <IcePic> so the solution is to grow the ceph faster than the users write to it. =)
[10:44] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:4d98:7dea:2462:19d7) has joined #ceph
[10:45] <IcePic> (ie, throw money and tech at the human issue =)
[10:46] <sep> IcePic, the money usualy comes from the human problem. :)
[10:46] * shyu (~Frank@218.241.172.114) Quit (Read error: Connection reset by peer)
[10:46] * valeech (~valeech@pool-96-247-203-33.clppva.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[10:46] * valeech_ is now known as valeech
[10:50] * TomyLobo (~aleksag@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[10:52] * ade (~abradshaw@tmo-108-29.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[10:53] * derjohn_mobi (~aj@2001:6f8:1337:0:30f3:25a4:f8bd:7c69) has joined #ceph
[10:54] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[11:07] <haomaiwang> kefu: could you give a look today?
[11:07] <kefu> haomaiwang, sorry, i have a 2.0 issue in my hand.
[11:08] <haomaiwang> kefu: ok...
[11:11] * kyson (~kyson@116.1.3.200) has joined #ceph
[11:14] * kyson (~kyson@116.1.3.200) Quit ()
[11:15] * kyson (~kyson@116.1.3.200) has joined #ceph
[11:16] * kyson (~kyson@116.1.3.200) Quit ()
[11:17] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[11:19] * TomyLobo (~aleksag@26XAAAZJM.tor-irc.dnsbl.oftc.net) Quit ()
[11:26] * chengpeng_ (~chengpeng@180.168.170.2) has joined #ceph
[11:33] * chengpeng__ (~chengpeng@180.168.170.2) Quit (Ping timeout: 480 seconds)
[11:36] * haomaiwang (~oftc-webi@114.242.248.120) Quit (Ping timeout: 480 seconds)
[11:38] <sep> samba dfree worked wonders, went from 33TB free to 8TB and that's aproximatly correct in this case
[11:40] * thomnico (~thomnico@2a01:e35:8b41:120:d491:7abc:d665:4d84) Quit (Quit: Ex-Chat)
[11:42] <sep> 2 lines in the script called. and it grabs the correct size of my pool, samba caches it for some minutes so it does not run so often size=$(sudo /usr/bin/ceph osd pool get cephfs_data size | awk '{print $2}') ; df -k $1 | tail -1 | awk "{print int(\$2/$size),int(\$4/$size)}"
[11:56] * garphy`aw is now known as garphy
[11:57] * penguinRaider (~KiKo@146.185.31.226) Quit (Ping timeout: 480 seconds)
[11:59] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Quit: Ex-Chat)
[12:09] * thomnico (~thomnico@2a01:e35:8b41:120:d491:7abc:d665:4d84) has joined #ceph
[12:10] * penguinRaider (~KiKo@146.185.31.226) has joined #ceph
[12:29] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[12:41] * kefu is now known as kefu|afk
[12:46] <swami2> which malloc (tcmalloc or jcmalloc) used in hammer release?
[12:46] <doppelgrau> swami2: default NOT jmalloc => guess tcmalloc
[12:51] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[12:53] <swami2> doppelgrau: Thanks...
[12:53] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[12:55] * Nacer_ (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[12:55] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[12:56] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[12:57] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[12:57] * Nacer_ (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[13:00] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Remote host closed the connection)
[13:00] <doppelgrau> tcmaloc: http://ceph.com/planet/ceph-release-ram-used-by-tcmalloc/
[13:00] * Nacer (~Nacer@176.31.89.99) has joined #ceph
[13:00] * zeestrat (uid176159@id-176159.brockwell.irccloud.com) has joined #ceph
[13:00] <doppelgrau> and the ugly ceph tell osd.* heap release that is propably in some cron-scripts
[13:01] <doppelgrau> (in one of mine too)
[13:14] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[13:21] * wjw-freebsd (~wjw@vpn.ecoracks.nl) has joined #ceph
[13:29] * rakeshgm (~rakesh@106.51.29.33) has joined #ceph
[13:36] * hellertime (~Adium@a23-79-238-10.deploy.static.akamaitechnologies.com) has joined #ceph
[13:38] <hellertime> I have a pg that appears to have all its peer osd active and up, but in its recovery_state it has an osd that is currently down, and this is blocking it from being up??? why does it try to peer with this osd?
[13:39] * rakeshgm (~rakesh@106.51.29.33) Quit (Quit: Leaving)
[13:39] * garphy is now known as garphy`aw
[13:42] * georgem (~Adium@24.114.66.118) has joined #ceph
[13:42] * georgem (~Adium@24.114.66.118) Quit ()
[13:43] * georgem (~Adium@206.108.127.16) has joined #ceph
[13:43] * garphy`aw is now known as garphy
[13:45] * rakeshgm (~rakesh@106.51.29.33) has joined #ceph
[13:46] * neurodrone (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) has joined #ceph
[13:47] <hellertime> here is the pg query output if your interested: http://pastebin.com/Rg2hK9GE
[13:50] * georgem1 (~Adium@24.114.66.118) has joined #ceph
[13:51] * georgem1 (~Adium@24.114.66.118) Quit ()
[13:51] * georgem1 (~Adium@206.108.127.16) has joined #ceph
[13:52] * efirs (~firs@31.173.240.8) Quit (Ping timeout: 480 seconds)
[13:52] * georgem (~Adium@206.108.127.16) Quit (Ping timeout: 480 seconds)
[13:54] * garphy is now known as garphy`aw
[13:56] * garphy`aw is now known as garphy
[13:57] * garphy is now known as garphy`aw
[13:57] * garphy`aw is now known as garphy
[14:01] * Racpatel (~Racpatel@2601:87:0:24af::53d5) has joined #ceph
[14:08] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[14:12] * wjw-freebsd (~wjw@vpn.ecoracks.nl) Quit (Ping timeout: 480 seconds)
[14:15] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[14:18] * ade (~abradshaw@tmo-110-223.customers.d1-online.com) has joined #ceph
[14:20] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) has joined #ceph
[14:25] * ade (~abradshaw@tmo-110-223.customers.d1-online.com) Quit (Quit: Too sexy for his shirt)
[14:30] * swami2 (~swami@49.44.57.242) Quit (Read error: Connection reset by peer)
[14:30] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[14:32] * georgem1 (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[14:40] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) has joined #ceph
[14:42] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[14:47] <boolman> if I have a 6h window where I want to scrub OSD's, but I see alot of scrubbing occur outside my window, any tip on forcing me to scrub during this time? will it help in reducing the "osd scrub min interval" ?
[14:47] * vimal (~vikumar@121.244.87.116) Quit (Quit: Leaving)
[14:54] * pdrakewe_ (~pdrakeweb@oh-76-5-108-60.dhcp.embarqhsd.net) has joined #ceph
[14:54] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Read error: Connection reset by peer)
[14:54] * georgem (~Adium@206.108.127.16) has joined #ceph
[14:55] * swami1 (~swami@49.44.57.242) has joined #ceph
[14:56] * Randleman (~jesse@89.105.204.182) Quit (Quit: leaving)
[14:56] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[14:57] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[14:58] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) has joined #ceph
[15:01] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[15:01] * swami1 (~swami@49.44.57.242) Quit (Read error: Connection reset by peer)
[15:02] <doppelgrau> boolman: perhaps your load is too high => scrubbing happens@max?
[15:03] <doppelgrau> boolman: or if you want to use deep scrubs, schedule it using a shell script
[15:03] <boolman> doppelgrau: yes I just found out my load is above 0.5, i'm about to increase that
[15:04] <boolman> doppelgrau: the deep-scrubs, arent they scheduled in the windows as well?
[15:04] <boolman> or is it just maximum interval?
[15:04] <doppelgrau> boolman: normal scrubs let I run always, deep scrub with the script every 10 days, and told ceph to deep scrub after 20 days, so I have some headroom if I want to skip one night deep scrubs for faster recovery e.g.
[15:05] <doppelgrau> boolman: IIRC the rules are "normal scrubs" anytime between min and max, if the load is below limit, at max the normal scrub is started either way
[15:06] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) has joined #ceph
[15:06] <doppelgrau> boolman: and at a normal scrub it is checked, if the last deep scrub is too long ago
[15:06] <doppelgrau> so a normal scrub "triggers" deep-scrub
[15:07] <doppelgrau> (in my case that is usually not the case, since the manual triggert deep-scrup happend earlier)
[15:07] <boolman> ok, but if the norma scrubs triggers the deep-scrub, the windows should work if I increase my load threshold
[15:11] * neurodrone (~neurodron@pool-100-35-225-168.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[15:11] <boolman> if its not working, im doing the cron like u, thanks for the help doppelgrau
[15:13] <doppelgrau> boolman: https://www.formann.de/2015/05/cronjob-to-enable-timed-deep-scrubbing-in-a-ceph-cluster/
[15:15] * hrast (~hrast@cpe-24-55-26-86.austin.res.rr.com) has joined #ceph
[15:20] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:4d98:7dea:2462:19d7) Quit (Ping timeout: 480 seconds)
[15:25] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[15:30] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) has joined #ceph
[15:31] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) has joined #ceph
[15:33] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[15:33] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[15:34] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[15:36] * lxxl (~oftc-webi@localhost) has joined #ceph
[15:37] * yanzheng (~zhyan@125.70.22.133) Quit (Quit: This computer has gone to sleep)
[15:42] * dnunez (~dnunez@nat-pool-bos-u.redhat.com) has joined #ceph
[15:43] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) has joined #ceph
[15:48] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[15:50] * thomnico (~thomnico@2a01:e35:8b41:120:d491:7abc:d665:4d84) Quit (Quit: Ex-Chat)
[15:50] * dnunez (~dnunez@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[15:57] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[15:57] * fenfen (~fenfen@mail.pbsnetwork.eu) has joined #ceph
[15:58] <fenfen> hi... is it better to have more buckets in radosgw or doesn't it matter?
[15:58] <fenfen> can we just poor lots and lots of objects in one bucket?
[16:00] <m0zes> shareding bucket indexes will be your friend with lots of objects in a single bucket.
[16:00] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[16:00] <m0zes> s/share/shar/
[16:01] <fenfen> so we should structure it hierarchical?
[16:01] <fenfen> within one bucket
[16:01] * dnunez (~dnunez@nat-pool-bos-t.redhat.com) has joined #ceph
[16:02] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[16:03] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[16:03] * branto (~branto@nat-pool-brq-t.redhat.com) Quit (Quit: Leaving.)
[16:03] <m0zes> so, by default radosgw needs to keep an index of objects for each bucket. that index is stored in 1 object per bucket by default. if you store more than about 20 million objects in a bucket, I'd either recommend splitting it into multiple buckets, or enabling sharding. lookups are slightly slower with sharding, but write and recovery performance is better.
[16:04] <fenfen> where is the documentation on enabling sharding located?
[16:04] <fenfen> i can't seem to find it..
[16:04] * xarses (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:05] <m0zes> I'd point you at a howto for sharding, but I've not played enough with it in ceph jewel to get it working. this used to work, not sure if its changed: https://access.redhat.com/documentation/en/red-hat-ceph-storage/version-1.3/red-hat-ceph-storage-13-ceph-object-gateway-for-rhel-x86-64/chapter-7-configure-bucket-sharding
[16:06] <fenfen> thanks!
[16:07] * Nacer (~Nacer@176.31.89.99) Quit (Remote host closed the connection)
[16:08] <fenfen> do you by any chance know if this can be activated later or is it required to be activated prior to filling up the bucket
[16:08] <fenfen> ?
[16:09] * ZombieL (~Guest1390@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[16:10] * bandrus (~brian@67.sub-70-211-74.myvzw.com) has joined #ceph
[16:10] * vbellur (~vijay@nat-pool-bos-t.redhat.com) has joined #ceph
[16:11] <m0zes> I honestly have no idea.
[16:12] * wschulze (~wschulze@cpe-72-225-192-123.nyc.res.rr.com) has joined #ceph
[16:12] <art_yo> Hi guys. I can't figure it out. When I do copy a lot of files, it suddenly stops after some time (without any errors) and ceph -s shows HEALTH_WARN (1 near full osd(s)). In fact one of OSDs really full but my quiestion is: Is it normal way that it just stops without errors? And why OSDs filled differently?
[16:13] <fenfen> m0zes: thanks anyway - it was still very helpful
[16:14] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[16:17] * pdrakewe_ (~pdrakeweb@oh-76-5-108-60.dhcp.embarqhsd.net) Quit (Read error: Connection reset by peer)
[16:18] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[16:18] <m0zes> art_yo: the crush algorithm values speed for mapping objects over fairness with regard to the fullness of the disks. you probably don't have enough pgs for your pool, or the weights of your disks is wrong, or one disk is significantly smaller than the rest.
[16:19] <doppelgrau> art_yo: yes, ceph simply blocks IO till problem resolves (a admin could add more disks e.g.) and an uneven distritution can have different reasons: wrong weights, exotic crush rules or simply a bit of bad luck
[16:19] <art_yo> And there is another issue:
[16:19] <m0zes> art_yo: pastebin 'ceph osd tree' and 'ceph osd pool ls detail'
[16:20] <art_yo> one sec
[16:20] <doppelgrau> the distribution of PGs and objekts happend with some probabalistic behavior, so there is allways some variance
[16:21] <art_yo> http://pastebin.com/5m2GMM69
[16:22] <doppelgrau> art_yo: default size=3 replicated pool?
[16:23] <art_yo> Here is ceph.conf: http://pastebin.com/2NKG6nnp
[16:23] <doppelgrau> in that case, problem is that all nodes have all data and with uneven capacity/weight...
[16:23] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[16:26] <m0zes> I'd probably change pg num to 512 (which will help a little with balancing) and look at spinning up at least one more server, as ceph doesn't like being above about 70-75% full.
[16:30] * bandrus (~brian@67.sub-70-211-74.myvzw.com) Quit (Ping timeout: 480 seconds)
[16:30] * Miouge (~Miouge@109.128.94.173) Quit (Quit: Miouge)
[16:31] * bandrus (~brian@67.sub-70-211-74.myvzw.com) has joined #ceph
[16:31] * Miouge (~Miouge@109.128.94.173) has joined #ceph
[16:37] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Remote host closed the connection)
[16:38] * xarses (~xarses@64.124.158.32) has joined #ceph
[16:38] * vbellur (~vijay@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[16:38] * Miouge (~Miouge@109.128.94.173) Quit (Quit: Miouge)
[16:39] <lxxl> Hi all, do we need to keep an updated list of mon_initial_members and mon_host on each host when adding more monitor nodes? I guess i am hitting this https://bugs.launchpad.net/fuel/+bug/1268579
[16:39] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[16:39] * ZombieL (~Guest1390@9YSAAA95A.tor-irc.dnsbl.oftc.net) Quit ()
[16:39] <lxxl> shouldn't this be somehow automated?
[16:41] <Be-El> lxxl: afaik the initial_members are only used to bootstrap a cluster; ceph client use mon_hosts to contact one mon and get its current monmap (which is in turn used to contact other mons)
[16:41] <doppelgrau> lxxl: the list of monitors is needed when a client starts, but your devop-Tool should be able to take care (I use ansible)
[16:41] <Be-El> (time to write the feature request for getting mons via DNS SRV entries... )
[16:43] <lxxl> Be-El: that would make much more sense
[16:44] <art_yo> doppelgrau: Thanks, I will decrease pg_num to 512. What about space? Why df shows 7,2 Tb, but ceph -s shows that 15 Tb are used?
[16:45] <art_yo> And how am I supposed to find out a clear free space what I can expect?
[16:45] * Hemanth (~hkumar_@103.228.221.135) has joined #ceph
[16:46] * rakeshgm (~rakesh@106.51.29.33) Quit (Quit: Leaving)
[16:47] * penguinRaider (~KiKo@146.185.31.226) Quit (Ping timeout: 480 seconds)
[16:48] * cathode (~cathode@50.232.215.114) has joined #ceph
[16:50] <SamYaple> you cant decrease pg__num for a pool
[16:50] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) Quit (Quit: Leaving.)
[16:50] <SamYaple> art_yo: where are you running `df`?
[16:51] <art_yo> SamYaple: at server with RBD
[16:51] * joshd1 (~jdurgin@2602:30a:c089:2b0:b0b3:6a24:e6e6:d1b1) has joined #ceph
[16:51] <SamYaple> and you have 2 copies of the object (size=2) im assuming?
[16:52] <art_yo> osd_pool_default_size = 2
[16:52] <art_yo> yep
[16:52] <SamYaple> `ceph -s` should also show you "data" and that will match your 7.2TB
[16:52] * vbellur (~vijay@nat-pool-bos-u.redhat.com) has joined #ceph
[16:52] <SamYaple> the used is showing you the raw used space. so in your case 7.2 x 2 == ~15TB
[16:53] * EdGruberman (~VampiricP@tor-exit.squirrel.theremailer.net) has joined #ceph
[16:54] <art_yo> SamYaple: thank you. stupid quiestion, I know
[16:54] <SamYaple> the only stupid questions are the ones not asked
[16:54] <art_yo> I didn't noticed this: 7358 GB data
[16:55] <art_yo> So, which way is the best to find out a free space?
[16:55] <art_yo> ceph df?
[16:55] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) has joined #ceph
[16:56] <SamYaple> `ceph -s` should also have available space, like this '6478 GB used, 18614 GB / 25093 GB avail'
[16:56] * penguinRaider (~KiKo@146.185.31.226) has joined #ceph
[16:56] <SamYaple> but available space is the raw space available, so in your case youd need ot split it by 2 to get an accurate number
[16:57] <art_yo> just split it by 2? ok, ok
[16:57] * kefu|afk is now known as kefu
[16:58] <art_yo> event(OnJoin,supressJoinMsg) { halt }
[16:58] <art_yo> sory
[16:59] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[16:59] * [0x4A6F] (~ident@0x4a6f.user.oftc.net) Quit (Quit: Quit)
[17:00] <doppelgrau> art_yo: you split it by the number of copies (size), an deduct some (filling OSDS more than 75-80% is not the best idea)
[17:00] <art_yo> doppelgrau: thx! I will
[17:01] * ircolle (~Adium@2601:285:201:633a:1c8b:71f5:60f8:4f79) has joined #ceph
[17:01] * [0x4A6F] (~ident@p4FC26FC3.dip0.t-ipconnect.de) has joined #ceph
[17:04] * wushudoin (~wushudoin@2601:646:8281:cfd:2ab2:bdff:fe0b:a6ee) has joined #ceph
[17:07] * mattbenjamin (~mbenjamin@12.118.3.106) has joined #ceph
[17:07] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Read error: Connection reset by peer)
[17:09] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:16] * pdrakeweb (~pdrakeweb@oh-76-5-108-60.dhcp.embarqhsd.net) has joined #ceph
[17:17] * kmroz (~kilo@00020103.user.oftc.net) has joined #ceph
[17:18] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[17:22] * kefu (~kefu@114.92.96.253) Quit (Ping timeout: 480 seconds)
[17:22] * EdGruberman (~VampiricP@9YSAAA96D.tor-irc.dnsbl.oftc.net) Quit ()
[17:26] * srk_ (~Siva@32.97.110.50) has joined #ceph
[17:28] <lxxl> is it possible to specify the number of replicas per rbd ? i have for example a 'clientdata' and a 'notsoimportantdata', could i somehow specify that clientdata needs to be replicated 4 times and 'notsoimportantdata' 3 times ?
[17:29] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[17:30] <Be-El> lxxl: nope, replication factor is a pool wide setting. but you can create two pools with different settings and but rbd on both of them
[17:31] <lxxl> Be-El: great, thanks again
[17:34] <lxxl> Are there any good gui's for ceph? need to show something bit more pretty to management
[17:34] <doppelgrau> cephdash
[17:35] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Read error: Connection reset by peer)
[17:36] * vbellur (~vijay@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[17:36] <Be-El> or openattic
[17:36] <lxxl> cephdash, inkscope, vsm, calamari, openattic
[17:37] <lxxl> gonna review those
[17:37] <lxxl> ty
[17:37] <doppelgrau> lxxl: cephdas is only a nice status board, no management-Interface
[17:37] * pdrakeweb (~pdrakeweb@oh-76-5-108-60.dhcp.embarqhsd.net) Quit (Ping timeout: 480 seconds)
[17:38] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[17:40] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[17:41] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[17:43] * karnan (~karnan@121.244.87.117) Quit (Remote host closed the connection)
[17:47] * blizzow (~jburns@50.243.148.102) has joined #ceph
[17:50] * kefu (~kefu@114.92.99.82) has joined #ceph
[17:50] * Bromine (~clusterfu@78.130.128.106) has joined #ceph
[17:50] * vbellur (~vijay@nat-pool-bos-t.redhat.com) has joined #ceph
[17:52] <lxxl> at a quick glance VSM seems to be the most feature rich
[17:52] <lxxl> or openattic
[17:56] * TMM (~hp@188.200.6.137) has joined #ceph
[17:59] * ntpttr_ (~ntpttr@192.55.55.41) has joined #ceph
[17:59] * danieagle (~Daniel@200-100-212-200.dial-up.telesp.net.br) has joined #ceph
[18:04] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[18:07] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[18:07] * walcubi_ is now known as walcubi
[18:08] * joshd1 (~jdurgin@2602:30a:c089:2b0:b0b3:6a24:e6e6:d1b1) Quit (Quit: Leaving.)
[18:09] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[18:12] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) Quit (Quit: Leaving.)
[18:13] * thansen (~thansen@17.253.sfcn.org) has joined #ceph
[18:14] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[18:17] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[18:17] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Quit: Bye guys! (??????????????????? ?????????)
[18:19] * Bromine (~clusterfu@61TAAA9CZ.tor-irc.dnsbl.oftc.net) Quit ()
[18:20] * Nephyrin (~Jase@tor3.pbin.co) has joined #ceph
[18:21] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[18:24] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Ping timeout: 480 seconds)
[18:25] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[18:26] * dlan (~dennis@116.228.88.131) Quit (Ping timeout: 480 seconds)
[18:27] * rmart04 (~rmart04@support.memset.com) Quit (Ping timeout: 480 seconds)
[18:30] * dlan (~dennis@116.228.88.131) has joined #ceph
[18:32] * ntpttr_ (~ntpttr@192.55.55.41) Quit (Remote host closed the connection)
[18:33] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Remote host closed the connection)
[18:38] * efirs (~firs@31.173.240.8) has joined #ceph
[18:39] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[18:44] <blizzow> Is Ceph catching up to Gluster in terms of performance since this paper?
[18:44] <blizzow> http://iopscience.iop.org/article/10.1088/1742-6596/513/4/042014/pdf
[18:45] * DanFoster (~Daniel@office.34sp.com) Quit (Quit: Leaving)
[18:47] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) Quit (Ping timeout: 480 seconds)
[18:49] * shaunm (~shaunm@nat-eduroam-02.scc.kit.edu) Quit (Ping timeout: 480 seconds)
[18:49] * Nephyrin (~Jase@61TAAA9DU.tor-irc.dnsbl.oftc.net) Quit ()
[18:51] * wschulze (~wschulze@cpe-72-225-192-123.nyc.res.rr.com) has left #ceph
[18:52] * [0x4A6F]_ (~ident@p4FC27D8D.dip0.t-ipconnect.de) has joined #ceph
[18:54] * [0x4A6F] (~ident@0x4a6f.user.oftc.net) Quit (Ping timeout: 480 seconds)
[18:54] * [0x4A6F]_ is now known as [0x4A6F]
[18:58] * ntpttr_ (~ntpttr@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[18:59] * chengpeng__ (~chengpeng@180.168.126.179) has joined #ceph
[19:03] * ntpttr_ (~ntpttr@jfdmzpr05-ext.jf.intel.com) Quit (Quit: Leaving)
[19:06] * chengpeng_ (~chengpeng@180.168.170.2) Quit (Ping timeout: 480 seconds)
[19:07] * [0x4A6F]_ (~ident@p4FC26AD9.dip0.t-ipconnect.de) has joined #ceph
[19:07] * [0x4A6F] (~ident@0x4a6f.user.oftc.net) Quit (Ping timeout: 480 seconds)
[19:07] * [0x4A6F]_ is now known as [0x4A6F]
[19:08] * Nicho1as (~nicho1as@00022427.user.oftc.net) Quit (Quit: A man from the Far East; using WeeChat 1.5)
[19:11] * raso (~raso@ns.deb-multimedia.org) Quit (Read error: Connection reset by peer)
[19:12] * dbbyleo (~dbbyleo@50-198-202-93-static.hfc.comcastbusiness.net) has joined #ceph
[19:12] * raso (~raso@ns.deb-multimedia.org) has joined #ceph
[19:13] <dbbyleo> Hello... I'm new to IRC chats and also to CEPH. I hope I don't break and etiquettes while I'm here. But I read this is a good place to ask for help.
[19:14] <dbbyleo> I'm installing my first CEPH system. I just had a quick question about preparing an OSD.
[19:16] * davidzlap (~Adium@2605:e000:1313:8003:e5e5:2192:fa28:3054) has joined #ceph
[19:16] <dbbyleo> I've provisioned a disk dedicated for OSD data on my OSD node (/dev/vdb). So to prepare it, I do: ceph-deploy osd prepare myosdnode01:vdb
[19:18] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) Quit (Ping timeout: 480 seconds)
[19:19] <dbbyleo> When I ls /dev/vdb*, I find this command created /dev/vdb1 and /dev/vdb2. I activate the osd with: ceph-deploy osd activate myosdnode01:/dev/vdb1, but I'm wondering what /dev/vdb2 is for.
[19:20] * mr_flea1 (~smf68@tor-exit.talyn.se) has joined #ceph
[19:20] <blizzow> /dev/vdb2 is for journaling.
[19:21] <blizzow> It's for recovery in case your OSD goes down.
[19:21] <dbbyleo> ok... I had a suspicion that it probably is for journaling, but the doc doesn't clarify that.
[19:22] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) has joined #ceph
[19:22] <blizzow> ceph documentation leaves something to be desired for sure.
[19:22] <dbbyleo> So my activate command puts both data and journaling on the same disk/partition... leaving /dev/vdb2 unused.
[19:22] <dbbyleo> right?
[19:23] * diq (~diq@2620:11c:f:2:c23f:d5ff:fe62:112c) Quit (Quit: Leaving)
[19:23] <blizzow> looks like it.
[19:24] <dbbyleo> ok... just wanted to confirm.
[19:24] <blizzow> I can't remember off the top of my head, but you may be able to skip the prepare step and go straight to activate, and it will take care of putting both journal and the data store on the same drive.. Someone here can probably chime in.
[19:25] <blizzow> *Or you could try and see ;)
[19:26] <dbbyleo> Yes... I can. I'll spin up another node with it's own set of disks and try to add another OSD.
[19:27] * kmroz (~kilo@00020103.user.oftc.net) Quit (Ping timeout: 480 seconds)
[19:29] * kmroz (~kilo@node-1w7jr9qmiwgsf8m0emgm5go1z.ipv6.telus.net) has joined #ceph
[19:29] * diq (~diq@2620:11c:f:2:c23f:d5ff:fe62:112c) has joined #ceph
[19:29] <blizzow> on my OSDs (where I specified the second partition as journal): "ls /var/lib/ceph/osd/ceph-X" shows journal is a symbolic link to the UUID of the second partition of the drive that OSD is hosted on. You could take a look at your /var/lib/ceph/osd/ceph-X/ directory and see if it's a symlink or an actual file.
[19:30] <dbbyleo> ok will do
[19:32] * kefu (~kefu@114.92.99.82) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:32] <dbbyleo> it's a link: journal -> /dev/disk/by-partuuid/48c06ff3-1a63-4c42-90ba-69b80a92ec51
[19:32] * wgao (~wgao@106.120.101.38) Quit (Ping timeout: 480 seconds)
[19:33] <dbbyleo> Are you saying "/dev/disk/by-partuuid/48c06ff3-1a63-4c42-90ba-69b80a92ec51" might actually be pointed to my /dev/vdb2?
[19:33] <blizzow> What does running blkid as root show?
[19:33] <dbbyleo> Oh snap - you;re right:
[19:33] <blizzow> if it says /dev/vdb2 is that uuid, then I think you're set.
[19:34] <infernix> does hammer always promote objects to cache when they are written to since it does not have min_write_recency_for_promote?
[19:34] <blizzow> So you could probably stop osd services, dd if=/dev/disk/by-partuuid/48c06ff3-1a63-4c42-90ba-69b80a92ec51 of=/dev/vdb2
[19:34] <blizzow> Then you'd remove the journal symlink and re-add it pointing to the blkid of /dev/vdb2
[19:35] * fenfen (~fenfen@mail.pbsnetwork.eu) Quit (Remote host closed the connection)
[19:35] <infernix> or do only reads cause promotion in hammer?
[19:36] * oliveiradan (~doliveira@137.65.133.10) Quit (Ping timeout: 480 seconds)
[19:36] <dbbyleo> But the bottom line is /dev/vdb2 is where my journal is stored (as it is now).
[19:36] <dbbyleo> that uuid is my /dev/vdb2
[19:36] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[19:39] <dbbyleo> So I guess when you prepare a disk (without specifying a location for journal), ceph-deploy will create two partitions on that disk. And when you activate the OSD (without specifying a journal location) it will automatically store the journal on the 2nd partition (via a symlink in the 1st partition). Is that what happened here?
[19:39] * wgao (~wgao@106.120.101.38) has joined #ceph
[19:39] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[19:41] <blizzow> What does blkid say the uuid of /dev/vdb2 is?
[19:41] * dmick (~dmick@206.169.83.146) has joined #ceph
[19:44] <dbbyleo> It is: /dev/vdb2: PARTLABEL="ceph journal" PARTUUID="48c06ff3-1a63-4c42-90ba-69b80a92ec51"
[19:45] * chunmei (~chunmei@134.134.137.75) has joined #ceph
[19:48] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) has joined #ceph
[19:50] * mr_flea1 (~smf68@61TAAA9FP.tor-irc.dnsbl.oftc.net) Quit ()
[19:50] * haplo37 (~haplo37@199.91.185.156) Quit (Remote host closed the connection)
[19:50] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[19:51] <blizzow> Then it looks like ceph-deploy is smarter than I'd assumed.
[19:53] <dbbyleo> Yes... that seems to be what it did: prepared 2 partitions and used the 2nd one for journalling (when I don't specify a location for journaling)
[19:54] * efirs (~firs@31.173.240.8) Quit (Ping timeout: 480 seconds)
[20:00] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[20:03] * visored (~Mattress@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[20:04] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Read error: Connection reset by peer)
[20:06] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) Quit (Ping timeout: 480 seconds)
[20:09] * swami1 (~swami@27.7.170.40) has joined #ceph
[20:10] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[20:15] <sep> blizzow, quite a lot of changes have happened in ceph since 0.70. i have never used that version so i can not compare to the performance of that vs a more current one.
[20:16] * bandrus (~brian@67.sub-70-211-74.myvzw.com) Quit (Quit: Leaving.)
[20:19] * shaunm (~shaunm@5.158.137.118) has joined #ceph
[20:22] * david_ (~david@207.107.71.71) Quit (Quit: Leaving)
[20:22] * david_ (~david@207.107.71.71) has joined #ceph
[20:27] <blizzow> sep, yeah, I know it's a busy project. It's weird that there is a section in the documentation called "cluster examples", with ZERO benchmarking numbers. It's nearly impossible to find public numbers about what kind of speed to expect out of ceph. Even a theory about speed scalability would be nice. I know adding more disks leads to more space, but SPEED is what users care about.
[20:28] <dbbyleo> Just to confirm, do you need a client node for a each ceph service (block, object, and FS)? or can all 3 be serviced through one client node?
[20:28] * dmick (~dmick@206.169.83.146) has left #ceph
[20:29] * swami1 (~swami@27.7.170.40) Quit (Quit: Leaving.)
[20:32] <evilrob> dbbyleo: I'm currently doing block and object through the same nodes. In my test environment I configured all 3 on the same set of nodes.
[20:33] <dbbyleo> you're using the same node (client) to do block and object?
[20:33] * visored (~Mattress@5AEAAAW9H.tor-irc.dnsbl.oftc.net) Quit ()
[20:33] <evilrob> yes
[20:33] <doppelgrau> blizzow: sequential or combined (parallel) speed?
[20:34] <doppelgrau> the later grows with more OSDs
[20:35] <dbbyleo> in the docs, it says to do this to "create block device image"...
[20:35] <dbbyleo> rbd create foo --size 4096 [-m {mon-IP}] [-k /path/to/ceph.client.admin.keyring]
[20:35] <dbbyleo> is {mon-IP} supposed to be the IP address of my monitor node?
[20:36] <dbbyleo> and if so, do I just pick one of my monitor nodes? I have 3 mon daemons on 3 nodes...
[20:36] <doppelgrau> dbbyleo: only need -m if not defined in the ceph.conf
[20:37] <doppelgrau> dbbyleo: and you can choose freely
[20:37] <dbbyleo> ah ok.
[20:37] <dbbyleo> so -k is also optional if my keyring is in the default location
[20:37] <dbbyleo> thanks!
[20:38] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) has joined #ceph
[20:41] * lxxl (~oftc-webi@localhost) Quit (Remote host closed the connection)
[20:43] * derjohn_mobi (~aj@2001:6f8:1337:0:30f3:25a4:f8bd:7c69) Quit (Ping timeout: 480 seconds)
[20:48] * devster (~devsterkn@2001:41d0:1:a3af::1) has joined #ceph
[20:49] <blizzow> doppelgrau: both. If one virtual machine using ceph as it's backend maxes out at 50MB/s, then eep!. I've set up ceph with three OSDs that can write to their disks at better than 800MB/sec, and a single VM using that ceph rbd image is still not able to write at better than 60MB/sec with a 10Gbe network.
[20:49] <blizzow> but that's my personal experience, and I have NO idea what to expect because there is practically no public benchmark.
[20:52] <doppelgrau> blizzow: sequential or multi thread IO? and is the managion-instanze limiting somehow? (e.g. qemu eating all cpu)
[20:52] <doppelgrau> blizzow: so dd or fio with e.g. 16 threads
[20:53] <blizzow> doppelgrau: you're not hearing me. My complaint isn't about my performance. It's about the fact that there is NOOOOO information about what kind of numbers to expect.
[20:53] <SamYaple> blizzow: doppelgrau one virtualmachine peaking at 50MB/s sounds like you are using virtio driver. using virtio-scsi I can saturate 10Gb from a single vm for large seq writes. i can do ~5k iops of random 4k
[20:54] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Read error: Connection reset by peer)
[20:54] * Pulp (~Pulp@63-221-50-195.dyn.estpak.ee) has joined #ceph
[20:54] * rendar (~I@host50-34-dynamic.25-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[20:54] <doppelgrau> blizzow: that depends on so many factors (network, speed of dics/journals, access-method, access-pattern)
[20:59] <blizzow> doppelgrau: Again, it doesn't matter, there is no way to even say to myself, is my ceph cluster functioning optimally? Is it a waste of time to investigate (network, speed of dics[sic]/journals, access-method, access-pattern) because my ceph cluster is already optimal out of the box?
[20:59] <dbbyleo> I'm getting an error creating my first block device... help.
[20:59] <dbbyleo> cephadmin@cephcl:~$ sudo rbd map cephdd --name client.admin
[20:59] <dbbyleo> rbd: sysfs write failed
[20:59] <dbbyleo> RBD image feature set mismatch. You can disable features unsupported by the kern el with "rbd feature disable".
[20:59] <dbbyleo> In some cases useful info is found in syslog - try "dmesg | tail" or so.
[20:59] <dbbyleo> rbd: map failed: (6) No such device or address
[21:00] <dbbyleo> I'm on Debian8 (Jessie)
[21:00] <blizzow> dbbyleo: start at the beginning. What does ceph osd lspools show?
[21:00] <SamYaple> dbbyleo: i believe you must use the <poolname>/<rbdname> format
[21:00] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[21:01] <jdillaman> dbbyleo: you need to use "rbd feature disable <image name> exclusive-lock,object-map,fast-diff,deep-flatten" on your image before it can be used by krbd
[21:01] <SamYaple> jdillaman: but would that show up as "No such device"?
[21:01] <dbbyleo> cephadmin@cephcl:~$ ceph osd lspools
[21:01] <dbbyleo> 0 rbd,1 .rgw.root,2 default.rgw.control,3 default.rgw.data.root,4 default.rgw.gc,5 default.rgw.log,6 data,
[21:02] <dbbyleo> [cephcl is my client node]
[21:02] <blizzow> dbbyleo: I assume you're using rbd as your pool. What does, rbd -p rbd ls show?
[21:03] <doppelgrau> dbbyleo: your kernel is too old for the features enabled in the rbd-device. Upgrade kernel, disable some features or use quemu-nbd
[21:03] * pdrakewe_ (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[21:03] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Read error: Connection reset by peer)
[21:04] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[21:04] <dbbyleo> so this has to do with my OS kernel? Would CentOS 7 be less limiting? IE, I don't have to disable features with CentOS 7?
[21:04] <SamYaple> dbbyleo: the kernel module itself doesnt have teh code to deal with that
[21:04] <SamYaple> its far behind librbd
[21:04] <doppelgrau> dbbyleo: rbd map uses the kernel rbd-driver to mount a image
[21:04] <SamYaple> pretty sure they are trying to phase it out completely
[21:05] <doppelgrau> dbbyleo: qemu-bd is a simple workaround
[21:05] <doppelgrau> nbd
[21:05] <doppelgrau> (running in userspace)
[21:06] <jdillaman> SamYaple: no plans to phase out krbd, but it definitely takes longer to (re)implement the code
[21:06] <dbbyleo> I'm just following the ceph documentation (quick start guide). And that doc states to use rbd create ... followed by rbd map ...
[21:06] <jdillaman> dbbyleo: can you provide the link so i can get the docs updated?
[21:07] <dbbyleo> http://docs.ceph.com/docs/master/start/quick-rbd/
[21:07] <jdillaman> dbbyleo: thx
[21:08] <jdillaman> dbbyleo: starting w/ the Jewel (latest) release, rbd creates images with features that are not supported by krbd, so the docs should be updated to reflect that
[21:08] <dbbyleo> ah...
[21:09] <dbbyleo> can y'all help me create a block device then. IE, what's the command look like? My setup is vanilla (again, just following the quick start guide so far)
[21:09] <jdillaman> "rbd feature disable <image name> exclusive-lock,object-map,fast-diff,deep-flatten"
[21:09] * Jones (~luigiman@5AEAAAXAR.tor-irc.dnsbl.oftc.net) has joined #ceph
[21:10] <jdillaman> rbd create --image-shared --size XYZ <image name>
[21:10] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[21:11] <dbbyleo> I previously had done this and it succeeded: rbd create cephdd --size 40G
[21:11] <dbbyleo> is cephdd = <image_name>
[21:11] <jdillaman> dbbyleo: yup -- image just has features enabled that krbd doesn'
[21:11] <jdillaman> t support
[21:11] <jdillaman> ... so use "rbd feature disable" to disable them so that you can map it
[21:12] <dbbyleo> ok...
[21:12] <dbbyleo> rbd feature disable cephdd exclusive-lock,object-map,fast-diff,deep-flatten
[21:12] <dbbyleo> done!
[21:12] <jdillaman> dbbyleo: you might also have to adjust your crush map if your kernel is too old for the new tunables
[21:12] <dbbyleo> What's next?
[21:12] <jdillaman> dmesg | tail would show you the reason why it fails
[21:12] <dbbyleo> Ok... I'll check that out later
[21:13] <dbbyleo> I have done the rbd feature disable on the image. What's next?
[21:13] <jdillaman> dbbyleo: does it work?
[21:13] <dbbyleo> the disable worked
[21:13] <dbbyleo> what do I do after the disable command?
[21:14] <dbbyleo> Do I try: rbd map cephdd --name client.admin (again)
[21:14] <jdillaman> yes
[21:15] <dbbyleo> ah yes! map command worked!
[21:15] <dbbyleo> cephadmin@cephcl:~$ sudo rbd map cephdd --name client.admin
[21:15] <dbbyleo> ... /dev/rbd0
[21:16] * TMM (~hp@188.200.6.137) Quit (Ping timeout: 480 seconds)
[21:16] <dbbyleo> jdillaman: this is our first attempt at standing up a ceph system.
[21:17] * mattbenjamin (~mbenjamin@12.118.3.106) Quit (Ping timeout: 480 seconds)
[21:17] <dbbyleo> If we decide to run it on CentOS 7 (instead of Debian8), do I have to worry about crush map adjustmnts (for tunables)?
[21:18] <SamYaple> jdillaman: good to know. i just saw something on the roadmap about deperecating support for rbds that _dont_ have excllusive-lock (or one of those features) and yet the krbd doesnt support that at all yet
[21:18] <jdillaman> dbbyleo: not sure where RHEL7.x is right now but it does receive backports for support
[21:18] <jdillaman> dbbyleo: http://docs.ceph.com/docs/master/rados/operations/crush-map/#which-client-versions-support-crush-tunables
[21:18] <dbbyleo> thanks!
[21:18] <jdillaman> SamYaple: we want to deprecate RBD image format v1
[21:18] <jdillaman> SamYaple: ... but we need a good upgrade path story first
[21:19] <SamYaple> jdillaman: ah. that sounds more logical. thanks for the information
[21:20] * rendar (~I@host50-34-dynamic.25-79-r.retail.telecomitalia.it) has joined #ceph
[21:23] * squizzi (~squizzi@71-34-69-94.ptld.qwest.net) has joined #ceph
[21:24] <doppelgrau> out of curiosity: someone otimized the qdisk-backend for xen to get the best rbd-performance inside the vm?
[21:27] * vbellur (~vijay@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving.)
[21:34] <dbbyleo> where do you normally store secretfiles? In the users home dir?
[21:35] <dbbyleo> The doc has this note:
[21:36] <dbbyleo> Note: Mount the Ceph FS filesystem on the admin node, not the server node. See FAQ for details.
[21:36] <dbbyleo> I assume the FS can also be mounted on the client node, right?
[21:37] * oliveiradan (~doliveira@67.214.238.80) has joined #ceph
[21:37] * pdrakewe_ (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Read error: Connection reset by peer)
[21:38] <doppelgrau> jdillaman: I have a 4.6 kernel and still get ???image uses unsupported features: 0x3c???, which rbd features do I need to remove? (all deep-flatten,fast-diff,object-map,exclusive-lock or only some of then, do not find the bitmap to see which features is which value)
[21:39] <jdillaman> doppelgrau: all features -- exclusive-lock will be added soon
[21:39] <doppelgrau> jdillaman: thanks
[21:39] * Jones (~luigiman@5AEAAAXAR.tor-irc.dnsbl.oftc.net) Quit ()
[21:39] * Blueraven (~Swompie`@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[21:40] <jdillaman> np
[21:40] * Psi-Jack (~psi-jack@mx.linux-help.org) Quit (Quit: Where'd my terminal go?)
[21:41] * Psi-Jack (~psi-jack@mx.linux-help.org) has joined #ceph
[21:50] * blizzow (~jburns@50.243.148.102) Quit (Ping timeout: 480 seconds)
[21:51] * hellertime (~Adium@a23-79-238-10.deploy.static.akamaitechnologies.com) Quit (Ping timeout: 480 seconds)
[21:52] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[21:56] <dbbyleo> is the number of pg (that you set) for a pool supposed to be some factor of how many OSDs you currently have?
[21:57] * verleihnix (~verleihni@195.12.46.2) Quit (Ping timeout: 480 seconds)
[21:58] * vbellur (~vijay@71.234.224.255) has joined #ceph
[22:01] * verleihnix (~verleihni@195.12.46.2) has joined #ceph
[22:01] <doppelgrau> dbbyleo: yes
[22:01] <dbbyleo> whats the factor?
[22:01] <doppelgrau> dbbyleo: ceph pg calculation
[22:01] * blizzow (~jburns@50.243.148.102) has joined #ceph
[22:01] <doppelgrau> dbbyleo: depends on replication size and expected growth
[22:02] <dbbyleo> Ok... is there resources online that dives into this calculation?
[22:02] <doppelgrau> dbbyleo: rule of thumb 100 PG per OSD (fall pools combined) after replication in full size, but not (much more) than 300
[22:03] <doppelgrau> dbbyleo: http://ceph.com/pgcalc/
[22:03] <dbbyleo> perfect. Thanks for the link!
[22:03] * squizzi (~squizzi@71-34-69-94.ptld.qwest.net) Quit (Quit: bye)
[22:09] <dbbyleo> I've create a ceph filesystem and now I'm trying to mount it on the client node. I've create a secretfile per the doc (using client.admin's key). It didn't say where to put the secret file so I just placed it in my user's home dir.
[22:09] * Blueraven (~Swompie`@5AEAAAXA6.tor-irc.dnsbl.oftc.net) Quit ()
[22:09] * eXeler0n (~K3NT1S_aw@tor-exit.squirrel.theremailer.net) has joined #ceph
[22:09] <doppelgrau> dbbyleo: default /etc/ceph/
[22:10] <dbbyleo> oh... ok. So /etc/ceph is where secretfiles should be stored
[22:12] <SamYaple> dbbyleo: that is what most people do, yes
[22:13] <dbbyleo> I was also wondering because of "secretfile= " option in the mount command. In the doc example, it just shows secretfile=admin.secret
[22:13] <dbbyleo> I was wondering if that's a relative path or not?
[22:14] <doppelgrau> dbbyleo: in that case you can simply use -i with rbd/ceph instead providing full path to key file
[22:14] <dbbyleo> so is the default /etc/ceph if no path is specified?
[22:14] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[22:14] <dbbyleo> Here's the doc example I'm talking about...
[22:15] <dbbyleo> sudo mount -t ceph 192.168.0.1:6789:/ /mnt/mycephfs -o name=admin,secretfile=admin.secret
[22:16] <dbbyleo> At any rate... my admin.secret file is in /etc/ceph now. But my mount command still fails with:
[22:16] <dbbyleo> mount: wrong fs type, bad option, bad superblock on 192.168.42.12:6789:/,
[22:16] <dbbyleo> missing codepage or helper program, or other error
[22:16] <dbbyleo> In some cases useful info is found in syslog - try
[22:16] <dbbyleo> dmesg | tail or so.
[22:16] <dbbyleo> dmesg shows:
[22:16] <dbbyleo> [ 7239.487720] libceph: bad option at 'secretfile=admin.secret'
[22:17] <SamYaple> dbbyleo: do you have ceph-fs-common installed?
[22:17] <dbbyleo> let me check
[22:17] <SamYaple> i seem to remember a message similiar to that when it was missing
[22:17] <SamYaple> rather confusing
[22:18] <dbbyleo> I don't have that package installed.
[22:19] <dbbyleo> http://docs.ceph.com/docs/master/start/quick-cephfs/
[22:19] <dbbyleo> ^^^ doesn't mention that package
[22:19] * derjohn_mobi (~aj@x590cb994.dyn.telefonica.de) has joined #ceph
[22:21] <SamYaple> it is a required pacakage
[22:21] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[22:21] <SamYaple> that would make it a packaging bug if you dont have it installed
[22:22] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving)
[22:22] <dbbyleo> I did the ceph-deploy install on this client node... and it must have missed it. Is that what you mean by "packaging bug"?
[22:22] * gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) Quit (Remote host closed the connection)
[22:23] <SamYaple> I would think ceph-client should have at least a suggested dep on ceph-fs-common, thats what i mean
[22:23] <SamYaple> but try to install it and attempt to mount again
[22:24] <doppelgrau> is ???,secretfile=admin.secret??? neccessary? I mount it onlich with id=<nameofkeyring>
[22:24] <doppelgrau> only
[22:25] <dbbyleo> I installed the packaged and that seem to do the trick. EXCEPT... when I did the mount command, my client got completely hung up :(
[22:26] <doppelgrau> Keyfile in /etc/ceph/ceph.client.<name>.keyring
[22:26] <SamYaple> dbbyleo: sounds like another issues
[22:26] <dbbyleo> CPU is completely pegged out at 100% and is completely unresponsive.
[22:27] <SamYaple> awesome sauce
[22:27] <SamYaple> cant really help you there
[22:29] <dbbyleo> The ceph doc has this note:
[22:29] <dbbyleo> Mount the Ceph FS filesystem on the admin node, not the server node. See FAQ for details.
[22:29] <dbbyleo> I was tryng to mount it on the client node... is this related to what just experienced? [my client node had to be hard cycled]
[22:31] <doppelgrau> dbbyleo: I use the fuse-part, not kernel, seems safer to mee
[22:32] <dbbyleo> This was at the bottom of the ceph doc:
[22:32] <dbbyleo> Mount the Ceph FS filesystem on the admin node, not the server node. See FAQ for details.
[22:32] <dbbyleo> Sorry I meant to paste this:
[22:32] <dbbyleo> See Ceph FS for additional information. Ceph FS is not quite as stable as the Ceph Block Device and Ceph Object Storage. See Troubleshooting if you encounter trouble.
[22:33] * TMM (~hp@188.200.6.137) has joined #ceph
[22:34] <dbbyleo> also... when I created the ceph file system, I called it "cephfs". When I do the mount, should I be referencing that filesystem name somewhere...
[22:34] <dbbyleo> The doc wasn't clear on that...
[22:34] <dbbyleo> The doc's example seems like it mounts the ceph's root filesystem
[22:35] <dbbyleo> doc example:
[22:35] <dbbyleo> sudo mount -t ceph 192.168.0.1:6789:/ /mnt/mycephfs -o name=admin,secretfile=admin.secret
[22:35] * Jeffrey4l_ (~Jeffrey@110.244.109.79) has joined #ceph
[22:35] * oliveiradan2 (~doliveira@67.214.238.80) has joined #ceph
[22:36] * gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) has joined #ceph
[22:36] <blizzow> Can an channel op in here change the channel greeting to add a link to where the channel is logged?
[22:36] <blizzow> Please?
[22:36] <doppelgrau> dbbyleo: t???d try the fuse variant, newer libs and less risky since mostly userspace
[22:37] <dbbyleo> Ok I'll try fuse... but question is: where/when do I reference the name of filesystem I previously created?
[22:38] * Hemanth (~hkumar_@103.228.221.135) Quit (Quit: Leaving)
[22:38] * Jeffrey4l (~Jeffrey@119.251.252.117) Quit (Ping timeout: 480 seconds)
[22:39] <dbbyleo> would it be like this:
[22:39] <dbbyleo> sudo ceph-fuse -m {ip-address-of-monitor}:6789:/<filesystem_name> ~/mycephfs
[22:39] <doppelgrau> dbbyleo: IIRC currently not necessary, one mds-daemon one cephfs
[22:39] * eXeler0n (~K3NT1S_aw@9YSAABADT.tor-irc.dnsbl.oftc.net) Quit ()
[22:39] <dbbyleo> doppel: I don't understand...
[22:39] <doppelgrau> dbbyleo: name not neccessary
[22:39] * georgem (~Adium@24.114.71.120) has joined #ceph
[22:40] <dbbyleo> What if there's multiple cephfs on the system?
[22:40] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Read error: Connection reset by peer)
[22:40] <dbbyleo> Or can there only be one cephfs??
[22:41] <dbbyleo> (I'm obviously looking at this like unix exports)
[22:44] * garphy is now known as garphy`aw
[22:45] <m0zes> there can be multiple cephfs out of a ceph cluster starting with jewel. not currently recommended. but you can do nested mounts and set file layouts to split your directory structure into different pools.
[22:45] <doppelgrau> dbbyleo: currently only one cephfs (IIRC with some black magic different mds might provide different cephfs, but not really supported/tested)
[22:46] <dbbyleo> ah ok. So me thinking cephfs like unix exports is totally wrong. Thanks for clarifying that.
[22:46] * m0zes has 3 pools for cephfs, homes (replicated x3), scratch (replicated x2) and bulk (EC 6+2).
[22:51] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[22:53] * ulterior (~datagutt@torsrvq.snydernet.net) has joined #ceph
[22:55] <dbbyleo> Ok... had to downlaod ceph-fuse, but it worked in mounting the cephfs.
[22:59] <doppelgrau> BTW, has anyone else seen a huge drop in cpu demand from hammer to jewel? reduzed to one thrid or less the osds
[23:03] * icey is now known as icey|vacation
[23:08] <dbbyleo> I had to bounce my client node since it froze up when I tried to kernel mount the cephfs. But since I rebooted, I tried to remount my Block Device I created (and mounted earlier), but I find that /dev/rbd/rbd/<myimage> is no longer there.
[23:09] <dbbyleo> Is that suppose to happen? Am I suppose to remap that image every time I reboot the client?
[23:11] <doppelgrau> dbbyleo: yes
[23:13] <jdillaman> dbbyleo: there is a helper systemd service "rbdmap" that will handle that automatically on boot if desired
[23:15] <dbbyleo> ok thanks. I'll check it out.
[23:18] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) Quit (Remote host closed the connection)
[23:19] * LegalResale (~LegalResa@66.165.126.130) Quit (Quit: Leaving)
[23:20] * brad_mssw (~brad@66.129.88.50) Quit (Quit: Leaving)
[23:21] * srk_ (~Siva@32.97.110.50) Quit (Ping timeout: 480 seconds)
[23:23] * ulterior (~datagutt@5AEAAAXC6.tor-irc.dnsbl.oftc.net) Quit ()
[23:23] * LegalResale (~LegalResa@66.165.126.130) has joined #ceph
[23:25] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) has joined #ceph
[23:37] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) Quit (Read error: Connection reset by peer)
[23:38] * oliveiradan (~doliveira@67.214.238.80) Quit (Ping timeout: 480 seconds)
[23:40] * georgem (~Adium@24.114.71.120) Quit (Ping timeout: 480 seconds)
[23:40] * dnunez (~dnunez@nat-pool-bos-t.redhat.com) Quit (Remote host closed the connection)
[23:45] * oliveiradan (~doliveira@67.214.238.80) has joined #ceph
[23:53] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) Quit (Ping timeout: 480 seconds)
[23:54] * haplo37 (~haplo37@199.91.185.156) Quit (Remote host closed the connection)
[23:54] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) Quit (Ping timeout: 480 seconds)
[23:57] * xarses (~xarses@64.124.158.32) Quit (Ping timeout: 480 seconds)
[23:59] * hellertime (~Adium@pool-71-162-119-41.bstnma.fios.verizon.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.