#ceph IRC Log

Index

IRC Log for 2016-10-04

Timestamps are in GMT/BST.

[0:06] * mattbenjamin (~mbenjamin@12.118.3.106) Quit (Ping timeout: 480 seconds)
[0:10] * scuttlemonkey is now known as scuttle|afk
[0:20] * mhackett (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) Quit (Remote host closed the connection)
[0:22] * squizzi (~squizzi@107.13.237.240) Quit (Quit: bye)
[0:23] * m0zes__ (~mozes@n117m02.cs.ksu.edu) has left #ceph
[0:32] * ircolle (~Adium@2601:285:201:633a:10d1:4a44:9d14:9551) Quit (Quit: Leaving.)
[0:34] * davidzlap (~Adium@2605:e000:1313:8003:99b0:b74c:52c:cd1d) Quit (Quit: Leaving.)
[0:35] * ffilzwin (~ffilz@c-67-170-185-135.hsd1.or.comcast.net) Quit (Quit: Leaving)
[0:35] * bene3 (~bene@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[0:36] * Rehevkor (~nih@46.166.138.175) has joined #ceph
[0:37] <lincolnb> does the MDS have a maximum number of caps that it hands out to clients?
[0:41] * shaunm (~shaunm@ms-208-102-105-216.gsm.cbwireless.com) Quit (Ping timeout: 480 seconds)
[0:51] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[0:51] * haplo37 (~haplo37@199.91.185.156) Quit (Ping timeout: 480 seconds)
[0:55] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Quit: doppelgrau)
[1:01] * stiopa (~stiopa@81.110.229.198) Quit (Ping timeout: 480 seconds)
[1:01] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[1:03] * ira (~ira@12.118.3.106) Quit (Ping timeout: 480 seconds)
[1:05] * wak-work (~wak-work@2620:15c:2c5:3:2497:7a21:8815:f0e7) Quit (Remote host closed the connection)
[1:06] * Rehevkor (~nih@46.166.138.175) Quit ()
[1:06] * wak-work (~wak-work@2620:15c:2c5:3:f0a4:c5dc:c888:f520) has joined #ceph
[1:06] * vata1 (~vata@207.96.182.162) Quit (Quit: Leaving.)
[1:07] * ffilzwin (~ffilz@c-67-170-185-135.hsd1.or.comcast.net) has joined #ceph
[1:12] * evelu (~erwan@37.160.202.220) has joined #ceph
[1:26] * oms101 (~oms101@p20030057EA002300C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:35] * oms101 (~oms101@p20030057EA000200C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[1:35] * wushudoin (~wushudoin@2601:646:8200:c9f0:2ab2:bdff:fe0b:a6ee) Quit (Quit: Leaving)
[1:35] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) Quit (Quit: Leaving.)
[1:36] * blizzow (~jburns@50-243-148-102-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[1:41] * ffilzwin (~ffilz@c-67-170-185-135.hsd1.or.comcast.net) Quit (Quit: Leaving)
[1:44] * hidekazu (~oftc-webi@210.143.35.18) has joined #ceph
[1:44] * ffilzwin (~ffilz@c-67-170-185-135.hsd1.or.comcast.net) has joined #ceph
[1:45] * davidzlap (~Adium@2605:e000:1313:8003:7119:5e48:db66:a916) has joined #ceph
[1:46] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) has joined #ceph
[1:52] <hidekazu> Hi, I am evaluating RGW Multisite for disaster recovery and have a question.
[1:52] <hidekazu> Please help.
[1:53] <hidekazu> I put object to a master zone, object is replicated to non master zone. But if replicating is started and master zone is down, How can i see which object is failed?
[1:54] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[2:05] * Concubidated (~cube@68.140.239.164) Quit (Quit: Leaving.)
[2:05] * sudocat (~dibarra@2602:306:8bc7:4c50:b0a9:c99d:a621:b157) has joined #ceph
[2:11] * guerby (~guerby@ip165.tetaneutral.net) Quit (Ping timeout: 480 seconds)
[2:15] * malevolent_ (~quassel@192.146.172.118) Quit (Quit: No Ping reply in 210 seconds.)
[2:16] * malevolent (~quassel@192.146.172.118) has joined #ceph
[2:23] * brians (~brian@80.111.114.175) has joined #ceph
[2:26] * verleihnix (~verleihni@46-127-202-48.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[2:27] * verleihnix (~verleihni@195.12.46.2) has joined #ceph
[2:27] * brians__ (~brian@80.111.114.175) Quit (Ping timeout: 480 seconds)
[2:35] * Concubidated (~cube@h4.246.129.40.static.ip.windstream.net) has joined #ceph
[2:45] * Bwana (~xanax`@46.166.138.147) has joined #ceph
[2:46] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[2:52] * cyphase (~cyphase@000134f2.user.oftc.net) Quit (Quit: cyphase.com)
[2:52] * jarrpa (~jarrpa@2602:3f:e183:a600:a4c6:1a92:820f:bb6) has joined #ceph
[2:52] * cyphase (~cyphase@000134f2.user.oftc.net) has joined #ceph
[2:54] * valeech (~valeech@pool-96-247-203-33.clppva.fios.verizon.net) has joined #ceph
[3:00] * guerby (~guerby@ip165.tetaneutral.net) has joined #ceph
[3:02] * cyphase (~cyphase@000134f2.user.oftc.net) Quit (Quit: cyphase.com)
[3:02] * cyphase (~cyphase@2601:640:c401:969a:468a:5bff:fe29:b5fd) has joined #ceph
[3:07] * natarej_ (~natarej@149.56.5.90) has joined #ceph
[3:10] * jfaj (~jan@p20030084AD11F4005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[3:13] * Racpatel (~Racpatel@2601:87:3:31e3::4d2a) Quit (Quit: Leaving)
[3:13] * vata (~vata@96.127.202.136) Quit (Quit: Leaving.)
[3:14] * natarej__ (~natarej@101.188.54.14) Quit (Ping timeout: 480 seconds)
[3:15] * Bwana (~xanax`@46.166.138.147) Quit ()
[3:16] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[3:20] * kristen (~kristen@jfdmzpr02-ext.jf.intel.com) Quit (Quit: Leaving)
[3:20] * jfaj (~jan@p20030084AD1B01005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) has joined #ceph
[3:20] * hidekazu (~oftc-webi@210.143.35.18) Quit (Quit: Page closed)
[3:24] * sto_ (~sto@121.red-2-139-229.staticip.rima-tde.net) has joined #ceph
[3:25] * sto (~sto@121.red-2-139-229.staticip.rima-tde.net) Quit (Read error: Connection reset by peer)
[3:41] * Charlie[m] (~charliesh@2001:470:1af1:101::4b) has joined #ceph
[3:42] * Charlie[m] (~charliesh@2001:470:1af1:101::4b) has left #ceph
[3:46] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Quit: doppelgrau)
[3:49] * vata (~vata@96.127.202.136) has joined #ceph
[3:49] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) has joined #ceph
[3:49] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[3:54] * davidzlap (~Adium@2605:e000:1313:8003:7119:5e48:db66:a916) Quit (Quit: Leaving.)
[4:23] * Plesioth (~anadrom@tor2r.ins.tor.net.eu.org) has joined #ceph
[4:29] * kefu (~kefu@114.92.125.128) has joined #ceph
[4:35] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[4:50] * raphaelsc (~raphaelsc@177.97.216.86) Quit (Read error: Connection reset by peer)
[4:53] * Plesioth (~anadrom@tor2r.ins.tor.net.eu.org) Quit ()
[5:00] * Realmy (~Realmy@0002243f.user.oftc.net) Quit (Quit: ZNC - http://znc.in)
[5:02] * Realmy (~Realmy@ec2-54-172-129-45.compute-1.amazonaws.com) has joined #ceph
[5:03] * Vacuum_ (~Vacuum@88.130.203.105) has joined #ceph
[5:09] * oracular (~Tralin|Sl@108.61.166.135) has joined #ceph
[5:09] * johnavp1989 (~jpetrini@pool-100-14-10-2.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[5:10] * Vacuum__ (~Vacuum@88.130.195.66) Quit (Ping timeout: 480 seconds)
[5:15] * jdillaman_ (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) Quit (Quit: jdillaman_)
[5:17] * rotbeard (~redbeard@aftr-109-90-233-215.unity-media.net) has joined #ceph
[5:33] * vimal (~vikumar@114.143.160.250) has joined #ceph
[5:39] * oracular (~Tralin|Sl@108.61.166.135) Quit ()
[5:39] * wkennington (~wak@0001bde8.user.oftc.net) has joined #ceph
[5:39] * CoZmicShReddeR (~ZombieL@tor2r.ins.tor.net.eu.org) has joined #ceph
[5:45] * karnan (~karnan@125.16.34.66) has joined #ceph
[5:56] * evelu (~erwan@37.160.202.220) Quit (Ping timeout: 480 seconds)
[6:03] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[6:04] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[6:05] * natarej__ (~natarej@101.188.54.14) has joined #ceph
[6:06] * Nicho1as (~nicho1as@00022427.user.oftc.net) has joined #ceph
[6:09] * CoZmicShReddeR (~ZombieL@tor2r.ins.tor.net.eu.org) Quit ()
[6:11] * walcubi (~walcubi@p5797AC35.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[6:11] * walcubi (~walcubi@p5797A087.dip0.t-ipconnect.de) has joined #ceph
[6:13] * natarej_ (~natarej@149.56.5.90) Quit (Ping timeout: 480 seconds)
[6:14] * kefu (~kefu@114.92.125.128) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[6:18] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[6:18] * vimal (~vikumar@114.143.160.250) Quit (Quit: Leaving)
[6:19] * rdas (~rdas@121.244.87.116) has joined #ceph
[6:21] * nilez (~nilez@104.129.28.50) Quit (Read error: Connection reset by peer)
[6:24] * nilez (~nilez@104.129.28.194) has joined #ceph
[6:25] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit (Quit: Leaving.)
[6:27] * kefu (~kefu@114.92.125.128) has joined #ceph
[6:31] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[6:37] * kefu (~kefu@114.92.125.128) Quit (Max SendQ exceeded)
[6:38] * kefu (~kefu@114.92.125.128) has joined #ceph
[6:39] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[6:41] * kefu (~kefu@114.92.125.128) Quit (Max SendQ exceeded)
[6:42] * kefu (~kefu@114.92.125.128) has joined #ceph
[6:44] * vimal (~vikumar@121.244.87.116) has joined #ceph
[6:46] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[6:47] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[6:54] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit (Quit: Leaving.)
[6:56] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[6:56] * masber (~masber@129.94.15.152) has joined #ceph
[6:57] <masber> has anyone ran mdtest?
[7:01] * TomasCZ (~TomasCZ@yes.tenlab.net) Quit (Quit: Leaving)
[7:14] * vata (~vata@96.127.202.136) Quit (Quit: Leaving.)
[7:23] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[7:25] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[7:30] * mollstam (~straterra@185.3.135.186) has joined #ceph
[7:55] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[8:00] * mollstam (~straterra@185.3.135.186) Quit ()
[8:11] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit (Quit: Leaving.)
[8:14] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Quit: doppelgrau)
[8:16] * lmb (~Lars@ip5b404bab.dynamic.kabel-deutschland.de) Quit (Ping timeout: 480 seconds)
[8:18] * rendar (~I@87.13.171.75) has joined #ceph
[8:24] * rotbeard (~redbeard@aftr-109-90-233-215.unity-media.net) Quit (Quit: Leaving)
[8:32] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) has joined #ceph
[8:33] * malevolent (~quassel@192.146.172.118) Quit (Quit: No Ping reply in 180 seconds.)
[8:34] * ade (~abradshaw@p4FF7BEEA.dip0.t-ipconnect.de) has joined #ceph
[8:35] * malevolent (~quassel@192.146.172.118) has joined #ceph
[8:40] * brians (~brian@80.111.114.175) Quit (Quit: Textual IRC Client: www.textualapp.com)
[8:41] * brians (~brian@80.111.114.175) has joined #ceph
[8:42] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[8:45] * Meths (~meths@95.151.244.152) Quit (Read error: Connection reset by peer)
[8:45] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[8:46] * Meths (~meths@95.151.244.152) has joined #ceph
[8:48] * hk135 (~horner@rs-mailrelay1.hornerscomputer.co.uk) has joined #ceph
[8:48] <hk135> Hi All, I was wondering if anyone could help with the following error i am seeing trying to setup cephfs
[8:48] <hk135> log_channel(cluster) log [ERR] : error reading table object 'mds0_inotable' -2
[8:49] <hk135> cluster has lots of rados images working fine
[8:49] <hk135> but I am trying to add cephfs
[8:49] <hk135> so I brought up an mds and instantly got rank 0 degraded
[8:49] <hk135> have tried googling but to no avail
[8:51] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[8:53] * wkennington (~wak@0001bde8.user.oftc.net) Quit (Read error: Connection reset by peer)
[8:59] * chutz (~chutz@rygel.linuxfreak.ca) Quit (Quit: Leaving)
[8:59] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[9:01] * dgurtner (~dgurtner@178.197.224.255) has joined #ceph
[9:05] * efirs (~firs@98.207.153.155) Quit (Quit: Leaving.)
[9:07] * gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) Quit (Quit: Textual IRC Client: www.textualapp.com)
[9:08] * gila (~gila@5ED4FE92.cm-7-5d.dynamic.ziggo.nl) has joined #ceph
[9:15] * analbeard (~shw@support.memset.com) has joined #ceph
[9:21] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[9:23] * Hemanth (~hkumar_@125.16.34.66) has joined #ceph
[9:34] * nilez (~nilez@104.129.28.194) Quit (Ping timeout: 480 seconds)
[9:39] * analbeard (~shw@support.memset.com) has left #ceph
[9:40] * analbeard (~shw@support.memset.com) has joined #ceph
[9:42] * nilez (~nilez@104.129.29.50) has joined #ceph
[9:54] * TMM (~hp@dhcp-077-248-009-229.chello.nl) Quit (Quit: Ex-Chat)
[9:57] * pdrakewe_ (~pdrakeweb@oh-76-5-101-140.dhcp.embarqhsd.net) Quit (Read error: Connection reset by peer)
[9:57] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[9:58] * nilez (~nilez@104.129.29.50) Quit (Ping timeout: 480 seconds)
[10:01] * dgurtner_ (~dgurtner@195.238.25.37) has joined #ceph
[10:03] * dgurtner (~dgurtner@178.197.224.255) Quit (Ping timeout: 480 seconds)
[10:03] * jcsp (~jspray@62.214.2.210) has joined #ceph
[10:04] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Read error: Connection reset by peer)
[10:04] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[10:08] * jcsp (~jspray@62.214.2.210) Quit ()
[10:08] * jcsp (~jspray@62.214.2.210) has joined #ceph
[10:11] * nilez (~nilez@104.129.29.2) has joined #ceph
[10:14] * kefu (~kefu@114.92.125.128) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[10:14] * branto (~branto@transit-86-181-132-209.redhat.com) has joined #ceph
[10:14] * malevolent (~quassel@192.146.172.118) Quit (Read error: Connection reset by peer)
[10:16] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[10:21] * b0e (~aledermue@213.95.25.82) has joined #ceph
[10:21] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[10:36] * kefu (~kefu@114.92.125.128) has joined #ceph
[10:44] * nilez (~nilez@104.129.29.2) Quit (Ping timeout: 480 seconds)
[10:52] * Nicho1as (~nicho1as@00022427.user.oftc.net) Quit (Quit: A man from the Far East; using WeeChat 1.5)
[10:53] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:84c4:a35d:32e9:a5e9) has joined #ceph
[11:03] * karnan (~karnan@125.16.34.66) Quit (Ping timeout: 480 seconds)
[11:15] * karnan (~karnan@125.16.34.66) has joined #ceph
[11:16] * cyphase (~cyphase@000134f2.user.oftc.net) Quit (Quit: cyphase.com)
[11:17] * cyphase (~cyphase@2601:640:c401:969a:468a:5bff:fe29:b5fd) has joined #ceph
[11:17] * kefu (~kefu@114.92.125.128) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[11:21] * TMM (~hp@185.5.121.201) has joined #ceph
[11:23] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[11:25] * jcsp (~jspray@62.214.2.210) has joined #ceph
[11:31] * mason1 (~VampiricP@exit1.radia.tor-relays.net) has joined #ceph
[11:31] <fusl> is there anyone who got jewel running on debian jessie?
[11:33] <fusl> i'm stuck at the step "ceph-deploy mon create-initial" here http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/ failing with some errors i can't seem to find the root cause: https://scr.meo.ws/paste/1475573465628792607.txt
[11:36] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[11:36] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[11:37] <peetaur2> that's what I see most of the time when I ctrl+c out of something "You cannot perform that operation on a Rados object in state configuring."
[11:37] * kefu (~kefu@114.92.125.128) has joined #ceph
[11:37] * ashah (~ashah@125.16.34.66) has joined #ceph
[11:38] <peetaur2> I decided to use the fully manual method instead because of little issues now and then like that... (issues are fine, but having them be so mysterious is not fine at all)
[11:38] <peetaur2> and looking at puppet config people posted, it seems to be more like the manual way...so I think that was a good plan if I will use puppet for ceph
[11:39] * jcsp (~jspray@62.214.2.210) has joined #ceph
[11:39] <peetaur2> the command that fails: Running command: /usr/bin/ceph [...] auth get-or-create client.admin osd allow * mds allow * mon allow *
[11:40] <peetaur2> so... does that mean the mon works already, just the admin key is not added?
[11:41] <peetaur2> you could check that by changing "cephx" to "none" in the conf and restarting the mons
[11:41] <peetaur2> and then ceph auth list
[11:42] <peetaur2> in the manual method, you first create the mon key and admin key as separate files, then put the admin one in the mon one, then add it to the monmap, then when you create the mon you use that monmap, so there is no "connect" when adding the key...no daemon running yet
[11:42] <peetaur2> so your output is hard to understand
[11:43] * DanFoster (~Daniel@office.34sp.com) has joined #ceph
[11:46] <fusl> is there a minimum amount of mon nodes i need for ceph or is a single node fine?
[11:47] <peetaur2> ceph is intended to be on a larger scale, 3 minimum
[11:47] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:84c4:a35d:32e9:a5e9) Quit (Ping timeout: 480 seconds)
[11:47] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[11:47] <peetaur2> 1 is likely possible, but sounds like a bad use of ceph
[11:49] <peetaur2> how many osds do you plan?
[11:50] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[11:50] * nardial (~ls@p5DC07F95.dip0.t-ipconnect.de) has joined #ceph
[11:50] <fusl> 4 on a single hardware node
[11:51] <peetaur2> ok I think that's ok, but you should try to get 3 ,ons
[11:52] <peetaur2> did you make sure to change "osd crush chooseleaf type" to 0 like it says here? http://docs.ceph.com/docs/jewel/rados/troubleshooting/troubleshooting-pg/
[11:52] <peetaur2> if not, it will break with just one osd lost
[11:53] <peetaur2> (and in my testing, it breaks (hang with easy fix, not data loss) even with that setting if you rm an osd, even after safely migrating data out using "ceph osd out ${id}", but didn't test reweight yet)
[11:53] * sickology (~root@vpn.bcs.hr) Quit (Read error: Connection reset by peer)
[11:53] * sickology (~root@vpn.bcs.hr) has joined #ceph
[11:54] <peetaur2> (I tested 2 OSDs btw... so 4 is different...need to test)
[12:00] * Arfed (~maku@watchme.tor-exit.network) has joined #ceph
[12:01] * mason1 (~VampiricP@exit1.radia.tor-relays.net) Quit ()
[12:11] <ledgr> I'm using mix of cephfs clients (some ceph-fuse, some kernel clients)
[12:12] <ledgr> But i get different metadata info on each of those nodes
[12:12] <ledgr> for example 'ceph.dir.rentries' is different on all nodes
[12:12] <ledgr> issuing 'touch' command on that directory updates metadata
[12:13] <ledgr> is this because of lazy MDS updates?
[12:14] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[12:17] * jcsp (~jspray@62.214.2.210) has joined #ceph
[12:18] * minnesotags (~herbgarci@c-50-137-242-97.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[12:26] <analbeard> does anyone have any experience with ceph-ansible?
[12:26] <analbeard> aside from leseb of course, unless he's there
[12:27] <analbeard> specifically - has anyone used it to install on Debian hosts?
[12:27] <leseb> analbeard oui ?
[12:28] <analbeard> ah morning Seb!
[12:28] <analbeard> I'm finding that instead of installing the packages listed in 'debian_package_dependencies', it's attempting to use 'debian_package_dependencies' as the name of the package to install
[12:29] * peetaur2 (~peter@i4DF67CD2.pool.tripleplugandplay.com) Quit (Ping timeout: 480 seconds)
[12:29] <analbeard> ansible 2.2.0 on Debian Jessie and an up to date pull from git
[12:29] * peetaur2_ (~peter@i4DF67CD2.pool.tripleplugandplay.com) has joined #ceph
[12:30] * Arfed (~maku@watchme.tor-exit.network) Quit ()
[12:31] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[12:33] * rraja (~rraja@125.16.34.66) has joined #ceph
[12:37] * [0x4A6F]_ (~ident@p4FC2718B.dip0.t-ipconnect.de) has joined #ceph
[12:37] * dgurtner_ (~dgurtner@195.238.25.37) Quit (Ping timeout: 480 seconds)
[12:40] * [0x4A6F] (~ident@0x4a6f.user.oftc.net) Quit (Ping timeout: 480 seconds)
[12:40] * [0x4A6F]_ is now known as [0x4A6F]
[12:47] * nilez (~nilez@ec2-52-37-170-77.us-west-2.compute.amazonaws.com) has joined #ceph
[12:50] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[12:50] <analbeard> leseb: sussed it - the debian dependencies block syntax isn't valid for ansible 2
[12:51] <analbeard> at least i suspect that's it, i've not had much experience with ansible
[12:51] <analbeard> I made some alterations after perusing the Ansible docs and it works just fine
[12:51] <leseb> analbeard to be honest it has been a while since I installed anything on Debian, doyou mind opening an issue for this? :)
[12:52] <analbeard> leseb: certainly! i'll do that later today
[12:52] <leseb> analbeard thanks that's easier for me to track :)
[12:52] <analbeard> no probs!
[12:58] * rraja (~rraja@125.16.34.66) Quit (Ping timeout: 480 seconds)
[13:00] * kefu (~kefu@114.92.125.128) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[13:02] * kefu (~kefu@114.92.125.128) has joined #ceph
[13:05] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[13:10] * rraja (~rraja@125.16.34.66) has joined #ceph
[13:10] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[13:10] <hk135> Hi All, I was wondering if anyone could help with the following error i am seeing trying to setup cephfs
[13:10] <hk135> log_channel(cluster) log [ERR] : error reading table object 'mds0_inotable' -2
[13:11] <hk135> cluster has lots of rados images working fine
[13:11] <hk135> but I am trying to add cephfs
[13:11] <hk135> so I brought up an mds and instantly got rank 0 degraded
[13:11] <hk135> have tried googling but to no avail
[13:13] <hk135> in the mds log I get error reading table object 'mds0_inotable' -2 ((2) No such file or directory)
[13:20] * ashah (~ashah@125.16.34.66) Quit (Quit: Leaving)
[13:20] * dgurtner (~dgurtner@178.197.233.213) has joined #ceph
[13:21] * Nicho1as (~nicho1as@00022427.user.oftc.net) has joined #ceph
[13:30] * TMM (~hp@185.5.121.201) Quit (Quit: Ex-Chat)
[13:34] * ade (~abradshaw@p4FF7BEEA.dip0.t-ipconnect.de) Quit (Quit: Too sexy for his shirt)
[13:34] * ade (~abradshaw@p4FF7BEEA.dip0.t-ipconnect.de) has joined #ceph
[13:36] * georgem (~Adium@206.108.127.16) has joined #ceph
[13:38] * rraja (~rraja@125.16.34.66) Quit (Quit: Leaving)
[13:41] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Remote host closed the connection)
[13:44] * ade_b (~abradshaw@p200300886B2AAA00A6C494FFFE000780.dip0.t-ipconnect.de) has joined #ceph
[13:45] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[13:47] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[13:47] * zokko (zok@neurosis.pl) has joined #ceph
[13:47] <zokko> hi guys
[13:50] * ade (~abradshaw@p4FF7BEEA.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[13:51] * jcsp (~jspray@62.214.2.210) has joined #ceph
[13:53] * rraja (~rraja@125.16.34.66) has joined #ceph
[13:55] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[13:56] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[13:57] * kefu (~kefu@114.92.125.128) Quit (Max SendQ exceeded)
[13:58] * kefu (~kefu@114.92.125.128) has joined #ceph
[13:59] * billwebb (~billwebb@162-207-226-157.lightspeed.tukrga.sbcglobal.net) has joined #ceph
[14:01] <peetaur2_> hk135: which version?
[14:03] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[14:04] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[14:06] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[14:06] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[14:10] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:84c4:a35d:32e9:a5e9) has joined #ceph
[14:14] * jidar_ (~jidar@104.207.140.225) has joined #ceph
[14:14] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[14:14] * bene2 (~bene@2601:193:4101:f410:ea2a:eaff:fe08:3c7a) has joined #ceph
[14:17] <hk135> peetaur2_: Jewel, 10.2.3
[14:17] <hk135> peetaur2: Jewel, 10.2.3
[14:17] <hk135> peetaur2: totally fresh install, as soon as I created my first fs it was damaged
[14:18] * shubjero (~shubjero@107.155.107.246) Quit (Ping timeout: 480 seconds)
[14:18] <peetaur2_> ok well... let's see ceph -s
[14:18] * shubjero (~shubjero@107.155.107.246) has joined #ceph
[14:19] * nigwil (~Oz@li1416-21.members.linode.com) Quit (Quit: leaving)
[14:19] * jidar (~jidar@104.207.140.225) Quit (Ping timeout: 480 seconds)
[14:22] * georgem (~Adium@206.108.127.16) has joined #ceph
[14:23] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[14:28] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) has joined #ceph
[14:32] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[14:33] * georgem (~Adium@206.108.127.16) has left #ceph
[14:33] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[14:33] * georgem (~Adium@206.108.127.16) has joined #ceph
[14:34] <hk135> peetaur2_:health HEALTH_ERR, mds rank 0 is damaged, mds cluster is degraded, monmap e1: 1 mons at {bertie=192.168.2.3:6789/0}, fsmap e94: 0/1/1 up, 1 up:standby, 1 damaged, osdmap e954: 4 osds: 4 up, 4 in, 4057 GB used, 7115 GB / 11172 GB avail, 420 active+clean
[14:36] <hk135> peetaur2: was getting log_channel(cluster) log [ERR] : error reading table object 'mds0_inotable' -2 in log but I have reset everything as per http://docs.ceph.com/docs/jewel/cephfs/disaster-recovery/
[14:37] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[14:37] <hk135> peetaur2: there is no data in the cephfs yet, straight after I created the first fs it reported rank 0 damaged, so I am happy to wipe the mds, there are other pools I am using so I can't wipe the whole cluster unfort
[14:39] <peetaur2_> hk135: ok and let's see ceph mds dump -f json-pretty | nc termbin.com 9999
[14:40] <hk135> peetaur2: thats run, thanks in advance btw
[14:40] <hk135> peetaur2: appreciate someone taking the time
[14:40] <peetaur2_> ok let's see the url
[14:41] <hk135> peetaur2: http://termbin.com/zxba
[14:41] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[14:41] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) Quit (Quit: Leaving...)
[14:42] * shaunm (~shaunm@ms-208-102-105-216.gsm.cbwireless.com) has joined #ceph
[14:42] <peetaur2_> did you always have only that version 10.2.3, or did you have a mix?
[14:43] * Racpatel (~Racpatel@2601:87:3:31e3::34db) has joined #ceph
[14:43] * jcsp (~jspray@62.214.2.210) has joined #ceph
[14:43] <peetaur2_> also let's see the log... /var/log/ceph/ceph-mds.*.log
[14:43] * jdillaman_ (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) has joined #ceph
[14:43] <peetaur2_> meanwhile, I"ll try deleting my cephfs and making a new one to see.... mine doesn't say "file layout v2", but "no anchor table" instead
[14:43] <peetaur2_> http://termbin.com/p1du
[14:44] <hk135> peetaur2: http://termbin.com/mcrz
[14:45] <hk135> peastaur2: I have the mds running in the foreground atm as it doesn't generate any logs otherwise
[14:46] * hk135 (~horner@rs-mailrelay1.hornerscomputer.co.uk) Quit (Quit: leaving)
[14:47] * hk135 (~horner@rs-mailrelay1.hornerscomputer.co.uk) has joined #ceph
[14:47] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[14:47] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[14:49] <peetaur2_> yours still looks different.... where does the "file layout v2" come from? http://termbin.com/js4k
[14:50] * nardial (~ls@p5DC07F95.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[14:51] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[14:51] <hk135> peetaur2: I'm not sure, it appears to be part of cephfs rather than file layouts on the osd
[14:52] <hk135> peetaur2_: I don't think I specified it anywhere
[14:53] * mhack (~mhack@nat-pool-bos-t.redhat.com) has joined #ceph
[14:53] * vimal (~vikumar@121.244.87.116) Quit (Quit: Leaving)
[14:54] <peetaur2_> neither did I, but our results are different... dunno what it is
[14:54] <peetaur2_> so should we try just recreating the mds then?
[14:54] <peetaur2_> delting all the cephfs data
[14:55] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[14:55] <hk135> peetaur2_: I have tried that numerous times, including removing all the pools and creating a new mds under a different name
[14:55] <hk135> peetaur2_: sry, all the cephfs pools, metadata and data
[14:56] * pdrakeweb (~pdrakeweb@oh-76-5-101-140.dhcp.embarqhsd.net) has joined #ceph
[14:57] * jlayton (~jlayton@cpe-2606-A000-1125-405B-14D9-DFF4-8FF1-7DD8.dyn6.twc.com) Quit (Quit: ZNC 1.6.2 - http://znc.in)
[14:58] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[14:59] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[15:00] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[15:00] * jlayton (~jlayton@cpe-2606-A000-1125-405B-14D9-DFF4-8FF1-7DD8.dyn6.twc.com) has joined #ceph
[15:00] * mhackett (~mhack@nat-pool-bos-u.redhat.com) has joined #ceph
[15:02] <hk135> peetaur2_: thanks for taking the time, I am going to whack my mds and pools this evening and try again. Hopefully it will work!
[15:02] * hk135 (~horner@rs-mailrelay1.hornerscomputer.co.uk) Quit (Quit: leaving)
[15:02] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[15:02] * mattbenjamin (~mbenjamin@76-206-42-50.lightspeed.livnmi.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[15:05] * mhack (~mhack@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[15:05] * Ivan1 (~ipencak@213.151.95.130) has joined #ceph
[15:08] * jlayton (~jlayton@cpe-2606-A000-1125-405B-14D9-DFF4-8FF1-7DD8.dyn6.twc.com) Quit (Quit: ZNC 1.6.2 - http://znc.in)
[15:08] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[15:09] * karnan (~karnan@125.16.34.66) Quit (Remote host closed the connection)
[15:09] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[15:10] * evelu (~erwan@37.162.45.209) has joined #ceph
[15:12] <Ivan1> can I configure cache-tier to move data to backing pool after they are not accessed in certain amount of time? I know I can set bytes size limit but how about access time?
[15:15] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[15:16] * TMM (~hp@dhcp-077-248-009-229.chello.nl) has joined #ceph
[15:17] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[15:18] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[15:19] * jlayton (~jlayton@107.13.71.30) has joined #ceph
[15:19] * kefu (~kefu@114.92.125.128) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[15:19] * jcsp (~jspray@62.214.2.210) Quit (Quit: Ex-Chat)
[15:25] * minnesotags (~herbgarci@c-50-137-242-97.hsd1.mn.comcast.net) has joined #ceph
[15:25] * derjohn_mob (~aj@42.red-176-83-90.dynamicip.rima-tde.net) has joined #ceph
[15:26] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[15:28] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[15:36] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[15:36] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[15:41] * salwasser (~Adium@72.246.3.14) has joined #ceph
[15:44] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[15:45] * mhack (~mhack@nat-pool-bos-t.redhat.com) has joined #ceph
[15:45] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[15:49] * mhackett (~mhack@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[15:49] * ira (~ira@12.118.3.106) has joined #ceph
[15:54] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[15:55] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[15:58] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[15:59] * mattbenjamin (~mbenjamin@12.118.3.106) has joined #ceph
[16:03] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[16:04] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[16:07] * squizzi (~squizzi@107.13.237.240) has joined #ceph
[16:12] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[16:13] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[16:14] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:19] <rkeene> Rebooting any Ceph node still causes a near outage on clients doing I/O, the performance degredation is so bad
[16:20] <rkeene> I've tried norecover, nobackfill, and norebalance (and a combination of all)
[16:21] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[16:22] <rkeene> while true; do dd if=/dev/zero of=x bs=128k count=$[8*128] conv=fsync; done => Before reboot: 134217728 bytes (128.0MB) copied, 37.329114 seconds, 3.4MB/s during Reboot: 134217728 bytes (128.0MB) copied, 421.353013 seconds, 311.1KB/s
[16:22] <rkeene> And it was only so fast "During reboot" there because it was partially running during the time before a reboot
[16:23] * fridim (~fridim@56-198-190-109.dsl.ovh.fr) has joined #ceph
[16:24] * sudocat (~dibarra@2602:306:8bc7:4c50:b0a9:c99d:a621:b157) Quit (Ping timeout: 480 seconds)
[16:31] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[16:39] * billwebb (~billwebb@162-207-226-157.lightspeed.tukrga.sbcglobal.net) Quit (Quit: billwebb)
[16:40] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[16:40] * Hemanth (~hkumar_@125.16.34.66) Quit (Ping timeout: 480 seconds)
[16:41] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) has joined #ceph
[16:42] * peetaur2_ (~peter@i4DF67CD2.pool.tripleplugandplay.com) Quit (Ping timeout: 480 seconds)
[16:44] * davidzlap (~Adium@rrcs-74-87-213-28.west.biz.rr.com) Quit ()
[16:44] * Ivan1 (~ipencak@213.151.95.130) Quit (Quit: Leaving.)
[16:45] * kefu (~kefu@114.92.125.128) has joined #ceph
[16:46] <rkeene> 134217728 bytes (128.0MB) copied, 1548.487718 seconds, 84.6KB/s
[16:46] * vata (~vata@207.96.182.162) has joined #ceph
[16:49] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[16:49] * ivve (~zed@m176-68-57-242.cust.tele2.se) has joined #ceph
[16:50] <rkeene> ceph-run also doesn't pass the exit code out, all these init scripts are a mess actually
[16:51] * Concubidated (~cube@h4.246.129.40.static.ip.windstream.net) Quit (Quit: Leaving.)
[16:53] <Be-El> rkeene: it requires some time for ceph to recognize a OSD as down. as long as this did not happen yet, client io is sent to an unavailable OSD
[16:53] <rkeene> The OSD was marked down
[16:53] <rkeene> And out
[16:55] * ade_b (~abradshaw@p200300886B2AAA00A6C494FFFE000780.dip0.t-ipconnect.de) Quit (Quit: Too sexy for his shirt)
[16:55] * nilez (~nilez@ec2-52-37-170-77.us-west-2.compute.amazonaws.com) Quit (Ping timeout: 480 seconds)
[16:58] * sudocat1 (~dibarra@192.185.1.20) has joined #ceph
[16:58] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[17:00] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[17:02] * btaylor_ (~btaylor@static-173-61-199-211.cmdnnj.fios.verizon.net) has joined #ceph
[17:02] * lmb (~Lars@ip5b404bab.dynamic.kabel-deutschland.de) has joined #ceph
[17:03] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:03] * btaylor_ (~btaylor@static-173-61-199-211.cmdnnj.fios.verizon.net) has left #ceph
[17:04] * dgurtner (~dgurtner@178.197.233.213) Quit (Ping timeout: 480 seconds)
[17:05] * peetaur2 (~peter@i4DF67CD2.pool.tripleplugandplay.com) has joined #ceph
[17:06] <peetaur2> so I did a test.... killed one of my node vms with osd.2 and then osd.0 decided to quit too, blocking IO... what's with that? common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit suicide timeout")
[17:07] * jcsp (~jspray@62.214.2.210) has joined #ceph
[17:08] * jcsp (~jspray@62.214.2.210) Quit ()
[17:08] * jcsp (~jspray@62.214.2.210) has joined #ceph
[17:09] * wushudoin (~wushudoin@2601:646:8200:c9f0:2ab2:bdff:fe0b:a6ee) has joined #ceph
[17:12] * DanFoster (~Daniel@office.34sp.com) Quit (Quit: Leaving)
[17:12] * vbellur (~vijay@71.234.224.255) Quit (Ping timeout: 480 seconds)
[17:13] * thansen (~thansen@173-14-224-98-Utah.hfc.comcastbusiness.net) has joined #ceph
[17:13] * ivve (~zed@m176-68-57-242.cust.tele2.se) Quit (Ping timeout: 480 seconds)
[17:14] * billwebb (~billwebb@50-203-47-138-static.hfc.comcastbusiness.net) has joined #ceph
[17:14] * ntpttr_ (~ntpttr@192.55.54.42) has joined #ceph
[17:16] * kefu is now known as kefu|afk
[17:20] * dgurtner (~dgurtner@178.197.233.213) has joined #ceph
[17:21] <ledgr> Hi. We are using cephfs with a mix of fuse and kernel clients. The problem is that if file operations are done one client, metadata (ceph.dir.rsize, ceph.dir.bytes etc.) is not updated on other clients.
[17:21] * lalalal (~oftc-webi@49.35.46.20) has joined #ceph
[17:21] <ledgr> If other clients do any file operations (e.g. 'touch') on said directory, only then metadata is updated
[17:22] * Concubidated (~cube@68.140.239.164) has joined #ceph
[17:23] * nilez (~nilez@ec2-52-37-170-77.us-west-2.compute.amazonaws.com) has joined #ceph
[17:23] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[17:25] <peetaur2> ledgr: and umount and mount again fixes it too, right?
[17:28] * Nicho1as (~nicho1as@00022427.user.oftc.net) Quit (Quit: A man from the Far East; using WeeChat 1.5)
[17:28] * andreww (~xarses@64.124.158.3) has joined #ceph
[17:28] * lalalal (~oftc-webi@49.35.46.20) Quit (Remote host closed the connection)
[17:28] * thansen (~thansen@173-14-224-98-Utah.hfc.comcastbusiness.net) Quit (Quit: Ex-Chat)
[17:31] <ledgr> peetaur2: I didn't try it as it's in semi-production state, but I guess that would do the trick too.
[17:32] <ledgr> It seems like MDS is broadcasting/pushing those values in some kind of lazy way. On some kind of trigger or something, but hey, it's distributed filesystem, and I want to see constant values everywhere :)
[17:33] * kristen (~kristen@134.134.139.72) has joined #ceph
[17:35] * ntpttr_ (~ntpttr@192.55.54.42) Quit (Remote host closed the connection)
[17:36] * ntpttr_ (~ntpttr@134.134.139.82) has joined #ceph
[17:37] * nilez (~nilez@ec2-52-37-170-77.us-west-2.compute.amazonaws.com) Quit (Ping timeout: 480 seconds)
[17:37] <ledgr> peetaur2: But what I can tell you for sure is that "ceph daemon mds.<id> flush journal" didn't work out
[17:39] <peetaur2> what about using stat instead of ls -l?
[17:39] * dgurtner (~dgurtner@178.197.233.213) Quit (Ping timeout: 480 seconds)
[17:41] <ledgr> It's not about inode metadata, it's about MDS metadata :)
[17:41] <ledgr> e.g. /usr/bin/getfattr -n ceph.dir.rentries /path/to/directory
[17:42] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[17:42] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[17:43] * nilez (~nilez@104.129.29.42) has joined #ceph
[17:44] <Be-El> ledgr: ceph.dir.rentries? are these quota settings?
[17:44] <ledgr> Be-El: yes
[17:44] <Be-El> ledgr: quota support is not implemented for the kernel client yet. you have to use ceph-fuse for it
[17:44] * vbellur (~vijay@nat-pool-bos-t.redhat.com) has joined #ceph
[17:44] <ledgr> ceph.dir.entries ceph.dir.files ceph.dir.rbytes ceph.dir.rctime ceph.dir.rentries ceph.dir.rfiles ceph.dir.rsubdirs ceph.dir.subdirs
[17:45] * post-factum (~post-fact@vulcan.natalenko.name) Quit (Quit: ZNC 1.6.3 - http://znc.in)
[17:46] * post-factum (~post-fact@vulcan.natalenko.name) has joined #ceph
[17:48] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) Quit (Quit: Leaving.)
[17:53] * jidar_ is now known as jidar
[17:55] * rraja (~rraja@125.16.34.66) Quit (Quit: Leaving)
[17:56] * evelu (~erwan@37.162.45.209) Quit (Ping timeout: 480 seconds)
[17:56] * rakeshgm (~rakesh@66-194-8-225.static.twtelecom.net) has joined #ceph
[18:03] * vbellur (~vijay@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[18:05] * dgurtner (~dgurtner@109.236.136.226) has joined #ceph
[18:06] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[18:15] <ledgr> Be-El: I have just created a new file (/bin/dd) on fuse client and even same client does not have correct (+1) rentries data even after waiting 10+minutes
[18:15] <ledgr> Not even talking about other nodes
[18:17] * vbellur (~vijay@nat-pool-bos-u.redhat.com) has joined #ceph
[18:17] <minnesotags> fusl: I just got ceph working on Debian yesterday, after days of flogging on it and no advice.
[18:20] * thoht (~fox@2a01:e34:ef3e:5270:ba27:ebff:fe18:4629) has joined #ceph
[18:20] <thoht> hi
[18:21] <thoht> this afternoon, i tried to move an OSD journal into a SSD disk bui encoutered error with --flush-journal
[18:21] <thoht> i got coredump each time i tried to use ceph-osd -i osd.id --flush-journal
[18:22] <thoht> finally; i just moved the journal to the SSD then created a symlink and it worked
[18:22] <thoht> but i m confused about flush-journal
[18:25] <minnesotags> fusl; I have it working on a single machine, with 6 osds. I tried using the manual method, but the instructions are incomplete and confused. You are better off resigning yourself to use ceph-deploy as far as you can get. If ceph-deploy fails at some point, pick up with the manual method. You need to make sure that you have also added the users, ceph, client.admin, etc, and have created keys and added those keys to the host (as desc
[18:26] <SamYaple> thoht: to move a journal, you need to stop the osd, run --flush-journal, update teh journal location
[18:26] <SamYaple> thoht: without the error, i dont know how to help exactly
[18:26] <SamYaple> thoht: if you didnt properly flush the journal, that leads me to think the data on that osd is suspect
[18:26] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[18:26] <minnesotags> Also fusl, Apparently there are bugs with the older versions of systemd that Debian uses, and there is zero documentation anymore for running with sysvinit. Firefly was a lot easier.
[18:28] * jermudgeon_ (~jermudgeo@wpc-pe-l2.whitestone.link) has joined #ceph
[18:29] <thoht> SamYaple: i stoped the osd, tried the flush journal but as it was core dumping; i move the file "journal" into the SSD; then started the osd back and it worked
[18:30] <thoht> (with a symlink from ssd journal to initial place)
[18:30] <SamYaple> thoht: so the file journal now just lives on a file system on the ssd?
[18:30] <thoht> SamYaple: yes
[18:30] <thoht> through a symlink
[18:30] <SamYaple> oh. that sounds less than ideal
[18:31] <thoht> SamYaple: why ?
[18:31] <SamYaple> how are you ensuring it automounts?
[18:31] <SamYaple> thoht: that is not how journals in ceph work, so automation and udev tools will be a problem
[18:31] <thoht> SamYaple: there is no need of automount
[18:31] <thoht> the SSD is used for the OS
[18:31] <thoht> so it is already automounted
[18:31] <SamYaple> eh. then it should be ok for you
[18:32] <SamYaple> but you are certainly non-standard
[18:32] <SamYaple> that may cause you problems down the road
[18:32] <thoht> SamYaple: i tried to use flush journal but it was coredumping
[18:33] <SamYaple> i hear what you are saying. but that doesnt change the fact you are in a configuration that very few/no one else is. so you may experince problems that no one else has
[18:33] <thoht> i can put the exact error on gist
[18:33] <SamYaple> external journals typically reside in thier own partitions without a filesystem rather than a file in an external filesystem to the osd
[18:35] <thoht> SamYaple: https://gist.github.com/anonymous/2aca8790acce054b9032311f8f14628b
[18:36] <thoht> SamYaple: previously the journal was in same partition of the OSD (SATA)
[18:36] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[18:36] <SamYaple> thoht: that is normal to haave the journal as a file when it is colocated
[18:36] <thoht> journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway
[18:37] <thoht> i didn't understand this error
[18:38] <SamYaple> you are using a file journal not a block journal (its not its own partition)
[18:38] <thoht> SamYaple: i saw that it is better to give a raw partition /dev/sdaX for SSD journal but it was not possible on this device, that s why i put the journal on a FS but i guess it is better than nothing
[18:38] <thoht> yes using a file journal
[18:38] <SamYaple> thats what that error is, it does say it plainly
[18:39] <thoht> ok so flush-journal is for block journal
[18:39] <thoht> and i had to use journal_force_aio but how ?
[18:40] <thoht> i don' t see this option in man ceph-osd
[18:41] <SamYaple> no, flush-journal is not just for block journal
[18:42] <SamYaple> why do you want to enable journal_force_aio for a file based journal again?
[18:43] <thoht> i wanted only to flush the journal in order to move it to SSD
[18:45] <SamYaple> technically since you are moving the file like that flushing the journal is not needed. but like i said you are in a strange unsupported setup right now. though you should be safe
[18:45] <SamYaple> personally, I wouldn't be doing what you are doing. but you aren't doing anythign wrong, just untested/supported
[18:45] <thoht> SamYaple: you mean it is recommanded to use a journal file when it is in cohabitation inside the OSD partition but it is not recommanded to use a file journal when it is with an SSD for journal and SATA for OSD ?
[18:47] <SamYaple> there are three supported scenarios that I know of. OSD with 1 partition and journal colocated as a file; OSD with 2 partitions and journal colocated on a partition; OSD with 1 partition with journal as a parition on another device (SSD)
[18:47] <SamYaple> this isnt count bluestore of course
[18:48] <thoht> i don't see the "bad" to have the journal as a file on a SSD when it is supported to have the journal as a file on SATA in colocation or in seprate partition
[18:49] <thoht> and i'm open to any explanation
[18:50] <SamYaple> you are now in a configuration that is untested and i don't know of anyone else in the configuration you are describing (not to say someone isnt doing it). That means you *may* experince bugs and problems no one else has because its an untested configuration. you may not. no one knows, because its not tested against
[18:51] <thoht> i understand but technically what does it really change to have it on /my_osd/journal or in /my_other_paritition_ssd_/journal
[18:51] <thoht> now, it is temporarly, i'm in a process to have a dedicated ssd partition for journal but it will be in a second time
[18:55] * jdillaman__ (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) has joined #ceph
[18:57] <thoht> and i still don't understand the part about "journal force aio = true"
[18:57] <thoht> :/
[18:57] * billwebb (~billwebb@50-203-47-138-static.hfc.comcastbusiness.net) Quit (Quit: billwebb)
[18:58] * kefu|afk (~kefu@114.92.125.128) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[18:58] <SamYaple> thoht: enabling AIO on your file based journal on your root filesystem is going to be awful. I would recommend against it
[18:59] * billwebb (~billwebb@50-203-47-138-static.hfc.comcastbusiness.net) has joined #ceph
[18:59] * kefu (~kefu@114.92.125.128) has joined #ceph
[19:01] * jdillaman_ (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[19:01] * jdillaman (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[19:01] * jdillaman__ is now known as jdillaman
[19:06] * kefu (~kefu@114.92.125.128) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:06] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) has joined #ceph
[19:07] * jcsp (~jspray@62.214.2.210) has joined #ceph
[19:08] * kristen (~kristen@134.134.139.72) Quit (Remote host closed the connection)
[19:11] * dillaman (~jdillaman@pool-108-18-97-95.washdc.fios.verizon.net) has joined #ceph
[19:13] * xinli (~charleyst@32.97.110.51) has joined #ceph
[19:19] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[19:20] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[19:21] * davidzlap (~Adium@cpe-172-91-154-245.socal.res.rr.com) Quit (Quit: Leaving.)
[19:27] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[19:28] <lincolnb> is there a good rule of thumb for scaling mds cache size to the size of RAM?
[19:28] * morse_ (~morse@supercomputing.univpm.it) has joined #ceph
[19:29] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[19:30] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[19:31] * vikhyat (~vumrao@49.248.86.245) has joined #ceph
[19:31] * ircolle (~Adium@c-50-155-137-227.hsd1.co.comcast.net) has joined #ceph
[19:32] * mykola (~Mikolaj@91.245.79.65) has joined #ceph
[19:33] <lincolnb> i.e., i have 64GB RAM, so how large should my mds cache size be?
[19:33] * ntpttr_ (~ntpttr@134.134.139.82) Quit (Remote host closed the connection)
[19:33] <gregsfortytwo> lincolnb: at last count a dentry+inode took 2KB of RAM; limited user evidence suggests ~4KB but it's not clear how much of that is scaling RAM usage versus the base daemon requirements
[19:33] * ntpttr_ (~ntpttr@192.55.54.42) has joined #ceph
[19:35] <lincolnb> gregsfortytwo: thanks. right now i'm using ~50G RAM (0.94.9) w/ the default mds cache size, which seems a bit curious maybe
[19:35] * aNuposic (~aNuposic@134.134.139.76) has joined #ceph
[19:35] <lincolnb> i'll try bumping up the cache size and see if I can OOM my machine :)
[19:37] <aNuposic> Hi Folks, I tried RADOSGW with swift temp url and it worked but now want to try with radosgw temp url format but with that do I need to integrate with Openstack Keystone?
[19:37] <gregsfortytwo> lincolnb: that definitely doesn't sound right
[19:37] <gregsfortytwo> is it rss or vss?
[19:37] <gregsfortytwo> and how many clients are connected?
[19:37] <aNuposic> is Keystone the only way how radosgw can communicate with Openstack services?
[19:38] <gregsfortytwo> lincolnb: and are there any warnings in ceph -s? ;)
[19:38] <aNuposic> Input would be very much appreciated. :)
[19:39] <lincolnb> ya, persistent 'failing to respond to cache pressure' warnings from fairly new clients (4.4.x kernel). from the mds log:
[19:39] <lincolnb> "2016-10-04 13:39:04.266177 7fe4c7ad7700 2 mds.0.cache check_memory_usage total 51802068, rss 50772516, heap 1412172, malloc -2084661 mmap 0, baseline 1412172, buffers 0, max 1048576, 112838 / 150014 inodes have caps, 113044 caps, 0.753556 caps per inode"
[19:40] <lincolnb> about 35 clients connected
[19:41] * vbellur (~vijay@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[19:41] <lincolnb> been tracking my num_caps and i'm noticing clients will suck up ~100k caps for a bit then slow everything down (stat'ing files, reading directories, etc) until they're released
[19:41] <aNuposic> Anyone having idea about how radosgw temp url work with Openstack?
[19:41] <gregsfortytwo> lincolnb: well, you've definitely got 150k inodes sitting in memory then, rather than the default 100k
[19:41] * jermudgeon__ (~jermudgeo@wpc-pe-r2.whitestone.link) has joined #ceph
[19:42] <lincolnb> ah ,yeah, i did change the mds cache size to 120k, but not 150k
[19:42] <gregsfortytwo> it's vaguely possible that continuously trying to evict them, and not being able to keep new data in-memory, is leading to much larger actual memory usage so I'd try bumping it up quite a bit
[19:42] <gregsfortytwo> and see what happens
[19:43] <lincolnb> yeah, I definitely noticed some interesting behavior w/ clients eating up all of the caps. i had a 0.94.5 client that had been sitting on 100k caps for a week and was causing heavy reads (bursts of 100MB/s for several minutes) to the metadata pool continuously
[19:44] <lincolnb> the bad client was using the FUSE client rather than the kclient. not sure if the reboot or the update fixed it. either way, exhausting caps definitely causes some fun behavior :)
[19:44] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Remote host closed the connection)
[19:45] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[19:47] <lincolnb> *sigh* i really need to update to EL7/jewel!
[19:48] * jermudgeon_ (~jermudgeo@wpc-pe-l2.whitestone.link) Quit (Ping timeout: 480 seconds)
[19:50] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Remote host closed the connection)
[19:50] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) has joined #ceph
[19:51] * vbellur (~vijay@nat-pool-bos-t.redhat.com) has joined #ceph
[19:52] * jcsp (~jspray@62.214.2.210) has joined #ceph
[19:52] * kristen (~kristen@134.134.139.72) has joined #ceph
[19:53] * derjohn_mob (~aj@42.red-176-83-90.dynamicip.rima-tde.net) Quit (Ping timeout: 480 seconds)
[19:56] * jermudgeon (~jermudgeo@wpc-pe-l2.whitestone.link) has joined #ceph
[19:57] * vikhyat (~vumrao@49.248.86.245) Quit (Quit: Leaving)
[19:58] * ledgr (~ledgr@88-119-196-104.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[20:02] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) has joined #ceph
[20:02] * jermudgeon__ (~jermudgeo@wpc-pe-r2.whitestone.link) Quit (Ping timeout: 480 seconds)
[20:05] * Hemanth (~hkumar_@103.228.221.190) has joined #ceph
[20:07] * davidzlap (~Adium@2605:e000:1313:8003:9899:8d85:63bf:27e7) has joined #ceph
[20:09] * treenerd_ (~gsulzberg@cpe90-146-148-47.liwest.at) has joined #ceph
[20:12] * jermudgeon (~jermudgeo@wpc-pe-l2.whitestone.link) Quit (Ping timeout: 480 seconds)
[20:20] * xinli (~charleyst@32.97.110.51) Quit (Ping timeout: 480 seconds)
[20:21] * efirs (~firs@98.207.153.155) has joined #ceph
[20:22] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[20:22] <tessier_> I can't believe it's been a month and I still don't even have the simplest ceph cluster working. :(
[20:23] * rendar (~I@87.13.171.75) Quit (Ping timeout: 480 seconds)
[20:23] * ntpttr_ (~ntpttr@192.55.54.42) Quit (Remote host closed the connection)
[20:25] * jcsp (~jspray@62.214.2.210) has joined #ceph
[20:30] * ircolle (~Adium@c-50-155-137-227.hsd1.co.comcast.net) Quit (Quit: Leaving.)
[20:31] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[20:38] * ira (~ira@12.118.3.106) Quit (Remote host closed the connection)
[20:40] * ira (~ira@12.118.3.106) has joined #ceph
[20:44] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:84c4:a35d:32e9:a5e9) Quit (Ping timeout: 480 seconds)
[20:49] * rendar (~I@87.13.171.75) has joined #ceph
[20:50] * salwasser (~Adium@72.246.3.14) Quit (Quit: Leaving.)
[20:51] * efirs (~firs@98.207.153.155) Quit (Quit: Leaving.)
[20:54] * branto (~branto@transit-86-181-132-209.redhat.com) Quit (Quit: ZNC 1.6.3 - http://znc.in)
[20:55] * treenerd_ (~gsulzberg@cpe90-146-148-47.liwest.at) Quit (Quit: treenerd_)
[20:59] * Behedwin (~spate@46.166.186.243) has joined #ceph
[21:02] * bniver (~bniver@pool-71-174-250-171.bstnma.fios.verizon.net) has joined #ceph
[21:02] * ledgr (~ledgr@88-222-11-185.meganet.lt) has joined #ceph
[21:03] * davidzlap (~Adium@2605:e000:1313:8003:9899:8d85:63bf:27e7) Quit (Quit: Leaving.)
[21:03] * ntpttr_ (~ntpttr@192.55.54.42) has joined #ceph
[21:11] <thoht> i need a recommandation. i got 3 physical devices running ceph, 2 OSD of 1To by device; so 6 OSD, replica 3. and now; i got the chance to add 2 x 480GB SSD on them. so question is: how to use them in the best way ? i m guessing to use first SSD partition for journal. now the other options are to use SSD for primary replica to speed up the write and to use tier caching. what woudl you do ?
[21:11] * blynch (~blynch@vm-nat.msi.umn.edu) has joined #ceph
[21:15] * Flynn (~stefan@ip-185-87-117-140.fiber.nl) has joined #ceph
[21:21] <Flynn> Hi all, I have a working ceph cluster with 3 nodes, 9 SATA disks per node and 9 SAS disks per node (all disks with a journal partition on a SSD). I made a crush map with two rules, one with all SATA disks, and one with all SAS disks. I have a rbd pool on the SAS rule, with size 2. That works fine. Cluster is active+clean. Now, I removed a SAS disk (by ceph osd out osd.17) and watch the cluster rebalance. It does a good job with the first disk, but o
[21:21] <Flynn> the second disk, I end up with 5 pg???s active+remapped and 290 objects misplaced. After putting the osd ???in??? again, this is fixed. I cannot see why it wouldn???t completely balance in this case. Anybody an idea?
[21:24] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[21:28] <thoht> anybody tried to use ZFS as backend for SATA OSD with L2ARC+ZIL on SSD to boost perf with these read/write cache ?
[21:29] * Behedwin (~spate@46.166.186.243) Quit ()
[21:31] * davidzlap (~Adium@2605:e000:1313:8003:9899:8d85:63bf:27e7) has joined #ceph
[21:33] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[21:41] * mykola (~Mikolaj@91.245.79.65) Quit (Quit: away)
[21:43] * davidzlap (~Adium@2605:e000:1313:8003:9899:8d85:63bf:27e7) Quit (Quit: Leaving.)
[21:44] * Hemanth (~hkumar_@103.228.221.190) Quit (Ping timeout: 480 seconds)
[21:45] * davidzlap (~Adium@2605:e000:1313:8003:9899:8d85:63bf:27e7) has joined #ceph
[21:47] <SamYaple> thoht: yes. its not great
[21:47] <SamYaple> thoht: best performance i get is with bcache
[21:47] <SamYaple> bcache + XFS
[21:47] <thoht> SamYaple: is there any howto about bcache/ceph ? i don't know bvache honestly
[21:47] <thoht> bcache sorry
[21:48] <thoht> "Bcache is a Linux kernel block layer cache. It allows one or more fast disk drives such as flash-based solid state drives (SSDs) to act as a cache for one or more slower hard disk drives."
[21:48] <thoht> omg !
[21:48] <SamYaple> youll want a howto on setting up bcache. once it is setup youll have a new device at /dev/bcache0 or similar. just use that device when setting up the osd
[21:49] <SamYaple> Arch has one of the better howto's on bcache https://wiki.archlinux.org/index.php/Bcache
[21:49] <thoht> so once i got /dev/bcache0, i just add it as an OSD ?
[21:49] <SamYaple> yea
[21:49] <thoht> how does ceph know to use it as a cache ?
[21:50] <SamYaple> ceph doesnt. bcache is the cachign software
[21:50] <thoht> i mean i will have on my 3 servers 4 SATA OSD and 2 bcache OSD
[21:50] * nigwil (~Oz@li1416-21.members.linode.com) has joined #ceph
[21:50] <SamYaple> wait im confused. what are you trying to do?
[21:50] <SamYaple> you want caching osds?
[21:50] <SamYaple> or you want your osds cached?
[21:50] <thoht> no no i want to use bcache
[21:51] <thoht> i mean to try your suggestion
[21:51] <thoht> my Ceph cluster is for now 6 OSD of 1TB SATA
[21:51] <thoht> it is a bit slow
[21:52] <thoht> and now; i ve 2x480G SSD for each server in addition
[21:52] <SamYaple> bcache device would be /dev/sda as cache (ssd) and /dev/sdb, /dev/sdc, /dev/sdx as backend (spinning disk) which would produce /dev/bcache0, /dev/bcache1, /dev/bcachen
[21:52] <SamYaple> cache would only ever talk to /dev/bcache*
[21:52] <thoht> oh ok
[21:52] <SamYaple> of note, upstream bcache does not have support for partitions on the /dev/bcache* device. I wrote an article about this you may be interested in https://yaple.net/2016/03/31/bcache-partitions-and-dkms/
[21:52] <thoht> so /dev/bcache0 will be a cache for my current OSD /dev/sda for instance
[21:52] <SamYaple> thoht: yes
[21:52] <thoht> i get it !
[21:52] <thoht> so nothing related to ceph at the end
[21:52] <thoht> nothing to configure on ceph i mean
[21:53] <SamYaple> ceph is not aware of any caching, correct
[21:53] <thoht> perfect !!
[21:56] * davidzlap (~Adium@2605:e000:1313:8003:9899:8d85:63bf:27e7) Quit (Quit: Leaving.)
[21:57] * ircolle (~Adium@c-50-155-137-227.hsd1.co.comcast.net) has joined #ceph
[21:59] * aNuposic (~aNuposic@134.134.139.76) Quit (Remote host closed the connection)
[21:59] * Hemanth (~hkumar_@103.228.221.190) has joined #ceph
[22:00] <thoht> SamYaple: http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043513.html <== bcache seems not stable ?
[22:00] <thoht> "We have also got unrecoverable XFS errors with bcache"
[22:04] <SamYaple> thoht: bcache has never let me down before. I have been running with ceph and bcache for ~3 years now, bcache for even longer. a random report where someone blames bcache but provides no evidence isn't evidence against bcache
[22:06] <thoht> SamYaple: which kernel version do you use ?
[22:09] * ircolle (~Adium@c-50-155-137-227.hsd1.co.comcast.net) Quit (Ping timeout: 480 seconds)
[22:11] <SamYaple> thoht: I have a bunch of different kernels running in a bunch of different places
[22:11] <thoht> SamYaple: i mean with bcache
[22:11] <SamYaple> the answer is the same
[22:12] <SamYaple> i use bcache in most places
[22:12] <thoht> i got a kernel 3.13
[22:12] <thoht> is it suitable for bcache ?
[22:13] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[22:14] <SamYaple> Yes. i run 3.13 on some of the servers
[22:14] <thoht> SamYaple: i was reading this article from Sebastien Han blog: https://www.sebastien-han.fr/blog/2015/12/21/ceph-crush-rule-1-copy-ssd-and-1-copy-sata/
[22:14] <thoht> he suggests to use 1 copy for SSD and 1 copy SATA
[22:14] <thoht> so the first replica is written to SSD drives
[22:14] <thoht> and second to SATA drives
[22:15] <thoht> it requires a crush rule; so it is very different from bcache
[22:15] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[22:16] <SamYaple> thats to have read afinity. so you read from the fast disk all the time
[22:16] <SamYaple> very different
[22:16] * ntpttr__ (~ntpttr@192.55.54.42) has joined #ceph
[22:18] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[22:19] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Quit: doppelgrau)
[22:20] <thoht> it is different approach to use SSD for speeding up a cluster ceph with SATA disks
[22:20] <thoht> 1. bcache is transparent; no ceph config. will improve write/read
[22:20] <thoht> 2. affinity on SSD needs ceph config. will improve write/read
[22:21] * ntpttr_ (~ntpttr@192.55.54.42) Quit (Remote host closed the connection)
[22:22] <SamYaple> 2 will not improve writes. It still has to write to the spinning disk before returning
[22:23] <thoht> oh it will improve only read then
[22:24] <m0zes> so, does anyone know the codepath followed for the monitors to promote a standby mds to active? I think it is just this one: https://github.com/ceph/ceph/blob/master/src/mon/MDSMonitor.cc#L2865
[22:25] <thoht> SamYaple: does bcache requires ssd mirror (raid) ?
[22:27] * thundercloud (~MJXII@exit0.radia.tor-relays.net) has joined #ceph
[22:33] <thoht> SamYaple: what about flashcache ?
[22:33] * fridim (~fridim@56-198-190-109.dsl.ovh.fr) Quit (Ping timeout: 480 seconds)
[22:34] * Jeffrey4l__ (~Jeffrey@110.244.241.218) Quit (Read error: Connection reset by peer)
[22:34] * vbellur (~vijay@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving.)
[22:35] * Jeffrey4l__ (~Jeffrey@110.252.62.76) has joined #ceph
[22:36] * haplo37 (~haplo37@199.91.185.156) Quit (Read error: Connection reset by peer)
[22:37] * Rickus (~Rickus@office.protected.ca) has joined #ceph
[22:37] * Hemanth (~hkumar_@103.228.221.190) Quit (Quit: Leaving)
[22:37] * Hemanth (~hkumar_@103.228.221.190) has joined #ceph
[22:40] * davidzlap (~Adium@2605:e000:1313:8003:9899:8d85:63bf:27e7) has joined #ceph
[22:41] * davidzlap (~Adium@2605:e000:1313:8003:9899:8d85:63bf:27e7) Quit ()
[22:49] <diq> flashcache is bitrot
[22:49] <diq> it's not maintained
[22:51] <thoht> make-bcache -C /dev/sda3 <== returns: Not enough buckets: 2, need 128
[22:51] <diq> I know the people formerly responsible for flashcache, and they would not tell you to deploy it today.
[22:51] <thoht> diq: thanks; will use bcache
[22:52] <darkfader> thanks to synology it's got the largest installed base now :)
[22:52] * georgem (~Adium@24.114.68.241) has joined #ceph
[22:52] <darkfader> idk what got into them to use flashcache this late
[22:55] <diq> it was all the hotness in the mysql world about 5-6 years ago
[22:55] <diq> then flash got cheap and people just put DB's on flash
[22:56] <darkfader> yeah but synology is pushing it since like a year
[22:56] <diq> then that's silly ;)
[22:56] <darkfader> yeah
[22:56] <darkfader> from the stuff i tried, enhanceio (also dead) is the only one that has nice performance characteristics
[22:56] <darkfader> but dead's dead
[22:57] * thundercloud (~MJXII@exit0.radia.tor-relays.net) Quit ()
[23:02] * malevolent (~quassel@192.146.172.118) has joined #ceph
[23:03] * malevolent (~quassel@192.146.172.118) Quit (Remote host closed the connection)
[23:03] * jcsp (~jspray@62.214.2.210) Quit (Ping timeout: 480 seconds)
[23:05] * newbie (~kvirc@host217-114-156-249.pppoe.mark-itt.net) Quit (Ping timeout: 480 seconds)
[23:06] * dug (~Coe|work@46.166.138.132) has joined #ceph
[23:11] * efirs (~firs@98.207.153.155) has joined #ceph
[23:15] * pinkypie (~AusGersw6@S010600212986d401.ed.shawcable.net) has joined #ceph
[23:18] <SamYaple> darkfader: its because its the only one that allows you to attach a cache on the fly
[23:18] <SamYaple> thats why synology is using it
[23:18] <SamYaple> well and enhanceio, but that is a fork IIRC
[23:19] <darkfader> makes sense
[23:19] <tessier_> Any Ceph experts here who are also on the ceph-users list? I emailed the list with subject "Can't activate OSD". I've been trying for a month to get a simple cluster going following the quickstart guide but keep getting hung up on this particular step. I don't even see any error messages other than the 300 second timeout.
[23:20] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) Quit (Ping timeout: 480 seconds)
[23:22] * billwebb (~billwebb@50-203-47-138-static.hfc.comcastbusiness.net) Quit (Quit: billwebb)
[23:32] * kristen (~kristen@134.134.139.72) Quit (Quit: Leaving)
[23:32] <minnesotags> tessier_ pastebin?
[23:32] * georgem (~Adium@24.114.68.241) Quit (Quit: Leaving.)
[23:32] <tessier_> minnesotags: Yeah, just a sec...logs coming up...
[23:33] <minnesotags> What are you installing on?
[23:33] <minnesotags> What os, what hardware, etc..
[23:34] * rendar (~I@87.13.171.75) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[23:35] * evelu (~erwan@37.160.181.8) has joined #ceph
[23:36] * dug (~Coe|work@46.166.138.132) Quit ()
[23:38] * rakeshgm (~rakesh@66-194-8-225.static.twtelecom.net) Quit (Quit: Peace :))
[23:42] * Flynn (~stefan@ip-185-87-117-140.fiber.nl) Quit (Quit: Flynn)
[23:44] * squizzi (~squizzi@107.13.237.240) Quit (Quit: bye)
[23:47] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[23:49] * ntpttr_ (~ntpttr@192.55.54.42) has joined #ceph
[23:49] * ntpttr__ (~ntpttr@192.55.54.42) Quit (Remote host closed the connection)
[23:54] <tessier_> minnesotags: Sorry for the delay, had a network issue.
[23:54] <tessier_>
[23:54] <tessier_> http://pastebin.com/A2kP28c4 is logs
[23:54] <tessier_> Installing on CentOS 7, Supermicro hardware.
[23:55] <tessier_> I've got 4 machines: A deploy server which is a VM, a monitor, and two OSDs. Each OSD has 4G of RAM and 4 1T disks. This is just a tiny test cluster.
[23:56] * brad_mssw (~brad@66.129.88.50) Quit (Quit: Leaving)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.