#ceph IRC Log

Index

IRC Log for 2016-08-15

Timestamps are in GMT/BST.

[0:00] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[0:03] * efirs (~firs@98.207.153.155) has joined #ceph
[0:06] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[0:09] * darthbacon (~darthbaco@67-61-63-35.cpe.cableone.net) Quit (Read error: Connection reset by peer)
[0:09] * danieagle (~Daniel@187.34.0.61) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[0:14] * penguinRaider (~KiKo@104.194.0.35) Quit (Ping timeout: 480 seconds)
[0:18] * dnunez (~dnunez@209-6-91-147.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com) Quit (Remote host closed the connection)
[0:19] * Kottizen (~brianjjo@9YSAABCK6.tor-irc.dnsbl.oftc.net) Quit ()
[0:32] * doppelgrau1 (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[0:32] * penguinRaider (~KiKo@104.194.0.35) has joined #ceph
[0:47] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[0:49] * Georgyo (~georgyo@shamm.as) Quit (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.)
[0:55] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[0:55] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[1:03] * bret1 (~colde@185.100.85.101) has joined #ceph
[1:05] * _mrp (~mrp@77-46-170-96.dynamic.isp.telekom.rs) has joined #ceph
[1:06] * _mrp (~mrp@77-46-170-96.dynamic.isp.telekom.rs) Quit ()
[1:12] * [0x7c1] (~1985@terminator.vision) has left #ceph
[1:14] * Racpatel (~Racpatel@2601:87:0:24af::53d5) Quit (Ping timeout: 480 seconds)
[1:30] * yanzheng (~zhyan@125.70.22.133) has joined #ceph
[1:33] * bret1 (~colde@26XAAA2KH.tor-irc.dnsbl.oftc.net) Quit ()
[1:41] * oms101 (~oms101@p20030057EA3C6A00C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:47] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Remote host closed the connection)
[1:50] * oms101 (~oms101@p20030057EA021A00C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[2:00] * yanzheng (~zhyan@125.70.22.133) Quit (Quit: This computer has gone to sleep)
[2:00] * bene2 (~bene@2601:193:4101:f410:ea2a:eaff:fe08:3c7a) has joined #ceph
[2:02] * bene2 (~bene@2601:193:4101:f410:ea2a:eaff:fe08:3c7a) Quit ()
[2:03] * cronburg (~cronburg@209-6-121-249.c3-0.arl-ubr1.sbo-arl.ma.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[2:07] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[2:16] * nils_ (~nils_@doomstreet.collins.kg) has joined #ceph
[2:19] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[2:30] * Thononain (~SweetGirl@185.100.85.101) has joined #ceph
[2:34] * yanzheng (~zhyan@125.70.22.133) has joined #ceph
[2:38] * ronrib (~boswortr@45.32.242.135) has joined #ceph
[2:45] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) Quit (Ping timeout: 480 seconds)
[2:47] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[2:55] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[2:58] * hellertime (~Adium@pool-71-162-119-41.bstnma.fios.verizon.net) has joined #ceph
[2:58] * vbellur (~vijay@71.234.224.255) Quit (Remote host closed the connection)
[3:00] * Thononain (~SweetGirl@26XAAA2ME.tor-irc.dnsbl.oftc.net) Quit ()
[3:11] * derjohn_mobi (~aj@x590d6735.dyn.telefonica.de) has joined #ceph
[3:18] * nils_ (~nils_@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[3:19] * derjohn_mob (~aj@x590db90b.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[3:21] * hellertime (~Adium@pool-71-162-119-41.bstnma.fios.verizon.net) Quit (Quit: Leaving.)
[3:22] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[3:22] * Swompie` (~Kottizen@ip95.ip-94-23-150.eu) has joined #ceph
[3:28] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) has joined #ceph
[3:31] * vbellur (~vijay@2601:18f:700:55b0:5e51:4fff:fee8:6a5c) has joined #ceph
[3:34] * zhen (~Thunderbi@130.57.30.250) has joined #ceph
[3:35] * zhen (~Thunderbi@130.57.30.250) Quit ()
[3:41] * sebastian-w_ (~quassel@212.218.8.139) Quit (Remote host closed the connection)
[3:41] * sebastian-w (~quassel@212.218.8.139) has joined #ceph
[3:48] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[3:52] * Swompie` (~Kottizen@61TAABBMW.tor-irc.dnsbl.oftc.net) Quit ()
[3:57] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[3:59] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) has joined #ceph
[4:12] * Racpatel (~Racpatel@2601:87:0:24af::cd3c) has joined #ceph
[4:16] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph
[4:17] * Racpatel (~Racpatel@2601:87:0:24af::cd3c) Quit (Quit: Leaving)
[4:18] * wkennington (~wkenningt@c-71-204-170-241.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[4:46] * jfaj (~jan@p20030084AF7EC2005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[4:47] * bildramer (~Oddtwang@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[4:55] * jfaj (~jan@p20030084AF31C4005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) has joined #ceph
[4:59] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) Quit (Quit: Leaving.)
[5:11] * jermudgeon (~jhaustin@31.207.56.59) has joined #ceph
[5:12] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) has joined #ceph
[5:12] * jermudgeon (~jhaustin@31.207.56.59) Quit ()
[5:12] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[5:16] * bildramer (~Oddtwang@26XAAA2PO.tor-irc.dnsbl.oftc.net) Quit ()
[5:20] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[5:26] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[5:30] * penguinRaider (~KiKo@104.194.0.35) Quit (Ping timeout: 480 seconds)
[5:31] * Vacuum__ (~Vacuum@i59F79BC4.versanet.de) has joined #ceph
[5:32] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[5:38] * Vacuum_ (~Vacuum@88.130.206.185) Quit (Ping timeout: 480 seconds)
[5:40] * penguinRaider (~KiKo@104.194.0.35) has joined #ceph
[5:43] * vimal (~vikumar@114.143.167.9) has joined #ceph
[5:53] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[5:58] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) Quit (Remote host closed the connection)
[6:01] * walcubi_ (~walcubi@p5795B235.dip0.t-ipconnect.de) has joined #ceph
[6:04] * [0x4A6F]_ (~ident@p4FC26104.dip0.t-ipconnect.de) has joined #ceph
[6:07] * [0x4A6F] (~ident@0x4a6f.user.oftc.net) Quit (Ping timeout: 480 seconds)
[6:07] * [0x4A6F]_ is now known as [0x4A6F]
[6:08] * walcubi__ (~walcubi@p5795A2FB.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[6:15] * Chrissi_ (~osuka_@2.tor.exit.babylon.network) has joined #ceph
[6:16] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[6:16] * pdrakeweb (~pdrakeweb@oh-76-5-108-60.dhcp.embarqhsd.net) Quit (Ping timeout: 480 seconds)
[6:24] * ffilzwin2 (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) has joined #ceph
[6:27] * ffilzwin (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) has joined #ceph
[6:31] * ffilzwin3 (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) Quit (Ping timeout: 480 seconds)
[6:33] * ffilzwin2 (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) Quit (Ping timeout: 480 seconds)
[6:33] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[6:36] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[6:39] * vimal (~vikumar@114.143.167.9) Quit (Quit: Leaving)
[6:45] * Chrissi_ (~osuka_@5AEAAAZUQ.tor-irc.dnsbl.oftc.net) Quit ()
[6:48] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[6:54] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[6:54] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[6:59] * vimal (~vikumar@121.244.87.116) has joined #ceph
[6:59] * TomasCZ (~TomasCZ@yes.tenlab.net) Quit (Quit: Leaving)
[6:59] * EinstCra_ (~EinstCraz@58.247.117.134) has joined #ceph
[7:02] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) Quit (Ping timeout: 480 seconds)
[7:03] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Ping timeout: 480 seconds)
[7:05] * pdrakeweb (~pdrakeweb@cpe-71-74-153-111.neo.res.rr.com) has joined #ceph
[7:09] * sudocat (~dibarra@192.185.1.20) Quit (Ping timeout: 480 seconds)
[7:11] * vikhyat (~vumrao@49.248.198.96) has joined #ceph
[7:16] * cronburg_ (~cronburg@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[7:16] * cronburg (~cronburg@nat-pool-bos-t.redhat.com) has joined #ceph
[7:19] * portante (~portante@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[7:19] * portante (~portante@nat-pool-bos-t.redhat.com) has joined #ceph
[7:24] * oliveiradan (~doliveira@67.214.238.80) Quit (Ping timeout: 480 seconds)
[7:26] * EinstCra_ (~EinstCraz@58.247.117.134) Quit (Remote host closed the connection)
[7:27] * EinstCrazy (~EinstCraz@58.247.117.134) has joined #ceph
[7:30] * EinstCra_ (~EinstCraz@58.247.119.250) has joined #ceph
[7:35] * EinstCrazy (~EinstCraz@58.247.117.134) Quit (Ping timeout: 480 seconds)
[7:43] * kefu (~kefu@114.92.101.38) has joined #ceph
[7:47] * derjohn_mobi (~aj@x590d6735.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[7:57] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[8:00] * Altitudes (~Dragonsha@torland1-this.is.a.tor.exit.server.torland.is) has joined #ceph
[8:03] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) has joined #ceph
[8:07] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Ping timeout: 480 seconds)
[8:20] * badone (~badone@66.187.239.16) has joined #ceph
[8:30] * sw3 (sweaung@2400:6180:0:d0::66:100f) Quit (autokilled: This host violated network policy. Contact support@oftc.net for further information and assistance. (2016-08-15 06:30:24))
[8:30] * Altitudes (~Dragonsha@9YSAABCWP.tor-irc.dnsbl.oftc.net) Quit ()
[8:36] * ade (~abradshaw@p4FF78731.dip0.t-ipconnect.de) has joined #ceph
[8:36] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Quit: Ex-Chat)
[8:45] * derjohn_mobi (~aj@88.128.80.107) has joined #ceph
[8:51] * dennis_ (~dennis@2a00:801:7:1:1a03:73ff:fed6:ffec) has joined #ceph
[8:56] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[8:57] * penguinRaider (~KiKo@104.194.0.35) Quit (Ping timeout: 480 seconds)
[8:58] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) has joined #ceph
[9:05] * raso (~raso@ns.deb-multimedia.org) Quit (Read error: Connection reset by peer)
[9:05] * penguinRaider (~KiKo@104.194.0.35) has joined #ceph
[9:06] * raso (~raso@ns.deb-multimedia.org) has joined #ceph
[9:15] * penguinRaider (~KiKo@104.194.0.35) Quit (Ping timeout: 480 seconds)
[9:18] * wkennington (~wkenningt@c-71-204-170-241.hsd1.ca.comcast.net) has joined #ceph
[9:19] * doppelgrau (~doppelgra@132.252.235.172) has joined #ceph
[9:21] * kuku (~kuku@119.93.91.136) has joined #ceph
[9:23] * penguinRaider (~KiKo@104.194.0.35) has joined #ceph
[9:25] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[9:28] * derjohn_mobi (~aj@88.128.80.107) Quit (Ping timeout: 480 seconds)
[9:30] * TMM (~hp@185.5.121.201) has joined #ceph
[9:45] * DV__ (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[9:45] * ChanServ sets mode +v scuttlemonkey
[9:45] * ChanServ sets mode +v nhm
[9:47] * Nicho1as (~nicho1as@00022427.user.oftc.net) has joined #ceph
[9:48] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has left #ceph
[9:49] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[9:54] * SinZ|offline1 (~pakman__@46.166.137.231) has joined #ceph
[9:56] * debian112 (~bcolbert@c-73-184-103-26.hsd1.ga.comcast.net) has joined #ceph
[9:58] * kuku (~kuku@119.93.91.136) Quit (Remote host closed the connection)
[10:00] * analbeard (~shw@support.memset.com) has joined #ceph
[10:02] <schegi> Hey there.
[10:02] * DanFoster (~Daniel@office.34sp.com) has joined #ceph
[10:04] <schegi> Is there any support for sed disks to be unlock before mounting if used as osd drives, or has this to be done manually as pre osd start job?
[10:06] <schegi> I read about dmcrypt support in ceph 0.6 and later but only in context of ceph-deploy and not for manual deployment and nothing about key management for sed encrypted disks
[10:08] * derjohn_mobi (~aj@fw.gkh-setu.de) has joined #ceph
[10:11] <schegi> Also found this, but again only for dmcrypt http://tracker.ceph.com/projects/ceph/wiki/Osd_-_simple_ceph-mon_dm-crypt_key_management
[10:14] * sw3 (sweaung@2400:6180:0:d0::66:100f) has joined #ceph
[10:15] * dgurtner (~dgurtner@84-73-130-19.dclient.hispeed.ch) has joined #ceph
[10:15] * kefu (~kefu@114.92.101.38) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[10:18] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[10:19] * rendar (~I@host63-44-dynamic.51-82-r.retail.telecomitalia.it) has joined #ceph
[10:20] * _mrp (~mrp@93-87-226-235.dynamic.isp.telekom.rs) has joined #ceph
[10:24] * SinZ|offline1 (~pakman__@46.166.137.231) Quit ()
[10:29] * walcubi_ is now known as walcubi
[10:31] <walcubi> SamYaple, I did some last ditch testing using bluestore. Very naive setup, just a filesystem and sparse file for the block storage. XFS was once again remarkably slow, but at least it didn't degrade throughput over time.
[10:32] <walcubi> Switched to btrfs before the weekend, and looking at the results, I almost fell off my chair.
[10:32] * MACscr_ (~MACscr@c-73-9-230-5.hsd1.il.comcast.net) Quit (Ping timeout: 480 seconds)
[10:33] * bara (~bara@ip4-83-240-10-82.cust.nbox.cz) has joined #ceph
[10:33] * mattch (~mattch@w5430.see.ed.ac.uk) has joined #ceph
[10:36] * _mrp (~mrp@93-87-226-235.dynamic.isp.telekom.rs) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[10:37] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[10:39] * penguinRaider (~KiKo@104.194.0.35) Quit (Ping timeout: 480 seconds)
[10:46] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) has joined #ceph
[10:47] * penguinRaider (~KiKo@104.194.0.35) has joined #ceph
[10:50] * DV__ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[10:50] * towo (~towo@towo.netrep.oftc.net) has joined #ceph
[10:51] <towo> Cheers. Quick question: why is a radosgw able to return "103" as an error code via HTTP, and what is it actually trying to say?
[10:58] * treenerd_ (~gsulzberg@cpe90-146-148-47.liwest.at) has joined #ceph
[11:02] <schegi> Is there any support for self encrypted disks to be unlocked before mounting, if used as osd drives or has this to be done manually as pre osd start job? Found something about support for dmcrypt but not for seds.
[11:06] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[11:10] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[11:11] * DV__ (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[11:13] <badone> towo: 103 could be ECONNABORTED /* Software caused connection abort */
[11:16] <badone> but that is not an HTTP code of course
[11:18] <TMM> walcubi, is btrfs so much faster than xfs for osds?
[11:22] <wkennington> TMM: yeah but btrfs is kinda dangerous if you value your data
[11:22] * zhen (~Thunderbi@130.57.30.250) has joined #ceph
[11:22] <TMM> wkennington, I don't know, I've used btrfs (without ceph) on all my desktops and personal servers for years, since about 3.16 I've not had any data breakage or instability with the featureset I use
[11:22] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[11:23] <wkennington> TMM: yeah same here
[11:23] <wkennington> it should be okay if you dont use the btrfs snapshotting feature
[11:23] <wkennington> but i really dont recommend it
[11:23] <wkennington> my osds were much more stable on zfs
[11:23] <TMM> I'm using xfs for my osds
[11:24] <TMM> I don't think I've had any xfs related issues
[11:24] <wkennington> probably not
[11:24] <wkennington> xfs should be rock solid for the osd backing fs
[11:24] <wkennington> i really want to start running one of my nodes with bluestore
[11:24] <TMM> I've had one strange issue this weekend where a hitset disappeared. Someone helped me write a patch to skip hitsets that are missing and my cluster was fine after that
[11:25] <TMM> I don't know if that was in any way related to my backing store though
[11:25] <wkennington> idk
[11:26] <TMM> me neither, but it is kind of worrying :-/
[11:27] <TMM> I wonder if me using the 'isa' ec plugin for my storage tier is what's causing me grief
[11:27] <TMM> it's a somewhat less standard deployment
[11:27] * zhen (~Thunderbi@130.57.30.250) Quit (Quit: zhen)
[11:27] <TMM> I'm thinking of just moving o jerasure everywhere
[11:28] <wkennington> yeah im just doing standard replication
[11:28] <TMM> we have a lot of never-touched data, it just seemed kind of pointless
[11:29] <TMM> it's all rbd, but people here never redeploy their instances, so we have osses sitting there read-only for years sometimes
[11:29] <TMM> having all that stuff in an ec pool seemed like a good idea :)
[11:32] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[11:35] * huangjun (~kvirc@113.57.168.154) has joined #ceph
[11:45] * derjohn_mobi (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[11:48] <walcubi> TMM - depends on the workload, I guess.
[11:49] <walcubi> For any of the client solutions (rgw, rbd, cephfs). Because data is stored in equal block sizes, it's all the same.
[11:50] <walcubi> For what I'm using it for, XFS is just horrible.
[11:55] * penguinRaider (~KiKo@104.194.0.35) Quit (Ping timeout: 480 seconds)
[11:57] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[12:01] * bvi (~Bastiaan@185.56.32.1) has joined #ceph
[12:03] <towo> badone: Yeah, that seems the most likely option. Wondering why it leaks via radosgw, though
[12:03] * Jeeves_ (~mark@2a03:7900:1:1:4cac:cad7:939b:67f4) Quit (Remote host closed the connection)
[12:10] <badone> towo: you'd need more information and matching logs with timestamps and high debug log level to zero in on it I suspect
[12:11] * penguinRaider (~KiKo@14.139.82.6) has joined #ceph
[12:11] <badone> turn up debugging, reproduce and look at and around that timestamp in the logs
[12:14] <Be-El> i'm planning to upgrade our hammer cluster (0.94.7 on centos 7.3) to jewel. mons are colocated on three hosts; are there any particular problems during update that have to be taken care of, or does the standard procedure (package upgrade, mons restart, osd restart, other restart) still apply in this case?
[12:20] * dennis_ (~dennis@2a00:801:7:1:1a03:73ff:fed6:ffec) Quit (Quit: L??mnar)
[12:20] <doppelgrau> Be-El: should work, I'd a Problem waiting too long with the third mon to update
[12:20] * Nicho1as (~nicho1as@00022427.user.oftc.net) Quit (Quit: A man from the Far East; using WeeChat 1.5)
[12:20] * EinstCra_ (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[12:21] <doppelgrau> (problem after two days), and since it was combined with kernel and xen update the reboot updated the osds on the mon host also => only some anoing (harmless) messages "failed to decode map .. with expected crc"
[12:21] <Be-El> doppelgrau: how does systemd react in such an update scenario? does it detect the new target/services files and tries to restart the services automatically (which would be bad)? the hammer release still uses init scripts on centos
[12:22] <Be-El> (<- systemd newbie.... )
[12:22] <doppelgrau> with Debian there was no automatic (re)start, but in gereral .. (also not too many systemd experience)
[12:23] * Nicho1as (~nicho1as@00022427.user.oftc.net) has joined #ceph
[12:24] * huangjun (~kvirc@113.57.168.154) Quit (Ping timeout: 480 seconds)
[12:24] <Be-El> ok, last question (before trashing all data :-). do you start with the leader mon or do you upgrade that mon at last?
[12:35] * shaunm (~shaunm@nat-eduroam-02.scc.kit.edu) has joined #ceph
[12:39] * _mrp (~mrp@82.117.199.26) has joined #ceph
[12:39] <doppelgrau> I started with the leading mon, but that was not really planed
[12:40] <IcePic> Be-El: as soon as you take the daemon offline in the daemon, the others would re-elect a new leader, would they not?
[12:41] * netmare (~skrasnopi@188.93.16.2) has left #ceph
[12:44] * JohnO (~Ralth@tor2r.ins.tor.net.eu.org) has joined #ceph
[12:45] <schegi> Is there any support for self encrypted disk (sed) unlocking before mounting or has this to be done manually as pre osd start job? Found something about support for dmcrypt but not for seds.
[12:52] <Be-El> IcePic: sure, but as soon as the former mon with the lowest IP is back, it will become the leader again
[12:54] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[12:58] <Be-El> schegi: i assume self encrypted disks encrypt the whole disk content; in that case there are not OSD detected in the default setup, since detection is based on GPT partition type uuids
[12:58] * _mrp (~mrp@82.117.199.26) Quit (Read error: Connection reset by peer)
[12:58] * _mrp (~mrp@82.117.199.26) has joined #ceph
[13:01] * jfaj (~jan@p20030084AF31C4005EC5D4FFFEBB68A4.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[13:01] * jfaj (~jan@p578E773F.dip0.t-ipconnect.de) has joined #ceph
[13:06] <schegi> Just read about the dmcrypt support for ceph-disk, storing the encryption keys in the mon. I am aware that detection is not possible due to completely encrypted disks. My plan was to unlock the seds (using hdparm) before starting the osd deamons, then i read about the dmcrypt support in ceph and was wondering if there is something similar for seds.
[13:09] * hellertime (~Adium@72.246.3.14) has joined #ceph
[13:10] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[13:13] * JohnO (~Ralth@61TAABBYO.tor-irc.dnsbl.oftc.net) Quit ()
[13:18] <doppelgrau> schegi: IIRC is the main goal for the dmcrypt-support the possibility to safely RMA disks with potential sensitive data even if you're no longer able to erase them safely, so the relativly weak key management
[13:25] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[13:31] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) Quit (Ping timeout: 480 seconds)
[13:34] * derjohn_mob (~aj@46.189.28.72) has joined #ceph
[13:35] * jmn (~jmn@nat-pool-bos-t.redhat.com) Quit (Quit: Coyote finally caught me)
[13:35] * jmn (~jmn@nat-pool-bos-t.redhat.com) has joined #ceph
[13:36] * hifi1 (~Diablodoc@tor-exit.squirrel.theremailer.net) has joined #ceph
[13:40] * treenerd_ (~gsulzberg@cpe90-146-148-47.liwest.at) Quit (Quit: treenerd_)
[13:47] * TMM_ (~hp@185.5.121.201) has joined #ceph
[13:47] * TMM (~hp@185.5.121.201) Quit (Read error: Connection reset by peer)
[13:55] * bniver (~bniver@nat-pool-bos-u.redhat.com) has joined #ceph
[13:57] * wkennington (~wkenningt@c-71-204-170-241.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[14:05] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[14:05] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[14:06] * hifi1 (~Diablodoc@61TAABBZK.tor-irc.dnsbl.oftc.net) Quit ()
[14:10] * huangjun (~kvirc@117.152.73.81) has joined #ceph
[14:14] * janos (~messy@static-71-176-211-4.rcmdva.fios.verizon.net) has joined #ceph
[14:16] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[14:17] * ivve (~zed@cust-gw-11.se.zetup.net) has joined #ceph
[14:18] * sebastian-w_ (~quassel@212.218.8.138) has joined #ceph
[14:20] * sebastian-w (~quassel@212.218.8.139) Quit (Read error: Connection reset by peer)
[14:22] <jiffe> how does one turn off strict mode in the pymongo driver?
[14:24] <jiffe> I'm coming up with the error 'utf8' codec can't decode byte 0xf1 in position 108: invalid continuation byte; and it sounds like this is due to corruption and disabling strict mode will ignore this
[14:28] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[14:30] <liiwi> source data is not utf8, sure you want to pass that forward? It will simply blow up on next point of translation.
[14:31] <liiwi> you do get those routinely when data is converted from non-utf8 strings to utf8
[14:31] * xarses_ (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[14:33] * Skyrider (~Aal@108.61.123.72) has joined #ceph
[14:33] <jiffe> yeah I just needs these reads to complete successfully even if the data isn't valid
[14:37] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[14:37] * Racpatel (~Racpatel@2601:87:0:24af::53d5) has joined #ceph
[14:41] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[14:45] <towo> badone: Aye, I've already hassled the customer to do that, I was just hoping there was some pretty obvious misconfiguration.
[14:52] * kuku (~kuku@124.104.85.105) has joined #ceph
[14:54] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[14:55] * ivve (~zed@cust-gw-11.se.zetup.net) Quit (Ping timeout: 480 seconds)
[14:55] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[14:55] * kuku (~kuku@124.104.85.105) Quit (Read error: Connection reset by peer)
[14:56] * kuku (~kuku@124.104.85.105) has joined #ceph
[15:02] * ircolle (~Adium@2601:285:201:633a:3114:56ea:8433:8286) has joined #ceph
[15:03] * mhack (~mhack@24-151-36-149.dhcp.nwtn.ct.charter.com) has joined #ceph
[15:03] * Skyrider (~Aal@5AEAAAZ6B.tor-irc.dnsbl.oftc.net) Quit ()
[15:11] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[15:12] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) has joined #ceph
[15:15] * kuku (~kuku@124.104.85.105) Quit (Remote host closed the connection)
[15:24] * kuku (~kuku@124.104.85.105) has joined #ceph
[15:25] * kuku (~kuku@124.104.85.105) Quit (Remote host closed the connection)
[15:25] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[15:25] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[15:27] * srk (~Siva@2605:6000:ed04:ce00:d961:c55c:ce6b:1a1d) has joined #ceph
[15:28] <walcubi> ok, I've just spoken to the people I'm working with, and the takeaway is this.
[15:28] <walcubi> They want only 1 copy of data, no replication.
[15:29] <walcubi> Also, if a disk goes bad, or a server goes down, we don't care about data lost.
[15:30] <walcubi> Everything else about ceph is great though.
[15:31] <walcubi> In other words, just being distributed without increasing the capacity of our current setup.
[15:34] <walcubi> Maybe using an erasure coded pool would be the best thing.
[15:35] * cronburg_ (~cronburg@nat-pool-bos-t.redhat.com) has joined #ceph
[15:35] <doppelgrau> walcubi: which 2application"? rbd?
[15:36] <doppelgrau> walcubi: rbd (and I think cephfs also, not sure about s3 gw) size=1 means one lost disk = all data (more or less) lost
[15:37] <doppelgrau> (and with enough disks, that means more or less every week oO)
[15:37] <walcubi> doppelgrau, we wrote our own librados client
[15:38] * salwasser (~Adium@72.246.3.14) has joined #ceph
[15:39] <doppelgrau> walcubi: ok, in that case it can work, although it appears strange to me, that the data is so invaluable that no replication/redundency (ec) is wanted
[15:39] <walcubi> We have a very unique use-case. :-)
[15:39] <walcubi> Anything lost, can be regenerated
[15:41] * vbellur (~vijay@2601:18f:700:55b0:5e51:4fff:fee8:6a5c) Quit (Ping timeout: 480 seconds)
[15:41] <walcubi> We store roughly 1.8 billion images, zero reference of locality over what is read or even accessed.
[15:41] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) Quit (Quit: Leaving.)
[15:42] <walcubi> *locality of reference, even.
[15:42] <doppelgrau> walcubi: I would test how the client behaved with broken PGs (size=1 and down/lost osds means PGs are down/stale inactive) - that will be the result of a failure with size=1, but after that, I think it should work, although it "feels wrong" :)
[15:43] <rkeene> Down is different from out and lost -- down causes reads to hang, out/lost will fail or seek alternate copies
[15:43] * art_yo (~art_yo@149.126.169.197) has joined #ceph
[15:43] <walcubi> There was even one time where for 3 months, we had 20% of this data being deleted every night, then regenerated the next morning.
[15:44] <walcubi> No one noticed that there was a problem, until we saw the graphs one day. =)
[15:44] <art_yo> Hi guys! Could you help me? I hav RBD device, that is maped and mounted to directory
[15:45] <art_yo> I increased RBD size and now rbd info shows me the new size, but df -h /directory shows old size
[15:45] <walcubi> doppelgrau, I've done some preliminary testing. Mostly just seeing how the cluster can handle writing 80 million randomly sized images, and I have one server in production that is getting images from ceph right now.
[15:46] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) has joined #ceph
[15:46] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) Quit ()
[15:46] * vbellur (~vijay@71.234.224.255) has joined #ceph
[15:46] <rkeene> I'd really just create the images as RADOS objects and leave RBD out of it
[15:46] <walcubi> doppelgrau, it seems not to really matter if data goes missing. The client gets back an appropriate "not found" response.
[15:47] <walcubi> doppelgrau, and in that instance, we just trim and regenerate the object.
[15:47] <walcubi> doppelgrau, if a PG goes missing, however, then all reads and writes get blocked.
[15:47] <art_yo> got it, I had to execute "xfs_growfs". sory
[15:48] <doppelgrau> walcubi: nice, I was curios that there was no "blocking" until manually the PGs are fixed/recreated or something like that. And cool application
[15:48] <walcubi> doppelgrau, this is where things become not so nice. Because in our case, loosing 80 million objects is no big deal.
[15:49] <doppelgrau> walcubi: eeks, blocking again
[15:49] * srk (~Siva@2605:6000:ed04:ce00:d961:c55c:ce6b:1a1d) Quit (Ping timeout: 480 seconds)
[15:52] * mattbenjamin (~mbenjamin@12.118.3.106) has joined #ceph
[15:52] <doppelgrau> first use case I have seen for size=1, min_size=0 :D
[15:52] <walcubi> doppelgrau, the application itself is simple. It's the seek speed that matters more really - we're getting 5-10ms stat() and read() times.
[15:53] <walcubi> doppelgrau, there's a first for everything. :-P
[15:53] <walcubi> What we store is really just a very *very* big cache for clients.
[15:54] <walcubi> Because clients are too slow to service the images, or have them in the wrong sizes.
[15:54] <rkeene> What you need is RAID-CDN :-D
[15:54] * vbellur (~vijay@71.234.224.255) Quit (Ping timeout: 480 seconds)
[15:54] <rkeene> https://github.com/lorriexingfang/webRTC-CDN-raidcdn-sample
[15:55] <walcubi> Ba-ha. No. :-|
[15:57] <rkeene> :-)
[15:58] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[16:00] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[16:02] * vimal (~vikumar@121.244.87.116) Quit (Quit: Leaving)
[16:03] <walcubi> Yeah, so I guess next thing to look at is maybe getting away with using erasure coded pools rather than replicated.
[16:03] <walcubi> I *guess* if I can get away with one server being down without affecting the cluster, that may be ok
[16:03] <rkeene> I thought you didn't care about redundancy ?
[16:03] <walcubi> I don't
[16:04] <walcubi> Ceph does, however.
[16:04] <rkeene> Ceph doesn't care about redundancy.
[16:05] <doppelgrau> rkeene: but ceph blocks IO to a PG on a down/lost OSD, and the data loss is no problem for walcubi, but the blocked IO
[16:05] <walcubi> If that were true, then I could rados_write() an object with missing placement groups. ;-)
[16:05] <rkeene> doppelgrau, Blocking I/O happens for DOWN OSDs, not Lost OSDs or out OSDs
[16:06] <rkeene> And you can just decrease the time after down before out occurs (at the risk of a really busy OSD being marked out and back in, causing more I/O, causing more OSDs to be really busy, causing them to be marked out, causing more I/O, ...)
[16:07] <doppelgrau> rkeene: even with size=1? the PGs can't repair themselfs in that case (and even if, the timeout tll down+out is too long outage in that use case)
[16:10] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[16:10] * doppelgrau has to put his "toy cluster" back online :D
[16:14] * huangjun (~kvirc@117.152.73.81) Quit (Ping timeout: 480 seconds)
[16:16] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[16:19] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[16:19] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[16:25] * i_m (~ivan.miro@31.173.120.48) has joined #ceph
[16:32] <walcubi> Actually, maybe erasure coded pools won't be so good.
[16:34] <walcubi> Sure, can save space by splitting the objects. But these are around 10kbs in size to begin with. Gah.
[16:36] * evelu (~erwan@poo40-1-78-231-184-196.fbx.proxad.net) has joined #ceph
[16:36] * rraja (~rraja@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:37] * wushudoin (~wushudoin@2601:646:8281:cfd:2ab2:bdff:fe0b:a6ee) has joined #ceph
[16:39] * joshd1 (~jdurgin@2602:30a:c089:2b0:15dd:dcf1:f5e0:b5d0) has joined #ceph
[16:39] * hellertime1 (~Adium@72.246.3.14) has joined #ceph
[16:40] * hellertime (~Adium@72.246.3.14) Quit (Read error: Connection reset by peer)
[16:40] * vbellur (~vijay@65.209.111.106) has joined #ceph
[16:42] * kuku (~kuku@124.104.85.105) has joined #ceph
[16:45] * jtw (~john@2601:644:4000:b0bf:a455:c53f:4a2b:be52) has joined #ceph
[16:48] * MrBy (~MrBy@85.115.23.2) has joined #ceph
[16:51] <doppelgrau> walcubi: One Idea I got is: putting the second copy on large hdds (with rust ^^), I guess it's mostly read-IO. But if the primary fails, performance for these PGs would be very bad (but the other PGs should be still up & running)
[16:51] * penguinRaider (~KiKo@14.139.82.6) Quit (Ping timeout: 480 seconds)
[16:52] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[16:53] * xarses_ (~xarses@mbc0536d0.tmodns.net) has joined #ceph
[16:53] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[17:01] * penguinRaider (~KiKo@104.194.0.35) has joined #ceph
[17:03] * xarses_ (~xarses@mbc0536d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[17:04] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[17:07] * onyb (~ani07nov@119.82.105.66) has joined #ceph
[17:07] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[17:08] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[17:08] * evelu (~erwan@poo40-1-78-231-184-196.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[17:09] * wes_dillingham (~wes_dilli@140.247.242.44) has joined #ceph
[17:11] <wes_dillingham> I am having an issue where about once a week I randomly get an OSD that will not restart. I have been dealing it so far with just removing the osd from the cluster, blasting it, and recreating the OSD but this is just a workaround. Anyways, I am now looking at the log for the individual osd that is currently down and I am seeing the following error in its log does anyone know what this error might be indicative of?
[17:11] <wes_dillingham> 2016-08-15 10:44:34.685936 7faa6597b800 -1 os/filestore/DBObjectMap.cc: In function 'DBObjectMap::Header DBObjectMap::lookup_parent(DBObjectMap::Header)' thread 7faa6597b800 time 2016-08-15 10:44:34.682096
[17:11] <wes_dillingham> os/filestore/DBObjectMap.cc: 1148: FAILED assert(0)
[17:12] <wes_dillingham> this is the error that gets thrown when i attept to restart the osd service.
[17:12] * doppelgrau (~doppelgra@132.252.235.172) Quit (Quit: Leaving.)
[17:13] <walcubi> doppelgrau, yeah - that came up, whether we can have secondary PGs on normal HDDs.
[17:13] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[17:13] <walcubi> I'm not sure if there's a way to control where PGs are created though?
[17:14] <walcubi> You'd have to have two pools, right?
[17:19] * bvi (~Bastiaan@185.56.32.1) Quit (Ping timeout: 480 seconds)
[17:19] * hyst (~KrimZon@108.61.122.121) has joined #ceph
[17:21] * yanzheng (~zhyan@125.70.22.133) Quit (Quit: This computer has gone to sleep)
[17:21] * ntpttr_ (~ntpttr@192.55.54.40) has joined #ceph
[17:25] * jarrpa (~jarrpa@adsl-72-50-86-240.prtc.net) has joined #ceph
[17:28] * kuku (~kuku@124.104.85.105) Quit (Remote host closed the connection)
[17:31] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:35] * penguinRaider (~KiKo@104.194.0.35) Quit (Ping timeout: 480 seconds)
[17:36] * blizzow (~jburns@50.243.148.102) has joined #ceph
[17:38] * bene2 (~bene@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[17:38] * ffilzwin (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) Quit (Quit: Leaving)
[17:41] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[17:45] * ffilzwin (~ffilz@c-76-115-190-27.hsd1.or.comcast.net) has joined #ceph
[17:46] * TMM_ is now known as TMM
[17:47] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[17:47] * ntpttr_ (~ntpttr@192.55.54.40) Quit (Remote host closed the connection)
[17:49] * hyst (~KrimZon@61TAABB6Y.tor-irc.dnsbl.oftc.net) Quit ()
[17:49] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) has joined #ceph
[17:49] * danieagle (~Daniel@187.35.180.9) has joined #ceph
[17:55] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) Quit (Ping timeout: 480 seconds)
[17:57] * xarses_ (~xarses@216.9.110.12) has joined #ceph
[17:57] * sudocat (~dibarra@104-188-116-197.lightspeed.hstntx.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[17:58] * shaunm (~shaunm@nat-eduroam-02.scc.kit.edu) Quit (Quit: Ex-Chat)
[17:58] * xarses_ (~xarses@216.9.110.12) Quit (Remote host closed the connection)
[17:58] * xarses_ (~xarses@216.9.110.12) has joined #ceph
[18:00] * bara (~bara@ip4-83-240-10-82.cust.nbox.cz) Quit (Quit: Bye guys! (??????????????????? ?????????)
[18:02] * ade (~abradshaw@p4FF78731.dip0.t-ipconnect.de) Quit (Quit: Too sexy for his shirt)
[18:04] * oliveiradan (~doliveira@67.214.238.80) has joined #ceph
[18:04] * _mrp (~mrp@82.117.199.26) Quit (Read error: Connection reset by peer)
[18:05] * _mrp (~mrp@82.117.199.26) has joined #ceph
[18:05] * oliveiradan (~doliveira@67.214.238.80) Quit ()
[18:06] * oliveiradan2 (~doliveira@67.214.238.80) has joined #ceph
[18:06] <m0zes> you can control that. through a crush ruleset.
[18:07] <wes_dillingham> I have an osd which wont restart and seems to be failing on the joural replay, which then turns to a segfault. would using ceph-osd ???flush-journal against this osd be a good way to deal with the failure, after flushing the journal would I likely be able to restart the osd? The documentation says regarding ???flush-journal : ???This can be useful if you want to resize the journal or need to otherwise destroy it: this guarantees you won???t lose
[18:07] <wes_dillingham> data.???
[18:08] <m0zes> sestep take root; step chooseleaf firstn 1 ssd; step chooseleaf firstn -1 spinners; step emit;
[18:08] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[18:08] <T1> wes_dillingham: it sounds like the journal is the problem then..
[18:09] <T1> wes_dillingham: where does the journal reside?
[18:09] <T1> is it a device or a file?
[18:09] <wes_dillingham> the journal resides on the same disk as the data partition
[18:09] <wes_dillingham> its a partition
[18:09] <T1> on rotating rust or?
[18:09] <wes_dillingham> rotating disk
[18:10] <T1> probably bad blocks where the journal is then
[18:11] <wes_dillingham> so ???flush-journal would buy me nothing as it needs the journal to be in a good state?
[18:11] <T1> yes
[18:11] <wes_dillingham> so should i just remake the entire osd ?
[18:12] <T1> sound like it me, yes
[18:12] <wes_dillingham> ok
[18:12] <T1> if the journal is the problem, then the entire OSD is in jepardy anyway
[18:12] * mykola (~Mikolaj@193.93.217.44) has joined #ceph
[18:14] * penguinRaider (~KiKo@104.194.0.35) has joined #ceph
[18:16] <wes_dillingham> T1 If I need to cocolate the journal with the data partition is the best method to use a raw partition? This is what ceph-deploy sets up for me, or to use a file on a filesystem?
[18:16] <T1> I'd use a raw partition on an SSD
[18:17] <T1> a journal on rotating rust is slow..it makes for slow writes
[18:18] * yanzheng (~zhyan@125.70.22.133) has joined #ceph
[18:18] <dbbyleo> Hey guys... trying to add a ceph fs into my /etc/fstab. But the docs aren't clear about how exactly to do this. All is says is to add something like this to my /etc/fstab:
[18:19] <dbbyleo> id=admin /mnt/ceph fuse.ceph defaults 0 0
[18:19] <dbbyleo> Should this include the IP address (and port) of the monitor node??
[18:20] * ntpttr_ (~ntpttr@134.134.139.83) has joined #ceph
[18:23] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[18:25] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Ping timeout: 480 seconds)
[18:25] <dbbyleo> if I add conf= , what is suppose to be in the ceph.conf file??
[18:26] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[18:26] <wes_dillingham> T1: I know but its my only option for the time being
[18:27] <dbbyleo> help anyone?
[18:30] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) has joined #ceph
[18:33] * yanzheng1 (~zhyan@118.116.114.80) has joined #ceph
[18:33] <T1> wes_dillingham: then just use a different disk
[18:34] <T1> dbbyleo: sorry, no idea
[18:34] <dbbyleo> T1: I don't understand how it can mount the fs without the IP address of the ceph node???
[18:35] <wes_dillingham> dbbyleo: I use the kernel driver and use a similar setup as desribed here: http://docs.ceph.com/docs/master/cephfs/fstab/
[18:35] <wes_dillingham> you need to provide the monitor ip and port as the source in fstab
[18:35] <wes_dillingham> more ideally, a comma separated list of monitors
[18:35] * yanzheng (~zhyan@125.70.22.133) Quit (Ping timeout: 480 seconds)
[18:36] <wes_dillingham> i dont have any experience using the FUSE driver
[18:36] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[18:36] * xarses_ (~xarses@216.9.110.12) Quit (Ping timeout: 480 seconds)
[18:36] <dbbyleo> wes_dillingham I've tried kernel mount and it freaks out. people here have said ceph-fuse is safer route. It works when I mount it manually. Now just trying to add it to fstab so it mounts automatically at server retsrats
[18:36] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) Quit (Max SendQ exceeded)
[18:37] * Jeffrey4l_ (~Jeffrey@110.244.109.184) Quit (Ping timeout: 480 seconds)
[18:39] <wes_dillingham> well you can always tinker with it by messing with your fstab and running ???mount -a??? which will mostly tell you if youve done it right
[18:39] * efirs (~firs@98.207.153.155) Quit (Quit: Leaving.)
[18:40] <blizzow> can I force mon servers to listen on both (or all) network cards?
[18:40] * ntpttr_ (~ntpttr@134.134.139.83) Quit (Quit: Leaving)
[18:41] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) Quit (Quit: Leaving.)
[18:41] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[18:48] <dbbyleo> Ok... I figured out that even with just this entry in my /etc/fstab:
[18:48] <dbbyleo> id=admin /mnt/ceph fuse.ceph defaults 0 0
[18:48] <dbbyleo> when I do a mount -a, it DOES mount the ceph FS.
[18:49] <dbbyleo> (I still don't understand how the heck it knows the IP address of the node, though).
[18:49] * DanFoster (~Daniel@office.34sp.com) Quit (Quit: Leaving)
[18:51] <dbbyleo> Anyway... so mount -a works, but when I reboot the server, it doesn't get mounted automatically.
[18:52] <T1> perhaps the network is not ready at mount time
[18:52] <dbbyleo> oh right
[18:52] <dbbyleo> whats that option that says to wait until network is ready??
[18:53] <T1> netdev I think
[18:53] <T1> look it up on the manpage for mount
[18:53] * Nicho1as (~nicho1as@00022427.user.oftc.net) Quit (Quit: A man from the Far East; using WeeChat 1.5)
[18:53] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[18:53] <dbbyleo> _netdev or something. Yes will look it up
[18:55] * onyb (~ani07nov@119.82.105.66) Quit (Quit: raise SystemExit())
[18:55] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[18:56] * vikhyat (~vumrao@49.248.198.96) Quit (Quit: Leaving)
[18:57] <dbbyleo> yup that was it: _netdev
[18:57] <dbbyleo> (I still don't get how it manages to mount the resource without the IP address in the fstab)
[19:06] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) Quit (Quit: Leaving.)
[19:11] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) has joined #ceph
[19:12] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[19:17] * yanzheng1 (~zhyan@118.116.114.80) Quit (Quit: This computer has gone to sleep)
[19:18] * yanzheng1 (~zhyan@118.116.114.80) has joined #ceph
[19:19] * yanzheng1 (~zhyan@118.116.114.80) Quit ()
[19:20] * kuku (~kuku@124.104.85.105) has joined #ceph
[19:20] * _mrp (~mrp@82.117.199.26) Quit (Read error: Connection reset by peer)
[19:20] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) has joined #ceph
[19:21] * _mrp (~mrp@82.117.199.26) has joined #ceph
[19:21] * TMM (~hp@185.5.121.201) Quit (Quit: Ex-Chat)
[19:23] * bniver (~bniver@nat-pool-bos-u.redhat.com) Quit (Remote host closed the connection)
[19:23] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) Quit (Quit: Leaving.)
[19:24] <dbbyleo> Help with rbdmap:
[19:24] <dbbyleo> ... /usr/bin/rbdmap: 32: /usr/bin/rbdmap: Bad substitution
[19:25] <dbbyleo> When I look at the script, I think the problem is at the while loop (while read DEV PARAMS). I don't know what "read DEV PARAMS" does, but running that line manually (as root) just hangs on my terminal
[19:28] * bene2 (~bene@nat-pool-bos-t.redhat.com) has joined #ceph
[19:28] * scg (~zscg@181.122.4.166) has joined #ceph
[19:28] * Kaervan (~Hazmat@108.61.122.88) has joined #ceph
[19:34] * jarrpa (~jarrpa@adsl-72-50-86-240.prtc.net) Quit (Ping timeout: 480 seconds)
[19:34] * madkiss (~madkiss@ip5b406a0a.dynamic.kabel-deutschland.de) has joined #ceph
[19:36] <dbbyleo> Is "PARAMS" supposed to mean something in /usr/bin/rbdmap ??
[19:36] <dbbyleo> How can you have a line in that script as "while read DEV PARAMS;" ??
[19:37] <dbbyleo> Am I suppose to replace PARAMS" with a file name?
[19:37] * madkiss1 (~madkiss@2a02:8109:8680:2000:4073:55d5:eac2:4ac4) has joined #ceph
[19:42] * kuku (~kuku@124.104.85.105) Quit (Remote host closed the connection)
[19:42] * madkiss (~madkiss@ip5b406a0a.dynamic.kabel-deutschland.de) Quit (Ping timeout: 480 seconds)
[19:50] * tserong (~tserong@203-214-92-220.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[19:50] * doppelgrau (~doppelgra@dslb-088-072-094-200.088.072.pools.vodafone-ip.de) has joined #ceph
[19:51] * tserong (~tserong@203-214-92-220.dyn.iinet.net.au) has joined #ceph
[19:52] * mattch (~mattch@w5430.see.ed.ac.uk) Quit (Ping timeout: 480 seconds)
[19:52] * chunmei (~chunmei@134.134.139.70) has joined #ceph
[19:52] * roeland (~roeland@77.108.157.165) Quit (Quit: leaving)
[19:53] <blizzow> Can an op please change the subject to show where the ceph bot logs to??
[19:54] * dbbyleo (~dbbyleo@50-198-202-93-static.hfc.comcastbusiness.net) has left #ceph
[19:55] * dbbyleo (~dbbyleo@50-198-202-93-static.hfc.comcastbusiness.net) has joined #ceph
[19:56] * dougf (~dougf@96-38-99-179.dhcp.jcsn.tn.charter.com) Quit (Ping timeout: 480 seconds)
[19:56] <dbbyleo> Hi... just wondering if this is a good channel to be in to get beginner-type help?
[19:56] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:f01f:657e:17e3:bfd5) Quit (Ping timeout: 480 seconds)
[19:57] * dougf (~dougf@75-131-32-223.static.kgpt.tn.charter.com) has joined #ceph
[19:58] * Kaervan (~Hazmat@26XAAA3D8.tor-irc.dnsbl.oftc.net) Quit ()
[20:00] * wushudoin (~wushudoin@2601:646:8281:cfd:2ab2:bdff:fe0b:a6ee) Quit (Quit: Leaving)
[20:04] <dbbyleo> doppelgrau can you help me?
[20:05] <dbbyleo> I think my issue is pretty basic but can't seem to get anyone to help.
[20:05] <blizzow> dbbyleo: just ask the question please.
[20:05] <dbbyleo> blizzow ... /usr/bin/rbdmap: 32: /usr/bin/rbdmap: Bad substitution
[20:06] <dbbyleo> I don't get why it's gettig this error
[20:06] <dbbyleo> its like its an unhandled exception in the script
[20:06] <dbbyleo> The line:
[20:06] <dbbyleo> while read DEV PARAMS
[20:06] <dbbyleo> in the script boggles me...
[20:06] <dbbyleo> What the heck is PARAMS??
[20:07] <blizzow> You may want pastebin the script then ask that in #bash or #linux.
[20:08] <dbbyleo> pastebin? I don't undersstand
[20:09] <dbbyleo> are you talking about going into a a different channel??
[20:10] <dbbyleo> blizzow I'm also new to ceph, so just in case there's an updated script I didn't know about, it would be good to know. But I deployed this using ceph-deploy.
[20:10] <blizzow> dbbyleo: what command are you running?
[20:10] <dbbyleo> rbdmap map
[20:11] <dbbyleo> for example:
[20:11] <dbbyleo> root@cephcl:~# rbdmap map
[20:11] <blizzow> try this: man rbdmap
[20:11] <dbbyleo> I have man'd rbdmap
[20:11] <blizzow> do you have a file called: /etc/ceph/rbdmap
[20:11] <dbbyleo> yes
[20:12] <blizzow> What's in the file? (post it to a pastebin)
[20:12] <dbbyleo> I haven't used pastebin before... whats that?
[20:13] <blizzow> dbbyleo: https://is.gd/I2u3Ot
[20:14] <dbbyleo> Ok I'll check it out, but can I just paste the /etc/ceph/rbdmap here for now. It's not much.
[20:15] <dbbyleo> ok... wait is this how it works:
[20:15] <dbbyleo> https://is.gd/I2u3Ot
[20:15] <blizzow> dbbyleo: please use a pastebin service, people can modify your paste then.
[20:15] <dbbyleo> I just created a "paste"
[20:16] * stiopa (~stiopa@81.110.229.198) has joined #ceph
[20:16] <dbbyleo> That's what's in my /etc/ceph/rbdmap file
[20:17] * blizzow slaps forehead.
[20:17] <blizzow> you just reposted a the same link I posted to you.
[20:17] * wushudoin (~wushudoin@2601:646:8281:cfd:2ab2:bdff:fe0b:a6ee) has joined #ceph
[20:18] <dbbyleo> Oh shoot!
[20:18] <dbbyleo> I just created a paste
[20:18] <dbbyleo> How do you see the paste I created??
[20:19] <dbbyleo> Sorry I'm a total newb at this
[20:19] <blizzow> Just copy the link from the top of your paste....
[20:19] <dbbyleo> http://pastebin.com/fWU1bR6f
[20:19] <dbbyleo> I thought I did that!!! Doh!
[20:19] <dbbyleo> There you go
[20:27] <jdillaman> dbbyleo: i answered your query from the ceph-users mailing list
[20:29] <wes_dillingham> in terms of getting which OSDs are doing the most IO, which perf metric is of the most inerest? Im seeing that there are a lot of similar metrics, just curios if there is something like read/write ops per second per osd
[20:31] <dbbyleo> jdillaman - ok thanks, I'll check my email
[20:39] <dbbyleo> jdillaman: THANK YOU!!!! That was it! sh --> dash was the way the server had it.
[20:44] <dbbyleo> blizzow: thanks for showing how to use pastebin!
[20:47] <blizzow> dbbyleo: of course.
[20:47] * _mrp (~mrp@82.117.199.26) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[20:47] <blizzow> Is there a way to see performance metrics (writes/reads/IOPs) per image to see which one is busiest?
[20:54] * haplo37 (~haplo37@199.91.185.156) Quit (Ping timeout: 480 seconds)
[20:54] <jdillaman> blizzow: nothing built in -- but you could use collectd / diamond perf tools to collect per-image stats and send those stats to a central perf db for querying / graphing
[21:00] * i_m (~ivan.miro@31.173.120.48) Quit (Ping timeout: 480 seconds)
[21:00] <wes_dillingham> blizzow: my colleague wrote this diamond collector: which while specific to opennebula maps opennebula vms to rbd disks and reports on those client side rbd metrics, might be somethign to start with.
[21:00] <wes_dillingham> https://github.com/fasrc/nebula-ceph-diamond-collector
[21:02] <jdillaman> wes_dillingham: nice -- it actually maps the images back to the VMs
[21:02] * dbbyleo (~dbbyleo@50-198-202-93-static.hfc.comcastbusiness.net) Quit (Read error: Connection reset by peer)
[21:03] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[21:04] <blizzow> jdillaman: I already pull iops/read/write stats for /dev/vda from each of my VMs directly. I was just hoping there was a way to see it from the ceph side.
[21:06] * dnunez (~dnunez@209-6-91-147.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com) has joined #ceph
[21:06] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[21:07] <jdillaman> blizzow: yeah, i was suggesting putting the collectors on the hypervisor nodes to read the stats generated from librbd
[21:09] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[21:10] * derjohn_mob (~aj@46.189.28.72) Quit (Ping timeout: 480 seconds)
[21:11] <wes_dillingham> the collector i linked to is intended to be run on the hypervisors to get ceph side librbd stats blizzow
[21:13] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[21:13] * haplo37 (~haplo37@199.91.185.156) Quit (Ping timeout: 480 seconds)
[21:14] * wes_dillingham_ (~wes_dilli@65.112.8.199) has joined #ceph
[21:16] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[21:18] * dmick (~dmick@206.169.83.146) has joined #ceph
[21:19] * dbbyleo (~dbbyleo@50-198-202-93-static.hfc.comcastbusiness.net) has joined #ceph
[21:20] * wes_dillingham (~wes_dilli@140.247.242.44) Quit (Ping timeout: 480 seconds)
[21:20] * wes_dillingham_ is now known as wes_dillingham
[21:21] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[21:22] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[21:23] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[21:24] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[21:27] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) Quit (Remote host closed the connection)
[21:28] * boredatwork (~overonthe@199.68.193.62) has joined #ceph
[21:30] * wes_dillingham (~wes_dilli@65.112.8.199) Quit (Quit: wes_dillingham)
[21:30] * dyasny (~dyasny@modemcable030.61-37-24.static.videotron.ca) has joined #ceph
[21:35] * vbellur (~vijay@65.209.111.106) Quit (Ping timeout: 480 seconds)
[21:35] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[21:35] * rburkholder (~overonthe@199.68.193.54) Quit (Ping timeout: 480 seconds)
[21:36] * rburkholder (~overonthe@199.68.193.54) has joined #ceph
[21:40] * dynamicudpate (~overonthe@199.68.193.54) Quit (Ping timeout: 480 seconds)
[21:40] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[21:40] * wes_dillingham (~wes_dilli@65.112.8.199) has joined #ceph
[22:00] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[22:00] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[22:02] * kmroz (~kilo@00020103.user.oftc.net) Quit (Ping timeout: 480 seconds)
[22:05] * danieagle (~Daniel@187.35.180.9) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[22:09] * srk_ (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[22:10] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[22:12] * blizzow (~jburns@50.243.148.102) Quit (Ping timeout: 480 seconds)
[22:12] * kmroz (~kilo@node-1w7jr9qmjt7nuki0ztjbfjw66.ipv6.telus.net) has joined #ceph
[22:15] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) has joined #ceph
[22:17] * davidzlap (~Adium@2605:e000:1313:8003:f01b:8940:89d5:6266) has joined #ceph
[22:17] * georgem (~Adium@107-179-157-134.cpe.teksavvy.com) Quit ()
[22:17] * georgem (~Adium@206.108.127.16) has joined #ceph
[22:20] * dyasny (~dyasny@modemcable030.61-37-24.static.videotron.ca) Quit (Ping timeout: 480 seconds)
[22:25] * rendar (~I@host63-44-dynamic.51-82-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[22:31] * blizzow (~jburns@50.243.148.102) has joined #ceph
[22:34] * rwheeler (~rwheeler@pool-173-48-195-215.bstnma.fios.verizon.net) has joined #ceph
[22:38] * wes_dillingham (~wes_dilli@65.112.8.199) Quit (Quit: wes_dillingham)
[22:41] * mykola (~Mikolaj@193.93.217.44) Quit (Quit: away)
[22:43] * TomyLobo (~Epi@46.166.190.221) has joined #ceph
[22:47] * wes_dillingham (~wes_dilli@65.112.8.199) has joined #ceph
[22:51] * rendar (~I@host63-44-dynamic.51-82-r.retail.telecomitalia.it) has joined #ceph
[22:54] * mattbenjamin (~mbenjamin@12.118.3.106) Quit (Ping timeout: 480 seconds)
[22:58] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[22:59] * wkennington (~wkenningt@c-71-204-170-241.hsd1.ca.comcast.net) has joined #ceph
[23:04] * wes_dillingham (~wes_dilli@65.112.8.199) Quit (Quit: wes_dillingham)
[23:06] * haplo37 (~haplo37@199.91.185.156) Quit (Ping timeout: 480 seconds)
[23:08] * brad_mssw (~brad@66.129.88.50) Quit (Quit: Leaving)
[23:09] * penguinRaider (~KiKo@104.194.0.35) Quit (Ping timeout: 480 seconds)
[23:10] * hellertime1 (~Adium@72.246.3.14) Quit (Quit: Leaving.)
[23:13] * TomyLobo (~Epi@46.166.190.221) Quit ()
[23:17] * haplo37 (~haplo37@199.91.185.156) has joined #ceph
[23:22] * cronburg__ (~cronburg@nat-pool-bos-t.redhat.com) has joined #ceph
[23:22] * cronburg_ (~cronburg@nat-pool-bos-t.redhat.com) Quit (Read error: Connection reset by peer)
[23:27] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[23:28] * penguinRaider (~KiKo@104.250.141.44) has joined #ceph
[23:31] * cathode (~cathode@50.232.215.114) has joined #ceph
[23:33] * AXJ (~oftc-webi@static-108-47-170-18.lsanca.fios.frontiernet.net) has joined #ceph
[23:34] <AXJ> Hey everyone, has anyone ever used one set of ceph monitors to monitor 2 different clusters?
[23:36] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[23:39] <[arx]> why
[23:39] <AXJ> We're running out of room in the rack
[23:40] <AXJ> I've read several times that no one recommends running ceph monitors in vm
[23:40] <AXJ> and I'd hate to burn 3 more U for monitors unless I really have to
[23:40] <[arx]> you can run multiple ceph-mon daemons with different --cluster flags, but i am pretty sure you can't use one set of daemons to monitor two clusters
[23:40] <dmick> ^ correct
[23:41] <AXJ> got it. thanks for letting me know
[23:41] <AXJ> if I try to run the mons in a vm will everything just explode?
[23:41] <dmick> note that, also, if you do that, of course, then both clusters share the same failure domain
[23:42] <dmick> so if you lose a physical host you lose redundancy for both clusters
[23:42] <AXJ> That makes sense. Maybe we've been lucky but we've never had any issues with the monitors
[23:43] <dmick> everything's great until something breaks :)
[23:43] <AXJ> We're going to have one set of OSD's with SSD and another set with disks. I know there are ways to create pools that just use one or the other, but that just looked more complex the having separate clusters
[23:47] * srk (~Siva@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[23:49] * narthollis (~dicko@178.162.205.1) has joined #ceph
[23:55] * bniver (~bniver@pool-98-110-180-234.bstnma.fios.verizon.net) has joined #ceph
[23:58] * Nacer (~Nacer@vir78-1-82-232-38-190.fbx.proxad.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.