#ceph IRC Log


IRC Log for 2013-09-23

Timestamps are in GMT/BST.

[0:03] * rovar (~rick@pool-96-246-17-104.nycmny.fios.verizon.net) has joined #ceph
[0:11] * diegows (~diegows@ has joined #ceph
[0:13] * yanzheng (~zhyan@ has joined #ceph
[0:17] * ScOut3R_ (~scout3r@540099D1.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[0:29] * malcolm_ (~malcolm@silico24.lnk.telstra.net) has joined #ceph
[0:30] * rendar (~s@host39-118-dynamic.53-82-r.retail.telecomitalia.it) Quit ()
[0:44] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) has joined #ceph
[1:00] * yanzheng (~zhyan@ Quit (Ping timeout: 480 seconds)
[1:03] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:40] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) has joined #ceph
[2:19] * The_Bishop (~bishop@2001:470:50b6:0:c459:c1aa:327b:d446) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[2:35] * rudolfsteiner (~federicon@ has joined #ceph
[2:35] * LeaChim (~LeaChim@host86-135-252-168.range86-135.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[2:39] * rudolfsteiner (~federicon@ Quit ()
[2:53] * markbby (~Adium@ has joined #ceph
[3:03] * yy-nm (~Thunderbi@ has joined #ceph
[3:04] * freedomhui (~freedomhu@ has joined #ceph
[3:13] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[3:15] * freedomhui (~freedomhu@ has joined #ceph
[3:18] * shang (~ShangWu@ has joined #ceph
[3:22] * rudolfsteiner (~federicon@ has joined #ceph
[3:22] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[3:28] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[3:28] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[3:28] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[3:31] * freedomhui (~freedomhu@ has joined #ceph
[3:48] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:48] * erice (~erice@host-sb226.res.openband.net) has joined #ceph
[3:52] * markbby (~Adium@ Quit (Quit: Leaving.)
[3:52] * markbby (~Adium@ has joined #ceph
[3:52] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[3:57] * markbby (~Adium@ Quit (Remote host closed the connection)
[4:02] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[4:15] * dalegaard (~dalegaard@vps.devrandom.dk) Quit (Read error: Operation timed out)
[4:15] * dalegaard (~dalegaard@vps.devrandom.dk) has joined #ceph
[4:34] * freedomhui (~freedomhu@ has joined #ceph
[4:55] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) has joined #ceph
[5:05] * fireD_ (~fireD@93-136-89-153.adsl.net.t-com.hr) has joined #ceph
[5:07] * fireD (~fireD@93-139-128-56.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:28] * julian (~julianwa@ has joined #ceph
[5:33] * erice (~erice@host-sb226.res.openband.net) Quit (Ping timeout: 480 seconds)
[5:39] * Andes (~oftc-webi@ has joined #ceph
[5:39] <Andes> anyone ever try to use rbd client??
[5:40] <Andes> I update my ubuntu kernel to 3.8.0-27, and run modprobe rbd, but get the error:'FATAL: Error inserting rbd (/lib/modules/3.8.0-27-generic/kernel/drivers/block/rbd.ko): Operation not permitted'
[5:41] <Andes> sorry, get the wrong user. 'FATAL: Error inserting rbd (/lib/modules/3.8.0-27-generic/kernel/drivers/block/rbd.ko): Invalid module format'
[5:41] <Andes> it shows this info
[5:42] <xarses> maybe try rebuilding the kernel driver
[5:42] <xarses> otherwise i'm not sure
[5:58] * yy-nm (~Thunderbi@ Quit (Quit: yy-nm)
[6:02] <Andes> i am not familiar with the kernel.
[6:02] * rovar (~rick@pool-96-246-17-104.nycmny.fios.verizon.net) Quit (Quit: Ex-Chat)
[6:02] <Andes> now I use ubuntu13.04 directly
[6:02] <Andes> I work anyway
[6:02] <Andes> it works anyway
[6:03] <Andes> try to find some docs about managing the block device.
[6:05] <malcolm_> how did you go with your Rados on centos xarses?
[6:05] <Andes> I am going for lunch, see you all.
[6:11] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:12] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) has joined #ceph
[6:22] * julian (~julianwa@ Quit (Quit: afk)
[6:23] <xarses> malcolm_ its still going, gant figure out where i have some errors from
[6:29] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:37] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[6:38] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[6:56] * MrNPP (~MrNPP@ Quit (Ping timeout: 480 seconds)
[7:00] * MrNPP (~MrNPP@ has joined #ceph
[7:05] <Qu310> hey guys any issues with ceph repo? http://ceph.com/debian-dumpling/pool/
[7:10] <Qu310> ah never mind apt-get was stuffed
[7:16] * glzhao (~glzhao@li565-182.members.linode.com) has joined #ceph
[7:30] * glzhao_ (~glzhao@ has joined #ceph
[7:32] * glzhao (~glzhao@li565-182.members.linode.com) Quit (Ping timeout: 480 seconds)
[7:36] * yy-nm (~Thunderbi@ has joined #ceph
[8:04] * Andes (~oftc-webi@ Quit (Quit: Page closed)
[8:15] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[8:17] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[8:17] * foosinn (~stefan@office.unitedcolo.de) has joined #ceph
[8:28] * Vjarjadian (~IceChat77@ Quit (Quit: If you think nobody cares, try missing a few payments)
[8:30] * sleinen (~Adium@2001:620:0:46:60a9:ec36:7e36:50f4) has joined #ceph
[8:32] * sleinen1 (~Adium@2001:620:0:46:446c:e700:245d:9f89) has joined #ceph
[8:36] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[8:38] * sleinen (~Adium@2001:620:0:46:60a9:ec36:7e36:50f4) Quit (Ping timeout: 480 seconds)
[8:46] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[8:57] * jks (~jks@3e6b5724.rev.stofanet.dk) has joined #ceph
[8:59] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) has joined #ceph
[9:00] * topro (~topro@host-62-245-142-50.customer.m-online.net) has joined #ceph
[9:05] * andrei_ (~andrei@ has joined #ceph
[9:06] * BManojlovic (~steki@ has joined #ceph
[9:07] * andrei (~andrei@ Quit (Read error: No route to host)
[9:08] * freedomhui (~freedomhu@ has joined #ceph
[9:10] * xarses1 (~andreww@c-71-202-167-197.hsd1.ca.comcast.net) has joined #ceph
[9:10] * xarses (~andreww@c-71-202-167-197.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[9:12] * andrei_ (~andrei@ Quit (Read error: Operation timed out)
[9:21] * andrei_ (~andrei@ has joined #ceph
[9:35] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) Quit (Quit: Bye)
[9:37] * malcolm_ (~malcolm@silico24.lnk.telstra.net) Quit (Ping timeout: 480 seconds)
[9:39] * ScOut3R (~ScOut3R@catv-89-133-21-203.catv.broadband.hu) has joined #ceph
[9:46] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) has joined #ceph
[9:51] * syed_ (~chatzilla@ has joined #ceph
[10:04] * dalegaar1 (~dalegaard@vps.devrandom.dk) has joined #ceph
[10:04] * dalegaard (~dalegaard@vps.devrandom.dk) Quit (Read error: Operation timed out)
[10:07] * allsystemsarego (~allsystem@ has joined #ceph
[10:08] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[10:09] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[10:11] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) Quit (Quit: Leaving.)
[10:12] * JustEra (~JustEra@ has joined #ceph
[10:17] * LeaChim (~LeaChim@host86-135-252-168.range86-135.btcentralplus.com) has joined #ceph
[10:17] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[10:18] * cofol1986 (~xwrj@ has joined #ceph
[10:19] <cofol1986> Hello, everyone, I always get this problem when using ceph-deploy to deploy my cluster:[node0][DEBUG ] === mon.node0 ===
[10:19] <cofol1986> [node0][DEBUG ] Starting Ceph mon.ruijie-node0 on ruijie-node0...
[10:19] <cofol1986> [node0][DEBUG ] Starting ceph-create-keys on ruijie-node0...
[10:19] <cofol1986> [node0][WARNIN] No data was received after 7 seconds, disconnecting...
[10:20] <cofol1986> any idea?
[10:31] * jcfischer (~fischer@macjcf.switch.ch) has joined #ceph
[10:34] <syed_> cofol1986: hi
[10:35] <syed_> cofol1986: which ceph release you are using and which os ?
[10:35] * jbd_ (~jbd_@2001:41d0:52:a00::77) has joined #ceph
[10:38] <JustEra> Someone can help me to debug a mon ? it can't start properly and suicide after 5s :( : http://pastebin.com/uaJikB17
[10:40] * guppy_ (~quassel@guppy.xxx) Quit (Quit: No Ping reply in 180 seconds.)
[10:41] <syed_> JustEra: can you check the cluster health with " cluster health --detail "
[10:42] * guppy (~quassel@guppy.xxx) has joined #ceph
[10:43] <JustEra> syed_, http://pastebin.com/4TccSvYS
[10:44] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[10:47] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[10:50] * rendar (~s@ has joined #ceph
[11:00] * wido__ is now known as wido
[11:00] <wido> joao: Are you around?
[11:00] <wido> joao: http://ceph.com/docs/next/dev/mon-bootstrap/
[11:00] <wido> A bootstrap keyfile with a mon. and client.admin key should be correct right?
[11:01] <wido> I always used that, but suddenly my client.admin key doesn't work, I get a access denied
[11:02] <andrei_> hello guys
[11:02] <andrei_> does anyone know if the bug 6278 is going to be addressed anytime soon?
[11:07] <andrei_> not sure if this effects any of your servers, but i've noticed that that without scrubbing my servers are not crashing
[11:08] <joao> wido, I believe it should
[11:08] <joao> wido, getting access denied on what?
[11:08] <wido> joao: A simple 'ceph -s'
[11:08] <wido> but I just looked at the mkcephfs source, it also adds --caps to the ceph-authtool
[11:08] <wido> Retried with that and works
[11:08] <wido> So the docs have to be fixed there
[11:09] <joao> ah
[11:09] <wido> I have push permissions nowadays, so I might fix it :)
[11:09] <joao> you need mon 'allow *' on client.admin
[11:09] <joao> right
[11:09] <joao> the docs are in need of a good scrubbing
[11:10] <joao> wido, are we going to have the pleasure of your company in London? :p
[11:10] <wido> joao: Yes, I'll be in London
[11:10] <joao> kickass
[11:10] <ccourtaut> we'll see you there so :)
[11:11] <wido> ccourtaut: Cool :)
[11:11] <joao> loicd is coming too
[11:11] <wido> joao: You are right, the docs need some scrubbing. I'll take care of this one
[11:11] <joao> gonna be great
[11:11] <loicd> yes absolutely yes
[11:11] <joao> wido, yeah, there are a few other things that should either be addressed or made clear
[11:11] <joao> trying to find the time in this sprint of the next to dedicate a couple of days just to the docs
[11:13] <wido> joao: Yes, since this can really scare of new users
[11:27] <JustEra> Hello, I've a weird bug, doing a clean install with 3 server all 3 mon are up if I restart one It dosnt go up..
[11:28] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) Quit (Remote host closed the connection)
[11:40] <andrei_> has anyone tried ceph with zfs? Is it stable for production yet?
[11:40] <andrei_> I am currently using xfs with ceph, but I can foresee a great deal of performance boost if I switch to zfs
[11:40] * claenjoy (~leggenda@ Quit (Quit: Leaving.)
[11:41] * claenjoy (~leggenda@ has joined #ceph
[11:42] * yy-nm (~Thunderbi@ Quit (Quit: yy-nm)
[11:44] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[11:45] <andrei_> what is the safe size for this value: "rbd cache max dirty age = 5"
[12:04] * guppy_ (~quassel@guppy.xxx) has joined #ceph
[12:04] * guppy (~quassel@guppy.xxx) Quit (Read error: Connection reset by peer)
[12:06] * cofol1986 (~xwrj@ Quit (Quit: Leaving.)
[12:09] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[12:10] * freedomhui (~freedomhu@ has joined #ceph
[12:17] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[12:41] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has left #ceph
[12:46] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[12:52] * erice (~erice@ has joined #ceph
[12:54] * shang (~ShangWu@ Quit (Quit: Ex-Chat)
[12:57] * AfC (~andrew@2001:44b8:31cb:d400:6e88:14ff:fe33:2a9c) has joined #ceph
[13:01] * glzhao_ (~glzhao@ Quit (Quit: leaving)
[13:04] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[13:06] * rudolfsteiner (~federicon@ has joined #ceph
[13:12] * thomnico (~thomnico@ has joined #ceph
[13:16] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[13:23] * syed_ (~chatzilla@ Quit (Ping timeout: 480 seconds)
[13:31] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) Quit (Read error: Connection reset by peer)
[13:32] * mnash (~chatzilla@vpn.expressionanalysis.com) has joined #ceph
[13:32] * syed_ (~chatzilla@ has joined #ceph
[13:32] * mnash (~chatzilla@vpn.expressionanalysis.com) Quit (Read error: Connection reset by peer)
[13:33] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) has joined #ceph
[13:34] * kuba (~kuba@ has joined #ceph
[13:37] * mnash_ (~chatzilla@vpn.expressionanalysis.com) has joined #ceph
[13:37] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) Quit (Read error: Connection reset by peer)
[13:37] * mnash_ is now known as mnash
[13:39] * mnash_ (~chatzilla@66-194-114-178.static.twtelecom.net) has joined #ceph
[13:39] * mnash (~chatzilla@vpn.expressionanalysis.com) Quit (Read error: Connection reset by peer)
[13:39] * mnash_ is now known as mnash
[13:42] * mnash_ (~chatzilla@vpn.expressionanalysis.com) has joined #ceph
[13:42] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) Quit (Read error: Connection reset by peer)
[13:42] * mnash_ is now known as mnash
[13:46] * thomnico (~thomnico@ Quit (Ping timeout: 480 seconds)
[13:52] * mnash (~chatzilla@vpn.expressionanalysis.com) Quit (Ping timeout: 480 seconds)
[13:53] * syed_ (~chatzilla@ Quit (Quit: ChatZilla [Firefox 23.0/20130803193131])
[14:08] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:10] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[14:17] * AfC (~andrew@2001:44b8:31cb:d400:6e88:14ff:fe33:2a9c) Quit (Quit: Leaving.)
[14:18] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[14:20] * freedomhui (~freedomhu@ has joined #ceph
[14:21] * diegows (~diegows@ has joined #ceph
[14:35] * janisg (~troll@ Quit (Quit: The lyf so short, the craft so longe to lerne)
[14:35] * mattt_ (~mattt@ has joined #ceph
[14:36] * janisg (~troll@ has joined #ceph
[14:44] * glzhao (~glzhao@ has joined #ceph
[14:48] * markbby (~Adium@ has joined #ceph
[14:56] * alexxy[home] (~alexxy@2001:470:1f14:106::2) has joined #ceph
[14:56] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Read error: Connection reset by peer)
[14:58] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) has joined #ceph
[14:58] * andrei_ (~andrei@ Quit (Ping timeout: 480 seconds)
[14:59] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:03] * wido_ (~wido@2a00:f10:121:100:4a5:76ff:fe00:199) has joined #ceph
[15:04] * wido (~wido@2a00:f10:121:100:4a5:76ff:fe00:199) Quit (Read error: Connection reset by peer)
[15:05] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) has joined #ceph
[15:09] * tsnider (~tsnider@nat-216-240-30-23.netapp.com) has joined #ceph
[15:15] * markl (~mark@tpsit.com) has joined #ceph
[15:20] <swinchen> So for running a HA ceph cluster... obviously the OSDs need to be replicated, multiple monitor nodes (probably with multipath routing?), .... what do you do with the metadata node?
[15:26] * shang (~ShangWu@1-171-115-249.dynamic.hinet.net) has joined #ceph
[15:29] * julian (~julianwa@ has joined #ceph
[15:29] <JustEra> How I can "restart" a client ? I'm getting a bad fsid
[15:30] <yanzheng> swinchen, active mds + standby mds
[15:36] * MACscr1 (~Adium@c-98-214-103-147.hsd1.il.comcast.net) Quit (Ping timeout: 480 seconds)
[15:36] <swinchen> yanzheng: Ok, so it is safe to say that for a production HA storage cluster you need (at minimum) 2 storage OSD nodes, 2 metadata nodes and 2 monitor nodes? Is it acceptable to combine the metadata and monitor nodes?
[15:38] <yanzheng> what does "combine the metadata and monitor nodes" mean?
[15:39] <yanzheng> you need odd number of monitor nodes
[15:39] <swinchen> yanzheng: run both daemons on the same node.
[15:39] <yanzheng> you can
[15:39] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[15:40] <tsnider> Hi -- Am I going to run into problems if I have the same node act as both monitor and MDS? i.e. operational and/or performance problems
[15:43] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[15:43] <yanzheng> mds may use lots of memory, it may affect the monitor if your node doesn't have enough memory
[15:45] <tsnider> yanzheng: ok - I'll keep that in mind. thx
[15:45] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[15:46] * dmsimard (~Adium@ has joined #ceph
[15:46] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[15:47] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[15:47] <swinchen> yanzheng: what would be the disk requirements for a combined mds/monitor node? I guess if I used lvm I could easily expand it...
[15:47] * alfredod_ (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[15:47] * alfredod_ (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[15:48] * alfredod_ (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[15:48] <yanzheng> no desk space is required for mds
[15:48] <yanzheng> s/desk/disk
[15:48] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Read error: Connection reset by peer)
[15:49] <yanzheng> all data used by mds are store in the object store
[15:49] * alfredod_ is now known as alfredodeza
[15:49] * mjeanson (~mjeanson@bell.multivax.ca) Quit (Remote host closed the connection)
[15:49] * mjeanson (~mjeanson@bell.multivax.ca) has joined #ceph
[15:49] <swinchen> yanzheng: oh, that is pretty cool.
[15:51] <joelio_> may as well leverage the system :)
[15:51] * joelio_ is now known as joelio
[15:51] * sjm (~sjm@ has joined #ceph
[15:52] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[15:55] * rudolfsteiner (~federicon@ has joined #ceph
[15:55] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) has joined #ceph
[15:56] * mnash_ (~chatzilla@66-194-114-178.static.twtelecom.net) has joined #ceph
[15:56] * mnash_ is now known as mnash
[15:57] * zackc (~zack@formosa.juno.dreamhost.com) has joined #ceph
[15:57] * zackc is now known as Guest79
[15:58] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[15:59] <swinchen> Can someone take a look at how I installed ceph? I constantly have "192 pgs stuck inactive; 192 pgs stuck unclean" http://pastie.org/8348940# (the first time this shows up on a brand new install is on line 116)
[16:00] <swinchen> I have three nodes (ceph-admin is the monitor, ceph0 and ceph1 are the osds with two disks each)
[16:00] <joelio> swinchen: all on the same node?
[16:00] <joelio> ahh, ignore
[16:01] <swinchen> I should also mention this is Fedora 19 running as kvm virtual machines. I have iptables completely disabled and "setenforce 0" (selinux disabled)
[16:02] * pieter (~pieter@105-237-50-229.access.mtnbusiness.co.za) has joined #ceph
[16:02] <pieter> Hi guys, am I right that more x OSD = faster speed?
[16:03] <swinchen> I have tried installing several times. Each time I start over I run "ceph-deploy purge ceph-admin ceph0 ceph1" then go to each node and make sure the OSD disks are unmounted, delete /var/run/ceph and /var/lib/ceph. I don't know how to clean it up much more than that.
[16:03] <alfredodeza> swinchen: have you used purgedata ?
[16:03] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[16:03] * Guest79 is now known as zackc
[16:04] * vata (~vata@2607:fad8:4:6:7891:2cb3:fd76:a5e3) has joined #ceph
[16:05] <swinchen> alfredodeza: no, I was under the impression that purge also did "purgedata" oops
[16:08] <yanzheng> swinchen, that's strange, all osd are up and in
[16:08] <yanzheng> try create a new pool
[16:09] * rudolfsteiner (~federicon@host135.181-14-188.telecom.net.ar) has joined #ceph
[16:09] <swinchen> I just tried cleaning up the cluster, running purge, then purgedata... then I ran new, install mon create, gatherkeys... it still lists 192 stuck pgs.
[16:09] <swinchen> ok, I will give that a shot
[16:10] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[16:11] * yanzheng still uses mkcephfs to create his test cluster
[16:12] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[16:13] <swinchen> yanzheng: http://pastie.org/pastes/8348976/text
[16:13] <swinchen> That is creating a new pool
[16:15] <yanzheng> can OSDs communicate with each other?
[16:17] <yanzheng> enable osd debug, the log will give you some hints
[16:18] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:22] * rudolfsteiner (~federicon@host135.181-14-188.telecom.net.ar) Quit (Quit: rudolfsteiner)
[16:24] <swinchen> yanzheng: do you recommend "debug osd = 20/5"?
[16:25] <yanzheng> yes
[16:25] <yanzheng> add "debug osd = 20" osd section of ceph.conf
[16:25] * ScOut3R (~ScOut3R@catv-89-133-21-203.catv.broadband.hu) Quit (Quit: Leaving...)
[16:27] <swinchen> alright, will give it a shot
[16:29] <joelio> there won't be an OSD section if using ceph-deploy, but should be a global iirc
[16:29] * ishkbabob (~c7a82cc0@webuser.thegrebs.com) has joined #ceph
[16:30] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[16:34] <ishkbabob> hey guys, does anyone know if it's a problem to mount an RBD on a Ceph cluster node? (i.e. - mounting an RBD on machine that also has MONs and OSDs)?
[16:35] <joelio> yes, badness
[16:35] <ishkbabob> joelio: how come? any resources on this?
[16:35] <mikedawson> ishkbabob: I believe it is bad if you are using kernel rbd. I do it frequently with userspace rbd (qemu/rbd guests)
[16:36] <ishkbabob> we're trying to re-export an RBD over CIFS, and everything seems fine, but when we start moving data to it we get tons of slow requests
[16:36] <ishkbabob> which would normally seem reasonable, but we're not even touching the bandwidth on these nodes (40 gigabit bonded ethernet)
[16:36] <ishkbabob> mikedawson: thanks for the info, that might actually help in troubleshooting
[16:37] * rudolfsteiner (~federicon@host135.181-14-188.telecom.net.ar) has joined #ceph
[16:37] <swinchen> yanzheng: http://paste.openstack.org/show/47380/ do you see anything out of the ordinary? right near the top there are several lines with "-1" that look like errors. :/
[16:38] <mikedawson> ishkbabob: that being said, I do run into slow requests under certain situations (mostly backfilling / recovery, and possiblyscrub / deep-scrub)
[16:38] * rudolfsteiner (~federicon@host135.181-14-188.telecom.net.ar) Quit ()
[16:38] <swinchen> -1 journal check: ondisk fsid d8711a8a-99ed-4525-8bd0-66fd5cc5924a doesn't match expected b3ffa8b8-685d-40e9-ab0d-7080f24574ed, invalid (someone else's?) journal
[16:38] <swinchen> -1 filestore(/var/lib/ceph/tmp/mnt.oprwFG) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
[16:40] <swinchen> It is almost like purgedata is not deleting the journal. I wonder if I should dd 0's to the OSD disks... :/
[16:40] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[16:40] * via (~via@smtp2.matthewvia.info) Quit (Remote host closed the connection)
[16:41] <mikedawson> swinchen: I see the journal check error when I am re-creating an OSD with a previously used journal. It is benign.
[16:42] <mikedawson> in that context anyway.
[16:42] * danieagle (~Daniel@ has joined #ceph
[16:43] <swinchen> Well I wonder what the heck is going on then :/ I have canned through the osd loog and don't see anything other than those. :/
[16:44] <yanzheng> try deleting all data in /var/lib/ceph/ while osd disks is mounted
[16:44] <swinchen> yanzheng: ok
[16:45] <mikedawson> swinchen: i haven't read back too far, but it looks like something isn't right in your CRUSH map perhaps
[16:45] * julian (~julianwa@ Quit (Read error: Connection reset by peer)
[16:46] <mikedawson> yanzheng: that's some serious brute force
[16:47] * via (~via@smtp2.matthewvia.info) has joined #ceph
[16:48] <swinchen> mikedawson: I am trying to install fresh.
[16:49] <yanzheng> don't forget to delete old monitor data
[16:49] <swinchen> yanzheng: how do I do that?
[16:50] <yanzheng> delete data in /var/lib/ceph/mon on the monitor node
[16:50] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has left #ceph
[16:52] * sjm (~sjm@ Quit (Quit: Leaving)
[16:53] <swinchen> yanzheng: ok, it looks like either purge or purgedata takes care of that for me.
[16:53] <tsnider> another question: for mds installation --- instructions on http://ceph.com/docs/next/start/quick-ceph-deploy/#add-a-mds say:
[16:53] <tsnider> "To use CephFS, you need at least one metadata node. Execute the following to create a metadata node:
[16:53] <tsnider> ceph-deploy mds create {node-name} "
[16:53] <tsnider> does this imply that "ceph-deploy install {node-name}" has been done first?
[16:54] <swinchen> Instead of doing "ceph-deploy osd prepare ceph0:vdc ceph0:vdd ceph1:vdc ceph1:vdd" should I just do one disk first, then add them one at a time?
[16:55] <pieter> Hi guys, am I right that more x OSD = faster speed?
[16:56] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[16:56] * ChanServ sets mode +o scuttlemonkey
[16:57] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[16:58] <dmsimard> tsnider: Yes, ceph-deploy install installs the ceph dependancies.
[16:58] <swinchen> Ugh! same exact thing. This is after scraping /var/lib/ceph whith the OSD disks still mounted, then doing all the purging and recreating :( http://pastie.org/pastes/8349073/text
[16:58] * swinchen doesn't undstand what he is doing wrong.
[16:58] <tsnider> dmsimard: thx -- sometimes instructions aren't clear
[16:59] <tsnider> for newbies
[16:59] <dmsimard> pieter: It's more complicated than that - I guess it's more a matter of capacity and distributing the load between the OSDs. The speed is more dependant on your disks and your network throughput.
[17:00] * BillK (~BillK-OFT@124-169-207-19.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[17:00] <dmsimard> The more OSDs you have, the better and faster you will recover from a single OSD failure
[17:00] <JustEra> Someone have a solution for backup file inside a ceph cluster ?
[17:00] <mattt_> swinchen: what is the problem you are having ?
[17:01] <pieter> thank you
[17:01] * pieter (~pieter@105-237-50-229.access.mtnbusiness.co.za) Quit (Quit: Konversation terminated!)
[17:02] <swinchen> mattt_: http://pastie.org/8348940# (starts @ line 116) This is my install log. No matter what I try I always get 192 stuck pgs.
[17:03] <swinchen> Now I am dd'ing 0's to the OSD disks. ugh.
[17:03] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[17:04] <swinchen> I am going to try adding the OSDs one at a time as well.
[17:05] <swinchen> after that I am going to try installing without ceph-deploy, but I haven't found much documenation on that.
[17:05] <JustEra> swinchen, do "yum install redhat-lsb" mabe it will help for the cmd to run successfully
[17:05] <swinchen> JustEra: I will give that a shot once I wipe these disks.
[17:06] <mattt_> swinchen: so did you have a running config before you started step 1 ?
[17:06] <swinchen> mattt_: I had several failed attempts
[17:07] <mattt_> swinchen: yeah, i wonder if those stuck pgs are from old data
[17:08] <swinchen> mattt_: that is what I am thinking too... but I mounted the OSD disks and deleted all the data from them. I am not sure where else it could be stored.
[17:08] <mattt_> swinchen: the MONs will store stuff too under /var/lib/ceph/mon i believe
[17:08] <swinchen> mattt_: and I deleted all /var/lib/ceph directories on all the nodes
[17:08] * sprachgenerator (~sprachgen@ has joined #ceph
[17:08] <mattt_> swinchen: ok, that answers that question then :P
[17:08] * kraken (~kraken@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[17:08] <mattt_> swinchen: but you did that on ceph-admin also ?
[17:09] <swinchen> mattt_: yep. all nodes... actually I think "ceph-deploy purgedata" deletes that directory.
[17:10] * shang (~ShangWu@1-171-115-249.dynamic.hinet.net) Quit (Ping timeout: 480 seconds)
[17:10] <mattt_> swinchen: on a side note, i have a dummy cluster running on fedora 19 too
[17:10] <swinchen> mattt_: Did you use ceph-deploy?
[17:10] <mattt_> swinchen: yap
[17:10] <swinchen> :/
[17:10] * JustEra (~JustEra@ Quit (Read error: Operation timed out)
[17:10] <mattt_> swinchen: using the ceph.com packages
[17:10] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[17:11] <mattt_> 0.67.2
[17:11] <swinchen> mattt_: me too. Hrmm.. I have selinux disabled, I stopped iptables... I can't imagine what is causing this.
[17:11] <mattt_> swinchen: not sure if it matters, but you may want to unmount the disks before you re-prepare / re-activate them
[17:13] <swinchen> mattt_: I do each time.
[17:14] <jnq> 12
[17:14] <mattt_> swinchen: and you're killing all daemons etc. before you nuke stuff right ?
[17:15] <yanzheng> swinchen's osd log show each pg only choose one acting osd
[17:15] <yanzheng> maybe it's crush rule issue
[17:16] <swinchen> mattt_: I verify that all daemons are killed, yes.
[17:16] * clayb (~kvirc@ has joined #ceph
[17:16] <mattt_> swinchen: sorry, new enough to ceph myself to not have any further suggestions :(
[17:17] * diegows (~diegows@ Quit (Read error: Operation timed out)
[17:18] <swinchen> mattt_: no worries. Thanks for the help
[17:18] <yanzheng> swinchen, all osd is on the same node?
[17:19] <mattt_> swinchen: may also want to make sure your ceph.conf is wiped before you start from scratch … not sure if ceph-deploy will overwrite that when it does its thing
[17:19] * glzhao (~glzhao@ Quit (Quit: leaving)
[17:19] <mattt_> and that references a cluster ID, so having a stale file around will presumably cause probs
[17:20] <swinchen> yanzheng: I have two OSDs per node, 2 OSD nodes.
[17:20] <swinchen> mattt_: I make sure that /etc/ceph is deleted.
[17:22] <swinchen> Gosh, it just doesn't matter... I scraped the disk ... still 192 pgs stuck. I am about to give up I think.
[17:22] <mattt_> swinchen: noooo
[17:23] <swinchen> mattt_: I just don't know what else to try.
[17:24] * tobru_ (~quassel@217-162-50-53.dynamic.hispeed.ch) has joined #ceph
[17:24] * sagelap1 (~sage@2600:1012:b02f:7df3:8c9f:39bf:9cd8:1d1c) has joined #ceph
[17:24] * foosinn (~stefan@office.unitedcolo.de) Quit (Quit: Leaving)
[17:29] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[17:30] <yanzheng> swinchen, what is the output of osd crush rule dump
[17:30] <yanzheng> "ceph osd crush rule dump"
[17:31] <swinchen> yanzheng: one second
[17:35] * rudolfsteiner (~federicon@ has joined #ceph
[17:35] <yanzheng> and "ceph pg map 0.1"
[17:35] * sagelap (~sage@2600:1012:b017:aa5:ccb1:69b8:aff2:596f) has joined #ceph
[17:39] * tobru_ (~quassel@217-162-50-53.dynamic.hispeed.ch) Quit (Remote host closed the connection)
[17:39] * danieagle (~Daniel@ Quit (Quit: inte+ e Obrigado Por tudo mesmo! :-D)
[17:39] * sagelap1 (~sage@2600:1012:b02f:7df3:8c9f:39bf:9cd8:1d1c) Quit (Ping timeout: 480 seconds)
[17:45] * jcfischer (~fischer@macjcf.switch.ch) Quit (Ping timeout: 480 seconds)
[17:49] <swinchen> yanzheng: http://pastie.org/8349175.
[17:49] <swinchen> err. http://pastie.org/8349175
[17:50] <swinchen> health HEALTH_WARN 192 pgs stuck inactive; 192 pgs stuck unclean
[17:50] <yanzheng> everything looks ok, no idea what's wrong.
[17:52] * ircolle (~Adium@c-67-165-237-235.hsd1.co.comcast.net) has joined #ceph
[17:52] <swinchen> They are VMs so if anyone wants to paste their public ssh key to look around I am fine with it.
[17:54] <swinchen> Maybe next I will try creating a cluster using a folder on the OS disk instead of using vdc/vdd
[17:57] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[17:57] * topro (~topro@host-62-245-142-50.customer.m-online.net) Quit (Quit: Konversation terminated!)
[17:57] <xarses1> swinchen: i just quickly glanced over your issue
[17:57] * xarses1 is now known as xarses
[17:58] <swinchen> xarses: Any idea?
[17:58] * mattt_ pulls up a chair
[17:58] <xarses> it looks like there is a problem with the crushmap beleiving that the osds are from separate nodes
[17:59] * swinchen double checks his /etc/hosts file
[17:59] <xarses> normaly, it will start replicating the pages around as long as there 2 hosts with at least 1 osd each
[17:59] <xarses> this can of course be changed (up or down)
[18:00] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[18:00] <xarses> I'm assuming you haven't changed it so it's odd that it hasn't come up
[18:00] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit ()
[18:00] <swinchen> xarses: No, I am using the default ceph.conf from ceph-deploy
[18:01] <swinchen> xarses: at one point (MANY attempts ago) I set
[18:01] <swinchen> osd crush chooseleaf type = 0
[18:01] <xarses> before creating the osds?
[18:01] <swinchen> xarses: yes. Back when I was trying it out on a single node.
[18:01] <xarses> ok
[18:02] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[18:03] <xarses> the hostnames on ceph0 and ceph1 are set correctly?
[18:03] <xarses> btw, i have no problems with virt-io disks in centos 6
[18:03] <swinchen> xarses: Oh crap. Perhaps not. /etc/hostname -> ceph0.novalocal
[18:04] <xarses> as long as thats your domain
[18:04] <xarses> ceph mostly cares about `hostname -s`
[18:04] <xarses> just as long as they both aren't the same name
[18:05] <xarses> and that they match the dns name
[18:06] * sagelap (~sage@2600:1012:b017:aa5:ccb1:69b8:aff2:596f) Quit (Read error: No route to host)
[18:07] <swinchen> xarses: just to be sure I am going to change the entries in /etc/hosts to ceph[0,1,-admin] to ceph[0,1,-admin].novalocal
[18:09] * sleinen1 (~Adium@2001:620:0:46:446c:e700:245d:9f89) Quit (Ping timeout: 480 seconds)
[18:09] <xarses> ceph dosn't care about the domain so much, just the short name
[18:09] <xarses> swinchen, heading to work will be back in ~20
[18:13] <swinchen> xarses: ok, thanks. I need to head to lunch soon. I am also trying to add OSDs in an alternating pattern this time: ceph0:vdc ceph1:vdc ceph0:vdd ceph1:vdd
[18:13] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) Quit (Read error: Connection reset by peer)
[18:13] <swinchen> That didn't do anything :(
[18:14] <swinchen> bbiab
[18:15] * dalegaard (~dalegaard@vps.devrandom.dk) has joined #ceph
[18:17] * dalegaar1 (~dalegaard@vps.devrandom.dk) Quit (Ping timeout: 480 seconds)
[18:17] * xarses (~andreww@c-71-202-167-197.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[18:19] * angdraug (~angdraug@ has joined #ceph
[18:23] * gregaf (~Adium@2607:f298:a:607:c2b:8d87:d52:b902) has joined #ceph
[18:24] * sagelap (~sage@2600:1012:b017:aa5:8c9f:39bf:9cd8:1d1c) has joined #ceph
[18:28] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) has joined #ceph
[18:28] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) Quit (Remote host closed the connection)
[18:29] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) has joined #ceph
[18:31] * xarses (~andreww@ has joined #ceph
[18:36] * Yapi (~agent@ has joined #ceph
[18:40] * Yapi (~agent@ has left #ceph
[18:41] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[18:43] * Tamil (~Adium@cpe-108-184-71-119.socal.res.rr.com) has joined #ceph
[18:44] * simulx (~simulx@vpn.expressionanalysis.com) has joined #ceph
[18:44] <simulx> question about the exec reduce(....) python thing
[18:45] <simulx> basically i want to start over
[18:45] <simulx> and all my command are hanging
[18:45] <simulx> has anyone seen this?
[18:45] <xarses> simulx: not enough information here
[18:46] <simulx> if you ps-ef on any node... you'll see a command " sudo python -u -c exec reduce(lambda a,b: a+b, map(chr, (105,109,1
[18:46] <simulx> and it's stuck
[18:46] <simulx> i ran through the quick start
[18:46] <simulx> realized that i wanted to start over, and tried to zap all the osd's
[18:46] <simulx> but then commands started hanging
[18:48] <dmsimard> I added you
[18:48] <dmsimard> Wrong channel
[18:48] <simulx> ok
[18:49] <simulx> i'm running on deb 12 lts ... following pretty standard stuff
[18:50] <xarses> swinchen: back, caught up with most of your logs, i would 1) double, tripple check your interfaces. 2) start really, really over. 3) add public_network and cluster_network to ceph.conf 4) use osd zap before prepare
[18:51] <xarses> use this to reset
[18:51] <xarses> export all="ceph-admin ceph0 ceph1"; ceph-deploy purge $all && ceph-deploy purge-data && ceph-deploy install $all && rm ceph* && ceph-deploy new
[18:51] * aliguori (~anthony@ has joined #ceph
[18:55] * mattt_ (~mattt@ Quit (Read error: Connection reset by peer)
[18:55] * jbd_ (~jbd_@2001:41d0:52:a00::77) has left #ceph
[18:57] * wusui (~Warren@2607:f298:a:607:cd8d:afc3:8329:e10) has joined #ceph
[19:00] * JustEra (~JustEra@ALille-555-1-127-163.w90-7.abo.wanadoo.fr) has joined #ceph
[19:02] * ishkbabob (~c7a82cc0@webuser.thegrebs.com) Quit (Quit: TheGrebs.com CGI:IRC)
[19:12] * dpippenger (~riven@cpe-75-85-17-224.socal.res.rr.com) Quit (Quit: Leaving.)
[19:13] * JustEra (~JustEra@ALille-555-1-127-163.w90-7.abo.wanadoo.fr) Quit (Quit: This computer has gone to sleep)
[19:15] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) Quit (Remote host closed the connection)
[19:17] * carif (~mcarifio@cpe-74-78-54-137.maine.res.rr.com) Quit (Quit: Ex-Chat)
[19:21] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) has joined #ceph
[19:22] * sagelap (~sage@2600:1012:b017:aa5:8c9f:39bf:9cd8:1d1c) Quit (Ping timeout: 480 seconds)
[19:26] * MACscr (~Adium@c-98-214-103-147.hsd1.il.comcast.net) has joined #ceph
[19:26] * ScOut3R (~scout3r@540099D1.dsl.pool.telekom.hu) has joined #ceph
[19:26] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Read error: Operation timed out)
[19:27] * tryggvil (~tryggvil@ has joined #ceph
[19:30] * rudolfsteiner (~federicon@ has joined #ceph
[19:33] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) has joined #ceph
[19:33] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) Quit (Remote host closed the connection)
[19:37] * rturk-away is now known as rturk
[19:39] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) has joined #ceph
[19:39] * sjustlaptop (~sam@2607:f298:a:607:41f:1c55:2baa:311d) has joined #ceph
[19:41] <swinchen> xarses: back. I only have one network on the VMs
[19:42] <xarses> ok
[19:43] <mikedawson> sagelap: Scrub has been hammering my OSDs when it runs. Added two more nodes running OSDs over the weekend. They are fully replicated now. Turned on scrub for a test. The two new nodes have way less iowait. Any idea why/where to look? http://www.gammacode.com/scrub-iowait-with-new-nodes.jpg
[19:43] <xarses> based on your ceph osd logs, i'd still scrub the osd partation's with ceph-zap
[19:43] <xarses> and build a new ceph.conf
[19:43] <swinchen> xarses: I do that every time: "ceph-deploy disk zap ceph-admin ceph0 ceph1"
[19:45] <xarses> close, ceph-deploy disk zap ceph0:vdb ceph0:vdc ceph1:vcb ceph1:vdc
[19:45] <xarses> you dont have any osd's on your admin, (zap blows away the partition table)
[19:45] <swinchen> xarses: that was a typo sorry... I include the disk when I do it.
[19:46] <xarses> ok, i didn't notice it in the log
[19:46] <xarses> ...
[19:47] <xarses> can you give me a ceph osd tree?
[19:47] * sleinen1 (~Adium@2001:620:0:25:fd72:2d18:2b1:e45c) has joined #ceph
[19:48] <swinchen> xarses: How does this look: export all="ceph-admin ceph0 ceph1"; ceph-deploy purge $all && ceph-deploy purgedata $all && ceph-deploy install $all && rm ceph* && ceph-deploy new ceph-admin && ceph-deploy mon create ceph-admin && ceph-deploy gatherkeys ceph-admin && ceph-deploy disk zap ceph0:vdc ceph0:vdd ceph1:vdc ceph1:vdd && ceph-deploy osd prepare ceph0:vdc ceph1:vdc ceph0:vdd ceph1:vdd && ceph-deploy osd activate ceph0:vdc ceph1:vdc
[19:49] * nwat (~nwat@eduroam-237-79.ucsc.edu) has joined #ceph
[19:49] <simulx> ceph purge is ok, that's what i was looking for (how to really, really, start over)
[19:50] <xarses> simulx do both purge, and then purge-data
[19:50] <simulx> ok
[19:50] <swinchen> xarses: sure... how do I generate it? I will look in the docs...
[19:50] <xarses> purge first
[19:50] <simulx> right now i can't even kill those processes, so i'm rebooting
[19:50] <xarses> swinchen. ceph osd tree
[19:52] <swinchen> xarses: http://pastie.org/pastes/8349460/text
[19:52] <xarses> swinchen, looks like you activate is missing disks or got truncated, but looks good
[19:52] <swinchen> xarses: must have gotten truncated: ceph-deploy osd activate ceph0:vdc ceph1:vdc ceph0:vdd ceph1:vdd
[19:53] <xarses> ok
[19:53] * sjustlaptop (~sam@2607:f298:a:607:41f:1c55:2baa:311d) Quit (Ping timeout: 480 seconds)
[19:54] <xarses> swinchen, food for exercise, have you tried changing to cuttlefish? maybe this is a ceph bug
[19:54] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[19:56] <swinchen> xarses: I haven't tried other releases. I am trying 0.67.3 right now
[19:57] <xarses> ya, which is the current dumpling release
[19:57] <xarses> I'm at a loss as to why the crushmap wont let you start writing
[19:57] <xarses> i don't know enough how to tune it to force it either
[19:58] <swinchen> both OSD nodes mount their disks: /dev/vdc1 on /var/lib/ceph/osd/ceph-1 type xfs (rw,noatime,seclabel,attr2,inode64,noquota)
[19:58] <swinchen> hrm
[19:58] <xarses> you can look at the crushmap docs http://ceph.com/docs/master/rados/operations/crush-map/
[19:58] <xarses> hey for fun
[19:58] <xarses> add any spare disk from the monitor
[19:59] * freedomhui (~freedomhu@ Quit (Quit: Leaving...)
[19:59] <swinchen> xarses: is it ok if it is a directory? If not I can create a volume and add it.
[19:59] <xarses> swinchen: the documentation implies that it's ok to use a directory, just pass the full path
[19:59] <xarses> i haven't tested it
[20:00] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * rudolfsteiner (~federicon@ Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * MACscr (~Adium@c-98-214-103-147.hsd1.il.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * Tamil (~Adium@cpe-108-184-71-119.socal.res.rr.com) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * dalegaard (~dalegaard@vps.devrandom.dk) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * kraken (~kraken@c-24-131-46-23.hsd1.ga.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * zackc (~zack@0001ba60.user.oftc.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * lx0 (~aoliva@lxo.user.oftc.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * dmsimard (~Adium@ Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * kuba (~kuba@ Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * MrNPP (~MrNPP@ Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * eternaleye_ (~eternaley@c-50-132-41-203.hsd1.wa.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * eternaleye (~eternaley@c-50-132-41-203.hsd1.wa.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * tchmnkyz (~jeremy@0001638b.user.oftc.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * Guest7383 (~coyo@thinks.outside.theb0x.org) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * torment_ (~torment@pool-72-64-182-81.tampfl.fios.verizon.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * joao (~joao@89-181-152-211.net.novis.pt) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * fretb (~fretb@frederik.pw) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * DarkAceZ (~BillyMays@ Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * athrift (~nz_monkey@ Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * AaronSchulz (~chatzilla@ Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] * DLange (~DLange@dlange.user.oftc.net) Quit (resistance.oftc.net oxygen.oftc.net)
[20:00] <swinchen> I will just create a new disk for the monitor node
[20:00] <xarses> ok
[20:00] <tsnider> I have a ceph filesystem mounted on a remote client and I'm able to do I/Os to it and see data go to the storage. Yipee. :) :) and have a question:
[20:00] <tsnider> If no authentication is needed -- I changed auth_supported = cephx to auth_supported = none in /etc/ceph/ceph.conf on the monitor node and restarted the ceph service. however authentication was still required on the remote client mount. What has to be done to get no auth. to take effect on the monitor node? I assume that the line doesn't have to be changed on all nodes in the cluster.
[20:00] <gregaf> tsnider: you will need to config each node to not use auth; yes, every one in the cluster
[20:01] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) has joined #ceph
[20:01] * rudolfsteiner (~federicon@ has joined #ceph
[20:01] * MACscr (~Adium@c-98-214-103-147.hsd1.il.comcast.net) has joined #ceph
[20:01] * dalegaard (~dalegaard@vps.devrandom.dk) has joined #ceph
[20:01] * kraken (~kraken@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[20:01] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[20:01] * zackc (~zack@0001ba60.user.oftc.net) has joined #ceph
[20:01] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) has joined #ceph
[20:01] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:01] * dmsimard (~Adium@ has joined #ceph
[20:01] * kuba (~kuba@ has joined #ceph
[20:01] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[20:01] * MrNPP (~MrNPP@ has joined #ceph
[20:01] * eternaleye_ (~eternaley@c-50-132-41-203.hsd1.wa.comcast.net) has joined #ceph
[20:01] * eternaleye (~eternaley@c-50-132-41-203.hsd1.wa.comcast.net) has joined #ceph
[20:01] * tchmnkyz (~jeremy@0001638b.user.oftc.net) has joined #ceph
[20:01] * Guest7383 (~coyo@thinks.outside.theb0x.org) has joined #ceph
[20:01] * torment_ (~torment@pool-72-64-182-81.tampfl.fios.verizon.net) has joined #ceph
[20:01] * joao (~joao@89-181-152-211.net.novis.pt) has joined #ceph
[20:01] * fretb (~fretb@frederik.pw) has joined #ceph
[20:01] * athrift (~nz_monkey@ has joined #ceph
[20:01] * AaronSchulz (~chatzilla@ has joined #ceph
[20:01] * DLange (~DLange@dlange.user.oftc.net) has joined #ceph
[20:01] <gregaf> sounds like you had the monitor not requiring auth the client was insisting on it
[20:01] * ChanServ sets mode +v scuttlemonkey
[20:01] <gregaf> as a security system you hardly want one node to be telling all the others what security they require!
[20:03] <swinchen> xarses: http://pastie.org/pastes/8349484/text
[20:03] <swinchen> (all disks are 250GB)
[20:04] <tsnider> gregaf: understood -- just getting it up and kicking the tires.
[20:05] <xarses> seriously odd
[20:05] <tsnider> gregaf: I only switched the monitor node orginally and not MDS or object nodes
[20:06] <xarses> tsnider, since the client will talk directly to the osd nodes, the osd's will need to not require cephx too
[20:07] <xarses> since mds is technically a client, it will also need cephx disabled at some point
[20:08] * JustEra (~JustEra@ALille-555-1-127-163.w90-7.abo.wanadoo.fr) has joined #ceph
[20:09] <xarses> swinchen, id try the older cuttlefish release, im at a loss for you here and am worried that it's a ceph bug. the default crushmap should allow your cluster to come clean at this point
[20:10] <swinchen> pgmap v18: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail <--- this is interesting. It looks like it thinks there is 0 KB avail>
[20:10] <tsnider> xarses: understood - thx
[20:10] <xarses> swinchen, oh
[20:10] <xarses> ya
[20:10] <xarses> that is odd
[20:10] <xarses> didn't catch that
[20:10] <xarses> that would help to explain some of the behavior
[20:11] <xarses> but on the previous cluster it was OK for space
[20:11] * sjustlaptop (~sam@ has joined #ceph
[20:11] <xarses> http://pastie.org/pastes/8348976/text
[20:11] <swinchen> xarses: very strange. also the 192pgs are stuck on "creating" now it looks like.
[20:12] <xarses> explains the "creating" sate instead of "active+degraded"
[20:12] * dmick (~dmick@2607:f298:a:607:8db2:46e7:7f9:f6a6) has joined #ceph
[20:12] * xarses is baffled
[20:12] <swinchen> /dev/vdc1 249G 34M 249G 1% /var/lib/ceph/osd/ceph-0
[20:12] * swinchen scratches his head
[20:12] <swinchen> ok... so to use cuddlefish do I just pass "cuddlefish" as a version parameter in ceph-deploy?
[20:13] <swinchen> cuttlefish... sorrt
[20:13] <swinchen> sorry
[20:14] <swinchen> nevermind... found it
[20:16] <swinchen> just to verify... I am going to run the exact same thing but replace: "ceph-deploy install $all" with "ceph-deploy install --stable cuttlefish $all"
[20:17] <xarses> yes
[20:18] <swinchen> eeek. that didn't work
[20:18] <swinchen> http://pastie.org/pastes/8349511/text
[20:19] <mikedawson> nhm: ping
[20:19] * dpippenger (~riven@tenant.pas.idealab.com) has joined #ceph
[20:19] <swinchen> It is like it didn't update the repo data
[20:20] * diegows (~diegows@ has joined #ceph
[20:22] <dmsimard> Oh, that's a fun typo. "Cuddlefish" http://images.elfwood.com/art/m/e/meglyman/meglyman_cuddlefishcolor.jpg :D
[20:22] <swinchen> hrm. It looks like the repo file is ok.
[20:23] <swinchen> ahh ok... think I got it
[20:27] * sjustlaptop1 (~sam@ has joined #ceph
[20:27] * sjustlaptop (~sam@ Quit (Read error: Connection reset by peer)
[20:28] <swinchen> xarses: same thing with cuttlefish
[20:29] * swinchen is intrigued by the 0 KB problem :/
[20:31] * tryggvil (~tryggvil@ Quit (Quit: tryggvil)
[20:32] <tsnider> I modified /etc/ceph.conf for no authentication and to specify both private and public networks. Distributed /etc/ceph/ceph.conf to all nodes in the cluster and restarted ceph on all clusters using "service ceph -a restart". But auth is still required and no ports are open on the cluster facing ( network. http://pastie.org/8349534. Am I missing something?
[20:33] * joshd (~joshd@2607:f298:a:607:18ea:cdc0:52c9:7878) has joined #ceph
[20:34] <xarses> swinchen: still getting 0kb?
[20:34] <swinchen> xarses: yes.
[20:34] <swinchen> xarses: I am deleting the virtual disks and recreating them
[20:35] * carif (~mcarifio@cpe-74-78-54-137.maine.res.rr.com) has joined #ceph
[20:35] * claenjoy (~leggenda@ Quit (Quit: Leaving.)
[20:38] <swinchen> xarses: well it isn't a problem with the disks... I recreated them and it still shows up as 0 KB. I am at a loss.
[20:38] <nwat> Is there a way to generate a PG map with a given set of crush rules and osds without actually spinning up any cluster?
[20:39] <gregaf> osdmaptool
[20:39] <gregaf> or crushtool
[20:39] <gregaf> I forget which exactly
[20:39] <gregaf> oh wait, a pg map, not osdmap
[20:39] <nwat> hrm, ok i thought i checked those out.
[20:39] <gregaf> no, not really
[20:39] <xarses> swinchen: what is your hypervisor?
[20:39] <swinchen> xarses: kvm
[20:40] <xarses> virtio for disk controller?
[20:40] <swinchen> xarses: yep
[20:40] <nwat> gregaf: it seems like such a tool would be possible (though maybe not useful). does any major roadblock come to mind... before i go start hacking to make this happen?
[20:40] <gregaf> not really, no
[20:40] <nwat> cool thanks
[20:43] * carif (~mcarifio@cpe-74-78-54-137.maine.res.rr.com) Quit (Ping timeout: 480 seconds)
[20:44] <swinchen> This shouldn't cause any problems (for testing) should it? [ceph0][ERROR ] INFO:ceph-disk:Will colocate journal with data on /dev/vdc
[20:45] <xarses> it shouldn't
[20:46] * sjustlaptop1 (~sam@ Quit (Ping timeout: 480 seconds)
[20:46] * DarkAceZ (~BillyMays@ has joined #ceph
[20:47] <swinchen> Maybe I will reboot all the nodes. Sometimes things just need a good reboot
[20:53] <swinchen> Ok, rebooting didn't work.
[20:53] * Camilo (~Adium@ has joined #ceph
[20:54] <swinchen> Where is data about the cluster stored? /var/lib/ceph, /var/run/ceph, /etc/ceph and the OSD disks? I have changed/delete all of those and have the same problem.
[20:55] <xarses> /var/lib/ceph/*
[20:55] * rturk is now known as rturk-away
[20:55] <xarses> /var/lib/ceph/osd.0 will be the mount for osd.0
[20:57] * yeled (~yeled@spodder.com) Quit (Ping timeout: 480 seconds)
[20:58] <swinchen> Here is my network configuration: http://pastie.org/pastes/8349600/text
[20:58] <cmdrk> silly question but.. when I put files into CephFS, how is object size determined in the underlying object store? does a 1GB file become a 1GB object or does it become, say, 256x4MB objects?
[21:00] * yeled (~yeled@spodder.com) has joined #ceph
[21:04] * sjustlaptop (~sam@ has joined #ceph
[21:07] <swinchen> Huh... isn't ceph-deploy suppose to update ceph.conf?
[21:09] <swinchen> OH MY GOD!
[21:09] <xarses> not really
[21:09] <swinchen> xarses: please don't kill me...
[21:09] <xarses> it only writes ceph.conf if /etc/ceph/ceph.conf is empty
[21:10] <swinchen> xarses: for some reason ceph-deploy was not startinf the OSD daemons. /etc/init.d/ceph start osd.[0-3] on the OSD nodes... fixed it.
[21:10] <swinchen> I can not believe I didn't check to make sure the daemons were running...
[21:10] <xarses> ceph-deploy should have started the deamons
[21:11] <swinchen> xarses: that must be a bug?
[21:11] <xarses> otherwise they shouldn't have shown up in the ceph -s or ceph osd tree
[21:12] <swinchen> xarses: I have no idea. Out of curiosity I did a "ps aux" and noticed they were not running so I started them. Now I have HEALTH_OK
[21:12] <xarses> random
[21:13] <xarses> so when you have lots of time re-image your machines and start over, if it does it again, its deff a ceph-deploy bug
[21:13] <xarses> I am glad there is a reason that it wasn't working
[21:13] <xarses> just not that you ran into it
[21:15] <simulx> ok, i figured out how to purge everything and i figured out what's wrong... the "/etc/ceph/ceph.client.admin.keyring" file isn't being created during the install
[21:15] <simulx> is that something i can back-fill?
[21:15] * phantomcircuit (~phantomci@covertinferno.org) Quit (Quit: quit)
[21:15] * phantomcircuit (~phantomci@covertinferno.org) has joined #ceph
[21:15] <swinchen> xarses: thank you for your all help.
[21:16] <simulx> i guess i can run mkcephfs by hand... but shouldn't ceph deploy do that for you?
[21:17] <xarses> simulx, ceph.client.admin.keyring dosn't usually need to be in /etc/ceph
[21:17] <xarses> simulx you should be able to ceph-deploy gatherkeys <node> after ceph-deploy mon create <node>...
[21:17] <xarses> if gatherkeys cant find all of the keys, then there isn't a quorum
[21:17] <simulx> ah
[21:18] <simulx> i get this error:
[21:18] <simulx> [ceph_deploy.gatherkeys][WARNIN] Unable to find /etc/ceph/ceph.client.admin.keyring on ['usadc-nasea05', 'usadc-nasea06', 'usadc-nasea07', 'usadc-nasea08']
[21:18] <simulx> all the other keys gather
[21:18] * rturk-away is now known as rturk
[21:18] <simulx> i get the ceph.bootstrap-mds.keyring, ceph.bootstrap-osd.keyring, ceph.mon.keyring
[21:18] <simulx> but not the admin keyring
[21:18] <xarses> ps ax | grep ceph
[21:19] <simulx> 6637 ? Ssl 0:00 /usr/bin/ceph-mon --cluster=ceph -i usadc-nasea05 -f
[21:20] <xarses> can you share the command lines you used for ceph-deploy new, and ceph-deploy mon create?
[21:20] <simulx> ceph-deploy new usadc-nasea05 usadc-nasea06 usadc-nasea07 usadc-nasea08
[21:20] <simulx> ceph-deploy mon create usadc-nasea05 usadc-nasea06 usadc-nasea07 usadc-nasea08
[21:21] <xarses> run /usr/sbin/ceph-create-keys
[21:21] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) Quit (Remote host closed the connection)
[21:21] <simulx> ceph-create-keys: error: argument --id/-i is required
[21:21] <simulx> "id of a ceph-mon that is coming up"
[21:21] <xarses> oh
[21:21] <simulx> not sure what that id would be
[21:21] <simulx> i guess a hostname?
[21:22] <xarses> ceph-create-keys -i mon.`hostname -s`
[21:22] <xarses> if it dosn't like that, then just `hostname -s`
[21:22] <simulx> admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[21:22] <simulx> INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
[21:23] <simulx> if i kill it... it's in "wait_for_quorum"
[21:23] <simulx> so it does sound quorumy
[21:23] <xarses> correct
[21:23] <xarses> the admin socket isn't open so ceph-mon is not running correctly
[21:23] <simulx> note: i got no warnings/errors during the install
[21:23] <xarses> do a service ceph -a stop
[21:23] <simulx> on all nodes?
[21:23] <xarses> and then service ceph -a start
[21:24] <xarses> just on the one for now
[21:24] <simulx> done
[21:24] <xarses> we want create-keys to be running in the background with 'probing for quorm' as the message, or the keys created
[21:24] <xarses> see if create-keys is running in the bg now
[21:25] <simulx> nope
[21:25] <kraken> http://i.imgur.com/2xwe756.gif
[21:25] <simulx> and if i run it again i get the same error
[21:26] <simulx> 2013-09-23 15:26:07.128854 7f6a5c0eb700 1 mon.usadc-nasea05@0(leader).paxos(paxos active c 1..45) is_readable now=2013-09-23 15:26:07.128862 lease_expire=2013-09-23 15:26:10.233116 has v0 lc 45
[21:26] <simulx> that's the tail of the mon log
[21:27] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[21:27] * TiCPU (~jeromepou@190-130.cgocable.ca) Quit (Ping timeout: 480 seconds)
[21:30] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) has joined #ceph
[21:30] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) has joined #ceph
[21:32] <xarses> hmm. based on other users, I'd guess that the hostname you passed on the command-line dosn't match `hostname -s`, otherwise, there was residual data in /var/lib/ceph/* from the last install
[21:32] <simulx> hostnames match
[21:33] <tsnider> I modified /etc/ceph.conf for no authentication and to specify both private and public networks. Distributed /etc/ceph/ceph.conf to all nodes in the cluster and restarted ceph on all clusters using "service ceph -a restart". But auth is still required and no ports are open on the cluster facing ( network. http://pastie.org/8349534. Am I missing something?
[21:33] <simulx> i'll try purging again and making sure nothing residual is there
[21:34] <xarses> simulx, do a purge, purge-data and then delete ceph* from the working directory on your ceph-deploy node
[21:36] * cjh_ (~cjh@ps123903.dreamhost.com) has joined #ceph
[21:36] <simulx> doin it
[21:37] <simulx> i noticed that after a purge, there still was some var lib stuff
[21:37] <simulx> so that could be it
[21:38] <xarses> purgedata should resolve that
[21:39] <simulx> yep
[21:39] <simulx> that was it
[21:39] <xarses> iirc it rm -rf /etc/ceph /var/lib/ceph
[21:40] <xarses> you will also want to erase ceph* from the ceph-deploy working directory
[21:40] <xarses> and you will need to new and install aswell as mon create
[21:41] <mikedawson> Anyone experienced with xfs filesystem fragmentation under your OSDs?
[21:42] * rudolfsteiner (~federicon@ Quit (Quit: rudolfsteiner)
[21:42] <simulx> everything worked
[21:42] <simulx> the rm -rf's were really needed
[21:43] * sjustlaptop (~sam@ Quit (Ping timeout: 480 seconds)
[21:44] <mikedawson> joshd: ping
[21:46] <sleinen1> I'm trying to get read of an unclean pg in my cluster. I tried "ceph pg force_create_pg 0.cfa", but the request seems stuck at the lead OSD. Any ideas?
[21:47] * clayb (~kvirc@ Quit (Read error: Connection reset by peer)
[21:47] <joshd> mikedawson: pong
[21:48] * BManojlovic (~steki@fo-d- has joined #ceph
[21:49] <mikedawson> joshd: are sparse rbd images and/or CoW clones expected to have any bearing on xfs filesystem fragmentation on the drives backing OSDs?
[21:51] <mikedawson> joshd: I added two new servers with OSDs over the weekend, let them fully balance, then enabled scrub today. Old OSDs have iowait approaching 100% when scrub is active. New ones hardly notice it. http://www.gammacode.com/scrub-iowait-with-new-nodes.jpg
[21:53] <sleinen1> If anyone wants to look at my issue trying to force_create_pg a pg that has been unclean for months: http://pastebin.ca/2457484
[21:53] <mikedawson> joshd: so I am suspecting fragmentation. New drives are about 2% fragmented after 3 days. 6-month old drives show high fragmentation (~80%).
[21:56] * Vjarjadian (~IceChat77@ has joined #ceph
[21:58] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) has joined #ceph
[21:59] <swinchen> Does replication replicate data across OSD or across nodes?
[22:00] <xarses> osd's as defined by the curshmap
[22:00] <xarses> by default it dosn't allow replica data on the same host
[22:03] <swinchen> xarses: perfect. Thanks :)
[22:03] <xarses> also, the default replica is 2
[22:04] <swinchen> xarses: even better... that is what I will want most of the time.
[22:16] * diegows (~diegows@ has joined #ceph
[22:17] <swinchen> Alright, now I need to come up with a recommended hardware list for our cluster. I would like to make the storage cluster HA. How does this sound: 2x 24-bay Drive Array w/ dual power-loss protected SSD (for journal), 3x monitor/metadata node (high RAM)?
[22:17] * sjustlaptop (~sam@2607:f298:a:607:740c:859b:4c33:46f2) has joined #ceph
[22:18] <swinchen> That sounds like a lot of overhead... I wonder if I am missing something.
[22:18] <joshd> mikedawson: I don't think anyone's looked at fragmentation too closely (maybe nhm), but in general sparse files are more likely to become fragmented
[22:22] <mikedawson> joshd: I have an excellent case study if someone wants to look ;-)
[22:23] <mikedawson> joshd: I have 4MB chunks set. Despite having sparse rbd images, I thought Ceph would allocate non-sparse 4MB chunks. Seems like that would minimize fragmentation on the physical drive.
[22:25] * lx0 is now known as lxo
[22:25] <joshd> mikedawson: the underlying files on the osd are sparse to save space, but it may be worth looking into fully allocating them to avoid fragmentation like you're seeing
[22:26] <swinchen> So... the monitor does not seem to be CPU intesive. What are the reasons for not running a monitor daemon on OSD nodes?
[22:26] <lxo> what's the type I have to tell ceph-dencoder to use for it to decode the information held in a cephfs directory, say 1.0/100001234.00000000__head* ?
[22:27] <mikedawson> joshd: that would be a huge help in this case. Or perhaps it is a documentation issue left to the operator to deal with (defrag constantly).
[22:29] <nhm> mikedawson: pong
[22:29] <mikedawson> joshd: In the case of RBD, would allocating the next chunk as a full 4MB only waste a max of 4MB, or would it waste a max of 4MB * some number of chunks? I.E. does RBD always write in sequential order of new chunks when filling out a sparse image?
[22:31] <joshd> mikedawson: rbd doesn't do any address translation - it just does the writes at the locations specified by the vm
[22:31] <xarses> swinchen, the production recommendation is to run them stand-alone however it does appear to be common to co-habitate them with other services. however if the monitors don't get enough resources to do their thing, they can cause the cluster to stop
[22:31] <joshd> mikedawson: so it's not actually doing any allocation, the first write to an object creates it implicitly
[22:33] <xarses> swinchen, putting them with busy osd's will lead to this issue, however mingling them with other services might not be such an issue.
[22:33] <mikedawson> nhm: seeing lots of xfs fragmentation under my osds that are around 6 months old (and have never been fragmented). That appears to be the cause of my scrub-related issues
[22:33] <xarses> sinchen, it's a do what feels best kind of thing
[22:33] <xarses> swinchen ^^
[22:34] <nhm> mikedawson: fun! looking via xfs_bmap?
[22:34] * sjustwork (~sam@2607:f298:a:607:d6be:d9ff:fe8e:1a8e) has joined #ceph
[22:34] * JustEra (~JustEra@ALille-555-1-127-163.w90-7.abo.wanadoo.fr) Quit (Quit: This computer has gone to sleep)
[22:34] <swinchen> xarses: Hmmm, ok. I think I will suggest we find some other place to put monitors. I would like to have three of them to keep the system HA. Any recommendations on how to handle connecting to multiple monitors? multipath routing?
[22:35] <mikedawson> joshd: once the object is allocated and full, would the xfs fragmentation issue largely go away permanently afterwards?
[22:36] <mikedawson> nhm: xfs_db -c frag -r /dev/sdb1
[22:36] <mikedawson> joshd: after it was defrag'ed, I mean
[22:37] <joshd> mikedawson: I'm not sure it'd go away permanently, since I don't know if there are other sources (maybe leveldb) that could produce fragmentation, but i'd expect there to be much less of a problem
[22:38] <xarses> swinchen, ceph (and client) is extremely intelligent in this matter, the monitor map and osd map are downloaded initially from one of the monitors listed in /etc/ceph/{cluter}.conf, afterwards, the client will use the mon map to find other mons if one isn't available. the monitor is only needed to manage updates to the mon map and osd map. the client will talk directly to the osd's as much as possible
[22:38] <mikedawson> joshd: yep. I'm seeing stuff like "extents before:239 after:1 DONE ino=537971636" when defraging. Seems like alot of extents for a 4MB object
[22:40] * sjustlaptop (~sam@2607:f298:a:607:740c:859b:4c33:46f2) Quit (Ping timeout: 480 seconds)
[22:40] <swinchen> xarses: Users connect initially to a monitor though correct? So what happens if you have three monitors (mon0 - mon2) and you tell everyone to connect to mon0's IP address. The next day the montherboard blows a cap in mon0.
[22:40] <nhm> mikedawson: it's been a while since I dug into XFS fragementation. We used to see a fair amount of it back in the old old days.
[22:41] <nhm> mikedawson: are you sure it only happened after scrub?
[22:41] <mikedawson> nhm: it may be back. RBD, sparse images, and CoW. We're also filling sparse multi-TB volumes with surveillance video over time
[22:41] <xarses> swinchen, ceph.conf [global] mon_host contains all monitors listed from ceph-deploy new, so it would have 3 nodes in your example.
[22:42] <xarses> sinchen, also the clients should cache the mon_map for some amount of time (not sure how long)
[22:43] <xarses> so as long as their copy of the monmap was still usable, they should be able to reconnect to the cluster
[22:43] <mikedawson> nhm: no, it's likely getting more fragmented the whole time. Scrub just makes it an acute problem. This image shows scrub off, then later enabled. node1 and node25 are a few days old and not very fragmented yet. Bet you can guess when I enabled scrub http://www.gammacode.com/scrub-iowait-with-new-nodes.jpg
[22:43] <swinchen> xarses: ahh ok.
[22:43] <nhm> mikedawson: is the fragmentation pretty evenly spread across files?
[22:43] <swinchen> xarses: thanks.
[22:43] <xarses> sinchen even if all the monitors in mon_host where offline (but there where still enough other monitors to keep a quorum)
[22:44] <mikedawson> nhm: it seems fragmentation is quite evenly spread, but I'm not certain of how to tell for sure
[22:44] <xarses> swinchen, if you add more monitors, the best practace would be to update the ceph.conf to include them in the mon_hosts list
[22:45] <xarses> swinchen, but the magic of the mon_map dosn't require this
[22:46] * Tamil (~Adium@cpe-108-184-71-119.socal.res.rr.com) has joined #ceph
[22:46] <swinchen> xarses: Ok, I haven't really dug around much in the ceph.conf file yet. That will be next on my list. mon_map sounds very interesting... seems like it really removes the need to multipath, or a pacemaker/corosync type setup.
[22:47] * TiCPU (~jeromepou@190-130.cgocable.ca) has joined #ceph
[22:48] <xarses> swinchen, correct, that should not be necessary for the monitors or osd's. you will have a slightly different outlook for radosgw
[22:48] <xarses> and probably mds (CephFS) too, but that one is even more weird
[22:49] <swinchen> xarses: luckily I don't think I need that for OpenStack!
[22:49] <swinchen> :)
[22:49] <nhm> mikedawson: you can check fragmentation on a per-file basis. It could be interesting to sweep through files and see if it matters how old they are or what kind of file it is.
[22:49] <xarses> swinchen, well then you are following exactly what I'm doing
[22:49] <xarses> we put the mon's on our ha controllers
[22:50] <xarses> and osd's wherever else the user want's to cook up.
[22:50] <xarses> I'm a developer for Fuel
[22:51] <swinchen> xarses: hrmm.. currently my HA controllers are coupled quite tightly into my OpenStack nodes. By that, I mean they are the same.
[22:52] <swinchen> xarses: Cool! The looks like an amazing product. I found that right after I created my own (by hand) HA Grizzly install.
[22:52] <mikedawson> nhm: I'll try to cook something up. Luckily it looks like xfs_fsr is relatively good at running in the background without killing performance.
[22:52] <nhm> mikedawson: might not need to do the whole FS either, just take a sample
[22:53] <xarses> swinchen, our new 3.2 version will support ceph =)
[22:53] <nhm> mikedawson: having said that, I have no idea if the results will be interesting or not...
[22:53] <mikedawson> nhm: unlike scrub (which stalls RBD guests on my albeit heavily fragmented cluster)
[22:53] <nhm> mikedawson: we saw something like this way back when, but we never did a deep investigation, and then it seemed to kind of get better on it's own.
[22:54] * jskinner (~jskinner@ has joined #ceph
[22:54] <nhm> mikedawson: Pretend it's like going through a really long car wash. ;)
[22:55] <swinchen> xarses: That is awesome! How about OVS? One thing that is stopping me from using it is the choice between VLAN and flat DHCP.
[22:55] <mikedawson> nhm: I'll do my best to get you some data. My guess is the problem will largely go away for us now that these originally sparse images are fully allocated. My bet is overwriting data will behave much better
[22:56] <xarses> swinchen, the fuel ui should work with nutron/quantum in 3.2. the CLI (hack puppet by hand) side can configure quantum currently
[22:56] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Quit: Leaving.)
[22:57] <joshd> nhm: mikedawson: yeah, I'd guess it's related to the workload doing lots of small sequential writes to the sparse rbd files
[22:57] <joshd> mikedawson: fully allocating probably means adding an CEPH_OSD_OP_ALLOCATE in the osd and librados and having rbd send one with each write
[22:57] * sjm (~sjm@wr1.pit.paircolo.net) has joined #ceph
[22:58] <nhm> joshd: any reason the fragmentation wouldn't show up until scrubs are done?
[22:58] <swinchen> xarses: Nice. I am certainly keeping an eye on it. I never again want to build an HA cluster by hand. I was bad and went against several recommendations (rabbitmq cluster, galera cluster). Galera is working fine... rabbit cluster is a bit flaky.
[22:58] <mikedawson> joshd: yeah, the other people suffering similar issues seem to be running databases that could grow in a similar way
[22:58] <joshd> nhm: no, I missed that part
[22:59] <joshd> nhm: could be scrub doing something I guess, I didn't think it was writing a lot though
[22:59] <mikedawson> nhm: the fragmentation was always there (I think). It only rears it's head as an issue during scrub starts, because I don't have many reads otherwise
[22:59] <nhm> ah
[23:00] <nhm> ok, so it might be interesting then to see if you do some big 4MB sequential writes and can track down the objects (maybe do a rados bench run) if those files are any more or less fragmented than the other ones.
[23:01] <mikedawson> nhm: We write lots of little frames of video which rbd writeback cache coalesces very nicely, and don't read very often. When scrub starts, it reads aggressively causing high spindle contention. Then our VMs tend to have stalls in i/o with Read latency getting ugly
[23:01] <nhm> I suppose assuming you've got enough free space left that it doesn't end up fragmented due to the existing fragmentation.
[23:03] <nhm> mikedawson: you could also try using blktrace and seekwatcher during some of your normal write activity.
[23:03] <mikedawson> nhm: I believe so. We're 65-75% utilized on 3TB drives and need to rearrange a bunch of 4MB objects, so I think it should work
[23:04] <mikedawson> nhm: under normal load (no scrub and no deep-scrub) drives peak around 20%util
[23:06] * themgt (~themgt@201-223-232-27.baf.movistar.cl) has joined #ceph
[23:13] * Kupo1 (~tyler.wil@wsip-68-14-231-140.ph.ph.cox.net) has joined #ceph
[23:13] <Kupo1> Hey All, any CentOS/RHEL users have issues with ceph-deploy mon not creating keys?
[23:14] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[23:14] <Kupo1> eg 'Unable to find /var/lib/ceph/bootstrap-osd/ceph.keyring on'
[23:16] <Camilo> kupo: I didn't have any
[23:18] <Kupo1> what scripts generate ceph.client.admin.keyring & ceph.keyring
[23:18] <Kupo1> ?
[23:19] <dmick> ceph-create-keys
[23:19] <Kupo1> what id would i use for that command?
[23:20] <xarses> Kupo1: what version of ceph-deploy and ceph?
[23:20] <Kupo1> ceph 0.67.3, deploy 1.2.6
[23:22] <cjh_> why does ceph crc all data coming in over the network when the network drivers crc the data also? Do we not trust that the network driver verified everything?
[23:23] <xarses> kupo1: ceph-deploy mon create should initalize the monitors, seed the key-rings and cause ceph-create-keys ro run
[23:23] * rendar (~s@ Quit ()
[23:23] * sprachgenerator (~sprachgen@ Quit (Quit: sprachgenerator)
[23:24] <Kupo1> I don't think ceph-create-keys is running
[23:24] <Kupo1> how do i invoke that manually?
[23:24] <xarses> ceph-create-keys -i mon.`hostname -s`
[23:25] <Kupo1> hmm
[23:25] <Kupo1> admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[23:25] <Kupo1> INFO:ceph-create-keys:ceph-mon admin socket not ready yet
[23:26] <xarses> yep
[23:27] <xarses> that is because of 1) left over bits from a previous install 2) hostname passed from the commandline isn't exactly the same as `hostname -s`
[23:31] <Kupo1> trying uninstall/purge
[23:31] * sjm (~sjm@wr1.pit.paircolo.net) Quit (Quit: Leaving)
[23:33] <Kupo1> same error; http://pastebin.mozilla.org/3131298
[23:33] <xarses> kupo1: do an purge and then purge-data and rm ceph* from the ceph-deploy working directory
[23:35] <Kupo1> how do i find the working directory?
[23:36] <xarses> where ever you are running the ceph-deploy command from
[23:36] <Kupo1> after that run the installation again?
[23:36] <xarses> yes
[23:37] <xarses> it should really delete everything
[23:38] <Kupo1> Same error
[23:38] <Kupo1> http://pastebin.mozilla.org/3131321
[23:40] * Tamil (~Adium@cpe-108-184-71-119.socal.res.rr.com) Quit (Quit: Leaving.)
[23:40] <xarses> odd, ceph-create-keys mon.`hostname -s` returns socket error still?
[23:41] <Kupo1> yes
[23:41] <Kupo1> ceph-create-keys -i mon.`hostname -s`
[23:41] <Kupo1> admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[23:41] <Kupo1> INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
[23:41] * tsnider (~tsnider@nat-216-240-30-23.netapp.com) Quit (Quit: Leaving.)
[23:43] <dmick> is the mon running?
[23:44] <Kupo1> whats the command to check?
[23:44] <dmick> uh, ps?
[23:44] * Tamil (~Adium@cpe-108-184-71-119.socal.res.rr.com) has joined #ceph
[23:44] * imjustmatthew (~imjustmat@pool-71-251-233-166.rcmdva.fios.verizon.net) has joined #ceph
[23:44] <Kupo1> yes; root 18525 0.0 0.0 159704 11072 ? Sl 14:37 0:00 /usr/bin/ceph-mon -i hv1 --pid-file /var/run/ceph/mon.hv1.pid -c /etc/ceph/ceph.conf
[23:47] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[23:49] <Kupo1> should i try a different version?
[23:54] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:54] * rturk is now known as rturk-away
[23:56] * vata (~vata@2607:fad8:4:6:7891:2cb3:fd76:a5e3) Quit (Quit: Leaving.)
[23:57] <sleinen1> I'm trying to repair our cluster, which has had a broken pg (0.cfa) for months now. As a last resort, I have tried to recreate that pg using ceph pg force_create_pg. But that seems stuck on the pg's lead OSD for 2.5 hours now.
[23:57] <sleinen1> http://pastebin.ca/2457516
[23:57] <sleinen1> Any chance to get this unstuck?

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.