#ceph IRC Log

Index

IRC Log for 2013-11-19

Timestamps are in GMT/BST.

[0:00] * bandrus (~Adium@108.249.90.53) has joined #ceph
[0:01] <aarontc> hey guys, how are things in cephland?
[0:02] * linuxkidd (~linuxkidd@2607:f298:a:607:e8b:fdff:fe5a:47c7) Quit (Quit: Konversation terminated!)
[0:02] * linuxkidd (~linuxkidd@38.122.20.226) has joined #ceph
[0:06] * jskinner (~jskinner@69.170.148.179) Quit (Remote host closed the connection)
[0:06] * jskinner (~jskinner@69.170.148.179) has joined #ceph
[0:07] <unit3> aarontc: distributed.
[0:07] <aarontc> unit3: niiice
[0:08] <JoeGruher> is there any performance advantage to multiple block devices (rbds) on a host? like if i have a system with a 16 thread workload, does running four threads each on four rbds have any advantage (or disadvantage) versus 16 threads on one rbd, if all other aspects of the workload remain constant?
[0:08] <aarontc> I was hoping to ask mikedawson for some more tips on solving my latency issues but he just left. I'm seeing average op_w_latency of > 3 seconds for the last hour :(
[0:09] <unit3> JoeGruher: I'm not super familiar with the internals yet, but that just sounds like more overhead to me. A single rbd should do the right thing, from how I understand it.
[0:10] <unit3> aarontc: he posted the following command for someone else to show the 20 slowest ops in the last 10m: ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok dump_historic_ops
[0:10] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[0:11] <unit3> might be worth examining that, see if you can identify common ops that are running slowly.
[0:11] <unit3> or pastebin it for other people in here to look at.
[0:11] <aarontc> unit3: Thanks, let me grab that
[0:13] * onizo (~onizo@wsip-70-166-5-159.sd.sd.cox.net) Quit (Remote host closed the connection)
[0:14] * jskinner (~jskinner@69.170.148.179) Quit (Ping timeout: 480 seconds)
[0:17] <aarontc> (slightly unrelated) looks like runningm ore than one OSD per CPU core on a single host causes some big latency issues, at least with kernel 3.10.7
[0:21] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) has joined #ceph
[0:27] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[0:28] <JoeGruher> when running performance testing on an RBD we should write in all the capacity once before starting testing, right? otherwise ceph may not actually fetch data from disk if it gets a read request to a block it knows hasn't been written (pretty standard for thin provisioning)?
[0:28] * dmsimard (~Adium@108.163.152.2) Quit (Ping timeout: 480 seconds)
[0:28] <via> is there any way to change a pool id?
[0:31] * ircolle (~Adium@2001:468:1f07:da:458a:75ed:728c:cb4a) has joined #ceph
[0:31] <aarontc> unit3: http://hastebin.com/koraficeve
[0:32] * mschiff (~mschiff@dslb-088-075-247-250.pools.arcor-ip.net) has joined #ceph
[0:32] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[0:34] * japuzzo (~japuzzo@ool-4570886e.dyn.optonline.net) Quit (Quit: Leaving)
[0:34] <aarontc> unit3: Associated latency from the first 9 OSDs: http://i.imgur.com/sRDwlNR.png
[0:35] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[0:36] * dxd828 (~dxd828@host-92-24-127-29.ppp.as43234.net) Quit (Quit: Textual IRC Client: www.textualapp.com)
[0:37] * nhm (~nhm@wlan-rooms-2395.sc13.org) has joined #ceph
[0:37] * ChanServ sets mode +o nhm
[0:38] <unit3> aarontc: hmmm. I don't know, unfortunately. I'm pretty new to ceph myself. Hopefully someone more familiar will be able to take a look.
[0:39] <aarontc> unit3: No worries, thanks. Mikedawson seems to be an expert on latency issues, last weekend he said he could help if I was able to monitor things - and now I am :)
[0:40] <unit3> awesome. hopefully you can catch him tomorrow then. :)
[0:40] * astark (~astark@ool-4576c894.dyn.optonline.net) has joined #ceph
[0:41] <aarontc> I managed to set up discovery rules for Zabbix so any host on my network that is running any ceph daemons will start being interrogated and have almost every param that 'perf dump' reports logged at 10 second intervals :)
[0:43] <aarontc> I wish it was easier to share zabbix configuration stuff, if anyone else uses zabbix they might want to use the rules
[0:44] * markbby (~Adium@168.94.245.2) Quit (Quit: Leaving.)
[0:45] <unit3> yeah, that does sound very handy.
[0:45] * mxmln (~mxmln@212.79.49.66) Quit (Quit: mxmln)
[0:48] <aarontc> awesome one of my hosts just died
[0:52] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[0:54] * tsnider (~tsnider@nat-216-240-30-23.netapp.com) Quit (Ping timeout: 480 seconds)
[0:57] * Cube (~Cube@12.248.40.138) Quit (Quit: Leaving.)
[0:57] * Cube (~Cube@12.248.40.138) has joined #ceph
[0:57] * Cube (~Cube@12.248.40.138) Quit ()
[0:59] * unit3 (~Unit3@72.2.49.50) has left #ceph
[1:01] * xmltok (~xmltok@216.103.134.250) has joined #ceph
[1:04] * dmsimard (~Adium@69.165.206.93) has joined #ceph
[1:04] * dmsimard (~Adium@69.165.206.93) Quit ()
[1:05] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[1:05] <via> i can't get my mds to start, keeps dying with: https://pastee.org/2kxz5
[1:05] * KindTwo (~KindOne@h78.24.131.174.dynamic.ip.windstream.net) has joined #ceph
[1:05] * sarob (~sarob@ip-64-134-231-115.public.wayport.net) has joined #ceph
[1:06] <via> not really sure whats going on there, but i dont have pool ids 0, 1, and 2 ... would that affect it?
[1:06] <via> the pools are still there, they're just 3 4 5
[1:06] <aarontc> how did you change the pool IDs, via?
[1:06] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[1:06] * KindTwo is now known as KindOne
[1:07] <via> i deleted the original pools and recreated
[1:07] <via> i didn't expect it to not start at 0
[1:07] <aarontc> there is some way to tell mds what pool ID to use but I don't know what it is, offhand
[1:07] <via> my pgnum needed to be be significantly higher, i didn't know of another way to do it
[1:07] <via> would that cause my crash?
[1:07] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) Quit (Ping timeout: 480 seconds)
[1:08] <aarontc> you can increase pgnum with 'ceph osd pool set poolname pg_num 1234' or similar, don't recall the precise syntax
[1:08] <via> well, crap, i could have done that
[1:08] <via> but surely there's a way to just get back to 0,1,2 without starting completely over again
[1:08] <aarontc> but yes, via, I think your mds is crashing because you don't have pools 0 and 1, which it uses
[1:09] <via> is there any way for me to create those pool ids?
[1:09] <aarontc> I can't answer that, sorry
[1:10] <via> there's ceph mds newfs, i'll give that a try
[1:11] * ircolle (~Adium@2001:468:1f07:da:458a:75ed:728c:cb4a) Quit (Quit: Leaving.)
[1:11] <via> that acutally worked
[1:11] <aarontc> cool, maybe it does name->id conversion on newfs and stores the ID somewhere
[1:14] * nwat (~textual@eduroam-243-4.ucsc.edu) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[1:14] * astark (~astark@ool-4576c894.dyn.optonline.net) Quit (Quit: astark)
[1:15] * noahmehl_ (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[1:18] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[1:20] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Ping timeout: 480 seconds)
[1:20] * noahmehl_ is now known as noahmehl
[1:27] * nhm (~nhm@wlan-rooms-2395.sc13.org) Quit (Ping timeout: 480 seconds)
[1:27] * mschiff (~mschiff@dslb-088-075-247-250.pools.arcor-ip.net) Quit (Quit: No Ping reply in 180 seconds.)
[1:27] * mschiff (~mschiff@dslb-088-075-247-250.pools.arcor-ip.net) has joined #ceph
[1:27] * Cube (~Cube@66-87-64-40.pools.spcsdns.net) has joined #ceph
[1:29] * astark (~astark@ool-4576c894.dyn.optonline.net) has joined #ceph
[1:32] * yanzheng (~zhyan@134.134.139.74) has joined #ceph
[1:33] * Siva (~sivat@vpnnat.eglbp.corp.yahoo.com) has joined #ceph
[1:33] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) Quit (Quit: Leaving.)
[1:34] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[1:37] * astark (~astark@ool-4576c894.dyn.optonline.net) Quit (Read error: Operation timed out)
[1:37] * astark (~astark@ool-4576c894.dyn.optonline.net) has joined #ceph
[1:38] * Steki (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[1:38] * andreask (~andreask@h081217067008.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[1:39] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Read error: Operation timed out)
[1:40] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[1:42] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Read error: Operation timed out)
[1:42] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[1:44] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[1:44] * niks (~oftc-webi@17.149.234.100) has joined #ceph
[1:46] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[1:46] <niks> Finally got ceph installed in ubuntu..is CentOS / RHEL even expected to work?
[1:47] <aarontc> niks: I think RHEL is supported but I hear a lot of people having issues with it
[1:48] <niks> i tried and found tons of problems...osd's do not come up etc..
[1:48] <niks> while in ubuntu its much better
[1:48] * diegows (~diegows@190.190.11.42) has joined #ceph
[1:49] <aarontc> I'm not surprised, RHEL/CentOS are running really old versions of pretty much everything
[1:49] <aarontc> Personally my entire cluster is on Gentoo
[1:50] <niks> yeah..docs seems to be little misleading..unless you see it closely..it seems it should work
[1:51] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[1:55] * DarkAce-Z (~BillyMays@50.107.53.200) has joined #ceph
[1:57] * DarkAceZ (~BillyMays@50.107.53.200) Quit (Read error: Operation timed out)
[1:58] * yy-nm (~Thunderbi@122.224.154.38) has joined #ceph
[1:59] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[1:59] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[2:01] * mwarwick (~mwarwick@2407:7800:200:1011:3e97:eff:fe91:d9bf) has joined #ceph
[2:01] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[2:04] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) has joined #ceph
[2:08] * DarkAce-Z is now known as DarkAceZ
[2:09] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[2:15] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) Quit (Ping timeout: 480 seconds)
[2:16] * xarses (~andreww@64-79-127-122.static.wiline.com) Quit (Ping timeout: 480 seconds)
[2:20] * yy-nm (~Thunderbi@122.224.154.38) Quit (Quit: yy-nm)
[2:20] * dpippenger (~riven@66-192-9-78.static.twtelecom.net) Quit (Quit: Leaving.)
[2:20] * mschiff (~mschiff@dslb-088-075-247-250.pools.arcor-ip.net) Quit (Remote host closed the connection)
[2:23] * LeaChim (~LeaChim@host86-162-2-255.range86-162.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[2:23] * niks (~oftc-webi@17.149.234.100) Quit (Quit: Page closed)
[2:24] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[2:26] * angdraug (~angdraug@64-79-127-122.static.wiline.com) Quit (Quit: Leaving)
[2:26] * glzhao (~glzhao@118.195.65.67) has joined #ceph
[2:29] * japuzzo (~japuzzo@ool-4570886e.dyn.optonline.net) has joined #ceph
[2:30] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) Quit (Quit: Leaving.)
[2:31] * xarses (~andreww@c-24-23-183-44.hsd1.ca.comcast.net) has joined #ceph
[2:33] * Siva (~sivat@vpnnat.eglbp.corp.yahoo.com) Quit (Ping timeout: 480 seconds)
[2:35] * japuzzo (~japuzzo@ool-4570886e.dyn.optonline.net) Quit (Quit: Leaving)
[2:40] * onizo (~onizo@cpe-98-155-117-134.san.res.rr.com) has joined #ceph
[2:40] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[2:45] * rongze_ (~rongze@117.79.232.205) Quit (Remote host closed the connection)
[2:48] * RoddieKieley (~RoddieKie@47.55.80.53) Quit (Read error: No route to host)
[2:48] * RoddieKieley (~RoddieKie@47.55.80.53) has joined #ceph
[2:49] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) Quit (Quit: Leaving.)
[2:50] * astark (~astark@ool-4576c894.dyn.optonline.net) Quit (Quit: astark)
[2:50] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[2:51] * wenjianhn (~wenjianhn@123.118.215.163) has joined #ceph
[2:56] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Quit: noahmehl)
[3:00] * linuxkidd (~linuxkidd@38.122.20.226) Quit (Read error: Operation timed out)
[3:02] * alram (~alram@38.122.20.226) Quit (Ping timeout: 480 seconds)
[3:02] * julian_ (~julianwa@125.70.135.211) Quit (Read error: Connection reset by peer)
[3:04] * otisspud (~otisspud@198.15.79.50) Quit (Quit: kill -9 idle)
[3:04] * julian (~julian@125.70.135.211) has joined #ceph
[3:05] * diegows (~diegows@190.190.11.42) Quit (Ping timeout: 480 seconds)
[3:05] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[3:05] * xmltok (~xmltok@216.103.134.250) Quit (Ping timeout: 480 seconds)
[3:07] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[3:07] * aliguori (~anthony@74.202.210.82) Quit (Quit: Ex-Chat)
[3:08] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[3:10] * onizo (~onizo@cpe-98-155-117-134.san.res.rr.com) Quit (Remote host closed the connection)
[3:11] * rongze (~rongze@117.79.232.218) has joined #ceph
[3:12] <jhujhiti> well i'm late to the conversation but it works beautifully on centos...
[3:15] * rongze_ (~rongze@14.18.203.18) has joined #ceph
[3:16] <pmatulis> how do i get the list of PGs associated with an OSD?
[3:19] * rongze (~rongze@117.79.232.218) Quit (Ping timeout: 480 seconds)
[3:23] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) has joined #ceph
[3:24] * AfC1 (~andrew@59.167.244.218) has joined #ceph
[3:24] * AfC (~andrew@2407:7800:200:1011:2ad2:44ff:fe08:a4c) Quit (Read error: Connection reset by peer)
[3:26] <dmick> pmatulis: parse ceph pg dump output. I have a oneliner I continually replicate that does something like "ceph pg dump pgs_brief --format=json | python -c 'import json,sys; tree=json.load(sys.stdin); ...'
[3:27] * sarob (~sarob@ip-64-134-231-115.public.wayport.net) Quit (Remote host closed the connection)
[3:27] * Steki (~steki@fo-d-130.180.254.37.targo.rs) Quit (Read error: Operation timed out)
[3:27] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[3:28] <pmatulis> dmick: ok, i figured it would be something like that, just wanted to confirm i wasn't missing something
[3:28] * sarob (~sarob@ip-64-134-231-115.public.wayport.net) has joined #ceph
[3:30] <dmick> ceph pg dump pgs_brief --format=json | python -c 'import sys,json; o=json.load(sys.stdin); print [p["pgid"] for p in o if 10 in p["acting"]];' finds all PGs on osd 10
[3:30] <dmick> every so often I think about wrapping this up into a more-usable command
[3:31] <dmick> l=[str(p["pgid"]) for p in o if 10 in p["acting"]]; print sorted(l) makes it a little prettier
[3:32] <pmatulis> dmick: i think it's worth it. a ceph subcommand or somesuch
[3:32] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:32] <dmick> it is, of course, dynamic. but yeah.
[3:32] <aarontc> you can also use jsawk to process the json output in-stream
[3:32] <dmick> could also indicate if primary, etc.
[3:33] <dmick> aarontc: js. blech. :)
[3:33] <dmick> but yes
[3:33] <dmick> any json'll do
[3:33] <dmick> python list comps are great for this sort of thing if you speak them
[3:33] <aarontc> dmick: I use jsawk to simply coalesce values from 'perf dump' on admin sockets into scalar values for performance logging
[3:33] <pmatulis> today i did a quick & dirty search for all PGs associated with a primary OSD
[3:36] * sarob (~sarob@ip-64-134-231-115.public.wayport.net) Quit (Ping timeout: 480 seconds)
[3:39] * davidzlap (~Adium@ip68-5-239-214.oc.oc.cox.net) has joined #ceph
[3:40] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[3:41] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[3:42] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:44] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[3:47] * SvenPHX1 (~Adium@wsip-174-79-34-244.ph.ph.cox.net) Quit (Read error: Connection reset by peer)
[3:48] * SvenPHX (~Adium@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[3:51] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:52] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[3:53] * shang (~ShangWu@175.41.48.77) has joined #ceph
[3:55] * KindTwo (KindOne@198.14.206.201) has joined #ceph
[3:58] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[3:58] * KindTwo is now known as KindOne
[4:02] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) has joined #ceph
[4:05] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[4:06] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[4:11] * gregsfortytwo (~Adium@2607:f298:a:607:f428:d81b:4739:7126) Quit (Ping timeout: 480 seconds)
[4:12] * gregsfortytwo (~Adium@2607:f298:a:607:cd79:6581:302c:5b69) has joined #ceph
[4:12] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[4:15] <dmick> pmatulis: aarontc: here's a little hack for fun: http://fpaste.org/55000/38483092/
[4:16] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[4:16] <dmick> (it could be hooked directly to the ceph python bindings, but, I have some aliases for contacting different clusters)
[4:18] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[4:20] <pmatulis> dmick: dunno, i get a traceback
[4:20] <dmick> paste it?...
[4:21] <pmatulis> dmick: and, right, i'm not a programmer
[4:21] <pmatulis> http://paste.ubuntu.com/6440816/
[4:22] <dmick> doh. my bad.
[4:23] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[4:24] <dmick> http://fpaste.org/55001/48314901/ should be better
[4:26] <pmatulis> dmick: yeah, that works :)
[4:26] <pmatulis> dmick: stars denote what again?
[4:26] <dmick> this osd is a primary for that pg
[4:26] <pmatulis> dmick: ah of course
[4:28] <pmatulis> dmick: sweet, thanks for that
[4:29] <dmick> no worries. perhaps it can serve as a talking point for how best to present such info
[4:32] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[5:05] * fireD_ (~fireD@93-142-206-36.adsl.net.t-com.hr) has joined #ceph
[5:07] * fireD (~fireD@93-142-226-129.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:07] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) Quit (Quit: Leaving.)
[5:10] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[5:10] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[5:12] * Hakisho (~Hakisho@0001be3c.user.oftc.net) Quit (Ping timeout: 480 seconds)
[5:15] * Hakisho (~Hakisho@0001be3c.user.oftc.net) has joined #ceph
[5:26] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[5:29] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[5:32] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[5:40] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[5:46] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) has joined #ceph
[5:50] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[5:52] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[5:53] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[5:57] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[5:59] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[6:00] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[6:15] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[6:29] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:36] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[6:36] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[6:48] <nwf> Hey channel; we enjoy riding with noout set; is there any way to have effectively that behavior without it triggering HEALTH_WARN?
[6:48] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[7:07] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[7:11] <aarontc> nwf: I'm interested in this as well, currently I've patched my monitoring system to parse the status and ignore noout, but another solution would be preferable
[7:18] <aarontc> nwf: just occurred to me, you could set the out delay to something really long, like 1209600 for 2 weeks
[7:22] * BillK (~BillK-OFT@106-68-78-193.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[7:22] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) has joined #ceph
[7:23] * BillK (~BillK-OFT@106-68-249-37.dyn.iinet.net.au) has joined #ceph
[7:27] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) Quit ()
[7:29] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[7:34] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[7:34] <dmick> nwf: you might well be able to form your own idea about health by parsing ceph health detail (preferably in, say --format=json)
[7:42] * AfC1 (~andrew@59.167.244.218) Quit (Quit: Leaving.)
[7:45] <bloodice> is there a document for correctly shutting down a cluster so that there is no data loss and then bring it back online cleanly? ( i need to power off the servers and move them )
[7:47] <nwf> aarontc: Oh, that's viable, perhaps.
[7:47] <bloodice> i found this: http://eu.ceph.com/docs/wip-msgauth/init/stop-cluster/ ( but it doesnt say that this command will do what i need done )
[7:59] * davidzlap (~Adium@ip68-5-239-214.oc.oc.cox.net) Quit (Quit: Leaving.)
[8:08] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[8:11] <bloodice> ok this is interesting, i am running the stop commands and ceph just keeps going..
[8:14] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[8:16] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[8:25] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[8:27] <bloodice> oh ffs, when i reinstalled the monitor, it changed the home directory for ceph... that explains why the command require sudo again..
[8:30] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[8:33] * Sysadmin88 (~IceChat77@94.1.37.151) Quit (Quit: If you think nobody cares, try missing a few payments)
[8:37] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:cd93:9e3b:ed66:f8f8) has joined #ceph
[8:42] * madkiss (~madkiss@2001:6f8:12c3:f00f:cd93:9e3b:ed66:f8f8) Quit (Ping timeout: 480 seconds)
[8:46] * mattt_ (~textual@94.236.7.190) has joined #ceph
[8:50] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) Quit (Ping timeout: 480 seconds)
[8:59] * wogri_risc (~wogri_ris@2a00:1860:104:0:405b:a780:d52:405b) has joined #ceph
[9:08] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:09] * sarob (~sarob@2601:9:7080:13a:451f:bf55:8a0f:5935) has joined #ceph
[9:10] <Pauline> bloodice: you might want to look into "ceph osd set noout", as it prevents recovery modes of the other servers when you turn then off one by one. Just make sure you have no clients connected, they will be so upset with ya :P ("ceph osd unset noout" when move is complete)
[9:15] * sarob_ (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[9:17] * sarob (~sarob@2601:9:7080:13a:451f:bf55:8a0f:5935) Quit (Ping timeout: 480 seconds)
[9:19] * simulx (~simulx@66-194-114-178.static.twtelecom.net) Quit (Quit: Nettalk6 - www.ntalk.de)
[9:21] * madkiss (~madkiss@2001:6f8:12c3:f00f:d0d:2252:c1cf:8249) has joined #ceph
[9:21] * mnash (~chatzilla@vpn.expressionanalysis.com) Quit (Remote host closed the connection)
[9:23] * sarob_ (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[9:27] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:cd93:9e3b:ed66:f8f8) Quit (Ping timeout: 480 seconds)
[9:31] * wogri_risc (~wogri_ris@2a00:1860:104:0:405b:a780:d52:405b) Quit (Quit: wogri_risc)
[9:32] * thomnico (~thomnico@81.253.41.52) has joined #ceph
[9:33] * houkouonchi-work (~linux@12.248.40.138) Quit (Read error: Operation timed out)
[9:33] * xdeller (~xdeller@91.218.144.129) has joined #ceph
[9:36] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[9:40] * mxmln (~mxmln@212.79.49.65) has joined #ceph
[9:45] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[9:46] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:48] * ScOut3R (~ScOut3R@dslC3E4E249.fixip.t-online.hu) has joined #ceph
[9:49] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[9:49] * thomnico (~thomnico@81.253.41.52) Quit (Quit: Ex-Chat)
[9:53] * rendar (~s@host41-179-dynamic.7-87-r.retail.telecomitalia.it) has joined #ceph
[10:00] * mxmln_ (~mxmln@212.79.49.65) has joined #ceph
[10:01] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[10:05] * mxmln (~mxmln@212.79.49.65) Quit (Ping timeout: 480 seconds)
[10:05] * mxmln_ is now known as mxmln
[10:08] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[10:10] * thomnico (~thomnico@81.253.41.52) has joined #ceph
[10:11] * mwarwick (~mwarwick@2407:7800:200:1011:3e97:eff:fe91:d9bf) Quit (Quit: Leaving.)
[10:16] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[10:28] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[10:40] * LeaChim (~LeaChim@host86-162-2-255.range86-162.btcentralplus.com) has joined #ceph
[10:45] * shang (~ShangWu@175.41.48.77) Quit (Quit: Ex-Chat)
[10:45] * shang (~ShangWu@175.41.48.77) has joined #ceph
[10:50] * foosinn (~stefan@office.unitedcolo.de) has joined #ceph
[10:57] * yanzheng (~zhyan@134.134.139.74) Quit (Remote host closed the connection)
[11:01] * odyssey4me (~odyssey4m@165.233.71.2) has joined #ceph
[11:03] * pressureman (~pressurem@62.217.45.26) has joined #ceph
[11:04] <pressureman> i just set up a new cluster using emperor, and noticed that the default pools (data, metadata, rbd) have pg_num 64, whereas my previous clusters defaulted to 128. has this changed recently?
[11:08] * sarob (~sarob@2601:9:7080:13a:79ce:7441:1ed6:c7be) has joined #ceph
[11:10] * thomnico (~thomnico@81.253.41.52) Quit (Ping timeout: 480 seconds)
[11:16] * sarob (~sarob@2601:9:7080:13a:79ce:7441:1ed6:c7be) Quit (Ping timeout: 480 seconds)
[11:21] * Cube (~Cube@66-87-64-40.pools.spcsdns.net) Quit (Quit: Leaving.)
[11:33] * andreask (~andreask@h081217067008.dyn.cm.kabsi.at) has joined #ceph
[11:33] * ChanServ sets mode +v andreask
[11:35] * alaind (~dechorgna@161.105.182.35) has joined #ceph
[11:39] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[11:45] <pressureman> what time does community help arrive?
[11:49] <topro> pressureman: normally around 17:00 UTC traffic in this channel takes up
[11:50] <pressureman> oh... got a while to wait then.
[11:50] <pressureman> that's pretty much the end of my day... i'm UTC+1
[11:50] <topro> i know, same timezone here :/
[11:53] <topro> pressureman: from http://ceph.com/docs/master/rados/configuration/pool-pg-config-ref/ i can see that "osd pool default pg num" has recently become "8", so maybe from the number of osds and replication level you configured, ceph-deploy decided to use 64 pgs for your pools now. don't know for sure as I am not aware of ceph-deploy.
[11:54] <topro> ^^ i assume you used ceph-deploy
[11:54] <pressureman> topro, i'm not a big fan of ceph-deploy... i tried to use it, but actually ended up using a slightly patched mkcephfs (yes, it's still lurking in emperor)
[11:55] <pressureman> i'll have to make friends with ceph-deploy eventually though
[11:55] <topro> well I never used ceph-depoly. when i last set up a cluster it was still the supported way to use manuaal setup with mkcephfs. dont like ceph-deploy either
[11:55] <topro> the only one time i gave it a try it was not working at all
[11:55] <topro> ^^ ceph-depoly
[11:56] <topro> bb
[11:56] <joao> ceph-deploy changed a lot since its inception
[11:56] <joao> alfredo has gone to great lengths to make it work really well
[11:57] <joao> but if you guys find it hard to use or if it doesn't seem to work for your purpose, I'd suggest sending an email to the list with suggestions :)
[12:00] * bandrus1 (~Adium@107.216.174.136) has joined #ceph
[12:00] * ScOut3R (~ScOut3R@dslC3E4E249.fixip.t-online.hu) Quit (Remote host closed the connection)
[12:01] * i_m (~ivan.miro@deibp9eh1--blueice1n2.emea.ibm.com) has joined #ceph
[12:01] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[12:02] * shang (~ShangWu@175.41.48.77) Quit (Quit: Ex-Chat)
[12:03] * ScOut3R (~ScOut3R@catv-89-133-22-210.catv.broadband.hu) has joined #ceph
[12:03] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) Quit (Read error: Connection reset by peer)
[12:03] * ScOut3R (~ScOut3R@catv-89-133-22-210.catv.broadband.hu) Quit (Remote host closed the connection)
[12:04] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[12:04] * ScOut3R (~ScOut3R@catv-89-133-22-210.catv.broadband.hu) has joined #ceph
[12:05] * bandrus (~Adium@108.249.90.53) Quit (Ping timeout: 480 seconds)
[12:06] * ScOut3R (~ScOut3R@catv-89-133-22-210.catv.broadband.hu) Quit (Remote host closed the connection)
[12:08] * ScOut3R (~ScOut3R@dslC3E4E249.fixip.t-online.hu) has joined #ceph
[12:08] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[12:16] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[12:17] * ScOut3R (~ScOut3R@dslC3E4E249.fixip.t-online.hu) Quit (Ping timeout: 480 seconds)
[12:18] * rongze_ (~rongze@14.18.203.18) Quit (Remote host closed the connection)
[12:21] * ScOut3R (~ScOut3R@catv-89-133-32-3.catv.broadband.hu) has joined #ceph
[12:56] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) has joined #ceph
[13:02] <topro> joao: you are right, just blaming it for what it was some time is not right. but anyway once I learned from alfredo that the coverage of ceph-deploy will alway only be to get a new user started. anything which is not plain standard will always need you to manually administrate your cluster. so that was when I completely lost interest in ceph-deploy
[13:08] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[13:13] * rongze (~rongze@211.155.113.236) has joined #ceph
[13:16] <pressureman> the main thing that kinda stuck out when i tried ceph-deploy was that the default directory naming conventions were different to what i had learned earlier from docs
[13:17] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[13:28] * odyssey4me (~odyssey4m@165.233.71.2) Quit (Ping timeout: 480 seconds)
[13:35] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[13:49] * nhm (~nhm@ma62636d0.tmodns.net) has joined #ceph
[13:49] * ChanServ sets mode +o nhm
[14:00] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) has joined #ceph
[14:02] * diegows (~diegows@190.190.11.42) has joined #ceph
[14:03] * japuzzo (~japuzzo@ool-4570886e.dyn.optonline.net) has joined #ceph
[14:03] * tsnider (~tsnider@nat-216-240-30-23.netapp.com) has joined #ceph
[14:08] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[14:16] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[14:26] * wschulze (~wschulze@cpe-72-229-37-201.nyc.res.rr.com) has joined #ceph
[14:28] * mschiff (~mschiff@port-13485.pppoe.wtnet.de) has joined #ceph
[14:31] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[14:32] * tziOm (~bjornar@194.19.106.242) has joined #ceph
[14:35] * markbby (~Adium@168.94.245.2) has joined #ceph
[14:54] * wenjianhn (~wenjianhn@123.118.215.163) Quit (Ping timeout: 480 seconds)
[15:05] * sleinen (~Adium@2001:620:0:26:18a8:fbc9:b8ff:e4d7) has joined #ceph
[15:06] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) Quit (Remote host closed the connection)
[15:08] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[15:10] * The_Bishop (~bishop@2001:470:50b6:0:7c44:dca:c92:9dec) has joined #ceph
[15:10] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) Quit (Remote host closed the connection)
[15:12] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Read error: Operation timed out)
[15:18] * jskinner (~jskinner@69.170.148.179) has joined #ceph
[15:20] * ScOut3R (~ScOut3R@catv-89-133-32-3.catv.broadband.hu) Quit (Read error: Operation timed out)
[15:24] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[15:24] * marrusl (~mark@209-150-43-182.c3-0.wsd-ubr2.qens-wsd.ny.cable.rcn.com) Quit (Remote host closed the connection)
[15:27] * BillK (~BillK-OFT@106-68-249-37.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:28] * peetaur (~peter@CPE788df73fb301-CM788df73fb300.cpe.net.cable.rogers.com) Quit (Ping timeout: 480 seconds)
[15:30] * marrusl (~mark@209-150-43-182.c3-0.wsd-ubr2.qens-wsd.ny.cable.rcn.com) has joined #ceph
[15:31] * peetaur (~peter@CPE788df73fb301-CM788df73fb300.cpe.net.cable.rogers.com) has joined #ceph
[15:36] * yanzheng (~zhyan@jfdmzpr06-ext.jf.intel.com) has joined #ceph
[15:38] * mnash (~chatzilla@vpn.expressionanalysis.com) has joined #ceph
[15:46] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[15:46] * xevwork (~xevious@6cb32e01.cst.lightpath.net) Quit (Remote host closed the connection)
[15:46] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:49] * xevwork (~xevious@6cb32e01.cst.lightpath.net) has joined #ceph
[15:50] * yanzheng (~zhyan@jfdmzpr06-ext.jf.intel.com) Quit (Ping timeout: 480 seconds)
[15:53] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) has joined #ceph
[15:57] * sjm (~sjm@38.98.115.250) has joined #ceph
[15:59] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[15:59] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[16:01] * scuttlemonkey (~scuttlemo@c-174-51-178-5.hsd1.co.comcast.net) has joined #ceph
[16:01] * ChanServ sets mode +o scuttlemonkey
[16:02] * yanzheng (~zhyan@jfdmzpr06-ext.jf.intel.com) has joined #ceph
[16:03] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit ()
[16:04] * rongze (~rongze@211.155.113.236) Quit (Remote host closed the connection)
[16:04] * sarob (~sarob@2601:9:7080:13a:d095:1182:6b1d:abb1) has joined #ceph
[16:04] * dmsimard (~Adium@108.163.152.2) has joined #ceph
[16:05] * yanzheng (~zhyan@jfdmzpr06-ext.jf.intel.com) Quit (Remote host closed the connection)
[16:06] * haomaiwang (~haomaiwan@211.155.113.163) Quit (Remote host closed the connection)
[16:07] * haomaiwang (~haomaiwan@211.155.113.236) has joined #ceph
[16:11] * ircolle (~Adium@2601:1:8380:2d9:e566:1bef:b138:dfb6) has joined #ceph
[16:12] * ircolle (~Adium@2601:1:8380:2d9:e566:1bef:b138:dfb6) Quit ()
[16:13] <pmatulis> is there any point in introducing external load balancers or ha proxies within a ceph environment?
[16:15] * haomaiwang (~haomaiwan@211.155.113.236) Quit (Ping timeout: 480 seconds)
[16:17] * julian (~julian@125.70.135.211) Quit (Quit: Leaving)
[16:17] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[16:19] * simulx (~simulx@vpn.expressionanalysis.com) has joined #ceph
[16:29] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[16:30] <baffle> How does a configuration management system fit in with Ceph? Just for generating ceph.conf and installing the packages?
[16:31] <madkiss> works fairly well. ttbomk both for chef and puppet.
[16:35] * rongze (~rongze@117.79.232.205) has joined #ceph
[16:35] * mattt_ (~textual@94.236.7.190) Quit (Ping timeout: 480 seconds)
[16:36] * mattt_ (~textual@92.52.76.140) has joined #ceph
[16:38] * sage (~sage@76.89.177.113) Quit (Ping timeout: 480 seconds)
[16:40] <baffle> Hmm, seems most manifests/recipies (for somewhat newer Ceph versions) use ceph-deploy and ssh-auth extensively. Guess that is the way to go. :)
[16:40] <alfredodeza> baffle: ceph-deploy is very limited as compared to all the things you can do with ceph command line utilities
[16:41] <alfredodeza> it is meant to be helpful by assuming a bunch of defaults
[16:41] <alfredodeza> so if you are OK with those defaults and using them, then sure, use ceph-deploy with the configuration management system
[16:42] <mikedawson> pmatulis: no load balancing or ha proxy is needed for the RBD use-case.
[16:43] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[16:45] <baffle> I don't really want to; I'd prefer to use the ceph tools directly, and having more control over the infrastructure. Is there any good reason for having a distributed root sshkey if you don't use ceph-deploy?
[16:48] * sage (~sage@cpe-23-242-158-79.socal.res.rr.com) has joined #ceph
[16:51] * jamespage (~jamespage@culvain.gromper.net) Quit (Quit: Coyote finally caught me)
[16:51] * jamespage (~jamespage@culvain.gromper.net) has joined #ceph
[16:51] * noahmehl_ (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) has joined #ceph
[16:52] * gregmark (~Adium@68.87.42.115) has joined #ceph
[16:54] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[16:56] * noahmehl (~noahmehl@cpe-71-67-115-16.cinci.res.rr.com) Quit (Ping timeout: 480 seconds)
[16:56] * noahmehl_ is now known as noahmehl
[16:58] <pmatulis> mikedawson: why is that again?
[16:59] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[16:59] <mikedawson> pmatulis: ceph's architecture has ha/load balancing built in. no need for anything else to use rbd
[17:00] <pmatulis> mikedawson: ok, but why just rbd?
[17:00] <mikedawson> pmatulis: that's all I use. not sure of cephfs or radosgw
[17:01] <pmatulis> mikedawson: ok
[17:02] * mikedawson_ (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[17:07] * sarob (~sarob@2601:9:7080:13a:d095:1182:6b1d:abb1) Quit (Remote host closed the connection)
[17:07] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[17:07] * jbd_ (~jbd_@2001:41d0:52:a00::77) has joined #ceph
[17:08] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[17:08] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[17:09] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Read error: Operation timed out)
[17:11] * andreask (~andreask@h081217067008.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[17:11] <baffle> Should I have a partition table on dedicated osd disks, or should the filesystem be directly on the device?
[17:12] <baffle> Nvm, I guess it should have a gpt.
[17:13] * rongze (~rongze@117.79.232.205) Quit (Remote host closed the connection)
[17:14] * scuttlemonkey (~scuttlemo@c-174-51-178-5.hsd1.co.comcast.net) Quit (Read error: Operation timed out)
[17:17] * rongze (~rongze@117.79.232.175) has joined #ceph
[17:18] * haomaiwang (~haomaiwan@117.79.232.205) has joined #ceph
[17:19] * nhm (~nhm@ma62636d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[17:24] * mozg (~andrei@host217-36-17-226.in-addr.btopenworld.com) has joined #ceph
[17:24] <mozg> hello guys
[17:24] <mozg> I was wondering if anyone managed to setup a working S3 API with SSL using radosgw daemon?
[17:25] <mozg> i've followed the instructions on setting up the radosgw, but the ssl bit is not working
[17:26] * haomaiwang (~haomaiwan@117.79.232.205) Quit (Ping timeout: 480 seconds)
[17:26] * xarses (~andreww@c-24-23-183-44.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[17:29] * foosinn (~stefan@office.unitedcolo.de) Quit (Quit: Leaving)
[17:29] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[17:30] * jskinner (~jskinner@69.170.148.179) Quit (Remote host closed the connection)
[17:30] * pressureman (~pressurem@62.217.45.26) Quit (Quit: Ex-Chat)
[17:33] * Tamil1 (~Adium@cpe-76-168-18-224.socal.res.rr.com) has joined #ceph
[17:34] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[17:37] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[17:37] * jskinner (~jskinner@69.170.148.179) has joined #ceph
[17:38] <L2SHO> is there a way to ask a monitor which rbd images are mounted?
[17:43] <xdeller> rados -p rbd listwatchers rbd_header.xxx
[17:45] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[17:46] <L2SHO> xdeller, am I supposed to replace .xxx with something? I'm getting "rbd_header.xxx: No such file or directory"
[17:47] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[17:47] <xdeller> ofc, rbd info volume-name | grep prefix
[17:47] * glzhao (~glzhao@118.195.65.67) Quit (Quit: leaving)
[17:47] <xdeller> block_name_prefix: rbd_data.xxx
[17:51] <L2SHO> that didn't seem to work either http://apaste.info/kGY0
[17:52] * alex__ (~quassel@85.14.154.66) has joined #ceph
[17:52] <pmatulis> L2SHO: what do you mean by 'mounted'?
[17:52] <alex__> hi all.
[17:52] * sleinen (~Adium@2001:620:0:26:18a8:fbc9:b8ff:e4d7) Quit (Quit: Leaving.)
[17:52] * sleinen (~Adium@130.59.94.209) has joined #ceph
[17:53] <L2SHO> pmatulis, I mean I want to make sure this rbd image isn't mapped on any clients before I remove it
[17:53] <xdeller> what version of ceph you are using?
[17:53] <pmatulis> L2SHO: ah 'mapped', not 'mounted'
[17:53] <L2SHO> 0.67.4-1precise
[17:53] <pmatulis> L2SHO: why not use the 'showmapped' command?
[17:54] <xdeller> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-August/003718.html here is the reference
[17:54] <L2SHO> pmatulis, it looks like that only works on the client side
[17:54] <L2SHO> pmatulis, I need to check from the ceph cluster side
[17:55] <pmatulis> L2SHO: why wouldn't it work?
[17:55] <L2SHO> pmatulis, http://apaste.info/9bbx
[17:55] * jluis (~JL@a79-168-11-205.cpe.netcabo.pt) has joined #ceph
[17:55] * ChanServ sets mode +o jluis
[17:56] <L2SHO> xdeller, thanks, I'll check this out
[17:58] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[17:58] * mattt_ (~textual@92.52.76.140) Quit (Read error: Connection reset by peer)
[17:59] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[18:01] * sleinen (~Adium@130.59.94.209) Quit (Ping timeout: 480 seconds)
[18:04] <Pauline> L2SHO: if you're using a format=2 image, you need to use rbd_header.xxx
[18:05] <L2SHO> Pauline, it's format: 1
[18:05] * xarses (~andreww@64-79-127-122.static.wiline.com) has joined #ceph
[18:05] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) Quit (Read error: Connection reset by peer)
[18:07] * i_m (~ivan.miro@deibp9eh1--blueice1n2.emea.ibm.com) Quit (Quit: Leaving.)
[18:08] * xmltok (~xmltok@162-236-144-149.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[18:08] <pmatulis> L2SHO: interesting, i get that too on a random monitor
[18:08] * mxmln (~mxmln@212.79.49.65) Quit (Remote host closed the connection)
[18:08] <L2SHO> pmatulis, I believe showmapped is meant to be run on the client
[18:09] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) has joined #ceph
[18:09] * mxmln (~mxmln@212.79.49.65) has joined #ceph
[18:09] * jbd_ (~jbd_@2001:41d0:52:a00::77) has left #ceph
[18:09] <pmatulis> L2SHO: sure, but what does client mean? i figured it was ??? ceph.conf ??? auth/credentials/keyring and ??? base ceph packages (ceph, ceph-common on ubuntu)
[18:10] <L2SHO> pmatulis, the client is the machine that has the rbd mounted
[18:10] <L2SHO> pmatulis, I mean mapped
[18:10] <pmatulis> L2SHO: ah
[18:10] * sjm (~sjm@38.98.115.250) has left #ceph
[18:10] * jluis is now known as joao|lap
[18:10] <Pauline> L2SHO, ok, just created a fmt=1 image called t, "rados -p rbd listwatchers t.rbd" show me the mounter.
[18:11] * nhm (~nhm@ma62636d0.tmodns.net) has joined #ceph
[18:11] * ChanServ sets mode +o nhm
[18:12] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[18:12] * mxmln (~mxmln@212.79.49.65) Quit (Remote host closed the connection)
[18:13] <L2SHO> Pauline, yep, that looks like it's working! thanks
[18:13] <pmatulis> L2SHO: re rbd client, understood, thanks for clarifying
[18:13] * mxmln (~mxmln@212.79.49.65) has joined #ceph
[18:18] * ScOut3R (~scout3r@4E5C7289.dsl.pool.telekom.hu) has joined #ceph
[18:18] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[18:20] * nhm (~nhm@ma62636d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[18:20] * mozg (~andrei@host217-36-17-226.in-addr.btopenworld.com) Quit (Ping timeout: 480 seconds)
[18:22] * smiley (~smiley@205.153.36.170) has left #ceph
[18:27] <L2SHO> wow, I didn't realize removing an rbd image would take hours
[18:28] * rongze (~rongze@117.79.232.175) Quit (Remote host closed the connection)
[18:28] * alaind (~dechorgna@161.105.182.35) Quit (Ping timeout: 480 seconds)
[18:29] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[18:30] * nhm (~nhm@wlan-exhibits-2605.sc13.org) has joined #ceph
[18:30] * ChanServ sets mode +o nhm
[18:30] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[18:30] * joshd1 (~jdurgin@2602:306:c5db:310:b9b1:7989:e38a:cac) has joined #ceph
[18:31] * mozg (~andrei@host217-46-236-49.in-addr.btopenworld.com) has joined #ceph
[18:33] <Pauline> yeah, async deletes would speeds up things considerably from the users perspective ^^ But hey, if you think that takes long, image copying an image to another pool...
[18:33] * thomnico (~thomnico@81.253.41.52) has joined #ceph
[18:34] <L2SHO> zzzzzz...... 2% complete. I hope it doesn't stop/break if my ssh dies before it finishes
[18:35] <L2SHO> why doesn't it just mark it as deletd and then go and actually delete the data when it's doing a scrub or something
[18:35] * linuxkidd (~linuxkidd@2607:f298:a:607:9eeb:e8ff:fe07:6658) has joined #ceph
[18:37] * Pauline jumps up and down, waving arm "I know the answer, i know the answer: patches welcome!"
[18:38] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[18:40] * mattt (~mattt@lnx1.defunct.ca) Quit (Remote host closed the connection)
[18:44] * wrale (~wrale@wrk-28-217.cs.wright.edu) Quit (Quit: Leaving)
[18:47] <JoeGruher> funny i just ran into the same thing... deleting a 1TB RBD is taking forever... and if I never even wrote anything to it and it is thin provisioned what is there to delete?
[18:48] <L2SHO> JoeGruher, I'm deleting a 30TB rbd. It's taking about 15min/1% :(
[18:48] <JoeGruher> ungh
[18:50] <L2SHO> actually, it looks like it's starting to speed upa little bit maybe?
[18:50] <L2SHO> it just jumped from 4 to 7
[18:51] <Pauline> the image is sparse, if you didnt fill it up completely it goes fast sometimes...
[18:52] * mattt (~mattt@lnx1.defunct.ca) has joined #ceph
[18:55] * dpippenger (~riven@66-192-9-78.static.twtelecom.net) has joined #ceph
[18:59] * rongze (~rongze@117.79.232.205) has joined #ceph
[18:59] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[18:59] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[19:00] * Perplexed (~perplexed@mobile-166-137-184-237.mycingular.net) has joined #ceph
[19:03] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[19:04] * nhm (~nhm@wlan-exhibits-2605.sc13.org) Quit (Ping timeout: 480 seconds)
[19:04] * mtanski (~mtanski@69.193.178.202) Quit (Quit: mtanski)
[19:07] * WarrenU (~Warren@2607:f298:a:607:2d2a:9a49:8111:f26d) has joined #ceph
[19:09] <aarontc> kind of off the wall question - I have two OSD hosts that in the last few days have started kernel panicing at least once a day. I can't see enough of the stack trace to know exactly why, but it's related to xfs, and the last line is "FIxing recursive fault but reboot is needed!"
[19:09] * Perplexed (~perplexed@mobile-166-137-184-237.mycingular.net) Quit (Quit: Colloquy for iPad - http://colloquy.mobi)
[19:09] * mozg (~andrei@host217-46-236-49.in-addr.btopenworld.com) Quit (Ping timeout: 480 seconds)
[19:11] * rongze (~rongze@117.79.232.205) Quit (Ping timeout: 480 seconds)
[19:11] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[19:11] * dpippenger (~riven@66-192-9-78.static.twtelecom.net) Quit (Read error: Connection reset by peer)
[19:11] * dpippenger (~riven@66-192-9-78.static.twtelecom.net) has joined #ceph
[19:12] * davidzlap (~Adium@ip68-5-239-214.oc.oc.cox.net) has joined #ceph
[19:13] * nhm (~nhm@wlan-rooms-4019.sc13.org) has joined #ceph
[19:13] * ChanServ sets mode +o nhm
[19:16] * smiley (~smiley@205.153.36.170) has joined #ceph
[19:18] * xmltok (~xmltok@162-236-144-149.lightspeed.irvnca.sbcglobal.net) Quit (Quit: Leaving...)
[19:18] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[19:18] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[19:19] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[19:22] * mxmln (~mxmln@212.79.49.65) Quit (Quit: mxmln)
[19:23] <aarontc> all my OSDs run the same kernel version and the rest are okay, so I'm not sure what the problem is. http://i.imgur.com/DGLLam4.png is all I've got :/
[19:24] * xmltok (~xmltok@162-236-144-149.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[19:28] <pmatulis> aarontc: i guess dial up the debug on those hosts
[19:28] <pmatulis> aarontc: on the OS side, configure kexec/kdump
[19:29] <aarontc> pmatulis: I could try that, but the system quits responding when the kernel panics, nothing else is written to disk
[19:29] <jhujhiti> trying to add a new mon -- "cephx: verify_reply couldn't decrypt with error: error decoding block for decryption" what did i do wrong?
[19:29] <xdeller> aarontc which kernel version you`re using?
[19:29] <aarontc> xdeller: 3.10.7
[19:29] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[19:29] <pmatulis> aarontc: there may be symptoms leading up the panic however
[19:29] <L2SHO> aarontc, You can try setting up a serial console to get the whole stack trace. My gut feeling is that it's a corrupt filesystem or hardware issue
[19:30] <Pauline> or netdump
[19:30] <xdeller> netconsole will be fine
[19:30] <aarontc> xdeller: I'll setup netconsole when I reboot the hosts
[19:30] <pmatulis> aarontc: 'leading up *to the panic' even
[19:30] <L2SHO> I guess serial console is old school, lol.
[19:31] <aarontc> L2SHO: I don't have any serial ports, unfortunately
[19:31] * ScOut3R (~scout3r@4E5C7289.dsl.pool.telekom.hu) Quit ()
[19:32] <L2SHO> aarontc, ya, I think most enterprise server have a way to emulate a serial port over ipmi or something like that, but I've never used it
[19:32] <L2SHO> I didn't even know netconsole was a thing
[19:32] * nhm (~nhm@wlan-rooms-4019.sc13.org) Quit (Ping timeout: 480 seconds)
[19:35] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[19:35] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[19:38] <aarontc> L2SHO: Yeah, it is, unfortunately it can have issues in case of a panic because sometimes the irq handler for the network card won't be able to run anymore, but I would imagine the same is true of a serial port
[19:38] * Cube (~Cube@66-87-64-40.pools.spcsdns.net) has joined #ceph
[19:40] <aarontc> L2SHO: if you're curious, you can read more: https://www.kernel.org/doc/Documentation/networking/netconsole.txt
[19:40] <L2SHO> aarontc, thanks
[19:41] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[19:41] * mattt (~mattt@lnx1.defunct.ca) Quit (Read error: Operation timed out)
[19:43] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[19:43] * mattt (~mattt@lnx1.defunct.ca) has joined #ceph
[19:44] * sleinen (~Adium@2001:620:0:25:4d2e:a816:ee36:2e3f) has joined #ceph
[19:49] * aliguori (~anthony@74.202.210.82) has joined #ceph
[19:53] * todin (tuxadero@kudu.in-berlin.de) Quit (Remote host closed the connection)
[19:53] * dxd828 (~dxd828@host-92-24-127-29.ppp.as43234.net) has joined #ceph
[19:58] * sleinen (~Adium@2001:620:0:25:4d2e:a816:ee36:2e3f) Quit (Quit: Leaving.)
[19:58] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[19:59] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) Quit (Quit: Leaving.)
[20:00] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[20:00] * angdraug (~angdraug@64-79-127-122.static.wiline.com) has joined #ceph
[20:01] * sarob (~sarob@ip-64-134-225-149.public.wayport.net) has joined #ceph
[20:02] * xmltok (~xmltok@162-236-144-149.lightspeed.irvnca.sbcglobal.net) Quit (Quit: Leaving...)
[20:04] * xmltok (~xmltok@162-236-144-149.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[20:06] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[20:06] * mtanski (~mtanski@69.193.178.202) Quit (Quit: mtanski)
[20:06] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[20:09] * ScOut3R (~ScOut3R@4E5C7289.dsl.pool.telekom.hu) has joined #ceph
[20:12] * andreask (~andreask@h081217067008.dyn.cm.kabsi.at) has joined #ceph
[20:12] * ChanServ sets mode +v andreask
[20:14] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[20:18] * ScOut3R (~ScOut3R@4E5C7289.dsl.pool.telekom.hu) Quit (Ping timeout: 480 seconds)
[20:22] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[20:24] * sleinen1 (~Adium@2001:620:0:25:50d0:4dee:a96c:a9f5) has joined #ceph
[20:26] * wusui1 (~Warren@2607:f298:a:607:a8d9:5660:cbb1:e25d) has joined #ceph
[20:29] * tsnider1 (~tsnider@nat-216-240-30-23.netapp.com) has joined #ceph
[20:30] * sarob_ (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[20:30] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[20:30] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[20:31] * gkoch (~gkoch@38.86.161.178) has joined #ceph
[20:32] * thomnico (~thomnico@81.253.41.52) Quit (Ping timeout: 480 seconds)
[20:32] * aardvark1 (~Warren@2607:f298:a:607:2d2a:9a49:8111:f26d) Quit (Ping timeout: 480 seconds)
[20:32] * WarrenUsui (~Warren@2607:f298:a:607:a8d9:5660:cbb1:e25d) has joined #ceph
[20:32] * wusui (~Warren@2607:f298:a:607:2d2a:9a49:8111:f26d) Quit (Ping timeout: 480 seconds)
[20:32] * WarrenU (~Warren@2607:f298:a:607:2d2a:9a49:8111:f26d) Quit (Ping timeout: 480 seconds)
[20:32] * wusui (~Warren@2607:f298:a:607:a8d9:5660:cbb1:e25d) has joined #ceph
[20:33] <gkoch> Hello, working on a first time ceph install and having trouble with ceph auth and keyring generation. Anyone available to help with this issue?
[20:34] * rongze (~rongze@117.79.232.205) has joined #ceph
[20:34] <pmatulis> gkoch: there are no dedicated people to deal with someone's problem. just ask your question and if someone knows the answer they will respond
[20:34] * sarob_ (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[20:34] * sarob_ (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[20:35] * tsnider (~tsnider@nat-216-240-30-23.netapp.com) Quit (Ping timeout: 480 seconds)
[20:36] <gkoch> Following the cephx configuration and the ceph auth get-or-create fails. Appears it is trying to connect to the cluster, but it isn't running yet because I don't have a keyring. Am I right in saying I can't start the cluster without a .admin key and keyring?
[20:36] <gkoch> Following the steps on http://ceph.com/docs/master/rados/operations/authentication/
[20:37] * sarob (~sarob@ip-64-134-225-149.public.wayport.net) Quit (Ping timeout: 480 seconds)
[20:39] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[20:39] * sarob_ (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Read error: Connection reset by peer)
[20:40] <andreask> gkoch: you already deployed the ceph cluster? ... with ceph-deploy?
[20:42] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[20:43] * rongze (~rongze@117.79.232.205) Quit (Ping timeout: 480 seconds)
[20:44] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[20:44] <gkoch> andreask: we are trying to do this manually so we can replicate it via puppet. I think we have the config file all set though. global, mon, and osd are in config.
[20:46] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[20:47] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[20:47] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[20:48] <L2SHO> gkoch, I believe you need to create a keyring with ceph-authtool, then bootstrap the monitors with that key
[20:48] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[20:49] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[20:49] <andreask> gkoch: you have seen this: https://github.com/enovance/puppet-ceph ?
[20:50] <gkoch> I'll check that doc out - I think ti's the same thing on puppet forge. Might need to make some changes to it for RHEL though.
[20:52] <gkoch> L2SHO I believe I need a client.admin key before I can create a keyring? I can't create this. I receive a bunch of pipe().fault errors and then finally "Error connecting to cluster: Error"
[20:52] <gkoch> This is step 1 here: http://ceph.com/docs/master/rados/operations/authentication/#enabling-cephx
[20:53] <L2SHO> gkoch, I think you want to start with this: http://ceph.com/docs/master/man/8/ceph-authtool/
[20:53] <pmatulis> gkoch: if you created a cluster you should have an admin keyring
[20:54] * xmltok (~xmltok@162-236-144-149.lightspeed.irvnca.sbcglobal.net) Quit (Quit: Leaving...)
[20:54] <gkoch> I've tried "ceph-authtool /etc/ceph/keyring --create-keyring --name mon.0 --gen-key". Then I try to print it out and get "entity client.admin not found"
[20:54] * Steki (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[20:54] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Read error: Connection reset by peer)
[20:55] <gkoch> Actually, I think just putting this out in writing had helped me. I can create a client.admin entity with this also!
[20:55] <L2SHO> gkoch, http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#adding-a-monitor-manual
[20:55] <L2SHO> gkoch, yes, you want to create a client.admin key
[20:55] <L2SHO> and then use that to create the first monitor
[20:55] <gkoch> Thanks, I think you just got me moving along again
[20:56] <L2SHO> gkoch, when you start up the monitor for the first time, it calls some other script to get the mon.0 keys that get generated I think
[20:57] <L2SHO> it should be in the logs
[20:57] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[20:57] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[20:58] <gkoch> L2SHO it calls ceph-create-keys when starting up. Seems to do the same pipe().fault stuff though.
[20:58] <gkoch> I'll see if I can continue though the cephx setup doc though. Thanks for your help so far.
[20:59] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[20:59] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[21:00] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[21:01] * xdeller (~xdeller@91.218.144.129) Quit (Quit: Leaving)
[21:02] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[21:02] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[21:02] * nhm (~nhm@wlan-rooms-4019.sc13.org) has joined #ceph
[21:02] * ChanServ sets mode +o nhm
[21:03] <L2SHO> gkoch, this might be helpful too: https://github.com/ceph/ceph/blob/master/doc/dev/mon-bootstrap.rst
[21:06] <gkoch> L2SHO: thanks, been very helpful thusfar!
[21:07] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Remote host closed the connection)
[21:07] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[21:09] * sarob_ (~sarob@ip-64-134-225-149.public.wayport.net) has joined #ceph
[21:11] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[21:13] <aarontc> gkoch: you may also find the "manual deployment" notes here useful: https://github.com/aarontc/ansible-playbooks/blob/master/roles/ceph.notes-on-deployment.rst
[21:13] * sarob (~sarob@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Read error: Connection reset by peer)
[21:16] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) has joined #ceph
[21:20] * andreask (~andreask@h081217067008.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[21:20] * Tamil1 (~Adium@cpe-76-168-18-224.socal.res.rr.com) Quit (Quit: Leaving.)
[21:21] <gkoch> aarontc: great notes on manual setup - thanks!
[21:24] <aarontc> gkoch: thanks, I only added the top few paragraphs, the rest I found on github as is :)
[21:28] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) has joined #ceph
[21:30] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[21:34] <jhujhiti> i've been building new ceph boxes and i've noticed that the running qemu VMs that started before any migration work are not connected to the new OSDs... is this normal?
[21:34] <jhujhiti> netstat -nt only shows mon connections to the new boxes - not osd
[21:34] * mozg (~andrei@host81-151-251-29.range81-151.btcentralplus.com) has joined #ceph
[21:34] <jhujhiti> i'm nervous about taking down the last of the original machines because of this
[21:36] * nwat (~textual@eduroam-227-103.ucsc.edu) has joined #ceph
[21:37] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[21:43] * wusui (~Warren@2607:f298:a:607:a8d9:5660:cbb1:e25d) Quit (Quit: Leaving)
[21:43] <aarontc> jhujhiti: is the cluster done rebalancing?
[21:44] * tsnider (~tsnider@nat-216-240-30-23.netapp.com) has joined #ceph
[21:44] <aarontc> qemu should (I believe) only connect to the "primary" OSD for any given placement group it need to access
[21:44] <jhujhiti> aarontc: yes, it was done
[21:44] <aarontc> unrelated, is this of concern? [498748.392577] ceph-osd (6507) used greatest stack depth: 1816 bytes left
[21:44] <jhujhiti> aarontc: that would explain it, this is would always be the third osd in the crush map
[21:47] * tsnider1 (~tsnider@nat-216-240-30-23.netapp.com) Quit (Ping timeout: 480 seconds)
[21:49] * sleinen1 (~Adium@2001:620:0:25:50d0:4dee:a96c:a9f5) Quit (Quit: Leaving.)
[21:49] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[21:50] * mikedawson_ (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Read error: Connection timed out)
[21:50] <pmatulis> afaiu, clients do not connect to an OSD. they calculate where their data is and this leads them to PGs that are spread all over a pool, which involves multiple OSDs. each PG does have a primary but there are many PGs so therefore many OSDs
[21:50] <pmatulis> can someone please confirm this? :) ^^^
[21:51] <jhujhiti> pmatulis: yes but they still have an open tcp connection to a socket on a machine hosting an osd
[21:52] <jhujhiti> but it must be because it's not primary - that makes sense and i remember having read about the osd architecture before
[21:52] <pmatulis> i suppose it's possible that your original OSDs are on the same host though
[21:52] <pmatulis> although that is not supposed to happen
[21:52] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[21:54] * mikedawson_ (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[21:55] <aarontc> I would think it'd be beneficial to direct reads to non-primary OSDs for better throughput ,but I think writes are always sent to the primary
[21:56] * mikedawson_ (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit ()
[21:56] <pmatulis> aarontc: why does reading from only primaries decrease throughput?
[21:57] <aarontc> pmatulis: because you have (potentially) many spindles with the same data, so you can serve more reads in parallel by using secondaries also
[21:57] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[21:59] * nhm (~nhm@wlan-rooms-4019.sc13.org) Quit (Ping timeout: 480 seconds)
[21:59] <pmatulis> aarontc: the client is reading already from multiple OSDs. it doesn't just read from one
[22:00] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[22:00] * The_Bishop (~bishop@2001:470:50b6:0:7c44:dca:c92:9dec) Quit (Ping timeout: 480 seconds)
[22:01] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[22:02] <pmatulis> aarontc: objects are spread over PGs, with each one potentially having a different primary OSD
[22:02] <aarontc> pmatulis: yes, but I don't know if clients are smart enough to go "oh, pg 0.1af is on OSD 0, 12, 14, and 24, so make connections to all four OSDs to do reads in parallel"
[22:03] <aarontc> pmatulis: or if the client will only look to the primary for a given pg to do a read
[22:05] <pmatulis> aarontc: that's not my understanding. hopefully someone else here can confirm that or not. good questions!
[22:05] <aarontc> pmatulis: I am assuming clients DO do that, otherwise the scalability of the cluster would be severely limited
[22:06] <jhujhiti> is it 100% safe to remove OSDs while i have pages active+remapped+backfilling?
[22:06] <jhujhiti> trying to save some time migrating to new machines
[22:07] <aarontc> jhujhiti: as long as no pages are degraded, I think so, but you should confirm with others who know more :)
[22:07] <aarontc> s/pages/placement groups
[22:07] * Tamil1 (~Adium@cpe-76-168-18-224.socal.res.rr.com) has joined #ceph
[22:08] * The_Bishop (~bishop@2001:470:50b6:0:ec49:a07a:74b3:4c77) has joined #ceph
[22:09] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[22:10] * japuzzo (~japuzzo@ool-4570886e.dyn.optonline.net) Quit (Quit: Leaving)
[22:12] * andreask (~andreask@h081217067008.dyn.cm.kabsi.at) has joined #ceph
[22:12] * ChanServ sets mode +v andreask
[22:14] * BillK (~BillK-OFT@58-7-109-226.dyn.iinet.net.au) has joined #ceph
[22:16] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[22:18] * The_Bishop (~bishop@2001:470:50b6:0:ec49:a07a:74b3:4c77) Quit (Ping timeout: 480 seconds)
[22:21] * The_Bishop (~bishop@2001:470:50b6:0:ec49:a07a:74b3:4c77) has joined #ceph
[22:21] <aarontc> anyone else have issues with MDS failing to reconnect after an OSD outage + recovery?
[22:22] <aarontc> I get to "2013-11-19 13:20:43.654510 7f54b1827700 1 mds.0.51 rejoin_joint_start" and then nothing
[22:25] * mtanski (~mtanski@69.193.178.202) Quit (Quit: mtanski)
[22:30] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[22:31] * yanzheng (~zhyan@134.134.139.76) has joined #ceph
[22:32] <aarontc> I can't seem to find the right documentation - how do I get the kernel rbd module to understand partitions on an rbd?
[22:33] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[22:34] * madkiss (~madkiss@2001:6f8:12c3:f00f:d0d:2252:c1cf:8249) Quit (Quit: Leaving.)
[22:36] * yanzheng (~zhyan@134.134.139.76) Quit (Remote host closed the connection)
[22:36] * sleinen1 (~Adium@2001:620:0:26:d929:19bb:3d9c:d068) has joined #ceph
[22:36] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Read error: Operation timed out)
[22:38] * xarses (~andreww@64-79-127-122.static.wiline.com) Quit (Ping timeout: 480 seconds)
[22:41] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[22:43] * mwarwick (~mwarwick@110-174-133-236.static.tpgi.com.au) has joined #ceph
[22:43] * madkiss (~madkiss@2001:6f8:12c3:f00f:956b:85b3:bb5:e176) has joined #ceph
[22:44] * mwarwick (~mwarwick@110-174-133-236.static.tpgi.com.au) Quit ()
[22:44] * mwarwick (~mwarwick@110-174-133-236.static.tpgi.com.au) has joined #ceph
[22:46] * Sysadmin88 (~IceChat77@94.1.37.151) has joined #ceph
[22:48] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[22:49] * xarses (~andreww@64-79-127-122.static.wiline.com) has joined #ceph
[22:53] <loicd> I have ntp installed on two machines that run the monitor, has been runing for three hours and are still 7 seconds appart. How can that be ?
[22:53] * rovar (~rovar@pool-108-6-176-201.nycmny.fios.verizon.net) Quit (Remote host closed the connection)
[22:54] <jhujhiti> loicd: really not a ceph question, but how far apart were the clocks when you started?
[22:55] <loicd> true it's not a ceph question ;-) mon complains and i've always been puzzled by the ability of ntp to *not* set the date ;-) The time was 7 seconds appart when it started, IIRC.
[22:55] <loicd> maybe 10
[22:55] <loicd> I did not accurately check
[22:55] * mnash (~chatzilla@vpn.expressionanalysis.com) Quit (Read error: No route to host)
[22:55] <loicd> definitely less than 15 seconds
[22:55] * mnash (~chatzilla@vpn.expressionanalysis.com) has joined #ceph
[22:55] <gregsfortytwo> does ntp refuse to correct clocks if they're too far off? (unless you pass in an override flag, obviously) I feel like I've heard something about that
[22:55] * markbby (~Adium@168.94.245.2) Quit (Quit: Leaving.)
[22:56] <jhujhiti> gregsfortytwo: yes but i believe it's five minutes by default
[22:56] <loicd> # ntpdate -q ntp.ubuntu.com
[22:56] <loicd> server 91.189.94.4, stratum 2, offset -8.342132, delay 0.03047
[22:56] <loicd> server 91.189.89.199, stratum 2, offset -8.340332, delay 0.02988
[22:56] <loicd> 19 Nov 22:56:16 ntpdate[11804]: step time server 91.189.89.199 offset -8.340332 sec
[22:57] <loicd> I'll wait to see if this slowly improve
[22:57] <jhujhiti> loicd: ntp drifts into synch, so it can take a while
[22:57] <jhujhiti> loicd: stop ntpd and run ntpdate instead to force it if you want, then start ntpd
[22:57] <loicd> jhujhiti: thanks, now I remember that it takes a really long time ;-)
[22:57] <loicd> jhujhiti: I'm not really in a hurry, I'll check tomorrow morning.
[22:57] <loicd> sorry for the slightly off-topic question
[22:58] <jhujhiti> i haven't seen clocks more than a couple seconds out of alignment in years so i'm not sure how long it's expected to take
[22:59] <janos> yeah i typicall run ntpdate -u <sync target> once or twice if it's really off. then start teh ntpd daemon
[22:59] <loicd> If the improvement is not significant by tomorrow morning I'll set it manually. This is kind of fun actually.
[22:59] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[23:00] <loicd> the drift is not big enough to trigger problems between the mons, except for a HEALTH_WARN status
[23:01] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) Quit (Quit: Computer has gone to sleep.)
[23:06] * xarses (~andreww@64-79-127-122.static.wiline.com) Quit (Ping timeout: 480 seconds)
[23:07] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[23:09] * DarkAce-Z (~BillyMays@50.107.53.200) has joined #ceph
[23:09] * DarkAce-Z (~BillyMays@50.107.53.200) Quit (Max SendQ exceeded)
[23:11] * DarkAce-Z (~BillyMays@50.107.53.200) has joined #ceph
[23:13] * AfC (~andrew@2407:7800:200:1011:6e88:14ff:fe33:2a9c) has joined #ceph
[23:14] * DarkAceZ (~BillyMays@50.107.53.200) Quit (Ping timeout: 480 seconds)
[23:17] * rendar (~s@host41-179-dynamic.7-87-r.retail.telecomitalia.it) Quit ()
[23:23] * mrjack_ (mrjack@office.smart-weblications.net) has joined #ceph
[23:25] * xarses (~andreww@64-79-127-122.static.wiline.com) has joined #ceph
[23:30] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[23:36] * DarkAce-Z is now known as DarkAceZ
[23:36] * Steki (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[23:37] * jskinner (~jskinner@69.170.148.179) Quit (Remote host closed the connection)
[23:37] * rongze (~rongze@117.79.232.205) has joined #ceph
[23:37] * jskinner (~jskinner@69.170.148.179) has joined #ceph
[23:42] * pvsa (~pvsa@184.121.113.82.net.de.o2.com) has joined #ceph
[23:45] * rongze (~rongze@117.79.232.205) Quit (Ping timeout: 480 seconds)
[23:45] * jskinner (~jskinner@69.170.148.179) Quit (Ping timeout: 480 seconds)
[23:47] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[23:47] * mschiff (~mschiff@port-13485.pppoe.wtnet.de) Quit (Remote host closed the connection)
[23:51] * pvsa (~pvsa@184.121.113.82.net.de.o2.com) Quit (Quit: pvsa)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.