#ceph IRC Log

Index

IRC Log for 2014-08-07

Timestamps are in GMT/BST.

[0:01] * b0e (~aledermue@x2f277dd.dyn.telefonica.de) has joined #ceph
[0:04] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[0:05] * yguang11 (~yguang11@vpn-nat.corp.tw1.yahoo.com) has joined #ceph
[0:09] * brad_mssw (~brad@shop.monetra.com) Quit (Quit: Leaving)
[0:18] * dmsimard is now known as dmsimard_away
[0:19] * sarob_ (~sarob@ip-64-134-225-62.public.wayport.net) Quit (Remote host closed the connection)
[0:19] * sarob (~sarob@ip-64-134-225-62.public.wayport.net) has joined #ceph
[0:21] * ikrstic (~ikrstic@109-93-162-27.dynamic.isp.telekom.rs) Quit (Quit: Konversation terminated!)
[0:21] * ircolle is now known as ircolle-afk
[0:21] * bkopilov (~bkopilov@213.57.18.214) Quit (Ping timeout: 480 seconds)
[0:24] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[0:26] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit (Quit: Leaving.)
[0:27] * sarob (~sarob@ip-64-134-225-62.public.wayport.net) Quit (Ping timeout: 480 seconds)
[0:28] * primechuck (~primechuc@host-95-2-129.infobunker.com) Quit (Remote host closed the connection)
[0:29] * primechuck (~primechuc@host-95-2-129.infobunker.com) has joined #ceph
[0:30] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) has joined #ceph
[0:34] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[0:35] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[0:35] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) has joined #ceph
[0:37] * primechuck (~primechuc@host-95-2-129.infobunker.com) Quit (Ping timeout: 480 seconds)
[0:38] * fsimonce (~simon@host225-92-dynamic.21-87-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[0:42] * rendar (~I@host163-182-dynamic.3-79-r.retail.telecomitalia.it) Quit ()
[0:45] <Anticimex> hmmm ceph.com unreachable for anyone else?
[0:46] <Kioob> +1 Anticimex
[0:46] <burley> http://www.isitdownrightnow.com/ceph.com.html
[0:47] <Anticimex> typical, i was just mirroring rpms :s
[0:48] <dmick> Anticimex: being worked
[0:48] <dmick> it was probably your fault :)
[0:48] <Anticimex> haha
[0:48] <Kioob> if you're searching packages, look at http://eu.ceph.com/ Anticimex
[0:48] <Anticimex> Kioob: no i want my mirror to work
[0:48] <runfromnowhere> I wonder if there's a way to make the ceph cookbook such that it doesn't hang all runs when ceph.com is down
[0:48] <Kioob> :)
[0:49] <Anticimex> Kioob: oh, well, actually, perhaps i get better speed from eu mirror there
[0:49] <Anticimex> Kioob: i should... or rather, ceph people should / could host a mirrorlist for yum for those two mirrors then
[0:51] <Kioob> mirrorrng all that stuff !? It probably requires a lot of storage space, how can we handle that ? ;)
[0:51] <Anticimex> not everything
[0:51] <Anticimex> only rpm-firefly and rpm-testing , both for rhel7
[0:51] <Anticimex> mirrorlists are yums way of letting you point a repository to multiple servers
[0:51] <Kioob> ok, it was just a (bad) joke :)
[0:51] <Anticimex> ahhh sorry, im tired :)
[0:52] <Anticimex> my humor seems to be turned off :(
[0:55] <dmick> Anticimex: if you have a proposal please suggest it. eu.ceph.com is the only mirror I'm aware of, and it's kinda volunteer, but if a mirrorlist would still help, by all means
[0:56] * lofejndif (~lsqavnbok@5IFAAAH98.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[0:56] <Anticimex> im making them now, makes most sense to store local anyway :)
[0:56] <Anticimex> i suppose yum caches them though
[0:56] <Anticimex> if theyre fetched from the server
[0:57] <Anticimex> not sure what order yum iterates over those lists either, but worst case using secondary mirror as fallback would still be quite fine..
[1:00] * yguang11 (~yguang11@vpn-nat.corp.tw1.yahoo.com) Quit (Ping timeout: 480 seconds)
[1:02] <Anticimex> ok did i just kill it again with my curl line, or was it never up here in between?
[1:02] <Anticimex> s/here in between/until now/ or sth
[1:02] <kraken> Anticimex meant to say: ok did i just kill it again with my curl line, or was it never up until now?
[1:02] <Anticimex> thanks kraken
[1:02] * kraken is overwhelmed by the muscular glorification of contribution
[1:02] <dmick> looking
[1:03] <Anticimex> i'm doing this from a machine behind a squid:
[1:03] <Anticimex> curl -x proxy:3128 -O 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'
[1:03] <dmick> it's still having problems
[1:03] <Anticimex> k
[1:03] <dmick> something is causing apache to blow up on restart
[1:03] <Anticimex> perhaps wasn't me then
[1:05] * b0e (~aledermue@x2f277dd.dyn.telefonica.de) Quit (Quit: Leaving.)
[1:09] * bandrus1 (~oddo@216.57.72.205) has joined #ceph
[1:09] * bandrus (~oddo@216.57.72.205) Quit (Read error: Connection reset by peer)
[1:09] * wer_ (~wer@206-248-239-142.unassigned.ntelos.net) Quit (Read error: Connection reset by peer)
[1:11] * danieagle (~Daniel@179.184.165.184.static.gvt.net.br) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[1:11] * oms101 (~oms101@p20030057EA24DB00EEF4BBFFFE0F7062.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:11] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[1:12] * zack_dolby (~textual@p8505b4.tokynt01.ap.so-net.ne.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[1:20] * oms101 (~oms101@p20030057EA262C00EEF4BBFFFE0F7062.dip0.t-ipconnect.de) has joined #ceph
[1:22] <Kioob> Anticimex: the key can be found here too : https://raw.githubusercontent.com/ceph/ceph/master/keys/release.asc
[1:30] <Kioob> don't know if it can helps, since it's not a mirror, just a caching gateway : http://apt.daevel.fr/ceph/
[1:30] <Anticimex> mmm i found it on github
[1:30] <Anticimex> the mirrorlist worked great
[1:31] <Anticimex> got ~5MB/s from EU rather than sub-MB/s from us. (im in EU)
[1:31] <Kioob> ok
[1:40] * primechuck (~primechuc@173-17-128-36.client.mchsi.com) has joined #ceph
[1:50] <dmick> back up now
[1:50] <dmick> a bunch of things were hitting php that shouldn't have been
[1:51] * baylight (~tbayly@204.15.85.169) has left #ceph
[1:51] * yguang11 (~yguang11@2406:2000:ef96:e:a987:115f:a38f:25c1) has joined #ceph
[1:58] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) has left #ceph
[2:03] * via (~via@smtp2.matthewvia.info) Quit (Ping timeout: 480 seconds)
[2:03] <runfromnowhere> Hmm
[2:03] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) has joined #ceph
[2:03] <runfromnowhere> So the new version of the Ceph cookbook has an issue with Chef when installed via the Omnibus Installer
[2:04] <runfromnowhere> It attempts to run "require 'netaddr'" but the netaddr gem isn't available to Chef
[2:04] <runfromnowhere> Not sure if Chef packaging error or cookbook prereq error :(
[2:04] <kraken> ???_???
[2:04] * sputnik13 (~sputnik13@207.8.121.241) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[2:05] <runfromnowhere> Well, let's learn together - testing a gem-based chef install and seeing if that has the right gem :)
[2:08] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[2:08] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Remote host closed the connection)
[2:09] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) Quit (Quit: Leaving.)
[2:09] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[2:13] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Quit: leaving)
[2:14] * rweeks (~rweeks@50.141.85.14) has joined #ceph
[2:15] * aknapp (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Ping timeout: 480 seconds)
[2:15] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[2:17] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Ping timeout: 480 seconds)
[2:19] * TiCPU (~jeromepou@12.160.0.155) Quit (Ping timeout: 480 seconds)
[2:25] * rweeks (~rweeks@50.141.85.14) Quit (Ping timeout: 480 seconds)
[2:30] * yguang11 (~yguang11@2406:2000:ef96:e:a987:115f:a38f:25c1) Quit (Remote host closed the connection)
[2:31] * yguang11 (~yguang11@2406:2000:ef96:e:a987:115f:a38f:25c1) has joined #ceph
[2:45] <runfromnowhere> Positive motion - the recipe properly tries to install the gem, it's my silliness that I didn't notice it. Now to figure out why it doesn't believe in any of my monitors
[2:52] * joef (~Adium@2620:79:0:131:e8a4:2a19:e991:1273) Quit (Remote host closed the connection)
[2:54] * xarses (~andreww@12.164.168.117) Quit (Ping timeout: 480 seconds)
[2:55] * andreask (~andreask@91.224.48.154) has joined #ceph
[2:55] * ChanServ sets mode +v andreask
[2:55] * huangjun (~kvirc@219.140.143.160) has joined #ceph
[2:58] <huangjun> execute the ./autogen.sh, it reports:http://pastebin.com/jkX9pPM1
[3:00] * tcos (~will@s11-241.rb.gh.centurytel.net) Quit (Read error: Operation timed out)
[3:01] <huangjun> after cp README.md README, the reports disapper
[3:04] * lucas1 (~Thunderbi@222.247.57.50) has joined #ceph
[3:06] * danieljh (~daniel@0001b4e9.user.oftc.net) Quit (Ping timeout: 480 seconds)
[3:09] * vz (~vz@122.178.204.139) has joined #ceph
[3:10] <andreask> has a ceph client that is writing data always have to wait until the next "filestore sync" so at most until the next "filestore max sync interval" to be able to read it aga?in
[3:12] * tcos (~will@s11-241.rb.gh.centurytel.net) has joined #ceph
[3:22] * LeaChim (~LeaChim@host86-159-115-162.range86-159.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[3:22] * diegows (~diegows@190.190.5.238) Quit (Ping timeout: 480 seconds)
[3:22] * via (~via@smtp2.matthewvia.info) has joined #ceph
[3:25] * adamcrume (~quassel@c-71-204-162-10.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[3:28] * allsystemsarego (~allsystem@79.115.170.35) Quit (Quit: Leaving)
[3:32] * andreask (~andreask@91.224.48.154) has left #ceph
[3:35] * yguang11 (~yguang11@2406:2000:ef96:e:a987:115f:a38f:25c1) Quit (Remote host closed the connection)
[3:36] * yguang11 (~yguang11@2406:2000:ef96:e:a987:115f:a38f:25c1) has joined #ceph
[3:44] * yguang11 (~yguang11@2406:2000:ef96:e:a987:115f:a38f:25c1) Quit (Ping timeout: 480 seconds)
[3:47] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) has joined #ceph
[3:48] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:48] * yguang11 (~yguang11@vpn-nat.peking.corp.yahoo.com) has joined #ceph
[3:50] <sage> huangjun: fixed in latest master..
[3:52] * primechuck (~primechuc@173-17-128-36.client.mchsi.com) Quit (Remote host closed the connection)
[3:52] * primechuck (~primechuc@173-17-128-36.client.mchsi.com) has joined #ceph
[3:53] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[4:00] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[4:00] * primechuck (~primechuc@173-17-128-36.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[4:04] * zhaochao (~zhaochao@124.207.139.17) has joined #ceph
[4:08] * bandrus1 (~oddo@216.57.72.205) Quit (Quit: Leaving.)
[4:08] * scuttle|afk is now known as scuttlemonkey
[4:10] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[4:10] * angdraug (~angdraug@12.164.168.117) Quit (Quit: Leaving)
[4:20] * haomaiwa_ (~haomaiwan@223.223.183.114) Quit (Remote host closed the connection)
[4:20] * haomaiwang (~haomaiwan@203.69.59.199) has joined #ceph
[4:21] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has joined #ceph
[4:27] * tupper (~chatzilla@108-83-203-37.lightspeed.rlghnc.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[4:29] * haomaiwa_ (~haomaiwan@223.223.183.114) has joined #ceph
[4:30] * shang (~ShangWu@220-135-203-169.HINET-IP.hinet.net) has joined #ceph
[4:35] * haomaiwang (~haomaiwan@203.69.59.199) Quit (Ping timeout: 480 seconds)
[4:38] * sarob (~sarob@mobile-166-137-179-150.mycingular.net) has joined #ceph
[4:38] * sarob (~sarob@mobile-166-137-179-150.mycingular.net) Quit (Read error: Connection reset by peer)
[4:39] * sarob (~sarob@2601:9:1d00:1328:4ca6:3378:c792:e431) has joined #ceph
[4:46] * shang (~ShangWu@220-135-203-169.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[4:49] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) has joined #ceph
[4:50] <huangjun> yes, it works
[4:55] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:56] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) has joined #ceph
[5:00] * sarob (~sarob@2601:9:1d00:1328:4ca6:3378:c792:e431) Quit (Remote host closed the connection)
[5:00] * sarob (~sarob@2601:9:1d00:1328:4ca6:3378:c792:e431) has joined #ceph
[5:03] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) Quit (Quit: Leaving.)
[5:08] * sarob (~sarob@2601:9:1d00:1328:4ca6:3378:c792:e431) Quit (Ping timeout: 480 seconds)
[5:10] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) Quit (Read error: Connection reset by peer)
[5:10] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) has joined #ceph
[5:18] * lupu (~lupu@86.107.101.214) Quit (Ping timeout: 480 seconds)
[5:22] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) Quit (Quit: Leaving.)
[5:23] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[5:23] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) has joined #ceph
[5:24] * Vacum (~vovo@88.130.206.134) has joined #ceph
[5:26] * Nats (~natscogs@2001:8000:200c:0:c11d:117a:c167:16df) Quit (Read error: Connection reset by peer)
[5:29] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[5:31] * marrusl (~mark@209-150-43-182.c3-0.wsd-ubr2.qens-wsd.ny.cable.rcn.com) Quit (Remote host closed the connection)
[5:31] * Vacum_ (~vovo@i59F79B73.versanet.de) Quit (Ping timeout: 480 seconds)
[5:32] * debian_ (~oftc-webi@116.212.137.13) Quit (Quit: Page closed)
[5:33] <Jakey> dmick: sage you here???
[5:33] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[5:34] * Nats (~natscogs@2001:8000:200c:0:c11d:117a:c167:16df) has joined #ceph
[5:35] <Jakey> how do i add more monitors using ceph-deploy
[5:35] <Jakey> correctly?
[5:35] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[5:35] <Jakey> i use ceph-deploy mon create but all the mon daemon ranks are -1
[5:35] <Jakey> what does that mean
[5:36] * davidz (~Adium@cpe-23-242-12-23.socal.res.rr.com) Quit (Quit: Leaving.)
[5:57] * KevinPerks (~Adium@2606:a000:80a1:1b00:8550:145d:4b79:1890) Quit (Quit: Leaving.)
[6:01] * sarob (~sarob@2601:9:1d00:1328:9825:19ec:bf0d:cec4) has joined #ceph
[6:03] * kanagaraj (~kanagaraj@121.244.87.117) has joined #ceph
[6:03] * Nats (~natscogs@2001:8000:200c:0:c11d:117a:c167:16df) Quit (Quit: Leaving)
[6:03] * Nats (~natscogs@2001:8000:200c:0:c11d:117a:c167:16df) has joined #ceph
[6:03] <Jakey> hello!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[6:09] * sarob (~sarob@2601:9:1d00:1328:9825:19ec:bf0d:cec4) Quit (Ping timeout: 480 seconds)
[6:10] <tserong> Jakey, give it a few more hours - might be more noise when daylight hits europe, then later when it hits the US ;)
[6:13] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) has joined #ceph
[6:18] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:28] * jianingy (~jianingy@211.151.238.51) has joined #ceph
[6:30] * swami (~swami@49.32.0.58) has joined #ceph
[6:30] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[6:31] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) has joined #ceph
[6:32] * Aea (~aea@66.185.106.232) has joined #ceph
[6:32] * MACscr (~Adium@c-50-158-183-38.hsd1.il.comcast.net) Quit (Quit: Leaving.)
[6:32] * twx (~twx@rosamoln.org) Quit (Ping timeout: 480 seconds)
[6:34] <Aea> Hello, I'm trying to set up ceph-fs on ubuntu. When I try to start ceph-fuse with the following command: ceph-fuse -k /etc/ceph/ceph.client.admin.keyring -m all-003, all-004, all005 /opt/shared ??? it will simply hang at "ceph-fuse[6524]: starting ceph client" with no further relevant stdout or messages in the ceph or system logs. Hostnames are created and exist, the keyring file exists, the /opt/shared directory exists. Any ideas?
[6:37] * vbellur (~vijay@122.166.147.17) Quit (Ping timeout: 480 seconds)
[6:41] * rdas (~rdas@110.227.45.63) has joined #ceph
[6:41] <Aea> Ah probably a lack of an MDS, my fault.
[6:44] * bkopilov (~bkopilov@213.57.64.132) has joined #ceph
[6:46] * zerick (~eocrospom@190.187.21.53) Quit (Ping timeout: 480 seconds)
[6:50] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[6:50] * vz (~vz@122.178.204.139) Quit (Quit: Leaving...)
[6:59] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) has joined #ceph
[7:01] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) has joined #ceph
[7:03] * lalatenduM (~lalatendu@121.244.87.117) has joined #ceph
[7:04] * vbellur (~vijay@121.244.87.117) has joined #ceph
[7:09] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[7:13] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) Quit (Quit: Leaving.)
[7:17] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[7:21] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[7:22] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) has joined #ceph
[7:23] * Nats_ (~Nats@2001:8000:200c:0:5038:e2f9:345e:cbf5) has joined #ceph
[7:24] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) Quit (Quit: Leaving.)
[7:26] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) has joined #ceph
[7:30] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[7:31] <flaf> Hello. When I add an osd in my cluster (I follow this http://ceph.com/docs/master/install/manual-deployment/#long-form), the command to initialize the OSD data directory prints some errors. Why? --> http://pastealacon.com/35172
[7:31] * Nats__ (~Nats@2001:8000:200c:0:f4b4:821:1f5a:23a8) Quit (Ping timeout: 480 seconds)
[7:32] <flaf> The command succeeds because the return value is 0. But what is the meaning of the error messages?
[7:33] * zhangdongmao (~zhangdong@203.192.156.9) has joined #ceph
[7:37] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[7:38] * twx (~twx@rosamoln.org) has joined #ceph
[7:39] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[7:41] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit ()
[7:44] * ashishchandra (~ashish@49.32.0.102) has joined #ceph
[7:50] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[7:52] * v2 (~vshankar@121.244.87.117) has joined #ceph
[7:52] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit ()
[7:52] * danieljh (~daniel@0001b4e9.user.oftc.net) has joined #ceph
[7:52] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[8:05] * ikrstic (~ikrstic@77-46-171-25.dynamic.isp.telekom.rs) has joined #ceph
[8:07] * morfair (~morfair@office.345000.ru) Quit (Remote host closed the connection)
[8:09] * reed (~reed@75-101-54-131.dsl.static.sonic.net) Quit (Ping timeout: 480 seconds)
[8:13] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) Quit (Ping timeout: 480 seconds)
[8:17] * thb (~me@0001bd58.user.oftc.net) has joined #ceph
[8:18] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) has joined #ceph
[8:22] * morfair (~morfair@office.345000.ru) has joined #ceph
[8:22] * sarob (~sarob@2601:9:1d00:1328:58f1:4b91:7a42:d99b) has joined #ceph
[8:30] * sarob (~sarob@2601:9:1d00:1328:58f1:4b91:7a42:d99b) Quit (Ping timeout: 480 seconds)
[8:34] <steveeJ> flaf: it checks for the keyring to see if one has to be created for the osd or not. i also find the keyword error irritating in such a case, since it seems to be perfectly normal behaver.
[8:35] <longguang> hi
[8:39] * rendar (~I@host29-178-dynamic.19-79-r.retail.telecomitalia.it) has joined #ceph
[8:40] * lcavassa (~lcavassa@89.184.114.246) has joined #ceph
[8:40] * cookednoodles (~eoin@eoin.clanslots.com) Quit (Ping timeout: 480 seconds)
[8:42] <longguang> is there a firmula to calculate the reliability of ceph?
[8:46] * morfair (~morfair@office.345000.ru) Quit (Quit: ?????????? ?? ???? ?????? (xchat 2.4.5 ?????? ????????????))
[8:46] * morfair (~mf@office.345000.ru) has joined #ceph
[8:46] * cok (~chk@2a02:2350:18:1012:c59a:51de:f9cc:8c85) has joined #ceph
[8:51] * shang (~ShangWu@175.41.48.77) has joined #ceph
[8:58] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) Quit (Quit: Leaving.)
[8:59] * lupu (~lupu@86.107.101.214) has joined #ceph
[9:00] <dvanders_> longguang: reliability model here: https://github.com/ceph/ceph-tools/tree/master/models/reliability
[9:08] * codice (~toodles@97-94-175-73.static.mtpk.ca.charter.com) Quit (Ping timeout: 480 seconds)
[9:08] * MACscr (~Adium@c-50-158-183-38.hsd1.il.comcast.net) has joined #ceph
[9:11] <Jakey> tserong: what the fu u talking about
[9:11] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[9:13] * lucas1 (~Thunderbi@222.247.57.50) Quit (Quit: lucas1)
[9:13] * kalleh (~kalleh@37-46-175-162.customers.ownit.se) has joined #ceph
[9:15] * analbeard (~shw@support.memset.com) has joined #ceph
[9:17] <longguang> dvanders_: like Amazon , who can assure 11's 9.
[9:18] <longguang> how about ceph
[9:18] <dvanders_> that reliability model shows you how many 9's you can get with different replication level and disk reliabilities
[9:20] <tserong> Jakey, you were saying hello and asking questions earlier
[9:21] <tserong> i was trying to suggest you hang out for a bit and wait 'til more people are likely to be online and responding
[9:21] <tserong> that's all
[9:21] <tserong> :)
[9:22] <longguang> dvanders_: does how many 9 mean how many years data can exist?
[9:23] <dvanders_> you should try the calculator... it answers all these questions for you
[9:23] * sarob (~sarob@2601:9:1d00:1328:999f:f9ba:eef6:451b) has joined #ceph
[9:25] <longguang> ok
[9:25] * codice (~toodles@97-94-175-73.static.mtpk.ca.charter.com) has joined #ceph
[9:30] * haomaiwa_ (~haomaiwan@223.223.183.114) Quit (Remote host closed the connection)
[9:31] * haomaiwang (~haomaiwan@203.69.59.199) has joined #ceph
[9:31] * sarob (~sarob@2601:9:1d00:1328:999f:f9ba:eef6:451b) Quit (Ping timeout: 480 seconds)
[9:32] * ashishchandra (~ashish@49.32.0.102) Quit (Ping timeout: 480 seconds)
[9:33] * clamorti (~francois@net-93-65-135-170.cust.vodafonedsl.it) has joined #ceph
[9:34] * Sysadmin88 (~IceChat77@2.218.9.98) Quit (Quit: Some folks are wise, and some otherwise.)
[9:36] <loicd> ceph.com is out houkouonchi-home ?
[9:36] <loicd> joao: do you know who should be notified when that happens?
[9:39] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Quit: Verlassend)
[9:43] * lczerner (~lczerner@nat-pool-brq-t.redhat.com) has joined #ceph
[9:46] * ashishchandra (~ashish@49.32.0.102) has joined #ceph
[9:46] * mourgaya (~kvirc@80.124.164.139) has joined #ceph
[9:46] * haomaiwa_ (~haomaiwan@223.223.183.114) has joined #ceph
[9:49] <Vacum> loicd: ceph.com is reachable for me. but it seems other regions in the world receive a 403 Forbidden (see a few days ago in the chatlog). and you can't reach it from a pure ipv6 network due to the authortative dns server not being reachable on ipv6
[9:52] * haomaiwang (~haomaiwan@203.69.59.199) Quit (Ping timeout: 480 seconds)
[9:58] * lucas1 (~Thunderbi@222.247.57.50) has joined #ceph
[10:00] <absynth> works fine for me, Vacum
[10:01] <absynth> but yes, the fact that dreamhost's auth dnses have no ipv6 kinda sucks
[10:03] * kalleeh (~kalleh@37-46-175-162.customers.ownit.se) has joined #ceph
[10:04] <Vacum> especially as ceph.com does have a AAAA record :)
[10:07] * clamorti (~francois@net-93-65-135-170.cust.vodafonedsl.it) Quit (Quit: Quitte)
[10:07] * haomaiwa_ (~haomaiwan@223.223.183.114) Quit (Remote host closed the connection)
[10:08] * haomaiwang (~haomaiwan@124.248.208.2) has joined #ceph
[10:08] * kalleh (~kalleh@37-46-175-162.customers.ownit.se) Quit (Ping timeout: 480 seconds)
[10:11] <loicd> interesting
[10:11] <loicd> it works right now, houkouonchi-home sorry for the noise :-/
[10:12] <absynth> probably got migrated from a decent distro to Redhat. :P
[10:16] * lucas1 (~Thunderbi@222.247.57.50) Quit (Quit: lucas1)
[10:18] * dvanders_ (~dvanders@2001:1458:202:180::101:f6c7) Quit (Read error: Connection reset by peer)
[10:20] * leseb (~leseb@81-64-215-19.rev.numericable.fr) Quit (Quit: ZNC - http://znc.in)
[10:21] * leseb (~leseb@81-64-215-19.rev.numericable.fr) has joined #ceph
[10:23] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[10:24] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) has joined #ceph
[10:26] * vbellur (~vijay@121.244.87.117) Quit (Ping timeout: 480 seconds)
[10:32] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[10:34] * kalleh (~kalleh@37-46-175-162.customers.ownit.se) has joined #ceph
[10:36] <longguang> dvanders_: is there a reference value, approximate ?
[10:36] <longguang> how many 9
[10:37] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[10:37] * vmx (~vmx@dslb-084-056-015-098.084.056.pools.vodafone-ip.de) has joined #ceph
[10:39] * LeaChim (~LeaChim@host86-159-115-162.range86-159.btcentralplus.com) has joined #ceph
[10:40] * kalleeh (~kalleh@37-46-175-162.customers.ownit.se) Quit (Ping timeout: 480 seconds)
[10:44] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[10:45] * codice (~toodles@97-94-175-73.static.mtpk.ca.charter.com) Quit (Ping timeout: 480 seconds)
[10:49] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[10:52] * haomaiwa_ (~haomaiwan@223.223.183.114) has joined #ceph
[10:56] * codice (~toodles@97-94-175-73.static.mtpk.ca.charter.com) has joined #ceph
[10:57] * vbellur (~vijay@122.172.224.64) has joined #ceph
[11:00] * haomaiwang (~haomaiwan@124.248.208.2) Quit (Ping timeout: 480 seconds)
[11:02] * lucas1 (~Thunderbi@222.247.57.50) has joined #ceph
[11:03] * steki (~steki@85.222.179.216) has joined #ceph
[11:05] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[11:05] <tnt_> Anyone familiar with radosgw internals ? When will an object copy result in an optimized copy that doesn't copy data ?
[11:10] * BManojlovic (~steki@91.195.39.5) Quit (Ping timeout: 480 seconds)
[11:12] * steki (~steki@85.222.179.216) Quit (Ping timeout: 480 seconds)
[11:12] * cookednoodles (~eoin@eoin.clanslots.com) has joined #ceph
[11:16] * lucas1 (~Thunderbi@222.247.57.50) Quit (Quit: lucas1)
[11:19] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[11:20] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[11:24] * sarob (~sarob@2601:9:1d00:1328:4d4d:218a:57f7:8f74) has joined #ceph
[11:27] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[11:32] * sarob (~sarob@2601:9:1d00:1328:4d4d:218a:57f7:8f74) Quit (Ping timeout: 480 seconds)
[11:33] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[11:36] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit ()
[11:36] * lucas1 (~Thunderbi@222.247.57.50) has joined #ceph
[11:37] * lucas1 (~Thunderbi@222.247.57.50) Quit ()
[11:41] <nizedk> what is the correct way to entirely shut down - and start up - a ceph cluster (after you have stopped IO to the cluster), in order to not have a lot of data reshuffle starting?
[11:41] <nizedk> stop the monitors first and start the monitors last?
[11:43] <Vacum> nizedk: you can set the global noout flag. then shutdown the mons, then the osds. for startup you need to start the mons first though, as the osds will need to connect them
[11:50] <nizedk> ok, thanks
[11:58] * lalatenduM (~lalatendu@121.244.87.117) Quit (Quit: Leaving)
[11:59] * lucas1 (~Thunderbi@222.240.148.130) has joined #ceph
[12:07] * lucas1 (~Thunderbi@222.240.148.130) Quit (Quit: lucas1)
[12:10] * joao|lap (~JL@78.29.191.247) has joined #ceph
[12:10] * ChanServ sets mode +o joao|lap
[12:15] * pieterl (~pieterl@194.134.32.8) Quit (Ping timeout: 480 seconds)
[12:17] * sz0 (~sz0@94.55.197.185) has joined #ceph
[12:19] * t0rn (~ssullivan@c-68-62-1-186.hsd1.mi.comcast.net) has joined #ceph
[12:21] * lczerner (~lczerner@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[12:25] * sarob (~sarob@2601:9:1d00:1328:819f:4faf:7d00:5538) has joined #ceph
[12:27] * huangjun (~kvirc@219.140.143.160) Quit (Ping timeout: 480 seconds)
[12:33] * sarob (~sarob@2601:9:1d00:1328:819f:4faf:7d00:5538) Quit (Ping timeout: 480 seconds)
[12:37] * ade (~abradshaw@h31-3-227-203.host.redstation.co.uk) has joined #ceph
[12:39] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[12:41] <flaf> steveeJ: Ah ok, so it's normal. Indeed, it looks like error messages. Thx for your answer.
[12:42] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) has joined #ceph
[12:45] * mivaho (~quassel@2001:983:eeb4:1:c0de:69ff:fe2f:5599) has joined #ceph
[12:46] * Jakey (uid1475@id-1475.uxbridge.irccloud.com) Quit (Quit: Connection closed for inactivity)
[12:48] * zack_dol_ (~textual@e0109-114-22-13-4.uqwimax.jp) has joined #ceph
[12:48] * vbellur (~vijay@122.172.224.64) Quit (Read error: Connection reset by peer)
[12:48] * vbellur (~vijay@122.172.224.64) has joined #ceph
[12:50] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) Quit (Read error: Connection reset by peer)
[12:55] * sjm (~sjm@108.53.250.33) Quit (Remote host closed the connection)
[12:58] * zack_dol_ (~textual@e0109-114-22-13-4.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[13:00] * lczerner (~lczerner@nat-pool-brq-t.redhat.com) has joined #ceph
[13:01] * lucas1 (~Thunderbi@222.240.148.130) has joined #ceph
[13:02] * haomaiwa_ (~haomaiwan@223.223.183.114) Quit (Remote host closed the connection)
[13:02] * haomaiwang (~haomaiwan@124.248.208.2) has joined #ceph
[13:04] * lucas1 (~Thunderbi@222.240.148.130) Quit ()
[13:07] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) has joined #ceph
[13:11] * zhaochao (~zhaochao@124.207.139.17) has left #ceph
[13:12] * i_m (~ivan.miro@gbibp9ph1--blueice4n2.emea.ibm.com) has joined #ceph
[13:14] * Defcon_102KALI_LINUX (~Defcon_10@46.191.159.193) has joined #ceph
[13:16] * Defcon_102KALI_LINUX (~Defcon_10@46.191.159.193) Quit ()
[13:19] <houkouonchi-home> sage: Saw tons of POST's to xmlrpc in a short period of time which was causing php crazyness on the ceph.com machine again and making it go OOM
[13:19] <houkouonchi-home> I chmoded the files to 000 for now
[13:22] <zack_dolby> We are having Ceph event in Tokyo right now. Taking place at RedHat building:)
[13:23] * jianingy (~jianingy@211.151.238.51) Quit (Quit: WeeChat 0.4.2)
[13:23] * ade (~abradshaw@h31-3-227-203.host.redstation.co.uk) Quit (Quit: Too sexy for his shirt)
[13:24] <houkouonchi-home> zack_dolby: I miss Tokyo, wish I could be there :)
[13:26] * sarob (~sarob@2601:9:1d00:1328:7df9:f141:a3f1:306d) has joined #ceph
[13:26] <houkouonchi-home> zack_dolby: this? http://sssslide.com/www.slideshare.net/VirtualTech-JP/rhel-osp ?
[13:26] <zack_dolby> houkouonchi-home When you are back in tokyo, let's hung!
[13:27] <houkouonchi-home> I haven't lived there since 2009 but I want to go back sometime?
[13:27] <zack_dolby> Yeah, http://ljstudy.connpass.com/event/7291/
[13:28] <zack_dolby> houkouonchi-home I am getting to ready to give Lightning Talk about introducing Japan Ceph User Group tonight:)
[13:29] <absynth> houkouonchi-home: do you have a WP installation on ceph.com?
[13:29] <houkouonchi-home> absynth: yes
[13:29] <absynth> houkouonchi-home: update to 3.9.2 then
[13:29] <absynth> DoS exploit
[13:29] <houkouonchi-home> yeah. I will tak to our web-guy
[13:29] <absynth> http://www.breaksec.com/?p=6362
[13:33] <houkouonchi-home> earlier today too many 404 requests for the debian repo's we have (for like translation files and stuff) were causing OOM too
[13:33] <houkouonchi-home> no luck today
[13:33] * ircolle-afk is now known as ircolle
[13:34] <absynth> how can a 404 cause OOM?
[13:34] * sarob (~sarob@2601:9:1d00:1328:7df9:f141:a3f1:306d) Quit (Ping timeout: 480 seconds)
[13:34] <houkouonchi-home> well 404 is processed by php due to mod_rewrite
[13:34] <houkouonchi-home> and if you get a lot in a short period of time because the repo's are being hammered
[13:34] <absynth> see your mistake there? ;)
[13:34] <houkouonchi-home> rather than a low memory quick apche 404 you get a php generated 404 :)
[13:34] <houkouonchi-home> yeah well I fixed the mod_rewrite so it wouldn't happen :)
[13:35] <houkouonchi-home> default config causes that though
[13:35] * shang (~ShangWu@175.41.48.77) Quit (Ping timeout: 480 seconds)
[13:38] * laurie (~laurie@45.164.250.195.sta.estpak.ee) has joined #ceph
[13:39] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[13:39] * pieterl (~pieterl@194.134.32.8) has joined #ceph
[13:42] * laurie_ (~laurie@195.50.209.94) has joined #ceph
[13:43] <laurie_> Hi! I am having a problem with flapping OSD-s. Seems that if there is any network contention, one of the OSD-s will struggle for hours after all networking issues have been resolved.
[13:43] * dmsimard_away is now known as dmsimard
[13:43] <laurie_> It gets marked as down and then it calls out "I was wronly marked down", and it goes on forever?
[13:44] <laurie_> I have to restart the OSD process to stabilize the situation (with nonout, nodown)
[13:45] <laurie_> I am surprised that the OSD-s still struggle when network is alreay back up
[13:46] * laurie (~laurie@45.164.250.195.sta.estpak.ee) Quit (Ping timeout: 480 seconds)
[13:46] <absynth> mh... that points in the general direction of some underlying network issues between the mons and the osd in question
[13:47] <absynth> are you positive that there's no issues there?
[13:48] <laurie_> Yup. We had some network contention and understandably, the cluster started having issues
[13:48] <laurie_> But 12hrs later it is still flapping and consuming resources
[13:48] <absynth> no, i meant _ongoing_ issues
[13:48] <laurie_> nope
[13:48] <kraken> http://i.imgur.com/iSm1aZu.gif
[13:48] <laurie_> network is fine now
[13:50] <laurie_> https://www.dropbox.com/s/fqpfzne843w7vm8/graph_image.png
[13:50] * KevinPerks (~Adium@2606:a000:80a1:1b00:b58a:1264:a43a:35e3) has joined #ceph
[13:50] <laurie_> And this seems to get progressively worse until I lose my nerve and restart the process
[13:51] <laurie_> Seems like the OSD cannot "get over it"
[13:58] <Kioob`Taff> Hi. Strange behavior (for me at least) : if I use ceph-disk-prepare --journal-dev --data-dev /dev/sda5 /dev/sda4 && ceph-disk-activate /dev/sda5
[13:58] <Kioob`Taff> the OSD start, without using sda4 as a journal
[13:59] <Kioob`Taff> (it uses file-based journal)
[13:59] <Kioob`Taff> Is it the expected behaviour ?
[14:01] * v2 (~vshankar@121.244.87.117) Quit (Quit: Leaving)
[14:02] <absynth> laurie_: does the memory usage for that OSD process look unusual, i.e. is it leaking?
[14:04] * yuriw1 (~Adium@c-76-126-35-111.hsd1.ca.comcast.net) has joined #ceph
[14:05] <laurie_> absynth: the memory usage seems stable throughout
[14:05] * huangjun (~kvirc@117.151.40.225) has joined #ceph
[14:05] <laurie_> it happend in the middle of a scrub, so it is elevated. but still okay
[14:06] * vmx (~vmx@dslb-084-056-015-098.084.056.pools.vodafone-ip.de) Quit (Quit: Leaving)
[14:06] <laurie_> absynth: https://www.dropbox.com/s/r3bl5z9l7xiwtwu/graph_image2.png
[14:07] <laurie_> this is the "problematic node"
[14:07] <laurie_> other nodes have a similar graph, but mirrored, i.e inbound is higher
[14:08] <laurie_> I recon it is recovery traffic, but why is it getting higher in such a linear progression ?
[14:08] <absynth> uhm
[14:08] <absynth> that is very, very unusual traffic
[14:09] <absynth> recovery traffic is usual very random and doesn't follow a pattern, sawtooth, triangle or whatnot
[14:09] <absynth> at least on my cluster it is like that
[14:09] <laurie_> yup
[14:09] <absynth> can you use iptraf to check where the traffic goes?
[14:09] <laurie_> I already terminated the traffic by restarting the OSD that went crazy
[14:10] <laurie_> by the "patter" it definitely is exchanged between the nodes in the clsuter
[14:10] <absynth> hm
[14:10] <laurie_> presumably OSD-s
[14:10] <absynth> that's something for the ceph people, i.e. nhm or dmick or sage or joao or so
[14:11] <laurie_> https://www.dropbox.com/s/bgoshdnnvgf0ayh/graph_image3.png
[14:11] <laurie_> compare with the previous graph
[14:11] <laurie_> alright, thanks!
[14:11] <laurie_> I will stick around and hope to talk to them
[14:11] * yuriw (~Adium@c-76-126-35-111.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[14:13] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[14:15] * joao|lap (~JL@78.29.191.247) Quit (Ping timeout: 480 seconds)
[14:15] * marrusl (~mark@209-150-43-182.c3-0.wsd-ubr2.qens-wsd.ny.cable.rcn.com) has joined #ceph
[14:16] * zack_dolby (~textual@e0109-114-22-13-4.uqwimax.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[14:16] * baylight (~tbayly@204.15.85.169) has joined #ceph
[14:19] <nhm> laurie_: whoa, interesting. And everything can talk to everything else properly once the network problems are over?
[14:19] <laurie_> seems so
[14:19] <laurie_> once i
[14:19] <laurie_> once I add nodown and noout the flapping stops
[14:20] * lalatenduM (~lalatendu@121.244.87.117) has joined #ceph
[14:20] <nhm> do you have any kind of continued heartbeat timeouts or anything?
[14:20] <laurie_> but the strange traffic continues..
[14:20] * dvanders (~dvanders@dvanders-hpi5.cern.ch) has joined #ceph
[14:20] <laurie_> nope
[14:20] <kraken> http://i.imgur.com/c4gTe5p.gif
[14:20] <absynth> nope
[14:20] <kraken> http://i.minus.com/iUgVCKwjISSke.gif
[14:20] <absynth> nope
[14:20] <kraken> http://i.imgur.com/zCtbl.gif
[14:21] <absynth> nope
[14:21] <kraken> http://i.imgur.com/foEHo.gif
[14:22] <laurie_> nhm: could I send you the logs for the period?
[14:22] * kanagaraj (~kanagaraj@121.244.87.117) Quit (Quit: Leaving)
[14:23] <nhm> laurie_: can you reproduce easily? We probably need increased logging
[14:24] <laurie_> 2014-08-07 06:33:55.040287 7fa3643bc700 0 log [WRN] : map e5508 wrongly marked me down
[14:24] <laurie_> 2014-08-07 06:34:28.074442 7fa370501700 0 -- x.x.x.A:6804/153005334 >> x.x.x.A:6802/2029205 pipe(0x7fa3a6293480 sd=19 :59416 s=2 pgs=4457 cs=1 l=0 c=0x7fa38
[14:24] <laurie_> 7566100).fault with nothing to send, going to standby
[14:24] <laurie_> 2014-08-07 06:34:28.074573 7fa365e5b700 0 -- x.x.x.A:6804/153005334 >> x.x.x.B:6801/1001558 pipe(0x7fa3a6293200 sd=22 :37293 s=2 pgs=518578 cs=1 l=0 c=0x7fa
[14:24] <laurie_> 37e6ae940).fault with nothing to send, going to standby
[14:24] <laurie_> 2014-08-07 06:34:28.074899 7fa356351700 0 -- x.x.x.A:6804/153005334 >> x.x.x.B:6806/5001413 pipe(0x7fa3b81cb980 sd=27 :49725 s=2 pgs=484797 cs=1 l=0 c=0x7fa
[14:24] <laurie_> 381aee580).fault with nothing to send, going to standby
[14:24] <laurie_> this goes on and on
[14:24] <laurie_> and after 3 "going to standby", it again gets marked as down
[14:24] <nizedk> kill it? remove it from crush and auth? deploy new?
[14:25] <laurie_> I cannot reproduce
[14:25] <laurie_> it seems that some random network issue causes it
[14:25] <laurie_> which logging levels do you reccommend ?
[14:26] <nhm> laurie_: what version of ceph?
[14:26] <laurie_> ceph version 0.79 (4c2d73a5095f527c3a2168deb5fa54b3c8991a6e)
[14:26] * sarob (~sarob@2601:9:1d00:1328:e4d8:6822:5a75:d500) has joined #ceph
[14:27] <laurie_> I will increase logging and wait for some more issues a couple of days..
[14:28] <nhm> laurie_: oh! you might want to upgrade. That's a development release.
[14:28] <laurie_> it was bundled with Ubuntu..
[14:28] * rdas (~rdas@110.227.45.63) Quit (Quit: Leaving)
[14:29] <laurie_> from Ubuntu repos I mean
[14:29] <nhm> laurie_: that's unfortunate. :/
[14:29] <laurie_> hmm
[14:29] <nhm> laurie_: the newest stable release is 0.80.5
[14:29] <laurie_> okay, I will start upgrading for sure then
[14:29] * yabalu (~yabalu007@2001:1680:10:300:7941:9404:c484:f400) Quit (Ping timeout: 480 seconds)
[14:29] <laurie_> mayb it will sort it out
[14:30] <nhm> laurie_: there were a ton of fixes landing right around then, both leading up to the 0.80 release and after in the point releases.
[14:30] <laurie_> sounds good!
[14:30] <laurie_> http://packages.ubuntu.com/trusty/admin/ceph
[14:30] <nhm> fwiw, there's a similar looking bug report for 0.79 here: http://tracker.ceph.com/issues/8098
[14:30] <laurie_> Ubuntu is still on the same 0.79
[14:31] <nhm> this is 14.04?
[14:31] <laurie_> yes! this looks exacly like my problem
[14:31] <laurie_> yup
[14:31] <laurie_> Trusty
[14:32] <absynth> 0.80.1-0ubuntu1.1 (/var/lib/apt/lists/de.archive.ubuntu.com_ubuntu_dists_trusty-updates_main_binary-amd64_Packages)
[14:32] <absynth> that's the current version in trusty
[14:32] <absynth> or, in 14.04 at least
[14:32] <absynth> i'm not good with names
[14:32] <nhm> laurie_: that's why I was wondering about thread heartbeat timeouts (at the very end of the osd.51 log in that bug report)
[14:33] <nhm> sorry, those are general heartbeat timeouts, not thread
[14:33] <absynth> (i still think that cuttlefish should have been named cthulhu though)
[14:33] <nhm> absynth: :D
[14:33] <laurie_> one moment, will grep some logs
[14:33] <nhm> absynth: huh, I wonder why they aren't on a newer 0.80 point release.
[14:34] <nhm> absynth: but what do I know, fedora seems to be on 0.81, it all seems slightly random.
[14:34] * bjornar (~bjornar@ns3.uniweb.no) Quit (Read error: No route to host)
[14:35] * sarob (~sarob@2601:9:1d00:1328:e4d8:6822:5a75:d500) Quit (Ping timeout: 480 seconds)
[14:35] <laurie_> 2014-08-07 12:54:46.648267 7fa366a60700 10 osd.1 7212 reset_heartbeat_peers
[14:36] <laurie_> that's the only reference to heartbeat in my logs
[14:36] <nhm> laurie_: btw, if it's a memory/swap related timeout issue, I've had success in the past decreasing /proc/sys/vm/swappiness to 10 to reduce the likelyhood that the OSD processes hit swap.
[14:36] <laurie_> but the rest looks similar
[14:36] <laurie_> cat /proc/sys/vm/swappiness
[14:36] <laurie_> 1
[14:36] <nhm> oh, I guess you've already got that covered. :)
[14:37] <nhm> did you do that yourself? ubuntu's default is 60 afaik.
[14:37] <laurie_> yes
[14:37] * vbellur (~vijay@122.172.224.64) Quit (Read error: Connection reset by peer)
[14:37] <nhm> excellent. :)
[14:38] * gregmark (~Adium@68.87.42.115) has joined #ceph
[14:38] <laurie_> If I can afford to bring the whole cluster down. that would be the safest upgrade route, right?
[14:39] <laurie_> bring it all down, upgrade all daemons and restart the clsuter
[14:39] <nizedk> I'd love a monitor-controlled "tell everything to enter the planned shutdown sequence" command, and a "wait for every OSD is up, in the post-shutdown-state, and bring everything online again"-command for the storage cluster. The reason being that I have to set up a cluster that will we shut down completely repeatedly. :-)
[14:39] * sjm (~sjm@172.56.36.97) has joined #ceph
[14:39] <nhm> probably, though I think some effort has been made to make incremental upgrade work (I honestly have no idea how well this gets tested)
[14:40] <laurie_> nhm: thank you very much for sparing your time looking into it. I hope the upgrade resolves the issue!
[14:42] <nhm> nizedk: My performance testing do that if you replace "shutdown" with "destroy the cluster" ;)
[14:42] <nhm> sorry, performance testing scripts
[14:43] <nhm> laurie_: at the very least if it still happens, at least it's on a current release.
[14:43] <nhm> laurie_: and we can get some folks who know what they are doing looking at it. ;)
[14:43] <nizedk> nhm: You are not a real sysadmin until you've destroyed a few petabytes of customer data, right? ;-)
[14:43] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) has joined #ceph
[14:44] <laurie_> Yes, indeed!
[14:44] <nhm> nizedk: I used to admin about 900TB of lustre. I never lost anything major, but minor data loss was like a monthly occurance.
[14:46] <nhm> nizedk: this was for supercomputer scratch space, and it was faster the refund the computational resource costs to recreate the data than to try to recover it.
[14:46] * baylight (~tbayly@204.15.85.169) Quit (Ping timeout: 480 seconds)
[14:46] <nizedk> nhm: "Only a distributed 3% total chunks of all the rbd images were lost. Who cares about 3% of anything?!!111"
[14:46] <nhm> nizedk: :D
[14:47] <ircolle> I'd take 3% of $1billion
[14:47] * lalatenduM (~lalatendu@121.244.87.117) Quit (Quit: Leaving)
[14:47] <nizedk> ircolle: *Boss-voice* Stop listening to what I say, understand what I mean!
[14:48] <darkfader> that should like what AWS said after their outage
[14:48] <nizedk> ircolle *left pinky to chin* 1 BILLION dollars!
[14:48] <darkfader> "only affected 0.09% of data"
[14:49] * lalatenduM (~lalatendu@121.244.87.117) has joined #ceph
[14:50] <ircolle> darkfader - exactly - kinda like a "minimal heart attack" is only minimal if it happens to someone else
[14:50] <nizedk> When you start the post mortem description with "It's actually quite interesting|funny..." - it isn't. :-)
[14:50] * rwheeler (~rwheeler@173.48.207.57) Quit (Quit: Leaving)
[14:51] <nizedk> small time pregant... a little dead... almost innocent...
[14:51] * vbellur (~vijay@122.167.208.126) has joined #ceph
[14:52] <nizedk> How many inhere with an ICE subscription?
[14:53] * laurie_ (~laurie@195.50.209.94) Quit ()
[14:55] * vbellur (~vijay@122.167.208.126) Quit (Read error: Connection reset by peer)
[15:01] * primechuck (~primechuc@host-95-2-129.infobunker.com) has joined #ceph
[15:08] * brad_mssw (~brad@shop.monetra.com) has joined #ceph
[15:11] * vbellur (~vijay@122.167.201.111) has joined #ceph
[15:26] * yguang11 (~yguang11@vpn-nat.peking.corp.yahoo.com) Quit (Remote host closed the connection)
[15:27] * yguang11 (~yguang11@vpn-nat.peking.corp.yahoo.com) has joined #ceph
[15:27] * ashishchandra (~ashish@49.32.0.102) Quit (Quit: Leaving)
[15:27] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) has joined #ceph
[15:32] * ircolle is now known as ircolle-afk
[15:35] * yguang11 (~yguang11@vpn-nat.peking.corp.yahoo.com) Quit (Ping timeout: 480 seconds)
[15:35] * sz0 (~sz0@94.55.197.185) Quit ()
[15:35] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[15:35] * mgarcesMZ (~mgarces@5.206.228.5) has joined #ceph
[15:35] <mgarcesMZ> hi there
[15:36] <mgarcesMZ> hi cookednoodles
[15:36] <mgarcesMZ> small question
[15:37] <mgarcesMZ> I followed the easy install, using ceph-deploy, but now I am trying to do it manual, following also the tutorial
[15:37] <mgarcesMZ> I am doing this in centos7
[15:38] <mgarcesMZ> when I do sudo /etc/init.d/ceph start mon.node1 it fails..
[15:38] <mgarcesMZ> usually you start services using ???systemctl start ceph.service'
[15:38] <mgarcesMZ> when I do ???ceph osd lspools??? he compains about the keys (even though I have created them)
[15:38] <mgarcesMZ> can you help me?
[15:39] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[15:42] * sz0 (~sz0@94.55.197.185) has joined #ceph
[15:45] * baylight (~tbayly@204.15.85.169) has joined #ceph
[15:48] * ganders (~root@200-127-158-54.net.prima.net.ar) has joined #ceph
[15:49] * hufman (~hufman@cpe-184-58-235-28.wi.res.rr.com) has joined #ceph
[15:51] <hufman> hello!!
[15:52] <hufman> does anyone know how i would get disk usage stats for each pool and rbd?
[15:52] <hufman> i can total up the provisioned size and multiple by the replica size, but how do i factor in snapshots?
[15:53] * jianingy_afk (~jianingy@li653-173.members.linode.com) Quit (Remote host closed the connection)
[15:55] <cookednoodles> mgarcesMZ, I have no idea
[15:55] * jianingy_afk (~jianingy@li653-173.members.linode.com) has joined #ceph
[15:55] <mgarcesMZ> :(
[15:55] <mgarcesMZ> going back to easy way
[15:56] <hufman> ah ha, rados df shows many fun statistics, awesome!
[15:56] <cookednoodles> I dont use centos :(
[15:56] <mgarcesMZ> ubuntu?
[15:56] <mgarcesMZ> we just have centos here
[15:57] <mgarcesMZ> cant switch to ubuntu
[15:57] <cookednoodles> I use ubuntu yep
[15:57] <mgarcesMZ> perhaps in a few months this is smoother on rhel/centos 7
[15:57] <mgarcesMZ> since redhat is envolved now
[15:58] <cookednoodles> what happens if you give ceph the key file ?
[15:59] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) has joined #ceph
[16:05] * ircolle-afk (~Adium@2601:1:a580:145a:1521:a6ec:445c:1933) Quit (Quit: Leaving.)
[16:05] <mgarcesMZ> cookednoodles: how? in the init script?
[16:06] <cookednoodles> well manually for now
[16:06] <cookednoodles> but I think there is a flag for it
[16:06] <mgarcesMZ> did not try that
[16:06] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[16:06] * baylight (~tbayly@204.15.85.169) Quit (Ping timeout: 480 seconds)
[16:06] <mgarcesMZ> the docs say to /etc/init.d/ceph start mon.node1
[16:06] * sjm (~sjm@172.56.36.97) Quit (Ping timeout: 480 seconds)
[16:07] <mgarcesMZ> but in centos7, the init script does not understand that
[16:07] <mgarcesMZ> now I am doing a mix of manual and easy install :)
[16:08] * TiCPU (~jeromepou@12.160.0.155) has joined #ceph
[16:08] * lalatenduM (~lalatendu@121.244.87.117) Quit (Quit: Leaving)
[16:08] <cookednoodles> errr
[16:09] <cookednoodles> isn't it just "start" with no ceph infront ?
[16:10] * sjm (~sjm@172.56.36.97) has joined #ceph
[16:18] * swami (~swami@49.32.0.58) Quit (Quit: Leaving.)
[16:19] * baylight (~tbayly@204.15.85.169) has joined #ceph
[16:22] <mgarcesMZ> nope
[16:22] <kraken> http://i.imgur.com/xKYs9.gif
[16:22] <mgarcesMZ> in the manual it says otherwise
[16:22] * sjm (~sjm@172.56.36.97) Quit (Ping timeout: 480 seconds)
[16:22] <mgarcesMZ> question
[16:22] <mgarcesMZ> when I add an extra monitor
[16:23] <mgarcesMZ> do I have to change the ceph.conf, `mon initial members`
[16:23] <mgarcesMZ> when it ends adding the monitor, it complains it is not part of the `mon initial members`
[16:23] * diegows (~diegows@190.190.5.238) has joined #ceph
[16:25] <mgarcesMZ> cookednoodles: the manual says: ???sudo /etc/init.d/ceph start mon.node1???
[16:26] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[16:27] * ircolle (~Adium@mobile-166-147-083-005.mycingular.net) has joined #ceph
[16:28] * sarob (~sarob@2601:9:1d00:1328:564:e42f:818c:6799) has joined #ceph
[16:29] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[16:33] * mgarcesMZ (~mgarces@5.206.228.5) Quit (Ping timeout: 480 seconds)
[16:36] * sarob (~sarob@2601:9:1d00:1328:564:e42f:818c:6799) Quit (Ping timeout: 480 seconds)
[16:36] * joao|lap (~JL@78.29.191.247) has joined #ceph
[16:36] * ChanServ sets mode +o joao|lap
[16:37] * cok (~chk@2a02:2350:18:1012:c59a:51de:f9cc:8c85) Quit (Quit: Leaving.)
[16:38] * kalleh (~kalleh@37-46-175-162.customers.ownit.se) Quit (Read error: Connection reset by peer)
[16:39] * kalleh (~kalleh@37-46-175-162.customers.ownit.se) has joined #ceph
[16:39] * mgarcesMZ (~mgarces@5.206.228.5) has joined #ceph
[16:39] <mgarcesMZ> back again
[16:39] <mgarcesMZ> someone kicked the fiber that connects this building to the datacenter
[16:39] <mgarcesMZ> amazing
[16:48] * vmx (~vmx@dslb-084-056-015-098.084.056.pools.vodafone-ip.de) has joined #ceph
[16:50] * danieagle (~Daniel@179.184.165.184.static.gvt.net.br) has joined #ceph
[16:58] <bloodice> probably time to add more protection on that cable
[16:59] * baylight (~tbayly@204.15.85.169) Quit (Ping timeout: 480 seconds)
[17:00] <mgarcesMZ> bloodice: the problem is.. they are installing a new one, also the failover wifi link, and the some guy unplugged the cable, thinking it was the old cable...
[17:00] <mgarcesMZ> so no primary link and no failover
[17:00] <mgarcesMZ> urray
[17:01] <mgarcesMZ> when they are finished with the job, I bet no one will trip on the fiber :)
[17:01] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[17:04] * pressureman (~pressurem@62.217.45.26) Quit (Remote host closed the connection)
[17:05] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[17:05] * bandrus (~oddo@216.57.72.205) has joined #ceph
[17:06] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:08] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[17:09] * mourgaya (~kvirc@80.124.164.139) Quit (Quit: KVIrc 4.1.3 Equilibrium http://www.kvirc.net/)
[17:10] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[17:13] * sarob (~sarob@2601:9:1d00:1328:0:c198:9315:9de1) has joined #ceph
[17:13] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[17:14] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[17:20] * kalleh (~kalleh@37-46-175-162.customers.ownit.se) Quit (Ping timeout: 480 seconds)
[17:20] <seapasulli> hardy har
[17:23] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) has joined #ceph
[17:23] * sjm (~sjm@12.130.117.134) has joined #ceph
[17:24] * sarob (~sarob@2601:9:1d00:1328:0:c198:9315:9de1) Quit (Remote host closed the connection)
[17:24] * sarob (~sarob@2601:9:1d00:1328:0:c198:9315:9de1) has joined #ceph
[17:30] * lalatenduM (~lalatendu@121.244.87.117) has joined #ceph
[17:30] * adamcrume (~quassel@50.247.81.99) has joined #ceph
[17:32] * huangjun (~kvirc@117.151.40.225) Quit (Read error: Connection reset by peer)
[17:32] * sarob (~sarob@2601:9:1d00:1328:0:c198:9315:9de1) Quit (Ping timeout: 480 seconds)
[17:35] <yuriw1> loicd: ping
[17:35] <yuriw1> loicd: you have 177 vps locked, do you need then for ec testing?
[17:36] * sjm (~sjm@12.130.117.134) has left #ceph
[17:36] <loicd> yes, I launched 2 suites a few hours ago
[17:38] <loicd> yuriw1: they seem to be making progress, do you see a problem indicating they are stalled ?
[17:39] <yuriw1> loicd: no, I was just wondering if you really needed 100+ machines
[17:39] <yuriw1> loicd: did you see my email with latest run with ec?
[17:40] <yuriw1> on dumpling-firefly-x suite
[17:40] <loicd> I think I did, and replied, let me check
[17:42] * joao|lap (~JL@78.29.191.247) Quit (Ping timeout: 480 seconds)
[17:42] <loicd> to be honest, I don't need as much. I only need the machines required to run the ubuntu suite and that would be enough. But when I tried to figure out when the plana suite was going to run, I was horrified to discover my suite was job 2200 and up in a 2400 jobs queue ;-) I tried against vps and was happy to discover it started running right away :-) Only after that did I remember that I should have trimmed manually to only run ubuntu. But I did not have
[17:42] <loicd> the heart to cancel the run and start over. That's what resources starvation do to me.
[17:43] <yuriw1> you could use '-l x' option to run only limited number of tests, the only catch is it will run of the top of your suite, e.g. first '-l' number of tests
[17:43] <loicd> yuriw1: I did not reply to the last mail. Checking http://pulpito.front.sepia.ceph.com/teuthology-2014-08-05_15:03:02-upgrade:dumpling-firefly-x-next-testing-basic-vps/ it looks like the ubuntu passes.
[17:44] <yuriw1> loicd: there are still some ec failures if remember correctly
[17:44] <loicd> http://pulpito.front.sepia.ceph.com/teuthology-2014-08-05_15:03:02-upgrade:dumpling-firefly-x-next-testing-basic-vps/400542/
[17:45] <loicd> yuriw1: http://pulpito.front.sepia.ceph.com/teuthology-2014-08-05_15:03:02-upgrade:dumpling-firefly-x-next-testing-basic-vps/400543/ fails
[17:45] * i_m (~ivan.miro@gbibp9ph1--blueice4n2.emea.ibm.com) Quit (Quit: Leaving.)
[17:45] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:46] <loicd> but it looks like it did not even start for a reason unrelated to code
[17:46] <loicd> yuriw1: where could I see an ec related failure ?
[17:46] * lalatenduM (~lalatendu@121.244.87.117) Quit (Quit: Leaving)
[17:46] <yuriw1> loicd: here is the latest run - http://pulpito.front.sepia.ceph.com/teuthology-2014-08-06_17:35:01-upgrade:dumpling-firefly-x-next-testing-basic-vps/
[17:46] <loicd> hum
[17:46] <loicd> funny
[17:46] <loicd> it looks the exact opposite of the first one : all green *but* ubuntu ;-)
[17:46] <loicd> almost
[17:47] <yuriw1> we had some unrelated issues in the first one
[17:48] <loicd> yuriw1: which job has ec failure ?
[17:48] <yuriw1> i (thought* :) let me ook
[17:50] * joef (~Adium@2620:79:0:131:5d44:98ee:a2ab:be5a) has joined #ceph
[17:50] * lczerner (~lczerner@nat-pool-brq-t.redhat.com) Quit (Ping timeout: 480 seconds)
[17:50] <loicd> 2014-08-06T21:15:37.380 INFO:teuthology.orchestra.run.vpm141.stderr:Error EINVAL: pool 'unique_pool_0' cannot change to type erasure
[17:50] <loicd> http://pulpito.front.sepia.ceph.com/teuthology-2014-08-06_17:35:01-upgrade:dumpling-firefly-x-next-testing-basic-vps/404389/
[17:50] * angdraug (~angdraug@c-67-169-181-128.hsd1.ca.comcast.net) has joined #ceph
[17:51] <yuriw1> aga !
[17:51] <loicd> is because unique_pool_0 already exist and cannot be converted to erasure coded pool
[17:51] <loicd> :-)
[17:51] <yuriw1> did you do anything for http://tracker.ceph.com/issues/9027 ?
[17:52] <loicd> http://pulpito.front.sepia.ceph.com/teuthology-2014-08-06_17:35:01-upgrade:dumpling-firefly-x-next-testing-basic-vps/404390/ is the same
[17:52] <yuriw1> I just wanted to help you with results :)
[17:52] <yuriw1> brb
[17:53] <loicd> yuriw1: I did not do anything related to http://tracker.ceph.com/issues/9027 : I'm not familiar with the dumpling to firefly upgrade suite and I would need to dig to figure out why this pool name is already used
[17:57] * xarses (~andreww@12.164.168.117) has joined #ceph
[18:00] <yuriw1> loicd: ok, let me know if can hep with anything, meanwhile I will re-run this suite tonight, i hate to see inconsistent results
[18:02] * aknapp (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[18:03] * fsimonce (~simon@host225-92-dynamic.21-87-r.retail.telecomitalia.it) has joined #ceph
[18:06] * bitserker (~toni@48.58.79.188.dynamic.jazztel.es) has joined #ceph
[18:07] <loicd> yuriw1: do you need me to learn about the dumpling suite and fix this problem ?
[18:08] * ircolle is now known as ircolle-crappyconnection
[18:09] <yuriw1> loicd: I'd be happy to do anything as long as it will be helpful, this particular suite has README https://github.com/ceph/ceph-qa-suite/tree/master/suites/upgrade/dumpling-firefly-x/parallel
[18:09] * masterpe (~masterpe@2a01:670:400::43) Quit (Read error: Connection reset by peer)
[18:10] <yuriw1> nickname: yuriw-xfiniti_2x
[18:10] * masterpe (~masterpe@2a01:670:400::43) has joined #ceph
[18:11] * JayJ (~jayj@157.130.21.226) has joined #ceph
[18:13] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) Quit (Quit: Leaving.)
[18:17] * zack_dolby (~textual@p8505b4.tokynt01.ap.so-net.ne.jp) has joined #ceph
[18:17] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[18:19] * fsimonce (~simon@host225-92-dynamic.21-87-r.retail.telecomitalia.it) Quit (Quit: Coyote finally caught me)
[18:22] * bitserker1 (~toni@169.38.79.188.dynamic.jazztel.es) has joined #ceph
[18:22] * tracphil (~tracphil@130.14.71.217) has joined #ceph
[18:23] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[18:25] * sarob (~sarob@2601:9:1d00:1328:d91e:e5a3:2931:f94a) has joined #ceph
[18:26] * bitserker (~toni@48.58.79.188.dynamic.jazztel.es) Quit (Ping timeout: 480 seconds)
[18:27] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[18:28] * mgarcesMZ (~mgarces@5.206.228.5) Quit (Quit: mgarcesMZ)
[18:28] * mattch (~mattch@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[18:33] * sarob (~sarob@2601:9:1d00:1328:d91e:e5a3:2931:f94a) Quit (Ping timeout: 480 seconds)
[18:34] * astellwag (~astellwag@209.132.181.86) Quit (Read error: Connection reset by peer)
[18:34] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[18:37] * astellwag (~astellwag@209.132.181.86) has joined #ceph
[18:39] * thb (~me@0001bd58.user.oftc.net) Quit (Ping timeout: 480 seconds)
[18:43] * ircolle-crappyconnection (~Adium@mobile-166-147-083-005.mycingular.net) Quit (Quit: Leaving.)
[18:49] * sjm (~sjm@12.130.117.134) has joined #ceph
[18:54] * lcavassa (~lcavassa@89.184.114.246) Quit (Quit: Leaving)
[18:56] * adamcrume (~quassel@50.247.81.99) Quit (Remote host closed the connection)
[18:58] * sputnik13 (~sputnik13@207.8.121.241) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[19:01] * ircolle (~Adium@2601:1:a580:145a:4d9:433e:91bd:1a5a) has joined #ceph
[19:01] * Tamil1 (~Adium@cpe-108-184-74-11.socal.res.rr.com) has joined #ceph
[19:02] * davidz (~Adium@cpe-23-242-12-23.socal.res.rr.com) has joined #ceph
[19:03] * TiCPU (~jeromepou@12.160.0.155) Quit (Ping timeout: 480 seconds)
[19:03] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[19:04] * rweeks (~rweeks@pat.hitachigst.com) has joined #ceph
[19:05] <rweeks> quick question
[19:05] * Sysadmin88 (~IceChat77@2.218.9.98) has joined #ceph
[19:05] <rweeks> at ceph.com/debian/dists/wheezy
[19:05] <rweeks> what version of Ceph is that?
[19:05] <rweeks> is that 0.83?
[19:06] <gleam> anyone know if issue 8912 is being worked on?
[19:06] <kraken> gleam might be talking about http://tracker.ceph.com/issues/8912 [librbd segfaults when creating new image (rbd-ephemeral-clone-stable-icehouse)]
[19:09] * bitserker1 (~toni@169.38.79.188.dynamic.jazztel.es) Quit (Quit: Leaving.)
[19:09] * kapil (~ksharma@2620:113:80c0:5::2222) Quit (Quit: Leaving)
[19:10] <joshd> gleam: looked over it, haven't dug into it yet though
[19:10] <gleam> okie doke
[19:11] <joshd> are you the one who reported it?
[19:11] * mm (~Michael_M@2a01:198:688:0:d6c9:efff:fe51:ea11) has joined #ceph
[19:12] <mm> Hi there
[19:12] <joshd> it's incredibly well detailed
[19:13] <gleam> nope, and i'm similarly impressed
[19:13] <mm> Anyone who ever had something like this? -832/189228127 objects degraded (-0.000%)
[19:13] <gleam> it's a hell of a bug report
[19:14] * KevinPerks (~Adium@2606:a000:80a1:1b00:b58a:1264:a43a:35e3) Quit (Quit: Leaving.)
[19:15] * sjm (~sjm@12.130.117.134) has left #ceph
[19:19] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[19:19] <stj> rweeks: looks like that repo contains 0.80.5-1~bpo70+1
[19:20] <rweeks> thanks stj
[19:20] <stj> not sure if that actually maps back to 0.83 or not... :/
[19:20] <kraken> ???_???
[19:21] * sputnik13 (~sputnik13@207.8.121.241) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[19:21] <rweeks> no, that looks like 0.80
[19:22] <rweeks> but I need an armel port, so I'm looking at https://packages.debian.org/wheezy-backports/ceph
[19:22] <rweeks> 0.80.4-1~bpo70+1 is the latest armel backport I can find
[19:23] * sigsegv (~sigsegv@188.25.123.201) has joined #ceph
[19:25] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[19:25] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) has joined #ceph
[19:26] * adamcrume (~quassel@c-71-204-162-10.hsd1.ca.comcast.net) has joined #ceph
[19:30] * Sysadmin88 (~IceChat77@2.218.9.98) Quit (Quit: Some folks are wise, and some otherwise.)
[19:31] * lcavassa (~lcavassa@2-229-47-79.ip195.fastwebnet.it) has joined #ceph
[19:31] * sputnik13 (~sputnik13@207.8.121.241) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[19:31] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[19:32] * danieljh (~daniel@0001b4e9.user.oftc.net) Quit (Quit: leaving)
[19:33] * mm (~Michael_M@2a01:198:688:0:d6c9:efff:fe51:ea11) Quit (Quit: ChatZilla 0.9.87-8.1450hg.fc20 [XULRunner 30.0/20140605102323])
[19:33] * sarob (~sarob@c-76-102-72-171.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[19:38] * sputnik13 (~sputnik13@207.8.121.241) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[19:47] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[19:50] * sputnik13 (~sputnik13@207.8.121.241) Quit ()
[19:54] * tracphil (~tracphil@130.14.71.217) Quit (Quit: leaving)
[19:55] * lupu (~lupu@86.107.101.214) Quit (Ping timeout: 480 seconds)
[20:00] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[20:03] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[20:06] * jeff-YF (~jeffyf@67.23.117.122) Quit (Quit: jeff-YF)
[20:07] * angdraug (~angdraug@c-67-169-181-128.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[20:16] * Cataglottism (~Cataglott@dsl-087-195-030-184.solcon.nl) has joined #ceph
[20:21] * yuriw1 (~Adium@c-76-126-35-111.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[20:22] * yuriw (~Adium@c-76-126-35-111.hsd1.ca.comcast.net) has joined #ceph
[20:24] * KevinPerks (~Adium@rrcs-96-10-254-138.midsouth.biz.rr.com) has joined #ceph
[20:25] <seapasulli> probably not advisable but is it easy to downgrade ceph monitors from 82.x to 80.5 (or latest stable?)
[20:26] * vmx (~vmx@dslb-084-056-015-098.084.056.pools.vodafone-ip.de) Quit (Quit: Leaving)
[20:27] * sputnik13 (~sputnik13@207.8.121.241) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[20:28] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[20:30] * sputnik13 (~sputnik13@207.8.121.241) Quit ()
[20:32] * Gamekiller77 (~Gamekille@128-107-239-235.cisco.com) has joined #ceph
[20:34] <ganders> does anyone know what could be the problem on this scenario.. i just add a new OSD server to the cluster, i run the disk zap, then the prepare with fs-type btrfs, and finally the activation... everything goes fine.. but.. when issuing a ceph -w or ceph osd tree i see that the osd.X is out and down... i try to run the service ceph start of the osd daemon inside the server but nothing happens
[20:34] <Gamekiller77> yah i just saw this
[20:35] <Gamekiller77> for me it was a cephx problem
[20:35] <Gamekiller77> do you have the right keys in the /etc/ceph dir
[20:35] <ganders> yes, i had the keys of the admin node on all the nodes osd and mon
[20:35] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[20:36] * danieljh (~daniel@0001b4e9.user.oftc.net) has joined #ceph
[20:36] * b0e (~aledermue@x2f35fa5.dyn.telefonica.de) has joined #ceph
[20:37] * TiCPU (~jeromepou@12.160.0.155) has joined #ceph
[20:37] * angdraug (~angdraug@12.164.168.117) has joined #ceph
[20:38] <dmick> client.admin key is not enough to start OSDs
[20:38] <dmick> that's only for client access (i.e. ceph commands)
[20:39] * rendar (~I@host29-178-dynamic.19-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[20:39] <ganders> do i need to scp also the bootstrap-osd and -mds to the /etc/ceph?
[20:40] * abonilla (~abonilla@ptcc-208-115-82-243.smartcity.com) has joined #ceph
[20:40] <abonilla> Hi - does anyone have a post on how to use mapreduce over ceph?
[20:41] * sputnik13 (~sputnik13@207.8.121.241) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[20:41] * rendar (~I@host29-178-dynamic.19-79-r.retail.telecomitalia.it) has joined #ceph
[20:42] <scuttlemonkey> Gamekiller77: hey there
[20:42] <scuttlemonkey> happy to chat blueprints or w/e (this is Patrick)
[20:44] <ganders> i also copy the osd and mds keyring to /etc/ceph on the osd server and getting the same error msg
[20:45] * lcavassa (~lcavassa@2-229-47-79.ip195.fastwebnet.it) Quit (Quit: Leaving)
[20:46] <ircolle> adamcrume - do you have anything you could point abonilla to?
[20:46] * b0e (~aledermue@x2f35fa5.dyn.telefonica.de) Quit (Quit: Leaving.)
[20:46] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[20:46] <adamcrume> Hm, I thought Joe or Noah did something with that. Lemme see if I can dig it up.
[20:47] <ircolle> adamcrume - tks!
[20:54] <adamcrume> There was a thread about Hadoop on Ceph on the ceph-devel mailing list a couple of years ago: http://www.spinics.net/lists/ceph-devel/msg11032.html
[20:54] <adamcrume> It doesn't look very instructive, though.
[20:56] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) Quit (Quit: Leaving.)
[20:56] * ultimape (~Ultimape@c-174-62-192-41.hsd1.vt.comcast.net) Quit (Ping timeout: 480 seconds)
[20:57] * ultimape (~Ultimape@c-174-62-192-41.hsd1.vt.comcast.net) has joined #ceph
[20:57] * badger32d (~badger@71-209-36-209.bois.qwest.net) has joined #ceph
[20:58] <badger32d> hey all, I'm having a problem with calamari, looks like there isn't a calamari channel so I figured I'd ask in here. I've built calamari/calamari-clients with the vagrant environments. After install / calamari-ctl initialize I get a 500 and a importerror in logs. pastie to follow
[20:58] <badger32d> http://pastie.org/9453749
[20:59] <ganders> any idea? http://pastebin.com/raw.php?i=4Q8Sykny
[20:59] * zerick (~eocrospom@190.187.21.53) has joined #ceph
[20:59] <adamcrume> abonilla: It looks like we have instructions at http://ceph.com/docs/master/cephfs/hadoop/
[21:00] <adamcrume> abonilla: I assume you're using Hadoop.
[21:00] <ganders> sorry this is the correct one: http://pastebin.com/raw.php?i=pEYrHZw8
[21:02] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[21:03] * ircolle is now known as ircolle-afk
[21:04] <abonilla> adamcrume: yeah, just found that too. thanks
[21:05] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[21:13] <badger32d> has anyone else had issues building calamari via http://calamari.readthedocs.org/en/latest/development/building_packages.html ?
[21:19] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[21:20] <bloodice> alfredo here? Just replied to your tweet about ceph-deploy
[21:21] * wrencsok (~wrencsok@wsip-174-79-34-244.ph.ph.cox.net) Quit (Quit: Leaving.)
[21:21] * wrencsok (~wrencsok@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[21:22] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[21:23] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving)
[21:23] <alfredodeza> hey bloodice
[21:23] <bloodice> heya
[21:24] <alfredodeza> so in that link I can't find anywhere that says that you can upgrade a cluster with ceph-deploy (you can't)
[21:24] <alfredodeza> ceph-deploy can help upgrade individual portions
[21:24] <alfredodeza> but there is no one command that will upgrade a whole cluster
[21:25] <bloodice> yea, thats correct
[21:25] <alfredodeza> and I can't find anywhere that says that
[21:25] <bloodice> i follow you there
[21:25] <bloodice> http://ceph.com/docs/master/install/upgrading-ceph/
[21:25] <alfredodeza> however! there might be a spot where we could be more clear?
[21:25] <alfredodeza> or that we are actually ambiguous and we should be more explicit
[21:25] * scuttlemonkey is now known as scuttle|afk
[21:26] <alfredodeza> but I can't tell which section/part you went though that caused you confusion in that aspect
[21:26] <bloodice> "Use ceph-deploy to upgrade the packages for multiple hosts"
[21:26] <bloodice> very top
[21:26] <alfredodeza> right
[21:26] <alfredodeza> that upgrades the ceph package
[21:26] <alfredodeza> not the cluster
[21:26] <bloodice> correct
[21:26] * scuttle|afk is now known as scuttlemonkey
[21:27] <bloodice> i wasnt saying... at least i didnt think i wasnt saying... you could upgrade the whole cluster...
[21:27] <bloodice> i was just talking about updating each node via ceph-deploy
[21:27] <alfredodeza> ok
[21:27] * alfredodeza misunderstood
[21:27] <bloodice> i was saying that page says you can upgrade nodes via ceph-deploy... but there was no command in ceph-deploy that actually stated it "upgraded"
[21:28] <bloodice> like ceph-deploy upgrade <node>
[21:28] <alfredodeza> right
[21:28] <alfredodeza> further down in the document it says how you can upgrade each node
[21:28] <bloodice> which its fine there isnt.... i just didnt know for certain that "ceph-deploy install <node>" would actually upgrade the node
[21:29] <alfredodeza> so further down in here: http://ceph.com/docs/master/install/upgrading-ceph/#upgrade-procedures
[21:29] <alfredodeza> it does tell you to do `ceph-deploy install` to start the upgrade
[21:29] <alfredodeza> bloodice: what you pointed in the beginning is a summary
[21:29] <alfredodeza> not a detailed step by step
[21:29] <bloodice> ahh know i feel stupid
[21:30] <bloodice> i see what you mean
[21:30] <alfredodeza> oh no
[21:30] <alfredodeza> don't feel that way
[21:30] <alfredodeza> (not my intention)
[21:30] <bloodice> one thing to note
[21:31] <bloodice> the restart osd function didnt work for me
[21:31] * alfredodeza looks
[21:31] <bloodice> our cluster is entirely ceph-deploy driven/deployed
[21:31] <alfredodeza> what function is that?
[21:31] <bloodice> i had to login to the osd host and run the restart ceph-osd-all
[21:31] <alfredodeza> `sudo restart ceph-osd id=N` ?
[21:32] <bloodice> sudo restart ceph-osd id=N
[21:32] <bloodice> yea
[21:32] <alfredodeza> hrmn, yeah that would be odd
[21:32] <alfredodeza> any specific error?
[21:32] <bloodice> it may be because our OSD hosts only do hosts
[21:32] <bloodice> i can login again and grab that
[21:33] * zerick (~eocrospom@190.187.21.53) Quit (Read error: Connection reset by peer)
[21:33] <bloodice> just a sec
[21:33] <alfredodeza> in the meantime, I will add a note to be a bit more specific at the very top that those steps are just a summary
[21:33] <bloodice> that section seems to be under upgrading from dumpling.. which i was already on emp and going to firefly
[21:33] <bloodice> probably why i skipped it
[21:34] <bloodice> i had like 20 tabs open
[21:34] <alfredodeza> heh
[21:34] <bloodice> "restart: Unknown instance: ceph/41
[21:35] <bloodice> This was the command: sudo restart ceph-osd id=41
[21:35] <bloodice> that was issued from the monitor/admin node
[21:35] * jeff-YF (~jeffyf@67.23.117.122) Quit (Quit: jeff-YF)
[21:36] <bloodice> i think that is because ceph-deploy doesnt enter the osd's in the ceph.conf
[21:36] <alfredodeza> hrmn
[21:36] <bloodice> which is fine from my understanding... it just makes it so most of the manual commands that reference the ceph.conf, not work
[21:37] <alfredodeza> I haven't done these myself
[21:37] <alfredodeza> but is it possible you are required to be in the host itself?
[21:37] <alfredodeza> the actual OSD host?
[21:37] <alfredodeza> and not from the admin node?
[21:37] <bloodice> it fails on the host as well... says its not in the ceph.conf
[21:38] <bloodice> let me double check that.... real quick
[21:38] <bloodice> make sure i am not mistating this..
[21:39] <bloodice> i was wrong.. you can run that command on the osd host and it works
[21:39] <bloodice> just not from the monitor
[21:40] <bloodice> that wasnt the command i was running then... it must of been another specified in another doc i was reading... let me check
[21:41] * zerick (~eocrospom@190.187.21.53) has joined #ceph
[21:42] * Cube (~Cube@tacocat.concubidated.com) has joined #ceph
[21:43] <bloodice> it was this one: "service ceph restart osd.0"
[21:43] <bloodice> but i realized that would never work after i researched more
[21:43] <bloodice> ceph-deploy runs everything directly, not deamonized
[21:45] * barnim (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[21:46] <Gamekiller77> scuttlemonkey, sup man
[21:48] <bloodice> many times in the documents, it says "run this command" but then it doesnt state on which server.. a monitor? the OSD host? the radosgw?
[21:48] <bloodice> that upgrade document says upgrade the deamons, but ceph-deploy only launches processes from what i can tell.... perhaps there is a switch i should of used to deamonize during install?
[21:50] <bloodice> personally, i think the keyring docs are the most confusing.. at least it was for me because we are running the radosgw
[21:50] <bloodice> I am glad there are docs out there though! :)
[21:51] * JC1 (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) has joined #ceph
[21:52] * JC (~JC@AMontpellier-651-1-420-97.w92-133.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[21:53] <bloodice> also note that the upgrade doc says to restart the monitors, but ceph-deploy install does that for you....
[21:54] <alfredodeza> bloodice: you are firing on multiple ends and I can't follow :/
[21:54] <bloodice> my bad
[21:55] <alfredodeza> so for inconsistencies in the docs we can definitely fix them
[21:55] <alfredodeza> those are easy
[21:55] <alfredodeza> the best you could to to help us improve them is to note how it didn't work for you because they seemed to suggest one thing and ended being something else and create a ticket with that information
[21:55] <bloodice> great news there! :)
[21:55] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[21:56] <alfredodeza> http://tracker.ceph.com/projects/ceph/issues/new
[21:56] <bloodice> ah ok.. i was about to ask
[21:56] <alfredodeza> you would need an account, and if possible, you could change the "Tracker" dropdown to say 'Documentation'
[21:57] <bloodice> can do
[21:57] <alfredodeza> it is going to be much better if you open those (vs me) because you have a fresh understanding on how they are lacking (or inconsistent)
[21:58] <bloodice> yea, that works
[21:59] <bloodice> i do realize it depends on how you deploy your environment too.. mine is a server for each role...
[21:59] * Cube (~Cube@tacocat.concubidated.com) Quit (Ping timeout: 480 seconds)
[21:59] * Cube (~Cube@tacocat.concubidated.com) has joined #ceph
[22:00] * Cataglottism (~Cataglott@dsl-087-195-030-184.solcon.nl) Quit (Quit: Textual IRC Client: www.textualapp.com)
[22:01] <ganders> people any idea on this?
[22:01] <ganders> http://pastebin.com/raw.php?i=JE91k2WZ
[22:01] <ganders> im having problems when trying to add a osd
[22:01] * CAPSLOCK2000 (~oftc@2001:610:748:1::8) Quit (Quit: ZNC - http://znc.in)
[22:02] * CAPSLOCK2000 (~oftc@2001:610:748:1::8) has joined #ceph
[22:02] <ganders> prepare work fine and also activate, but the osd never get IN and UP
[22:05] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[22:06] * thb (~me@port-5823.pppoe.wtnet.de) has joined #ceph
[22:07] * thb (~me@port-5823.pppoe.wtnet.de) Quit ()
[22:08] <dmick> ganders: you checked the log?
[22:09] <ganders> yeah the log said that the osd in this case osd.0 is already mounted in pos
[22:10] <dmick> what's the exact log msg?
[22:10] <ganders> i think i got the problem
[22:11] * abonilla (~abonilla@ptcc-208-115-82-243.smartcity.com) Quit (Ping timeout: 480 seconds)
[22:13] <ganders> it was a problem with the clust net
[22:13] <dmick> mmk....
[22:13] <dmick> dunno what that would have to do with a mounted fs but...
[22:15] <ganders> it was not able to start the osd daemon since it lost connection to the clus net, so it can't commun with the rest of the members
[22:20] <bloodice> DNS correct?
[22:26] <kitz> When I'm doing a ceph -w and I see "osd.74 10.40.0.104:6857/4492 boot" does that mean that osd 74 actually just restarted or that it just rejoined the cluster in some way?
[22:27] * sage___ (~quassel@gw.sepia.ceph.com) Quit (Remote host closed the connection)
[22:28] * sarob (~sarob@2001:4998:effd:600:cd26:43d3:c222:d927) has joined #ceph
[22:30] <bloodice> restarted... from what i have seen when i sent the restart command to an osd
[22:30] * rwheeler (~rwheeler@173.48.207.57) has joined #ceph
[22:31] <bloodice> run a "ceph osd tree" command and look for anything outside of "up"
[22:37] * scuttlemonkey is now known as scuttle|afk
[22:39] * KevinPerks (~Adium@rrcs-96-10-254-138.midsouth.biz.rr.com) Quit (Quit: Leaving.)
[22:40] * KevinPerks (~Adium@rrcs-96-10-254-138.midsouth.biz.rr.com) has joined #ceph
[22:44] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit (Quit: Leaving.)
[22:56] * sage___ (~quassel@gw.sepia.ceph.com) has joined #ceph
[22:57] <rweeks> I glanced up at that last sentence and read "up a ceph osd tree"
[22:57] <rweeks> <.<
[22:58] * JayJ (~jayj@157.130.21.226) Quit (Remote host closed the connection)
[22:59] * JayJ (~jayj@157.130.21.226) has joined #ceph
[23:00] * sage___ (~quassel@gw.sepia.ceph.com) Quit (Remote host closed the connection)
[23:02] * ninkotech__ (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[23:03] * ninkotech__ (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[23:04] * t0rn (~ssullivan@c-68-62-1-186.hsd1.mi.comcast.net) Quit (Quit: Leaving.)
[23:08] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[23:09] * KevinPerks (~Adium@rrcs-96-10-254-138.midsouth.biz.rr.com) has left #ceph
[23:10] * ganders (~root@200-127-158-54.net.prima.net.ar) Quit (Quit: WeeChat 0.4.1)
[23:14] <bloodice> indeed
[23:14] <kraken> http://i.imgur.com/bQcbpki.gif
[23:15] * stupidnic (~foo@cpe-70-94-232-191.sw.res.rr.com) has joined #ceph
[23:16] <stupidnic> I have a question about ceph-deploy and I want to make sure I understand it. Does ceph-deploy look at the remote node or the admin node to determine what OS and version you are using?
[23:17] <alfredodeza> stupidnic: remote node
[23:17] <stupidnic> hmmm... odd
[23:17] <alfredodeza> issues?
[23:17] <stupidnic> I am running ceph-deploy on a CentOS6 box and deploying to a CentOS7 box (yes I know there is no repo)
[23:17] * dmsimard is now known as dmsimard_away
[23:17] <alfredodeza> there is actually an el7 repo stupidnic
[23:17] <stupidnic> and it keeps trying to install the EL6 repo in yum
[23:17] <stupidnic> is there?
[23:17] <alfredodeza> yes
[23:17] <stupidnic> not for firefly
[23:17] <alfredodeza> hrmn
[23:18] * alfredodeza checks
[23:18] <alfredodeza> and you would be right
[23:18] <alfredodeza> I guess we haven't done any firefly releases since we added el7 repos
[23:18] <stupidnic> is that something that can be pushed through on the next build cycle?
[23:19] <stupidnic> I would love to have them
[23:19] <alfredodeza> yes of course
[23:19] <alfredodeza> that should be the case
[23:19] * TiCPU (~jeromepou@12.160.0.155) Quit (Ping timeout: 480 seconds)
[23:20] <stupidnic> so moving back to the original issue...
[23:20] * baylight (~tbayly@74-220-196-40.unifiedlayer.com) has joined #ceph
[23:20] <stupidnic> I had to physically edit my ceph-deploy install to get it to even use rhel7
[23:20] <stupidnic> it kept picking up el6
[23:21] <stupidnic> I assumed since my admin node was CentOS6 that is where it was picking it up from
[23:21] <alfredodeza> stupidnic: what version of ceph-deploy are you using
[23:21] <stupidnic> latest 1.5.10
[23:21] <alfredodeza> stupidnic: can you show me the logs? that sounds like a bug
[23:21] <alfredodeza> also, keep in mind that you can explicitly point to a repo
[23:22] <stupidnic> alfredodeza: sure... give me a minute to purge and run it again
[23:22] * lupu (~lupu@86.107.101.214) has joined #ceph
[23:23] * ircolle-afk is now known as ircolle
[23:24] * b0e (~aledermue@x2f35fa5.dyn.telefonica.de) has joined #ceph
[23:24] <stupidnic> alfredodeza: http://pastebin.com/QEG5BiPx
[23:25] <stupidnic> alfredodeza: I edited /usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py on the admin node and forced the two returns to return rhel7 and el7 as required
[23:25] <alfredodeza> argh
[23:25] <alfredodeza> that looks like a bug
[23:25] <stupidnic> and I was able to install on the two nodes running CentOS7
[23:25] <alfredodeza> yeah no need to install the python file :)
[23:25] <alfredodeza> you can pass in a flag
[23:25] <alfredodeza> or you can set it on your cephdeploy.conf file
[23:26] <stupidnic> You mean hardcoding the repos?
[23:26] <alfredodeza> you can define repos in your cephdeploy.conf file and/or pass the repo url and gpg url to install
[23:26] <alfredodeza> do `ceph-deploy install --help` to see some of those options
[23:26] <stupidnic> I tried that, but I am not 100% I did it correctly
[23:26] <stupidnic> I edited the .cephdeploy.conf file and put the repos
[23:27] <alfredodeza> show me that file
[23:27] <stupidnic> k
[23:27] <alfredodeza> you need to specify it
[23:27] <stupidnic> Oh okay... I might have not done that
[23:27] <alfredodeza> e.g. if you have a [foo] repo, them you would `ceph-deploy install --release foo node1 node2`
[23:28] <stupidnic> k let me try that using the correct method
[23:28] <alfredodeza> oh I see where the bug is, we added all this logic but only for redhat
[23:28] <alfredodeza> not for centos
[23:28] <alfredodeza> because that was just released
[23:28] * alfredodeza opens bug
[23:28] <stupidnic> Bleeding edge sucks that way sometimes :)
[23:29] <stupidnic> I really wanted to use btrfs with ceph
[23:29] <stupidnic> which is why I went with CentOS 7
[23:29] <stupidnic> I can probably grind out the code if needed
[23:30] <stupidnic> is there a github repo for this? I can clone and do a pull request
[23:30] <alfredodeza> issue 9041
[23:30] <kraken> alfredodeza might be talking about http://tracker.ceph.com/issues/9041 [ceph-deploy fails to detect el7 repos for CentOS7]
[23:30] <alfredodeza> stupidnic: ^ ^
[23:30] <alfredodeza> stupidnic: and yes there is: https://github.com/ceph/ceph-deploy
[23:31] <stupidnic> Alright. I will bang this out and submit a pull request after that
[23:31] <alfredodeza> stupidnic: you are looking at https://github.com/ceph/ceph-deploy/blob/master/ceph_deploy/hosts/centos/install.py#L5 and at https://github.com/ceph/ceph-deploy/blob/master/ceph_deploy/hosts/centos/install.py#L12
[23:31] <alfredodeza> those are the two functions that you need to change
[23:31] <stupidnic> one problem though...
[23:31] <stupidnic> even after I do this... the EL7 repo will still be missing :)
[23:31] * Gamekiller77 (~Gamekille@128-107-239-235.cisco.com) Quit (Ping timeout: 480 seconds)
[23:32] <alfredodeza> that is about to be fixed in the next couple of days (iirc)
[23:32] <alfredodeza> stupidnic: one more thing, you will need to add some tests to verify your changes work
[23:32] <alfredodeza> here https://github.com/ceph/ceph-deploy/blob/master/ceph_deploy/tests/unit/hosts/test_centos.py
[23:32] <alfredodeza> as you can see, we where just testing redhat there, not centos
[23:32] <stupidnic> shouldn't be hard
[23:33] <alfredodeza> stupidnic: if you have any issues or need help, don't doubt to ping me
[23:34] <stupidnic> no problem. Should be pretty straight forward in getting it taken care of
[23:38] * b0e (~aledermue@x2f35fa5.dyn.telefonica.de) Quit (Quit: Leaving.)
[23:41] * barnim (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Quit: Verlassend)
[23:43] * rweeks (~rweeks@pat.hitachigst.com) Quit (Quit: Leaving)
[23:52] * barnim (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[23:54] * hufman (~hufman@cpe-184-58-235-28.wi.res.rr.com) Quit (Quit: leaving)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.