#ceph IRC Log

Index

IRC Log for 2013-08-05

Timestamps are in GMT/BST.

[0:00] <mozg> and the third one is arh-cloud2-ib
[0:00] <mozg> not sure if this matters
[0:00] <sage> sorry yeah those are the internal ranks. it's teh 2nd in the list when you dump teh monmap
[0:00] <sage> .200 and .2 are the two that seem to be participating currently
[0:01] <mozg> okay
[0:01] <mozg> and .201 is not?
[0:01] <sage> not in the log you sent me
[0:01] <sage> do those debug commands and capture the log on that node to see what is up..
[0:02] <mozg> sage: i think restarting mons solved the problem
[0:02] <mozg> however, the health status is warning
[0:02] <mozg> health HEALTH_WARN 1 mons down, quorum 0,2,3 arh-cloud2-ib,arh-ibstorage1-ib,arh-ibstorage2-ib
[0:02] <mozg> it seems to have added the 4th one
[0:02] <mozg> but it is not joining
[0:02] <sage> on the one that is still missing, enable debug and capture a log
[0:02] <sage> before retsarting
[0:02] <sage> so we can see why
[0:03] <mozg> one sec please
[0:03] <mozg> let me check if it is running
[0:03] <mozg> ah
[0:03] <mozg> the mon was not running
[0:04] <sage> ah
[0:04] <sage> is there a crash dump in the log?
[0:04] <mozg> i guess i must have stopped it after you've asked me if the quorum was forming after the 4th mon is stopped
[0:05] <mozg> there is no "crash" in the logs
[0:05] <sage> is there a 'got signal Terminated' message?
[0:05] <sage> just prior to the most recent 'ceph version ...' startup banner?
[0:06] <mozg> grep -i Terminated *.log
[0:06] <mozg> ceph-mon.arh-cloud11-ib.log:2013-08-04 22:41:25.374298 7f0646459700 -1 mon.arh-cloud11-ib@1(electing) e2 *** Got Signal Terminated ***
[0:06] <mozg> ceph-mon.arh-cloud11-ib.log:2013-08-04 22:48:15.337395 7fa0626ab700 -1 mon.arh-cloud11-ib@1(electing) e2 *** Got Signal Terminated ***
[0:06] <mozg> let me check if it is the most recent one or not
[0:07] <sage> that's what you normally see with a 'stop ceph-mon id=foo' or 'service ceph stop' fwiw
[0:08] <mozg> after the latest version ... there is no terminated stuff
[0:08] <mozg> sage: the new mon is not joining for some reason
[0:08] <mozg> i've started it
[0:09] <mozg> and ceph -s is still showing 1 mon down
[0:09] <sage> it might be syncing.. tail teh log?
[0:09] <sage> or ceph --admin-daemon /var/run/ceph/ceph-mon.*.asok mon_status
[0:10] <mozg> one sec
[0:10] <mozg> by the way, should i disable debugging on the server which i've previously set?
[0:10] <sage> yeah probably
[0:10] <sage> or just restart it if you're lazy
[0:11] <mozg> restart all ceph services?
[0:11] <mozg> or just the mons?
[0:12] <sage> just the one mon
[0:12] <sage> (just an easier way to get debugging back ot default)
[0:12] <mozg> sage: this is the mon_status I get: http://ur1.ca/ewamm
[0:13] <sage> can you turn up debugging on that one and capture a log?
[0:13] <mozg> the other mon server shows:
[0:13] <mozg> 2013-08-04 23:12:51.866984 7fbd75eaf700 1 mon.arh-ibstorage1-ib@2(peon) e2 handle_timecheck_peon got wrong epoch (ours 659 theirs: 658) -- discarding
[0:13] <mozg> i will turn on debug on the 4th server now
[0:15] <mozg> will upload the log file from the 4th mon in a sec
[0:16] <mozg> done
[0:16] <mozg> it's in the mozg folder
[0:17] * grepory (~Adium@209.119.62.120) has joined #ceph
[0:19] <mozg> sage: did you get the log file?
[0:19] <sage> looking now
[0:21] * BillK_ (~BillK-OFT@124-148-246-233.dyn.iinet.net.au) has joined #ceph
[0:21] <mozg> thanks for you help!!!
[0:23] * BillK (~BillK-OFT@124-148-246-233.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[0:23] <sage> it looks like that one is winning and the others are calling new elections. can you capture a log from the .11 or .200 node?
[0:25] <mozg> sage: the last log i've sent was from .11
[0:26] <mozg> that's the 4th mon that i've added
[0:26] <mozg> would you like me to get from .200?
[0:26] <sage> oh sorry
[0:26] <sage> i mean .200 or .201
[0:26] * yanzheng (~zhyan@134.134.139.74) has joined #ceph
[0:28] <mozg> should I enable debugging before?
[0:28] <mozg> or just as it is?
[0:29] <mozg> http://ur1.ca/ewb1o
[0:29] <mozg> sage: that's without the debug
[0:29] <mozg> from .200
[0:29] <sage> with debug please
[0:29] <sage> to see why it is calling an election
[0:31] * grepory (~Adium@209.119.62.120) Quit (Quit: Leaving.)
[0:31] <mozg> one moment please
[0:32] * dalgaaf (~dalgaaf@nrbg-4dbe2844.pool.mediaWays.net) Quit (Ping timeout: 480 seconds)
[0:33] <mozg> done
[0:33] <mozg> uploaded in the mozg folder
[0:33] <mozg> ceph-mon.arh-ibstorage1-ib.log
[0:37] <sage> can yo ucapture the log from .2 ?
[0:38] <sage> there is some strangeness going on in the election.. both mon.0 and and mon.1 are claiming victory
[0:39] <mozg> one sec
[0:41] <mozg> sage: done. file name is ceph-mon.arh-cloud2-ib.log
[0:44] <sage> is there some firewall/network thing going on? they can't seem to reach each other
[0:44] <sage> debug ms = 20 would tell us a bit more
[0:45] <mozg> ah, f*ck
[0:45] <mozg> you are right
[0:45] <mozg> there might be the fw on .11
[0:45] <mozg> damn, it's rather late
[0:45] <mozg> why didn't i check
[0:45] <sage> phew, that makes me feel better :)
[0:45] <mozg> let me double check
[0:45] <mozg> sage: sorry about that
[0:45] <sage> np
[0:46] <mozg> by the way, is there a firewall guide for ceph?
[0:46] <mozg> which ports should I open?
[0:46] <mozg> there is no fw issue between .200, .201 and .2
[0:46] <mozg> these i had working before without any issues
[0:47] <sage> 6789 for the mons
[0:47] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[0:47] <sage> the other daemons bind to the 6800-7100 range
[0:47] <mozg> ah, i see
[0:47] <sage> it's configurable.. ceph-mon --show-config | grep port
[0:47] <mozg> it's all working fine once i've done ufw disable
[0:47] <sage> yay!
[0:47] <mozg> thanks
[0:47] <sage> np
[0:48] <mozg> so, i need to switch fw back on with 6789 open
[0:48] <sage> for the mon nodes, yeah.
[0:48] <mozg> sage: by the way, after adding a new mon, should i restart existing mons?
[0:48] <mozg> is this the reason why the cluster got frozen right after adding the 4th mon
[0:49] <sage> normally not needed. my guess is somewhere along the way one of hte daemons got stopped by accident
[0:49] <sage> unless you can find a crash dump or some other indicator in the log
[0:50] <sage> the trick is that 2/3 is a quorum but 2/4 is not. that and the firewall issue is probably where things went wrong.
[0:50] <mozg> sage: last question, if i want to remove one of the mons (.2 actually), should I do with ceph-deploy?
[0:50] <mozg> will it cause any issues from 4>3 mons?
[0:51] <sage> you can, or you can just 'ceph mon remove <name>' (or whatever it is) and move/remove the /var/lib/ceph/mon/* directory yourself
[0:51] <sage> no problem as long as you still ahve a quorum with the reduced cluster size
[0:51] <mozg> i will try it now
[0:51] <mozg> coz i want to rebuild .2 server
[0:52] <sage> you'll probably want to updating your ceph.conf's on teh other nodes so that they have the revised list of mon addresses. it should work ok as-is, though, as long as at least one of the listed addrs is still valid
[0:52] <sage> tho they may take a few seconds searching to find a live mon
[0:52] <mozg> actually, .2 was one of the original once that i've done when i created my ceph cluster with ceph-deploy
[0:52] <mozg> it is still in ceph.conf
[0:53] <mozg> coz the other servers are not in ceph.conf
[0:53] <mozg> would it cause any issues?
[0:56] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:03] * markbby (~Adium@168.94.245.2) has joined #ceph
[1:07] * LeaChim (~LeaChim@2.122.178.96) Quit (Ping timeout: 480 seconds)
[1:12] * AfC (~andrew@2001:44b8:31cb:d400:cc05:1c07:9192:74e2) Quit (Quit: Leaving.)
[1:24] * joao (~JL@216.1.187.162) has joined #ceph
[1:24] * ChanServ sets mode +o joao
[1:27] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley)
[1:29] <n1md4> hi. how can i generate an ceph.client.admin.keyring ?
[1:33] <n1md4> .found ceph-create-keys .. now need to know the ceph-mon id ..> ?
[1:41] <n1md4> ..never mind. i've totally srewed ceph up (for the 2nd time now) I'll reinstall the OS as it's the only sure way I know to ensure everything is fresh.
[1:46] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[1:46] * yanzheng (~zhyan@134.134.139.74) Quit (Remote host closed the connection)
[2:04] * nwat (~nwat@eduroam-251-132.ucsc.edu) Quit (Ping timeout: 480 seconds)
[2:08] * joao (~JL@216.1.187.162) Quit (Ping timeout: 480 seconds)
[2:09] * joao (~JL@216.1.187.162) has joined #ceph
[2:09] * ChanServ sets mode +o joao
[2:12] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[2:18] * huangjun (~kvirc@111.174.44.49) has joined #ceph
[2:42] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Remote host closed the connection)
[2:42] * markbby (~Adium@168.94.245.2) Quit (Remote host closed the connection)
[2:42] * xmltok (~xmltok@relay.els4.ticketmaster.com) has joined #ceph
[2:46] * rturk-away is now known as rturk
[2:49] * yanzheng (~zhyan@134.134.139.70) has joined #ceph
[2:51] * rturk is now known as rturk-away
[3:11] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[3:12] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit ()
[3:15] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:20] * julian (~julianwa@125.69.104.58) has joined #ceph
[3:21] <huangjun> hi,all,
[3:21] <huangjun> an error occured last week and google found no answers
[3:22] <huangjun> when i compiled ceph on centos 5.9
[3:22] <huangjun> it returns compile error:
[3:22] <huangjun> cls/lock/cls_lock_client.cc:59: error: conflict with ��int rados::cls::lock::lock(librados::IoCtx*, const std::string&, const std::string&, ClsLockType, const std::string&, const std::string&, const std::string&, const utime_t&, uint8_t)��
[3:23] <huangjun> so any one have meet this problem before?
[3:25] * mozg (~andrei@host109-151-35-94.range109-151.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[3:32] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[3:41] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley)
[3:45] * yy-nm (~chatzilla@218.74.33.110) has joined #ceph
[3:55] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[3:55] * Vincent_Valentine (~Vincent_V@49.206.158.155) has joined #ceph
[4:03] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[4:03] * xmltok (~xmltok@relay.els4.ticketmaster.com) Quit (Quit: Leaving...)
[4:05] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[4:07] * xmltok (~xmltok@relay.els4.ticketmaster.com) has joined #ceph
[4:08] * xmltok_ (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[4:09] * xmltok_ (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit ()
[4:11] * xmltok (~xmltok@relay.els4.ticketmaster.com) Quit (Read error: Operation timed out)
[4:35] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[4:48] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[4:49] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[5:02] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[5:05] * fireD (~fireD@93-142-207-243.adsl.net.t-com.hr) has joined #ceph
[5:06] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit ()
[5:07] * fireD_ (~fireD@93-139-173-101.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:09] * xinxinsh_ (~xinxinsh@134.134.139.70) has joined #ceph
[5:11] * danieagle (~Daniel@177.205.183.226.dynamic.adsl.gvt.net.br) has joined #ceph
[5:20] * xinxinsh_ (~xinxinsh@134.134.139.70) Quit (Quit: Leaving)
[5:21] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[5:21] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit ()
[5:24] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[5:37] * julian (~julianwa@125.69.104.58) Quit (Quit: afk)
[5:38] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (Read error: No route to host)
[5:49] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[5:53] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley)
[6:11] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[6:12] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[6:20] * nerdtron (~kenneth@202.60.8.252) has joined #ceph
[6:21] <nerdtron> hi all
[6:21] <nerdtron> does anyone here able to use ceph as a datastore in opennebula?
[6:33] * Vincent_Valentine (~Vincent_V@49.206.158.155) Quit (Ping timeout: 480 seconds)
[6:33] * ivan` (~ivan`@000130ca.user.oftc.net) Quit (Read error: Connection reset by peer)
[6:37] * ivan` (~ivan`@000130ca.user.oftc.net) has joined #ceph
[6:46] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[7:09] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[7:12] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[7:13] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[7:18] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[7:21] * huangjun (~kvirc@111.174.44.49) Quit (Ping timeout: 480 seconds)
[7:27] * rongze_ (~quassel@notes4.com) has joined #ceph
[7:27] * haomaiwa_ (~haomaiwan@117.79.232.144) has joined #ceph
[7:34] * haomaiwang (~haomaiwan@li565-182.members.linode.com) Quit (Ping timeout: 480 seconds)
[7:34] * rongze (~quassel@li565-182.members.linode.com) Quit (Ping timeout: 480 seconds)
[7:37] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[7:38] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[7:40] * nerdtron (~kenneth@202.60.8.252) Quit (Quit: Leaving)
[7:53] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[7:55] * sleinen1 (~Adium@2001:620:0:26:90f9:3bf3:dabd:e353) has joined #ceph
[7:58] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[8:01] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[8:14] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[8:16] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit ()
[8:21] * saabylaptop (~saabylapt@2a02:2350:18:1010:ac98:bef:86c7:dfaf) has joined #ceph
[8:22] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[8:28] * odyssey4me (~odyssey4m@165.233.71.2) has joined #ceph
[8:29] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) Quit (Read error: Operation timed out)
[8:30] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Read error: Operation timed out)
[8:47] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[8:48] * odyssey4me (~odyssey4m@165.233.71.2) Quit (Ping timeout: 480 seconds)
[8:52] * bergerx_ (~bekir@78.188.101.175) has joined #ceph
[8:52] * odyssey4me (~odyssey4m@165.233.71.2) has joined #ceph
[8:52] * joao (~JL@216.1.187.162) Quit (Ping timeout: 480 seconds)
[9:00] * yy-nm (~chatzilla@218.74.33.110) Quit (Read error: Connection reset by peer)
[9:01] * yy-nm (~chatzilla@218.74.33.110) has joined #ceph
[9:10] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[9:16] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) has joined #ceph
[9:16] * ChanServ sets mode +v andreask
[9:18] * joelio (~Joel@88.198.107.214) Quit (Ping timeout: 480 seconds)
[9:19] * huangjun (~kvirc@111.173.155.201) has joined #ceph
[9:21] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[9:22] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[9:32] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) has joined #ceph
[9:34] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[9:36] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) has joined #ceph
[9:36] * ChanServ sets mode +v andreask
[9:36] * danieagle (~Daniel@177.205.183.226.dynamic.adsl.gvt.net.br) Quit (Quit: inte+ e Obrigado Por tudo mesmo! :-D)
[9:54] * joelio (~Joel@88.198.107.214) has joined #ceph
[9:55] * dalgaaf (~dalgaaf@nrbg-4dbfcce3.pool.mediaWays.net) has joined #ceph
[9:55] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[10:00] * jamespage (~jamespage@culvain.gromper.net) has joined #ceph
[10:00] * mschiff (~mschiff@port-2854.pppoe.wtnet.de) has joined #ceph
[10:08] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) Quit (Quit: Leaving.)
[10:18] * LeaChim (~LeaChim@2.122.178.96) has joined #ceph
[10:19] * Vincent_Valentine (~Vincent_V@49.206.158.155) has joined #ceph
[10:20] * dobber_ (~dobber@213.169.45.222) has joined #ceph
[10:27] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[10:31] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[10:33] * tziOm (~bjornar@194.19.106.242) has joined #ceph
[10:35] * rongze_ (~quassel@notes4.com) Quit (Read error: Connection reset by peer)
[10:36] * rongze (~quassel@117.79.232.202) has joined #ceph
[10:36] * maciek (maciek@0001bab6.user.oftc.net) Quit (Ping timeout: 480 seconds)
[10:39] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[10:40] * root_ (~chatzilla@218.94.22.130) has joined #ceph
[10:55] * frank9999 (~frank@kantoor.transip.nl) has joined #ceph
[11:10] * yanzheng (~zhyan@134.134.139.70) Quit (Remote host closed the connection)
[11:14] * allsystemsarego (~allsystem@188.25.130.190) has joined #ceph
[11:22] * yy-nm (~chatzilla@218.74.33.110) Quit (Quit: ChatZilla 0.9.90.1 [Firefox 22.0/20130618035212])
[11:29] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[11:40] * dosaboy_ (~dosaboy@host81-156-121-242.range81-156.btcentralplus.com) has joined #ceph
[11:41] * dosaboy (~dosaboy@host109-158-236-137.range109-158.btcentralplus.com) Quit (Read error: Operation timed out)
[11:47] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:e1ed:f163:805c:87f7) has joined #ceph
[11:47] * madkiss (~madkiss@2001:6f8:12c3:f00f:9540:7eb0:e1ac:1100) Quit (Ping timeout: 480 seconds)
[11:49] * mozg (~andrei@host217-46-236-49.in-addr.btopenworld.com) has joined #ceph
[11:52] * maciek (maciek@2001:41d0:2:2218::dead) has joined #ceph
[12:06] * AfC (~andrew@2001:44b8:31cb:d400:401c:245e:d36a:a517) has joined #ceph
[12:21] * AfC (~andrew@2001:44b8:31cb:d400:401c:245e:d36a:a517) Quit (Quit: Leaving.)
[12:21] <Gugge-47527> is there any command to show how much space is used by an rbd?
[12:22] <Kioob`Taff> not directly Gugge-47527, but you can do a script. I give you that
[12:24] <Gugge-47527> a script would do i guess :)
[12:24] <Kioob`Taff> so : BLOCKNAMEPREFIX=`rbd info $IMG | grep block_name_prefix: | awk '{ print $2 }'`
[12:24] <Kioob`Taff> then : NBBLOCK=`rados --pool $POOL ls | grep -c $BLOCKNAMEPREFIX`
[12:24] <Kioob`Taff> here with a NBBLOCK * SIZE_OF_BLOCK (4MB per default), you have an approximate value
[12:25] <Kioob`Taff> if you want the exact value : (much longer)
[12:25] <Kioob`Taff> rados --pool $POOL ls | grep $BLOCKNAMEPREFIX | xargs -r -iOBJECT rados --pool $POOL stat OBJECT | awk '{ SUM += $5 } END { print SUM/1024/1024 " MB" }'
[12:26] <Kioob`Taff> I add : not sure at 100%, maybe a make a mistake, but it's what I understood
[12:27] <Gugge-47527> and i guess running 3 million rados stat commands is gonna take time :P
[12:27] <Kioob`Taff> yes :)
[12:27] <Kioob`Taff> there is probably a better way too
[12:28] <Gugge-47527> rados ls could use a prefix too :)
[12:28] <Kioob`Taff> I mainly use the "approximative" way
[12:28] <Kioob`Taff> how, much faster I suppose !
[12:28] <Gugge-47527> it doesnt exist, but i wish it did :)
[12:28] <Gugge-47527> rados ls takes ages to run :)
[12:30] * sleinen1 (~Adium@2001:620:0:26:90f9:3bf3:dabd:e353) Quit (Remote host closed the connection)
[12:31] <Kioob`Taff> note : I see a "kernel BUG at net/ceph/osd_client.c:582!" on a Ceph client (RBD kernel client), from 29 july, thrown by a an OSD restart (on an other physical server). Should I make a bug report for that, or was it already fixed since this date ?
[12:40] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[12:54] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[12:59] * huangjun (~kvirc@111.173.155.201) Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[13:15] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:19] * Maskul (~Maskul@host-78-148-89-70.as13285.net) has joined #ceph
[13:20] * Maskul (~Maskul@host-78-148-89-70.as13285.net) Quit ()
[13:28] * link0 (~dennisdeg@backend0.link0.net) Quit (Remote host closed the connection)
[13:34] * diegows (~diegows@190.190.11.42) has joined #ceph
[13:35] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[13:37] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[13:37] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) has joined #ceph
[13:37] * ChanServ sets mode +v andreask
[13:40] * dragonfly (~dragonfly@171.215.185.30) has joined #ceph
[13:42] * dragonfly (~dragonfly@171.215.185.30) Quit ()
[13:49] * topro (~topro@host-62-245-142-50.customer.m-online.net) Quit (Quit: Konversation terminated!)
[13:49] * topro (~topro@host-62-245-142-50.customer.m-online.net) has joined #ceph
[14:09] * yanzheng (~zhyan@101.82.135.209) has joined #ceph
[14:37] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[14:38] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley)
[14:39] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[14:44] * john_barbee (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Quit: ChatZilla 0.9.90.1 [Firefox 22.0/20130618035212])
[14:44] * huangjun (~kvirc@106.120.176.42) has joined #ceph
[14:50] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) Quit (Quit: ZNC - http://znc.in)
[14:50] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[14:52] * dosaboy (~dosaboy@host109-158-237-180.range109-158.btcentralplus.com) has joined #ceph
[14:55] * tchmnkyz (~jeremy@0001638b.user.oftc.net) Quit (Quit: Lost terminal)
[14:58] * dosaboy_ (~dosaboy@host81-156-121-242.range81-156.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[15:02] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) Quit (Quit: ZNC - http://znc.in)
[15:02] * oliver1 (~oliver@p4FD07DAC.dip0.t-ipconnect.de) has joined #ceph
[15:03] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[15:04] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:06] * sleinen (~Adium@2001:620:0:25:7908:9cda:2745:1fab) has joined #ceph
[15:08] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[15:26] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) has joined #ceph
[15:28] * xdeller (~xdeller@91.218.144.129) has joined #ceph
[15:28] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) Quit (Quit: ZNC - http://znc.in)
[15:28] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[15:29] * diegows (~diegows@190.190.11.42) Quit (Ping timeout: 480 seconds)
[15:32] * zhyan_ (~zhyan@114.81.215.37) has joined #ceph
[15:36] <niklas> Hi. How do I create a new pool using librados giving the amount of PGs?
[15:39] <joelio> niklas: docs are your friend? http://ceph.com/docs/master/rados/operations/pools/
[15:39] <joelio> or specifically via librados?
[15:39] * Psi-Jack (~Psi-Jack@psi-jack.user.oftc.net) Quit (Quit: ZNC - http://znc.in)
[15:39] * yanzheng (~zhyan@101.82.135.209) Quit (Ping timeout: 480 seconds)
[15:40] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[15:40] <niklas> yes, specifically via librados
[15:40] <niklas> default seems to be 8 (wtf??)
[15:41] <niklas> (In which sczenario is 8 a valid amount of pgs?)
[15:41] <joelio> yea, I think that's crazy
[15:41] <joelio> no idea why that's default
[15:42] <joelio> especially in docs "The default value 8 is NOT suitable for most systems"
[15:42] <joelio> so why have it default then????
[15:42] * Psi-Jack (~Psi-Jack@psi-jack.user.oftc.net) Quit ()
[15:43] <niklas> yep
[15:43] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[15:43] <niklas> Anyways: do you know how to set the amount of pgs via librados?
[15:46] <joelio> niklas: looking at the stub examples in http://ceph.com/docs/next/rados/api/librados/ - it seems not possible.. perhaps you could overload somehing
[15:46] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) Quit ()
[15:47] <joelio> maybe rados_pool_create_with_crush_rule is getting you closer?
[15:47] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[15:48] <niklas> I thought so, too. But what is a crush rule?
[15:48] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) Quit ()
[15:50] <cfreak201> I'm currently thinking of possible outage scenarios and if ceph would "survive" those: Given 3 racks, each rack has 1 mon, 2 racks have osd's and "consumers" (nova-compute), now the "inter-rack-communcation" fails (for whatever reason).. Each compute node should still work right? What happens if the connection gets reestablished? Does it merge the data without conflicts since each rack/node has it's dedicated ceph volumes?
[15:50] <joelio> niklas: well, that's more for placement rules.. I'd expect there to be extra paramaterisation to 'int rados_pool_create(rados_t cluster, const char *pool_name)'
[15:50] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[15:50] <joelio> have a PG num..
[15:50] <joelio> I think this is an oversight perhaps?
[15:51] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) Quit ()
[15:51] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[15:52] * diegows (~diegows@190.190.11.42) has joined #ceph
[15:52] <joelio> cfreak201: if the hypervisor supports native Ceph and you're not baking in via an rbd mount or something, then the comoute node will keep going. It'll look up replicas if promary data is not available
[15:53] * Mithril (~Melkor@208.175.141.7) has joined #ceph
[15:53] <joelio> conflicts resolved based on time of outages, delta of drift, map version.. (I think!)
[15:53] * joelio no Ceph dev
[15:53] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[15:53] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit ()
[15:53] <niklas> joelio: thats what I worked out
[15:53] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[15:54] <niklas> I don't see how to set the amount of PGs
[15:54] * sleinen (~Adium@2001:620:0:25:7908:9cda:2745:1fab) Quit (Quit: Leaving.)
[15:54] <joelio> maybe a feature request needed?
[15:54] <cfreak201> joelio: thanks, i really hope it just merges clean otherwise it will be hard to argue for ceph as active-active solution...
[15:56] * BillK_ (~BillK-OFT@124-148-246-233.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:56] <joelio> cfreak201: I've had stuff turned off for a week at a time and it resolved everything ok - was really shocked it worked!
[15:57] <joelio> also active-active?
[15:57] <mozg> hello guys
[15:57] <joelio> cfreak201: there's no standby concept (bar mds mons)
[15:57] <mozg> I was wondering if anyone using -lowlatency kernels with ceph?
[15:58] <mozg> if so, does it provide any benefit?
[15:59] <joelio> mozg: lowlatency generally for desktop usage? stuff like audio and video capture?
[16:00] <mozg> joelio: is it not also used for the HPC setups?
[16:00] <mozg> to decrease the latency in networking
[16:00] <phantomcircuit> mozg, usually low latency kernels are tweaked so that audio/video is low latency
[16:00] <mozg> for things like infiniband?
[16:00] <phantomcircuit> which probably wouldn't do anything with ceph
[16:01] <joelio> thought later kernels were tickless now anyways
[16:01] <mozg> i remember when i was reading docs on setting up the infiniband infrastructure they were mentioning that it is recommended to use low latency kernels
[16:01] <mozg> to decrease latency
[16:01] <joelio> try it, bench it, analyse it..
[16:01] <joelio> douubt there's all that much in it tbh
[16:01] <mozg> so, I thought that ceph might also benefit from these tunes
[16:01] <mozg> i see
[16:01] <mozg> probably a waste of effort
[16:03] * zhyan_ (~zhyan@114.81.215.37) Quit (Ping timeout: 480 seconds)
[16:04] <phantomcircuit> mozg, the low latency kernel just means preemption is enabled and the clock frequency is set to 1000
[16:05] <phantomcircuit> in most cases it actually reduces throughput
[16:05] <phantomcircuit> since you waste a bunch of time constantly context switching
[16:05] <mozg> i see
[16:05] <mozg> thanks
[16:09] * odyssey4me (~odyssey4m@165.233.71.2) Quit (Ping timeout: 480 seconds)
[16:13] * skm (~smiley@205.153.36.170) has left #ceph
[16:18] * saabylaptop (~saabylapt@2a02:2350:18:1010:ac98:bef:86c7:dfaf) Quit (Quit: Leaving.)
[16:24] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) Quit (Quit: ZNC - http://znc.in)
[16:24] * doubleg (~doubleg@69.167.130.11) has joined #ceph
[16:25] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[16:26] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) Quit ()
[16:26] * Psi-Jack (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[16:26] * zhyan_ (~zhyan@114.81.215.37) has joined #ceph
[16:29] <zhyan_> cfreak201, i think io hangs in this case
[16:30] <cfreak201> zhyan_: even thought i have min_size 1 and size==2 ?
[16:33] <zhyan_> if you can't access objects whose primary osd is on the disconnected node
[16:34] <cfreak201> ok, what if the priamry osd is permanently gone? I'll have to resolve it manually ?
[16:36] <zhyan_> ceph remap objects automatically if the primary osd gone
[16:36] <cfreak201> ok in my above scenario it would remap in both racks ?
[16:37] <zhyan_> no
[16:38] <zhyan_> your network is partitioned, the sub-network contains two mon will remap objects
[16:38] <cfreak201> ok
[16:38] <zhyan_> the sub-network contains only one mon won't
[16:39] <cfreak201> ok, thats what i wanted to hear
[16:40] <Psi-Jack> Blah. I forgot to update my ceph cluster this weekend from 0.61.5
[16:40] <cfreak201> always (n/2)+1 mon wins ?
[16:40] <zhyan_> yes
[16:40] <Psi-Jack> Is there any issues updating from 0.61.5 to 0.61.7 as I saw with 0.61.4->0.61.5 where OSDs and MONs wouldn't communicate at all until they were all up-to-date?
[16:46] <mozg> Psi-Jack: i've upgraded from .5 to .7 skipping .6
[16:46] <mozg> and didn't have any issues
[16:46] <mozg> but having said this, my cluster has only 2 osd servers
[16:46] <mozg> with 16 osds in total
[16:47] <Psi-Jack> I have 3 OSD servers totalling 9 OSDs in total.
[16:47] * zhyan__ (~zhyan@101.82.248.189) has joined #ceph
[16:48] <Psi-Jack> But, my issue between .4 to .5 was when I updated ceph1 from .4 to .5, it was no longer able to join the cluster group, and so it was literally still down until I took down ceph2, updated it, then that dissolved quorum, and re-made a new quorum.
[16:48] <Psi-Jack> dissolving quorum and making a new quorum is bad. :)
[16:48] <mozg> I think there was a warning about mons during the upgrade
[16:48] <mozg> coz i had done similar upgrade
[16:48] <Psi-Jack> Hmm
[16:49] <mozg> and the guys here recommended to do the upgrade and ceph restart all at the same time
[16:49] <mozg> and it worked like a charm
[16:49] <Psi-Jack> Did you verify yours re-joined the cluster and obtained quorum, up status?
[16:49] <mozg> didn't have any issues
[16:49] <Psi-Jack> Hmmm, okay.
[16:50] <Psi-Jack> I usually do mine in sequence. Ceph1->Ceph2->Ceph3, that way, I loose no quorum during the process, in most cases, and HA retains availability. When I did .4->.5 the loss of quorum and re-joining of new quorum definitely caused some annoyance.
[16:51] <Psi-Jack> Ceph is my primary storage for RBD disks for VM guest machines, and CephFS for shared data amongst multiple VMs.
[16:52] <mozg> i am also running ceph for vms
[16:52] <mozg> had an issue yesterday, which sage kindly helped me to resolve
[16:52] * tziOm (~bjornar@194.19.106.242) Quit (Remote host closed the connection)
[16:52] <mozg> when i've added 4th mon server my ceph cluster stopped working
[16:52] <mozg> at the end it was the firewall on the 4th mon that was blocking access to port 6789 (((
[16:53] * zhyan_ (~zhyan@114.81.215.37) Quit (Ping timeout: 480 seconds)
[16:53] <Psi-Jack> Heh, fun
[16:53] <Psi-Jack> Yeah, I have my osd, mon, and mds all on the same 3 servers.
[16:53] <Psi-Jack> SO upgrading is generally easy, only 3 physical servers to deal with.
[16:54] <Psi-Jack> But, temporary disk loss due to quorum dissolving can be bad. :)
[16:55] * dosaboy (~dosaboy@host109-158-237-180.range109-158.btcentralplus.com) Quit (Quit: leaving)
[16:55] * dosaboy (~dosaboy@host109-158-237-180.range109-158.btcentralplus.com) has joined #ceph
[16:59] * al (d@niel.cx) Quit (Ping timeout: 480 seconds)
[17:04] * al (quassel@niel.cx) has joined #ceph
[17:10] <absynth> any inktank employee around?
[17:12] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:15] * sprachgenerator (~sprachgen@130.202.135.214) has joined #ceph
[17:19] * scuttlemonkey (~scuttlemo@2607:f298:a:607:b440:aef4:bd00:1f5f) has joined #ceph
[17:19] * ChanServ sets mode +o scuttlemonkey
[17:19] * zhyan__ (~zhyan@101.82.248.189) Quit (Ping timeout: 480 seconds)
[17:26] <madkiss1> absynth: why?
[17:27] <wschulze> absynth: why? ;-)
[17:32] * yehudasa__ (~yehudasa@2602:306:330b:1410:2420:498a:1917:b8f9) Quit (Ping timeout: 480 seconds)
[17:33] * scuttlemonkey_ (~scuttlemo@2607:f298:a:607:b440:aef4:bd00:1f5f) has joined #ceph
[17:34] * scuttlemonkey (~scuttlemo@2607:f298:a:607:b440:aef4:bd00:1f5f) Quit (Read error: Connection reset by peer)
[17:35] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[17:36] * devoid (~devoid@130.202.135.215) has joined #ceph
[17:37] <ccourtaut> loicd: regarding S3 compliance documentation page
[17:38] <ccourtaut> here comes a few points to discuss
[17:38] <ccourtaut> How do we keep track of the S3 API documentation changes? Local mirror and wget? custom crawler?
[17:38] <ccourtaut> How do we keep code links up to date?
[17:38] <ccourtaut> How do we determine the support status of rgw?
[17:38] <ccourtaut> What level of detail do we need in our documentation page? One entry in table per feature? one page per feature?
[17:38] <ccourtaut> there might be more to discuss tought
[17:39] <loicd> ccourtaut: how would you link a http://tracker.ceph.com/ issue to the matching S3 feature being worked on ?
[17:40] <loicd> yehudasa: may have an opinion on this :-)
[17:40] <ccourtaut> loicd: that's a good point
[17:40] <ccourtaut> i think the most important issue behind all this is the structure of the document
[17:43] * scuttlemonkey (~scuttlemo@38.122.20.226) has joined #ceph
[17:43] * ChanServ sets mode +o scuttlemonkey
[17:43] * scuttlemonkey_ (~scuttlemo@2607:f298:a:607:b440:aef4:bd00:1f5f) Quit (Read error: Connection reset by peer)
[17:43] <loicd> https://github.com/kri5/ceph/blob/wip-s3-compliance-doc/doc/radosgw/s3_compliance.rst shows a lot of research and would certainly be very useful to a new contributor
[17:43] <ccourtaut> loicd: i hope so! :)
[17:44] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[17:44] <loicd> aren't you afraid that it will quickly become obsolete as code drifts away ? I mean that you have urls tagged with the commit so it will still be a good reference for a past implementation.
[17:44] <ccourtaut> loicd: but there is a lot of information by now, and it start to be not that easy to read because we have horizontal scrolling on the tables
[17:45] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[17:45] <ccourtaut> loicd: yes but it's for now the only way i've found to give links that really means something, even if they become obsolete
[17:45] * dobber_ (~dobber@213.169.45.222) Quit (Remote host closed the connection)
[17:46] <ccourtaut> loicd: btw, even if the code become obsolete, because it refers to the implementation of a S3 feature
[17:46] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[17:46] <ccourtaut> if the status of the feature is supported
[17:46] <ccourtaut> then even if the code have moved around, by this time the feature was fully implemented
[17:46] <loicd> ok
[17:47] <ccourtaut> but though we still need to figure a way to reveal obsolete code links and to be able to update them
[17:47] * wonkotheinsane (~jf@jf.ccs.usherbrooke.ca) has joined #ceph
[17:47] <loicd> GET Object N/A
[17:47] * wonkotheinsane (~jf@jf.ccs.usherbrooke.ca) Quit ()
[17:47] <n1md4> hi. how can I completely remove all ceph config and start again?
[17:48] <loicd> ccourtaut: from your research it is implemented fully or partially ? I assume N/A was not yet updated right ?
[17:48] <joelio> n1md4: uninstall binaries, pruge working direcories and config dirs..
[17:48] <joelio> reinstall
[17:48] * wonkotheinsane (~jf@jf.ccs.usherbrooke.ca) has joined #ceph
[17:49] <ccourtaut> loicd: you're right, all the N/A feature, are still N/A because i'm not sure of the completeness of the feature in radosgw
[17:49] * sagelap (~sage@2600:1012:b00f:3d53:5040:250a:77d2:ee44) has joined #ceph
[17:49] <n1md4> joelio: yeah, i thought that would be the simplest too, ta
[17:50] <joelio> n/p - ceph doesn't leave that much cruft.. /var/{lib,log,run}/ceph /etc/ceph
[17:50] <joelio> unless you chose to custom deploy osd locations etc..
[17:51] <joelio> in which case just purge those instead
[17:51] <loicd> ccourtaut: are are sure that it implements parts of it ( for GET Object ) ?
[17:52] <loicd> if so it would be very useful to display which parts you think are fully implemented or not. http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadListMPUpload.html
[17:52] <loicd> is going to be in the same case, don't you think ?
[17:52] <loicd> ccourtaut: ^
[17:53] <ccourtaut> yes it might be, that's why the question about the level of detail of the documentation page
[17:53] * wonkotheinsane (~jf@jf.ccs.usherbrooke.ca) Quit (Quit: WeeChat 0.3.7)
[17:53] * wonkotheinsane (~jf@jf.ccs.usherbrooke.ca) has joined #ceph
[17:53] <ccourtaut> i think multiple page would keep things clearer, but would be harder to maintain
[17:54] * huangjun (~kvirc@106.120.176.42) Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[17:54] * skm (~smiley@205.153.36.170) has joined #ceph
[17:54] <loicd> you could use GET object as an example of what such a detailed entry will be, otherwise it will be difficult for people to picture since they are not so familiar with the problem.
[17:54] <absynth> wschulze: is dona still with you?
[17:54] <loicd> ccourtaut: if not multiple pages, one page with more entries ?
[17:54] <absynth> wschulze: or shouuld i talk about contract issues with you?
[17:55] <absynth> issues as in questions, not problems
[17:55] <ccourtaut> loicd: maybe with more sections to separate the feature?
[17:55] <ccourtaut> s/feature/features/
[17:55] * scuttlemonkey (~scuttlemo@38.122.20.226) Quit (Quit: my troubles seem so far away, now yours are too...)
[17:55] <loicd> ccourtaut: yes
[17:55] <wschulze> absynth: Dona is of course still with us ;-)
[17:55] * scuttlemonkey (~scuttlemo@38.122.20.226) has joined #ceph
[17:55] * ChanServ sets mode +o scuttlemonkey
[17:56] * tnt (~tnt@109.130.80.16) has joined #ceph
[17:56] <absynth> ok, so i can just mail her for our contract?
[17:57] <wschulze> Yes, you can.
[17:57] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[17:59] <loicd> ccourtaut: http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Emperor/Create_and_Maintain_S3_feature_list_for_compatibility_tracking&action=diff&revision=11&diff=12 for the record
[17:59] <ccourtaut> loicd: great!
[18:00] * oliver1 (~oliver@p4FD07DAC.dip0.t-ipconnect.de) has left #ceph
[18:01] * xmltok (~xmltok@pool101.bizrate.com) has joined #ceph
[18:02] <loicd> ccourtaut: "transversal rules / behavior should be listed and matched against code " ( from the description of http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Create_and_Maintain_S3_feature_list_for_compatibility_tracking ) did you introduce an example of how it should look like in your latest https://github.com/kri5/ceph/blob/wip-s3-compliance-doc/doc/radosgw/s3_compliance.rst ?
[18:04] <ccourtaut> loicd: no there is still no example about that still
[18:06] * diegows (~diegows@190.190.11.42) Quit (Ping timeout: 480 seconds)
[18:08] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Ping timeout: 480 seconds)
[18:08] <loicd> ccourtaut: I proposed a direction for keeping track of the S3 pages : http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Emperor/Create_and_Maintain_S3_feature_list_for_compatibility_tracking&action=diff&revision=12&diff=13
[18:09] <loicd> if someone has a better idea tomorrow during the discussion... it will be changed
[18:10] <ccourtaut> ok sound good to me
[18:10] <ccourtaut> as they seems to version the API
[18:10] <ccourtaut> but not really :D
[18:10] <loicd> ccourtaut: regarding "transversal rules / behavior should be listed and matched against code " I think an example would be most useful for yehudasa to comment on. At this point it is probably more important to show examples ( even if not thought thru ) of what the table would look like than exploring the code extensively.
[18:11] <loicd> ccourtaut: ahahah +1 on the "not really versioned API"
[18:11] <ccourtaut> "Amazon Simple Storage Service
[18:11] <ccourtaut> API Reference (API Version 2006-03-01"
[18:11] <ccourtaut> might be a little outdated :P
[18:14] * Cube (~Cube@12.248.40.138) has joined #ceph
[18:17] <loicd> I proposed this to maintain the links to the code : http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Emperor/Create_and_Maintain_S3_feature_list_for_compatibility_tracking&action=diff&revision=13&diff=14
[18:18] <loicd> feel free to adjust if it does not make sense. It's a bootstrap for discussion, not really thought thru ;-)
[18:18] <loicd> ccourtaut: ^
[18:19] <ccourtaut> loicd: ok
[18:22] <loicd> ccourtaut: gtg, will reconnect in 30min
[18:22] <ccourtaut> loicd: ok
[18:27] * dosaboy_ (~dosaboy@host86-156-252-32.range86-156.btcentralplus.com) has joined #ceph
[18:28] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[18:29] * dosaboy (~dosaboy@host109-158-237-180.range109-158.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[18:30] <mikedawson> joshd: ping
[18:33] * alfredodeza (~alfredode@38.122.20.226) has joined #ceph
[18:34] * scuttlemonkey changes topic to 'TODAY && TOMORROW -- Ceph Developer Summit: Emperor - http://ceph.com/cds || Latest stable (v0.61.7 "Cuttlefish") -- http://ceph.com/get'
[18:38] * scuttlemonkey_ (~scuttlemo@2607:f298:a:607:24c9:9969:aa27:163b) has joined #ceph
[18:38] * sagelap (~sage@2600:1012:b00f:3d53:5040:250a:77d2:ee44) Quit (Ping timeout: 480 seconds)
[18:41] * alram (~alram@38.122.20.226) has joined #ceph
[18:42] * sagelap (~sage@38.122.20.226) has joined #ceph
[18:42] * gregmark (~Adium@cet-nat-254.ndceast.pa.bo.comcast.net) has joined #ceph
[18:42] * scuttlemonkey (~scuttlemo@38.122.20.226) Quit (Read error: Operation timed out)
[18:47] <cfreak201> Boot from an image with nova-compute shouldn't be a problem right? create a new volume from the image (cinder create --image...) and then it should be bootable ? qemu doesnt seem to be able to find/boot from that volume. qemu-image info on the same volume works fine..
[18:47] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:49] * joao (~JL@38.122.20.226) has joined #ceph
[18:49] * ChanServ sets mode +o joao
[18:49] * bergerx_ (~bekir@78.188.101.175) Quit (Quit: Leaving.)
[18:51] * duff_ (~duff@199.181.135.135) has joined #ceph
[18:51] <mikedawson> cfreak201: Yes. We create a Cinder volume from a Glance image, then we launch an instance from that cinder volume.
[18:52] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[18:52] <cfreak201> mikedawson: what are you using as base image ?
[18:52] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[18:53] <mikedawson> cfreak201: Anything we've prepped for Glance (Windows Server, Ubuntu Cloud Image, Cirros, etc)
[18:53] <cfreak201> mhm
[18:54] <cfreak201> qcow2 -> raw converted i guess ?
[18:55] <joelio> is qcow2 supported in rbd? Thought it's all raw
[18:56] <cfreak201> joelio: as much as i know yes, but glance does ofc not complain ;)
[18:56] * joelio not an openstack user
[18:57] <joelio> I know in OpenNebula you have to use raw images as qcow2 not supported - so no state snapshots (which I can live without personally)
[18:57] <joelio> (within Ceph I mean.. qcow2 supported by OpenNebula ofc)
[18:58] * scuttlemonkey (~scuttlemo@2607:f298:a:607:94a3:6831:7ee1:8fdd) has joined #ceph
[18:58] * ChanServ sets mode +o scuttlemonkey
[18:58] <cfreak201> and again "No bootable device" :(
[18:58] <loicd> ccourtaut: if I'm not mistaken there is a way to introduce permalinks in RST and also make sure they show on github ( something like *foo or something )
[18:59] * yehudasa__ (~yehudasa@2607:f298:a:607:ea03:9aff:fe98:e8ff) has joined #ceph
[18:59] <joelio> cfreak201: are you ussing rbd or cephfs - no support for qcow2 in rbd afaik - look here http://ceph.com/docs/next/rbd/qemu-rbd/
[19:00] <joelio> has to be raw
[19:00] <cfreak201> joelio: i've created a disk from a qcow2 -> raw converted image
[19:01] <joelio> can you see what XML is being generated, or how the vm is beind instantiated?
[19:01] <joelio> s/see/share/
[19:01] <cfreak201> joelio: the qemu cmdline looks fine
[19:01] <n1md4> hello, again. I have a testing cluster setup, with 2 nodes, with 3 standalone-disk osds on each. I would like to make this storage available as shared storage for a pair of xenserver boxes I've setup. With the storage ready, what would be my next step to making this available?
[19:01] <ccourtaut> loicd: i'll take a look
[19:01] <cfreak201> joelio: atleast as much as i can tell but i'll share 1 sec..
[19:03] * scuttlemonkey_ (~scuttlemo@2607:f298:a:607:24c9:9969:aa27:163b) Quit (Ping timeout: 480 seconds)
[19:04] <cfreak201> joelio: http://nopaste.info/4842f91c28.html
[19:05] <loicd> ccourtaut: http://pad.ceph.com/p/rgw-s3-compatibility a pad has been created and linked to ccourtaut: http://pad.ceph.com/p/rgw-s3-compatibility a pad has been created and linked to ccourtaut: http://pad.ceph.com/p/rgw-s3-compatibility a pad has been created and linked to http://wiki.ceph.com/01Planning/CDS/Emperor
[19:05] <loicd> ccourtaut: sorry about that. buggy copy / paste ;)
[19:06] <ccourtaut> it happens! :)
[19:07] <mikedawson> cfreak201: yes, we convert qcow2 images into raw before uploading to glance
[19:07] <joelio> cfreak201: and that command line works?
[19:08] <cfreak201> joelio: it works except for the part that it's not booting :P
[19:08] <joelio> so, no then :)
[19:08] <joelio> you have bootindex set on one of the devices
[19:09] <ccourtaut> loicd: dropped some thoughts on the pad to keep track of what we've talked about on the subject
[19:09] <joelio> cfreak201: here's one of mine that work https://gist.github.com/anonymous/5fc7f8a2e96189aee437/raw/ffe302ca339b494e415cc90c3a26b8051e82d2de/gistfile1.txt
[19:10] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[19:10] <loicd> ideally the pad ( very much like the openstack summit pads ) should be an agenda for the session.
[19:10] <joelio> cfreak201: I don't know what additional stuff those extra devices provide, but I'd remove them if possible (especially the one with bootindex) and retry?
[19:10] <joelio> just try command line instatation, remove that dev
[19:10] <ccourtaut> loicd: oh, ok
[19:11] * bandrus (~Adium@12.248.40.138) has joined #ceph
[19:12] <loicd> ccourtaut: there are no rules though ;-) I found it most useful during the summit to use it in this way. Agenda + URLs + things to be discussed + summarize the discussions
[19:13] <ccourtaut> yes i see
[19:13] <ccourtaut> i have written the various questions that will be discussed on the pad, the ones that i showed you earlier
[19:13] * scuttlemonkey changes topic to 'TODAY && TOMORROW -- Ceph Developer Summit: Emperor - http://ceph.com/cds JOIN: #ceph-summit for chat || Latest stable (v0.61.7 "Cuttlefish") -- http://ceph.com/get'
[19:14] * mozg (~andrei@host217-46-236-49.in-addr.btopenworld.com) Quit (Ping timeout: 480 seconds)
[19:15] <duff_> dmick: Well I'm back. So I deleted the instances and started again just to make sure everything was fresh. This time putting the mon on both nodes in the cluster. I'm still getting the the ceph-create-keys command just spinning. Copied all of the output a bit of poking at things to: https://friendpaste.com/3sRuNdM862zwA37bDT3PHg.
[19:16] * joao (~JL@38.122.20.226) Quit (Ping timeout: 480 seconds)
[19:16] * L2SHO (~adam@office-nat.choopa.net) has joined #ceph
[19:16] * gregaf (~Adium@2607:f298:a:607:112c:1fa8:77e1:af2e) Quit (Ping timeout: 480 seconds)
[19:20] * odyssey4me (~odyssey4m@41-133-58-101.dsl.mweb.co.za) has joined #ceph
[19:20] <mikedawson> duff_: I haven't used ceph-deploy, but it looks like you should have valid addresses in the monmap for node01 and node02
[19:21] <mikedawson> duff_: until the monitors achieve quorum, ceph-create-keys will hang forever
[19:21] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) Quit (Quit: Leaving.)
[19:23] <mikedawson> duff_: or perhaps you don't need valid addr's in the monmap yet, as I see potentially valid addresses in extra_probe_peers. Perhaps someone more knowledgeable with ceph-deploy could help
[19:23] * joao (~JL@2607:f298:a:607:9eeb:e8ff:fe0f:c9a6) has joined #ceph
[19:23] * ChanServ sets mode +o joao
[19:23] * nhm (~nhm@184-97-255-87.mpls.qwest.net) has joined #ceph
[19:23] * ChanServ sets mode +o nhm
[19:25] <sagewk> loicd: if you want to stick around after the standup we can do it now too
[19:25] <loicd> sagewk: sure :-)
[19:28] * Vincent_Valentine (~Vincent_V@49.206.158.155) Quit (Ping timeout: 480 seconds)
[19:31] * davidzlap (~Adium@ip68-5-239-214.oc.oc.cox.net) has joined #ceph
[19:31] <duff_> midedawson: thanks, that got it. I got a little lost by the isntructions to edit ~/.ssh/config on http://ceph.com/docs/master/start/quick-start-preflight/. I was just picking names that were easy for me, then the nodes had no idea what the names resolved to. Used the actual hostnames and that problem went away.
[19:35] * sagelap (~sage@38.122.20.226) Quit (Quit: Leaving.)
[19:37] <mikedawson> duff_: glad to help
[19:38] * xmltok (~xmltok@pool101.bizrate.com) Quit (Remote host closed the connection)
[19:38] * xmltok (~xmltok@relay.els4.ticketmaster.com) has joined #ceph
[19:39] <L2SHO> What would be the best way to disable cephx on a running cluster?
[19:41] <tnt> not ?
[19:42] <L2SHO> unfortunately that's not going to work for me
[19:42] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[19:43] <cfreak201> yay i succeded... dd if=/dev/vda of=/dev/vdb... started a vm from the image (on local FS) copied the content to the rdb disk and now I could boot from it... cinder create --image ... doesnt seem tow rk quiet well
[19:43] <L2SHO> can I shutdown all osd's and mon's and then set auth_supported=none in all the ceph.conf's and then restart everything? Or is there something else stored somewhere in the mon's that I'll need to change too?
[19:44] <tnt> L2SHO: no that'll work. But that's not "on a running cluster" ... you're shutting it down.
[19:45] <L2SHO> tnt, maybe I should have said existing cluster
[19:46] <tnt> Then yes, it's totally possible and just changing the ceph.conf will work. You also need to change the config of clients where applicable.
[19:48] <L2SHO> tnt, I don't suppose there's a way to keep cephx enabled, but allow certain IP's to bypass it?
[19:49] <joao> sagewk, wip-5648-b looks good
[19:52] <tnt> L2SHO: don't think so no.
[19:52] <tnt> but I'm no expert ..
[19:53] <sagewk> joao: cool, want to squash it down?
[19:53] <sagewk> also i didn't actually test it :) so you should finish your tests before we merge it
[19:57] * odyssey4me (~odyssey4m@41-133-58-101.dsl.mweb.co.za) Quit (Ping timeout: 480 seconds)
[19:57] * odyssey4me (~odyssey4m@165.233.71.2) has joined #ceph
[19:58] * gentleben (~sseveranc@c-98-207-40-73.hsd1.ca.comcast.net) Quit (Quit: gentleben)
[19:58] * diegows (~diegows@200.68.116.185) has joined #ceph
[19:58] * xmltok_ (~xmltok@pool101.bizrate.com) has joined #ceph
[20:01] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) has joined #ceph
[20:05] * xmltok (~xmltok@relay.els4.ticketmaster.com) Quit (Ping timeout: 480 seconds)
[20:13] <joelio> L2SHO: just recompile libvirt to force auth=cephx ;)
[20:14] <joelio> as mentioned in #opennebula it's dirty, but does work well
[20:15] <joao> sagewk, sure
[20:15] <joao> taking care of that now
[20:15] <joelio> cfreak201: any luck?
[20:18] * joelio spots success
[20:18] <joelio> nice
[20:19] <joelio> cfreak201: If it helps, I use veewee for image curation. Really nice system for maintaining multiple images
[20:21] * gentleben (~sseveranc@216.55.31.102) has joined #ceph
[20:26] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[20:30] * devoid (~devoid@130.202.135.215) Quit (Ping timeout: 480 seconds)
[20:36] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) Quit (Quit: Leaving.)
[21:02] * mozg (~andrei@host109-151-35-94.range109-151.btcentralplus.com) has joined #ceph
[21:02] * sagelap (~sage@2607:f298:a:607:c5e:7bb0:c323:186c) has joined #ceph
[21:06] * tziOm (~bjornar@ti0099a340-dhcp0395.bb.online.no) has joined #ceph
[21:12] * terje (~joey@184-96-143-206.hlrn.qwest.net) Quit (Ping timeout: 480 seconds)
[21:14] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[21:14] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit ()
[21:33] * allsystemsarego (~allsystem@188.25.130.190) Quit (Quit: Leaving)
[21:44] * devoid (~devoid@130.202.135.215) has joined #ceph
[21:51] <MACscr> is there any sort of ceph management/reporting gui available?
[21:54] * andreask (~andreask@h081217135028.dyn.cm.kabsi.at) has joined #ceph
[21:54] * ChanServ sets mode +v andreask
[21:58] * sagelap (~sage@2607:f298:a:607:c5e:7bb0:c323:186c) Quit (Ping timeout: 480 seconds)
[21:58] <dmick> MACscr: not yet. Plans are afoot, however.
[21:58] <scuttlemonkey> http://wiki.ceph.com/01Planning/02Blueprints/Dumpling/Ceph_management_API
[21:58] <dmick> look, there's one afoot now
[21:58] <scuttlemonkey> no foots were harmed in the making of this blueprint
[21:59] <dmick> actually, that piece is coming in dumpling, but it's just a building block for the larger console project
[22:00] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[22:00] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[22:02] <mikedawson> did the cephdrop password change?
[22:02] <mikedawson> got it
[22:04] <MACscr> well at least its in the works. Seems quite a ways off, but still promising
[22:07] <scuttlemonkey> MACscr: not as far as you might think
[22:09] <scuttlemonkey> Dumpling is being released this month, and the management console (which is part of the enterprise subscription) should follow it before the Emperor release (Nov)
[22:12] <janos> is there a definitive bobtail-to-cuttlefish upgrade document and procedure?
[22:13] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:13] <janos> i need to get on that process before i get too far behind
[22:14] <dmick> http://ceph.com/docs/master/install/upgrading-ceph/#upgrading-from-bobtail-to-cuttlefish; also review the release notes for Cuttlefish releases: http://ceph.com/docs/master/release-notes
[22:14] <janos> thanks!
[22:14] <janos> will look
[22:16] * sagelap (~sage@38.122.20.226) has joined #ceph
[22:16] <janos> party. i hope it goes as straight-forward as this suggests ;)
[22:17] <mikedawson> joshd: RBD client logs for the qemu deadlock are on cephdrop, called mikedawson-rbd-qemu-deadlock. Let me know what else you need
[22:17] <joshd> mikedawson: thanks
[22:18] <mikedawson> joshd: my untrained eye can spot a pattern. Hopefully your trained eye can spot the issue!
[22:26] * TirixTa (~TirixTa@187.252.46.172) has joined #ceph
[22:27] <TirixTa> Tired of niggers? Sick of their monkeyshines? We are too! Join Chimpout Forum! http://www.chimpout.com/forum At Chimpout, we are NOT white supremacists! I myself am a Mexican! Basically, if you are not a NIGGER and you hate NIGGERS, we welcome you with open arms! Join Chimpout Forum today! http://www.chimpout.com/forum
[22:27] <paravoid> wow, seriously?
[22:28] <janos> impressive, in all the wrong ways
[22:29] <TirixTa> We are not white supremacists!
[22:29] * nhm sets mode +b *!*TirixTa@187.252.46.*
[22:29] * TirixTa was kicked from #ceph by nhm
[22:30] <janos> oh right, because THAT was my problem with it
[22:30] <janos> *cough*
[22:30] <janos> thank you, nhm
[22:31] <nhm> janos: np. :)
[22:33] <scuttlemonkey> hooray for trolls
[22:36] <nhm> I'm surprised there are as many trolls on this network as there are. I thought they all hung out on efnet and undernet.
[22:36] <scuttlemonkey> hehe
[22:38] <nhm> wow, freenode is biggest now.
[22:38] <nhm> efnet is down to like 33k users at peak hours.
[22:40] <nhm> oftc has like 9k
[22:41] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[22:42] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[22:46] * odyssey4me2 (~odyssey4m@41-133-58-101.dsl.mweb.co.za) has joined #ceph
[22:46] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[22:47] <cjh_> when a 1MB write is issued to ceph does it write that 1MB in 1 chunk or does it cut it up into smaller pieces? I think the default rados size is 4MB so I would guess it's 1 write call.
[22:50] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) has joined #ceph
[22:53] * odyssey4me (~odyssey4m@165.233.71.2) Quit (Ping timeout: 480 seconds)
[22:54] * odyssey4me2 (~odyssey4m@41-133-58-101.dsl.mweb.co.za) Quit (Ping timeout: 480 seconds)
[22:55] * dalgaaf (~dalgaaf@nrbg-4dbfcce3.pool.mediaWays.net) Quit (Ping timeout: 480 seconds)
[22:57] * Vincent_Valentine (~Vincent_V@49.206.158.155) has joined #ceph
[22:58] <nhm> cjh_: Ceph won't break it up typically, but it's possible that it could get broken up at the block layer or lower.
[23:01] <jeff-YF> I attempted following the example of placing a pool on a set of OSD as shown here: http://ceph.com/docs/master/rados/operations/crush-map/ after I create the pools and set the crush_ruleset ceph shows all the pgs for my SSD ruleset as stuck unclean and "active+remapped" . Anyone have any experience with this?
[23:02] <cjh_> nhm: ok cool
[23:02] <cjh_> that's what i was hoping was the answer :)
[23:02] <scuttlemonkey> jeff-YF: assuming you are injecting that new map on an existing cluster?
[23:02] <cjh_> i'm excited for dumpling to land. i want to do some more perf tests on it
[23:02] <jeff-YF> scuttlemonkey: yes, that is correct.
[23:03] <scuttlemonkey> active+remapped is the cluster rebalancing pgs. You give it a bit of time to chunk things around a bit?
[23:03] <devoid> what's the process for getting a pull request reviewed?
[23:04] <jeff-YF> its been a few hours and it hasn't changed
[23:04] <scuttlemonkey> devoid: link it in here and ask someone to look
[23:04] <scuttlemonkey> devoid: that said my task after this week is to start capturing governance and procedure on the wiki...so hopefully I will be able to point you at something a bit more comprehensive soon
[23:04] <devoid> https://github.com/ceph/ceph/pull/479 (small docs fix that nhm wanted)
[23:05] <devoid> scuttlemonkey: cool, I just wanted to make sure I hadn't missed a step.
[23:05] <jeff-YF> scuttlemonkey: the cluster is actually freshly created.. theres no data on it at all
[23:06] <scuttlemonkey> jeff-YF: can you pastebin your crushmap?
[23:06] <sagewk> http://hub.github.com/
[23:07] <nhm> cjh_: we implemented some new wbthrottle code that may improve and/or hurt performance depending on the IO size. We also had to implement a workaround for a bug in XFS that may cause a bit of slow down. :(
[23:08] <nhm> cjh_: so testing (especially compared to cuttlefish!) is welcome!
[23:11] <scuttlemonkey> ** ======= CDS =======
[23:11] <scuttlemonkey> ** Under 1 hour until the online Ceph Developer Summit
[23:11] <scuttlemonkey> ** Join #ceph-summit for CDS-specific chat
[23:11] <scuttlemonkey> ** More details available at http://ceph.com/cds
[23:11] <scuttlemonkey> ** ======= CDS =======
[23:11] <jeff-YF> scuttlemonkey: here is the crush map http://pastebin.com/0vHzsWa9
[23:15] <scuttlemonkey> jeff-YF: worth noting you have 2 'ruleset 3'
[23:16] <scuttlemonkey> oh nm, they were just split by one
[23:16] <scuttlemonkey> I see
[23:16] <jeff-YF> scuttlemonkey: I was following the example in the ceph documentation
[23:17] <scuttlemonkey> yeah, my brain wasn't aggregating them w/ the sata rule interrupt :)
[23:17] * zhyan__ (~zhyan@101.83.141.22) has joined #ceph
[23:18] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) Quit (Quit: Leaving.)
[23:20] <cjh_> nhm: ok i'll see what i can find
[23:22] <scuttlemonkey> jeff-YF: what version are you running? I know you said it was essentially an empty cluster, but is this a new install, or have you messed with tuneables or whatnot?
[23:22] <jeff-YF> scuttlemonkey: ceph version 0.61.2
[23:23] <jeff-YF> I haven't messed with turntables. its not a new install .. but i did just re-create the cluster today
[23:24] <scuttlemonkey> jeff-YF: gotcha
[23:24] <jeff-YF> scuttlemonkey: i experienced this same result before re-creating the cluster.
[23:26] * yanzheng (~zhyan@jfdmzpr04-ext.jf.intel.com) has joined #ceph
[23:29] <dmick> jeff-YF: it's legal to have two rules with the same ruleset number, but in order for it to make sense, they'd need to be arranged to kick in for non-overlapping numbers of replicas
[23:30] <dmick> that is, you might want a different strategy for pools with 2 replicas vs. pools with 3, but could use the same ruleset number if the first had min_size/max_size 2 and the second had min_size 3, max at least 3
[23:30] <dmick> (or, more safely, min0max2 and min3max10)
[23:30] <dmick> the doc example is not correct
[23:32] <dmick> are the pools with pgs in trouble using that ruleset?
[23:32] <dmick> (ceph osd dump will tell you quickly)
[23:33] <jeff-YF> dmick/scuttlemonkey: I just deleted my ssd pool, reinjected a new map and left out the ssd-primary rule… the pgs for the SSD ruleset are unclean
[23:33] * zhyan__ (~zhyan@101.83.141.22) Quit (Ping timeout: 480 seconds)
[23:33] <jeff-YF> (after creating a new SSD pool and assigning the ruleset
[23:36] <dmick> jeff-YF: can you show your current crush rules and osd tree?
[23:37] <jeff-YF> dmick: http://pastebin.com/Xt029rrM
[23:39] <dmick> so with one host in root ssd, you will only ever be able to have one replica when doing chooseleaf...host
[23:39] * john (~john@astound-64-85-225-33.ca.astound.net) has joined #ceph
[23:39] <dmick> if any pools are set with size > 1, they're unable to map
[23:39] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) Quit (Quit: erice)
[23:41] <dmick> ^ make sense jeff-YF?
[23:41] <jeff-YF> dmick: oh I see.. i need at least 2 hosts to make it work correctly for the default replica size of 2?
[23:41] <dmick> if you're going to do chooseleaf...host
[23:41] <dmick> that says choose N(-0) hosts, and then find a leaf on each
[23:42] <dmick> you could do chooseleaf....device, I think, and that would allow choosing two OSDs on the same host (with, of course, that one host as a failure domain)
[23:42] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) has joined #ceph
[23:42] * mmercer (~kvirc@c-67-180-16-120.hsd1.ca.comcast.net) has joined #ceph
[23:42] <mmercer> lo all
[23:43] <jeff-YF> dmick: I follow you now.. thank you for clearing that up for me
[23:44] * rturk-away is now known as rturk
[23:46] * moslemsry121311 (~dasdasd@41.46.223.54) has joined #ceph
[23:46] * tziOm (~bjornar@ti0099a340-dhcp0395.bb.online.no) Quit (Remote host closed the connection)
[23:46] <moslemsry121311> Do skype,yahoo other chat and social communication prog work 2 spoil muslim's youth and spy4isreal&usa???????
[23:46] <moslemsry121311> do they record and analyse every word we type????????????
[23:46] <moslemsry121311> Do chat prog spy 4 isreal&usa??????? Do chat prog spy 4 isreal&usa???????
[23:46] <moslemsry121311> هل يتجسس الشات لامريكا واسرائيل؟؟؟؟؟؟؟؟؟
[23:46] <moslemsry121311> Do skype,yahoo other chat and social communication prog work 2 spoil muslim's youth and spy4isreal&usa???????
[23:46] <moslemsry121311> do they record and analyse every word we type????????????
[23:46] <moslemsry121311> Do chat prog spy 4 isreal&usa??????? Do chat prog spy 4 isreal&usa???????
[23:46] <moslemsry121311> هل يتجسس الشات لامريكا واسرائيل؟؟؟؟؟؟؟؟؟
[23:46] * joao sets mode +b *!*dasdasd@41.46.223.*
[23:46] * moslemsry121311 was kicked from #ceph by joao
[23:47] * yanzheng (~zhyan@jfdmzpr04-ext.jf.intel.com) Quit (Remote host closed the connection)
[23:48] <mmercer> what is the performance like of the cephfs fuse/kernel modules and mounting of the fs?
[23:53] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[23:54] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[23:55] <scuttlemonkey> ** ======= CDS =======
[23:55] <scuttlemonkey> ** 5 minutes until the online Ceph Developer Summit
[23:55] <scuttlemonkey> ** Join #ceph-summit for CDS-specific chat
[23:55] <scuttlemonkey> ** Video will be broadcast at: http://youtu.be/j0JQvE5uGgs
[23:55] <scuttlemonkey> ** More details available at http://ceph.com/cds
[23:56] <scuttlemonkey> ** ======= CDS =======
[23:56] <mozg> good luck guys
[23:56] <scuttlemonkey> mozg: thanks :)
[23:57] <joelio> +1
[23:57] * Vincent_Valentine (~Vincent_V@49.206.158.155) Quit ()
[23:57] * Vincent_Valentine (Vincent_Va@49.206.158.155) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.