#ceph IRC Log


IRC Log for 2013-04-02

Timestamps are in GMT/BST.

[0:02] * ninkotech_ (~duplo@ip-89-102-24-167.net.upcbroadband.cz) has joined #ceph
[0:04] <andreask> a question regarding using rbd block devices as backend for iscsi target luns ... does it work well with multi-threaded access?
[0:05] * rustam (~rustam@5e0f5b1e.bb.sky.com) has joined #ceph
[0:05] <andreask> concrete ... lio and rbd kernel module
[0:10] <Elbandi_> hmm
[0:10] <Elbandi_> 2013-04-02 00:10:48.558270 7f6136b0a700 10 mds.1.241 beacon_send up:replay seq 68 (currently up:replay)
[0:11] <Elbandi_> 2013-04-02 00:10:48.560022 7f6138d10700 10 mds.1.241 handle_mds_beacon up:replay seq 68 rtt 0.001718
[0:11] <Elbandi_> 2013-04-02 00:10:49.555944 7f6136b0a700 15 mds.1.bal get_load mdsload<[0,0 0]/[0,0 0], req 0, hr 0, qlen 0, cpu 0.19>
[0:11] <Elbandi_> and mds still replay state
[0:11] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[0:16] <slang1> Elbandi_: can you post the full mds log somewhere?
[0:18] * BillK (~BillK@ has joined #ceph
[0:19] * tnt (~tnt@ Quit (Ping timeout: 480 seconds)
[0:21] <Elbandi_> slang1: 3 giga... :)
[0:26] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[0:29] <slang1> Elbandi_: compress and post to sftp://cephdrop@ceph.com?
[0:30] <Elbandi_> ok
[0:33] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[0:35] * BManojlovic (~steki@fo-d- Quit (Read error: Operation timed out)
[0:38] <slang1> Elbandi_: is that the one stuck in replay?
[0:40] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[0:40] * BManojlovic (~steki@ has joined #ceph
[0:41] <Elbandi_> no, two mds, max_mds = 2, current state are:
[0:41] <Elbandi_> 'ceph-mds2' mds.0.272 up:resolve seq 12
[0:41] <Elbandi_> 'ceph-mds1' mds.1.241 up:replay seq 25
[0:41] <Elbandi_> mds1 log still compressing
[0:46] * Steki (~steki@fo-d- has joined #ceph
[0:46] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[0:47] * PerlStalker (~PerlStalk@ Quit (Quit: ...)
[0:48] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[0:48] * BManojlovic (~steki@ Quit (Ping timeout: 480 seconds)
[0:58] <dmick> andreask: fwiw, there is also an stgt block driver for rbd if you're in a mood to experiment. I don't know how well either does with threading
[0:59] <andreask> thx dmick
[1:11] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[1:16] * diegows (~diegows@ has joined #ceph
[1:17] <sjust> sagewk, gregaf: updated wip_4510
[1:18] * portante is now known as portante|afk
[1:29] * jlogan1 (~Thunderbi@2600:c00:3010:1:fc52:a0e0:824c:3a1d) Quit (Ping timeout: 480 seconds)
[1:44] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[1:45] <Elbandi_> slang1: i have to go to sleep, but tomorrow i'll create a ticket
[1:46] <Elbandi_> and i try to upgrade the mds
[1:47] <Elbandi_> will it work, if mds are trunk version, but the mons are bobtail?
[1:52] * Steki (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:58] * LeaChim (~LeaChim@02d9ee2d.bb.sky.com) Quit (Ping timeout: 480 seconds)
[2:00] <humbolt> Hey, I am wondering, when I have 3 servers with 4 HDDs each, how do I make sure, my replicas do not end up on the HDDs of only one host?
[2:05] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[2:07] <wer_> nick wer
[2:08] * wer_ (~wer@206-248-239-142.unassigned.ntelos.net) Quit (Quit: Leaving)
[2:08] * dpippenger (~riven@ Quit (Remote host closed the connection)
[2:11] <BillK> pgmap v881533: 1418 pgs: 1418 active+clean; 1180 GB data, 3541 GB used, 1963 GB / 5509 GB avail; 0B/s rd, 431KB/s wr, 21op/s
[2:11] <BillK> this is on .58, and doesnt make sense
[2:11] * ivotron (~ivo@dhcp-59-180.cse.ucsc.edu) Quit (Read error: Operation timed out)
[2:12] <BillK> on .56 data used was 2x data + some overhead, on .58 its .3x data + overhead ... why?
[2:12] <BillK> same system, upgraded
[2:16] * alram (~alram@ Quit (Quit: leaving)
[2:18] * chutzpah (~chutz@ has joined #ceph
[2:21] * rturk is now known as rturk-away
[2:25] * Loffler (~Loffler@115-166-35-130.ip.adam.com.au) Quit (Quit: Leaving)
[2:28] * xmltok (~xmltok@pool101.bizrate.com) has joined #ceph
[2:29] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[2:38] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[2:38] * rustam (~rustam@5e0f5b1e.bb.sky.com) Quit (Remote host closed the connection)
[2:41] <dmick> humbolt: that's what the CRUSH map is for
[3:01] * maxiz (~pfliu@ has joined #ceph
[3:04] * Downchuck (62e8084e@ircip2.mibbit.com) has joined #ceph
[3:05] <Downchuck> How do I force a mon to accept that it's the only node in a now defunct quorum?
[3:05] <Downchuck> Currently, since the mon can't connect to others, it's trying to forward the client indefinitely so I can't submit any ceph commands.
[3:11] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[3:15] <elder> dmick, is there anybody around that might know something about mds connections dropping while I run the kernel_untar_build.sh test?
[3:15] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:17] <elder> Actually, maybe I know what's going on now...
[3:18] <dmick> elder: not many left in this physical room
[3:18] <elder> Seems to be socket failures being injected. Just at a rate that seems higher than I expected.
[3:18] <elder> And I think it's new--I don't remember getting injected errors on the mds connectinos.
[3:19] <dmick> Downchuck: possibly ceph mon remove on the ones that aren't there anymore
[3:20] <dmick> but I don't know. You might have to start another few, which wouldn't be super-hard
[3:20] <dmick> (they can run on the same host)
[3:25] <Downchuck> "ceph mon" fails with the forward; is there an easy initialize command for the new mon directory?
[3:25] <Downchuck> mkcephfs seems a little heavy
[3:26] <Downchuck> (basically getting: "unable to read magic from mon data..")
[3:31] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[3:36] <Downchuck> ceph-mon --mkfs -i 2 -c /etc/ceph/ceph.conf --fsid <fsid>; then adding a keyring got the mon up quickly.
[3:39] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[3:41] * yanzheng (~zhyan@ has joined #ceph
[3:54] <Downchuck> well, I clobbered authentication but at least it moved.
[3:59] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[4:06] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:17] * Downchuck (62e8084e@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[4:18] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[4:21] * absynth (~absynth@irc.absynth.de) Quit (Ping timeout: 480 seconds)
[4:27] * winston-d (~zhiteng@pgdmzpr01-ext.png.intel.com) has joined #ceph
[4:29] * Cube (~Cube@cpe-76-172-67-97.socal.res.rr.com) has joined #ceph
[4:31] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[4:34] * maxiz (~pfliu@ Quit (Ping timeout: 480 seconds)
[4:35] * maxiz (~pfliu@ has joined #ceph
[4:35] * jbarth (~yaaic@host-184-166-122-97.but-mt.client.bresnan.net) has joined #ceph
[4:39] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[4:41] * absynth (~absynth@irc.absynth.de) has joined #ceph
[4:48] * BillK (~BillK@ Quit (Remote host closed the connection)
[4:48] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) has joined #ceph
[4:49] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) has left #ceph
[4:50] <jbarth> hello all, how would one list all the user accounts for radosgw? I've looked around a bit and maybe I'm missing it. is there something that provides such output.
[5:06] * winston-d (~zhiteng@pgdmzpr01-ext.png.intel.com) Quit (Ping timeout: 480 seconds)
[5:23] * scuttlemonkey (~scuttlemo@fl-184-1-34-163.dhcp.embarqhsd.net) Quit (Ping timeout: 480 seconds)
[5:27] * chutzpah (~chutz@ Quit (Quit: Leaving)
[5:28] * chutzpah (~chutz@ has joined #ceph
[5:34] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[5:38] * chutzpah (~chutz@ Quit (Quit: Leaving)
[5:45] * winston-d (~zhiteng@pgdmzpr01-ext.png.intel.com) has joined #ceph
[6:16] * scuttlemonkey (~scuttlemo@fl-184-1-34-163.dhcp.embarqhsd.net) has joined #ceph
[6:16] * ChanServ sets mode +o scuttlemonkey
[6:28] * scuttlemonkey (~scuttlemo@fl-184-1-34-163.dhcp.embarqhsd.net) Quit (Ping timeout: 480 seconds)
[6:28] * winston-d (~zhiteng@pgdmzpr01-ext.png.intel.com) Quit (Ping timeout: 480 seconds)
[6:40] * absynth (~absynth@irc.absynth.de) Quit (Ping timeout: 480 seconds)
[6:49] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[6:54] * sleinen (~Adium@user-28-15.vpn.switch.ch) Quit (Quit: Leaving.)
[6:54] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[7:02] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[7:13] * madkiss1 (~madkiss@port-213-160-22-242.static.qsc.de) Quit (Quit: Leaving.)
[7:33] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[7:41] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[7:41] * rustam (~rustam@5e0f5b1e.bb.sky.com) has joined #ceph
[7:52] * tnt (~tnt@ has joined #ceph
[7:55] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) Quit (Quit: Leaving.)
[7:58] * rustam (~rustam@5e0f5b1e.bb.sky.com) Quit (Remote host closed the connection)
[8:00] * rustam (~rustam@5e0f5b1e.bb.sky.com) has joined #ceph
[8:01] * Cube (~Cube@cpe-76-172-67-97.socal.res.rr.com) Quit (Quit: Leaving.)
[8:06] * norbi (~nonline@buerogw01.ispgateway.de) has joined #ceph
[8:10] <norbi> mornin #ceph
[8:10] <norbi> wget http://ceph.com/download/ceph-60.tar.gz -> 404 Not Found ?:)
[8:10] * absynth (~absynth@irc.absynth.de) has joined #ceph
[8:16] <yanzheng> wget http://ceph.com/download/ceph-0.60.tar.gz
[8:16] <norbi> yes have seen, only the link in the blog is false
[8:23] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[8:23] * madkiss (~madkiss@business-213-023-158-038.static.arcor-ip.net) has joined #ceph
[8:29] <norbi> great work! ceph 0.60 resolves upgrade bugs from version 0.59 ! :)
[8:30] * Cube (~Cube@cpe-76-172-67-97.socal.res.rr.com) has joined #ceph
[8:47] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[8:49] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) Quit (Quit: Leaving.)
[8:55] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[8:56] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: Beware of programmers who carry screwdrivers.)
[8:56] * sleinen1 (~Adium@2001:620:0:26:f808:44f2:f593:c7b1) has joined #ceph
[8:58] * ScOut3R (~ScOut3R@ has joined #ceph
[9:00] * winston-d (~zhiteng@pgdmzpr01-ext.png.intel.com) has joined #ceph
[9:03] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[9:07] * rustam (~rustam@5e0f5b1e.bb.sky.com) Quit (Remote host closed the connection)
[9:08] * rustam (~rustam@5e0f5b1e.bb.sky.com) has joined #ceph
[9:08] * madkiss (~madkiss@business-213-023-158-038.static.arcor-ip.net) Quit (Quit: Leaving.)
[9:09] * sleinen1 (~Adium@2001:620:0:26:f808:44f2:f593:c7b1) Quit (Quit: Leaving.)
[9:10] * tnt (~tnt@ Quit (Read error: Operation timed out)
[9:14] * gerard_dethier (~Thunderbi@ has joined #ceph
[9:18] * loicd (~loic@ has joined #ceph
[9:22] * scuttlemonkey (~scuttlemo@fl-184-1-34-163.dhcp.embarqhsd.net) has joined #ceph
[9:22] * ChanServ sets mode +o scuttlemonkey
[9:25] * sleinen (~Adium@2001:620:0:46:3833:b8dc:96ac:ccb7) has joined #ceph
[9:25] * rustam (~rustam@5e0f5b1e.bb.sky.com) Quit (Remote host closed the connection)
[9:27] * leseb (~Adium@ has joined #ceph
[9:29] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[9:32] * scuttlemonkey (~scuttlemo@fl-184-1-34-163.dhcp.embarqhsd.net) Quit (Ping timeout: 480 seconds)
[9:44] * winston-d (~zhiteng@pgdmzpr01-ext.png.intel.com) Quit (Quit: Leaving)
[9:48] * LeaChim (~LeaChim@02d9ee2d.bb.sky.com) has joined #ceph
[9:49] * l0nk (~alex@ has joined #ceph
[9:53] * dosaboy (~user1@host86-161-164-218.range86-161.btcentralplus.com) has joined #ceph
[9:58] <joelio> morning all
[9:59] <matt_> morning' (well afternoon for me)
[10:00] <joelio> heh, afternoon then!
[10:00] * joelio sees 0.60 released
[10:00] <joelio> nice
[10:04] <matt_> Has anyone upgraded to 0.60 yet?
[10:09] <joelio> I will once I've cleared some post-eggster work.. probably in a few hours
[10:10] * BManojlovic (~steki@ has joined #ceph
[10:11] <topro> is 0.60 going to be the next supported stable release like bobtail before?
[10:12] <matt_> 0.61 is an LTS release I believe, 4 weeks or so away
[10:18] <absynth> yep
[10:18] <absynth> cuttlefish, it will be called
[10:18] <absynth> (iirc)
[10:19] <topro> looks like codenames will follow the a-b-c scheme
[10:24] <matt_> I wonder if there's a cephalopod for every letter of the alphabet..
[10:26] <absynth> http://en.wikipedia.org/wiki/Category:Cephalopods
[10:26] <absynth> looks a lot like we won't have naming issues until well into 2026.
[10:30] <loicd> :-)
[10:31] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[10:31] * agh (~oftc-webi@gw-to-666.outscale.net) has joined #ceph
[10:31] <agh> hello
[10:32] <agh> I've a problem : I've a ceph cluster with 4 nodes, 8disks each.
[10:32] <agh> For testing purpose, I've rebooted one node.
[10:32] <agh> Everything goes fine (degraded, then recovering)
[10:32] <agh> oK.
[10:33] <agh> But, When the node has rebooted, 2 OSDs daemons did not start over. And here is the error :
[10:33] <agh> 2013-04-02 08:14:37.749074 7fef36f56760 -1 OSD id 11 != my id 12
[10:33] <agh> do you have an idea?
[10:48] * bithin (~bithin@ has joined #ceph
[10:49] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[10:50] <bithin> elder, hi I am not able to register to tracker.ceph.com. Not getting any mail to confirm my account.
[11:04] * Morg (b2f95a11@ircip2.mibbit.com) has joined #ceph
[11:06] * agh (~oftc-webi@gw-to-666.outscale.net) Quit (Quit: Page closed)
[11:07] <Morg> mornin'
[11:16] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[11:27] * madkiss (~madkiss@business-213-023-158-038.static.arcor-ip.net) has joined #ceph
[11:29] * madkiss (~madkiss@business-213-023-158-038.static.arcor-ip.net) Quit ()
[11:30] <humbolt> dmick: Yes, I figured that crush map is for that. But how?
[11:31] <humbolt> Is it enough to define hosts and racks and CRUSH does the magic by itself?
[11:33] <humbolt> For the rest of you, I am trying to understand, how I can make sure the 3 redundant copies of a file are not placed on the three OSDs in one and the same host.
[11:38] * ninkotech (~duplo@ip-89-102-24-167.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[11:39] * ninkotech (~duplo@ip-89-102-24-167.net.upcbroadband.cz) has joined #ceph
[11:45] * schlitzer|work (~schlitzer@ has joined #ceph
[11:46] <topro> humbolt: i think for most recent installations of ceph this is default behavior, for older installations you have to change that behaviour by modifying crushmap. anyway I would recommend you double-check that the crush-map does what you want it to do.
[11:46] <topro> this is controlled by crushmap rules/rulesets
[11:47] <topro> humbolt: have a look there http://ceph.com/docs/master/rados/operations/crush-map/
[11:52] * maxiz (~pfliu@ Quit (Remote host closed the connection)
[12:07] * humbolt_ (~elias@91-113-41-56.adsl.highway.telekom.at) has joined #ceph
[12:12] * loicd (~loic@ Quit (Read error: Operation timed out)
[12:12] * humbolt (~elias@91-113-41-56.adsl.highway.telekom.at) Quit (Ping timeout: 480 seconds)
[12:12] * humbolt_ is now known as humbolt
[12:51] * scuttlemonkey (~scuttlemo@fl-184-1-34-163.dhcp.embarqhsd.net) has joined #ceph
[12:51] * ChanServ sets mode +o scuttlemonkey
[12:53] <joelio> There's something intensely satisfying about upgrading the storage backend software and simultaneously upgrading storage hosts RAM allocations whilst the VM's being backed still chugg along with aplomb
[12:54] * joelio now rocks 0.60
[12:56] * yanzheng (~zhyan@ has joined #ceph
[12:56] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[12:59] * scuttlemonkey (~scuttlemo@fl-184-1-34-163.dhcp.embarqhsd.net) Quit (Ping timeout: 480 seconds)
[13:03] <loicd> Hi, what is the proper way to contribute to the documentation ? I would like to add to http://ceph.com/docs/master/rados/configuration/filestore-config-ref/
[13:05] * loicd reading http://ceph.com/docs/master/dev/generatedocs/
[13:12] * eschnou (~eschnou@191.208-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[13:12] * schlitzer|work (~schlitzer@ Quit (Remote host closed the connection)
[13:19] * schlitzer|work (~schlitzer@ has joined #ceph
[13:24] * goldfish (~goldfish@ has joined #ceph
[13:28] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[13:37] * eschnou (~eschnou@191.208-201-80.adsl-dyn.isp.belgacom.be) Quit (Quit: Leaving)
[13:51] * ninkotech_ (~duplo@ip-89-102-24-167.net.upcbroadband.cz) Quit (Remote host closed the connection)
[13:53] * ninkotech_ (~duplo@ip-89-102-24-167.net.upcbroadband.cz) has joined #ceph
[13:53] * ninkotech_ (~duplo@ip-89-102-24-167.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[14:06] * sleinen1 (~Adium@ has joined #ceph
[14:08] * sleinen2 (~Adium@2001:620:0:26:71ee:7d9e:58ed:c73f) has joined #ceph
[14:08] <humbolt> topro: I read all this CRUSH documentation, but it does not make clear, how I make sure replicas don't end up on the same host.
[14:09] <humbolt> It does not make clear I mean
[14:10] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Remote host closed the connection)
[14:10] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[14:14] * sleinen (~Adium@2001:620:0:46:3833:b8dc:96ac:ccb7) Quit (Ping timeout: 480 seconds)
[14:14] * l0nk (~alex@ Quit (Remote host closed the connection)
[14:14] * sleinen1 (~Adium@ Quit (Ping timeout: 480 seconds)
[14:15] * l0nk (~alex@ has joined #ceph
[14:18] * humbolt (~elias@91-113-41-56.adsl.highway.telekom.at) Quit (Quit: humbolt)
[14:22] * loicd trying to figure out why the index of the documentation has not been generated ( http://dachary.org/loic/ceph-doc/ ) ( followed the steps in http://ceph.com/docs/master/dev/generatedocs/ on the master branch and got a BUILD SUCCESSFUL at the end )
[14:24] <loicd> s/index/table of content/
[14:27] <loicd> http://paste.debian.net/ is the full output of build-doc and I can't spot anything obvious
[14:29] <loicd> http://dachary.org/loic/ceph-doc.txt is the full output of build-doc and I can't spot anything obvious ( too big for pastebin ;-)
[14:33] <topro> humbolt: in the rule of your pool you would want something like "step take default" (or whatever your root type is called), followed by something like "step chooseleaf firstn 0 type host" (or whatever your failure domain type is called), followed by "step emit"
[14:34] <topro> ^^ this would distribute replicas across hosts
[14:40] <goldfish> Hello, I am proof of concept with a ceph cluster and I am just wondering what the recommended layout of ods, mds and mon should be. The hardware I currently have is 3 servers each with Dual Core 1.66GHz,3GB Ram and 8 x 1.5TB HDD. Can I put mds,mon,ods on each or should I really be running each of those services on separate machines?
[14:46] <nhm> goldfish: under heavy load those machines are probably going to be CPU and memory limited if you have 8 OSDs on each one.
[14:46] <nhm> goldfish: if you can put the mons and mds on an alternate node you may be better off.
[14:47] <nhm> goldfish: we typically recommend 1GHz of 1 core and 1-2GB of ram per OSD in production.
[14:50] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:52] <goldfish> ok, the final system is mainly going to be used as backup storage so not performance heavy. I am primarily after the redundancy and quantity of storage for this setup.
[14:53] <goldfish> asides from performance issues, can i run each system with all three components ?
[14:56] <nhm> only concern is memory I think.
[14:56] <goldfish> cheers
[14:56] <goldfish> thanks
[14:57] <goldfish> I think if I put it in and it works then I should be able to make the case for more memory at a later date :-)
[14:58] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[14:59] <nhm> goldfish: you may want to explicitly test a recovery scenario where one of the OSDS has been down for a while. That will use more memory than normal operation.
[15:01] <tnt> Anyone using radosgw with nginx as front end here ?
[15:02] <nhm> tnt: Thought about it, but haven't tried it yet.
[15:05] <goldfish> and how would you recommend installing the OS on these systems. should I partition off off a GB from the front of the disks and raid the 1gb partitions or should I sacrifice 2 didks in each and mirror them ? or just not bother because each of the nodes are the same so as long as I stay at a storage usage of N-1node then I should be able to recover ?
[15:06] * mcclurmc (~mcclurmc@firewall.ctxuk.citrix.com) has joined #ceph
[15:06] <tnt> nhm: I've hit an issue where nginx sends an empty CONTENT_LENGTH param (for get request for eg) and radosgw doesn't like that, there is an explicit 'return -EINVAL' in that case, not sure why ... (instead of just considering 0 or absent)
[15:10] <nhm> goldfish: we tend to recommend raid1 system disk for enterprise, but it's not a hard rule. It all depends on how much you want to avoid revoery in prodution vs maximizing density. Some systems have a couple of 2.5" bays that makes it easier to devote those to system disks.
[15:11] <nhm> goldfish: normally we recommend not putting OSDS on the system disks, but if you really don't care about performance it won't break anything to carve off a partition for the system.
[15:11] <goldfish> ok, thanks again
[15:14] <Azrael> has anybody made use of the ceph chef cookbook?
[15:14] <Azrael> seems it doesn't work out of the box. issues with keyring generation.
[15:15] <nhm> tnt: hrm. not sure.
[15:17] <nhm> tnt: https://github.com/carsonoid/ceph/commit/96896eb092c3b4e0760e56d5228ef0d604951a12
[15:21] <tnt> yeah, that's pretty much the patch I made in my repo
[15:23] <tnt> oh, that was apparently pulled in master but not backported in bobtal
[15:28] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[15:37] * Morg (b2f95a11@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[15:39] * yanzheng (~zhyan@ has joined #ceph
[15:40] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[15:42] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[15:43] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Remote host closed the connection)
[15:44] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[15:54] * eschnou (~eschnou@191.208-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[15:54] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[16:01] * scuttlemonkey (~scuttlemo@fl-184-1-34-163.dhcp.embarqhsd.net) has joined #ceph
[16:01] * ChanServ sets mode +o scuttlemonkey
[16:04] <loicd> leseb: good day sir :-) I'm having troubles generating the documentation : the table of content does not show and I don't see any error. Does this ring a bell ?
[16:06] <Azrael> anybody available to help with ceph + chef?
[16:09] <scuttlemonkey> Azrael, I'm no chef wizard...but I have tinkered with it a bit
[16:09] <scuttlemonkey> what seems to be the trouble?
[16:09] <Azrael> scuttlemonkey: thanks for reaching out
[16:09] <Azrael> scuttlemonkey: seems to be an issue with the client.admin keyring
[16:09] <scuttlemonkey> not generating?
[16:10] <Azrael> yeah
[16:10] <Azrael> so
[16:10] <Azrael> https://github.com/ceph/ceph-cookbooks/blob/master/recipes/mon.rb
[16:10] <Azrael> looks like ruby_block lines 79-89 fail
[16:10] * PerlStalker (~PerlStalk@ has joined #ceph
[16:10] <Azrael> ceph auth get-key client.bootstrap-osd
[16:10] <Azrael> cant do that if there's no client.admin key heh
[16:11] <Azrael> the docs for ceph state that one must first start ceph w/o authx in order to generate the keys
[16:11] <Azrael> but that isn't handled with the ceph chef recipes for your first mon deployment
[16:11] <scuttlemonkey> well, I dunno about Chef...but when I was working with juju the problem with generating keyrings was almost always 1) no passwordless ssh 2) hostname issues or 3) file permissions problems
[16:11] <scuttlemonkey> really? The doc says to start w/o cephx?
[16:12] <Azrael> [2013-04-02T16:01:46+02:00] INFO: Processing ruby_block[ceph client admin keyring] action run (ceph::mon line 80)
[16:12] <Azrael> 2013-04-02 16:01:46.040786 7f48a2749760 -1 unable to authenticate as client.admin
[16:12] <scuttlemonkey> Dunno that I have ever started a cluster w/o auth
[16:12] <Azrael> Execute the following procedures to enable cephx on a cluster with cephx disabled. If you (or your deployment utility) have already generated the keys, you may skip the steps related to generating keys.
[16:12] <Azrael> -EOWAIT
[16:12] <Azrael> hmm
[16:12] <Azrael> thats different
[16:12] <Azrael> my bad
[16:12] <scuttlemonkey> hehe
[16:12] <Azrael> thats if your cluster currently doesn't have cephx
[16:12] <Azrael> right, ok
[16:12] <Azrael> but hmm
[16:13] <scuttlemonkey> yeah, still sounds like there is an underlying issue
[16:13] <Azrael> now also
[16:13] <Azrael> yeah
[16:13] <Azrael> i'm doing this on debian (sigh)
[16:13] <scuttlemonkey> /eject
[16:13] <Azrael> i wonder if the ubuntu ceph packages are different
[16:13] <scuttlemonkey> :)
[16:13] <Azrael> if they create /etc/ceph/keyring for you upon installation, for example
[16:13] <Azrael> heh yeah
[16:13] <Azrael> debian... ubuntu... rawr...
[16:14] <scuttlemonkey> hmmm
[16:14] <scuttlemonkey> it has been quite a while since I dropped ceph on debian...pre-bobtail I think
[16:14] <scuttlemonkey> that said...it should still work, in theory
[16:14] <Azrael> yup
[16:14] <scuttlemonkey> what kernel version are you rockin?
[16:14] <Azrael> i have ceph on debian n oprob
[16:15] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Quit: Ex-Chat)
[16:15] <Azrael> atm its 2.6.32 but thats because i'm deploying with vagrant/berkshelf, for testing
[16:15] <Azrael> once the cookbook is working, i'll deploy on 3.whatever in prod
[16:15] <scuttlemonkey> gotcha
[16:16] <scuttlemonkey> iirc the ceph stuff didn't hit the kernel until 2.6.34
[16:16] <Azrael> ruby... vagrant... chef... berkshelf... i feel very web scale.
[16:16] <scuttlemonkey> so that may be causing issue
[16:16] <scuttlemonkey> hehe
[16:16] <Azrael> hmm
[16:16] <Azrael> well
[16:16] <Azrael> mon starts ok
[16:17] <Azrael> its just the daemon, not a ceph client
[16:17] <scuttlemonkey> yeah, it was the client-stuff
[16:17] <scuttlemonkey> right
[16:17] <scuttlemonkey> you doing this as a quick-start single node? Or is this a multi-node setup?
[16:17] <Azrael> (side note: i spent a day troubleshooting ceph + chef, only to finally figure out the next day that i had typos with "chef" where it should say "ceph" haha)
[16:17] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[16:17] <Azrael> quick start single node. once that works, then i'll do multi-node.
[16:18] <scuttlemonkey> haha, yeah that's one of the reasons I quit poking at chef....I couldn't stop typoing back and forth chef/ceph
[16:18] <Azrael> just trying to bring up a mon. once that works, i'll try an osd.
[16:19] <Azrael> i have a single-node ceph system on debian deployed already, without using chef. its for the developers to have their stuff use librados. thats working great.
[16:19] <scuttlemonkey> k
[16:19] * eschnou (~eschnou@191.208-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[16:19] <scuttlemonkey> so it's definitely something w/ chef deploying things
[16:21] <Azrael> yup
[16:22] <scuttlemonkey> which chef cookbooks are you using? Ours?
[16:22] <Azrael> yes
[16:22] <scuttlemonkey> k, I should spin up a box and play w/ chef again
[16:22] <Azrael> https://github.com/ceph/ceph-cookbooks
[16:23] <scuttlemonkey> I'll do a debian one and poke at it
[16:23] <Azrael> wow, thanks man
[16:23] <Azrael> high five for #ceph support
[16:23] <scuttlemonkey> hehe
[16:23] <scuttlemonkey> motivated self-interest :)
[16:23] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) Quit (Quit: Leaving.)
[16:23] <Azrael> heh
[16:23] <Azrael> btw, i am using chef 11
[16:23] <Azrael> shouldn't matter
[16:23] <Azrael> but yeah chef 11
[16:23] <scuttlemonkey> gotcha
[16:23] <Azrael> also
[16:23] <Azrael> chef-solo will not work
[16:23] <Azrael> so you'll need a chef server somewheres
[16:23] <scuttlemonkey> I'll probably just grab w/e dpkg gives me
[16:23] <Azrael> with the cookbook
[16:24] <scuttlemonkey> oh
[16:24] <scuttlemonkey> dang, chef solo was my plan
[16:24] <Azrael> chef-solo will not work because libraries/* uses .search(...) to ask the chef system to do a search of other nodes in the ceph cluster.
[16:24] <scuttlemonkey> wonder if that's the digital equivalent of dropping a hammer on your foot
[16:24] <Azrael> and .search isn't available in chef-solo, since there's nothing to search against
[16:24] <Azrael> haha
[16:25] <Azrael> good news is, setting up chef-server 11 is very easy in debian
[16:25] <scuttlemonkey> cool
[16:25] <Azrael> just grab the package from opscode.com for ubuntu. installs and works on debian.
[16:26] <scuttlemonkey> oh, well that's handy
[16:26] <Azrael> chef-server_11.0.6-1.ubuntu.10.04_amd64.deb is what i used
[16:27] <scuttlemonkey> what debian version?
[16:27] <scuttlemonkey> squeeze / wheezy?
[16:27] <Azrael> squeeze
[16:28] <scuttlemonkey> k
[16:38] * jrisch (~Adium@83-95-19-94-static.dk.customer.tdc.net) has joined #ceph
[16:42] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[16:43] <Azrael> scuttlemonkey: so, if i disable authx (set to none) and then run
[16:43] <Azrael> $ ceph auth list
[16:43] <Azrael> no installed auth entries!
[16:43] <Azrael> is what i get
[16:43] <Azrael> in other words, the chef bootstrap isn't generating a client.admin key
[16:44] <Azrael> i'm doing this with stable (bobtail). i'll switch to testing and see what happens.
[16:44] * Azrael throws darts
[16:46] * eschnou (~eschnou@191.208-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[16:47] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[16:47] * diegows (~diegows@ has joined #ceph
[16:48] * jbarth (~yaaic@host-184-166-122-97.but-mt.client.bresnan.net) Quit (Ping timeout: 480 seconds)
[16:50] * aliguori (~anthony@ has joined #ceph
[16:50] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[16:52] * jmlowe (~Adium@2001:18e8:2:28cf:f000::5ab8) has joined #ceph
[16:52] <Azrael> scuttlemonkey: well haha... 0.60 doesn't install on debian either. sigh :-)
[16:53] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[16:54] * norbi (~nonline@buerogw01.ispgateway.de) Quit (Quit: Miranda IM! Smaller, Faster, Easier. http://miranda-im.org)
[16:56] * eschnou (~eschnou@191.208-201-80.adsl-dyn.isp.belgacom.be) Quit (Quit: Leaving)
[16:56] <absynth> libleveldb dependancy?
[16:57] * gerard_dethier (~Thunderbi@ Quit (Quit: gerard_dethier)
[17:04] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[17:07] <Azrael> close
[17:07] <Azrael> well
[17:07] <Azrael> yes
[17:07] <Azrael> actually
[17:08] <Azrael> libleveldb1, libsnappy1, etc
[17:09] <Azrael> absynth: do you know of a workaround?
[17:09] <absynth> i think ours is "apt-get dist-upgrade"
[17:09] <absynth> obviously, only on our test environment
[17:09] <absynth> i suggest waiting for new packages
[17:09] <Azrael> ok
[17:09] <Azrael> thanks
[17:10] * ScOut3R (~ScOut3R@ Quit (Ping timeout: 480 seconds)
[17:11] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:11] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[17:25] * jlogan1 (~Thunderbi@2600:c00:3010:1:fc52:a0e0:824c:3a1d) has joined #ceph
[17:27] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[17:27] <lxo> 0.60's ceph-mds crashes on startup, AFAICT decoding the session map. 0.59's ceph-mds starts fine, even though the rest of the cluster is running 0.60
[17:31] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[17:33] <dosaboy> hi all, I am trying to get ceph working with cinder in Openstack grizzly. I have hit a problem when trying to attach the volume to an instance whereby I get "libvirtError: internal error rbd username 'cinder' specified but secret not found" in nova-compute. I have followed instrcutions to add secret to virsh etc but don't get why this is not working
[17:34] <dosaboy> has anyone come across this?
[17:38] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[17:40] * jmlowe (~Adium@2001:18e8:2:28cf:f000::5ab8) Quit (Quit: Leaving.)
[17:42] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) has joined #ceph
[17:47] * sleinen (~Adium@ has joined #ceph
[17:47] <Azrael> scuttlemonkey: wonder if i have to adjust the recipe to use ceph-authtool?
[17:54] * vata (~vata@2607:fad8:4:6:61f5:900a:247f:e5d1) has joined #ceph
[17:54] * sleinen2 (~Adium@2001:620:0:26:71ee:7d9e:58ed:c73f) Quit (Ping timeout: 480 seconds)
[17:54] <Elbandi_> # ceph mds getmap -o map
[17:54] <Elbandi_> got mdsmap epoch 49233
[17:54] <Elbandi_> How can I decode this file?
[17:55] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[17:59] * BManojlovic (~steki@fo-d- has joined #ceph
[18:04] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:09] <pioto> hi, so, with cephfs... it seems that by default you only have a single active MDS at once, and any others you may add are just hot standbys? i guess that means that the MDS is a possible bottleneck?
[18:09] <pioto> is there any reason you should/shoudln't increase the 'max mds' setting?
[18:09] <pioto> like, if you had 2 active MDSes, i assume half your cephfs clients would talk to one, and half to the other?
[18:12] <lxo> pioto, multi-mds settings are not regarded as anywhere close to stable yet
[18:12] <scuttlemonkey> Azrael: hmm, I know they are using the chef recipes in production
[18:12] <pioto> lxo: ok
[18:12] <scuttlemonkey> but the charms used ceph-authtool
[18:13] <pioto> lxo: is there a summary somewhere of what does/doesn't work, and what is/isn't stable, with cephfs?
[18:14] <pioto> so i can judge how many kittens it'll kill if i try to use it, etc
[18:14] <scuttlemonkey> Elbandi_: iirc that is undocumented (which I assume is why you're asking) lemme see what the magic is for that
[18:16] * kincl (~kincl@0001aeba.user.oftc.net) has left #ceph
[18:17] <loicd> joshd: would you have a few minutes to discuss the OpenStack summit session "Roadmap for Ceph integration with OpenStack" ( http://summit.openstack.org/cfp/details/76 ) as described in https://etherpad.openstack.org/roadmap-for-ceph-integration-with-openstack ?
[18:17] <loicd> and hi :-)
[18:19] <scuttlemonkey> Elbandi_: sounds like the best bet is to just use 'ceph mds dump' instead
[18:19] * bithin (~bithin@ Quit (Quit: Leaving)
[18:20] <lxo> pioto, there was a recent blog post on the plans for the first stable (as in supportable) cephfs feature set. check the community blog at ceph.com
[18:21] <scuttlemonkey> http://ceph.com/dev-notes/cephfs-mds-status-discussion/
[18:28] * mattch (~mattch@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[18:29] <Elbandi_> scuttlemonkey: i get that command from docs: http://ceph.com/docs/master/rados/operations/control/#mds-subsystem
[18:30] <Elbandi_> Todo ceph mds subcommands missing docs: set_max_mds, dump, getmap, stop, setmap
[18:30] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[18:31] <Elbandi_> "let me see, what are those commands", and i can't find how to use getmap and setmap
[18:32] * jrisch (~Adium@83-95-19-94-static.dk.customer.tdc.net) Quit (Read error: Operation timed out)
[18:32] <Elbandi_> so getmap/setmap are useless or internal use :)
[18:32] <scuttlemonkey> Elbandi_: yeah, I was just chatting w/ greg and he didn't think we had a separate tool for decoding
[18:32] <scuttlemonkey> but you can get what you need from mds dump right now
[18:32] <scuttlemonkey> (mds stuff is still going through lots of change)
[18:33] * tnt (~tnt@ has joined #ceph
[18:33] <scuttlemonkey> I'm guessing the getmap/setmap stuff will be hashed in more detail when multi-mds stuff is closer to stable
[18:33] <scuttlemonkey> (pure conjecture)
[18:34] <scuttlemonkey> but yeah, lots of dev stubs atm
[18:35] <Elbandi_> ok, thx
[18:41] * sleinen (~Adium@2001:620:0:25:346c:a8c2:f4e5:3ea2) has joined #ceph
[18:46] * l0nk (~alex@ Quit (Quit: Leaving.)
[18:48] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) has joined #ceph
[18:49] <lxo> sage, the session_info_t::decode compat code should decode into s, not into completed_requests
[18:50] <joao> matt_, around?
[18:50] <lxo> that's what broke ceph-mds 0.60 for me, AFAICT
[18:50] <lxo> testing the fix now, will post the patch momentarily
[18:50] * leseb (~Adium@ Quit (Quit: Leaving.)
[18:51] * jrisch (~Adium@80-62-117-240-mobile.dk.customer.tdc.net) has joined #ceph
[18:56] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[19:00] * diegows (~diegows@ has joined #ceph
[19:02] * Cube (~Cube@cpe-76-172-67-97.socal.res.rr.com) Quit (Quit: Leaving.)
[19:02] * jrisch (~Adium@80-62-117-240-mobile.dk.customer.tdc.net) Quit (Read error: Connection reset by peer)
[19:03] * joshd1 (~jdurgin@2602:306:c5db:310:a4de:64a:4873:7ec1) has joined #ceph
[19:04] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[19:05] <loicd> https://etherpad.openstack.org/roadmap-for-ceph-integration-with-openstack
[19:09] * mcclurmc (~mcclurmc@firewall.ctxuk.citrix.com) Quit (Read error: Operation timed out)
[19:12] * alram (~alram@ has joined #ceph
[19:13] <Anticimex> loicd: interesting, nice, thanks
[19:15] * rturk-away is now known as rturk
[19:15] * jrisch (~Adium@80-62-117-240-mobile.dk.customer.tdc.net) has joined #ceph
[19:17] <lxo> sage, posted
[19:18] <scuttlemonkey> Azrael: ah hah...I have an update if you're still around
[19:22] * Cube (~Cube@ has joined #ceph
[19:23] * madkiss (~madkiss@port-213-160-22-242.static.qsc.de) has joined #ceph
[19:23] * madkiss (~madkiss@port-213-160-22-242.static.qsc.de) Quit ()
[19:25] * The_Bishop (~bishop@2001:470:50b6:0:25dc:e446:71f6:f1b0) has joined #ceph
[19:29] * calebamiles (~caleb@pool-70-109-184-32.burl.east.myfairpoint.net) Quit (Ping timeout: 480 seconds)
[19:29] <phantomcircuit> updating from 0.56.1 to 0.56.3 shouldn't require anything more than restarting daemons right?
[19:30] <sagewk> phantomcircuit: right
[19:30] <loicd> joshd I saved the revision and I'll get back home, feel free to edit https://etherpad.openstack.org/roadmap-for-ceph-integration-with-openstack as you see fit ;-)
[19:30] <sagewk> 0.56.4 is out though; you should upgrade to that :)
[19:30] <janos> is there an order to the restarts?
[19:31] <janos> like mons first?
[19:31] <janos> i keep eyeballing my .56.3. haven't done .56.4 yet
[19:32] * jrisch (~Adium@80-62-117-240-mobile.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[19:33] * loicd (~loic@3.46-14-84.ripe.coltfrance.com) Quit (Quit: Leaving.)
[19:37] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:37] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[19:42] * lx0 is now known as lxo
[19:43] <scuttlemonkey> Azrael: gonna drop offline to focus for a bit...but if you wander back through the chef stuff is apparently in a bit of flux
[19:44] <scuttlemonkey> basically, they will be broken if used stock until cuttlefish
[19:44] <absynth> is cuttlefish tasty, btw?
[19:44] <scuttlemonkey> you could add a call in the mon deployment for ceph-createkeys or ceph-authtool before the actual process of getting keys
[19:45] <scuttlemonkey> also, starting services may be a bit dicey b/c of the init scripts used
[19:45] <scuttlemonkey> but there is a config section you can monkey with to make it work
[19:45] <scuttlemonkey> fair bit of work until the new cuttlefish stuff lands, but I wanted to let you know
[19:45] <scuttlemonkey> absynth: that which does not eat me first...is delicious :)
[19:46] <absynth> in East Asia, dried, shredded cuttlefish is a popular snack food.
[19:46] <absynth> nom, nom, nom!
[19:47] <scuttlemonkey> hehe
[19:47] <absynth> joao: In Portugal, cuttlefish is present in many popular dishes, chocos com tinta (cuttlefish in black ink) being among the most popular. This dish is made with grilled cuttlefish served in a sauce of its own ink.
[19:56] * jbarth (~jbarth@a.clients.kiwiirc.com) has joined #ceph
[19:57] * madkiss (~madkiss@port-213-160-22-242.static.qsc.de) has joined #ceph
[19:57] <jbarth> hello #ceph, how would one list all the user accounts for radosgw?  What am I missing?
[20:01] * dpippenger (~riven@ has joined #ceph
[20:02] * madkiss1 (~madkiss@tmo-103-41.customers.d1-online.com) has joined #ceph
[20:05] <dmick> jbarth: I think there may be a RESTful interface that would help, but I'm not sure; did you check out http://ceph.com/docs/master/radosgw/admin/adminops/#get-usage? Otherwise, ISTR that you need to rados ls a particular pool.
[20:06] * chutzpah (~chutz@ has joined #ceph
[20:06] <gregaf> I don't think the restful interface for admin ops is implemented yet, though one's under development — yehudasa?
[20:06] <gregaf> look at the radosgw-admin CLI tool
[20:07] <dmick> get-usage/trim-usage were the only ones, I thought
[20:07] <dmick> but I could be wrong
[20:07] <gregaf> ah, that'd be cool if they did exist
[20:07] <jbarth> so what you mention about the admin API is thus far for usage reporting
[20:07] <dmick> yes, but it can give you detail from which you could maybe extract users. I dunno if it shows you entries for 0-usage users tho
[20:08] <gregaf> I suppose the current design does expect you to have a separate database of users since you need a different system for billing, etc anyway
[20:08] * madkiss (~madkiss@port-213-160-22-242.static.qsc.de) Quit (Ping timeout: 480 seconds)
[20:08] * maoy (~maoy@ has joined #ceph
[20:09] <jbarth> gregaf - I have looked at the radosgw-admin, seems that this should be there but, unless I'm blind I don't see how you get such info using this tool.  Sure you get the keys as output, one time when you create them but, what if I need to go back and retrieve a key?
[20:10] <dmick> yeah, I don't think it's there. I've never really been clear on why
[20:10] <gregaf> user info will give you that — you do need to know the user exists though, yes
[20:11] <gregaf> dmick: we don't maintain a list of the users anywhere; I suppose it wouldn't be that expensive to do so but there hasn't been demand for it that I'm aware of
[20:12] * madkiss (~madkiss@tmo-107-67.customers.d1-online.com) has joined #ceph
[20:14] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Read error: Connection reset by peer)
[20:15] <jbarth> well if you have the know the user is there that is not very manageable, what if you made a typo...that user isn't "there" when you go back and try and retrieve their info, i.e. because it was Tim not Tom
[20:16] <dmick> jbarth: you can derive this from ls of one of the pools; I just don't remember its name right now
[20:16] * madkiss1 (~madkiss@tmo-103-41.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[20:17] <dmick> rados -p .users.uid ls
[20:17] <jbarth> seriously, no demand for this?  how would anyone ever use it for multi-tenant access?
[20:17] <jbarth> sweet, let me try that
[20:17] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[20:18] <gregaf> jbarth: there's no billing support in here either; the expectation is that administrators will have a separate database holding all the users they have
[20:18] <gregaf> there are ways to retrieve the data but we don't present them nicely because they're relatively expensive
[20:18] <gregaf> where "expensive" here means "not looking up a list in one object"
[20:18] <jbarth> gotcha so one would create a user and better be damned sure they saved the output
[20:19] <gregaf> well, yeah…how else in a service environment would you return the user info to the person who created it?
[20:19] <gregaf> it's not like they can mystically peer into a screen in your DC so it's got to go somewhere to get put into a web view, etc ;)
[20:19] <jbarth> well usually when I create a user in a userbase, it is queryable
[20:19] <gregaf> (this isn't to say it's a bad feature request!)
[20:19] <jbarth> LDAP, SQL, NIS on and on
[20:24] * loicd (~loic@magenta.dachary.org) has joined #ceph
[20:24] <dmick> it should at least be an FAQ methings
[20:24] <dmick> *ks
[20:25] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[20:26] <joshd> loicd: great! thanks again for organizing the session
[20:28] <nwl> loicd, joshd: i've tidied up the etherpad
[20:31] <loicd> joshd my pleasure :-)
[20:31] <loicd> nwl: thanks !
[20:32] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[20:33] <jbarth> dmick: thank you for the info
[20:33] <dmick> no worries. good to know you're not nuts? :)
[20:35] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[20:35] * ivotron (~ivo@dhcp-59-232.cse.ucsc.edu) has joined #ceph
[20:36] * dosaboy (~user1@host86-161-164-218.range86-161.btcentralplus.com) Quit (Remote host closed the connection)
[20:36] * dosaboy (~user1@host86-161-164-218.range86-161.btcentralplus.com) has joined #ceph
[20:37] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[20:38] <jbarth> I think that it is a missing piece and maybe needing mention in the FAQ or maybe just in the RADOSGW guide, I'd like to hear how everyone else manages their users, it almost seems that a wrapper would be needed to grab the radosgw-admin create output and populate that to another system
[20:46] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[20:46] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[20:48] * yasu` (~yasu`@dhcp-59-149.cse.ucsc.edu) has joined #ceph
[20:48] * yasu` (~yasu`@dhcp-59-149.cse.ucsc.edu) Quit (Remote host closed the connection)
[20:48] * yasu` (~yasu`@dhcp-59-149.cse.ucsc.edu) has joined #ceph
[20:50] * brahmanda (~brahmanda@ip-95-220-238-248.bb.netbynet.ru) has joined #ceph
[20:50] <brahmanda> Привет! Рускоговорящие есть ?
[20:50] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[20:53] <SvenPHX> Anyone done any OS performance tuning for a ceph cluster?
[20:56] <Elbandi_> i want to set max_mds back to 1
[20:56] <Elbandi_> but always 2 mds are up and in :(
[20:57] * jrisch (~Adium@4505ds2-hi.0.fullrate.dk) has joined #ceph
[20:59] <brahmanda> Help!
[20:59] * maoy (~maoy@ Quit (Quit: maoy)
[21:00] <brahmanda> I install ceph by http://pve.proxmox.com/wiki/Storage:_Ceph
[21:01] <brahmanda> I have the same performance
[21:01] * jlogan1 (~Thunderbi@2600:c00:3010:1:fc52:a0e0:824c:3a1d) Quit (Quit: jlogan1)
[21:01] <brahmanda> CrystalDiskMark 3.0.2 Shizuku Edition x64 (C) 2007-2013 hiyohiyo
[21:01] <brahmanda> Crystal Dew World : http://crystalmark.info/
[21:01] <brahmanda> -----------------------------------------------------------------------
[21:01] <brahmanda> * MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s]
[21:01] <brahmanda> Sequential Read : 63.282 MB/s
[21:01] <brahmanda> Sequential Write : 6.273 MB/s
[21:01] <brahmanda> Random Read 512KB : 63.769 MB/s
[21:01] <brahmanda> Random Write 512KB : 6.685 MB/s
[21:01] <brahmanda> Random Read 4KB (QD=1) : 4.241 MB/s [ 1035.5 IOPS]
[21:01] <brahmanda> Random Write 4KB (QD=1) : 0.228 MB/s [ 55.7 IOPS]
[21:01] <brahmanda> Random Read 4KB (QD=32) : 4.424 MB/s [ 1080.1 IOPS]
[21:02] <brahmanda> Random Write 4KB (QD=32) : 0.252 MB/s [ 61.4 IOPS]
[21:02] <brahmanda> Test : 100 MB [C: 34.6% (11.0/31.9 GB)] (x1)
[21:02] <brahmanda> Date : 2013/04/02 22:57:04
[21:02] <brahmanda> OS : Windows Server 2008 R2 Server Standard Edition (full installation) SP1 [6.1 Build 7601] (x64)
[21:02] <brahmanda> Почему такая маленькая скорость на запись ?
[21:02] <brahmanda> Why such a small speed record?
[21:02] <dmick> Elbandi_: try set_max_mds to 1, and then mds stop/mds rm
[21:04] <Elbandi_> nothing happends
[21:05] <Elbandi_> # ceph mds rm 1
[21:05] <Elbandi_> mds gid 1 dne
[21:05] <Elbandi_> but mds.1 still there in dump
[21:05] <Elbandi_> mds.1.247 up:replay seq 1 laggy since 2013-04-02 20:36:52.139847 (standby for rank 1 'ceph-mds1')
[21:05] <dmick> yes. dne is "does not exist". the gid isn't 1; look in the dump
[21:05] <dmick> (it's not very friendly)
[21:06] <yasu`> Did you replace monmap ? > Elbandi_
[21:06] <dmick> the number after the =, or before the :, is the gid
[21:07] * jrisch (~Adium@4505ds2-hi.0.fullrate.dk) has left #ceph
[21:08] <dmick> Elbandi_: i.e. up {0=4597,1=4599,2=4598}
[21:08] <dmick> gids are the 45xx
[21:09] <Elbandi_> ah
[21:10] <Elbandi_> haha
[21:10] <Elbandi_> 0: :/0 '' mds.-1.0 up:stopping seq 0 laggy since 2013-04-02 21:08:12.452789
[21:10] <Elbandi_> :D
[21:10] <dmick> yw
[21:14] <Elbandi_> # ceph mds dump|grep up
[21:14] <Elbandi_> up {0=20707,1=20702,20276=0,20702=0}
[21:16] * alram (~alram@ Quit (Quit: Lost terminal)
[21:16] * rustam (~rustam@5e0f5b1e.bb.sky.com) has joined #ceph
[21:16] * alram (~alram@ has joined #ceph
[21:16] <Elbandi_> how does this big number come? :/
[21:20] * jlogan (~Thunderbi@2600:c00:3010:1:943e:a21b:f1f8:c84e) has joined #ceph
[21:20] * madkiss (~madkiss@tmo-107-67.customers.d1-online.com) Quit (Read error: No route to host)
[21:28] <Psi-jack> How do you kick out an osd instead of waiting for it to time out while it's down?
[21:31] <dmick> Elbandi_: sorry, what's the question?
[21:31] <dmick> Psi-jack: mark it down or out as appropriate?
[21:31] <dmick> ceph osd <down|out>
[21:32] <Psi-jack> Ahhh, ceph osd out, I believe is the one I'm looking for.
[21:33] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) Quit (Quit: Leaving.)
[21:34] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[21:35] <Psi-jack> Ahh, there he is!
[21:36] <Psi-jack> fghaas: Heh, tomorrow's the day I'm finally doing my ceph presentation to a local LUG group. :)
[21:36] <fghaas> Psi-jack: good luck for that! if anyone's taking pics or videos, be sure to post them somewhere
[21:37] <fghaas> (ceph community on g+ would be one option)
[21:37] <Psi-jack> hehe, I don't think there'd be any of that. Mostly just people curiously interested and excited about it.
[21:38] <Psi-jack> My presentation is mostly just going to be in front of about 15~20 people, of which some will take the knowledge to their place of work and most likely use it.
[21:38] <fghaas> Psi-jack: did you steal^Wborro^Wuse any of my material on github?
[21:39] <Psi-jack> Yep. I actually have the LCA 2013 presentation which I will be borrowing, along with all the kvm images and all for the technical presentation parts. ;)
[21:40] <fghaas> Psi-jack: sure, with the caveat that some of the bugs we highlighted then have been fixed since
[21:40] <Psi-jack> hehe yep. I'm in heavy note-taking phase, about all that, and stuff. LOL
[21:41] <rturk> Psi-jack: let me know if you find yourself short on material, I've built a few Ceph talks as well
[21:41] <Psi-jack> Building myself a Zim Wiki I'll be using to present the material and have quick notes on specific details, notes I may not know fully 100% on my own.
[21:41] <fghaas> rturk: are you doing oscon this year?
[21:41] <rturk> fghaas: they encouraged me to submit another ignite talk, so I will do that
[21:42] <Psi-jack> rturk: Nice. So far, Florian's here is pretty good. I have some of my own material after shortening his up a little bit and focusing some more on the technical aspect, and I have my own cluster at home I can use to demonstrate more techncial aspects already done and in production use. :)
[21:42] <fghaas> rturk, they turned down both of my ceph talks, so I won't be able to give you crazy ideas this year :)
[21:42] <Psi-jack> Wow. Turned down?
[21:42] <rturk> not unusual for OSCON, it's a popular show
[21:43] <fghaas> Psi-jack: yup, got the notification last week. suppose they had enough ceph talks already. :) is joshd speaking?
[21:43] <Psi-jack> fghaas: I never could actually find a place to reasonably buy that presenter remote you suggested.. I ended up with a Logitech R400, which works out pretty well for me. I can just re-map the keys appropriately for my presentation as need-be. :)
[21:43] <fghaas> Psi-jack: I used that one for a time too, it's fine
[21:43] <Psi-jack> Yeah, it works. Not sure how it is on battery life, yet. heh
[21:43] <rturk> fghaas: as far as I know we haven't heard yet
[21:43] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) has joined #ceph
[21:43] <Psi-jack> I didn't bother spending the extra $40 for the LED timer and green laser. LOL
[21:44] <Psi-jack> Err, lcd timer.
[21:46] * eschnou (~eschnou@191.208-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:46] <Psi-jack> fghaas: heh, because my /personal/ laptop is so underpowered, I almost needed to just put the ceph demo servers onto my existing proxmox VE environment, on ceph storage, basically doubling up ceph on top of ceph, but thankfully I was able to tune down the memory and baloon them so if they needed more memory, I could tune it on demand. LOL
[21:46] <Psi-jack> 3 GB RAM and 4 1GB RAM VM's was a bit heavy. :)
[21:47] <joshd> fghaas: not at OSCON, but at openstack
[21:47] <fghaas> joshd: ahum. so are there _any_ ceph talks at oscon?
[21:47] <fghaas> if they didn't accept any, that would be... bad. like, real bad.
[21:48] * gentleben (~sseveranc@ has joined #ceph
[21:49] * mcclurmc (~mcclurmc@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[21:49] <gentleben> hi, what leveldb package is needed to build master?
[21:50] <joshd> fghaas: I didn't submit any to OSCON, it sounds like rturk may still get through
[21:51] <fghaas> well if rturk is doing an ignite talk that's great, but if there's not a single ceph talk in the main program, and as per http://www.oscon.com/oscon2013/public/schedule/full it seems like there isn't, that's really poor scheduling
[21:53] <rturk> agree. I know we submitted a few
[21:54] <rturk> but that's OSCON :) they get *tons* of submissions
[21:55] <Psi-jack> heh
[21:55] <dmick> gentleben: debian/control says libleveldb-dev
[21:55] <Psi-jack> wow.. yeah.
[21:56] <dmick> gentleben: you can also find that in README
[21:57] <gentleben> dmick: I think my problem may be with gentoo. It doesn't seem to find it. I wil play with it for a while
[21:57] <gentleben> thanks
[21:57] <dmick> gentleben: it's pretty easy to build from source if you have to
[21:57] <dmick> IIRC
[21:58] <dmick> and: I dunno gentoo, but I googled leveldb gentoo and found http://packages.gentoo.org/package/dev-libs/leveldb
[21:58] <dmick> fwiw
[21:59] <Elbandi_> dmick: the initial issue is: i change back max_mds to 1, but two mds want to be active :(
[21:59] <Elbandi_> the question is: why? :D
[21:59] <dmick> yes, but I thought we solved that
[22:00] <dmick> and the reason is, probably, the mdsmap has two mds's in it from when it was created
[22:00] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[22:00] <Elbandi_> and how can i remove one mds?
[22:01] <Elbandi_> because
[22:01] <Elbandi_> # ceph mds stop 20755
[22:01] <Elbandi_> mds.20755 not active (down:dne)
[22:01] <Elbandi_> # ceph mds rm 20755
[22:01] <Elbandi_> cannot remove active mds.ceph-mds2 rank 1
[22:01] <Elbandi_> and
[22:01] <Elbandi_> up {0=20967,1=20755}
[22:01] <Elbandi_> 20755: x.x.34.83:6800/17545 'ceph-mds2' mds.1.253 up:replay seq 24
[22:02] <dmick> max_mds is indeed 1?
[22:02] <Elbandi_> # ceph mds dump|grep max_mds
[22:02] <Elbandi_> max_mds 1
[22:04] <PerlStalker> Is there a doc that explains the various log formats?
[22:05] <dmick> Elbandi_: you got me there. gregaf?
[22:05] <dmick> PerlStalker: not that I'm aware of. it is knowable from examining the source and its various "operator<<"s, but its not easy.
[22:06] <fghaas> PerlStalker: sure you mean formats? or levels?
[22:06] <gregaf> you can't bring down an MDS unless it's active, and yours aren't because they're stuck in replay
[22:07] <gregaf> Elbandi_ & dmick ^
[22:07] <PerlStalker> dmick: That's what I was afraid of. I was hopping to build a simple logstash filter. It doesn't look like it will be simple.
[22:07] <PerlStalker> fghaas: formats
[22:07] <gregaf> you can set max_mds to what you like but that sets limits on new ones, not on existing ones
[22:07] <dmick> gregaf: I brought mine down from active after setting max_mds, and then I rm'ed it, which succeeded, but it's still listed in mds dump
[22:08] <dmick> had 3, and now mds dump shows
[22:08] <gregaf> not sure what sequence you're describing
[22:08] <dmick> e97: 2/2/2 up {0=a=up:active,1=c=up:active}, 1 up:standby
[22:08] <fghaas> PerlStalker: sorry, no expert there. I once got stuck elbow deep in dout levels, but formats no, sorry :)
[22:08] <gregaf> dmick: right, that's correct
[22:08] <dmick> have 3: set_max_mds to 2; stop 1; rm 1
[22:09] <dmick> I would have thought stop would move it from active to standby, and rm would remove it altogether
[22:09] <dmick> but apparently not.
[22:09] <gregaf> yeah, if you leave the daemon running then it will put itself back into the map
[22:09] <gregaf> and it will be a standby because you're only letting 2 be active
[22:10] <gregaf> so you stopped it (moved it from active to standby), then rm'ed it (which removed it from the map), and then the daemon got a new map and said "hey, I'm not in the map, put me in it" and the monitor did so
[22:10] <dmick> awesome
[22:11] <Elbandi_> gregaf: if i stop all mds (service ceph stop mds), and after i start them (service ceph start mds), _two_ want to be active (start replay, etc)
[22:11] <gregaf> Elbandi_: yes, that's because you had them both active before
[22:11] <gregaf> now there's a bug of some kind so they get stuck in recovery
[22:12] <gregaf> but you need them both active (and out of recovery) in order to stop them
[22:12] <dmick> so to stop them *and* keep them out of the map, stop, rm, kill. ok.
[22:12] <Elbandi_> yes, mds never go to active...
[22:12] <Elbandi_> so i can stop&rm it :(
[22:12] * eschnou (~eschnou@191.208-201-80.adsl-dyn.isp.belgacom.be) Quit (Quit: Leaving)
[22:12] <Elbandi_> i cant *
[22:13] * brahmanda (~brahmanda@ip-95-220-238-248.bb.netbynet.ru) Quit ()
[22:14] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) Quit (Remote host closed the connection)
[22:19] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[22:19] <Elbandi_> so, there is no way to stop/rm mds when they are not active?
[22:22] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:24] <gregaf> there is not; removing them involves doing data migrations
[22:31] <Elbandi_> thats not good :(
[22:32] <absynth> gregaf: are there any public holidays or something in cali about 15th of april?
[22:32] <absynth> (2nd half of april, that is)
[22:32] <gregaf> don't think so?
[22:32] <janos> april 15th is a "holiday" of sorts in the US ;(
[22:32] <janos> uncle sam's favorite day!
[22:33] <gregaf> haha, true, tax deadline is the 15th
[22:33] <dmick> 4/22 is Earth Day :)
[22:34] <absynth> heh, there's a simpsons episode about tax deadline
[22:34] <absynth> and that reminds me
[22:34] <absynth> FUCK
[22:34] <dmick> why do you ask absynth
[22:34] <absynth> (pardon my french)
[22:34] <absynth> gotta do my taxes
[22:34] <janos> lol
[22:34] <absynth> dmick: planning 0.56.4 upgrade
[22:34] <yasu`> how Ceph treats null blocks ? What if I write from 4K offset of a new file ? does it take physical storage space ?
[22:35] <absynth> i wanna do that when someone from inktank is around, or at least awake
[22:35] <gregaf> lol
[22:36] <absynth> srsly
[22:36] <absynth> i kid you not
[22:36] <gregaf> yasu`: it's all sparsely-allocated, assuming the filesystem you have underneath your OSDs is
[22:36] <dmick> better make the criterion "awake"; "around" is insufficiently-specified
[22:36] <absynth> "someone around" == "green dot next to sage's IM"
[22:36] <yasu`> thanks gregaf
[22:37] <yasu`> the FS is btrfs
[22:38] <yasu`> I don't fully understand layout and striping, but does the underlying FS needs to handle the sparsity ?
[22:38] <gregaf> btrfs will do nicely; it will all be sparse
[22:39] <yasu`> I thought file contents are split and saved in the underlying fs. No ?
[22:41] <gregaf> when using CephFS or RBD or RGW yes, but they're by default in 4MB chunks so if you're talking about 4KB of sparseness the filesystem needs to handle it too
[22:42] <gregaf> I think they all do so but there are some weird FSes out there
[22:42] <yasu`> I see. Thanks !
[22:43] <pioto> i'm trying to compare rbd with an existing iscsi setup, and rbd is coming out orders of magnitide slower... any suggestions of things i may need to tweak in the config, that i haven't thought of yet? i've upped the journal size from 1GB to 5GB, and i still see very short, small bursts in writes, and then long lulls
[22:44] <yasu`> by the way, I am reading Ceph source to create a hook framework in it. If anybody is working on it or interested in it,, please let me know.
[22:44] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[22:45] * calebamiles (~caleb@c-50-138-218-203.hsd1.vt.comcast.net) has joined #ceph
[22:45] <pioto> like, a really naive comparision using 'hdparm -t' from within 2 VMs, one running with iscsi, one with the native libvirt rbd... 183.86 MB/sec vs. 7.56 MB/sec
[22:46] <nhm> pioto: is this kernel rbd or qemu/kvm?
[22:46] <gregaf> iscsi is probably a lot lower latency and it looks like hdparm -t is doing synchronous requests to the disk
[22:47] <gregaf> just as a guess
[22:47] <nhm> pioto: you should talk to xiaoxi who is in the channel here in the mornings and evenings. He's done some comparisons with iscsi and RBD as well with some favorable results with 0.58+
[22:47] <pioto> nhm: qemu/kvm
[22:48] <pioto> but the kernel is no better/worse
[22:48] <nhm> pioto: virtio driver?
[22:48] <pioto> yes
[22:48] <pioto> for both
[22:48] <nhm> rbd cache?
[22:48] <pioto> hm
[22:48] <pioto> how do i find that?
[22:48] <gregaf> yasu`: not sure what exactly you mean by a hook framework, but if you're talking about object classes you should know that 1) they exist, and 2) you should talk to nwat when he's around as he's written a lua interpreter for them
[22:48] <pioto> (it'll be whatever the default is i guess)
[22:49] <nhm> http://ceph.com/docs/master/rbd/rbd-config-ref/
[22:49] <pioto> so "off" i guess...
[22:49] <pioto> but that's just to help with reads, not writes?
[22:49] <pioto> oh, hm...
[22:49] <nhm> it will help quite a bit with small sequential writes, and even helps some with random writes in some cases.
[22:49] <pioto> interesting...
[22:50] <pioto> so...just [client] rbd cache = true
[22:50] <pioto> let's see
[22:50] <absynth> uh, what version are you running?
[22:50] <pioto> i assume i'll have to restart my vm for it to pick that up...
[22:50] <lurbs> pioto: You can also turn it on/off at the level of a single disk in libvirt's XML config.
[22:50] <pioto> bobtail
[22:50] <nhm> possibly. I also added something to my VM xml but it may not have been needed.
[22:50] <absynth> but no qemu / kvm, right?
[22:50] <pioto> hm
[22:50] <absynth> just rbd as an iscsi alternative
[22:51] <absynth> because you don't want rbd_cache with qemu/kvm and bobtail.
[22:51] <pioto> ok
[22:51] <lurbs> And I believe that if you turn it on then you need cache=writeback, yes?
[22:51] <yasu`> thanks gregaf, so there's a hook framework to object storage, where we can hook a callback function for creating/modifying an object ?
[22:51] <absynth> i believe the issue we are seeing is only fixed in 0.60 (right?)
[22:51] <janos> i thought rbd cache was fine as long as you set writeback
[22:51] <janos> for the guests
[22:51] <pioto> so, basically, rbd is just gonna be orders of magnitide slower in this version? or...?
[22:52] <gregaf> yasu`: there's a framework called "object classes" that let you put shared libraries on the OSDs, and then call into them via the librados library "execute" functions
[22:52] <absynth> janos: as soon as you have i/o going on in the guests, i/o to the guests is stalled, that's what we saw with all combinations of rbd_cache and qemu cache
[22:52] <absynth> (qemu 1.2, iirc)
[22:52] <gregaf> the interface looks like "execute(object_id, function_to_call, data_for_function)" or something
[22:52] <pioto> absynth: well, what i see now is, basically... i'm trying to run, say, a more serious benchmark (say, mysql's sql-benchmark)
[22:52] <janos> absynth: interesting. i'll try out killing rbd_cache then. did you still keep writeback in guests?
[22:53] <pioto> and i see tiny write bursts, basically no reads
[22:53] <gregaf> pioto: what you're seeing here is latency issues due to small IO going to disk; if you have multiple IOs in parallel (like you would behind a page cache) you won't have any trouble
[22:53] <pioto> but, when i look at the same test on an otherwise identical VPS, running over iscsi
[22:53] <absynth> janos: good question, let me see
[22:53] <pioto> it's a nice smooth write
[22:53] <gregaf> mysql is probably still going to be unhappy though, yes
[22:53] <nhm> absynth: I had good results with wip-rbd-cache-aio but that doesn't exist anymore. I'm not sure if it was merged in 0.59 or 0.60.
[22:53] <absynth> janos: yes
[22:53] <janos> cool, thanks for confirming
[22:54] <yasu`> interesting.
[22:54] <absynth> nhm: in any case, you are not going to backport that into the bobtail maintenance releases, right?
[22:54] <pioto> ok. so... i can try some qemu-specific caching stuff (but don't wanna use the rbd cache i take it?)
[22:54] <nhm> absynth: I doubt it, but not my call.
[22:55] <absynth> pioto: in my opinion and from our experience, correct
[22:55] <absynth> nhm: doesn't make a lot of sense, i think, so i'm fine with that
[22:55] <absynth> it will take us a year or two to get all our vms back to rbd_cache=on, anyway
[22:55] <yasu`> there's no documentation for the "object classes" ? I'll try to talk to nwat.
[22:55] <nhm> absynth: Good news is that I'm getting pretty close to krbd level of sequential write performance with QEMU/KVM with RBD cache neabled now and much much better small IO performance.
[22:56] <absynth> that's good news
[22:56] <absynth> did oliver ever tell you about our cronjob issue?
[22:56] <absynth> brb
[22:56] <nhm> absynth: and with the pg_info change in 0.58, performance has improved quite a bit across the board.
[22:58] <yasu`> gregaf: does anybody use the "object classes" framework for any purpose ?
[22:58] <gregaf> yasu`: not much, but you can look at the interface in ceph/src/cls/* (those are classes we wrote and use) and ceph/src/objclass
[22:58] <gregaf> yeah, we use it extensively for RBD and RGW functionality
[22:58] <lurbs> pioto: Personally, if I were building a cluster now I'd probably use 0.60, leave RBD caching etc on, and upgrade to the 0.61.x Cuttlefish series when it comes out in a month or so.
[22:58] <lurbs> And skip the 0.56.x entirely.
[22:59] <yasu`> okay... what I'm trying is a hook framework for a filesystem (Ceph-FS), so I guess it is a bit different
[22:59] <yasu`> but thanks, good to know
[22:59] <pioto> lurbs: hm.
[23:00] <pioto> well. maybe i'll try that
[23:00] <pioto> the qemu cache doesn't seem to be doing much for the mysql tests
[23:00] <dmick> yasu`: what are you attempting to hook, and for what?
[23:01] <yasu`> for what is "research", I guess I'm trying to develop a semantic FS on Ceph
[23:02] <yasu`> and I'm attempting to hook file systems' read/write/readdir/link/...etc
[23:02] <dmick> an
[23:02] <dmick> er, ah, I see
[23:02] <yasu`> but on the Ceph cluster side
[23:02] <nhm> pioto: I don't know a whole lot about tweaking mysql, but there may be some things that can be done to improve it too.
[23:02] <dmick> yeah, object classes aren't the right level there
[23:03] <absynth> nhm: yeah. primarily, throw RAM at the problem, whatever it may be
[23:03] * ivotron (~ivo@dhcp-59-232.cse.ucsc.edu) Quit (Ping timeout: 480 seconds)
[23:04] * ivotron (~ivo@eduroam-225-108.ucsc.edu) has joined #ceph
[23:04] <yasu`> and currently there's no attempt such like that, is it ? (important in research :)
[23:04] <nhm> absynth: looks like there is quite a bit here: https://blogs.oracle.com/luojiach/entry/mysql_innodb_performance_tuning_for
[23:04] <yasu`> at least for Ceph
[23:04] <yasu`> ?
[23:05] <dmick> I haven't heard of anything, but that doesn't mean much
[23:06] <yasu`> okay :)
[23:06] <yasu`> and I'm now suffering from understanding around policy/authorizer ...
[23:06] <nhm> pioto: FWIW, I do plan to do DB testing on RBD at some point, just haven't had the time to dig into it yet. :/
[23:12] <pioto> nhm: i understand you have other priorities, of course
[23:12] <pioto> i'm just trying to find the best solution for my own needs
[23:12] <pioto> and i wanna have faith in ceph
[23:12] <pioto> because it seems like it could scale so much better.
[23:16] <pioto> so, i guess i am wondering... what workloads is ceph good at?
[23:16] <Azrael> scuttlemonkey: ahh ok. guess i'll have to wait til cuttlefish.
[23:17] <pioto> where can i get it to have a nice solid amount of througput, instead of lots of little bursts?
[23:17] <scuttlemonkey> Azrael: yeah, sorry it wasn't better news
[23:18] <nhm> pioto: Any time you can avoid or hide latency it does pretty well. For large sequential writes I've gotten around a gigabyte/s from one node with sufficient IO depth.
[23:18] <pioto> ok, so... the caching in the current testing branch is a way to 'hide' the latency i guess
[23:18] <pioto> hm
[23:19] <nhm> pioto: it can help a lot with small sequental writes and helps a little with small random writes, but small random writes are definitely the hardest to do well imho.
[23:21] <Psi-jack> Whew! Okay.. I think I'm FINALLY ready for tomorrow! hehe
[23:21] <pioto> nhm: yeah. so, the other angle of attack is "reduce latency at the network layer" (if possible)?
[23:22] <nhm> pioto, btw, how much replication are you using?
[23:22] * diegows (~diegows@ has joined #ceph
[23:23] <Azrael> scuttlemonkey: so yeah, i did manage to get things going by using the config section with the chef cookbook
[23:23] <Azrael> scuttlemonkey: so the init scripts would play nice
[23:24] <Azrael> script*
[23:24] * aliguori (~anthony@ Quit (Quit: Ex-Chat)
[23:25] * mjblw3 (~mbaysek@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[23:25] * SvenPHX1 (~scarter@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[23:25] <Azrael> scuttlemonkey: so whats gonna happen around cuttlefish? new chef cookbook will be released?
[23:26] * rturk is now known as rturk-away
[23:26] * sleinen (~Adium@2001:620:0:25:346c:a8c2:f4e5:3ea2) Quit (Ping timeout: 480 seconds)
[23:27] <scuttlemonkey> well, that's the issue
[23:27] <scuttlemonkey> the cookbooks are already getting updated
[23:27] <pioto> nhm: the default, so, 2 i think
[23:27] <scuttlemonkey> but they'll _work_ out of the box w/ cuttlefish b/c of new init and upstart changes
[23:27] <pioto> my test cluster is very small. only 4 OSDs across 3 hosts
[23:27] <Azrael> scuttlemonkey: ahh, hehe
[23:28] <Azrael> scuttlemonkey: so. basically, besides using the config section w/in attributes, i'll have to add a call to ceph-authtool?
[23:29] <scuttlemonkey> yeah, just a fair bit or duct tape and bailing wire :)
[23:29] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[23:29] * mjblw1 (~mbaysek@wsip-174-79-34-244.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[23:29] <Azrael> scuttlemonkey: would the "right" way to thus have the client.admin key also set as an attribute? just like the monitorsecret?
[23:30] <scuttlemonkey> as I understand it the crux is the SysV inits for cuttlefish
[23:30] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[23:30] <scuttlemonkey> I'm not the guy to answer that question
[23:30] <scuttlemonkey> :)
[23:31] <Azrael> yeah, the init was "fun"
[23:32] <pioto> oh, i think i remember seeing that cuttlefish is using upstart instead of sysv init?
[23:32] <Azrael> i read through the code. basically its trying to be smart and match your hostname to the right section(s) in ceph.conf and then start things accordingly.
[23:32] <pioto> at least in ubuntu?
[23:32] <Azrael> but, the cookbook recipes dont create that section in ceph.conf, at least for mon heh
[23:35] <Azrael> scuttlemonkey: has the ceph team thought about using Zookeeper perhaps for config/cluster management? instead of ceph.conf and fun config parsing?
[23:37] <scuttlemonkey> Azrael: I know design paradigms have been drawn from zookeper
[23:38] <scuttlemonkey> as for using it, I haven't heard one way or the other
[23:38] <scuttlemonkey> ok, time to go pick up dinner
[23:38] <scuttlemonkey> back in a bit
[23:38] <Azrael> ok
[23:38] <Azrael> thanks for the help scuttlemonkey
[23:38] <Azrael> i'm gonna head to sleep. its late here.
[23:38] <scuttlemonkey> no problem
[23:38] <scuttlemonkey> enjoy :)
[23:39] * ivotron (~ivo@eduroam-225-108.ucsc.edu) Quit (Ping timeout: 480 seconds)
[23:41] * loicd (~loic@magenta.dachary.org) has joined #ceph
[23:42] <gregaf> scuttlemonkey: *cough* Zookeeper and Ceph monitors draw their design paradigms from the same sources — not from each other at all, though :)
[23:47] * rustam (~rustam@5e0f5b1e.bb.sky.com) Quit (Remote host closed the connection)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.