#ceph IRC Log


IRC Log for 2012-04-04

Timestamps are in GMT/BST.

[0:10] * yehuda_hm (~yehuda@99-48-179-68.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[0:32] <perplexed> Let's say I have 2 racks, and 4 hosts = 2 per rack... and that I have a pool configured to use 3 copies. If I configure the crush map to use a rule that assigns equal priority to rack1 & rack2, how is the rack that will carry 2 copies of an object identified? Is it just random based on the hash and OSD peering? Is it reasonable to expect that rack1 & rack2 will ultimately be balanced from a storage/traffic perspective?
[0:33] <sagewk> perplexed: it won't. crush is strict about the 'choose' constraints, so youll only get 2 replicas
[0:33] <sagewk> you want to make sure you have enough racks given your rule and replication count
[0:33] <perplexed> ah.. so replication needs to be < racks
[0:34] <perplexed> thx
[0:34] <gregaf> you could also set it up so racks can choose more than one host
[0:34] <gregaf> but each rack has to choose the same number (for now...)
[0:35] <perplexed> Right now I defined rack1 as server1/server2, rack2 as server3/server4.
[0:36] <sagewk> perplexed: i should mention that there are some crush issues right now when the "racks" are super small like that. for your test, i would just make it a flat collection of 4 hosts
[0:37] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Remote host closed the connection)
[0:37] <perplexed> Guess I could define 2 more logical racks (1 rack per server), but then I'm really back to the same effect (in my test env) as 4 servers with placement based on host... I'll just drop the pool replication down to 2 for the moment, noting that in a real-world scenario we'd want to ensure number of racks >= number of replicas
[0:38] <sagewk> yeah
[0:38] <perplexed> Is it possible to create a complex rule that distributes, say, 2 copies in the local datacenter (across 2 racks), and one copy in any rack in the remote datacenter?
[0:39] <sagewk> yeah
[0:40] <sagewk> step take localdc
[0:40] <sagewk> step choose firstn -1 type osd
[0:40] <sagewk> step emit
[0:40] <sagewk> step take remotedc
[0:40] <sagewk> step choose firstn 1 type osd
[0:40] <sagewk> step emit
[0:40] <sagewk> will choose n-1 from the local dc and 1 from the remote dc
[0:40] <sagewk> where localdc and remotedc are two different hierarchies, presumably
[0:48] <perplexed> Thx. I was wondering if datacenter would somehow to be defined in the crush map file as a new type (type 4 datacenter), and the rule would look something like "choose firstn N type datacenter". Seems that complexity isn't needed with your approach any how. Are the 4 crush map types hardcoded? osd/host/rack/pool? I likely I need to do some further digging into the crush logic / rule config syntax...
[0:50] <sagewk> you can define whatever types you want.. the default just defines a few common ones
[0:50] <gregaf> perplexed: none of them are hardcoded except osd (aka device); although the ceph setup will auto-create host and rack if you put them in your conf
[0:51] <perplexed> Thanks all
[1:07] * yehudasa_ (~yehudasa@aon.hq.newdream.net) has joined #ceph
[1:07] * gregaf1 (~Adium@aon.hq.newdream.net) has joined #ceph
[1:07] * sjust1 (~sam@aon.hq.newdream.net) has joined #ceph
[1:07] * mkampe1 (~markk@aon.hq.newdream.net) has joined #ceph
[1:08] * sagewk1 (~sage@aon.hq.newdream.net) has joined #ceph
[1:08] * yehudasa (~yehudasa@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:08] * joshd (~joshd@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:08] * sjust (~sam@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:08] * sagewk (~sage@aon.hq.newdream.net) Quit (Write error: connection closed)
[1:08] * mkampe (~markk@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:09] * Tv|work (~Tv_@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:11] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[1:13] * gregaf (~Adium@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[1:13] * dmick (~dmick@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[1:14] <iggy> remotedc still shouldn't be something with low bandwidth right?
[1:15] <sagewk1> depends on how slow you want your writes to go :)
[1:15] <sagewk1> replication is still synchronous
[1:16] <sagewk1> for some people (nearby datacenters, small countries) that is still fine.
[1:16] <iggy> so if you say 3 replicas, 2 local and 1 remote, all 3 have to complete before writes are ack'ed, right?
[1:16] * steki-BLAH (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:17] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[1:22] * adjohn (~adjohn@s24.GtokyoFL16.vectant.ne.jp) has joined #ceph
[1:34] <sagewk1> right
[1:37] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[1:47] * rturk (~textual@aon.hq.newdream.net) has joined #ceph
[1:50] * lofejndif (~lsqavnbok@28IAADQHM.tor-irc.dnsbl.oftc.net) Quit (Quit: Leaving)
[1:53] <sjust1> anyone holding on to plana machines they can give up?
[1:53] <sjust1> or rather, aren't using?
[1:54] <joshd> sjust1: you need more than 18?
[1:55] <sjust1> there don't seem to be that many free
[1:56] * lofejndif (~lsqavnbok@83TAAEPLG.tor-irc.dnsbl.oftc.net) has joined #ceph
[1:57] <perplexed> Back... Our DC's would all be within 14ms of each other. Replication cross DC would be ideal for DR reasons (DC1 offline, DC2 has the data available). We'd have to accept the added write latency associated with cross DC replication.
[1:58] <dmick> sjust1: I gave up plana28 :)
[1:58] <sjust1> 1 down, 8 to go
[1:59] <joshd> is anyone using 40-49? they're locked with no owner
[1:59] <dmick> there are a couple marked down as well
[1:59] <joao> sjust1, I can release the two I've been using
[1:59] <joao> I'm heading to bed anyway
[1:59] <sjust1> if it's convenient
[2:00] <dmick> plana07 seems to be up and usable to me
[2:00] <joao> released
[2:00] <dmick> although..hang on, let me check one more thing
[2:00] <perplexed> A few observations during testing. I wrote ~16k small (13KB avg) files to a new pool (replication 3) across a 4 node cluster (40 sod's - 10 per server) in the same physical rack. Writes were initiated from one of the cluster members (not ideal perhaps). Write performance was ~29Mbps. I used rados import to import the files into the cluster, and it looked to me that they were being distributed across a range of OSD's...
[2:01] <perplexed> which is good :)
[2:02] <perplexed> The distribution of writes to 1st OSD's in the PG (I believe) was as follows though:
[2:02] <joao> sjust1, do you know if there's anything that depends on the format outputted by the transaction's dump() functions?
[2:02] <sagewk1> by default the new pools have only 8 pgs. you need to specify an appropriate number of pgs on creation...
[2:03] <sjust1> I am not sure
[2:03] <sagewk1> ceph osd pool create <name> <pg_num>
[2:03] <perplexed> Server1
[2:03] <perplexed> osd 6 : 1976 writes
[2:03] <perplexed> osd 7 : 1963 writes
[2:03] <perplexed> Server2:
[2:03] <perplexed> osd 12 : 1971 writes
[2:03] <sagewk1> where pg_num should be maybe 50 * number of servers
[2:03] <perplexed> osd 14 : 2034 writes
[2:03] <perplexed> osd 15 : 1998 writes
[2:03] <perplexed> Server3:
[2:03] <perplexed> osd 29 : 2081 writes
[2:03] <perplexed> Server4:
[2:03] <perplexed> osd 34 : 2007 writes
[2:03] <perplexed> osd 36 : 1978 writes
[2:03] <perplexed> TOTAL : 15997 writes
[2:03] <sjust1> joao: you mean the one to an ostream?
[2:03] <perplexed> (sorry for the multi-line spam)
[2:03] <sagewk1> joao: nothing yet
[2:03] <joao> sagewk1, good to know
[2:03] * adjohn (~adjohn@s24.GtokyoFL16.vectant.ne.jp) Quit (Quit: adjohn)
[2:04] <joao> the chances of breaking anything are slim
[2:04] <perplexed> Is that to be expected? I'm assuming the replication will even out the effect once the dust settles
[2:04] <joao> I'll push this to the repo in the morning after cleaning the code a bit
[2:04] <joao> see you tomorrow guys o/
[2:05] <sagewk1> perplexed: not sure which part you're asking about.. the distribution of primaries and replicas should both be uniform
[2:05] <sagewk1> .. but will probably be skewed if you didn't specify a large pg_num on pool creation (see above)
[2:06] <perplexed> Ah... default is 8. I'll retry with a pool defined with more. thx
[2:07] <sagewk1> perplexed: someday soon that will be adjustable after pool creation, but not yet
[2:07] <perplexed> Gotta love how my Mac keeps on spell-correcting osd's to sod's :)
[2:09] <dmick> heh
[2:28] * lofejndif (~lsqavnbok@83TAAEPLG.tor-irc.dnsbl.oftc.net) Quit (Quit: Leaving)
[2:30] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[2:35] * yoshi (~yoshi@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:38] * cattelan is now known as cattelan_away
[2:39] * cattelan_away is now known as cattelan
[2:52] * joao (~JL@89-181-151-120.net.novis.pt) Quit (Ping timeout: 480 seconds)
[2:52] <perplexed> sagewk1: moving to pools created with 4x#servers PG's does improve write performance significantly. With 8 PG's across 40 osd's my write performance was about 29Mbps. With 200 PG's across 40 osd's this jumped up to 218Mbps... Same number of files, same average file size imported.
[2:53] <perplexed> Does this sound reasonable?
[2:54] * cattelan (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[2:56] <gregaf1> perplexed: general rule of thumb is 50-200 PGs per OSD (it's highly flexible and not really experimentally verified) ??? I would certainly set up more than 5/OSD by default :)
[2:56] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:57] <gregaf1> and with 4 servers and GigE on each of them I would hope for more than that, though it can depend on config
[2:57] <gregaf1> gotta run though, later!
[2:57] <perplexed> Thx.. sounds like I need 10x more then. I've been using this approach to capture what I assume to be the actual routing of writes to OSD's: rados import --workers 16 <directory with 16k 13KB files> <target pool name> --debug-objecter 10 --log-to-stderr 2>&1 | grep op_submit >>log.txt
[2:58] <perplexed> then filtering for "write" operations in the log to see what osd's are being targeted.
[2:58] <perplexed> not sure if this is the right way to go though.
[2:59] <gregaf1> yeah, that should work
[2:59] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit ()
[2:59] <gregaf1> you can also look into perfcounters and admin socket, which provides less data but is more machine readable, but I'm not sure how well-documented it is and unfortunately it's getting a little late for somebody to walk you through it
[3:00] <gregaf1> (really am off now, though I'll check in later this evening)
[3:03] * cattelan (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[3:13] * rturk (~textual@aon.hq.newdream.net) Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[3:15] * cattelan (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Read error: Operation timed out)
[3:16] <perplexed> I'm not sure if the distribution of writes to primary OSD's is to be expected. It looks a little off to me. Actually, with 2000 PG's defined for a test pool I see worse skewing of write operations to specific OSD's: osd.6:4052 writes, osd.22:4030 writes, osd.23:1998 writes, osd.27:1963 writes, osd.29:1976 writes, osd.33:1978 writes.
[3:17] <perplexed> osd.0-9 Server1, 10-19 Server2, 20-29 Server3, 30-39 Server4
[3:18] <perplexed> Server2 isn't touched for initial writes. But perhaps I'm mis-interpreting the data.
[3:23] * perplexed_ (~ncampbell@ has joined #ceph
[3:27] * cattelan (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[3:30] * perplexed (~ncampbell@ Quit (Ping timeout: 480 seconds)
[3:31] * perplexed_ (~ncampbell@ Quit (Ping timeout: 480 seconds)
[3:45] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[4:06] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[4:24] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[4:51] * chutzpah (~chutz@ Quit (Quit: Leaving)
[6:45] * cattelan is now known as cattelan_away
[6:46] * f4m8_ is now known as f4m8
[6:55] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[7:02] * darkfaded (~floh@ has joined #ceph
[7:02] * darkfader (~floh@ Quit (synthon.oftc.net charm.oftc.net)
[8:00] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Quit: adjohn)
[8:56] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:59] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[9:23] * gregorg (~Greg@ has joined #ceph
[9:29] * loicd (~loic@ has joined #ceph
[10:13] * andreask (~andreas@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[10:13] * andreask (~andreas@chello062178057005.20.11.vie.surfer.at) has left #ceph
[10:32] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[10:38] * brambles (brambles@ Quit (Remote host closed the connection)
[10:46] * loicd (~loic@ Quit (Ping timeout: 480 seconds)
[10:49] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Quit: adjohn)
[10:49] <chaos_> while running ceph -s / ceph -w, I've noticed something like this http://wklej.org/hash/0c6d8638f29/. Is it common? Should I fill bug report?
[10:52] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[10:56] * brambles (brambles@ has joined #ceph
[10:59] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Read error: Operation timed out)
[11:14] * joao (~JL@89-181-151-120.net.novis.pt) has joined #ceph
[11:37] * yoshi (~yoshi@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:41] * lofejndif (~lsqavnbok@09GAAELT0.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:07] * loicd (~loic@ has joined #ceph
[12:25] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Quit: adjohn)
[12:41] * lofejndif (~lsqavnbok@09GAAELT0.tor-irc.dnsbl.oftc.net) Quit (Quit: Leaving)
[13:03] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[13:42] * rosco (~r.nap@ Quit (Ping timeout: 480 seconds)
[13:44] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[13:45] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[13:52] * andreask (~andreas@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[13:52] * andreask (~andreas@chello062178057005.20.11.vie.surfer.at) has left #ceph
[14:08] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[14:44] * oliver1 (~oliver@p4FFFEEA0.dip.t-dialin.net) has joined #ceph
[15:08] * wi3dzma (3eb503c8@ircip3.mibbit.com) has joined #ceph
[15:29] <wi3dzma> Hi, I got question about rbd kernel driver. Is it possible to compile just rbd driver for kernel 2.6.32(I got centos 6.2)?
[15:29] * aliguori (~anthony@ has joined #ceph
[15:30] <nhm> wi3dzma: 2.6.32 is pretty old. I think in the past we've told people to avoid it.
[15:33] <wi3dzma> nhm: So instead of compiling module for pretty old kernel I should compile newest kernel, right?
[15:36] <nhm> wi3dzma: That's what I would do...
[15:36] <nhm> wi3dzma: We're testing on 3.3 now...
[15:46] * f4m8 is now known as f4m8_
[16:07] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[16:15] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[16:30] * rosco (~r.nap@ has joined #ceph
[16:30] * rosco is now known as Rocky
[16:35] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[16:45] * wi3dzma (3eb503c8@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[17:04] * aliguori (~anthony@ Quit (Remote host closed the connection)
[17:23] <gregaf1> chaos_: that is not a common stack trace/bug to run in to, and you should file a bug for it
[17:30] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:40] * loicd (~loic@ Quit (Quit: Leaving.)
[17:49] * Tv|work (~Tv_@aon.hq.newdream.net) has joined #ceph
[17:59] * Tv|work (~Tv_@aon.hq.newdream.net) Quit (Quit: Tv|work)
[18:07] <chaos_> gregaf1, k
[18:07] * oliver1 (~oliver@p4FFFEEA0.dip.t-dialin.net) has left #ceph
[18:10] * sagewk1 (~sage@aon.hq.newdream.net) Quit (Quit: Leaving.)
[18:15] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[18:15] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[18:16] * brambles (brambles@ Quit (Remote host closed the connection)
[18:18] * MK_FG (~MK_FG@219.91-157-90.telenet.ru) Quit (Ping timeout: 480 seconds)
[18:20] * brambles (brambles@ has joined #ceph
[18:20] * MK_FG (~MK_FG@ has joined #ceph
[18:23] <chaos_> gregaf1, http://tracker.newdream.net/issues/2234
[18:23] <gregaf1> cool, thanks
[18:24] <chaos_> np;-)
[18:25] * brambles_ (brambles@ has joined #ceph
[18:33] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[18:40] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[18:43] <chaos_> is there anywhere documentation for crushmap rules? I'm specially interested in "type", "min_size" and "max_size", what this is for and how changes will affect cluster
[18:51] * yehudasa_ (~yehudasa@aon.hq.newdream.net) Quit (Remote host closed the connection)
[18:53] * yehudasa (~yehudasa@aon.hq.newdream.net) has joined #ceph
[18:54] * sagewk (~sage@aon.hq.newdream.net) has joined #ceph
[18:55] <chaos_> hi sagewk
[18:55] <sagewk> chaos_: hi
[18:56] <joao> hey sage
[19:09] * Oliver1 (~oliver1@ip-37-24-160-195.unitymediagroup.de) has joined #ceph
[19:11] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[19:15] * chutzpah (~chutz@ has joined #ceph
[19:19] <imjustmatthew> gregaf: Did you get everything you need for #2218? I have to nuke that cluster this week.
[19:19] * rturk (~textual@aon.hq.newdream.net) has joined #ceph
[19:21] * adjohn (~adjohn@s24.GtokyoFL16.vectant.ne.jp) has joined #ceph
[19:28] * adjohn (~adjohn@s24.GtokyoFL16.vectant.ne.jp) Quit (Quit: adjohn)
[19:29] * LarsFronius (~LarsFroni@g231136036.adsl.alicedsl.de) has joined #ceph
[19:33] * perplexed (~ncampbell@outbound4.ebay.com) has joined #ceph
[19:36] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[20:02] * loicd (~loic@AMontsouris-757-1-26-219.w90-46.abo.wanadoo.fr) has joined #ceph
[20:19] * nyeates (~nyeates@mobile-198-228-192-051.mycingular.net) has joined #ceph
[20:22] * loicd (~loic@AMontsouris-757-1-26-219.w90-46.abo.wanadoo.fr) Quit (Quit: Leaving.)
[20:25] * perplexed (~ncampbell@outbound4.ebay.com) has left #ceph
[20:34] * yehudasa (~yehudasa@aon.hq.newdream.net) Quit (Quit: Ex-Chat)
[20:34] <dmick> woot, plana58 can be added to the teuthology lock database. (I assume that's joshd?)
[20:35] <joshd> yeah, I'll add it
[20:35] * yehudasa (~yehudasa@aon.hq.newdream.net) has joined #ceph
[20:36] * nyeates (~nyeates@mobile-198-228-192-051.mycingular.net) Quit (Quit: I'm outta here!)
[20:41] * ss7pro (~3e57f74f@webuser.thegrebs.com) has joined #ceph
[20:43] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[20:44] <dmick> hm. and just like that, someone reboots it out from under me :)
[20:45] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit ()
[20:46] * ss7pro (~3e57f74f@webuser.thegrebs.com) Quit (Quit: TheGrebs.com CGI:IRC (Ping timeout))
[20:48] * Oliver1 (~oliver1@ip-37-24-160-195.unitymediagroup.de) has left #ceph
[20:58] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:06] <nhm> hrm... is there any reason the tracker wouldn't be accepting email replies?
[21:09] <dmick> wasn't aware it did..
[21:22] <gregaf1> nhm: it never has accepted email replies; apparently we can configure it to do that but nobody has :(
[21:23] <gregaf1> imjustmatthew: I haven't had much time to look into it, but I have the log and I don't think anything else is likely to help ??? thanks!
[21:23] <imjustmatthew> gregaf1: awesome, good luck
[21:24] * imjustmatthew (~imjustmat@pool-71-176-237-208.rcmdva.fios.verizon.net) Quit (Remote host closed the connection)
[21:31] * BManojlovic (~steki@ has joined #ceph
[21:39] <nhm> hrm, saw a 502 Bad Gateway error on one of the plana nodes.
[21:52] <dmick> accessing what, nhm, do you know?
[21:54] <nhm> dmick: sourceforge
[21:54] <nhm> dmick: seems to be working now though.
[22:14] <sagewk> sourceforge?
[22:15] <nhm> sagewk: for collectl
[22:15] <nhm> sagewk: I could put it somewhere locally, though TV mentioned he's going to put a caching proxy in place at some point.
[22:16] * lofejndif (~lsqavnbok@9YYAAE3I8.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:17] <sagewk> i wouldn't bother for now. hopefully they're better than github
[22:20] <sagewk> tv|work: /var/lib/ceph/$cluster-$name for osd_data and mon_data
[22:39] * elder (~elder@ has joined #ceph
[22:39] <elder> sagewk, around?
[22:40] <sagewk> elder: hey
[22:40] <elder> I can't get to the teuthology machine from here.
[22:40] <sagewk> you need to use the dh vpn
[22:40] <elder> Can you identify what test was running when http://tracker.newdream.net/issues/2242 reported the problem?
[22:41] <sagewk> dbench on rbd
[22:41] <sagewk> - ceph: null
[22:41] <sagewk> - rbd:
[22:41] <sagewk> all: null
[22:41] <sagewk> - workunit:
[22:41] <sagewk> all:
[22:41] <sagewk> - suites/dbench.sh
[22:41] <elder> Great. I just want to fire off a test.
[22:41] <elder> I think the VPN problem has to do with private IP address ranges (10.)
[22:41] <sagewk> yeah
[22:42] <sagewk> the dh vpn config is very impolite
[22:42] <yehudasa> sagewk: any other info for objcache perfcounters, other than hit/miss?
[22:44] <joshd> yehudasa: maybe space used and num reads and writes?
[22:45] * lofejndif (~lsqavnbok@9YYAAE3I8.tor-irc.dnsbl.oftc.net) Quit (Quit: Leaving)
[22:46] <gregaf1> yehudasa: joshd: data invalidated while in cacher, amount read into cacher, amount written from cacher?
[22:47] <sagewk> user-facing reads/writes (as compared to backend reads and writes)
[22:48] <sagewk> everything in terms of ops and bytes, where appropriate
[22:48] <joshd> amount of dirty data overwritten
[22:49] <gregaf1> more precisely, amount written without ever having been flushed
[22:52] * Tv_ (~tv@aon.hq.newdream.net) has joined #ceph
[22:53] * Tv_ is now known as Tv|work
[22:53] * lofejndif (~lsqavnbok@28IAADREN.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:53] <yehudasa> joshd: gregaf1: isn't that just user facing writes minus backend writes?
[22:54] * lofejndif (~lsqavnbok@28IAADREN.tor-irc.dnsbl.oftc.net) Quit (Remote host closed the connection)
[22:54] <gregaf1> yeah, suppose so
[22:55] <gregaf1> although if we had separate counters for "data overwritten while flushing" and "data overwritten without flushing" it would be easier to tune windows and such
[22:58] * sjust1 (~sam@aon.hq.newdream.net) Quit (Quit: Leaving.)
[22:58] * sjust (~sam@aon.hq.newdream.net) has joined #ceph
[23:03] * rturk (~textual@aon.hq.newdream.net) Quit (Quit: Computer has gone to sleep.)
[23:04] * lofejndif (~lsqavnbok@1RDAAAI18.tor-irc.dnsbl.oftc.net) has joined #ceph
[23:06] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Read error: Operation timed out)
[23:06] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[23:06] * perplexed (~ncampbell@ has joined #ceph
[23:20] * BManojlovic (~steki@ Quit (Ping timeout: 480 seconds)
[23:21] * BManojlovic (~steki@ has joined #ceph
[23:24] * bchrisman (~Adium@ has joined #ceph
[23:28] * rturk (~rturk@aon.hq.newdream.net) has joined #ceph
[23:30] <yehudasa> perplexed: about your imbalanced rados import: trying to reproduce, did you first uploaded everything through other tool and then exported it?
[23:32] <perplexed> Hi, yes.. I uploaded initially using rados, but file by file into the pool. I then exported the pool contents to a directory and used that export directory to do imports from that point on. I initially tried importing the directory of files, but rados complained that the directory didn't look like one that had been created with an export....
[23:44] <yehudasa> perplexed: are the objects sizes even or could it be that you have certain objects that are much larger than the others?
[23:48] <perplexed> I'll take a closer look... should be able to get to this in the next couple of hours (meeting to head to unfortunately). Will keep you posted.
[23:48] <yehudasa> thanks
[23:52] * perplexed_ (~ncampbell@ has joined #ceph
[23:56] * perplexed (~ncampbell@ Quit (Ping timeout: 480 seconds)
[23:56] * perplexed_ is now known as perplexed

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.