#ceph IRC Log

Index

IRC Log for 2013-05-01

Timestamps are in GMT/BST.

[0:00] * aliguori (~anthony@32.97.110.51) Quit (Quit: Ex-Chat)
[0:11] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[0:17] * vata (~vata@2607:fad8:4:6:221:5aff:fe2a:d1dd) Quit (Quit: Leaving.)
[0:24] * jskinner (~jskinner@69.170.148.179) Quit (Remote host closed the connection)
[0:28] * median1 (~0x00@85-220-109-89.dsl.dynamic.simnet.is) has joined #ceph
[0:31] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) has joined #ceph
[0:32] * median (~0x00@85-220-109-89.dsl.dynamic.simnet.is) Quit (Ping timeout: 480 seconds)
[0:39] * sstan (~chatzilla@dmzgw2.cbnco.com) Quit (Quit: ChatZilla 0.9.90 [Firefox 19.0/2013021500])
[0:39] * median1 (~0x00@85-220-109-89.dsl.dynamic.simnet.is) Quit (Quit: Leaving.)
[0:47] * portante|afk (~user@66.187.233.206) Quit (Quit: upgrades)
[0:48] <cjh_> do i have to remove the osd from the tree if i want to reimage a machine?
[0:50] <gregaf> it's aesthetically pleasing to keep your map clean, and will help make sure that you've pulled all the data off of it
[0:51] <cjh_> it looks like when i do a ceph mkfs and start the osd back up the cluster figured out what it needed to backfill
[0:51] <cjh_> that's amazing :)
[0:53] <gregaf> not much would work if it didn't do that automatically ;)
[0:53] * atb (~chatzilla@d24-141-198-231.home.cgocable.net) Quit (Remote host closed the connection)
[0:53] * tnt (~tnt@109.130.111.118) Quit (Ping timeout: 480 seconds)
[0:55] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Quit: ...)
[0:56] <cjh_> gregaf: gluster can't do that
[0:56] <cjh_> you nuke it and it doesn't heal
[0:56] <gregaf> ah, right
[0:57] <dmick> new topic: Ceph isn't Gluster ;)
[0:57] <cjh_> lol
[0:57] <gregaf> *censors self*
[0:57] <cjh_> yeah i don't want to start a flame war
[0:57] <cjh_> just saying
[0:57] <cjh_> am i noticing some odd behavior on my cluster of 20
[0:58] <cjh_> if i fire up rados bench on 1 node i get 800MB/s. if i fire 1 instance on all 20 nodes i get 40MB/s
[0:58] <cjh_> shouldn't it scale a little bit better ?
[0:58] * portante (~user@66.187.233.206) Quit (Ping timeout: 480 seconds)
[0:58] <cjh_> i'm in the process of switching from xfs to btrfs to test it but it looks like from the perf top that it's spending a lot of time doing crc processing
[0:58] <gregaf> so you're getting 800MB/s out of your cluster
[0:59] <cjh_> basically yeah
[0:59] <gregaf> how many disks and how many OSDs per node?
[0:59] <cjh_> 1 raid6 per osd composed of 12 sata drives
[0:59] <cjh_> 1 osd per node
[0:59] <cjh_> i can try jbod if that would help
[0:59] <gregaf> 2x replication?
[0:59] <gregaf> (at the pool level)
[1:00] <cjh_> yup
[1:00] <cjh_> 2x
[1:00] <Vjarjadian> there was a blog showing that 1 OSD per drive and not using large raid gave better performance on the ceph website
[1:00] <cjh_> ok so the osd threads are bound in a way then
[1:00] <cjh_> and we need to fire up more of them per host
[1:00] <gregaf> so 40MB/s per RAID6
[1:00] <gregaf> that's a lot slower than an OSD can go
[1:00] <gregaf> what CPUs?
[1:00] <cjh_> well i benched it at 500MB/s write speed
[1:01] <cjh_> um 8 cores of intel xeon 5630's
[1:02] <gregaf> you said something about crc processing so I was hoping you were on an Atom or something that would bottleneck it ;)
[1:02] <gregaf> sjust, do we document smalliobenchfs anywhere?
[1:03] <gregaf> it's possible that the raid6 is just killing us the way we process updates, I guess...
[1:03] <gregaf> but that's much worse than I'd have expected
[1:03] <gregaf> try varying the block sizes in rados bench and see if larger ones give you more throughput, that would be an indicator
[1:04] <cjh_> yeah i tried using the default 4M and then 10M and it didn't seem to improve at all
[1:05] <cjh_> with each node having a 10Gb nic i would expect something like 8GB/s cluster throughput
[1:05] <cjh_> or just completely max out the rack switch
[1:05] <gregaf> yeah
[1:05] <gregaf> what version are you running right now?
[1:06] <cjh_> 56.4
[1:08] * gmason_ (~gmason@hpcc-fw.net.msu.edu) Quit (Quit: Computer has gone to sleep.)
[1:08] <gregaf> what benchmark gave you 500MB/s out of them?
[1:09] <nhm> cjh_: I suspect you'll see better performance with smaller RAID6 arrays or switching to 1 OSD per drive.
[1:09] <gregaf> and yeah, if you're storing both journals and backing stores on there don't forget you're doubling writes (again) to the same RAID6, and it might not handle dual streams as well as it does a single stream
[1:09] <nhm> cjh_: btw, do you have writeback cache on your controller?
[1:10] * LeaChim (~LeaChim@90.205.52.127) Quit (Ping timeout: 480 seconds)
[1:10] <gregaf> *sigh of relief* nhm can do this better than I can
[1:10] <gregaf> *goes back to log diving*
[1:12] <nhm> gregaf: don't worry, I'll be bugging you plenty about RGW performance at some point. ;)
[1:13] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[1:16] <cjh_> nhm: i think i turned it off. i need to check again
[1:17] <nhm> cjh_: turning it on will probably help
[1:17] <cjh_> gregaf: i'm seeing a sine wave type pattern as it flushes and then fills up the journal
[1:18] <cjh_> nhm: what were you getting out of your supermicro cluster again?
[1:18] <nhm> cjh_: 1 node, around 2GB/s depending on how it's configured.
[1:18] <cjh_> aggregate sequential throughput and aggregate random
[1:19] <cjh_> nhm: how about across the cluster?
[1:19] <nhm> cjh_: that is the cluster. :D
[1:19] <cjh_> oh :)
[1:19] <cjh_> how many nodes is it?
[1:19] <nhm> cjh_: 1
[1:19] <cjh_> what? haha
[1:19] <nhm> cjh_: it's our performance testing platform. :)
[1:19] <cjh_> gotcha
[1:20] * LeaChim (~LeaChim@176.250.176.2) has joined #ceph
[1:20] <nhm> cjh_: we've got a big cluster of old supermicro nodes and Dell R515s. This is our test box to see how far we can push OSD performance in different configurations.
[1:20] <cjh_> so on your big cluster how does it perform?
[1:21] <nhm> cjh_: far worse, but the machines don't seem to be able to perform consistently and that really hurts Ceph.
[1:21] <cjh_> i see
[1:21] <nhm> cjh_: I've fought with them quite a bit. I suspect the expanders in the backplane.
[1:23] <gregaf> they still get more than 40MB/s across 12 drives, though
[1:26] <nhm> gregaf: I saw some very strange behavior on those nodes. Large direct IO writes would go very fast as long as there was a single stream hitting the RAID array. As soon as a 2nd stream was added, performance plummeted from 800MB/s to ~70MB/s.
[1:26] <nhm> gregaf: that was to a big RAID0 array.
[1:27] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) Quit (Read error: Connection reset by peer)
[1:27] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[1:30] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[1:31] <cjh_> nhm: i'm seeing something similar
[1:31] <cjh_> with the big raid6
[1:32] * tkensiski (~tkensiski@209.66.64.134) has joined #ceph
[1:33] * tkensiski (~tkensiski@209.66.64.134) has left #ceph
[1:36] <cjh_> i'll change over to jbod and see if that helps the situation
[1:38] <nhm> cjh_: what kind of nodes?
[1:39] <cjh_> they're dell something or other with perc 700 controllers
[1:39] <nhm> yeah, that's what our R515s do. :(
[1:39] <nhm> Ok, gotta put the kids to bed. Will bbl
[1:40] <cjh_> see ya
[1:50] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[1:50] * Cube (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[1:52] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has left #ceph
[1:53] * LeaChim (~LeaChim@176.250.176.2) Quit (Read error: Connection reset by peer)
[1:54] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[1:56] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[1:56] <cjh_> could they make megacli any harder to use?
[1:59] * alram (~alram@cpe-75-83-127-87.socal.res.rr.com) Quit (Quit: leaving)
[1:59] * MK_FG (~MK_FG@00018720.user.oftc.net) Quit (Ping timeout: 480 seconds)
[2:04] <nigwil> further to my question yesterday with our Openstack+Ceph deployment, does anyone have an opinion or preference (all things being equal) between co-resident Openstack+Ceph (sharing nodes) versus dedicated nodes for Openstack and Ceph? The former gets us more compute overall but with the complexity of the interactions. For first time deployers and to make life simpler/minimize risk should we choose dedicated nodes?
[2:05] <gregaf> I would probably do that, you'll see less cascade issues under recovery and such
[2:05] <gregaf> with dedicated nodes
[2:05] <nigwil> We'
[2:05] <nigwil> oops
[2:07] <nigwil> we've spent the last several weeks only thinking in terms of dedicated nodes, so the option of co-resident (although we know that works in theory) was a new consideration. Sage indicated that people are doing it but as this is our first time we're hedging.
[2:08] <nigwil> I'm still leaning towards dedicated but open to arguments that co-residency is fine.
[2:08] <gregaf> well, it should be fine and if it's a big cost savings it's probably worth it
[2:08] <nigwil> I'm thinking that tuning separate clusters will be a lot easier
[2:08] <gregaf> but since you were asking, the thing that I imagine would complicate it is resource competition
[2:09] <nigwil> We get 25% more cores with the co-resident option but less network bandwidth and slightly less overall storage
[2:09] <gregaf> ah, dunno then
[2:09] <mikedawson> nigwil: When deploying openstack and ceph, there are many architectural decisions. Your use case should drive the decision.
[2:10] <nigwil> cloud use-case?
[2:11] <nhm> nigwil: there's some controversy on this. I think Josh and I tend to fall on the side of not liking sharing resources. If you use cgroups I might be willing to endorse it. :)
[2:11] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[2:12] * loicd (~loic@magenta.dachary.org) has joined #ceph
[2:12] <nigwil> we expected cgroups would be essential to quarantine resources for Ceph, especially against rogue VMs
[2:12] <nhm> nigwil: some folks are doing it without them. :)
[2:13] <nigwil> they're brave :-)
[2:37] * ggreg (~ggreg@92.243.7.223) has joined #ceph
[2:38] * ggreg_ (~ggreg@int.0x80.net) Quit (Read error: Connection reset by peer)
[2:44] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[2:44] * loicd (~loic@magenta.dachary.org) has joined #ceph
[2:47] * kfox1111 (bob@205.205.214.5) Quit (Quit: Lost terminal)
[3:00] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[3:07] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[3:09] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[3:09] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has left #ceph
[3:09] * loicd (~loic@magenta.dachary.org) has joined #ceph
[3:16] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[3:45] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[3:48] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) Quit (Quit: Leaving.)
[3:57] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[4:08] * DarkAceZ (~BillyMays@50.107.54.92) Quit (Ping timeout: 480 seconds)
[4:10] * DarkAceZ (~BillyMays@50.107.54.92) has joined #ceph
[4:11] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[4:15] * danieagle (~Daniel@186.214.77.43) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[4:24] * portante|ltp (~user@c-24-63-226-65.hsd1.ma.comcast.net) has joined #ceph
[4:29] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[4:36] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:41] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[4:41] * loicd (~loic@magenta.dachary.org) has joined #ceph
[4:47] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[4:50] * scuttlemonkey_ (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: my troubles seem so far away, now yours are too...)
[4:51] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[4:51] * ChanServ sets mode +o scuttlemonkey
[4:58] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[4:59] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[5:02] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[5:28] * wca (~will@198.105.209.36) has joined #ceph
[5:31] * tserong (~tserong@124-171-116-238.dyn.iinet.net.au) Quit (Read error: Connection reset by peer)
[5:32] <wca> hi, I'm reading about the ceph architecture, and was wondering: how does ceph distribute large (say, 2TB+) objects onto nodes? it doesn't appear to be any different than for small objects?
[5:36] * tserong (~tserong@124-171-115-108.dyn.iinet.net.au) has joined #ceph
[5:36] <sjustlaptop> wca: nope
[5:37] <sjustlaptop> cephfs, radosgw, rbd all use some striping/chunking strategy for mapping stuff onto objects
[5:40] <wca> sjustlaptop: so a RBD or file object might map to multiple ceph objects?
[5:41] <wca> to perform the chunking?
[5:52] <sjustlaptop> wca: exactly
[5:53] <wca> sjustlaptop: my expectation would be that the chunks would be something like a blocksize, so 4k, 32k, 128k?
[5:53] <sjustlaptop> 4MB is the default for both rbd and cephfs
[5:54] <sjustlaptop> no real advantage to going smaller
[5:54] <wca> ah, ok
[5:55] <wca> does it use CRUSH to calculate the chunk locations, like it does for objects?
[5:55] <sjustlaptop> the chunks are just objects with predictable names
[5:55] <sjustlaptop> simplifies matters
[5:55] <athrift> nhm: thanks for your advice around R720XD SAS issues, we suspected an issue but were not sure what was causing it. We just ordered a bunch of dual socket SuperMicro machines with "Direct Attach" backplanes to test
[5:56] <wca> sjustlaptop: ok, so a ceph client would say "I want to access rbd:/foo/bar at offset 128M", calculates the object using a string like "rbd:/foo/bar%128M" or something like that?
[5:57] <sjustlaptop> pretty much
[5:57] <wca> cool.
[5:57] <sjustlaptop> though the rbd images have ids so that they can be cheaply renamed
[5:57] <sjustlaptop> so you first look up the id and then it's something like <id>_<blockno>
[5:58] <wca> makes sense.
[5:59] <wca> I would guess the S3 frontend probably does something similar
[6:00] <sjustlaptop> I don't know as much about that, but iirc, the chunks are based on how the user uploaded the file
[6:01] <sjustlaptop> and each s3 object has a manifest rados object explaining which offsets map to which rados object
[6:01] * dmick wonders how to get a make variable into the environment of a make command
[6:03] <wca> sjustlaptop: well, S3 objects are basically a <bucket-id,object-id> tuple, and from there essentially look the same as a block device object, so you could imagine generating a predictable name given <bucket,object,offset> tuple.
[6:04] <sjustlaptop> that would be one approach, but radosgw I think instead arranges for each part of a multi-part upload to wind up as a rados object
[6:04] <sjustlaptop> to reduce the number of objects put
[6:04] <sjustlaptop> since there is already a limit on the size of a single put
[6:04] <sjustlaptop> I could be wrong about that though
[6:05] <sjustlaptop> it's also how radosgw gets atomicity
[6:08] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[6:08] * rustam (~rustam@94.15.91.30) has joined #ceph
[6:08] <wca> sjustlaptop: thanks for the info.. I need to read more, but it sounds like a good design. :)
[6:09] <sjustlaptop> wca: happy to help
[6:13] <wca> sjustlaptop: I am curious, has there been any research into adaptive load distribution, eg replicating certain objects to more nodes in response to load?
[6:13] <sjustlaptop> it's tricky
[6:13] <wca> yeah, I mean, you have to offset that with the desire to use an algorithm for distribution vs. lookup tables
[6:14] <sjustlaptop> it also makes it difficult to ensure consistency
[6:15] <sjustlaptop> there isn't really anywhere in the architecture to put a lookup table, so that's probably a non-starter
[6:15] <sjustlaptop> it is possible to configure things to allow a client to talk to an object replica instead of the primary for a read
[6:15] <wca> right, 'cause you have to distribute updates to all replicas
[6:16] <wca> preferably in some sort of atomic fashion
[6:16] <sjustlaptop> but it's only really possible for a read only object
[6:16] <sjustlaptop> it's mostly there so hadoop workloads can read from the local copy if it's there
[6:16] <sjustlaptop> wca: more that you'd have to notify replicas to stop allowing reads prior to permitting a write or something more complicated
[6:17] <wca> something less scalable, too.
[6:18] <sjustlaptop> not necessarily, but definitely more complicated
[6:19] <sjustlaptop> and in most cases, you'd prefer to minimize the number of OSDs caching an object
[6:43] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:55] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[7:00] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[7:00] * loicd (~loic@magenta.dachary.org) has joined #ceph
[7:20] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[7:20] * loicd (~loic@magenta.dachary.org) has joined #ceph
[7:38] * jerker (jerker@Psilocybe.Update.UU.SE) Quit (Ping timeout: 480 seconds)
[7:41] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[7:41] * loicd (~loic@magenta.dachary.org) has joined #ceph
[7:46] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[7:55] * rustam (~rustam@94.15.91.30) has joined #ceph
[7:55] * eschnou (~eschnou@252.94-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[7:57] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[8:04] * jerker (jerker@Psilocybe.Update.UU.SE) has joined #ceph
[8:04] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:04] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:04] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[8:09] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[8:12] * MK_FG (~MK_FG@00018720.user.oftc.net) has joined #ceph
[8:13] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: Depression is merely anger without enthusiasm)
[8:17] * eternaleye (~eternaley@c-50-132-41-203.hsd1.wa.comcast.net) Quit (Ping timeout: 480 seconds)
[8:36] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[8:36] * tnt (~tnt@109.130.111.118) has joined #ceph
[8:44] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[8:52] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[8:58] * eschnou (~eschnou@252.94-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:36] * eschnou (~eschnou@252.94-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[9:49] * LeaChim (~LeaChim@176.250.176.2) has joined #ceph
[10:05] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[10:16] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[10:23] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[10:26] * madkiss (~madkiss@089144192030.atnat0001.highway.a1.net) has joined #ceph
[10:37] * eternaleye (~eternaley@c-50-132-41-203.hsd1.wa.comcast.net) has joined #ceph
[10:44] * lerrie2 (~Larry@remote.compukos.nl) has joined #ceph
[10:54] * vo1d (~v0@62-46-172-22.adsl.highway.telekom.at) has joined #ceph
[11:01] * v0id (~v0@91-115-228-70.adsl.highway.telekom.at) Quit (Ping timeout: 480 seconds)
[11:35] * lofejndif (~lsqavnbok@83TAAA2SX.tor-irc.dnsbl.oftc.net) has joined #ceph
[11:46] * rustam (~rustam@94.15.91.30) has joined #ceph
[11:47] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[11:57] <LeaChim> Hi, is there a way to see which objects the mds is intending to delete? And why it hasn't yet? My cluster is reporting 55GB used in the data pool, but the mounted filesystem is showing only 12GB used.
[12:38] * syed_ (~chatzilla@180.151.28.182) has joined #ceph
[12:40] <syed_> hello
[12:40] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[12:47] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Read error: No route to host)
[12:48] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[12:53] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:05] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[13:11] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:19] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[13:23] <nhm> athrift: cool! What I've found on the one we have is that if you use JBOD controllers you really need SSD backed journals. If you are using controllers with WB cache (say a 9265 with single disk RAID0 arrays), you can get reasonable performance with just spinning disks.
[13:25] <athrift> nhm: thanks, we got the LSI 9201-16i and two S3700 200GB SSD's per node
[13:26] <nhm> athrift: good deal. I've never tried the 9201, but afaik it's basically like 2 2008s on 1 board.
[13:27] <athrift> nhm: Yep, we figured it would be fine being original LSI
[13:27] <athrift> unlike the h310...
[13:28] * madkiss (~madkiss@089144192030.atnat0001.highway.a1.net) Quit (Quit: Leaving.)
[13:29] <athrift> I do hope Dell fix the issues at some point as they have exellent warranties, and for us are quite a bit cheaper than the supermicro's
[13:29] <nhm> athrift: the 200GB S3700s are slightly slower than the bigger capacities, but I think it's still good for close to 400MB/s so I expect you should be able to push around 700-800MB/s for sequential 4MB object writes with that setup.
[13:30] <nhm> athrift: not counting replication, and overhead at other layers (rbd, rgw, etc)
[13:30] <athrift> nhm: we were planning to use bcache so that they are bypassed for sequential writes which should bring that up a little
[13:30] <athrift> in theory!
[13:31] <nhm> athrift: you'll be using them for ceph journals still though I expect?
[13:31] <athrift> nhm: we are going to test with ceph journal on a LV on the SSD's as well as with it just on the OSD on top of bcache and see which performs better
[13:32] <nhm> athrift: neat
[13:33] <nhm> athrift: with 12 spinning disks you have a decent amount of backend throughput. If bcache can eliminate the seek penality for switching between journal writes and osd writes on the same disk it's not a bad plan.
[13:34] <athrift> nhm: with something like bcache is there any benefit of ceph having a journal ?
[13:34] <nhm> athrift: I heavily expect it's the little things like xattrs and dentry lookups for osd writes combined with direct IO journal writes that hurts on combined journal setups with no WB cache.
[13:36] <athrift> nhm: ok that makes sense.
[13:36] <nhm> athrift: Ceph does direct IO writes to the journal to "guarantee" that the data is written in place before an ACK gets sent out, then lazily copies the data via buffered writes out to the OSD data store.
[13:37] <athrift> similar to how bcache works. Do you think maybe ceph will integrate smarts to push sequential writes directly to the OSD rather than journalling them ?
[13:37] <nhm> athrift: if you eliminate the journal, you have to do direct IO writes to the filesystem which typically is much worse. In the journal data can just be appended so there are no real seek penalities for small writes. That's not true on the FS.
[13:38] <nhm> athrift: We've discussed that. It's semi-easily doable for btrfs using clone if the journal resides on the same FS as the OSD.
[13:38] <athrift> Awesome, for some reason I had not joined those dots, that is logical
[13:39] <nhm> athrift: for XFS and EXT4 it's not as straightforward.
[13:41] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:51] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Ping timeout: 480 seconds)
[13:52] <nhm> So anyone here know anything about ZFS performance tuning?
[13:59] <matt_> nhm, I've messed with ZFS a while ago
[14:00] <matt_> What are you trying to do?
[14:01] <nhm> matt_: playing with Ceph on top of ZOL. SA xattrs are slightly broken so I can't use those. Otherwise I'm using 1 pool per OSD and sicking the ZIL on an SSD partition (which apparently isn't ideal, but for now I'm stuck with it).
[14:02] <nhm> matt_: basically I'm a total noob though so I don't know if there are any other things I should be looking at. anything that can speed up xattrs is likely a win.
[14:03] <matt_> nhm, there isn't a whole lot you can do with ZFS. Making sure you ashift is correct if you use 4kb drives is a start. You can also modify the pool to focus on throughput or latency
[14:03] <matt_> If you really wanted to see what it can do then disabling the ZIL would be the way to go
[14:04] <nhm> interesting, ok. SA xattrs are pretty major. Need to figure out why they aren't working.
[14:05] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[14:07] <matt_> nhm, I haven't played with any of the new builds so I'm probably not much help there
[14:09] <nhm> matt_: We've been working with Brian Behlendorf on a couple of xattr issues so we probably just didn't fully fix it.
[14:12] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[14:12] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Ping timeout: 480 seconds)
[14:22] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[14:35] * diegows (~diegows@190.190.2.126) has joined #ceph
[14:38] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[14:42] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:43] * Havre (~Havre@2a01:e35:8a2c:b230:80ab:a678:e1d9:96ac) Quit (Ping timeout: 480 seconds)
[14:46] <jtangwk> in the docs for the radosgw on centos/sl it says to do a "/etc/init.d/radosgw start"
[14:46] <jtangwk> to start off the daemon
[14:46] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[14:46] <jtangwk> it seems the 0.56.x rpms in epel dont have this init script
[14:47] <jtangwk> also, the docs at http://ceph.com/docs/master/radosgw/config-ref/ — are they for 0.56 ?
[14:48] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[14:50] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Ping timeout: 480 seconds)
[14:55] * eschnou (~eschnou@252.94-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[14:56] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[14:56] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (Quit: Ex-Chat)
[14:59] * Wolff_John (~jwolff@64.132.197.18) has joined #ceph
[15:01] <jtangwk> btw, my issues stem from install ceph from EPEL
[15:04] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[15:04] <syed_> jtangwk: are you using this one http://pkgs.org/centos-6-rhel-6/epel-x86_64/ceph-0.56.3-1.el6.x86_64.rpm.html
[15:06] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[15:07] * m0zes (~oftc-webi@dhcp251-10.cis.ksu.edu) Quit (Quit: Page closed)
[15:08] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Ping timeout: 480 seconds)
[15:10] <jtangwk> syed_: im trying out the rpms from http://www.ceph.com/rpm-bobtail/el6/x86_64/ after the failed attempt at the ones from EPEL
[15:10] <jtangwk> seems there is a dependcy on start-stop-daemon in the init scripts
[15:10] <jtangwk> which doesnt exist in centos/sl land
[15:10] <jtangwk> *sigh*
[15:11] <jtangwk> i could just start it manually
[15:11] <jtangwk> or fix the init scripts
[15:11] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[15:11] <jtangwk> im kinda in no mood to do things, i might just install ubuntu precise to test the radosgw rather than fix the packaging
[15:11] <jtangwk> so far the radosgw experience on centos/sl isn't great
[15:12] <syed_> jtangwk: it seems to be a rather feasible option
[15:17] <jtangwk> seems that in the /lib/lsb/init-functions dont have the start-stop-daemon call
[15:17] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[15:18] <jtangwk> nor does /etc/init.d/functions
[15:18] <darkfaded> it's upstart crap iirc
[15:19] <darkfaded> nothing to relate with LSB or linux or unix ;)
[15:19] <jtangwk> heh yea
[15:19] <jtangwk> its going to be even more fun when el7 decides to use systemd
[15:19] <jtangwk> :)
[15:19] <jtangwk> since this is just a VM that i am testing, i think i might just re-install my test system with ubuntu
[15:20] <jtangwk> there goes a morning of attempting to setup ceph/radosgw
[15:20] <darkfaded> ouch :/
[15:20] <jtangwk> its ansible'd up and automated
[15:20] <jtangwk> so im not losing much time
[15:22] <darkfaded> oki ;)
[15:24] <darkfaded> last upstart script i looked at hand me constantly thinking "we'd have fired the guy", i guess technology-wise it may be better or worse, doesn't matter. what kills it is that apparently no distro considers init system the most critical piece (some may think dbus hehe)
[15:25] <darkfaded> jtangwk: how do you ansible'ize the osd / mds dir names?
[15:25] <darkfaded> template and some listing or what's your way for that?
[15:25] <darkfaded> last time i just had a long list of commands run via ansible :)
[15:29] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph
[15:33] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[15:40] <LeaChim> Is it possible to have radosgw read from replica OSDs? i.e. the nearest one, if there's one in the same rack etc?
[15:45] * madkiss (~madkiss@178.188.60.118) has joined #ceph
[15:47] * madkiss1 (~madkiss@089144192030.atnat0001.highway.a1.net) has joined #ceph
[15:50] * rustam (~rustam@94.15.91.30) has joined #ceph
[15:51] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[15:54] * madkiss (~madkiss@178.188.60.118) Quit (Ping timeout: 480 seconds)
[15:58] * PerlStalker (~PerlStalk@72.166.192.70) has joined #ceph
[16:03] <wido> LeaChim: No, RADOS will always read from the Primary OSD
[16:03] <wido> In fact, it never reads from a replica, always from the primary OSD of that Placement Group
[16:10] <jtangwk> darkfaded: not yet
[16:10] * syed_ (~chatzilla@180.151.28.182) Quit (Quit: ChatZilla 0.9.90 [Firefox 19.0.2/20130307122351])
[16:10] <jtangwk> i was planning on creating roles with take parameters for building up configs
[16:11] <Vjarjadian> can you change the primary OSD to another one? or set where the primary will be?
[16:11] <jtangwk> i have a small ceph_facts module for pulling out cluster information which i plan on using to build up configs
[16:11] <Vjarjadian> might accomplish the same thing for him
[16:11] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[16:12] <jtangwk> our storage person needs to be replaced (im just filling in that spot for now to progress our project)
[16:14] <darkfaded> jtangwk: but that sounds like a very good approach ( i had "actually you might just build it from ceph.conf" on my mind)
[16:14] <darkfaded> nice you went that way with your module
[16:18] <jtangwk> darkfaded: im looking forward to the admin rest-api to ceph
[16:18] * gmason (~gmason@hpcc-fw.net.msu.edu) has joined #ceph
[16:18] <jtangwk> that would be far more desirable than the current way of doing things
[16:18] <jtangwk> well im templating up ceph.conf right now, but will be making it more dynamic as i get time
[16:18] * portante|ltp (~user@c-24-63-226-65.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[16:18] <jtangwk> right now i just want radosgw :)
[16:18] <darkfaded> jtangwk: that was the one thing i filled in the ansible survey "if i could wish for something": standard api to all services for node add/remove similar tasks
[16:19] <darkfaded> (that would include ansible, so it was me being a little mean)
[16:19] <darkfaded> but yeah the admin sockets already made me SOOOOO happy
[16:19] <darkfaded> i know. lol. so why dopn't you got it working yet hehe
[16:20] <jtangwk> halting my sl6 vm
[16:20] <jtangwk> going to start up an ubuntu lts vm in a few minutes
[16:20] <jtangwk> im interested in a test environment for development rather than production
[16:28] <ron-slc> Hey all. Are custom / user-defined xattrs properly supported and MAINTAINED in CephFS? I can't find assurance of user-defined xattrs this in the Documentation.
[16:29] <ron-slc> err: I can't find assurance of user-defined xattrs being allowed in the Documentation.
[16:32] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[16:36] * ron-slc2 (~ron-slc@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[16:36] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Remote host closed the connection)
[16:38] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[16:39] * ron-slc2 (~ron-slc@173-165-129-125-utah.hfc.comcastbusiness.net) Quit ()
[16:54] <TiCPU> on 2 different kernel, 3.8 from Debian Wheezy and 3.5 from Ubuntu 12.04 LTS, rbd map <someimage> == Kernel panic.
[17:02] <elder> Is the image non-existent?
[17:03] <dspano> I've got 5 600GB 15K OSDs with the journals writing to the same disks on two hosts. I'm noticing horrible random read and write performance. If I take a disk per host to dedicate to the journal, will that boost performance considerably?
[17:04] <dspano> 5 600GB OSDs per host. I worded it wrong. Not 5 disks between two hosts.
[17:06] <jtangwk> hmm
[17:06] <jtangwk> seems ceph-deploy does a lot already
[17:06] <jtangwk> my life just got easier
[17:10] <jtangwk> saves me from messing with the ceph_facts for now
[17:11] * itamar (~itamar@82.166.185.149) has joined #ceph
[17:14] * vata (~vata@2607:fad8:4:6:221:5aff:fe2a:d1dd) has joined #ceph
[17:16] * itamar (~itamar@82.166.185.149) Quit (Quit: Leaving)
[17:17] <darkfaded> jtangwk: ah right, if you're on ubuntu now then -deploy must work
[17:18] <jtangwk> im taking the easy route
[17:19] <jtangwk> to get me partially or most of the way there
[17:20] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[17:20] <jtangwk> it'd be nice if there was a ceph-deploy radosgw as well :)
[17:24] * rustam (~rustam@94.15.91.30) has joined #ceph
[17:25] * diegows (~diegows@190.190.2.126) has joined #ceph
[17:26] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[17:31] <nhm> dspano: what kind of random read and write numbers are you seeing?
[17:32] <nhm> more questions: what controller? and also, how are you doing the testing?
[17:33] * Yudong (Yudong@hp1.cs.fiu.edu) has joined #ceph
[17:37] <dspano> nhm: Dell H700. When running something like this dd if=/dev/zero of=here bs=4K count=10000 oflag=direct I get 961 kB/s.
[17:38] <nhm> dspano: is this krbd, qemu/kvm, cephfs?
[17:38] <dspano> When I run this root@test:/data# dd if=/dev/zero of=here bs=4M count=100 oflag=direct. I get 41MB/s
[17:38] <dspano> qemu/kvm with ubuntu 12.04 as the guest. RHEL 5 is way worse, but I think that's kernel related.
[17:39] <dspano> On RBDs.
[17:39] <nhm> dspano: rbd cache enabled?
[17:39] <nhm> also, virtio disks?
[17:39] <jmlowe> specifically virtio-block or virtio-scsi
[17:39] <nhm> yes, good call jmlowe. :)
[17:40] <nhm> I've got my head dug into rados so much I forget all of the qemu/kvm details.
[17:40] <dspano> nhm: virtio
[17:40] <jmlowe> if virtio-block, grab the quantal or raring kernal, and use virtio-scsi
[17:40] <nhm> dspano: ok, rbd cache enabled?
[17:40] <dspano> nhm: no
[17:41] <nhm> dspano: that will help with sequential write throughput dramatically
[17:41] <nhm> dspano: it'll help with random write throughput too, but not as much. What kind of dell nodes are these?
[17:42] <nhm> dspano: we have some R515s in house and unfortunately have some strange performance issues on them. I think it's related to the expanders in the backplane. The R720xds seem to do better, but not as good as they should be able to do it.
[17:43] <dspano> nhm: R515s
[17:43] <nhm> dspano: ugh, ok. :)
[17:43] <dspano> nhm: lol
[17:43] <nhm> dspano: wb cache enabled on the H700?
[17:44] <jmlowe> ceph is copy on write and rbd has 4Mb objects, so you are doing many more writes with the 4k size than the 4M size without a writeback cache to coalesce them
[17:44] <mikedawson> kylehutson: did you get back to HEALTH_OK?
[17:44] <dspano> nhm: Yes. I've got each of the OSDs in Raid0 on the controller. There's no option for JBOD.
[17:44] <nhm> dspano: that's how we do it too, 1 RAID0 per disk with WB cache enabled.
[17:45] <dspano> nhm: So Raid0x1 with writeback enabled.
[17:45] * DGMurdockIII (~chatzilla@c-69-243-167-125.hsd1.in.comcast.net) has joined #ceph
[17:45] <nhm> dspano: that's about th best you can do on those.
[17:45] <jmlowe> dspano: apt-get install linux-current-generic is what you want, using libvirt?
[17:45] <nhm> dspano: one of the very strange things I noticed on our boxes is that sending multiple direct IO streams to a single LUN seemed to tank performance.
[17:46] <DGMurdockIII> how come you official channel is not on freenode
[17:46] <nhm> dspano: so if I had a big RAID0 with 7 disks in it, I could do 800MB/s to the LUN with 1 writer and really big IOs, but as soon as I added a 2nd writer the performance dropped to like 70MB/s.
[17:46] <dspano> jmlowe: I'll try that. I was thinking of rolling the mainstream kernel on the RHEL vm I need this to work on.
[17:46] <ron-slc> DGMurdockII: Is OFTC not good enough??
[17:47] <DGMurdockIII> dont forget guys live interview going on now http://live.twit.tv/ with ceph dev
[17:47] <nhm> dspano: that wasn't with ceph involved at all.
[17:47] <jmlowe> rhel 6.4 has virtio-scsi backported and in the standard distro version
[17:47] <dspano> nhm: I'm really wishing I hadn't purchased Dell servers now.
[17:47] <nhm> Hey, I'm a live ceph dev (sort of). ;)
[17:47] <DGMurdockIII> with sage
[17:48] <janos> DGMurdockIII: cool. thanks for the head's up
[17:48] <DGMurdockIII> it going on right now watch it it before it over
[17:48] <janos> i am
[17:49] <jmlowe> dspano: if you are using libvirt, I can give you the xml I use that has caching,trim, virtio-serial for guest agent enabled
[17:49] <dspano> jmlowe: Yes, please.
[17:49] <janos> jmlowe: i'd be interested inseeing that too
[17:50] <jmlowe> dspano: would you prefer my bridged or nat for network?
[17:50] <dspano> jmlowe: I saw your comment about RHEL 6. I'm going to try to convince the vendor to use it, but they say they only support RHEL 5, so I may be stuck with that version.
[17:50] <nhm> dspano: I posted some comments a while back on the dell mailing list: http://lists.us.dell.com/pipermail/linux-poweredge/2012-September/047117.html
[17:50] <nhm> dspano: and http://web.archiveorange.com/archive/v/dg7HTRCQyfSpjqPmfMzV
[17:50] <jmlowe> dspano: pretty good chance it will run on rhel6 unmodified
[17:51] <dspano> jmlowe: I'm using Openstack Folsom, so it's bridged on br100
[17:51] <nhm> dspano: ignore the iodepth flag in those tests, it doesn't work with the sync engine. I was doing some testing with libaio as well and forgot to remove it (it doesn't hurt, just doesn't do anything)
[17:52] <nhm> dspano: notice especially how the latency increases
[17:52] <dspano> nhm: I'm just glad, it's not me doing anything wrong. Despite the disappointment, at least I have something concrete to go on.
[17:53] <nhm> dspano: One of these days when I have time I'm going to rip apart one of those boxes, manually by-pass the expander backplane, stick in an LSI controller, and see if I can narrow down whatever is causing it.
[17:53] <nhm> dspano: I suspect it's the expander backplane.
[17:54] <nhm> dspano: we have 64 of those machines. :)
[17:54] <dspano> nhm: That is much more than my 2!
[17:55] <dspano> nhm: If I wanted to try a different controller, what one would you suggest?
[17:55] <nhm> dspano: I was just going to try an LSI 9260 or 9265 or something. It's almost the same thing as the H700, but with different firmware.
[17:56] <jmlowe> dspano: http://pastebin.com/ypPJamm1
[17:57] <dspano> jmlowe: Thank you!
[17:57] <jmlowe> np
[17:57] <nhm> yes, my rant has derailed into complaining about our nodes. jmlowe is right, probably the first order of business is to look at your VM config, enable RBD cache, and make sure you are running with something like 0.60+
[17:58] * madkiss1 (~madkiss@089144192030.atnat0001.highway.a1.net) Quit (Quit: Leaving.)
[17:58] <dspano> nhm: I'm on 0.58.4
[17:58] <jmlowe> I can do about 80MB/s inside of our vm's with 18 osd's backed by 4 nodes
[17:59] <jmlowe> 0.56.4
[17:59] <nhm> dspano: If you are using xfs, you may see a nice random write improvement just by upgrading.
[18:00] <nhm> dspano: though if this is for production, you might want to wait for cuttlefish since it's so close to release.
[18:00] <jmlowe> nhm: I'm looking forward to that, but I'm waiting for cuttlefish
[18:00] <nhm> jmlowe: yeah
[18:00] <dspano> nhm: Yeah, it's production. I just noticed I put 58.4, I meant 56.4
[18:01] <nhm> dspano: yeah, figured as much. :)
[18:01] <dspano> nhm: I appreciate your rant. I'll try the cache, but I my gut feeling is that it's what you're saying.
[18:02] <nhm> dspano: it should help with sequential writes at least. Random writes may improve a little bit at least.
[18:02] <matt_> joao, mikedawson, are you around? I think I found a problem with the WIP-compact branch
[18:03] <joao> I'm here
[18:03] <joao> sup?
[18:03] <dspano> nhm: Yeah, I'm running xfs. What mkfs do you guys use on your OSDs on those servers?
[18:03] <matt_> joao, compact worked perfectly but I've just re-created an OSD and the OSD is crashing. Rebooting an existing osd works fine
[18:04] <dspano> nhm: I was just building an fs directly on the device rather than creating a partition.
[18:04] <matt_> -2> 2013-05-01 23:58:30.402619 7f3ed3434700 1 -- 172.16.0.3:6851/31643 <== mon.1 172.16.0.17:6789/0 8 ==== osd_map(42853..42853 src has 42853..48062) $
[18:04] <matt_> -1> 2013-05-01 23:58:30.402682 7f3ed3434700 3 osd.83 0 handle_osd_map epochs [42853,42853], i have 0, src has [42853,48062]
[18:04] <matt_> 0> 2013-05-01 23:58:30.407180 7f3ed3434700 -1 *** Caught signal (Aborted) **
[18:04] * Wolff_John (~jwolff@64.132.197.18) Quit (Ping timeout: 480 seconds)
[18:04] <nhm> dspano: currently I'm using -i size=2048 and mounting with -o inode64,noatime.
[18:05] <joao> matt_, better wait a bit for sjust; he might know better what is wrong with that
[18:05] <jmlowe> I use nobarrier with my battery backed controllers
[18:05] <joao> so we can look into it on the monitor side
[18:05] <dspano> nhm: I did the same. My mount options are LABEL=osd0 /srv/ceph/osd0 xfs rw,noexec,nodev,noatime,nodiratime,barrier=0 0 0
[18:06] <dspano> nhm: Sorry I meant I did the same at first.
[18:06] <dspano> I tried the mount options I just posted yesterday to see if it made a difference.
[18:07] <matt_> joao, no worries. Would he be far away from being online?
[18:07] <nhm> dspano: Christoph mentioned a couple of other things on the mailing list that may help, but I didn't see much improvement with them.
[18:07] <joao> matt_, he should be showing up soon
[18:07] <joao> as in, at most an hour
[18:08] <jmlowe> dspano: may or may not make a big difference, but you can go with a noop scheduler inside the vm and for any dev backed by a controller with cache
[18:08] <mikedawson> matt_: I haven't added any osds against wip-compact, but it's been working for me so far
[18:09] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[18:09] <matt_> mikedawson, Yeh likewise. It's just adding new osd's which seems to be broken. The rest seems fine
[18:09] <dspano> jmlowe: Tried that and deadline in the vm and all the OSDs. It runs better in Ubuntu, but I didn't see a huge difference in the RHEL vm.
[18:10] <dspano> nhm: Did you write the performance blogs I've been reading over and over for the last couple of weeks?
[18:10] <jmlowe> dspano: it never gave me much either
[18:11] <nhm> dspano: if it's the one on the ceph blog, yes. :)
[18:11] <dspano> nhm: Just saw your info. Thank you for writing those. I really appreciate all the effort.
[18:12] <nhm> dspano: Thanks! It's really nice to hear that it helps people. :) I've got some more coming with the cuttlefish release.
[18:13] <nhm> This time 24 disks, bonded 10GbE, and RBD and QEMU/KVM tests.
[18:14] <jmlowe> cant' wait
[18:14] <dspano> nhm: I wish I could afford 10Gbe. I think I will shed real tears the day I'm able to buy my first switch.
[18:15] <jmlowe> we've got 2 48 port 10Gig switches sitting in storage
[18:15] <jmlowe> dozens of chelsio cards
[18:16] <nhm> dspano: I'm cheating, I just have 2 X520s connected together with SFP+. :)
[18:16] * loicd (~loic@2a01:e35:2eba:db10:7dd3:6565:e69f:225f) has joined #ceph
[18:17] * wag2 (~wag2@node001.ds.geneseo.edu) has joined #ceph
[18:18] <wag2> wow, finally got in.
[18:18] <nhm> wag2: was there a line? :)
[18:20] <dspano> jmlowe: I'm extremely jealous
[18:21] <jmlowe> high price of doing live demos on the floor of SC
[18:21] <nhm> jmlowe: oh, I think I might actually remember that.
[18:21] <jmlowe> 100Gig bandwidth challenge from SC'11 I think
[18:21] <nhm> jmlowe: yeah, I was there
[18:22] <nhm> jmlowe: I felt sorry for everyone involved. ;)
[18:22] <jmlowe> they were on a death march, that's for sure
[18:34] <janos> good interview with sage
[18:34] <sage> janos: thanks!
[18:35] <janos> i enjoyed it
[18:36] <janos> thank *you*
[18:37] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[18:39] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit ()
[18:40] * dxd828 (~dxd828@195.191.107.205) Quit (Remote host closed the connection)
[18:42] * gmason (~gmason@hpcc-fw.net.msu.edu) Quit (Quit: Computer has gone to sleep.)
[18:42] * Havre (~Havre@2a01:e35:8a2c:b230:2d8b:cae5:ff86:48e6) has joined #ceph
[18:43] <scuttlemonkey> sage: were you the even 250?
[18:44] <sage> that'd be cool
[18:44] <scuttlemonkey> in other news...crap I missed it, will have to wait for the recording to go up
[18:44] <sage> not sure
[18:44] <sage> that'll be better, there were technical difficulties :)
[18:45] <scuttlemonkey> hehe
[18:45] <scuttlemonkey> always are!
[18:45] <dspano> nhm: I wonder if I bought a PCI-Xpress SSD, using that as a journal drive would overcome the issue with the raid controller.
[18:46] <dspano> nhm: The writeback cache didn't change things much.
[18:50] * loicd (~loic@2a01:e35:2eba:db10:7dd3:6565:e69f:225f) Quit (Quit: Leaving.)
[18:50] * gmason (~gmason@13-176-175.client.wireless.msu.edu) has joined #ceph
[18:50] <nhm> dspano: it'd probably help, but I don't know for sure that it'd really fix it.
[18:50] * gmason (~gmason@13-176-175.client.wireless.msu.edu) Quit ()
[18:50] <nhm> dspano: in qemu/kvm or on controller?
[18:51] * gmason (~gmason@13-176-175.client.wireless.msu.edu) has joined #ceph
[18:52] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[18:52] <dspano> nhm: qemu/kvm
[18:53] <dspano> nhm: Do you have the disk cache policy disabled when you create the raid0 array? Mine is.
[18:53] <nhm> dspano: yeah, disk cache is disabled.
[18:54] <dspano> nhm: Sounds like everything is the same. If I overcome this hurdle, you'll be the first to know about it.
[18:54] * wag2 (~wag2@node001.ds.geneseo.edu) Quit (Ping timeout: 480 seconds)
[18:56] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[18:57] <nhm> dspano: sorry, doing like 3 things at once. Do you have "rbd cache = true" in your ceph.conf file? You may need that still, I don't remember.
[18:57] <nhm> along with the stuff in the vm xml
[18:57] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[18:59] * gmason (~gmason@13-176-175.client.wireless.msu.edu) Quit (Ping timeout: 480 seconds)
[19:03] * gmason (~gmason@hpcc-fw.net.msu.edu) has joined #ceph
[19:03] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[19:06] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[19:07] * gmason_ (~gmason@13-248-55.client.wireless.msu.edu) has joined #ceph
[19:08] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: ASCII a stupid question, get a stupid ANSI!)
[19:10] * loicd (~loic@magenta.dachary.org) has joined #ceph
[19:11] * gmason (~gmason@hpcc-fw.net.msu.edu) Quit (Ping timeout: 480 seconds)
[19:16] * DGMurdockIII (~chatzilla@c-69-243-167-125.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.90 [Firefox 20.0.1/20130409194949])
[19:19] * madkiss (~madkiss@089144192030.atnat0001.highway.a1.net) has joined #ceph
[19:20] * portante (~user@66.187.233.206) has joined #ceph
[19:21] * dwt (~dwt@128-107-239-233.cisco.com) has joined #ceph
[19:23] * gmason (~gmason@35.9.172.169) has joined #ceph
[19:24] * gmason_ (~gmason@13-248-55.client.wireless.msu.edu) Quit (Ping timeout: 480 seconds)
[19:26] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[19:26] * Wolff_John (~jwolff@ftp.monarch-beverage.com) has joined #ceph
[19:29] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[19:33] * madkiss (~madkiss@089144192030.atnat0001.highway.a1.net) Quit (Ping timeout: 480 seconds)
[19:34] * rturk-away is now known as rturk
[19:34] <matt_> sjust, sjusthm, sjustlaptop , if you're around I have a new bug that joao thinks you might be able to help with
[19:34] <sjusthm> matt_: what's up?
[19:35] * gmason (~gmason@35.9.172.169) Quit (Ping timeout: 480 seconds)
[19:36] <matt_> Using the WIP-compact build, a fresh OSD crashes with - $
[19:36] <matt_> -1> 2013-05-02 00:48:56.550134 7f0ad37cf700 3 osd.83 0 handle_osd_map epochs [42853,42853], i have 0, src has [42853,48076]
[19:37] * gmason (~gmason@13-248-55.client.wireless.msu.edu) has joined #ceph
[19:37] <sjusthm> can I get the 10 lines on either side?
[19:40] * rustam (~rustam@94.15.91.30) has joined #ceph
[19:43] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[19:44] * gmason (~gmason@13-248-55.client.wireless.msu.edu) Quit (Quit: Computer has gone to sleep.)
[19:44] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) has joined #ceph
[19:48] * tkensiski (~tkensiski@209.66.64.134) has joined #ceph
[19:48] * tkensiski (~tkensiski@209.66.64.134) has left #ceph
[19:49] <dspano> nhm: I totally forgot about that!
[19:51] * psomas (~psomas@inferno.cc.ece.ntua.gr) Quit (Read error: Operation timed out)
[19:52] * paravoid (~paravoid@scrooge.tty.gr) Quit (Read error: Operation timed out)
[19:52] <dspano> nhm: Much faster
[19:53] * gmason (~gmason@35.9.32.192) has joined #ceph
[19:53] * LeaChim (~LeaChim@176.250.176.2) Quit (Ping timeout: 480 seconds)
[19:53] <dspano> nhm: dd if=/dev/zero of=test bs=4k count=16k oflag=direct
[19:53] <dspano> nhm: 67108864 bytes (67 MB) copied, 2.63155 s, 25.5 MB/s
[19:53] * psomas (~psomas@inferno.cc.ece.ntua.gr) has joined #ceph
[19:54] * paravoid (~paravoid@scrooge.tty.gr) has joined #ceph
[19:54] <joao> gregaf, sage, I think I recall seeing some election bug fixes going into next in the last couple of days; any idea what they were supposed to fix?
[19:54] <gregaf> joao: they got rolled back
[19:54] <joao> I just hit an election cycle on one of the monitors after shutting down the leader
[19:55] <gregaf> I was trying to prevent out-of-date monitors from becoming leader
[19:55] <joao> while the third monitor is still a peon
[19:55] <joao> hmm... I wonder if this might be it
[19:55] <joao> okay, looking for a couple more minutes
[19:59] <gregaf> joao: sagewk: I don't understand the bug that's fixed by using rank instead of name in pick_random_mon
[19:59] <joao> gregaf, using the rank is just to make things more obvious
[19:59] <joao> I can rebase those two patches
[20:00] <joao> and make things a bit clearer
[20:00] <gregaf> so it wasn't broken as it was?
[20:00] <gregaf> use wip-mon-rank that sagewk created
[20:00] <sagewk> fwiw, i think that rand loop and the search that follows should be collapsed into a single 'start at a random point, then walk forward until we find one that is good'
[20:00] <sagewk> except that the debug_* stuff confusesme
[20:01] <joao> gregaf, the function was broken in the sense that it was allowing for a get_name(n) with n == monmap.size()
[20:01] <sagewk> sjust: wip-4872 looks ok?
[20:02] <sjusthm> looking
[20:02] <joao> gregaf, I'll rework those two patches, just a sec
[20:02] * LeaChim (~LeaChim@176.250.181.7) has joined #ceph
[20:03] <gregaf> joao: yeah, but only if you specified an "other" to avoid
[20:03] <joao> sagewk, the debug_* stuff used to be nifty to force a monitor to sync from another monitor
[20:03] <sjusthm> ah.
[20:03] <sjusthm> ok, looks right
[20:03] <gregaf> so yes, that was broken, but the interface change had nothing to do with fixing a bug
[20:03] <joao> gregaf, I think I did, in the other patch
[20:03] <gregaf> yes, I just thought you'd said the interface change fixed a bug too
[20:06] <joao> apologies then; these are two patches with different purposes: the first solves the bug on sync_timeout() comparing a monmap name to the monitor's name by using ranks instead of names; the second fixes a bug on _pick_random_mon() when specifying a 'other'
[20:06] <elder> Tornado sirens going off (just testing) and snow is falling.
[20:07] <gregaf> joao: okay, so there is a bug that's fixed. what's the monmap name to monitor's name comparison issue?
[20:09] <joao> gregaf, Monitor::name holds whatever you call your monitor ('a', 'b', 'foo'); the monmap name appears to hold a '0', '1', ...
[20:09] <joao> I honestly didn't expect that, so I wonder if this is a different bug of sorts
[20:10] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[20:10] <sagewk> joao: hmm, they should always be in sync..
[20:10] <joao> but when the time comes to compare those on sync_timeout(), 'a' != '0', and we may pick ourselves
[20:11] <gregaf> okay, well now I'm really confused because I don't think those can be out of sync like that
[20:11] <gregaf> unless you're comparing the wrong things
[20:11] <joao> yeah, now I'm confused too
[20:12] <joao> let me recheck this whole thing
[20:12] <gregaf> maybe the messenger entity names are based on ranks? but I don't think they are
[20:15] * dwt (~dwt@128-107-239-233.cisco.com) Quit (Quit: Leaving)
[20:16] * eschnou (~eschnou@252.94-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[20:18] * portante|ltp (~user@66.187.233.207) has joined #ceph
[20:19] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[20:20] <joao> gregaf, entity names appear to be based on rank; e.g., sync_send_heartbeat mon.1 127.0.0.1:6790/0
[20:21] <gregaf> okay, but that should just mean we need to match them up properly, not that we can't match them up properly, right?
[20:22] <gregaf> (my inclination actually is that we should work in terms of ranks all the time, but we need to make sure we understand what's going on here so we can look and see if it's a problem elsewhere too)
[20:22] <joao> yeah
[20:22] <joao> looking
[20:23] * wag2 (~wag2@node001.ds.geneseo.edu) has joined #ceph
[20:28] * rustam (~rustam@94.15.91.30) has joined #ceph
[20:35] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[20:37] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) Quit (Read error: Operation timed out)
[20:41] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) has joined #ceph
[20:44] * loicd (~loic@magenta.dachary.org) has joined #ceph
[20:50] <nhm> dspano: glad to hear it! It'll probably help random writes too, but not to any extent that it really fixes the underlying bad behavior
[20:51] <nhm> dspano: cuttleflish should help too, and we also have some fixes in place for rbd cache that improves sequential write performance and reduces lag under heavy IO loads.
[20:53] * yohui (~yohui@p5B09DE26.dip0.t-ipconnect.de) has joined #ceph
[20:56] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[20:57] * Yudong (Yudong@hp1.cs.fiu.edu) Quit ()
[21:00] * yohui (~yohui@p5B09DE26.dip0.t-ipconnect.de) Quit (Quit: irc2go)
[21:03] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[21:06] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[21:06] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:16] * portante|ltp (~user@66.187.233.207) Quit (Ping timeout: 480 seconds)
[21:17] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[21:23] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[21:34] * dwt (~dwt@128-107-239-233.cisco.com) has joined #ceph
[21:42] * sakib (~sakib@109.251.213.132) has joined #ceph
[21:45] * danieagle (~Daniel@186.214.77.43) has joined #ceph
[21:48] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[21:50] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[21:50] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[21:56] * dwt (~dwt@128-107-239-233.cisco.com) Quit (Read error: Connection reset by peer)
[21:58] <wido> joao: gregaf: Are you also seeing the mon issues with single mons?
[21:58] <wido> I upgraded another dev cluster to the next branch, it has a single mon, it's very slow and most of the time even unresponsive
[21:59] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[21:59] <nhm> wido: how many PGs?
[22:00] <wido> nhm: I think about 7k
[22:00] <wido> cluster is down, can't check
[22:00] <nhm> wido: ok, so pretty small
[22:00] <wido> nhm: Yes
[22:00] <nhm> wido: I've had some problems with unresponsiveness and slowness with like 100K PGs on 1 mon.
[22:06] <gregaf> wido: I don't think we have observed issues with 1 mon, no
[22:07] <gregaf> what version precisely are you on? bunch of fixes and changes going in lately
[22:07] <wido> gregaf: I just updated the issue (#4851), but it's from the next about 8 hours ago
[22:08] <gregaf> hmm, k
[22:08] <wido> btw, isn't it Labor day over there?
[22:09] <wido> the mon isn't responding on the admin socket either
[22:09] <gregaf> no, that's in September, and always a Monday...
[22:11] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:11] <wido> Ah, it's Labour Day on May 1st in Europe :)
[22:12] <wido> Anyway's, getting kind of weird with the monitors in the next branch
[22:13] <gregaf> been weird for a while :/
[22:17] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:18] <wido> sage: Do you want debug mon 20 or also ms?
[22:18] <nhm> wido: btw, ZFS seems to work fine so long as you don't use SA attributes. It's slow though without them.
[22:19] <wido> nhm: Indeed. Didn't report back on that. With SA attributes it gets weird. Removing objects for example doesn't work
[22:20] * dwt (~dwt@wsip-70-166-104-226.ph.ph.cox.net) has joined #ceph
[22:20] * gmason (~gmason@35.9.32.192) Quit (Quit: Computer has gone to sleep.)
[22:20] <nhm> wido: I didn't look too closely, but deepscrub was failing due to some xattr problem.
[22:21] * dontalton2 (~dwt@128-107-239-234.cisco.com) has joined #ceph
[22:24] <nhm> ok, IBM is kind of hardcore: http://techland.time.com/2013/05/01/tiny-toon-ibm-makes-a-movie-out-of-atoms/
[22:27] * wag2 (~wag2@node001.ds.geneseo.edu) Quit (Ping timeout: 480 seconds)
[22:28] * dwt (~dwt@wsip-70-166-104-226.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[22:31] * noob22 (~cjh@2620:0:1cfe:28:9cf8:21a5:b78d:b5ed) has left #ceph
[22:31] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Read error: Connection reset by peer)
[22:33] * jtang2 (~jtang@79.97.135.214) has joined #ceph
[22:33] * jtang1 (~jtang@79.97.135.214) Quit (Read error: Connection reset by peer)
[22:33] * rustam (~rustam@94.15.91.30) has joined #ceph
[22:34] * Cube (~Cube@12.248.40.138) has joined #ceph
[22:34] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[22:36] * sakib (~sakib@109.251.213.132) Quit (Quit: leaving)
[22:36] <athrift> nhm: I am really looking forward to your cuttlefish performance blog post :)
[22:38] <nhm> athrift: me too! who knows what I'll write? ;)
[22:38] * gmason (~gmason@hpcc-fw.net.msu.edu) has joined #ceph
[22:38] * brady_ (~brady@rrcs-64-183-4-86.west.biz.rr.com) has joined #ceph
[22:38] <nhm> athrift: Might be a bit late though!
[22:39] <athrift> nhm: better late than never
[22:44] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[22:44] * jtang2 (~jtang@79.97.135.214) Quit (Read error: Connection reset by peer)
[22:44] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[22:50] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[23:01] * mikedawson_ (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[23:01] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Remote host closed the connection)
[23:04] * eschnou (~eschnou@252.94-201-80.adsl-dyn.isp.belgacom.be) Quit (Read error: Operation timed out)
[23:05] <TiCPU> on 2 different kernel, 3.8 from Debian Wheezy and 3.5 from Ubuntu 12.04 LTS, rbd map <someimage> == Kernel panic. Kernel 3.9 has this fixed, I hope this will get backported.
[23:07] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:07] * mikedawson_ is now known as mikedawson
[23:08] * Wolff_John (~jwolff@ftp.monarch-beverage.com) Quit (Quit: ChatZilla 0.9.90 [Firefox 20.0.1/20130409194949])
[23:11] * jtang1 (~jtang@79.97.135.214) Quit (Quit: Leaving.)
[23:16] <joshd> TiCPU: do you have the backtrace?
[23:17] <joshd> TiCPU: do you have newer crush tunables enabled? support for those was new in kernel 3.9
[23:18] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[23:18] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[23:20] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[23:20] * john_barbee_ (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[23:21] * rustam (~rustam@94.15.91.30) has joined #ceph
[23:22] <sagewk> gregaf: dumper looks ok. can we make a teuthology test for it so it doesn't break again?
[23:22] <gregaf> haha, was just adding a ticket for that
[23:23] <gregaf> I've got your reviewed-by to merge into next?
[23:23] <sagewk> yeah
[23:23] <gregaf> sweet
[23:25] <gregaf> …wow, our null commit takes longer to build than it used to
[23:25] <gregaf> a ton of the libraries depend on the git hash now, and we probably just need the end-product executables to do so :/
[23:26] <sagewk> sjusthm: wip_4813 isn't in next yet right?
[23:26] <sjusthm> sagewk: it is
[23:26] <sjusthm> seems to have caused 4884
[23:26] <sagewk> oh i see it
[23:27] <gregaf> noooo, what'd I miss?
[23:27] <gregaf> davidz, is it possible you've bookmarked the FS project's New issue page?
[23:29] * niklas_ (~niklas@2001:7c0:409:8001::32:115) has joined #ceph
[23:29] * niklas (~niklas@2001:7c0:409:8001::32:115) Quit (Read error: Connection reset by peer)
[23:31] <sjusthm> gregaf: I think I know what happened, it's a flaw in the way I handled start_split in load_pgs
[23:31] * niklas_ (~niklas@2001:7c0:409:8001::32:115) Quit (Read error: Connection reset by peer)
[23:32] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Remote host closed the connection)
[23:32] <sagewk> sjusthm, gregaf: http://tracker.ceph.com/issues/4851 .. how do you run a leveldb fsck manually?
[23:33] <sagewk> wido: ^
[23:33] <sjusthm> you give it the paranoid start up option
[23:33] <dmick> gregaf: you should fix that makefile :-P
[23:33] <davidz> I always seem to go there to find a "New Issue" yab
[23:33] <davidz> tab
[23:33] <sjusthm> which we don't currently do
[23:34] * niklas (~niklas@2001:7c0:409:8001::32:115) has joined #ceph
[23:34] <sjusthm> there doesn't appear to be a standalone option currently
[23:35] <davidz> I don't see "New Issues" on Home page or "My page" so I always click on "fs" under latest project which seems like the best choice to file a bug (Not teuthology, rbd, rgw or Performance).
[23:36] <sjusthm> yeah, it's special like that
[23:36] <sjusthm> I usually get to it from one of the bugs under "My Page"
[23:38] * portante is now known as portante|afk
[23:39] <davidz> So when you say "it" you mean I should be filing under project "ceph"?
[23:40] <sjusthm> yeah
[23:45] * john_barbee_ (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:45] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[23:46] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:49] * niklas_ (~niklas@2001:7c0:409:8001::32:115) has joined #ceph
[23:50] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[23:50] * dwt (~dwt@128-107-239-233.cisco.com) has joined #ceph
[23:51] * dontalton2 (~dwt@128-107-239-234.cisco.com) Quit (Quit: Leaving)
[23:51] * niklas (~niklas@2001:7c0:409:8001::32:115) Quit (Ping timeout: 480 seconds)
[23:53] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[23:54] * rustam (~rustam@94.15.91.30) has joined #ceph
[23:56] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit ()
[23:57] <PerlStalker> Is it possible to mirror the ceph debian repo?
[23:58] <dmick> There is an EU mirror run by Wido, so yes
[23:58] <dmick> From ceph.com/docs: For the European users there is also a mirror in the Netherlands at http://eu.ceph.com/
[23:59] <PerlStalker> I'm looking at that page.

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.