#ceph IRC Log

Index

IRC Log for 2013-02-09

Timestamps are in GMT/BST.

[0:02] <loicd> dmick: :-) it's in build-depends though Build-Depends: ... libboost-program-options-dev
[0:02] * leseb (~leseb@bea13-1-82-228-104-16.fbx.proxad.net) has joined #ceph
[0:02] <dmick> yeah. it just didn't get installed on the gitbuilder
[0:03] <dmick> we have a "gitbuilder creation" mechanism that wasn't updated when the new lib dependency was added
[0:03] <dmick> (repo autobuild-ceph)
[0:03] <loicd> checking boost/program_options/option.hpp usability... yes checking boost/program_options/option.hpp presence... yes checking for boost/program_options/option.hpp... yes
[0:04] <loicd> oh :-)
[0:04] * wschulze1 (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[0:05] <ShaunR> dmick: so if i were to change the replicas to 3, would i get better performance over 2?
[0:05] <ShaunR> i should right? since it has 3 locations to pull from instead on 2
[0:06] <dmick> no, not the way ceph works
[0:06] <dmick> reads always come from primary (if it's up). writes go to all replicas before client can continue
[0:07] <dmick> and, loicd: program_options headers are in libboost-dev, but the library is in libboost-program-options-dev
[0:07] <dmick> go figure
[0:07] <loicd> :-D
[0:08] <ShaunR> Wait what... I though one of the goals here is to allow for increased performance (IOPS specifically)
[0:10] <loicd> I thought about adding to the checks done by --enable-coverage to include detection of commands such as mkfs.btrfs, mkfs.ext4 etc. In order to run more extensive tests depending on these tools when running make check-coverage
[0:11] <loicd> I would very much like to have a simple way ( i.e. ./configure --enable-coverage + make check-coverage ) to maximize the code coverage.
[0:12] <loicd> For instance by running src/test_filestore which is not run by default by make check or check-coverage
[0:12] <loicd> although it increases the coverage of src/os/FileStore.cc significantly
[0:14] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[0:14] <gregaf1> it's not anything you could really call a unit test, though
[0:14] <gregaf1> I believe the rest of make check can at least claim that appellation
[0:15] <dmick> ShaunR: not for any single thread. Aggregate bandwidth for many clients/threads is increased because the storage is distributed across the cluster's compute/IO resources
[0:15] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit ()
[0:16] <dmick> but any single request is necessarily slower because the redundancy benefit isn't free
[0:17] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[0:18] <ShaunR> dmick: qemu using rbd to attach the disk, is that considered a single thread?
[0:19] <dmick> it's likely that the OS image on that qemu machine is going to issue multiple simultaneous requests to the block device
[0:19] <dmick> so, no
[0:20] <dmick> but the blocks are probably pretty localized, so you won't get as much distribution across the cluster as you'd want for performance increase
[0:20] <dmick> now, 100 VMs...
[0:21] <loicd> gregaf1: yes. I'm not sure what's the right way to approach unit tests that have a dependency on tools. If a unit tests makes use of mkfs.btrfs instead of relying on a mockup, can it still be called a unit test...
[0:21] <gregaf1> probably not
[0:22] <gregaf1> at that point we write teuthology jobs that spin them up
[0:22] <loicd> Writing unit tests for FileStore::_detect_fs with mockups seems fairly pointless. Maybe it's a border case.
[0:22] <gregaf1> often with our "workunits" infrastructure, which is just writing a shell script, dropping it in ceph/qa/suites/workunits, and then adding a yaml to ceph-qa-suite
[0:23] <ShaunR> dmick: yes, i'm talking about 500+ VMS
[0:25] <dmick> So yes, the cluster should help distribute the load more evenly for those VMs. but adding replication won't help; adding more OSDs and making sure PGs distribute across them is what you want
[0:25] <ShaunR> what we build now is basically 4 disk, hardware raid10 arrays, dual quad core, 64gb ram... we can roughly get 20 VM's on somthing like this before we see a IO bottleneck.
[0:25] <ShaunR> We are left over with a ton of extra diskspace...
[0:26] <ShaunR> the idea here is to stop doing that, build out simple hosts without the raidcard and disks, and instead build out a large CEPH cluster that all the VM's would use.
[0:26] <ShaunR> hoping that we could increase IO easily by adding "shelfs"
[0:26] <ShaunR> or in this case OSD's i guess :)
[0:27] <loicd> gregaf1: are there teuthology "workunits" designed to maximize coverage ? My motivation is to be able to run tests that cover as much code as possible as fast as possible. I would like to make it as simple/fast as possible to run tests showing that a patch did not break any unit test / integration tests. Maybe running all ceph-qa-suite is what I need. I should try to see how long it takes.
[0:27] <dmick> ShaunR: it's a reasonable plan
[0:28] <gregaf1> loicd: I don't think we segment them by time allotment much once we make the teuthology/unit test decision
[0:29] <gregaf1> but you should ask joshd or somebody more involved in setting that up
[0:29] <loicd> The proposed chain_xattr.cc unit tests are somewhat sensitive to the underlying file system https://github.com/ceph/ceph/pull/40/files but they won't break, only skip.
[0:29] <gregaf1> ShaunR: dmick: as long as you remember that replication increases the IOPs required of the disks, so as you increase replication you'll need more of them to support the same number of users
[0:29] <gregaf1> of course, the tradeoff there is you get to amortize a lot more as well, so you see fewer local hotspot nodes
[0:30] * leseb (~leseb@bea13-1-82-228-104-16.fbx.proxad.net) Quit (Remote host closed the connection)
[0:30] <joshd> loicd: there are a small number of non-unit tests that don't require e.g. a running cluster. these aren't really grouped together anywhere in ceph-qa-suite though
[0:30] <dmick> gregaf1: yeah, I was specifically talking about adding more OSDs but not increasing replication
[0:31] <joshd> loicd: so there's no real fast collection other than make check
[0:32] <loicd> joshd ok :-)
[0:33] <loicd> gregaf1: thanks !
[0:34] <loicd> joshd gregaf1 would you advise me to keep going with unit tests or to spend more time on teuthology ? My goal is both to learn the code base and improve code coverage.
[0:35] <loicd> I feel like I should improve tests for FileStore.cc in teuthology rather than with make check-coverage / check
[0:36] <loicd> But that testing chain_xattr.cc is better done with make check
[0:36] <loicd> I'm not sure though, all this is a still fuzzy for me ;-)
[0:36] <joshd> loicd: for the FileStore stuff, yeah, I think it makes sense to do as a script run by teuthology
[0:37] <ShaunR> gregaf1: I was figuring the more replicas you had the more read performance you would get seeing how there is 3 seperate paths for the same data
[0:37] <gregaf1> ShaunR: no, because then you're diluting your effective cache
[0:38] <joshd> I'm not sure about the chain_xattrs stuff, I haven't looked at it too closely
[0:38] <gregaf1> and if you wanted to make IO go through faster you'd basically have to issue the request to each node and then take the one which answered fastest
[0:38] <gregaf1> which is wildly inefficient even by distributed FS standards ;)
[0:38] <gregaf1> (although there are systems which do so when they're specifically latency-focused)
[0:39] <joshd> loicd: I'd actually advise you to do more unit tests, or at least not full-scale system tests
[0:39] <ShaunR> gregaf1: so with a 500 VM cluster attached to a ceph cluster with 2 replica if i started seeing a IO limitation accross all VM's i could simply add more OSD's to increase IO?
[0:39] <gregaf1> yeah
[0:40] <loicd> joshd I will follow your advice, thanks :-)
[0:40] <ShaunR> ok, thats the answer i was hopeing for... otherwise i would have just wasted days of time :)
[0:40] <joshd> loicd: if ever a class could use unit testing (and documentation), it's bufferlist. but that's quite a tedious task
[0:41] <dmick> ShaunR: I *think* that's what I was saying :)
[0:41] <joshd> loicd: getting to know the filestore would certainly be useful, but it's not nearly as unit-testable
[0:42] <ShaunR> gregaf1: also, will the cluster automatically start moving data to the new OSD's to balance?
[0:43] <gregaf1> assuming you add them to the CRUSH map, yes
[0:44] <ShaunR> I just dont want to add a new osd and only new data see the improvement, like i said i want the whole cluster to increase in performance
[0:44] * rturk is now known as rturk-away
[0:45] <gregaf1> yeah, it rebalances, it's not a new-data-only thing
[0:45] * rturk-away is now known as rturk
[0:47] <ShaunR> gregaf1: Are you part of the ceph project?
[0:47] <gregaf1> yeah
[0:47] <gregaf1> sorry, in a call now, I have to drop this for now :)
[0:48] <ShaunR> ok, i tried to do some research on this, was hoping you might have some more insight.....
[0:48] <ShaunR> bahh... ok
[0:48] <ShaunR> thanks for the time so far!
[0:48] <loicd> I'll add to https://github.com/ceph/ceph/blob/master/src/test/bufferlist.cc in order to improve the coverage of http://dachary.org/wp-uploads/2013/01/ceph/common/buffer.cc.gcov.html . It's what you're refering to right joshd ?
[0:50] <ShaunR> dmick: i'm trying to add my other server to the first now.. but mkcephfs is not happy
[0:50] <ShaunR> looks like because the first server was already done
[0:52] <ShaunR> ahh, n/m think i just found the rigth docs
[0:53] * rturk is now known as rturk-away
[0:53] <dmick> yeah, mkcephfs isn't well-suited to incremental bringup, but I assume you found http://ceph.com/docs/master/rados/operations/add-or-rm-osds/
[0:54] <ShaunR> yep
[0:54] <ShaunR> but of course i'm already having issues :)
[0:55] <dmick> growing pains
[0:55] <dmick> and there are lots of steps there
[0:55] <dmick> what's up?
[0:57] <ShaunR> 2013-02-08 08:02:46.576195 7f79b2e9a700 0 -- :/7238 >> 208.67.183.132:6789/0 pipe(0x23a14b0 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
[0:57] <ShaunR> 2
[0:57] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Remote host closed the connection)
[0:59] <joshd> loicd: yeah, that'd be awesome. I'm always paranoid about it since it's so complex, old, and lacking tests
[0:59] <loicd> :-)
[0:59] <dmick> ShaunR: that's not very much information to go on. does ceph osd tree look right?
[1:01] <ShaunR> dmick: not sure, honestly i'm not sure if that output is wrong for the most part.
[1:01] <ShaunR> the .fault is what threw me
[1:02] <dmick> "that output" meaning the output from ceph osd tree?
[1:02] <dmick> or the log message?
[1:04] <ShaunR> no, what i pasted above
[1:04] <dmick> yes. me either. That's why I said it's not very much information to go on, and asked about the output of ceph osd tree
[1:06] <ShaunR> http://pastebin.ca/2311976
[1:06] <ShaunR> I only have osd.0 and osd.1 though
[1:07] <ShaunR> not sure where those others are comming from
[1:10] <dmick> ok, so, you do have osd.2 and osd.3 somehow mentioned
[1:10] <dmick> and 1, 2, and 3 are not properly added to the crush map
[1:11] <dmick> *and* 0 is down
[1:11] <dmick> so yeah, there are problems
[1:13] <ShaunR> lol
[1:13] <ShaunR> maybe i should start over..
[1:13] <dmick> did you follow the process on add-or-rm-osds?
[1:13] <ShaunR> in this setup i've got two machines, trying to use both machines as a mon, mds, and osd
[1:16] <ShaunR> a bit confused how osd.0 could be down but 1 up
[1:20] <ShaunR> ok, i just decided to wipe and start over..
[1:20] <ShaunR> now i see this though... HEALTH_WARN 576 pgs stuck inactive; 576 pgs stuck unclean
[1:20] <ShaunR> osd tree looks better though
[1:20] <ShaunR> http://pastebin.ca/2311982
[1:21] <ShaunR> hmm, now it's saying HEALTH_OK
[1:21] <dmick> oh, ok. so it got to the point of being peered
[1:21] <ShaunR> must have took a sec to initialize or somthing
[1:22] <ShaunR> hmm, is it normal for the other hosts to be running the conf/keyring out of /tmp?
[1:23] * rturk-away is now known as rturk
[1:24] <dmick> if you mean what I think you mean, no
[1:24] <ShaunR> storage2 configs are in /tmp/
[1:24] <ShaunR> ex: root 3963 0.4 0.0 1258816 10056 ? Ssl 16:20 0:01 /usr/bin/ceph-mon -i b --pid-file /var/run/ceph/mon.b.pid -c /tmp/ceph.conf.7999
[1:26] <dmick> mkcephfs copies them there temporarily, but should also put them in /etc/ceph
[1:27] <ShaunR> doesnt look like it did
[1:28] <dmick> and you didn't run it with --no-copy-conf?
[1:28] <ShaunR> nope
[1:28] <dmick> I have no explanation
[1:28] <ShaunR> mkcephfs -a -c /etc/ceph/ceph.conf -k ceph.keyring
[1:28] * miroslav (~miroslav@pool-98-114-229-250.phlapa.fios.verizon.net) has joined #ceph
[1:29] <ShaunR> ahh, it's the init script doing it
[1:30] * yehuda_hm (~yehuda@2602:306:330b:a40:d99b:e6d5:b855:167e) has joined #ceph
[1:30] <ShaunR> it scp's over a copy of the config into /tmp and appends on it's pid
[1:34] * ircolle (~ircolle@c-67-172-132-164.hsd1.co.comcast.net) Quit (Quit: Leaving.)
[1:39] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[1:53] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:56] * rturk is now known as rturk-away
[1:57] * xiaoxi (~xiaoxiche@134.134.139.74) Quit (Remote host closed the connection)
[1:57] * jlogan2 (~Thunderbi@72.5.59.176) has joined #ceph
[1:57] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[1:57] * rturk-away is now known as rturk
[2:00] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[2:02] * jlogan1 (~Thunderbi@2600:c00:3010:1:64a7:2a7d:3bc:c1b2) Quit (Ping timeout: 480 seconds)
[2:07] * rtek (~sjaak@empfindlichkeit.nl) Quit (Ping timeout: 480 seconds)
[2:08] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[2:10] * joshd (~joshd@38.122.20.226) Quit (Quit: Leaving.)
[2:12] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[2:16] * rtek (~sjaak@empfindlichkeit.nl) has joined #ceph
[2:16] * alram (~alram@38.122.20.226) Quit (Quit: leaving)
[2:18] <ShaunR> with rbd is it ideal to just use the rbd pool?
[2:18] <ShaunR> or should i create my own
[2:18] * rturk is now known as rturk-away
[2:24] <dmick> ShaunR: ah, for starting remotes using the local conf, I see
[2:25] <dmick> and for rbd pool: if you're messing aroudn with the cmdline it avoids you having to type -p <pool> all the time
[2:25] <dmick> but otherwise there's nothing magic
[2:25] <ShaunR> i'm trying to understand placement groups right now.
[2:26] <ShaunR> It kind of sounds like i really need to know how many OSD's i'm going to have
[2:27] <ShaunR> that throws a big wrench in the mix when it comes to scaling out a cluster as needed.
[2:27] <dmick> well, "a hundred or so per OSD" is only a guideline; you can run with more or less
[2:27] <dmick> and
[2:27] <ShaunR> I may not be fully understanding this though either... may need to read it a few more times
[2:28] <dmick> soon one will be able to dynamically increase the number of PGs per pool; it's just not ready for prime time quite yet, but imminent
[2:28] <dmick> so that should help for "I started tiny, now I need to be huge"
[2:28] <ShaunR> ok, so it sounds like i am understanding this right.
[2:28] <dmick> but the thing to remember is that the cluster *works* no matter how many PGs there are
[2:28] <dmick> it just may not distribute as evenly as you'd like with a small number
[2:29] <ShaunR> sounds like if i wanted to cluster to grow i would need to add osds and then create a new pool?
[2:29] <ShaunR> until they get to the point where pg's can be changed in a pool
[2:30] <dmick> if your pg count was prohibitively low to start, yes
[2:30] <dmick> I'd say aim for a reasonable eventual first plateau
[2:30] <dmick> if you then double or triple in size, then it's time to worry
[2:30] <dmick> but I have to run.
[2:30] <dmick> gl with all this
[2:30] * dmick is now known as dmick_away
[2:31] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[2:31] * miroslav (~miroslav@pool-98-114-229-250.phlapa.fios.verizon.net) Quit (Quit: Leaving.)
[2:32] * miroslav (~miroslav@pool-98-114-229-250.phlapa.fios.verizon.net) has joined #ceph
[2:38] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[2:39] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[2:44] * miroslav (~miroslav@pool-98-114-229-250.phlapa.fios.verizon.net) Quit (Quit: Leaving.)
[2:46] * buck (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving.)
[2:50] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Ping timeout: 480 seconds)
[2:59] * ScOut3R (~ScOut3R@540079A1.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[3:00] * Cube (~Cube@12.248.40.138) Quit (Quit: Leaving.)
[3:05] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:13] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[3:13] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[3:16] * danieagle (~Daniel@177.97.251.41) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[3:21] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[3:29] * chutzpah (~chutz@199.21.234.7) Quit (Quit: Leaving)
[3:45] * yoshi_ (~yoshi@lg-corp.netflix.com) has joined #ceph
[3:45] * yoshi (~yoshi@lg-corp.netflix.com) Quit (Read error: Connection reset by peer)
[3:51] * Cube (~Cube@66-87-66-206.pools.spcsdns.net) has joined #ceph
[3:53] * yoshi_ (~yoshi@lg-corp.netflix.com) Quit (Ping timeout: 480 seconds)
[3:53] * Cube (~Cube@66-87-66-206.pools.spcsdns.net) Quit ()
[4:00] * LeaChim (~LeaChim@b0faa140.bb.sky.com) Quit (Ping timeout: 480 seconds)
[4:04] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[4:08] * jlogan2 (~Thunderbi@72.5.59.176) Quit (Read error: Connection reset by peer)
[4:08] * jlogan1 (~Thunderbi@72.5.59.176) has joined #ceph
[4:09] * Cube (~Cube@66-87-66-206.pools.spcsdns.net) has joined #ceph
[4:10] * Cube1 (~Cube@66-87-66-206.pools.spcsdns.net) has joined #ceph
[4:10] * Cube (~Cube@66-87-66-206.pools.spcsdns.net) Quit (Read error: Connection reset by peer)
[4:16] * Cube (~Cube@66-87-66-206.pools.spcsdns.net) has joined #ceph
[4:16] * Cube1 (~Cube@66-87-66-206.pools.spcsdns.net) Quit (Read error: Connection reset by peer)
[4:17] * Cube1 (~Cube@66-87-66-206.pools.spcsdns.net) has joined #ceph
[4:17] * Cube (~Cube@66-87-66-206.pools.spcsdns.net) Quit (Read error: Connection reset by peer)
[4:26] * Cube1 (~Cube@66-87-66-206.pools.spcsdns.net) Quit (Ping timeout: 480 seconds)
[4:29] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[4:32] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit ()
[4:34] * noob2 (~noob2@ext.cscinfo.com) Quit (Quit: Leaving.)
[4:50] * phillipp (~phil@p5B3AFC73.dip.t-dialin.net) has joined #ceph
[4:56] * phillipp1 (~phil@p5B3AF5AB.dip.t-dialin.net) Quit (Ping timeout: 480 seconds)
[5:23] * gregaf1 (~Adium@2607:f298:a:607:d1d0:87ef:2023:ecab) Quit (Quit: Leaving.)
[5:26] * Q310 (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) Quit (Read error: Connection reset by peer)
[5:26] * Qten (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) has joined #ceph
[5:28] * lurbs_ (user@uber.geek.nz) has joined #ceph
[5:28] * lurbs (user@uber.geek.nz) Quit (Read error: Connection reset by peer)
[5:33] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[5:40] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[6:06] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.89 [Firefox 18.0.2/20130201065344])
[6:15] * jlogan1 (~Thunderbi@72.5.59.176) Quit (Ping timeout: 480 seconds)
[6:22] * nhm_ (~nh@184-97-130-55.mpls.qwest.net) has joined #ceph
[6:27] * nhm (~nh@184-97-251-146.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[6:38] * jlogan1 (~Thunderbi@2600:c00:3010:1:5946:68bd:a804:3cb2) has joined #ceph
[7:11] * The_Bishop (~bishop@2001:470:50b6:0:edfe:497e:7390:6a2b) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[7:40] * sage (~sage@76.89.177.113) Quit (Ping timeout: 480 seconds)
[7:50] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[7:51] * sage (~sage@76.89.177.113) has joined #ceph
[8:02] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[8:20] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[8:44] * jlogan1 (~Thunderbi@2600:c00:3010:1:5946:68bd:a804:3cb2) Quit (Ping timeout: 480 seconds)
[9:28] * loicd (~loic@magenta.dachary.org) has joined #ceph
[9:28] * dignus (~dignus@bastion.jkit.nl) has joined #ceph
[9:32] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) Quit (Quit: It's a dud! It's a dud! It's a du...)
[10:00] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Remote host closed the connection)
[10:05] * wer (~wer@wer.youfarted.net) Quit (Remote host closed the connection)
[10:07] * ScOut3R (~scout3r@2E6B53AC.dsl.pool.telekom.hu) has joined #ceph
[10:28] * topro (~topro@host-62-245-142-50.customer.m-online.net) Quit (Quit: Konversation terminated!)
[10:30] * ScOut3R (~scout3r@2E6B53AC.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[10:35] * lightspeed (~lightspee@81.187.0.153) Quit (Ping timeout: 480 seconds)
[10:37] * KindOne (KindOne@h185.237.22.98.dynamic.ip.windstream.net) Quit (Quit: //TODO: Add Quit Message.)
[10:39] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[10:43] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[10:44] * topro (~topro@host-62-245-142-50.customer.m-online.net) has joined #ceph
[10:59] * KindOne (KindOne@h185.237.22.98.dynamic.ip.windstream.net) has joined #ceph
[11:04] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[11:06] * sleinen1 (~Adium@2001:620:0:25:e1bc:4a58:f22a:9b61) has joined #ceph
[11:12] * loicd (~loic@magenta.dachary.org) has joined #ceph
[11:12] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[11:20] * sleinen (~Adium@2001:620:0:25:60eb:68c0:23bd:b758) has joined #ceph
[11:27] * sleinen1 (~Adium@2001:620:0:25:e1bc:4a58:f22a:9b61) Quit (Ping timeout: 480 seconds)
[11:43] * leseb (~leseb@bea13-1-82-228-104-16.fbx.proxad.net) has joined #ceph
[12:03] * leseb (~leseb@bea13-1-82-228-104-16.fbx.proxad.net) Quit (Remote host closed the connection)
[12:14] * calebamiles (~caleb@c-107-3-1-145.hsd1.vt.comcast.net) Quit (Ping timeout: 480 seconds)
[12:28] * sleinen (~Adium@2001:620:0:25:60eb:68c0:23bd:b758) Quit (Quit: Leaving.)
[12:50] <Kioob> how can I verify from where come performance problems ? I can't achieve more than 40MB/s on sequential read, with data in memory cache on OSD
[12:50] <Kioob> (and a 10Gbps network)
[12:51] <Kioob> I bench a rados block device, mapped thought the kernel module
[12:51] <Kioob> like that : dd if=/dev/rbd/hdd3copies/courier-hdd of=/dev/null bs=8M count=1024
[13:00] * LeaChim (~LeaChim@b0faa140.bb.sky.com) has joined #ceph
[13:06] <Kioob> how... readahead
[13:07] * leseb (~leseb@bea13-1-82-228-104-16.fbx.proxad.net) has joined #ceph
[13:07] <Kioob> after some tests, on a Xen Dom0 I had 135MB/s, and 67MB/s from a Xen DomU
[13:09] <Kioob> but if I change the readahead from 128KB to 4MB in DomU, the read throughput jump to 200MB/s
[13:10] * loicd (~loic@magenta.dachary.org) Quit (Ping timeout: 480 seconds)
[13:27] <Kioob> so, with readahead at 128KB/s (default) I have 65MB/s, with 256KB/s I have 140MB/s, and with 512KB/s and greater I have between 180 and 210 MB/s
[13:27] <Kioob> I like that !
[13:36] * lightspeed (~lightspee@81.187.0.153) has joined #ceph
[13:38] * leseb (~leseb@bea13-1-82-228-104-16.fbx.proxad.net) Quit (Remote host closed the connection)
[14:23] * loicd (~loic@ram94-1-81-57-198-59.fbx.proxad.net) has joined #ceph
[14:30] * Kioob (~kioob@luuna.daevel.fr) Quit (Quit: Leaving.)
[14:43] * jdarcy (~quassel@66.187.233.206) Quit (Ping timeout: 480 seconds)
[14:50] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[14:56] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[14:57] * Kioob (~kioob@luuna.daevel.fr) has joined #ceph
[15:01] <madkiss> could somebody refresh my memory w/ regards to deep mounts?
[15:01] <madkiss> do they work? are they expected to?
[15:02] * ScOut3R (~scout3r@2E6B53AC.dsl.pool.telekom.hu) has joined #ceph
[15:02] * BillK (~BillK@124-169-193-2.dyn.iinet.net.au) Quit (Quit: Leaving)
[15:07] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[15:09] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[15:24] <Kioob> So... just by tuning readahead, in a Xen dom0 I jump to 345MB/s (data in memory on OSDs) or 180MB/s (data not in memory on OSDs)... because each 4MB chunk it use a different OSD
[15:25] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[15:25] <nhm_> Kioob: yes, I was just talking to Xiaoxi about readahead. It definitely provides a big boost.
[15:25] <Kioob> yes :)
[15:25] <nhm_> Kioob: wanted to mention that before you left earlier.
[15:25] <nhm_> Kioob: he went from about 1.2GB/s aggregate to around 3GB/s aggregate I beleive.
[15:26] <nhm_> just by going from 128k to 512k readahead.
[15:26] * sleinen1 (~Adium@2001:620:0:25:2c04:cba2:ea98:7f2e) has joined #ceph
[15:26] <Kioob> 3GB/s... wow
[15:27] <Kioob> but.... 3GB/s... with what sort of network ??
[15:27] <Kioob> Infiniband ?
[15:27] <nhm_> Kioob: 10GbE
[15:27] <nhm_> Kioob: He has I think 6 nodes
[15:27] <Kioob> oh, 3Gb/s then ?
[15:27] <Kioob> bits, not bytes ?
[15:28] <nhm_> Kioob: no, bytes, but that was aggregate reads across all his VMs
[15:28] <Kioob> oh ok
[15:28] <Kioob> great !
[15:28] <nhm_> Kioob: I can do 2.7GB/s writes from localhost on our test node (1 node) now. :)
[15:28] <Kioob> so yes, I was thinking to keep 512k per VM too
[15:29] <nhm_> Kioob: I've got bonded 10GbE to it and that maxes out at around 2GB/s.
[15:29] <Kioob> I like that !
[15:30] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[15:31] <Kioob> I suppose that one of my main problem is that Xen doesn't report to the DomU the block devices informations
[15:31] <Kioob> optimal_io_size for example stay to 0
[15:33] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[15:35] * noob2 (~noob2@pool-71-244-111-36.phlapa.fios.verizon.net) has joined #ceph
[15:35] * noob2 (~noob2@pool-71-244-111-36.phlapa.fios.verizon.net) has left #ceph
[15:39] <Kioob> for now I still have very poor write throughput (30MB/s), I have to fix that
[15:40] * ScOut3R (~scout3r@2E6B53AC.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[15:46] <Kioob> oh, in kernel 3.7.6 : � xfs: Fix possible use-after-free with AIO �
[15:47] <Kioob> and � xfs: fix periodic log flushing � about metadata loss
[15:48] <Kioob> I need to upgrade all my OSDs :/
[15:49] * sleinen1 (~Adium@2001:620:0:25:2c04:cba2:ea98:7f2e) Quit (Quit: Leaving.)
[15:52] * danieagle (~Daniel@177.97.251.41) has joined #ceph
[16:03] * xiaoxi (~xiaoxiche@134.134.139.72) has joined #ceph
[16:07] * Kioob (~kioob@luuna.daevel.fr) Quit (Remote host closed the connection)
[16:08] * Kioob (~kioob@luuna.daevel.fr) has joined #ceph
[16:09] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[16:09] * xiaoxi (~xiaoxiche@134.134.139.72) Quit (Remote host closed the connection)
[16:16] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[16:20] * sleinen1 (~Adium@2001:620:0:26:3cbf:d4ff:d9b2:fb41) has joined #ceph
[16:27] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[16:39] * calebamiles (~caleb@c-107-3-1-145.hsd1.vt.comcast.net) has joined #ceph
[16:41] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[16:51] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[17:05] * sleinen1 (~Adium@2001:620:0:26:3cbf:d4ff:d9b2:fb41) Quit (Quit: Leaving.)
[17:05] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[17:05] * sleinen1 (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[17:05] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Read error: Connection reset by peer)
[17:06] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[17:07] * sleinen (~Adium@2001:620:0:26:1182:9bf6:5f6:bd4c) has joined #ceph
[17:08] * mdxi_ (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Quit: leaving)
[17:09] * miroslav (~miroslav@pool-98-114-229-250.phlapa.fios.verizon.net) has joined #ceph
[17:13] * sleinen1 (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[17:14] * xiaoxi (~xiaoxiche@134.134.139.74) has joined #ceph
[17:14] <xiaoxi> Hi everyone~
[17:14] <xiaoxi> It's chinese new year today~ Happy new year~
[17:15] * sleinen (~Adium@2001:620:0:26:1182:9bf6:5f6:bd4c) Quit (Quit: Leaving.)
[17:15] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[17:16] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[17:16] * xiaoxi (~xiaoxiche@134.134.139.74) Quit ()
[17:18] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[17:18] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit ()
[17:20] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[17:21] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[17:22] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:23] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[17:32] <loicd> happy new year :-D
[17:32] <loicd> Anyone willing to review https://github.com/ceph/ceph/pull/41/files ?
[17:37] <Kioob> nhm_: I confirm that my write performance problem also came from Xen
[17:37] <Kioob> I achieve between 120 and 220 MB/s out of Xen, and 40MB/s inside Xen
[17:39] <Kioob> (on the same RBD device)
[17:40] * yoshi (~yoshi@96.24.74.71) has joined #ceph
[17:44] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[17:50] <Kioob> I suppose its because of max_segments and max_segments_size
[17:50] * scalability-junk (~stp@188-193-201-35-dynip.superkabel.de) has joined #ceph
[17:50] <Kioob> in Xen I have a maximum of 44KB data sent... versus 512MB outside of Xen
[17:51] <Kioob> I was thinking that this Xen bug was fixed...
[17:55] * yoshi (~yoshi@96.24.74.71) Quit (Remote host closed the connection)
[18:00] * loicd (~loic@ram94-1-81-57-198-59.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[18:01] * miroslav (~miroslav@pool-98-114-229-250.phlapa.fios.verizon.net) Quit (Remote host closed the connection)
[18:02] * miroslav (~miroslav@pool-98-114-229-250.phlapa.fios.verizon.net) has joined #ceph
[18:07] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[18:08] * nz_monkey (~nz_monkey@222.47.255.123.static.snap.net.nz) Quit (Ping timeout: 480 seconds)
[18:09] * miroslav (~miroslav@pool-98-114-229-250.phlapa.fios.verizon.net) has left #ceph
[18:38] * leseb (~leseb@90.84.144.82) has joined #ceph
[18:40] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[18:43] * yoshi_ (~yoshi@168.sub-70-199-81.myvzw.com) has joined #ceph
[18:50] * yoshi (~yoshi@96.24.74.71) has joined #ceph
[18:57] * leseb_ (~leseb@90.84.146.233) has joined #ceph
[18:57] * yoshi_ (~yoshi@168.sub-70-199-81.myvzw.com) Quit (Ping timeout: 480 seconds)
[18:58] * nz_monkey (~nz_monkey@222.47.255.123.static.snap.net.nz) has joined #ceph
[19:00] * leseb (~leseb@90.84.144.82) Quit (Ping timeout: 480 seconds)
[19:00] * jtang1 (~jtang@79.97.135.214) Quit (Quit: Leaving.)
[19:01] * yoshi (~yoshi@96.24.74.71) Quit (Remote host closed the connection)
[19:09] * terje_ (~joey@97-118-121-147.hlrn.qwest.net) has joined #ceph
[19:10] * terje (~joey@71-218-6-247.hlrn.qwest.net) Quit (Ping timeout: 480 seconds)
[19:14] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[19:15] <madkiss> is there any actual trick to make MySQL on CephFS perform?
[19:15] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[19:15] <madkiss> I can literally see my script insert the lines into the database
[19:34] * leseb_ (~leseb@90.84.146.233) Quit (Ping timeout: 480 seconds)
[19:37] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[19:39] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[20:00] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[20:03] <Kioob> madkiss: I'm not sure having MySQL on a network FS is a good idea. You should maybe try RBD, no ?
[20:12] <madkiss> Kioob: I do not plan to use this in production anyway, I am just building a PoC and I was too lazy to set up Galera.
[20:13] <madkiss> i was just wondering about the fact that it is soooo extrodinarily slow
[20:13] <Kioob> ok. I think MySQL use DirectIO
[20:13] <madkiss> I did an OpenStack-Installation that has nova-compute controlled by pacemaker and can thus do automatic VM recovery by now
[20:14] <madkiss> and by all means I wanted to avoid DRBD, because this PoC is a 3-node PoC :)
[20:14] <Kioob> well... I use RBD for VM, not CephFS
[20:15] <madkiss> I was using CephFS for the shared stuff from MySQL/RabbitMQ and for my OpenStack /var/lib/nova/instances folder
[20:15] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[20:16] <madkiss> i am sort of abusing Ceph as a standard shared storage, if you want to put it like that, NFS would probably do just as well. I just thought "Why use another solution if we have a superior one running already?"
[20:17] <Kioob> yes, but if you have a running Ceph cluster, you can easily use RBD too
[20:17] <madkiss> ya. i could boot my VMs from volumes, that's correct.
[20:25] * scalability-junk (~stp@188-193-201-35-dynip.superkabel.de) Quit (Quit: Leaving)
[20:36] * leseb (~leseb@90.84.144.120) has joined #ceph
[20:37] <ShaunR> [root@storage1 ~]# rbd list
[20:37] <ShaunR> 2013-02-09 03:43:10.555815 7f60567c0700 0 cephx: verify_reply coudln't decrypt with error: error decoding block for decryption
[20:37] <ShaunR> 2013-02-09 03:43:10.555826 7f60567c0700 0 -- 208.82.119.132:0/1013336 >> 208.82.119.132:6801/9052 pipe(0x17582b0 sd=4 :49178 s=1 pgs=0 cs=0 l=1).failed verifying authorize reply
[20:37] <ShaunR> any ideas, worked last night, woke up this morning with this error
[20:42] <ShaunR> n/m, some how the time on this server got all out of wack
[20:44] * leseb (~leseb@90.84.144.120) Quit (Read error: Connection reset by peer)
[20:44] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[21:08] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:09] * loicd (~loic@magenta.dachary.org) Quit ()
[21:21] * ScOut3R (~scout3r@2E6B53AC.dsl.pool.telekom.hu) has joined #ceph
[21:26] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:29] * loicd (~loic@magenta.dachary.org) Quit ()
[21:31] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:34] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Ping timeout: 480 seconds)
[21:35] * loicd (~loic@magenta.dachary.org) Quit ()
[21:44] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:44] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[21:45] * loicd (~loic@magenta.dachary.org) Quit ()
[21:55] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:56] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[21:57] * loicd (~loic@magenta.dachary.org) Quit ()
[22:03] * danieagle (~Daniel@177.97.251.41) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[22:09] * Kioob (~kioob@luuna.daevel.fr) Quit (Remote host closed the connection)
[22:12] * ScOut3R (~scout3r@2E6B53AC.dsl.pool.telekom.hu) Quit (Remote host closed the connection)
[22:22] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[22:40] * asadpanda (~asadpanda@67.231.236.80) Quit (Quit: ZNC - http://znc.in)
[22:43] * CrashHD (CrashHD@c-24-10-14-95.hsd1.ca.comcast.net) has joined #ceph
[22:47] * ScOut3R (~ScOut3R@2E6B53AC.dsl.pool.telekom.hu) has joined #ceph
[22:48] <CrashHD> hello
[22:48] <CrashHD> how goes it
[22:50] * Vjarjadian (~IceChat77@5ad6d005.bb.sky.com) has joined #ceph
[22:54] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:56] * loicd (~loic@magenta.dachary.org) Quit ()
[22:57] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:58] * loicd (~loic@magenta.dachary.org) Quit ()
[22:58] <CrashHD> what kind of latencies should you expect with ceph?
[22:58] <CrashHD> put it up against isilon? netapp? etc?
[23:00] <Robe> reads hit one server, writes are two servers deep
[23:02] <Robe> see http://ceph.com/papers/weil-rados-pdsw07.pdf
[23:02] <CrashHD> I'll take a look
[23:02] <CrashHD> thanks
[23:04] <CrashHD> any support for things like read caching to ssd?
[23:04] <CrashHD> or just ssd write journalling?
[23:05] <Robe> for the block device layer there's read caching on the client side
[23:05] <Robe> other than that it's up to the storage implementation on the OSD side
[23:05] <CrashHD> interesting
[23:06] <Vjarjadian> and with Ceph you can tailor the hardware to your needs...
[23:06] <CrashHD> I don't believe any of the standard filesystems I've read tested against support block caching
[23:06] <CrashHD> xfs, ext and brtfs
[23:06] <CrashHD> am I wrong on that?
[23:06] <Robe> bcache/flashcache is available on linux
[23:07] <CrashHD> very nice
[23:08] <CrashHD> much traction for ceph against vmware?
[23:08] <CrashHD> s/against/with
[23:08] <Robe> more with openstack
[23:08] <Robe> not too sure about vmware
[23:09] <Vjarjadian> proxmox have a page on using ceph with the hypervisor
[23:20] * CrashHD (CrashHD@c-24-10-14-95.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[23:20] * CrashHD (CrashHD@c-24-10-14-95.hsd1.ca.comcast.net) has joined #ceph
[23:27] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[23:28] * CrashHD (CrashHD@c-24-10-14-95.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[23:29] * CrashHD (CrashHD@c-24-10-14-95.hsd1.ca.comcast.net) has joined #ceph
[23:30] <ShaunR> wahoo.... got my 2 server ceph cluster up, kvm server vm1 using rbd!
[23:32] * aliguori (~anthony@cpe-70-112-157-151.austin.res.rr.com) Quit (Remote host closed the connection)
[23:41] * CrashHD (CrashHD@c-24-10-14-95.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[23:41] * CrashHD (CrashHD@c-24-10-14-95.hsd1.ca.comcast.net) has joined #ceph
[23:42] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: This computer has gone to sleep)
[23:48] * sagewk (~sage@38.122.20.226) Quit (Ping timeout: 480 seconds)
[23:48] * dmick_away (~dmick@2607:f298:a:607:1a03:73ff:fedd:c856) Quit (Ping timeout: 480 seconds)
[23:50] * sjust (~sam@38.122.20.226) Quit (Ping timeout: 480 seconds)
[23:50] * yehudasa (~yehudasa@2607:f298:a:607:91cf:597a:4e62:5836) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.