#ceph IRC Log

Index

IRC Log for 2012-09-25

Timestamps are in GMT/BST.

[0:03] * calebamiles (~caleb@pool-71-241-142-116.burl.east.myfairpoint.net) has joined #ceph
[0:06] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[0:09] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has left #ceph
[0:14] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[0:16] <elder> sagelap1, I have only just begun really looking at this but I think the problem reported by Christian Huang on the ML *might* be the same kunmap() issue just fixed.
[0:16] <elder> The suspicion is based on looking at the disassembled kernel object file (taken from my own machine's /lib/modules directory!) and a small pattern of instructions nearby that may correlate to kmap() calls.
[0:17] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[0:18] * pentabular (~sean@adsl-70-231-141-128.dsl.snfc21.sbcglobal.net) has joined #ceph
[0:27] * fengh (~avati@c-98-207-206-65.hsd1.ca.comcast.net) has joined #ceph
[0:29] * sagelap1 (~sage@38.122.20.226) Quit (Ping timeout: 480 seconds)
[0:30] * SvenDowideit (~SvenDowid@203-206-171-38.perm.iinet.net.au) has joined #ceph
[0:32] * ajm (~ajm@adam.gs) has left #ceph
[0:40] * sagelap (~sage@2607:f298:a:607:c685:8ff:fe59:d486) has joined #ceph
[0:54] * allsystemsarego (~allsystem@188.27.164.159) Quit (Quit: Leaving)
[0:58] * fengh (~avati@c-98-207-206-65.hsd1.ca.comcast.net) Quit (Quit: leaving)
[0:59] * maelfius (~mdrnstm@adsl-99-16-51-31.dsl.lsan03.sbcglobal.net) has joined #ceph
[1:06] * slang (~slang@2607:f298:a:607:5cbf:67fd:ead2:6f0f) Quit (Ping timeout: 480 seconds)
[1:06] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[1:06] * calebamiles (~caleb@pool-71-241-142-116.burl.east.myfairpoint.net) Quit (Ping timeout: 480 seconds)
[1:06] * slang (~slang@38.122.20.226) has joined #ceph
[1:09] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has left #ceph
[1:12] * BManojlovic (~steki@195.13.166.253) Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:16] <sagewk> elder: can have him try a kernel with that patch applied, i guess...
[1:17] <sagewk> elder: a bit odd that it would reliably trigger from a reconnect, though. :/
[1:17] * calebamiles (~caleb@pool-71-161-214-234.burl.east.myfairpoint.net) has joined #ceph
[1:17] <dmick> man, they'll let anyone in here
[1:20] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) has left #ceph
[1:21] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) has joined #ceph
[1:25] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[1:25] <sagewk> spamaps: where do i look to get a status update on the freeze exception?
[1:26] <SpamapS> sagewk: it was just accepted 2 hours ago actually
[1:27] <SpamapS> sagewk: https://launchpad.net/ubuntu/+source/ceph shows 0.48.2
[1:28] <SpamapS> sagewk: built successfully on all but arm
[1:31] <sagewk> spamaps: sweet. are those builds in progress, or failures?
[1:32] <SpamapS> sagewk: in progress
[1:32] <SpamapS> sagewk: armhf is almost done
[1:33] <SpamapS> sagewk: https://launchpad.net/ubuntu/+source/ceph/0.48.2-0ubuntu1
[1:33] <sagewk> spamaps: yay, thanks!
[1:35] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:43] * chutzpah (~chutz@100.42.98.5) has joined #ceph
[1:53] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:55] * loicd (~loic@magenta.dachary.org) has joined #ceph
[2:02] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) Quit (Remote host closed the connection)
[2:10] * slang (~slang@38.122.20.226) Quit (Ping timeout: 480 seconds)
[2:10] * maelfius (~mdrnstm@adsl-99-16-51-31.dsl.lsan03.sbcglobal.net) Quit (Quit: Leaving.)
[2:15] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: zzzzzzzzzzzzzzzzzzzz)
[2:19] * pentabular (~sean@adsl-70-231-141-128.dsl.snfc21.sbcglobal.net) has left #ceph
[2:22] * pentabular (~sean@adsl-70-231-141-128.dsl.snfc21.sbcglobal.net) has joined #ceph
[2:44] * cblack101 (c0373624@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[2:54] * Tv_ (~tv@2607:f298:a:607:912:1bb:3a6b:cca2) Quit (Read error: Operation timed out)
[3:00] * sagelap (~sage@2607:f298:a:607:c685:8ff:fe59:d486) Quit (Ping timeout: 480 seconds)
[3:00] * sagelap (~sage@155.sub-70-197-139.myvzw.com) has joined #ceph
[3:01] * sagelap (~sage@155.sub-70-197-139.myvzw.com) Quit ()
[3:02] * sagelap (~sage@155.sub-70-197-139.myvzw.com) has joined #ceph
[3:10] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[3:13] <sagelap> joshd: wip-rbd-coverity looks good!
[3:14] <joshd> that was fast
[3:23] <dmick> convinced sagelap is reviewing code at red lights
[3:28] * calebamiles (~caleb@pool-71-161-214-234.burl.east.myfairpoint.net) Quit (Quit: Leaving.)
[3:40] <sagelap> joshd: the rados list from precise also appears to be broken.. but i can't think of anything that uses it...
[3:40] <sagelap> except 'rados ls'
[3:42] <joshd> sagelap: yeah, I don't think that's used anywhere else
[3:49] * lofejndif (~lsqavnbok@04ZAAFMJR.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[3:50] * sagelap (~sage@155.sub-70-197-139.myvzw.com) Quit (Ping timeout: 480 seconds)
[3:53] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) Quit (Quit: Leaving.)
[4:11] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[4:33] * chutzpah (~chutz@100.42.98.5) Quit (Quit: Leaving)
[4:41] * Ryan_Lane (~Adium@216.38.130.162) Quit (Quit: Leaving.)
[4:54] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[4:58] * glowell (~Adium@38.122.20.226) Quit (Quit: Leaving.)
[5:06] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[5:06] <elder> sagewk the reason I had a hunch it could be related to the kunmap() thing was because there was a distinctive pattern of a couple of instructions just ahead of the crash, involving PAGE_OFFSET (0xffff880000000000).
[5:18] <elder> Looking at the commits between 3.5.4 and 3.6-rc7, I see this one: (43643528) rbd: Clear ceph_msg->bio_iter for retransmitted message
[5:19] <elder> So this could be just hitting that bug. We really need to provide our bugfixes to the -stable branch...
[5:33] * dmick (~dmick@2607:f298:a:607:b46e:6310:b0cc:f34) Quit (Quit: Leaving.)
[5:38] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[6:01] * pentabular (~sean@adsl-70-231-141-128.dsl.snfc21.sbcglobal.net) has left #ceph
[6:07] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[6:07] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has left #ceph
[6:09] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) Quit (Ping timeout: 480 seconds)
[6:23] * MikeMcClurg1 (~mike@93-137-178-252.adsl.net.t-com.hr) has joined #ceph
[6:23] * MikeMcClurg (~mike@93-137-178-252.adsl.net.t-com.hr) Quit (Read error: Connection reset by peer)
[6:34] * glowell (~Adium@68.170.71.123) has joined #ceph
[6:35] * alexxy (~alexxy@79.173.81.171) Quit (Read error: No route to host)
[6:36] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[7:06] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Read error: Operation timed out)
[7:10] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[7:14] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) has joined #ceph
[7:32] * MikeMcClurg1 (~mike@93-137-178-252.adsl.net.t-com.hr) Quit (Quit: Leaving.)
[7:32] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[7:41] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[7:43] * loicd (~loic@82.235.173.177) has joined #ceph
[7:53] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[8:14] * loicd (~loic@82.235.173.177) Quit (Quit: Leaving.)
[8:19] * gaveen (~gaveen@112.135.153.236) has joined #ceph
[8:23] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Read error: Connection reset by peer)
[8:23] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[8:30] * mistur (~yoann@kewl.mistur.org) has joined #ceph
[8:41] * MikeMcClurg (~mike@93-137-178-252.adsl.net.t-com.hr) has joined #ceph
[8:48] * gaveen (~gaveen@112.135.153.236) Quit (Quit: Leaving)
[8:49] * yehuda_hm (~yehuda@99-48-179-68.lightspeed.irvnca.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[9:02] * gaveen (~gaveen@112.135.153.236) has joined #ceph
[9:07] * loicd (~loic@194.201-14-84.ripe.coltfrance.com) has joined #ceph
[9:07] * MikeMcClurg (~mike@93-137-178-252.adsl.net.t-com.hr) Quit (Quit: Leaving.)
[9:14] * loicd1 (~loic@90.84.144.220) has joined #ceph
[9:17] * pentabular (~sean@adsl-70-231-141-128.dsl.snfc21.sbcglobal.net) has joined #ceph
[9:17] * loicd (~loic@194.201-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[9:24] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:32] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[9:34] * pentabular (~sean@adsl-70-231-141-128.dsl.snfc21.sbcglobal.net) has left #ceph
[9:34] * BManojlovic (~steki@87.110.183.173) has joined #ceph
[9:50] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[10:10] * jjgalvez (~jjgalvez@32.173.23.137) has joined #ceph
[10:11] * jjgalvez (~jjgalvez@32.173.23.137) has left #ceph
[10:12] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:03] * Cube (~Adium@12.248.40.138) Quit (Quit: Leaving.)
[11:20] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[11:35] * loicd1 is now known as loicd
[11:41] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[11:49] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[11:50] * mjosu001 (~mosu001@en-439-0331-001.esc.auckland.ac.nz) has joined #ceph
[11:53] <mjosu001> Hi everyone, I'm having trouble gettin gmy ceph system back up and running after a major power failure at my university's data centre
[11:54] <mjosu001> I have two servers with MDS/MON on each of them
[11:54] <mjosu001> and two servers with 6 OSDs each on them
[11:55] <mjosu001> I was replicating between the two storage servers, so there should be one replica on OSDs 0-5 and one on OSDs 6-11
[11:55] <mjosu001> After the power cut I had to get the OSDs back using mkfs.btrfs and re-mount them
[11:55] <mjosu001> All the pgs got stuck stale, but by using ceph pg create_pg_force I got the pgs back
[11:56] <mjosu001> However I can't get both MDS processes running at once (this wasn't a problem previously)
[11:57] <mjosu001> After a little bit (1 minute of so) one of the ceph-mds processes with crash and ceph will say that MDS is laggy or unresponsive
[11:57] <mjosu001> I can restart ceph-mds, but it just goes down again after about a minute
[11:57] <mjosu001> I have noticed one of my OSDs has failed (each OSD is on a disk and the disk has failed), but I though the MDSs should keep running anyway?
[11:57] <mjosu001> Any ideas how I can trouble shoot this?
[11:58] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[12:00] <mjosu001> Maybe I should mkcephfs again, but I would prefer not to lose the data if there is a better way...?
[12:03] <andreask> hmm ... you made mkfs.btrfs? .... hopefully only on one server?
[12:03] * EmilienM (~EmilienM@55.67.197.77.rev.sfr.net) has left #ceph
[12:04] <mjosu001> No all the OSDs had bad superblocks...
[12:04] <andreask> you wiped all data on all osds?
[12:05] <mjosu001> That could be, it is an experimental system, so not a big deal
[12:06] <andreask> I see
[12:06] <mjosu001> I've had problems getting OSDs up and running before (bad superblock I think)
[12:06] <mjosu001> and Sage suggested mkfs.btrfs, then mount, then start ceph-osd
[12:07] <mjosu001> This seems to fix the problem and the data is still there...
[12:07] <andreask> as long as there is one replica of each pg on that reformated osd, yes
[12:08] <andreask> you should use xfs
[12:08] <mjosu001> Oh, is that the preferred FS now?
[12:09] <mjosu001> Even if the data is gone, one of the MDSs still crashes so I can't mount to see what is in my system...
[12:10] <mjosu001> I mean mount the cephfs
[12:10] <wido> It's not that XFS is the preferred, btrfs still has the best features, but it's not so robust as XFS is
[12:10] <wido> mjosu001: What does ceph -s say? Is the cluster degraded?
[12:11] <mjosu001> health HEALTH_WARN mds 1 is laggy
[12:11] <mjosu001> monmap e1: 2 mons at {0=10.19.99.123:6789/0,1=10.19.99.124:6789/0}, election epoch 2, quorum 0,1 0,1
[12:11] <mjosu001> osdmap e1041: 12 osds: 11 up, 11 in
[12:11] <mjosu001> pgmap v30406: 2304 pgs: 2304 active+clean; 0 bytes data, 22101 MB used, 19632 GB / 20478 GB avail
[12:11] <mjosu001> mdsmap e17002: 1/1/1 up {0=1=up:replay(laggy or crashed)}
[12:11] <mjosu001> When I restart mds1 it says HEALTH_OK for about a minute then the mds crashes again...
[12:12] <mjosu001> There are only 11 OSDs up as one disk has failed due the the power outage I think
[12:12] <wido> There is currently no data on the cluster, 0 bytes used. Your mkfs.btrfs wiped all the data
[12:12] <wido> however, the MDS should create a new filesystem and still run happy
[12:12] <mjosu001> OK, that is not a problem, I'm only testing performance and have copies of the data elsewhere
[12:12] <wido> Did you try and up the logging for the MDS and see what comes out?
[12:13] <mjosu001> Yes, but I'm struggling to see what is happening
[12:13] <mjosu001> Hold on I'll get my ceph.conf
[12:14] <mjosu001> [global]
[12:14] <mjosu001> pid file = /var/run/ceph/$name.pid
[12:14] <mjosu001> logger dir = /cephlog
[12:14] <mjosu001> log dir = /cephlog
[12:14] <mjosu001> user = root
[12:14] <mjosu001> [mon]
[12:14] <mjosu001> mon data = /data/mon$id
[12:14] <mjosu001> ; debug ms = 1
[12:14] <mjosu001> ; debug mon = 20
[12:14] <mjosu001> ; debug paxos = 20
[12:14] <mjosu001> [mon.0]
[12:14] <mjosu001> host = ss3
[12:14] <mjosu001> mon addr = 10.19.99.123:6789
[12:14] <mjosu001> [mon.1]
[12:14] <mjosu001> host = ss4
[12:14] <mjosu001> mon addr = 10.19.99.124:6789
[12:14] <mjosu001> [mds]
[12:14] <mjosu001> debug ms = 1 ; message traffic
[12:14] <mjosu001> debug mds = 20 ; mds
[12:14] <mjosu001> ; debug mds balancer = 20 ; load balancing
[12:14] <mjosu001> ; debug mds log = 20 ; mds journaling
[12:14] <mjosu001> ; debug mds_migrator = 20 ; metadata migration
[12:14] <mjosu001> ; debug monc = 20 ; monitor interaction, startup
[12:14] <mjosu001> [mds.0]
[12:15] <mjosu001> host = ss3
[12:15] <mjosu001> [mds.1]
[12:15] <mjosu001> host = ss4
[12:15] <mjosu001> [osd]
[12:15] <mjosu001> ; osd journal = /data/osd$id/journal
[12:15] <mjosu001> osd journal size = 2000
[12:15] <mjosu001> filestore journal writeahead = true
[12:15] <mjosu001> ; osd data = /data/osd$id
[12:15] <mjosu001> ; debug ms = 1 ; message traffic
[12:15] <mjosu001> ; debug osd = 20
[12:15] <mjosu001> ; debug filestore = 20 ; local object storage
[12:15] <mjosu001> ; debug journal = 20 ; local journaling
[12:15] <mjosu001> ; debug monc = 20 ; monitor interaction, startup
[12:15] <mjosu001> [osd.0]
[12:15] <mjosu001> host = ss1
[12:15] <mjosu001> osd data = /data/osd.11
[12:15] <mjosu001> osd journal = /data/osd.11/journal
[12:15] <mjosu001> btrfs devs = /dev/sdb
[12:15] <mjosu001> The other OSDs are basically the same as osd.0
[12:16] <wido> Why are you setting the osd data directory again under osd.0?
[12:16] <wido> Same goes for journal
[12:16] <wido> they inherit that from the general osd section
[12:17] <mjosu001> That is a good question
[12:17] <mjosu001> I didn't set up this conf file, I have inherited the system from my postgrad student who went to work for Samsung!
[12:17] <mjosu001> I don't see any mds.1.log on ss4 in /cephlog ...
[12:18] <wido> A couple of other notes: user is not required under "global"
[12:18] <mjosu001> But ss3 has lots of log files...
[12:18] <mjosu001> OK, thanks
[12:18] <wido> logger dir doesn't exist as a directive
[12:19] <wido> btrfs devs will become deprecated in the near future, I recommend mounting all the drives in your fstab and removing the btrfs devs lines
[12:19] <mjosu001> Oh really? We started a couple of years ago, so things have changed and I probably haven't kept up...
[12:19] <wido> So a [osd.X] section only has to contain a "host" directive
[12:21] <wido> Now, your MDS, I'm not that known with the MDS. It is still under development and didn't get that much attention lately
[12:21] <wido> most of the focus went to rados and rbd
[12:22] <mjosu001> OK, I basically learned Linux and Ceph via Google, so I will look into how to set up the fstab
[12:22] <mjosu001> Is CephFS still the best way to use ceph for a file store?
[12:22] <mjosu001> We have also played a little with OpenStack...
[12:23] <wido> If you used OpenStack you'll be using RBD, that is much more stable
[12:24] <mjosu001> So OpenStack's Swift links to RBD? Then we use cloudfuse to get an FS?
[12:24] <wido> No, not Swift, but the hypervisor, (Nova?) uses RBD
[12:24] <wido> there is some docs about this on ceph.com/docs/
[12:25] <mjosu001> Oh ok, I can look there, but Nova is for compute and running VMs (I think, just getting into this), so how does this work as a distributed FS?
[12:25] <mjosu001> Or shoudl I just go look at ceph.com/docs...?
[12:26] <wido> You won't get a distributed FS, but your disk images will run from RBD
[12:27] <wido> I recommend you learn the terminology, RADOS, RBD and CephFS
[12:27] <wido> 3 different aspects from Ceph
[12:27] <wido> I got to go afk
[12:27] <mjosu001> Yeah, object store, block store and file store
[12:28] <mjosu001> Thanks wido
[12:30] <mjosu001> Can anyone else help me get my MDS and hence CephFS back up and running?
[12:42] * loicd (~loic@90.84.144.220) Quit (Quit: Leaving.)
[13:17] <mjosu001> OK, used mkcephfs and all seems to be working again (without any data retained)
[13:17] * mjosu001 (~mosu001@en-439-0331-001.esc.auckland.ac.nz) Quit (Quit: Leaving)
[13:28] * gaveen (~gaveen@112.135.153.236) Quit (Remote host closed the connection)
[13:38] * mtk (~mtk@ool-44c35bb4.dyn.optonline.net) has joined #ceph
[13:39] * mtk (~mtk@ool-44c35bb4.dyn.optonline.net) Quit ()
[13:39] * mtk (~mtk@ool-44c35bb4.dyn.optonline.net) has joined #ceph
[13:45] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:52] * loicd (~loic@178.20.50.225) has joined #ceph
[13:52] * gaveen (~gaveen@112.135.133.215) has joined #ceph
[14:49] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[14:53] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) has joined #ceph
[14:55] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[15:05] * Solver (~robert@atlas.opentrend.net) Quit (Remote host closed the connection)
[15:10] * Solver (~robert@atlas.opentrend.net) has joined #ceph
[15:35] * masterpe (~masterpe@2001:990:0:1674::1:82) Quit (Quit: leaving)
[15:35] * masterpe (~masterpe@2001:990:0:1674::1:82) has joined #ceph
[15:37] * loicd (~loic@178.20.50.225) Quit (Ping timeout: 480 seconds)
[15:44] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[15:46] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[15:53] * cblack101 (86868b48@ircip3.mibbit.com) has joined #ceph
[15:58] * slang (~slang@38.122.20.226) has joined #ceph
[16:03] * loicd (~loic@90.84.146.238) has joined #ceph
[16:04] * loicd (~loic@90.84.146.238) Quit ()
[16:05] * loicd (~loic@90.84.146.238) has joined #ceph
[16:08] * BManojlovic (~steki@87.110.183.173) Quit (Quit: Ja odoh a vi sta 'ocete...)
[16:28] * andret (~andre@pcandre.nine.ch) Quit (Quit: Verlassend)
[16:29] * andret (~andre@pcandre.nine.ch) has joined #ceph
[16:47] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) Quit (Quit: Leaving)
[16:48] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) has joined #ceph
[16:51] <scheuk> what is the best way to find ceph bottlenecks and performance tune using rbds for qcow2 images?
[16:52] <scheuk> We are seeing that random writes case large IO/Wait times within the VM itself
[16:52] <scheuk> We have 15 openstack compute nodes, and currlently 6 storage nodes for ceph-osd
[16:54] <scheuk> with a total of 6, 8 disk riad 0 array (1 on each storage node), plus each storage node has a 2GB journal on a seperate raid0 device
[16:59] * sagelap (~sage@76.89.177.113) has joined #ceph
[17:06] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[17:07] <elder> sagelap, there are several messenger and rbd bugs that have been fixed that are present in 3.5.4 and earlier. Many of the things you and I have fixed in recent months.
[17:07] <elder> I am going to try to update the stable code with anything that's been fixed since last time I looked. This time I'm going to get them to the -stable people.
[17:08] * loicd (~loic@90.84.146.238) Quit (Ping timeout: 480 seconds)
[17:08] * loicd (~loic@90.84.146.238) has joined #ceph
[17:08] <elder> I hope there's no pushback. It was a bit of a pain to apply a few of the fixes, because a few of them really were derived from heavily refactored code.
[17:08] <elder> So I had to add a few things that were not real bug fixes. (But finding the bugs wasn't really feasible until the refactoring occurred.)
[17:12] * loicd1 (~loic@90.84.146.238) has joined #ceph
[17:12] * loicd (~loic@90.84.146.238) Quit ()
[17:13] * sagelap (~sage@76.89.177.113) Quit (Ping timeout: 480 seconds)
[17:18] * glowell (~Adium@68.170.71.123) Quit (Quit: Leaving.)
[17:20] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Read error: Operation timed out)
[17:21] <nhm> scheuk: Hi, how big is the random IO?
[17:22] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[17:23] <nhm> scheuk: Also, a single OSD with a large raid0 behind it often performs worse than individual OSDs for each disk.
[17:23] <scheuk> nhm: 14Kib to 25KiB is the average size
[17:23] <nhm> scheuk: I'm actually in the middle of publishing a report on it.
[17:23] <scheuk> 62 to 100 I/O Sec
[17:23] <scheuk> that's just one RBD
[17:23] <scheuk> times that roughly by 15 :)
[17:28] <scheuk> the backend OSD IO is average req for wite is 6KiB at about 200/sec
[17:29] <scheuk> and times that by 6
[17:30] <nhm> scheuk: Ok. I just went back and looked at some of my numbers. On a single node doing 4k IOs (writes to new files), switching from a RAID0 to JBOD improved performance anywhere from 50% to about 300% depending on the controller and backend filesystem.
[17:30] <sage1> joshd: when you have a minute can you look at wip-2525?
[17:30] <nhm> RBD will be doing rewrites, so the numbers might be different.
[17:31] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:32] <joao> oh finally
[17:32] <joao> sage1, it's alive!
[17:32] <joao> and by that I mean that a single-paxos monitor is now able to use a converted store (albeit converted offline)
[17:33] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Read error: Operation timed out)
[17:33] * loicd1 (~loic@90.84.146.238) Quit (Ping timeout: 480 seconds)
[17:33] * glowell (~Adium@38.122.20.226) has joined #ceph
[17:33] * sagelap (~sage@14.sub-70-197-150.myvzw.com) has joined #ceph
[17:36] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[17:38] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:43] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) Quit (Quit: Ex-Chat)
[17:46] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[17:50] * Cube (~Adium@12.248.40.138) has joined #ceph
[17:50] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[17:51] * sbohrer (~sbohrer@173.227.92.65) has joined #ceph
[17:52] * loicd (~loic@magenta.dachary.org) has joined #ceph
[17:52] <joao> sagewk, let me know when you're around
[17:54] * Tv_ (~tv@38.122.20.226) has joined #ceph
[17:57] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[18:07] * sagelap (~sage@14.sub-70-197-150.myvzw.com) Quit (Ping timeout: 480 seconds)
[18:07] <scheuk> nhm: what did you have the journal
[18:08] <scheuk> what did you use for the journal?
[18:13] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[18:13] * Cube (~Adium@12.248.40.138) Quit (Ping timeout: 480 seconds)
[18:13] * Cube (~Adium@12.248.40.138) has joined #ceph
[18:21] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[18:21] <nhm> scheuk: sorry, got pulled away on something
[18:22] <nhm> scheuk: for the tests I was doing, journals were on seperate SSD disks.
[18:23] <nhm> scheuk: in the 6 OSD (1 disk per) mode, 3 journals per SSD, 2 SSDs . In the RAID0 mode, 1 journal on 2 SSDs in a Raid0.
[18:23] <jmlowe> I'm trying to mkcephfs with zfs backed osds, has anybody done this successfully?
[18:23] <scheuk> nhm: ok
[18:24] <nhm> jmlowe: neat! I've wanted to. Some other guys on the channel were going to try a while back but I don't know if they actually did it.
[18:24] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) has left #ceph
[18:24] <jmlowe> well
[18:24] <nhm> jmlowe: what problems are you having?
[18:24] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) has joined #ceph
[18:24] <jmlowe> 012-09-25 12:22:47.546692 7f350f56f780 -1 OSD::mkfs: FileStore::mkfs failed with error -22
[18:24] <jmlowe> 2012-09-25 12:22:47.546727 7f350f56f780 -1 ** ERROR: error creating empty object store in /data/osd.0: (22) Invalid argument
[18:25] <jmlowe> hopefully there is something obvious I'm missing
[18:26] <nhm> hrm
[18:26] <jmlowe> wait, I think I've lost my xattrs
[18:26] * sagelap (~sage@14.sub-70-197-150.myvzw.com) has joined #ceph
[18:27] <nhm> does it say anything interesting before the first line you posted?
[18:28] * jlogan1 (~Thunderbi@72.5.59.176) has joined #ceph
[18:28] <jmlowe> pushing conf and monmap to gwboss1:/tmp/mkfs.ceph.10744
[18:29] <jmlowe> before that looks like standard monmap generation
[18:30] <scheuk> set novice off
[18:31] <nhm> jmlowe: this might be helpful: http://www.spinics.net/lists/ceph-devel/msg08381.html
[18:32] <jmlowe> what section does filestore xattr use omap = true go in?
[18:33] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) has left #ceph
[18:33] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) has joined #ceph
[18:34] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:34] <nhm> jmlowe: osd or global should work
[18:34] * sagelap (~sage@14.sub-70-197-150.myvzw.com) Quit (Ping timeout: 480 seconds)
[18:37] <jmlowe> http://pastebin.com/9qvMVMBN
[18:37] <jmlowe> a little strace
[18:42] <jmlowe> found it: journal dio = false
[18:43] <jmlowe> nhm: what do you typically use to benchmark?
[18:45] * sagelap (~sage@2600:1013:b00f:1b07:c685:8ff:fe59:d486) has joined #ceph
[18:47] <nhm> jmlowe: oh, was your journal on tmpfs or something?
[18:47] <jmlowe> no
[18:47] <nhm> jmlowe: for benchmarking, I use a mix of things depending on what we are testing.
[18:47] <jmlowe> osd journal = /data/$name/journal
[18:47] <jmlowe> osd data = /data/$name
[18:48] <nhm> jmlowe: interesting. I didn't know zfs on linux didn't support direct IO.
[18:48] <nhm> jmlowe: anyway, is this for rbd, rgw, cephfs, etc?
[18:49] <jmlowe> rbd is my eventual target, I'd also be interested in how cephfs is doing in general
[18:50] <nhm> jmlowe: for rbd, I'd use fio, and if you want to test metadata operatoins fileop from iozone3.
[18:50] <nhm> jmlowe: for cephfs, IOR is useful if you don't mind installing openmpi.
[18:51] <nhm> apparently mdtest is used often for multi-client metadata tests, but it also needs mpi.
[18:57] * jmcdice (~root@135.13.255.151) has joined #ceph
[18:58] <jmcdice> Does anyone have an example of an fstab entry that will mount a ceph file system using the fuse client?
[19:00] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) has joined #ceph
[19:01] * dmick (~dmick@2607:f298:a:607:71b4:1e55:8139:f272) has joined #ceph
[19:02] * ChanServ sets mode +o dmick
[19:04] * chutzpah (~chutz@100.42.98.5) has joined #ceph
[19:04] * hk135 (~root@89.30.48.254) has joined #ceph
[19:09] * sagelap (~sage@2600:1013:b00f:1b07:c685:8ff:fe59:d486) Quit (Ping timeout: 480 seconds)
[19:22] * jjgalvez (~jjgalvez@12.248.40.138) has joined #ceph
[19:25] * Ryan_Lane (~Adium@216.38.130.162) has joined #ceph
[19:31] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Ping timeout: 480 seconds)
[19:32] <cblack101> I've been looking at my ceph.log and found this particular string on a regular basis, any ideas what this means other than there's something wrong with osd 19? osd.19 <ip address>:6809/129826 681 : [WRN] 6 slow requests, 6 included below; oldest blocked for > 30.713282 secs
[19:34] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[19:35] * tren (~Adium@184.69.73.122) has joined #ceph
[19:35] <tren> wow, far more people than I was expecting in here...
[19:36] <dmick> hi cblack101
[19:36] <tren> I had sent a message to the mailing list about a mds server being stuck in clientreplay mode. Was wondering if anyone is around/available to help troubleshoot this?
[19:38] <joshd> tren: gregaf can probably help, but he's running late this morning
[19:39] <tren> thanks joshd!
[19:39] <scheuk> nmh: so if I where to use a single OSD per disk (8 per storage node), then put the 8 journal files on a 7 disk raid 0, that would perform alot better then what we have? Also we are running 0.48.2
[19:40] <tren> scheuk: we tried that. 12 osd's with a portion carved out to a raid10 group. We found the journals on the md1 caused contention with the osd volumes. In the end we put the journals in the same directory as the osd
[19:41] <scheuk> tren: the 7 disks are seperate from the 8 disks :)
[19:42] <scheuk> tren: server have 16 disks total
[19:42] <jmlowe> nhm: well, it's not quite there yet, all of the osds die shortly after startup
[19:43] <tren> scheuk: ah gotcha. seems like a waste of drive bays though. maybe just get a few ssd's instead?
[19:45] <scheuk> tren: yeah that would take a while, what would be a good setup for 15 disks (the 16th is for the OS)
[19:45] <scheuk> I think they are 600GB 10Ks
[19:45] <scheuk> plus they are behind raid controllers
[19:45] * phantomcircuit (~phantomci@173-45-240-7.static.cloud-ips.com) has joined #ceph
[19:47] <tren> scheuk: you run a risk though with 7 drives in raid0 handling journals for 8 osd volumes. If you lose any 1 drive, you'll lose all your journals. Depending on your backing file system that could be very bad
[19:47] <scheuk> we are using XFS
[19:48] <tren> With anything other than btrfs (AFAIK) losing the journal means you lose the osd
[19:48] <tren> so you'd lose 8 journals at the same time.
[19:48] <scheuk> yeah
[19:48] <scheuk> not very smart
[19:49] <scheuk> I could do some 2 disk raid0s
[19:49] <tren> maybe try raid10?
[19:50] <scheuk> yeah that would make sense
[19:50] <scheuk> 6 disk raid10
[19:51] <tren> *nods* run 9 osd's and 6 disk raid10 for journals
[19:51] <scheuk> tren: right now we have an 8 disk raid0 for the OSD storage, and a 7 disk raid0 for the journal
[19:52] <tren> scheuk: living dangerously ;)
[19:52] <scheuk> and we are having high IO wait times in our VMs that are on RBD volumes
[19:53] <scheuk> tren: what is your use case for your ceph cluster?
[19:53] <cblack101> scheuk: are the VMs on RBD volumes presented through Openstack as /dev/vdX?
[19:53] <tren> scheuk: we're mostly interested in it for the file system, and for virtualization. but for now we're focused on cephfs the most
[19:54] <tren> scheuk: we have 16 osd servers each presenting 12x2tb disks
[19:54] <joshd> scheuk: are you using rbd caching?
[19:55] <iggy> that's a pretty sizeable deployment (compared to what I've heard in here)
[19:56] <tren> iggy: yeah, they're new dell servers we just installed last month
[19:57] <scheuk> no, right now we have the rbd mounted as /var/lib/nova/instances
[19:57] <scheuk> on the host level
[19:57] <scheuk> and the vms are using qcow2 images
[19:57] <stan_theman> it would be nice if the init scripts checked if all boxes were running the same version of ceph
[19:58] <scheuk> in the future we will switch to rbd backed VMs
[19:58] <iggy> stan...theman....
[19:58] * guilhemfr (~guilhem@tui75-3-88-168-236-26.fbx.proxad.net) Quit (Remote host closed the connection)
[19:58] <scheuk> for as an interum we are using the rbd as "ephemeral" storage
[19:59] <stan_theman> iggy: :D
[20:01] <iggy> stan_theman: the init scripts are losing favor mostly... integrating with chef/puppet/etc. seems to be getting more attention these days
[20:01] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:c8eb:ed74:6675:2a79) has joined #ceph
[20:02] <iggy> and there are legitimate reasons to be running different versions of things (rolling upgrades, etc.)
[20:02] <stan_theman> yeah, that makes sense. it turned into one of those "oh, this ceph error really means that this other thing is happening"
[20:03] <nhm> scheuk: what controller?
[20:03] <iggy> yeah, discerning ceph errors seems to be a little known art
[20:04] <scheuk> nhm: HP Smart Array
[20:04] <nhm> scheuk: anyway, you might try a 1 drive raid0 on each drive with 2 partitions, 1 for data and one for journal.
[20:05] <nhm> scheuk: ah, don't know much about that one. rebranded lsi of some type?
[20:05] <tren> scheuk: which model of hp smart array?
[20:06] <scheuk> nhm: I beleive that are created by HP, the servers are HP proliants DL385
[20:06] <nhm> jmlowe: hrm, that sucks. Any idea what's wrong?
[20:07] <nhm> looks like it could be a smart array 6i, P400, P410i, P600, etc.
[20:07] <joao> omg
[20:07] <joao> this coverity thingy is awesome
[20:08] <nhm> joao: it sounded neat on the phone. I've logged in but haven't really used it yet.
[20:08] <tren> coverity?
[20:09] <joao> static analyzer it seems
[20:09] <joao> I've used static analyzers before, but none looked so good
[20:11] * The_Bishop (~bishop@2001:470:50b6:0:2515:55f8:ef09:e742) has joined #ceph
[20:15] <dmick> joao: nice interface, but a lot of false positives in my experience
[20:15] <dmick> which is very limited
[20:15] <tren> gregaf: can you please send me a ping when you're around?
[20:15] <gregaf> tren: I'm busy now, but I'll get back to you later or respond to your mailing list message :)
[20:16] <tren> gregaf: oh! thank you very much!
[20:16] <tren> :)
[20:17] * aliguori (~anthony@32.97.110.59) has joined #ceph
[20:17] <tren> gregaf: I'm leaving it in clientreplay until I hear from you in case you need more info :)
[20:18] <dmick> cblack101: to answer your earlier question about those log msgs: it means that OSD is taking a while to service some requests, yes, which could be indicative of some sort of slowness problem
[20:19] * maelfius (~mdrnstm@66.209.104.107) has joined #ceph
[20:19] <joao> dmick, I can live with false positives; wouldn't be so good if your experience involved a lot of false negatives though ;)
[20:20] * sagelap (~sage@124.sub-166-250-35.myvzw.com) has joined #ceph
[20:20] <cblack101> dmick: Thanks, I'm going to run the same IOmeter test in an Openstack VM (mounted rbd as /dev/sdc) as a host with the same size rbd mounted in /dev/rbd0 and see if there is an immediate difference, this should be interesting
[20:26] <scheuk> nhm: correct, we have 2, a P410 and P410i
[20:26] <dmick> joao: sadly false negatives are much harder to find...
[20:26] <dmick> cblack101: so this was at least under heavy I/O load? That's good
[20:29] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[20:30] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[20:33] * gaveen (~gaveen@112.135.133.215) Quit (Remote host closed the connection)
[20:35] * sagelap1 (~sage@224.sub-166-250-32.myvzw.com) has joined #ceph
[20:38] * sagelap (~sage@124.sub-166-250-35.myvzw.com) Quit (Ping timeout: 480 seconds)
[20:38] * aliguori (~anthony@32.97.110.59) Quit (Ping timeout: 480 seconds)
[20:40] <cblack101> dmick: not terribly heavy, but very random and generated from a number of windows server VMs across multiple hosts with the extra 256GB volume in openstack mounted to the vm as /dev/sdc (shows up as just another disk in Windows).
[20:40] <jmlowe> nhm: nope last line before the crash is this 2012-09-25 13:40:13.965688 7f452dce0700 0 -- 149.165.228.10:6802/937 >> 149.165.228.10:6808/2078 pipe(0x182dc00 sd=33 pgs=1 cs=1 l=0).fault with nothing to send, going t
[20:40] <jmlowe> o standby
[20:54] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[20:57] * danieagle (~Daniel@186.214.57.4) has joined #ceph
[20:58] <nhm> jmlowe: could you send a bug report in with a description of what you are doing and the stacktrace and stuff?
[20:58] <nhm> jmlowe: it'd be good to capture this. Unfortunately we are all pretty swamped right now. :/
[21:00] <nhm> scheuk: interesting, it's a PMC Sierra chip. I haven't tested one of those yet.
[21:00] <nhm> scheuk: I should try to get one.
[21:01] <scheuk> nhm: just order an HP proliant server :)
[21:01] <scheuk> all of the DL3XX have a smart array embedded on the motherboard
[21:02] <nhm> scheuk: Chances would be better if we got a support contract for a system using them. ;)
[21:03] <scheuk> indeed :)
[21:23] <joshd> sage1 sagelap1 sagewk: wip-2525 looks good to me. I'm not convinced that's the only thing that's not thread-safe, but it takes care of the osdmap
[21:24] <sagelap1> joshd: yeah, hard to prove a negative. i did a superficial skim of the file and it looked like everythign else had the Mutex::Locker's in place, but it was quick.
[21:25] <joshd> parallel testing would help convince me
[21:25] <joshd> we've got some tests using the framework colin wrote in test/system that do a little of that
[21:26] <joshd> it'd be good to get those into the regression suite
[21:31] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has left #ceph
[21:32] * aliguori (~anthony@32.97.110.59) has joined #ceph
[21:47] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) Quit (Quit: Leaving.)
[21:51] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) has joined #ceph
[21:54] * pentabular (~sean@adsl-70-231-141-17.dsl.snfc21.sbcglobal.net) has joined #ceph
[21:54] * pentabular is now known as Guest8199
[21:54] * Guest8199 is now known as pentabular
[22:01] <scheuk> nmh: when do you think you'll have your paper published about jbod vs raid?
[22:02] * sagelap (~sage@72.sub-166-250-32.myvzw.com) has joined #ceph
[22:04] <joshd> sagelap: could you look at a couple more coverity fixes in wip-rbd-coverity, plus the watch version bug fix in wip-watch-header-race?
[22:06] <nhm> scheuk: I'm hoping this week, but other things are pulling me in other directions. I'll probably try to write up a more basic version in a blog post.
[22:07] <scheuk> nhm: it will be an interesting reed
[22:07] <scheuk> ead
[22:07] <scheuk> read
[22:07] <scheuk> <- can't type today :)
[22:08] <nhm> scheuk: the long and short of it though is that on fresh filesystems, btrfs and jbod seem to be a good combination in fresh filesystems. At some point in the future we'll look at tests over time to see how various filesystems age.
[22:08] <nhm> scheuk: and yes, I can't type either it seems. ;)
[22:08] <scheuk> cool :)
[22:08] <scheuk> I have a journal tuning question
[22:09] * sagelap1 (~sage@224.sub-166-250-32.myvzw.com) Quit (Ping timeout: 480 seconds)
[22:09] <scheuk> I have been noticing that our journal disks are running around 40% utilization in terms of performance
[22:09] <scheuk> while our OSD disks are running more around 80-90%
[22:10] <scheuk> we have a 2GB journal file
[22:10] <nhm> scheuk: how are you measuring utilization?
[22:10] <scheuk> using the munin stats
[22:10] <joao> sagelap, still there?
[22:10] <joao> any sage* around? :p
[22:10] <scheuk> Utilization of the device. If the time spent for I/O is close to 1000msec for a given second, the device is nearly 100% saturated.
[22:11] * sagelap (~sage@72.sub-166-250-32.myvzw.com) Quit (Ping timeout: 480 seconds)
[22:12] * SpamapS (~clint@xencbyrum2.srihosting.com) Quit (Quit: leaving)
[22:12] <nhm> scheuk: Ok, the likelyhood there is that you have more seeks on the data disk than on the journal, so the jouranl doesn't get hit as hard.
[22:13] <nhm> scheuk: because the jouranl will only fill up so far ahead of the data disk, it sits idle waiting for the data disk to complete it's writes.
[22:14] <scheuk> ok, so tuning the journal won't help our write latency
[22:15] <scheuk> it's more of a need to add more disk, or get rid of the raid
[22:15] <nhm> scheuk: One thing you might try (Note I haven't tried this) is just sticking two OSDs on the raid instead of 1.
[22:16] <nhm> As crazy as that sounds, with really small IOs it may help.
[22:16] <joshd> making the journal larger will let you absorb writes longer before needing to wait for the data disks if you're worried about bursts of writes
[22:17] <scheuk> I did try tuning up the OSD threads and filesystem threads
[22:17] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[22:17] <nhm> scheuk: joshd raises a good point. It won't help with sustained traffic, but it can help with the spikes at least.
[22:17] <scheuk> ok
[22:21] * sagelap (~sage@4.sub-166-250-35.myvzw.com) has joined #ceph
[22:22] <scheuk> I would also guess, once we starting using rbd backed VMS and enable rbd caching, that will help alot as well :)
[22:23] <joshd> yeah, it should
[22:25] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:31] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[22:34] <gregaf> tren: you available? I'm looking through your logs and the code...
[22:35] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[22:48] * BManojlovic (~steki@195.13.166.253) has joined #ceph
[22:52] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[23:13] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[23:16] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Remote host closed the connection)
[23:19] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[23:37] * danieagle (~Daniel@186.214.57.4) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[23:39] * stass (stas@ssh.deglitch.com) has joined #ceph
[23:39] * ssedov (stas@ssh.deglitch.com) Quit (Read error: Connection reset by peer)
[23:56] <SpamapS> FYI, not sure if this is a known bug https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1056462
[23:57] <SpamapS> don't see it on tracker.newdream.net
[23:58] <dmick> it is mentioned in http://tracker.newdream.net/issues/3002
[23:58] <Tv_> 4e40a7849442d633c9801639fbdaf27a97f39888
[23:58] <Tv_> SpamapS: sorry, that ^ was for you
[23:58] <dmick> but I can't say I'm thrilled with the resolution
[23:59] <dmick> that's a better resolution Tv_

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.