#ceph IRC Log

Index

IRC Log for 2012-04-16

Timestamps are in GMT/BST.

[0:54] * Qten1 (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) has joined #ceph
[1:19] * danieagle (~Daniel@177.43.213.15) has joined #ceph
[1:43] * darkfader (~floh@188.40.175.2) Quit (Ping timeout: 480 seconds)
[1:43] * BManojlovic (~steki@212.200.243.246) Quit (Ping timeout: 480 seconds)
[2:00] * loicd (~loic@99-7-168-244.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[2:05] * lofejndif (~lsqavnbok@09GAAEX9R.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[2:38] * yoshi (~yoshi@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[3:50] * loicd1 (~loic@99-7-168-244.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[3:55] * loicd (~loic@99-7-168-244.lightspeed.sntcca.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[4:13] * jantje (jan@paranoid.nl) has joined #ceph
[4:18] * jantje_ (~jan@paranoid.nl) Quit (Ping timeout: 480 seconds)
[4:59] * sagelap (~sage@12.199.7.82) has joined #ceph
[5:15] * sagelap (~sage@12.199.7.82) Quit (Ping timeout: 480 seconds)
[5:21] * joao (~JL@89-181-153-140.net.novis.pt) Quit (Ping timeout: 480 seconds)
[5:25] * sagelap (~sage@12.199.7.82) has joined #ceph
[5:33] * sagelap (~sage@12.199.7.82) Quit (Ping timeout: 480 seconds)
[5:36] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) Quit (Remote host closed the connection)
[5:36] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) has joined #ceph
[5:44] * sagelap (~sage@ace.ops.newdream.net) has joined #ceph
[6:05] * danieagle (~Daniel@177.43.213.15) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[6:47] * f4m8_ is now known as f4m8
[7:09] * sagelap (~sage@ace.ops.newdream.net) Quit (Quit: Leaving.)
[9:25] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[10:14] * darkfader (~floh@188.40.175.2) has joined #ceph
[10:19] * loicd1 is now known as loicd
[10:20] * pmjdebruijn (~pascal@62.133.201.16) has joined #ceph
[10:20] <pmjdebruijn> hi
[10:20] <pmjdebruijn> I just noticed that ceph 0.45 has a dependancy on Python 2.6.6
[10:20] <pmjdebruijn> which isn't available on Ubuntu Lucid
[10:20] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) has joined #ceph
[11:05] * df_ (davidf@dog.thdo.woaf.net) Quit (Ping timeout: 480 seconds)
[11:13] * df_ (davidf@dog.thdo.woaf.net) has joined #ceph
[11:32] * yoshi (~yoshi@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:35] * josi (~josi@28IAADZFA.tor-irc.dnsbl.oftc.net) has joined #ceph
[11:37] <josi> hello
[11:41] * josi (~josi@28IAADZFA.tor-irc.dnsbl.oftc.net) has left #ceph
[11:48] <deam> hi, and bye :-)
[12:03] * nhm (~nh@68.168.168.19) Quit (Ping timeout: 480 seconds)
[12:12] * nhm (~nh@68.168.168.19) has joined #ceph
[12:21] * joao (~JL@89-181-153-140.net.novis.pt) has joined #ceph
[12:39] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[12:52] * BManojlovic (~steki@212.200.243.246) has joined #ceph
[13:35] * Madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[13:35] <Madkiss> hi there!
[13:36] <Madkiss> I've got a short question. I'm trying to figure out how to add a new OSD to my rados/ceph cluster, and I found http://ceph.newdream.net/docs/master/ops/manage/grow/osd/#adding-a-new-osd-to-the-cluster
[13:36] <Madkiss> I am just not sure which of these commands has to be executed on which host
[13:36] <Madkiss> (note: I'm using cephx)
[13:42] <todin> Madkiss: I always used the old doku http://ceph.newdream.net/wiki/OSD_cluster_expansion/contraction
[13:43] * gregorg_taf (~Greg@78.155.152.6) Quit (Quit: Quitte)
[13:43] <Madkiss> todin: well, I think my question is the very same for that howto. Step 5 reads: "ceph auth add osd.4 osd 'allow *' mon 'allow rwx' -i /path/to/osd/keyring"
[13:43] <Madkiss> Is that on one of the existing nodes or on the new oneP
[13:47] <todin> Madkiss: as far as I know on a montior node
[13:47] <Madkiss> but that means I would have to copy /etc/ceph/osd.ID.keyring over to exactly that monitor node, too, correct?
[13:48] <Madkiss> do I have to do the add command on one of the existing monitor nodes only, or do I have to do it on every existing mon?
[13:49] <todin> Madkiss: sorry, I don't know, you could wait here, the channel gets busy in a few hours or ask on the mailinglist
[13:50] <Madkiss> hm.
[13:56] * df_ (davidf@dog.thdo.woaf.net) Quit (Remote host closed the connection)
[14:02] * df_ (davidf@dog.thdo.woaf.net) has joined #ceph
[14:23] <Madkiss> hm.
[14:23] <Madkiss> todin: when adding a new OSD and using the default crushmap, will it get added to that crushmap automatically, too? Or will I have to adapt the crushmap every time?
[14:31] <todin> Madkiss: you have to update the crusmap everytime
[14:42] * BManojlovic (~steki@212.200.243.246) Quit (Remote host closed the connection)
[14:47] <Madkiss> thanks
[14:52] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[15:13] * mtk (~mtk@ool-44c35967.dyn.optonline.net) Quit (Remote host closed the connection)
[15:24] * BManojlovic (~steki@212.200.243.246) has joined #ceph
[15:41] * f4m8 is now known as f4m8_
[16:33] * __jt__ (~james@jamestaylor.org) has joined #ceph
[16:39] <deam> is it possible to setup multiple rados gateways?
[16:39] <deam> and will that improve the performance if I load balance requests?
[16:43] <wonko_be> Madkiss: the ceph auth add ... should be done on any machine which has access to the monitor cluster and has access to do this manipulation
[16:43] <wonko_be> in short, do it on the machine where you ran mkcephfs
[16:43] <wonko_be> the keyring should be copied over
[16:45] <wonko_be> if you want to do it on any other machine, make sure it has some info about the monitors in your ceph.conf, and copy the client.admin.keyring key from a node with access (or extract the key, and re-add it to another keyring)
[17:01] <nhm> deam: yes, that is what dreamhost is going to be doing. I haven't set it up yet myself though.
[17:07] <nhm> brrr, it's cold today
[17:18] <joao> here it feels like spring has arrived :D
[17:19] <nhm> joao: it was like 25-26C a month ago, today it was snowing.
[17:20] <joao> to 26C I usually call summer
[17:22] <nhm> joao: that's a nice summer day here. We have kind of extreme weather. -20-30 in the winter and like 35 in the summer.
[17:22] <nhm> though not every day.
[17:22] <joao> it sure should keep things interesting :p
[17:23] <joao> we get > 30C, nearly 40C occasionally during the summer, but I'm glad the weather is pretty stable around here
[17:24] <joao> in bad, bad winters we get 0C, and the weather in Europe must be really screwed up for that to happen
[17:24] <joao> by "we" I mean "in Lisbon", because the up north it can get uglier
[17:25] <joao> s/the//
[17:25] <joao> man, I breaking the journal replay
[17:25] <joao> *I'm
[17:25] <nhm> joao: Very rarely does it get up to 40c here. Usually 36-37c is about the highest it gets. It's really humid here in the summers though.
[17:33] * loicd (~loic@99-7-168-244.lightspeed.sntcca.sbcglobal.net) Quit (Quit: Leaving.)
[17:36] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:39] * gregaf (~Adium@aon.hq.newdream.net) Quit (Quit: Leaving.)
[17:42] * gregaf (~Adium@aon.hq.newdream.net) has joined #ceph
[18:02] <joao> gregaf, around?
[18:03] <gregaf> hi joao
[18:03] <joao> good morning :)
[18:03] <joao> sam and sage went to the openstack thingy?
[18:04] <gregaf> no, I think they're both in the office this week
[18:04] <joao> ah cool :)
[18:04] <joao> thanks
[18:04] <gregaf> yep
[18:21] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[18:21] <chaos_> hello
[18:21] * perplexed (~ncampbell@216.113.168.141) has joined #ceph
[18:25] <The_Bishop> i dont get it. i did "ceph osd out 0", but the device still fills up :(
[18:29] <chaos_> I've some performance issues with ceph osd. Week ago we've added two new osd servers, now we have 2x 2.5TB (old one osd) + 2x 5.3TB (new ones), everything was fine for two days, but after one of osd reported that it didn't receive hearbeats from three other osds. I've restarted faulty osd, everything was fine with hearbeats after restart. From then commit latency went from <2s to 14-15 seconds for new osd servers, read latency from ~45ms to 402ms and writ
[18:30] <chaos_> Only worrying thing I see inside logs is "2012-04-16 13:30:35.216739 7f6c9b96d700 log [WRN] : old request osd_op(client.25490.0:10177 1000000f0f4.00000000 [write 0~4194304] 0.f73a6f76 snapc 1=[]) v3 received at 2012-04-16 13:30:04.683783 currently waiting for sub ops
[18:35] <gregaf> The_Bishop: you marked an OSD out and it's still in the cluster? or are other OSDs filling up?
[18:35] * Tv_ (~tv@aon.hq.newdream.net) has joined #ceph
[18:35] <gregaf> chaos_: which OSD stopped receiving heartbeats?
[18:36] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[18:36] <chaos_> new one
[18:36] <chaos_> receiving and sending
[18:36] <gregaf> sounds like maybe you have a slow node, have you tested each of them independently?
[18:36] <chaos_> but it was 5 days ago
[18:36] <chaos_> gregaf, how?
[18:37] <gregaf> well, for a very basic test you can run "ceph osd tell \* bench" and then watch "ceph -w" to see the results from each OSD
[18:37] <chaos_> ok
[18:37] <chaos_> now i've lower traffic, everyone went home, so i can try
[18:38] <gregaf> that'll just write 1 GB to the journal+store of each OSD and tell you how long it took
[18:38] <chaos_> great ;-)
[18:39] <chaos_> old osd daemons are pretty the same
[18:39] <chaos_> [INF] bench: wrote 1024 MB in blocks of 4096 KB in 33.329332 sec at 31461 KB/sec
[18:39] <gregaf> that warning line you were seeing in the logs is an indication that your OSDs have requests that have taken >30seconds to service, and the 30 second mark is being passed while they're waiting for one of the replicas to apply the op
[18:39] <chaos_> waiting for new ones
[18:39] <gregaf> well that's 30MB/s right there, so if the new ones are taking longer???.
[18:40] <chaos_> hmm still waiting.. maybe as you say
[18:40] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[18:40] <chaos_> waiting.. ehhh
[18:40] <chaos_> so slow? impossible;p
[18:41] <gregaf> maybe they did it so fast you weren't watching in time?
[18:41] <chaos_> nooo they are slower
[18:41] <gregaf> :(
[18:41] <chaos_> [INF] bench: wrote 1024 MB in blocks of 4096 KB in 180.330053 sec at 5814 KB/sec
[18:42] <chaos_> maybe hetzner cutted my link down.. because after adding osd lots of data migrated between nodes.. ~2TB
[18:42] <chaos_> I'll check this again
[18:42] <gregaf> hetzner?
[18:42] <chaos_> hetzner.de
[18:43] <gregaf> your provider, I assume?
[18:43] <chaos_> they have crappy support and they do "things" for example they cutting your link down without notice
[18:43] <chaos_> yea.. my client have small cluster there
[18:43] <gregaf> so "bench" writes to both the journal and the filestore, which means that it's two streams, and if they're on the same disk that can be a bit much, but any modern drive should be able to sustain ~50MB/s *2 (or better)
[18:44] <gregaf> but this sounds like Not My Problem ;)
[18:44] <gregaf> are they on actual physical machines?
[18:44] <chaos_> yes
[18:44] <chaos_> 4 separate servers
[18:45] * loicd1 (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[18:45] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Quit: Leaving.)
[18:48] <chaos_> gregaf, it isn't "link" issue 80-100MB/s between any of servers
[18:48] <gregaf> chaos_: oh, yeah, bench is entirely local except for the initiating command ??? it generates the data to write on the node
[18:48] <chaos_> oh
[18:49] <chaos_> interesting
[18:49] <gregaf> sorry, missed your reference to link speeds
[18:49] <chaos_> maybe it's something with raid array
[18:49] <gregaf> you should probably try out some more standard benchmarks and then go yell at Hetzner or whoever supplied the hardware
[18:50] <chaos_> hmm it maybe btrfs related? because I've 3.2 kernel at osd.0 and 1 and 3.0 at osd.2 and 3
[18:51] <gregaf> I wouldn't expect it to do *that* badly, but you could check
[18:56] <chaos_> raid array looks fine, at least from performance point of view
[18:56] <chaos_> gregaf, is there any other way to tell why these nodes are so slow?
[18:57] <gregaf> you're getting good performance out of the RAID but bad performance out of OSD bench?
[18:57] <gregaf> well, try a filesystem test then
[18:57] <chaos_> hm?
[18:57] <nhm> chaos_: you might want to use a utility like sar or collectl to see what the writes look like during the bench tests.
[18:57] <gregaf> all bench does it write 1GB in 4MB chunks, in the same way as incoming requests do
[18:57] <gregaf> (so in parallel if you're using btrfs, or journal and then filestore for everything else)
[18:58] <gregaf> either your journal or your filestore is hideously slow, and it's either the RAID array or the FS
[18:58] <gregaf> *shrug*
[18:59] <gregaf> if it turns out that a generic filesystem test (dbench, iozone, whatever) does fine, then it actually might be a btrfs fragmentation problem in the Ceph directories
[18:59] <gregaf> but I wouldn't have expected it to get that bad
[18:59] * loicd1 (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Quit: Leaving.)
[19:00] <chaos_> thanks nhm I'll look into this
[19:00] <gregaf> you could test out that possibility by copying the Ceph directories (to the same disk, but letting the filesystem rewrite them out nicely) and seeing if performance improves
[19:01] <chaos_> is it safe
[19:01] <chaos_> ?
[19:01] <gregaf> well, if you shut down the daemon it is ;)
[19:02] <chaos_> i don't have so much space;p
[19:02] <gregaf> do it progressively?
[19:03] <gregaf> I mean, I would try out standard benchmark tools first
[19:03] <gregaf> it's just a possibility
[19:03] <chaos_> ok
[19:04] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[19:04] <The_Bishop> btrfs can do online defrag
[19:05] <chaos_> i think first i'll upgrade kernel to one version, to have the same btrfs
[19:06] <nhm> chaos_: probably a good idea if only to eliminate variables.
[19:06] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[19:08] * loicd1 (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[19:08] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Read error: Connection reset by peer)
[19:09] * loicd1 (~loic@204-16-154-194-static.ipnetworksinc.net) Quit ()
[19:12] <NaioN> is there a easy way to move a rbd from pool?
[19:12] <NaioN> i want to create a new pool and move some rbd images to the new pool
[19:14] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[19:15] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit ()
[19:15] <joao> erm
[19:15] <joao> I think I just got onto the wrong meeting
[19:15] * joao blushes
[19:17] <nhm> joao: yeah, I did the same thing
[19:17] <chaos_> gregaf, it could be fragmentation issue, btrfs-transacti and btrfs-endio-wri use cpu quite much, I've found that it happens when filesystem is fragmentated
[19:17] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[19:18] <joao> nhm, yeah... not sure if they changed rooms or something
[19:19] * loicd1 (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[19:19] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Read error: Connection reset by peer)
[19:20] * gregphone (~gregphone@aon.hq.newdream.net) has joined #ceph
[19:20] <gregphone> try The Fortress of Solitude, guys
[19:20] <joao> gregphone, I'll need the URL
[19:20] <gregphone> somebody took our space
[19:21] <joao> yeah, we've crashed that party :p
[19:21] <gregphone> you should be able to search for it in the UI?
[19:21] <joao> gregphone, I don't have access to it
[19:22] <joao> I guess that's one of the downsides of not having a panel account :)
[19:22] <gregphone> I don't know how to use it at all, maybe nhm can help you
[19:23] <joao> nhm, any chance of getting me that url? :)
[19:23] <nhm> joao: trying to fiugre out what it is, one sec
[19:23] <joao> thanks
[19:25] <nhm> can't seem to find the url, sorry. :/
[19:25] <joao> np
[19:26] <nhm> might be becasue I don't have access to control the meeting
[19:26] <joao> please let the others know why I'm missing the standup then
[19:26] * chutzpah (~chutz@216.174.109.254) has joined #ceph
[19:27] * gregphone (~gregphone@aon.hq.newdream.net) has left #ceph
[19:29] * loicd1 (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Read error: Connection reset by peer)
[19:33] <gregaf> NaioN: unfortunately I think you need to do the rbd moves manually :(
[19:37] * cattelan_away is now known as cattelan
[19:38] <NaioN> gregaf: ok, thx, just checking :)
[19:38] <NaioN> thought so, but if there would be an easier way, it would be nice
[19:38] <gregaf> NaioN: yeah ??? you should file a bug report
[19:38] <NaioN> well feature request :)
[19:38] <gregaf> it's still going to take a while but we could have a command do it
[19:39] <dmick> Using two image names with pool names doesn't work, then? I was just looking at the syntax
[19:39] <gregaf> (or else Josh will say "hey, we do that already!")
[19:39] <gregaf> dmick: I don't think you can specify pool name as part of the image name in the rbd tool ?????am I mistaken?
[19:39] <dmick> manpage claims you can
[19:39] <gregaf> if you can, then there is a move command that would probably do it
[19:39] <gregaf> ah
[19:39] <NaioN> nope just image name
[19:39] <NaioN> it's move/rename :)
[19:39] <dmick> In addition to using the --pool and the --snap options, the image name
[19:39] <dmick> can include both the pool name and the snapshot name. The image name
[19:39] <dmick> format is as follows:
[19:39] <dmick> [pool/]image-name[@snap]
[19:39] <dmick> Thus an image name that contains a slash character ('/') requires spec???
[19:39] <dmick> ifying the pool name explicitly.
[19:40] <gregaf> NaioN: well try it with the pool name included and see
[19:40] <NaioN> yeah will do
[19:40] <NaioN> dmick: thx!
[19:40] <dmick> I don't know whether it *works*, but it *claims* it does
[19:41] <dmick> I get to librbd.cc:rename and then realize it's going to be a long journey through the code
[19:41] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[19:42] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit ()
[19:42] * gregphone (~gregphone@aon.hq.newdream.net) has joined #ceph
[19:43] * gregphone (~gregphone@aon.hq.newdream.net) Quit ()
[19:50] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[19:50] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[19:53] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[19:56] <NaioN> dmick: gregaf: hmmm doesn't seem to work
[19:56] <NaioN> just renames the rbd in the same pool
[19:57] <dmick> so ignores the pool name? or errors?
[19:57] <gregaf> bummer
[19:57] <NaioN> ignores
[19:57] <gregaf> yep, make a feature request in the tracker!
[19:57] <dmick> sadface
[19:57] <NaioN> i shall!
[19:57] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Quit: Leaving.)
[19:58] <dmick> Does copy work to a different pool as a workaround?...
[20:01] <NaioN> lets try
[20:03] <NaioN> hmmm it seems doing something
[20:03] <NaioN> but it gives no feedback and it's sloooooowwwwww
[20:03] <NaioN> oh just had to wait :)
[20:03] <NaioN> 2% at the moment
[20:03] <dmick> does it appear to be showing up int he other pool?
[20:04] <NaioN> yeps
[20:04] <dmick> (you might not be able to tell until it's done...oh good)
[20:04] <gregaf> ah, there we go, hurray!
[20:04] <gregaf> that didn't seem like the kind of thing that Josh would forget
[20:04] <dmick> ok, so at least you have a config path, albeit perhaps not as fast as you'd like. cool.
[20:04] <NaioN> well i'll try a lot in parallel
[20:10] * wido is now known as widodh
[20:10] * widodh is now known as wido
[20:10] * wido (~wido@rockbox.widodh.nl) Quit (Quit: leaving)
[20:10] * wido (~wido@rockbox.widodh.nl) has joined #ceph
[20:12] <NaioN> doing now two and ik looks like both are going steady :)
[20:14] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[20:21] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[20:22] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Quit: Leaving.)
[20:29] <dmick> gregaf, I think what happened is that rename ignores the poolname, but copy works (albeit more slowly); from your redmine update I think that's not what you're thinking?
[20:30] <gregaf> oh, maybe ??? I admit I didn't pay much attention once it worked
[20:30] <gregaf> feel free to update! ;)
[20:31] <dmick> ok. thought maybe you knew something I didn't. NaioN, that's correct, right? rename/mv does not work across pools because poolname is ignored?
[20:33] * sagelap (~sage@12.199.7.82) has joined #ceph
[20:34] <sagelap> gregaf: wouldn't it be sufficient to unconditionally encode in create_initial(), and then only conditionally thereafter?
[20:35] <sagelap> i think the weak link here is that we're using config and config observers to adjust this, but it's really cluster (not config) state.
[20:36] <NaioN> dmick: yes that's what I observe
[20:36] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[20:37] <NaioN> I did: rbd mv <pool1>/<oldimagename> <pool2>/<newimagename>
[20:37] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[20:37] <NaioN> in this case it just renames the imagename
[20:38] <NaioN> then i did: rbd mv <pool1>/<imagename> <pool2>/<imagename> and it complained the name already existed
[20:38] <NaioN> so I concluded it ignores the poolname
[20:38] <NaioN> and I got a crash from rbd with --pool and --dest-pool
[20:39] <sagelap> i think it should do -EXDEV in that case
[20:39] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Read error: Operation timed out)
[20:39] <sagelap> (i'm not really a fan of mv silently/transparently doing a copy + delete when an efficient rename isn't possible)
[20:41] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[20:52] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Quit: Leaving.)
[20:55] <NaioN> gregaf: I saw you updated the feature request
[20:56] <NaioN> but how could you do it with updating pointers? the data has to move to different directories, but will it land in the same pg_num in a different pool?
[20:56] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Remote host closed the connection)
[20:56] <NaioN> of so then you can implement a very fast move
[21:05] * BManojlovic (~steki@212.200.243.246) Quit (Ping timeout: 480 seconds)
[21:13] * sagelap1 (~sage@12.199.7.82) has joined #ceph
[21:13] * sagelap (~sage@12.199.7.82) Quit (Read error: Connection reset by peer)
[21:14] * BManojlovic (~steki@212.200.243.246) has joined #ceph
[21:21] * sagelap1 (~sage@12.199.7.82) Quit (Ping timeout: 480 seconds)
[21:22] <dmick> fwiw I agree with sage; if mv doesn't really efficiently rename I think it should fail or be removed
[21:24] <perplexed> Just catching up on this morning's thread. Regarding bench being entirely local. Is that the case? It seemed to be that you could target a particular OSD instance in the command (e.g. "ceph osd tell 0 bench" would direct it toward osd.0). Is that's true, then you're presumably able to ensure you're heading to an OSD that's on a remote server (?)
[21:25] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[21:27] * sjust (~sam@aon.hq.newdream.net) Quit (Quit: Leaving.)
[21:27] * sjust (~sam@aon.hq.newdream.net) has joined #ceph
[21:30] <gregaf> perplexed: when you direct the bench to a specific OSD you're sending a command to trigger an entirely local process
[21:30] <gregaf> osd bench is separate from rados bench, though ;)
[21:31] * alexxy (~alexxy@79.173.81.171) Quit (Remote host closed the connection)
[21:31] <perplexed> Ah, I see. Thx.
[21:32] <perplexed> Yes, it seemed as though rados bench was more useful as it would distribute the traffic across all OSD's as you'd expect (local and remote).
[21:34] <gregaf> perplexed: osd bench is intended to be a filesystem+disk test, to make sure the OSDs are behaving
[21:34] <gregaf> as we saw above where somebody's cluster wasn't performing and it turned out the OSD could only maintain 5MB/s of writes
[21:34] <gregaf> so, useful, but for completely different things!
[21:35] <perplexed> Good to know
[21:35] * vikasap (~vikasap@64.209.89.103) has joined #ceph
[21:36] * perplexed (~ncampbell@216.113.168.141) has left #ceph
[21:38] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[22:12] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[22:13] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit ()
[22:29] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[22:30] * al_ (quassel@niel.cx) Quit (Remote host closed the connection)
[22:33] * lofejndif (~lsqavnbok@othal.net) has joined #ceph
[22:33] * lofejndif (~lsqavnbok@othal.net) Quit (autokilled: This host may be infected. Mail support@oftc.net with questions. BOPM (2012-04-16 20:33:19))
[22:35] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[22:36] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[22:40] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) has joined #ceph
[22:40] * cattelan is now known as cattelan_away
[22:43] * lofejndif (~lsqavnbok@1RDAAAYAD.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:54] * adjohn (~adjohn@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[22:56] * cattelan_away is now known as cattelan
[23:00] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[23:03] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit ()
[23:03] * sagelap (~sage@12.199.7.82) has joined #ceph
[23:06] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[23:16] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) Quit (Read error: Connection reset by peer)
[23:22] <todin> how far does qemu or libvirt support rbd snapshots?
[23:23] <gregaf> todin: I'm afraid right now the only one keeping track of that is Josh, and he's out at OpenStack ?????I'd recommend emailing the list and CCing him (or I'll make sure he sees it)
[23:24] <todin> gregaf: ok, is saw that he did patches for qemu in jan/feb, as far as I could see they are upstream, but somehow it does not work
[23:25] <todin> I will mail the error to the ml
[23:26] * adjohn (~adjohn@204-16-154-194-static.ipnetworksinc.net) Quit (Quit: adjohn)
[23:26] <todin> btw, the rbd cache is working stabe for the last few days ;-)
[23:26] <nhm> todin: good news. :D
[23:27] <todin> nhm: yep, and the performace gain is extrem
[23:27] <nhm> todin: What kind of difference are you seeing?
[23:29] <todin> nhm: a high increase in small writes, with the writeback window the vm could do around of 1000 iops of 4k writes with the rbd cache its above 10k iops
[23:29] * rturk (~textual@12.199.7.82) has joined #ceph
[23:29] <gregaf> buffering does wonderful things ;)
[23:29] * loicd (~loic@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[23:29] <todin> yep
[23:29] <todin> the only thing I did not test, is live migration with the cache
[23:29] <nhm> todin: excellent
[23:31] <todin> the next one I want to test is discard/trim
[23:45] * rturk (~textual@12.199.7.82) Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[23:51] * adjohn (~adjohn@204-16-154-194-static.ipnetworksinc.net) has joined #ceph
[23:59] * sagelap (~sage@12.199.7.82) Quit (Quit: Leaving.)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.