#ceph IRC Log


IRC Log for 2010-08-18

Timestamps are in GMT/BST.

[1:14] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) Quit (Ping timeout: 480 seconds)
[1:43] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) has joined #ceph
[3:01] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has joined #ceph
[3:13] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[4:07] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has joined #ceph
[4:10] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) Quit ()
[4:58] * sagelap (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has joined #ceph
[5:15] * sagelap (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) Quit (Read error: Operation timed out)
[6:04] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) Quit (Remote host closed the connection)
[6:58] * f4m8_ is now known as f4m8
[8:32] * mtg (~mtg@vollkornmail.dbk-nb.de) has joined #ceph
[8:47] * allsystemsarego (~allsystem@ has joined #ceph
[9:06] * cowbar (edf0c0fd96@dagron.dreamhost.com) Quit (Remote host closed the connection)
[9:06] * cowbar (cfa287e760@dagron.dreamhost.com) has joined #ceph
[9:52] * Osso (osso@AMontsouris-755-1-31-136.w90-46.abo.wanadoo.fr) Quit (Quit: Osso)
[10:16] * Yoric (~David@ has joined #ceph
[12:18] <f4m8> OT: Does someone here attend froson(http://www.froscon.de/index.php?id=15&L=1&no_cache=1) this weekend?
[15:17] * neale_ (~neale@pool-173-71-192-200.clppva.fios.verizon.net) has joined #ceph
[15:37] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) has joined #ceph
[16:27] * mtg (~mtg@vollkornmail.dbk-nb.de) Quit (Quit: Verlassend)
[16:47] * f4m8 is now known as f4m8_
[17:19] <wido> hi
[17:23] <wido> i'm playing around with qemu-kvm and seeing some weird behaviour. When running a VM with an ISO attached, where the ISO is on a Ceph filesystem, i see ceph/msgr 2 and 3 taking about 20% to 30% CPU
[17:23] <wido> while the ISO is not being used inside the VM
[17:23] <wido> it's only attached
[17:24] <wido> detaching the ISO resolves this, the CPU goes back normal
[17:24] <wido> before i create an issue, what info is needed?
[17:51] * gregphone (~gregphone@ has joined #ceph
[18:14] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) Quit (Ping timeout: 480 seconds)
[18:49] <sagewk> hmm.. can you turn on librados debugging on the qemu process? 'debug ms = 1' and 'log file = /path/to/file' ....
[18:50] * gregphone (~gregphone@ Quit (Ping timeout: 480 seconds)
[18:50] <wido> debug librados too?
[18:50] <wido> debug librados = 20
[18:51] <wido> btw, the guest is running qemu-kvm with rbd
[18:51] <sagewk> probably not needed
[18:51] <wido> it's only the ISO of Ubuntu which is attached
[18:51] <sagewk> i just want to see what i/o it's doing
[18:51] <wido> ok. Right now i'm installing a second VM, will do the debug after that is finished
[18:52] <sagewk> ok cool
[18:54] * Yoric (~David@ Quit (Quit: Yoric)
[19:10] * deksai (~chris@dsl093-003-018.det1.dsl.speakeasy.net) has joined #ceph
[19:20] * Osso (osso@AMontsouris-755-1-31-136.w90-46.abo.wanadoo.fr) has joined #ceph
[19:55] <wido> sagewk: can't reproduce it anymore, seems that after a unmount and mount from the Ceph filesystem this afternoon the problem went away
[19:57] <gregaf> did you actually do any file accesses on the ISO?
[19:57] <wido> no, i wasn't even mounted
[19:57] <wido> it*
[19:57] <wido> so it might be just coincidence
[19:58] <wido> but i'm 100% sure there were no I/O's to the Ceph filesystem at that point
[19:58] <gregaf> my initial WAG was that maybe there was an issue with the connection and the messengers were spinning around trying to do error recovery
[20:00] <wido> it's all connected to one switch and while this was going on, i was SSH'ing on the machines without any lag
[20:00] <wido> have to say, the unmount took a real long time
[20:00] <gregaf> I meant a Ceph issue, not an interface issue ;)
[20:01] <wido> ah, ok :)
[20:02] * MarkN (~nathan@ Quit (Ping timeout: 480 seconds)
[20:05] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[20:23] <wido> i'm playing around with qemu-kvm (rbd), but it seems setting cache=writeback doesn't make a difference. Anyone tested this?
[20:23] <wido> with cache off or on writeback i'm still getting about 74MB/sec write in a VM (virtio)
[20:23] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Remote host closed the connection)
[20:24] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[21:44] <sagewk> i don't think the writeback behavior will change something like throughput.... it'll affect sync latency
[22:01] <wido> oh, in other envirioments i saw a difference in those speeds too
[22:02] <sagewk> hmm
[22:02] <wido> tried tools like postmark, they didn't show any difference between writeback and "none"
[22:02] <sagewk> i'm not actually sure what those options do. it may be that they control the normal storage drivers (qcow2 or whatever) and we aren't changing our behavior
[22:05] <wido> that might be, since i would personally like some writeback features, just to speed up some small writes
[22:06] <wido> http://www.linux-kvm.org/page/Tuning_KVM << "Storage"
[22:06] <sagewk> in general, the vm's fs will take care of that, won't it? the only time it'll block on backend storage is when you do an fsync or something
[22:07] <wido> yes, that is true
[22:07] <wido> i have two VM's online right now via qemu-kvm, running pretty smooth
[22:07] <wido> but for example a Ubuntu install is pretty slow, especially when packages are unpacked, since those involve a lot of small writes
[22:08] <sagewk> is it doing lots of fsyncs or something?
[22:08] <wido> i think so, not really sure
[22:08] <wido> but since the journal writes are DSYNC too, you are hitting the raw disk performance
[22:08] <wido> no caching or so in the whole path
[22:09] <sagewk> right. well there can't be if you want "safe" to really mean "safe".
[22:10] <sagewk> you get a lower bound on latency because all writes are replicated and go all the way to disk
[22:10] <wido> yes, here a SSD for journaling will really speed things up
[22:10] <sagewk> OTOH, those writes are to a sequential journal (regardless of how "seeky" they are) so you usually will get consistent latencies
[22:10] <sagewk> yes yes yes. :)
[22:11] <wido> right now i have stress and bonnie++ running for the night, i'll leave them on for about 12 hours
[22:11] <wido> see how everything holds up
[22:12] <sagewk> i've thought about trying to make the journaling code try to time the disk rotation and leave appropriately sized gaps between writes to lower the latency.. instead of consistently getting a latency of 1 rotation it'd get something very short.
[22:12] <sagewk> cool
[22:12] <wido> that would be cool, but i would recommend a SSD for now
[22:12] <sagewk> :)
[22:13] <wido> i'm going afk, i'll let you know how it all held up
[22:13] <wido> ttyl
[22:16] <sagewk> 'night
[22:46] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[23:00] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.