#ceph IRC Log


IRC Log for 2010-08-25

Timestamps are in GMT/BST.

[0:19] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has left #ceph
[0:37] <gregaf> todinini: what version of ceph are you running?
[0:38] <gregaf> the current unstable just crashes when I try blogbench on it ;)
[1:05] <gregaf> todinini: what settings are you using for blogbench?
[1:06] <gregaf> I just did a run on my dev machine with the defaults and the cmds process is reporting 202MB real and 293MB virtual memory afterwards (and that's the peak I saw)
[1:44] * gregphone (~gregphone@ has joined #ceph
[2:05] * gregphone (~gregphone@ Quit (Quit: Rooms • iPhone IRC Client • http://www.roomsapp.mobi)
[2:05] * gregphone (~gregphone@ has joined #ceph
[2:45] * gregphone (~gregphone@ Quit (Quit: Rooms • iPhone IRC Client • http://www.roomsapp.mobi)
[3:12] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Ping timeout: 480 seconds)
[3:12] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) has joined #ceph
[3:40] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Ping timeout: 480 seconds)
[3:46] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) has joined #ceph
[3:57] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Ping timeout: 480 seconds)
[3:58] * tjikkun (~tjikkun@195-240-122-237.ip.telfort.nl) has joined #ceph
[5:26] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) Quit (Remote host closed the connection)
[5:26] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has joined #ceph
[6:44] * f4m8_ is now known as f4m8
[8:11] * tjikkun (~tjikkun@195-240-122-237.ip.telfort.nl) Quit (Ping timeout: 480 seconds)
[8:42] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) has joined #ceph
[8:57] * Osso (osso@AMontsouris-755-1-5-251.w86-212.abo.wanadoo.fr) has joined #ceph
[9:31] * Osso (osso@AMontsouris-755-1-5-251.w86-212.abo.wanadoo.fr) Quit (Quit: Osso)
[9:31] * Osso (osso@AMontsouris-755-1-5-251.w86-212.abo.wanadoo.fr) has joined #ceph
[10:07] * allsystemsarego (~allsystem@ has joined #ceph
[10:13] * Yoric (~David@ has joined #ceph
[10:17] <todinini> gregaf: I use ceph version 0.22~rc (cdb8a98601ca85ddc345eae519c8e8fc25de253f)
[10:18] <todinini> gregaf: I use the default option, I just use blogbench -d /ceph
[10:18] <todinini> gregaf: which version do you run on your dev maschine?
[13:54] * atg (~atg@please.dont.hacktheinter.net) Quit (Quit: No Ping reply in 180 seconds.)
[13:55] * atg (~atg@please.dont.hacktheinter.net) has joined #ceph
[13:59] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) Quit (synthon.oftc.net weber.oftc.net)
[13:59] * yehudasa (~yehudasa@ip-66-33-206-8.dreamhost.com) Quit (synthon.oftc.net weber.oftc.net)
[14:02] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has joined #ceph
[14:02] * yehudasa (~yehudasa@ip-66-33-206-8.dreamhost.com) has joined #ceph
[14:07] * iggy (~iggy@theiggy.com) Quit (Remote host closed the connection)
[14:13] * iggy (~iggy@theiggy.com) has joined #ceph
[14:25] * Guest684 (quasselcor@bas11-montreal02-1128531598.dsl.bell.ca) Quit (synthon.oftc.net charm.oftc.net)
[14:25] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (synthon.oftc.net charm.oftc.net)
[14:25] * revstray (~rev@blue-labs.net) Quit (synthon.oftc.net charm.oftc.net)
[14:25] * pruby (~tim@leibniz.catalyst.net.nz) Quit (synthon.oftc.net charm.oftc.net)
[14:31] * Guest684 (quasselcor@bas11-montreal02-1128531598.dsl.bell.ca) has joined #ceph
[14:31] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[14:31] * revstray (~rev@blue-labs.net) has joined #ceph
[14:31] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[14:46] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Ping timeout: 480 seconds)
[14:47] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[14:54] * revstray_ (~rev@ has joined #ceph
[14:56] * revstray (~rev@blue-labs.net) Quit (Read error: Connection reset by peer)
[15:47] * f4m8 is now known as f4m8_
[17:51] * gregphone (~gregphone@ has joined #ceph
[18:39] * Yoric (~David@ Quit (Quit: Yoric)
[18:49] * gregphone (~gregphone@ Quit (Ping timeout: 480 seconds)
[19:03] * NoahWatkins (~jayhawk@waterdance.cse.ucsc.edu) Quit (Quit: leaving)
[19:03] * NoahWatkins (~jayhawk@waterdance.cse.ucsc.edu) has joined #ceph
[19:06] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[20:00] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) has joined #ceph
[20:48] <sagewk> wido: there?
[21:18] <wido> sagewk: yes
[21:21] <kblin_> say, what's the best way to get a simple test setup to work? a source build or installing from the packages?
[21:22] <wido> kblin_: packages are pretty simple
[21:22] <kblin_> yeah, I've built and installed those, but the "quick test setup
[21:22] <wido> i would recommend that for just the simple tests, when you want to do more severe testing, building from source is recommended due to the high rate of development
[21:23] <kblin_> " page on the wiki describes the source build
[21:23] <wido> there is a "vstart.sh" script which gives you a cluster in a few minutes, never used it though
[21:23] * kblin_ is now known as kblin
[21:33] <kblin> I guess I still have a more basic problem, it seems like the ubuntu 10.10 kernel can't do user_xattr for btrfs root file systems
[21:34] <wido> can you actually boot from btrfs yet with 10.10?
[21:35] <kblin> /boot is an ext2 :)
[21:39] <wido> ah, ok :) Well, with 10.04 you even have to make some initrd adjustments to mount a multiple-device fs on boot
[21:42] <kblin> ok, given that this is just a VM I can nuke whenever I feel like it, what are you folks using to get decent btrfs support?
[21:42] <kblin> ceph seems to need that, after all
[21:44] <wido> i'm having my / on ext4 and only the Ceph data on btrfs
[21:45] <wido> using backported 2.6.35 kernels from Ubuntu's kernel-ppa, running on 10.04
[21:45] <kblin> but if I can't enable xattr on / I'm not sure xattr on /data would be different
[21:47] <kblin> hm, looks like there's a kernel update, let's try that
[21:53] <wido> why do you need xattr for Ceph? that's only needed when using ext4
[21:55] <kblin> I'm just saying what I saw on the wiki
[21:56] <kblin> "You'll need xattr support."
[21:58] <wido> oh, i've never used it. Note sure it is needed.
[22:00] <kblin> ok
[22:01] <gregaf> you need xattrs on the data store device
[22:01] <gregaf> you might not for the journals
[22:03] <wido> do you also need xattr with btrfs? Since the init script doesn't use it as mount option when you specify "btrfs devs"
[22:03] <gregaf> it might be on btrfs by default? not sure
[22:06] * kblin builds ceph again to be able to try the vstart script
[22:07] <kblin> should have used ccache
[22:10] * MarkN (~nathan@ Quit (Ping timeout: 480 seconds)
[22:12] <todinini> if I use the rbd there is no locking if I mount it on diffrent nodes?
[22:13] <todinini> only the ceph fs has locking via the osds?
[22:14] * MarkN (~nathan@ has joined #ceph
[22:15] <gregaf> todinini: no locking at all
[22:15] <gregaf> you really don't want to mount an image on more than one node at once unless you've got some complicated locking stuff running in the guest
[22:15] <kblin> wait, the btrfs devs needs a separate partition?
[22:16] <wido> gregaf: using gfs or ocfs2 should be possible, right?
[22:16] <sagewk> kblin: no. if you specify btrfs devs then mkcephfs will format and the init script will mount for you. if you don't, it wont do any of that, and will just look in the 'osd data' dir
[22:16] <sagewk> wido: yeah
[22:17] <wido> sagewk: you asked me if i was there? well, here i am
[22:19] <sagewk> yeah, i'm wonder where the non-ceph copy of the static stuff is so i can compare files?
[22:19] <kblin> sagewk: ah, ok
[22:20] <kblin> I'll try and remember that for my next test VM and try it
[22:20] <wido> sagewk: it's on the logger in /srv/ceph/iso
[22:20] <wido> i haven't got all the static data local, i fetched a few ISO's to compare it
[22:20] <wido> as you can see, i've mirrored Ubuntu, Debian and kernel.org
[22:21] <wido> all the Ubuntu's ISO are on our local mirror at: http://nl3.releases.ubuntu.com/
[22:21] <sagewk> k thanks
[22:29] <todinini> gregaf: I wanted to mount it on a diffrent node ro, to make there the backup of the guest
[22:30] <gregaf> ah, yeah, readonly works just fine, but your backup might not be consistent if the guest is doing writes during the backup
[22:31] <todinini> gregaf: yep, but we cannot shutdown the guest.
[22:32] <gregaf> there are various ways to freeze IO, though, if you have that kind of access to them
[22:33] <todinini> gregaf: nah, we are an isp and the guest are in the controll of your customer, but we want to make backups for them.
[22:33] <wido> todinini: snapshot them?
[22:35] <todinini> wido: we want a backup outside of the ceph cluster, just in case something goes wrong
[22:35] <sagewk> todinini: snapshot the rbd image, mount the snapshot ro somewhere else, and back that up.
[22:35] <gregaf> a naive snapshot will have the same issue with fs consistency as just copying it
[22:36] <sagewk> you might also try the export function of the rbd tool
[22:36] * eternaleye (~eternaley@ has joined #ceph
[22:36] <wido> what gregaf is pointing out is the cache at the fs, which will not be flushed to the RBD image
[22:36] <todinini> what kind of targets format does it support?
[22:36] <wido> you would have to invoke sync before the backup
[22:36] <sagewk> on the kvm host, you should use virsh (assuming you're using libvirt) to create the snapshot.
[22:37] <wido> xfs for example has xfs_freeze and xfs_unfreeze, so you can create a consistent backup
[22:37] <wido> i've got some Ubuntu packages for libvirt and qemu-kvm with RBD support
[22:37] <sagewk> ideally, you would xfs_freeze on the guest so that the snapshot is consistent, but failing that, you get consistency equivalent to a crash.
[22:37] <wido> i've patched the Ubuntu 10.04 packages with RBD support
[22:37] <gregaf> I guess it wouldn't be any different than just losing power, though, so any issues could probably be fixed just by an fsck
[22:38] <gregaf> and if you're doing backups of unmanaged servers that's really more than they have a right to ask for ;)
[22:38] <wido> gregaf: most of them will, i always keep the rule that data written the last 30 seconds is not safe
[22:38] <todinini> sagewk: the problem is we dont have the cpu time on the kvm hosts, on the backup host we use dedicated compression and encryption cards
[22:38] <gregaf> sinapshots are pretty much free to make, though
[22:38] * eternale1e (eternaleye@bach.exherbo.org) Quit (Quit: leaving)
[22:38] <gregaf> so make the snapshot on the host and then mount it on your backup machine
[22:39] <gregaf> it's probably less expensive for your kvm host than a 4MB write is
[22:39] <todinini> gregaf: that sound good, there is the time window of changes in the fs much shorter
[22:40] <gregaf> yeah
[22:40] <gregaf> and you can delete the snapshot once done if you like
[22:40] <todinini> I will give that a try tomorrow
[22:41] <wido> about rbd, i still have some issues with it, like i reported in #377. Although the RBD class should be loaded correctly, it doesn't seem to do so
[22:41] <wido> i still have to run cclass -a and ceph class list
[22:41] <todinini> me too
[22:41] <wido> to propogate it through the cluster
[22:42] <wido> todinini: Ubuntu 10.04?
[22:42] <todinini> wido: yep
[22:42] <wido> todinini: http://pcx.apt-get.eu/ubuntu/dists/lucid/unofficial/binary-amd64/Packages
[22:43] <wido> deb http://pcx.apt-get.eu/ubuntu lucid unofficial
[22:43] <wido> patched libvirt and qemu-kvm packages with RBD
[22:44] <todinini> cool, thanks, did you build them?
[22:44] <wido> yes
[22:44] <kblin> meh, cmon dumps core on me
[22:44] <wido> if you need the key: http://www.apt-get.eu/pcextreme_archive_key.asc
[22:45] <kblin> bet I'm still doing something wrong
[22:45] <wido> kblin: it should not dump its core
[22:45] <wido> what are you trying?
[22:45] <todinini> wido: I had problems to build them, the rules files wasn't correct, do you know a nice how-to?
[22:47] <wido> todinini: i don't have a nice how-to, just some trial and error to get it working. Didn't touch the rules, just added some patches
[22:47] <kblin> wido: ./cmon -i 0 -c ./ceph.conf
[22:48] <todinini> so you got a source packages, and put your patches in die patches folder?
[22:48] <wido> todinini: yes, and modified the changelog to up the version number
[22:48] <wido> kblin: it then dumps? without doing anything?
[22:48] <kblin> terminate called after throwing an instance of 'std::logic_error'
[22:48] <kblin> what(): basic_string::_S_construct NULL not valid
[22:49] <kblin> is the output
[22:49] <sagewk> do you have a core file?
[22:49] <sagewk> a backtrace from gdb should be enough to find the problem
[22:49] <wido> hmm, i guess the devs need the core, the binary and your logs to find the bug, since it should not crash :)
[22:50] <kblin> sagewk: hang on, had to change ulimit settings
[22:50] <kblin> there we go
[22:51] <kblin> hm, maybe that filesytem is corrupt
[22:52] <kblin> git fetch is dying as well
[22:52] <kblin> I can upload the core file if you think that'll help, but I'll go do a fsck first
[22:53] <todinini> gregaf: I retestet the mds with blogbench and now it works fine, I don't know what went wrong before
[22:54] <sagewk> kblin: better if you can gdb /usr/bin/cmon core and pastebin the result of 'bt'
[22:54] <gregaf> todinini: same version of the executable?
[22:54] <gregaf> I tried it on my local machine and wasn't getting too much memory usage out of it, although it did vary quite a bit across runs
[22:55] <todinini> gregaf: yep, but I think the mds wasn't restarte, so the old binary was still in memory running, but I cannot verify that now
[22:55] <kblin> sagewk: sure thing
[22:55] <gregaf> ah, that might be it, I didn't try without tcmalloc (or anything older than that commit you gave me)
[22:55] <kblin> sagewk: will take a few minutes, just fetched the latest git and rebuilding that to see if it still happens
[22:56] <todinini> gregaf: I just wanted to give you feedback
[22:56] <gregaf> I appreciate it
[22:56] <kblin> old checkout was a couple of days old, you never know
[23:04] <wido> is there a mechanism for checking if a OSD is still consistent? For example, when removing a pg directory, it's not re-created again
[23:05] <wido> or is still for milestone 1.0? The OSD fsc?
[23:05] <wido> fsck
[23:08] <kblin> sagewk: http://pastey.net/140032
[23:08] <kblin> sagewk: afraid I just got a SIGWIFE, though, be back tomorrow
[23:09] <sagewk> looks like you didn't run mkcephfs?
[23:09] <sagewk> or have a bad mon data dir configured
[23:09] <sagewk> it faile dto read the $mon_data/magic file
[23:21] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[23:22] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[23:39] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.