#ceph IRC Log

Index

IRC Log for 2011-03-07

Timestamps are in GMT/BST.

[0:26] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Quit: neurodrone)
[0:33] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[0:34] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[0:43] * allsystemsarego (~allsystem@188.27.166.127) Quit (Quit: Leaving)
[1:03] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[1:51] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[2:08] * lxo (~aoliva@201.82.54.5) Quit (Read error: Connection reset by peer)
[2:22] * lxo (~aoliva@201.82.54.5) has joined #ceph
[9:03] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[9:25] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[9:37] * allsystemsarego (~allsystem@188.25.130.175) has joined #ceph
[10:15] * Yoric (~David@213.144.210.93) has joined #ceph
[10:31] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[10:31] * Yoric (~David@213.144.210.93) has joined #ceph
[10:40] * `gregorg` (~Greg@78.155.152.6) Quit (Quit: Quitte)
[10:53] * pombreda (~Administr@92.132.139.9) has joined #ceph
[10:56] * verwilst (~verwilst@router.begen1.office.netnoc.eu) has joined #ceph
[11:41] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Quit: neurodrone)
[11:47] * Yoric_ (~David@213.144.210.93) has joined #ceph
[11:47] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[11:47] * Yoric_ is now known as Yoric
[12:10] * Yoric (~David@213.144.210.93) Quit (Quit: Yoric)
[12:14] * Yoric (~David@80.70.32.140) has joined #ceph
[12:26] * morse (~morse@supercomputing.univpm.it) Quit (Quit: Bye, see you soon)
[12:26] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[13:02] * pombreda (~Administr@92.132.139.9) Quit (Quit: Leaving.)
[13:05] * Yoric (~David@80.70.32.140) Quit (Quit: Yoric)
[13:06] * darktim (~andre@public-wlan.nine.ch) has joined #ceph
[13:30] * Yoric (~David@213.144.210.93) has joined #ceph
[13:44] * gregorg (~Greg@78.155.152.6) has joined #ceph
[14:14] * darktim (~andre@public-wlan.nine.ch) Quit (Remote host closed the connection)
[15:35] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[16:10] * Yoric_ (~David@80.70.32.140) has joined #ceph
[16:12] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[16:14] * Yoric (~David@213.144.210.93) Quit (Ping timeout: 480 seconds)
[16:14] * Yoric_ is now known as Yoric
[16:49] * Yoric_ (~David@213.144.210.93) has joined #ceph
[16:50] * Yoric (~David@80.70.32.140) Quit (Ping timeout: 480 seconds)
[16:50] * Yoric_ is now known as Yoric
[16:55] * Yoric_ (~David@213.144.210.93) has joined #ceph
[16:55] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[16:55] * Yoric_ is now known as Yoric
[16:57] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[17:27] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[17:39] * ana_ (~quassel@175.96.21.95.dynamic.jazztel.es) has joined #ceph
[17:47] <ana_> greagaf I tried tcmalloc instead
[17:48] <ana_> it was a little bit better... but still I think it consumed too much memory
[17:48] <ana_> before it was 4GB now it is like 3.5GB :(
[17:48] * Yoric_ (~David@213.144.210.93) has joined #ceph
[17:48] <ana_> ups, gregaf
[17:49] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[17:49] * Yoric_ is now known as Yoric
[17:51] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[17:51] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:52] * raso (~raso@debian-multimedia.org) Quit (Ping timeout: 480 seconds)
[17:55] * greglap (~Adium@166.205.138.215) has joined #ceph
[17:57] * Yoric (~David@213.144.210.93) Quit (Ping timeout: 480 seconds)
[18:04] <ana_> greaglap, I tried tcmalloc instead
[18:04] <ana_> of ptmalloc
[18:05] <ana_> it was a little bit better, but still too much memory I think, around 3,5 GB
[18:06] <greglap> yeah, that is more than I'd expect
[18:06] <greglap> but still, that fits everything into main memory, right?
[18:06] <ana_> nope
[18:07] <ana_> it swaps
[18:07] <greglap> ah, bummer
[18:07] <ana_> before it used around 1GB of swap
[18:07] <ana_> no it is like half of that
[18:07] <ana_> yep
[18:07] <greglap> hmm, which would indicate it's actually using that memory, too
[18:08] <greglap> ugh, I just spent a long time looking at memory use and I'm not too eager to dive back into it
[18:08] <ana_> *now it is like half of that
[18:08] <ana_> hehe
[18:08] <ana_> so there nothing I can do about that, right?
[18:08] <greglap> how large are the directories you're untarring?
[18:09] <ana_> now too long, just a linux source
[18:09] <ana_> kernel source
[18:09] <greglap> yeah, I don't think there's anything too big in there
[18:10] * Yoric (~David@213.144.210.93) has joined #ceph
[18:11] <greglap> my best off-the-cuff guess, though, is that there are simply too many inodes/dentries to fit into the amount of memory you've given your MDS
[18:12] * verwilst (~verwilst@router.begen1.office.netnoc.eu) Quit (Quit: Ex-Chat)
[18:12] <ana_> can I set up the amount of memory given to an MDS?
[18:12] <ana_> like in the config file or something like that?
[18:13] <greglap> well I meant physical memory in this case
[18:13] <greglap> there are also settings for how many inodes it's willing to cache
[18:14] <greglap> that's a configurable — mds cache size
[18:14] <greglap> it defaults to 100k
[18:14] <greglap> each dentry takes 1 or 2k
[18:14] <ana_> because in the confi file I specify the path for the mon and mds
[18:14] <ana_> but nothing about the mds
[18:14] <ana_> sorry
[18:14] <greglap> but if the directories are too large then it might be that it's reading a directory off disk, making the change, and then immediately pushing them back out to disk again
[18:15] <ana_> for the mon and OSD
[18:15] <ana_> not mds
[18:15] <greglap> yeah, the MDS doesn't use any local disk storage
[18:16] <greglap> if you have your mds debugging up a little bit (at least 7) you can look in your mds log for a line that starts "trim max="
[18:16] <greglap> and that will tell you how many dentries it wants to keep in-memory and how many are currently there
[18:17] <greglap> if it's consistently a lot above the max then you've got bad behavior being caused by a collision between the work load's necessary cache size, and the actual size of the cache
[18:21] <ana_> ohh okay. so I may try debugging mds a little bit more. thanks!
[18:33] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:37] * Yoric (~David@213.144.210.93) Quit (Quit: Yoric)
[18:40] * bchrisman (~Adium@70-35-37-146.static.wiline.com) has joined #ceph
[18:40] * greglap (~Adium@166.205.138.215) Quit (Quit: Leaving.)
[18:58] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:00] * cmccabe (~cmccabe@208.80.64.121) has joined #ceph
[19:00] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Quit: neurodrone)
[19:10] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has left #ceph
[19:10] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:58] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[20:00] <joshd> wido: you there?
[20:12] <Tv> cephbooter nfsroot is full again
[20:13] <wido> joshd: here
[20:14] <joshd> I'm trying to reproduce the librbd/qemu issue, just wanted to make sure you were running the librbd branch of our qemu-kvm repo
[20:15] <Tv> anyone using sepia or cosd, your run just borked
[20:17] <Tv> 20G ceph-peon64
[20:17] <Tv> 5.8G sepia
[20:18] <Tv> 14G ceph-peon64/c
[20:18] <Tv> 7.4G ceph-peon64/c/ceph6/iozone.sh
[20:18] <Tv> that'd be it
[20:19] <Tv> -rw-r----- 1 root root 7.4G 2011-03-03 13:02 ceph-peon64/c/ceph6/iozone.sh/.nfs00000000000fc002000000f6
[20:19] <Tv> is that a silly rename leftover?
[20:19] <Tv> 4 days old -> removing
[20:20] <Tv> # find ceph-peon64/c -name .nfs\* -mtime +1 -print0|xargs -0 du -csh --|grep total
[20:20] <Tv> 3.8G total
[20:20] <Tv> -> cleaning them all
[20:21] <Tv> and that got us 11GB of space
[20:22] <wido> joshd: Yes, I'm running the librbd branch of your Qemu-KVM
[20:22] <wido> joshd: a686726803722a98428bad97c01cc08229f21bb8
[20:22] <wido> My librbd is from "next" with commit 46b01f4a78725642366eefe0658b368f959f45c8
[20:23] <wido> joshd: If you want access to my cluster, let me know
[20:23] <joshd> wido: thanks, those are the right versions
[20:24] <wido> sagewk: fyi, the scrub didn't find any broken PG's
[20:24] <sagewk> wido: cool.
[20:25] <sagewk> tv: silly rename is still broken upstream
[20:25] <wido> sagewk: I'm trying to understand what went wrong: Is it that I added the 4 new OSD's, the map CRUSHmap changed that fast, that the cluster thought the PG's should be on osd1, but they were actually on osd2?
[20:25] * Juul (~Juul@static.88-198-13-205.clients.your-server.de) has joined #ceph
[20:25] <Tv> sagewk: this is nfsroot not ceph
[20:26] <sagewk> yeah it's broken on nfs. (ceph doesn't sillyrename :)
[20:26] <Tv> ahh
[20:26] <Tv> i thought you meant the re-export case
[20:27] <Tv> sagewk: anyway, if we weren't building the new sepia-autotest boxes to not be nfsroot, i'd be bitching more about that disk filling up
[20:27] <sagewk> wido: it's a problem in the recovery code. it wasn't looking for replicas on the nodes that had them. i'm working on making my fix a bit more robust now.
[20:29] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[20:44] <Tv> sagewk: fyi there's now a fedora13 vm on ceph-kvm1, only behind NAT for now so use virt-manager to get on it, i haven't done any further work on it yet but anyone should be able to go in and install build-dependencies etc.. usual root password, same on the shared user account
[20:45] <sagewk> k
[20:45] <Tv> 64-bit, i can do 32 too if wanted
[22:15] * Juul (~Juul@static.88-198-13-205.clients.your-server.de) Quit (Quit: Leaving)
[22:35] * ana_ (~quassel@175.96.21.95.dynamic.jazztel.es) Quit (Remote host closed the connection)
[23:56] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.