#ceph IRC Log


IRC Log for 2011-10-26

Timestamps are in GMT/BST.

[0:06] <stingray> okay
[0:06] <stingray> so it compiled and I deployed it
[0:06] <stingray> let's see if it'll work
[0:09] <stingray> yay, it no longer hangs there
[0:09] <stingray> now to why mds doesn't replay
[0:18] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[0:24] * cp (~cp@ Quit (Quit: cp)
[0:26] * cp (~cp@ has joined #ceph
[0:29] <stingray> now I have 18 crashed+peering, 4 crashed+down+peering that doesn't go away
[0:29] * stingray gonna fix it tomorrow
[0:59] * gregaf (~Adium@aon.hq.newdream.net) has joined #ceph
[0:59] * gregaf1 (~Adium@aon.hq.newdream.net) Quit (Read error: Connection reset by peer)
[1:10] <sagewk> hmm anyone remember Alexandre Oliva's nick?
[1:12] <joshd> sagewk: lxo
[1:15] * fronlius (~Adium@f054098210.adsl.alicedsl.de) Quit (Quit: Leaving.)
[1:26] * adjohn (~adjohn@ Quit (Remote host closed the connection)
[1:26] * adjohn (~adjohn@ has joined #ceph
[1:35] * adjohn is now known as Guest14756
[1:35] * adjohn (~adjohn@ has joined #ceph
[1:38] * Guest14756 (~adjohn@ Quit (Ping timeout: 480 seconds)
[1:44] <sagewk> joshd: thanks
[1:44] <sagewk> lxo: had a chance to look at #1435? http://tracker.newdream.net/issues/1435
[1:54] * cp (~cp@ Quit (Quit: cp)
[2:04] * adjohn (~adjohn@ Quit (Read error: Connection reset by peer)
[2:06] * adjohn (~adjohn@ has joined #ceph
[2:18] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[2:20] <lxo> sagewk, sorry, no, but I'm getting the impression that any mds take-over can reset the layout for a directory, at least for a ceph.ko mount. but I'm trying to recover from another ceph error right now
[2:47] * maswan (maswan@kennedy.acc.umu.se) Quit (Ping timeout: 480 seconds)
[2:48] * maswan (~maswan@kennedy.acc.umu.se) has joined #ceph
[3:13] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) Quit (Quit: This computer has gone to sleep)
[3:28] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[3:52] * adjohn (~adjohn@ Quit (Quit: adjohn)
[3:58] * cp (~cp@ has joined #ceph
[3:58] * cp (~cp@ Quit ()
[4:54] * tserong (~tserong@124-168-227-136.dyn.iinet.net.au) has joined #ceph
[5:21] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) has joined #ceph
[6:05] * sandeen_ (~sandeen@sandeen.net) Quit (Quit: This computer has gone to sleep)
[6:16] * sandeen_ (~sandeen@sandeen.net) has joined #ceph
[6:39] * sandeen_ (~sandeen@sandeen.net) Quit (Quit: This computer has gone to sleep)
[6:55] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[6:59] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) has joined #ceph
[7:55] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[8:17] <NaioN> stingray: could you send me the patch, I would like to give it a try also...
[8:20] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) has joined #ceph
[8:28] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[8:36] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) has joined #ceph
[9:14] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[10:03] * fronlius (~Adium@f054109039.adsl.alicedsl.de) has joined #ceph
[10:04] * fronlius (~Adium@f054109039.adsl.alicedsl.de) Quit ()
[10:21] * fronlius (~Adium@testing78.jimdo-server.com) has joined #ceph
[11:12] <stingray> NaioN:
[11:14] <stingray> sagewk: http://pastebin.com/tqmVhzKy
[11:21] * fronlius1 (~Adium@testing78.jimdo-server.com) has joined #ceph
[11:21] * fronlius (~Adium@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[11:23] * fronlius (~Adium@p578b21b6.dip0.t-ipconnect.de) has joined #ceph
[11:26] * fronlius1 (~Adium@testing78.jimdo-server.com) Quit (Read error: No route to host)
[11:26] * fronlius2 (~Adium@testing78.jimdo-server.com) has joined #ceph
[11:33] * fronlius (~Adium@p578b21b6.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[11:52] <psomas> stingray: this is for debugging only, right? because i think the code assumes that it'll find the right entry in the log, so if you reach the end and you haven't find it, you should probably do an assert(0) or sth, or else it'll return an incorrect entry to the caller
[11:55] * FoxMURDER (~fox@ip-89-176-11-254.net.upcbroadband.cz) has joined #ceph
[12:00] <NaioN> psomas: what happens if you reach the beginning or end of the log?
[12:02] <psomas> well, i'm not sure about the 'semantics'...but from what i understand, the caller wants to find an entry in the log, if you reach the beginning/end of the log, the entry is not there(?) for some reason, and if you stop the while loop when you reach the bounds of the log, you return that entry to the caller, which shouldn't be correct
[12:02] <psomas> i think the caller is build_incr_scrub_map or sth
[12:02] <NaioN> Yes I agree with that
[12:03] <NaioN> but what happens in the current code if you don't find the entry?
[12:04] <psomas> as i said i'm not familiar with the code, so i really don't know, but i guess (wild guess) you're going have a problem with build_incr_scrub_map or sth, it's not what you originally wanted
[12:05] <psomas> void PG::build_inc_scrub_map(ScrubMap &map, eversion_t v)
[12:06] <psomas> build a summary of pg content changed starting after v
[12:08] <bugoff> 26
[13:17] <NaioN> stingray:
[13:17] <NaioN> psomas:
[13:17] <NaioN> http://tracker.newdream.net/issues/1624
[13:17] <NaioN> seems that sage got him too
[13:55] * tserong (~tserong@124-168-227-136.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[14:04] * tserong (~tserong@58-6-103-bcast.dyn.iinet.net.au) has joined #ceph
[14:30] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[15:36] <stingray> NaioN: the description is "fixed in previous patch"
[15:36] <stingray> if only I can find at least the current patch
[15:36] <stingray> 2b3bdea9f7bcf9e9f8d4328f62d82ff43e996b3a
[15:36] <stingray> ah
[15:50] <stingray> I fail to understand the link but I'll accept that
[15:55] * sandeen_ (~sandeen@sandeen.net) has joined #ceph
[16:17] * gregorg_taf (~Greg@ Quit (Ping timeout: 480 seconds)
[16:34] * gregorg (~Greg@ has joined #ceph
[16:35] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[16:36] * ognatortcele (~ognatortc@ has joined #ceph
[16:48] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[16:49] <NaioN> stingray: I think the previous patch solved it
[16:49] <NaioN> We are now compiling the 0.37 with that patch and give it a try
[16:50] * gregorg (~Greg@ Quit (Quit: Quitte)
[16:56] <stingray> NaioN: which "previous patch" ?
[16:56] <NaioN> commit:2b3bdea9f7bcf9e9f8d4328f62d82ff43e996b3a
[16:56] <NaioN> going to chalk these up to the infinite loop fixed in that previous patch.
[16:56] <NaioN> from the bug page
[16:56] <NaioN> http://tracker.newdream.net/issues/1624
[17:04] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) has joined #ceph
[17:08] <stingray> this commit I just built with
[17:08] <stingray> and the system is deploying it
[17:08] <stingray> now
[17:12] * adjohn (~adjohn@70-36-139-78.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[17:24] <ajm> does something like
[17:24] <ajm> LD_PRELOAD=/usr/lib64/libtcmalloc.so.0.0.0 CEPH_HEAP_PROFILER_INIT=1 ceph-osd ...
[17:24] <ajm> seem sane to do memory profiling? i'm not seeing any output from that in /var/log/ceph/...
[17:37] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:05] * adjohn (~adjohn@ has joined #ceph
[18:06] * adjohn (~adjohn@ Quit (Remote host closed the connection)
[18:06] * adjohn (~adjohn@ has joined #ceph
[18:20] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[18:23] * Tv (~Tv|work@aon.hq.newdream.net) has joined #ceph
[18:29] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Remote host closed the connection)
[18:33] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[18:41] * bchrisman (~Adium@ has joined #ceph
[18:45] * tserong (~tserong@58-6-103-bcast.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[18:48] <gregaf> ajm: if you built with tcmalloc (that's the default) you shouldn't need to LD_PRELOAD it
[18:48] <gregaf> if you didn't build with tcmalloc I don't think any of Ceph's hooks for it are going to work since they won't be compiled in
[18:50] * adjohn is now known as Guest14852
[18:50] * adjohn (~adjohn@ has joined #ceph
[18:54] <ajm> gregaf: didn't have tcmalloc installed at the time, i'll rebuild, their docs suggested it'd just use tcmalloc like that (which I guess it does) it just doesn't have the CEPH stuff then
[18:54] * Guest14852 (~adjohn@ Quit (Ping timeout: 480 seconds)
[18:54] * tserong (~tserong@58-6-131-7.dyn.iinet.net.au) has joined #ceph
[18:54] <gregaf> ajm: so yeah, you can just use it that way, but I don't remember the configuration bits you need to have it go on start-up
[18:54] <gregaf> (there are some, just need to look them up!)
[18:55] <gregaf> Ceph selectively compiles its control bits based on whether it's being built with tcmalloc or not so things like CEPH_ env variables and the control commands won't work
[18:56] <ajm> makes sense
[18:57] <ajm> oh wonderful, gentoo ebuild specifies --without-tcmalloc without a use flag to enable it
[18:57] <gregaf> ajm: looks like you want to use LD_PRELOAD=/usr/lib64/libtcmalloc.so.0.0.0 HEAPPROFILE=/path/to/dump/to ceph-osd …
[18:58] <ajm> is there a real benefit from using tcmalloc generally ?
[18:58] <gregaf> and then it'll use the default HEAP_PROFILE_ALLOCATION_INTERVAL and HEAP_PROFILE_INUSE_INTERVAL to of 1 GB of allocations and 100MB of higher in-use heap
[18:59] <gregaf> ajm: in our testing it results in lower peak memory use
[18:59] <gregaf> it seems to handle Ceph's highly-threaded allocations and deallocations better and produce less memory fragmentation
[19:00] <ajm> ok, I'm patching ebuilds, perhaps I'll submit them
[19:00] <gregaf> the .debs and such that we produce all turn it on by default :)
[19:00] <ajm> but... i'd have to use debian then :D
[19:00] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:01] <gregaf> but it's not enabled for Red Hat derivatives since they don't include the google-perftools due to issues with some of the other pieces under x64
[19:01] <gregaf> dunno where the gentoo stuff came from or why it's turned off there
[19:07] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Remote host closed the connection)
[19:10] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[19:15] <ajm> gregaf: "Starting tracking the heap"
[19:15] <ajm> looks good
[19:22] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Remote host closed the connection)
[19:25] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[19:27] <ajm> gregaf: how frequently should this be dumping data into HEAPPROFILE ?
[19:40] <gregaf> ajm: it's triggered based on memory allocations
[19:41] <gregaf> the defaults (configurable with those env vars I mentioned) are that it dumps after every gigabyte of allocations or whenever the in-use heap increases by 100MB
[19:45] * fronlius2 (~Adium@testing78.jimdo-server.com) Quit (Quit: Leaving.)
[19:45] <ajm> hrmmmmm
[19:45] <ajm> did my memory issue dissapear with tcmalloc
[19:45] <gregaf> it might have — that's why we switched the defaults… ;)
[19:45] <ajm> well damn
[19:47] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Remote host closed the connection)
[19:49] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[19:51] <ajm> interesting, ceph-osd is looping over the same directory repeatedly (if I look at what its doing with strace)
[19:56] <ajm> http://pastebin.com/R3spnhBf
[19:56] <ajm> its just looping over the same pg
[19:59] <gregaf> mmm, log bound mismatches!
[19:59] <ajm> i see a bunch of those, in other pg's too but with no ill effects
[20:00] <gregaf> sagewk: joshd: what are the implications of log bound mismatches again?
[20:05] * phil__ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[20:06] <joshd> gregaf: I think we concluded the last time it happened was due to ignoring errors from the underlying fs, which was fixed in b5c606230f7e002115d3b86e64dac9dbb4ffedef, that wasn't in 0.37
[20:06] * phil__ (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Remote host closed the connection)
[20:06] <gregaf> joshd: okay
[20:07] * julienhuang (~julienhua@pasteur.dedibox.netavenir.com) has joined #ceph
[20:07] <gregaf> I don't really know what the actual error message means or what external behavior it could manifest (large memory consumption?), is why I ask
[20:07] <ajm> well, with tcmalloc that issue went away
[20:08] * julienhuang (~julienhua@pasteur.dedibox.netavenir.com) Quit ()
[20:08] <ajm> i think the memory usage was due to it looping over this pg
[20:08] * fronlius (~Adium@f054109039.adsl.alicedsl.de) has joined #ceph
[20:08] <ajm> so tcmalloc fixed the memory issue, but not the underlying problem
[20:08] <gregaf> I wonder if that was the problem for everybody who complained about memory use — come to think of I think the rest might all have been on Red Hat derivatives
[20:08] <gregaf> that's good to know
[20:09] <gregaf> ajm: try that patch Josh mentioned and see if it fixes it?
[20:09] <ajm> sure
[20:10] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[20:10] <joshd> ajm: if you're not using btrfs, you might need 3e92aace21ecc766f14ac5a2c6377570988f1a3b too
[20:10] <ajm> i am, on this node
[20:10] <ajm> anything horrible will happen if i just use the latest git?
[20:12] <gregaf> oh, maybe ;)
[20:13] <gregaf> actually you probably shouldn't
[20:13] <ajm> "mabey" was enough ;)
[20:13] <gregaf> sagewk made some changes to encoding and I'm not sure how tested they are at playing nicely with old stuff
[20:15] <sagewk> ajm, gregaf: there were several bugs that were fixed 2 days ago, i forget exactly which ones were leading to the log bound mismatches
[20:15] <joshd> stingray: the problem you're hitting is still a bug - http://tracker.newdream.net/issues/1530
[20:15] <ajm> sagewk: any idea if the log bound mismatch is the underlying issue though?
[20:16] <sagewk> ajm: i was pretty careful about make the encoding changes backwards compatible over the wire, so you should be ok. you just won't be able to go back to an old version once you roll forward and the new encodings get written to disk
[20:16] <joshd> stingray: just ran into it on latest master
[20:17] <sagewk> ajm: oh, that's not a bug there, actually.. it's reading the log during startup and because debug is cranked up and the log is half-read you see those messages.
[20:17] <ajm> oh ok
[20:17] <ajm> so any idea why it won't startup then :)
[20:18] <sagewk> it may just be taking a really long time because of the quantity of debug output. if you set debug osd = 10 you won't get the per-entry messages and it'll go(/eat ram) faster
[20:19] <ajm> it never starts up even without debug on
[20:19] <sagewk> yehudasa, tv: any idea why sheng qiu would get EACCES on a file that is already open and previously written to?
[20:19] <ajm> if I strace the ceph process its looping over reading the same pg from disk
[20:19] <Tv> sagewk: all i can think of was selinux, apparmor, etc, but i still don't see why it'd fail in the middle
[20:19] <sagewk> ajm: from that log snippet it's making progress, though...
[20:20] <yehudasa> sagewk: might be that the underlying block device returns that for some reason
[20:20] <ajm> sagewk: it seems to reread the same files many many times, thats normal ?
[20:20] <sagewk> ajm: oh, it shoudl only read each file once.
[20:20] <ajm> well the same directories, open(), getdents(), close()
[20:21] <sagewk> ajm: ditto for dirs. can you post a full log that captures it?
[20:21] <ajm> sure
[20:22] <Tv> sagewk: i see mentions of NFS being able to cause that... :-/
[20:22] <yehudasa> Tv, sagewk: he did say that the file is on separate device from the osd
[20:23] <yehudasa> we can ask for more details, might shed a better light
[20:24] <sagewk> replied
[20:25] * fronlius (~Adium@f054109039.adsl.alicedsl.de) Quit (Quit: Leaving.)
[20:30] * nwatkins (~nwatkins@kyoto.soe.ucsc.edu) has joined #ceph
[20:36] * nwatkins (~nwatkins@kyoto.soe.ucsc.edu) has left #ceph
[20:37] * nwatkins (~nwatkins@kyoto.soe.ucsc.edu) has joined #ceph
[20:39] <nwatkins> Is anyone else seeing poor response time on the wiki? I am also seeing many MySQL connection errors.
[20:39] <Tv> nwatkins: the database behind it has been problematic; it usually comes back quick
[20:39] * adjohn (~adjohn@ Quit (Quit: adjohn)
[20:40] <nwatkins> Ok thanks
[20:41] <psomas> ajm: about the gentoo ebuild, https://bugs.gentoo.org/show_bug.cgi?id=351032
[20:42] <ajm> interesting, i rolled an ebuild to use tcmalloc without issue
[20:42] <Tv> the ceph bug linked there seems to point at PEBKAC
[20:43] <gregaf> I think it was a distro bug, actually
[20:43] <Tv> gregaf: like, libceph version not matching ceph package?
[20:44] <gregaf> libgoogle-perftools version not matching the perftools.h
[20:44] <Tv> oh odd
[20:44] <psomas> ajm: you can reopen the bug if you want, and maybe attach the ebuild with tcmalloc
[20:44] <ajm> yeah i'll think
[20:54] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) Quit (Quit: Ex-Chat)
[20:55] * fronlius (~Adium@g231174031.adsl.alicedsl.de) has joined #ceph
[20:59] * fronlius1 (~Adium@f054178104.adsl.alicedsl.de) has joined #ceph
[21:00] * adjohn (~adjohn@ has joined #ceph
[21:04] * fronlius (~Adium@g231174031.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[21:07] * conner (~conner@leo.tuc.noao.edu) Quit (Ping timeout: 480 seconds)
[21:10] * fronlius (~Adium@f054178104.adsl.alicedsl.de) has joined #ceph
[21:12] * adjohn (~adjohn@ Quit (Read error: Connection reset by peer)
[21:12] * fronlius (~Adium@f054178104.adsl.alicedsl.de) Quit (Remote host closed the connection)
[21:13] <stingray> joshd: :(
[21:13] * adjohn (~adjohn@ has joined #ceph
[21:15] * adjohn is now known as Guest14865
[21:15] * Guest14865 (~adjohn@ Quit (Read error: Connection reset by peer)
[21:15] * adjohn (~adjohn@ has joined #ceph
[21:16] * conner (~conner@leo.tuc.noao.edu) has joined #ceph
[21:17] * fronlius1 (~Adium@f054178104.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[21:22] <ajm> sagewk: http://adam.gs/osd.2.log.bz2
[22:01] * ognatortcele (~ognatortc@ Quit (Quit: ognatortcele)
[22:13] * tserong (~tserong@58-6-131-7.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[22:22] * tserong (~tserong@124-168-225-205.dyn.iinet.net.au) has joined #ceph
[22:52] * sandeen_ (~sandeen@sandeen.net) Quit (Quit: This computer has gone to sleep)
[22:59] * adjohn is now known as Guest14875
[22:59] * Guest14875 (~adjohn@ Quit (Read error: Connection reset by peer)
[22:59] * adjohn (~adjohn@ has joined #ceph
[23:10] <gregaf> nwatkins: what version of stuff were you using when you saw those unit tests breaking?
[23:14] * cp (~cp@dhcp184-48-60-82.whsj.sjc.wayport.net) has joined #ceph
[23:38] * tserong (~tserong@124-168-225-205.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[23:47] * tserong (~tserong@58-6-101-93.dyn.iinet.net.au) has joined #ceph
[23:48] * cp (~cp@dhcp184-48-60-82.whsj.sjc.wayport.net) Quit (Quit: cp)
[23:55] <sagewk> ajm: i see the problem :)
[23:55] * adjohn (~adjohn@ Quit (Ping timeout: 480 seconds)
[23:56] <sagewk> ajm: http://fpaste.org/e5pV/
[23:56] <sagewk> ajm: you just upgraded, right? i think that's what's going on.
[23:59] * adjohn (~adjohn@ has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.