[0:05] <johnl> build of unstable is failing
[0:05] <johnl> cosd.cc: In function 'int main(int, const char**)':
[0:05] <johnl> cosd.cc:65: error: 'IsHeapProfilerRunning' was not declared in this scope
[0:05] <johnl> worth filing bug reports about unstable not building?
[0:06] <sagewk> what os are you on?
[0:06] <johnl> Linux. Ubuntu Lucid i386
[0:06] <johnl> unstable bf784cdb4f60
[0:07] <sagewk> iirc this is a library versioning issue or something. gregaf do you remember?
[0:07] <johnl> a google-perftools thing?
[0:07] <sagewk> yeah
[0:07] <johnl> too old a version perhaps?
[0:09] <johnl> it's 0.98. quite old.
[0:09] <johnl> I'll backport a newer version and rebuild
[0:10] <johnl> ta.
[0:22] <gregaf> johnl: as I recall this problem occurs because google-perftools changed the prototype for that function from "bool IsHeapProfilerRunning();" to "int isHeapProfilerRunning();"
[0:22] <gregaf> in April or May
[0:23] <gregaf> and for some reason some systems have a header from after that change and a library from before that change
[0:24] <johnl> hrm right. well, latest package will likely sort that for me.
[0:24] <johnl> ta
[0:26] <gregaf> now that i think of it the other person who reported this was running some version of Ubuntu as well
[0:30] <johnl> heh
[2:35] * greglap (~Adium@ has joined #ceph
[4:41] * lidongyang (~lidongyan@ has joined #ceph
[9:50] <failboat> sagewk: not yet
[9:50] <failboat> sagewk: I managed to crash anchorserver again
[9:50] <failboat> can't do it in synthetic workload though
[9:50] <failboat> only rsync on my homedir :(
[10:22] <jantje_> maybe you can do a strace of your rsync
[10:22] <jantje_> (I'm not sure if that would be of any help)
[10:58] * johnl (~johnl@cpc3-brad19-2-0-cust563.barn.cable.virginmedia.com) has joined #ceph
[11:24] <failboat> if only I could find a place to store that gigantic trace
[11:24] <failboat> anyway
[11:24] <failboat> I'll continue
[12:13] * lidongyang (~lidongyan@ Quit (Remote host closed the connection)
[12:27] * lidongyang (~lidongyan@ has joined #ceph
[17:18] <greglap> failboat: the AnchorServer is part of how Ceph implements hard links, so any synthetic test you come up with will probably need to use those
[17:19] <greglap> if that helps you
[18:04] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:37] <sagewk> failboat: if you can reproduce the (workload leading up to the) crash with the mds logging enabled (debug mds = 20 and debug ms = 1) that log should also be sufficient.
[18:57] * sjust (~sam@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:34] <wido> hi
[19:34] <cmccabe> wido: hi
[19:34] <wido> After the recent crashes my OSD's keep crashing with new pg errors
[19:34] <wido> A lot of them
[19:35] <cmccabe> wido: is this the latest unstable?
[19:35] <wido> In the past I had situations where only I would run into
[19:35] <wido> cmccabe: yesterdays
[19:35] <cmccabe> v0.25~rc?
[19:35] <wido> I would only run into these, since I have a "old" FS, of a few weeks old
[19:36] <wido> cmccabe: No, "ceph version 0.24~rc (commit:463d624d38d2c5444cc9aa6a2c8e6d3fbcca65fd)"
[19:36] <wido> I'll upgrade to the latest right now, see what that does
[19:36] <cmccabe> wido: great
[19:36] <cmccabe> wido: yeah, we're stabilizing 0.25~rc now. A bunch of known bugs were fixed
[19:37] <wido> But, I had some things like these in the past, my cluster would keep crashing and you guys concluded that no new cluster would run into these issues
[19:37] <sagewk> wido: you should use the 'rc' branch for the time being
[19:37] <wido> sagewk: ok, i'll switch
[19:38] <wido> and see what that does
[19:42] <wido> sagewk: the rc branch is still at 0.23-1, correct?
[19:47] <sagewk> rc should be 0.24~1
[19:48] <sagewk> er, 0.24~rc .. i.e., not quite 0.24
[20:01] <wido> ah, it's the debian changelog which is outdated
[20:02] <wido> for example, all my OSD's (expect for one) crash directly after startup: http://pastebin.com/DTwsx9Sk
[20:03] <wido> Before I start creating issues, is this (If you can see it that fast) due to my very damaged fs / old fs?
[20:03] <sagewk> do you have a gdb backtrace?
[20:03] <sagewk> i was just working with that code, probably my fault
[20:04] <wido> sagewk: yes: http://pastebin.com/qcmyaDQX
[20:05] <sagewk> http://fpaste.org/Amv4/ should fix it
[20:05] <sagewk> btw you're running in trailing journal mode, which is unusual.. is that on purpose?
[20:07] <wido> sagewk: No, that is not on purpose at all. btrfs journal or OSD journal?
[20:10] <sagewk> osd journal
[20:10] <sagewk> which node is this?
[20:11] <wido> on node01
[20:20] <sagewk> wido: oh, i see the problem
[20:21] <wido> I'm just wondering, is it worth the time hunting these issues down? Or is this just another corner case no body will run into? I'll happily create an issue for everything I find, no problem
[20:26] <sagewk> this one is real :)
[20:27] <sagewk> in general, adding issues is good. the recovery stuff you're hitting i wouldn't tho since it's likely fallout from instability earlier in this cycle
[20:27] <wido> Yes, I get it. Well, take your time checking this one out, I just wanted to check the btrfs issue tomorrow and for that I need a working cluster
[20:30] <sagewk> ok. well i'll push a fix for this pushed in a few minutes in any case
[20:30] <sagewk> once that's verified i suggest wiping and then looking for the btrfs bug
[20:31] <sagewk> no need for that patch from before.. the async snap creation is disabled by default now until the ioctl interface is finalized
[20:31] <wido> btw, something else, how "big" is IPv6 in the US? I've got a issue with Brocade/Foundry about IPv6 and it seems to me that the US not really hurrying to implementing IPv6
[20:31] <wido> sagewk: Ok, i'll try to recreate the issue without your patch, just see if I get some warnings
[20:31] <gregaf> IPv6? what's that?
[20:32] <wido> gregaf: Thanks, that explains it :)
[20:32] <gregaf> ;)
[20:32] <johnl> IPv4 will never catch on. I'm still on IPX.
[20:32] <cmccabe> I think the IPv4 address space is going to run out this year
[20:32] <cmccabe> er, early next year
[20:33] <cmccabe> I saw an article that said 6 months, tops
[20:33] <johnl> the IPX address space is still going strong.
[20:33] <cmccabe> haha
[20:33] <cmccabe> I remember it was an option for warcraft 2
[20:33] <cmccabe> and also I think Lotus Notes might have supported it at one point?
[20:34] <gregaf> IPX was huge, it was the only way to play games online for a while
[20:34] <gregaf> or maybe just multiplayer, period
[20:34] <wido> indeed, I remember IPX :-)
[20:35] <wido> No, but here in Europe the IPv4 space will run out somewhere next year
[20:35] <cmccabe> some people believe that ISPs will just start NATing everyone
[20:35] <gregaf> I don't think the US is in any better shape in terms of available addresses
[20:35] <wido> In a few weeks we will be running dual-stack, but Brocade/Foundry has a real bug in there router platform
[20:35] <cmccabe> that would be kind of horrible I think... we'd have to tunnel absolutely everything over port 80 probably
[20:35] <gregaf> but the projections keep getting pushed back as IP holders start more aggressively gating and reclaiming unused ones and stuff
[20:36] <wido> and they don't seem to be willing to fix it
[20:36] <gregaf> and there are still tons of issues with IPv6 adoption
[20:36] <wido> gregaf: yes, that's true :) Like I'm seeing right now
[20:36] <cmccabe> well, the "running out" is in terms of companies buying new blocks of addresses
[20:36] <cmccabe> it doesn't mean that companies that already have them don't have some headroom
[20:38] <wido> ok, but it answers my question, the problems are the same. Brocade is just my problem right now
[20:43] <gregaf> Ars Technica had a pretty good overview of the state of IPv6 and its issues recently if you were looking for more background
[20:43] <gregaf> (at least, it seemed good to me: http://arstechnica.com/business/news/2010/09/there-is-no-plan-b-why-the-ipv4-to-ipv6-transition-will-be-ugly.ars/4 )
[20:44] <wido> gregaf: Yes, I saw that one :)
[20:44] <wido> nice indeed
[22:06] <johnl> hey sagewk, sorry if I missed your response to this, but in bug #621 you mention commit:307404231ecb09fdd2f6dd6e50677e746bba4236 but that isn't available in the git repository. you pushed?
[22:08] <gregaf> johnl: Sage is at lunch but I think the commit got renamed or something
[22:08] <gregaf> cbb562089c788e5eeb8cbee7a2be5de0b40d84b4 is pushed and I'm pretty sure that's the one he meant
[22:08] <gregaf> commit cbb562089c788e5eeb8cbee7a2be5de0b40d84b4
[22:08] <gregaf> Author: Sage Weil <mailto:sage@newdream.net>
[22:08] <gregaf> Date:   Wed Dec 1 09:51:27 2010 -0800
[22:08] <gregaf>    rbd: use MIN instead of min()
[22:08] <gregaf>    Not even sure where min() was coming from, but it seems to be missing on
[22:08] <gregaf>    i386 lucid.:
[22:10] <johnl> ah yeah
[22:11] <wido> yehudasa: If you are playing with the RGW sometime, this might be fun to try: http://ceph.newdream.net/wiki/RADOS_Gateway#Accelerating_the_gateway_with_Varnish
[22:11] <johnl> gregaf: thats' not on unstable branch though. what is unstable branch exactly?
[22:12] <johnl> I'm looking for a branch to get the latest fixes (so I can test when my bugs are fixes :)
[22:13] <gregaf> unstable is our main dev branch
[22:14] <gregaf> most bug fixes go into the testing branch (they all should but sometimes they get mixed up or we think a bug got introduced in unstable but it was actually older)
[22:15] <failboat> sagewk: it usually crashes when I try to rm -rf the tree
[22:16] <failboat> which I previously create by rsyncing (which succeed)
[22:17] <johnl> gregaf: can't see it on the testing branch either.
[22:17] <johnl> it's in the repo but I can't for the life of me find what branch it's on!
[22:18] <wido> sagewk: I'm going afk, if you have a fix for the crash I'm seeing, mail me or post it here, i'll read it tomorrow before doing the mkcephfs for the btrfs test
[22:18] <wido> Let me know which backtrace I should NOT be seeing, so that I know if it's fixed
[22:19] <gregaf> johnl: oh, he only put it into the rc branch
[22:19] <johnl> ah!
[22:20] <johnl> bunch of other commits on there not on unstable or testing too!
[22:20] <johnl> hard for an outsider to follow!
[22:21] <johnl> could you merge it all to unstable? or is there a reason it's separate?
[22:21] <gregaf> yeah, we're trying to firm up our release practices but we haven't fully established them yet
[22:21] <gregaf> my guess is you'll want to follow either testing or rc, but I'm not entirely clear on the rc branch myself so I'll let Sage sort it out with you
[22:25] <johnl> ok ta.
[22:26] <johnl> I'm rigging up an automated build atm
[22:26] <johnl> so would be good to know of one good branch to do that from
[22:28] <johnl> you working on ceph full time greg?
[22:28] <gregaf> yep!
[22:28] <johnl> sweet. you and sage? or more?
[22:30] <gregaf> Sage and Yehuda, I came in summer last year, we added another full-time over the summer (cmccabe) and we have a couple guys who are split about 50/50 between Ceph and other company products (sjust and joshd)
[22:31] <johnl> wow! ace.
[22:31] <gregaf> (and hello, I hope you enjoyed your IM alerts to the three of you :P)
[22:31] <cmccabe> gregaf: for some reason, my IM only alerts when someone starts the line with cmccabe:
[22:32] <gregaf> well that's a lame alert system
[22:32] <cmccabe> gregaf: yeah, maybe the later versions of pidgin are better or something
[22:32] <gregaf> it's a configurable option in some clients I've used, maybe you should check your prefs
[22:33] <wido> you guys are using pidgin?
[22:34] <wido> I'm using plain old irssi on a console running in a screen
[22:34] <cmccabe> wido: that has some advantages, but I was never able to get screen's notification system to work reliably
[22:34] <gregaf> I think we're all using some multi-protocol client since we have a company jabber server
[22:35] <cmccabe> wido: the whole monitor-for-activity thing sort of worked, but sometimes seemed to miss events
[22:35] <gregaf> I'm actually on Adium since I use a Mac desktop/laptop
[22:35] <cmccabe> also as gregaf said, we usually use a multi-protocol client just for convenience
[22:35] <cmccabe> although there is a text version of pidgin (finch), it has some quirks
[22:35] <wido> I'm using pidgin, but only for my own Jabber
[22:35] <wido> I switch a lot from places during the day, office, home, other office, etc
[22:36] <wido> log on to my own server and "screen -x"
[22:36] <wido> and i'm back in the IRC channel
[22:36] <cmccabe> I run most applications inside screen, but pidgin is one exception
[22:37] <wido> Yeah, my notification right now is a yellow line, so I have to check it myself every now and then.
[22:37] <cmccabe> do you use screen's monitor-for-activity?
[22:37] <wido> uh no, I simply have a terminal open somewhere on my Ubuntu desktop, where I have the IRC channel
[22:37] <cmccabe> k
[22:38] <wido> but, I'm really going afk now! My message to sagewk has gone up a lot of lines, could one of you point him to it?
[22:38] <wido> He was working on a fix, which I had to try before cleaning my cluster for another test
[22:39] <cmccabe> ok
[22:39] <wido> tnx! ttly
[22:39] <wido> ttyl
[22:39] <cmccabe> bye
[22:40] <gregaf> johnl: sagewk says following the rc branch would be ideal
[22:41] <gregaf> everything goes into there before it goes out in any release
[22:46] <johnl> right. I'll use that. ta
[23:01] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[23:14] * johnl builds ceph ubuntu lucid packages...
[23:15] * ajnelson (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
