#ceph IRC Log


IRC Log for 2011-06-22

Timestamps are in GMT/BST.

[0:16] <cmccabe> jmlowe: yeah
[0:26] <jmlowe> I obsess over the roadmap far too much
[0:29] <jim> does anyone in here do the ceph hosting hdfs thing mentioned in the wiki ??
[0:30] <cmccabe> jim: I am familiar with some code that allows you to use ceph instead of HDFS
[0:30] <cmccabe> jim: is that what you mean?
[1:03] * sugoruyo (~george@athedsl-408632.home.otenet.gr) Quit (Quit: sugoruyo)
[1:04] * sugoruyo (~george@athedsl-408632.home.otenet.gr) has joined #ceph
[1:06] * sugoruyo (~george@athedsl-408632.home.otenet.gr) Quit ()
[2:04] * Ormod (~valtha@ohmu.fi) Quit (Remote host closed the connection)
[2:04] * Ormod (~valtha@ohmu.fi) has joined #ceph
[2:05] * maswan (maswan@kennedy.acc.umu.se) Quit (Remote host closed the connection)
[2:05] * maswan (maswan@kennedy.acc.umu.se) has joined #ceph
[2:05] * Yulya__ (~Yu1ya_@ip-95-220-242-20.bb.netbynet.ru) has joined #ceph
[2:07] * Yulya_ (~Yu1ya_@ip-95-220-242-20.bb.netbynet.ru) Quit (Ping timeout: 480 seconds)
[2:10] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[2:29] * cmccabe (~cmccabe@ has left #ceph
[2:52] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[3:01] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Remote host closed the connection)
[3:14] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[3:36] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[4:12] * jmlowe (~Adium@ Quit (Ping timeout: 480 seconds)
[4:17] * jmlowe (~Adium@ has joined #ceph
[4:43] * yoshi_ (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[4:50] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Ping timeout: 480 seconds)
[5:02] * yoshi_ (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Remote host closed the connection)
[5:50] * votz (~votz@dhcp0020.grt.resnet.group.upenn.edu) Quit (Quit: Leaving)
[6:25] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[7:27] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Remote host closed the connection)
[9:02] * jmlowe (~Adium@ has left #ceph
[9:38] * lxo (~aoliva@ Quit (Ping timeout: 480 seconds)
[9:41] * hijacker_ (~hijacker@ Quit (Ping timeout: 480 seconds)
[9:43] * lxo (~aoliva@83TAABZQQ.tor-irc.dnsbl.oftc.net) has joined #ceph
[10:09] * jbd (~jbd@ks305592.kimsufi.com) has joined #ceph
[10:11] * sugoruyo (~george@athedsl-408632.home.otenet.gr) has joined #ceph
[10:31] * lxo (~aoliva@83TAABZQQ.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[10:33] * lxo (~aoliva@19NAABZH6.tor-irc.dnsbl.oftc.net) has joined #ceph
[10:42] * lxo (~aoliva@19NAABZH6.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[10:44] * lxo (~aoliva@83TAABZSE.tor-irc.dnsbl.oftc.net) has joined #ceph
[11:03] * allsystemsarego (~allsystem@ has joined #ceph
[11:12] * lxo (~aoliva@83TAABZSE.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[11:21] * lxo (~aoliva@09GAAE2BO.tor-irc.dnsbl.oftc.net) has joined #ceph
[11:29] * lxo (~aoliva@09GAAE2BO.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[11:29] * lxo (~aoliva@9YYAABL7Z.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:19] * lxo (~aoliva@9YYAABL7Z.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[12:20] * lxo (~aoliva@09GAAE2C1.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:32] * MarkN (~nathan@ Quit (Ping timeout: 480 seconds)
[12:33] * MarkN (~nathan@ has joined #ceph
[12:44] * lxo (~aoliva@09GAAE2C1.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[12:47] * lxo (~aoliva@9YYAABL9I.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:53] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) has joined #ceph
[12:54] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[12:54] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Remote host closed the connection)
[12:56] * lxo (~aoliva@9YYAABL9I.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[13:10] * lxo (~aoliva@19NAABZKD.tor-irc.dnsbl.oftc.net) has joined #ceph
[13:30] * lxo (~aoliva@19NAABZKD.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[13:38] * lxo (~aoliva@83TAABZVG.tor-irc.dnsbl.oftc.net) has joined #ceph
[13:50] * lxo (~aoliva@83TAABZVG.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[13:51] * lxo (~aoliva@19NAABZK4.tor-irc.dnsbl.oftc.net) has joined #ceph
[14:07] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[14:08] * lxo (~aoliva@19NAABZK4.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[14:09] * darktim (~andre@dhcp-181.nine.ch) has joined #ceph
[14:09] * lxo (~aoliva@19NAABZLJ.tor-irc.dnsbl.oftc.net) has joined #ceph
[14:32] * darktim (~andre@dhcp-181.nine.ch) Quit (Ping timeout: 480 seconds)
[14:37] * lxo (~aoliva@19NAABZLJ.tor-irc.dnsbl.oftc.net) Quit (Read error: Connection reset by peer)
[14:37] * lxo (~aoliva@09GAAE2GM.tor-irc.dnsbl.oftc.net) has joined #ceph
[14:58] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) Quit (Remote host closed the connection)
[15:43] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[15:45] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[16:18] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Read error: Operation timed out)
[16:28] * lxo (~aoliva@09GAAE2GM.tor-irc.dnsbl.oftc.net) Quit (Read error: Connection reset by peer)
[16:35] * lxo (~aoliva@19NAABZPV.tor-irc.dnsbl.oftc.net) has joined #ceph
[16:43] * aliguori (~anthony@ has joined #ceph
[17:47] * Yulya__ (~Yu1ya_@ip-95-220-242-20.bb.netbynet.ru) Quit (Ping timeout: 480 seconds)
[17:52] * greglap (~Adium@ has joined #ceph
[17:57] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:07] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[18:11] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[18:11] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[18:39] * greglap (~Adium@ Quit (Quit: Leaving.)
[18:44] * swendel (~swendel@ has joined #ceph
[18:44] * swendel (~swendel@ Quit ()
[18:45] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[18:48] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:54] * bchrisman (~Adium@70-35-37-146.static.wiline.com) has joined #ceph
[18:57] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:04] * cmccabe (~cmccabe@c-24-23-254-199.hsd1.ca.comcast.net) has joined #ceph
[19:11] * lxo (~aoliva@19NAABZPV.tor-irc.dnsbl.oftc.net) Quit (Read error: Connection reset by peer)
[19:14] * lxo (~aoliva@83TAABZ6G.tor-irc.dnsbl.oftc.net) has joined #ceph
[19:26] * cmccabe1 (~cmccabe@c-24-23-254-199.hsd1.ca.comcast.net) has joined #ceph
[19:26] * cmccabe (~cmccabe@c-24-23-254-199.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[19:43] <stingray> yehudasa: yay, it works!
[19:43] <stingray> seems to work, at least
[19:43] <yehudasa> stingray: great! let us know if there are any issues with actually running the vm
[19:44] <stingray> yep
[19:44] <stingray> I'll try the vm now
[19:52] * aliguori (~anthony@ Quit (Quit: Ex-Chat)
[19:55] * aliguori (~anthony@ has joined #ceph
[20:02] * lxo (~aoliva@83TAABZ6G.tor-irc.dnsbl.oftc.net) Quit (Read error: Connection reset by peer)
[20:02] <stingray> 2011-06-22 22:02:17.082509 7f3adc1b39e0 librados: client.admin authentication error Operation not permitted
[20:02] <stingray> WHYYYYYYYYYYYY
[20:03] <cmccabe1> stingray: what commit are you on?
[20:03] * lxo (~aoliva@19NAABZWX.tor-irc.dnsbl.oftc.net) has joined #ceph
[20:03] <stingray> e214420518914b2856e88e27456cd107db3f6166
[20:03] <stingray> it's in qemu only
[20:03] <stingray> I think I broke it myself
[20:04] <yehudasa> cmccabe1: he's using a recent stable librbd
[20:04] <cmccabe1> it looks like that branched off a while ago
[20:05] <cmccabe1> looks like these library changes drop build time from 4 minutes to 3:30 minutes for me
[20:05] <cmccabe1> I mean building from scratch
[20:11] <cmccabe1> tv: make check seems to be segfaulting on gitbuilder/
[20:11] <Tv> lovely
[20:11] <cmccabe1> tv: I can't reproduce it locally
[20:11] <Tv> oom
[20:12] <Tv> or at least, it was
[20:12] <Tv> no that's older
[20:12] <Tv> lots of cauthtool and monmaptool segfaults, nothing else looks odd
[20:13] <cmccabe1> I think that cauthtool and monmaptool stuff is from the rgw branch
[20:13] <cmccabe1> well, maybe not. Tests are failing on that branch, but not because of that.
[20:15] <Tv> let's see if the segfaults reproduce for me
[20:15] <Tv> sagewk: 96ef8a67bcf4a4a43f0a5c38224314abdd88a12c looks very very wrong
[20:15] <Tv> sagewk: that's what ${shlibs:Depends} is all about
[20:16] <sagewk> oh right
[20:16] <cmccabe1> looking at the master branch at least, the cauthtool and monmaptool clitests complete fine?
[20:17] <sagewk> thanks
[20:17] <Tv> cmccabe1: yeah the segfaults are older
[20:17] <cmccabe1> tv: are we doing make distclean every time or just make clean?
[20:18] <Tv> cmccabe1: git clean
[20:18] <cmccabe1> that should be as good as make distclean
[20:18] <Tv> better
[20:18] <cmccabe1> assuming you're using force and all that
[20:20] <cmccabe1> I am going to have to log in and see why the unit tests are failing
[20:20] <cmccabe1> how do I do that
[20:20] <Tv> the same test hangs for me
[20:20] <cmccabe1> which test is that?
[20:20] <Tv> the signal is probably gitbuilder's time limit kicking in
[20:20] <Tv> [ RUN ] SignalApi.SimpleInstallAndTest
[20:24] <cmccabe1> I can't see any problem with that test
[20:25] <Tv> well it's not the gitbuilder, there's some environment difference between us
[20:25] <Tv> it seems racy
[20:25] <Tv> worked on one run, failed on another
[20:25] <Tv> probably timing of the signal delivery
[20:26] <cmccabe1> it blocks all signals, sends a signal while all signals are blocked, then calls sigsuspend
[20:27] <Tv> and it's hanging on sigsuspend every now and then
[20:27] <cmccabe1> do you know it's hanging on sigsuspend, or just a guess
[20:28] <cmccabe1> I guess that is pretty much the only long-blocking call in that test, so it's at least a good guess
[20:28] <Tv> i'm looking at it in gdb
[20:28] <cmccabe1> k
[20:29] <cmccabe1> hmm
[20:29] <cmccabe1> perhaps the signal is being delivered to the wrong thread
[20:30] <cmccabe1> try changing the kill to pthread_kill(pthread_self(), SIGUSR1);
[20:32] <cmccabe1> yeah, I'm afraid that is what it is
[20:32] <cmccabe1> the ceph context thread is not going to block SIGUSR1 (why should it?)
[20:34] <stingray> yehudasa: I/O error, dev vda, sector xxxxx...
[20:34] <stingray> 0xffffu
[20:35] <yehudasa> stingray: what was the scenario?
[20:35] <stingray> I just imported the image from disk to rbd and started vm
[20:35] <stingray> on boot it panics
[20:35] <stingray> I'll try installing it on rbd now
[20:41] <cmccabe1> tv: fixed by d42da230fec8f4d40d610fdae2be859d3cf2be47
[20:47] <yehudasa> stingray: does it fail also when you import the image via qemu-img?
[20:50] <stingray> yehudasa: yep.
[20:50] <stingray> I am installing now
[20:50] <stingray> directly to rbd
[20:51] <stingray> before your fixes it was usually installing normally but after reboot the data was weirdly inconsistent
[20:51] <stingray> we'll see how it goes now in 5 minutes
[20:57] <stingray> I/O error, dev vda, sector xxxxxx
[20:58] <stingray> something doesn't work, definitely
[20:58] <stingray> I'll try rolling back your async stuff
[21:00] <stingray> hm
[21:00] <stingray> your stuff is for writing
[21:01] <Tv> cmccabe1: 1000 runs without failure
[21:02] <yehudasa> stingray: my aio stuff <--- the rbd-async-convert branch?
[21:02] <stingray> yeah
[21:02] <stingray> that shouldn't affect whatever I'm doing
[21:03] <stingray> the faulty sector numbers are consistent across reboots, and when I did dd if=/dev/vda of=/dev/null bs=1M qemu-kvm crashed instantly
[21:03] <stingray> ... also stable
[21:04] <stingray> qemu-img convert -f rbd -O raw rbd:rbd/test0 foo == segv
[21:04] <yehudasa> hmm.. I'll test that now locally
[21:06] <stingray> http://pastebin.com/tbuq0rXp
[21:06] <stingray> qemu-img convert with debug rbd = 20
[21:07] <stingray> bt
[21:07] <stingray> bah
[21:07] <stingray> wrongwindow
[21:08] <stingray> http://pastebin.com/0q8yGK8c <- bt full
[21:22] <cmccabe1> tv: great
[21:25] <stingray> eh nice
[21:25] <stingray> qemu-img convert raw -> rbd - works
[21:25] <stingray> rbd -> raw - sigsegv
[21:27] <yehudasa> stingray: yeah.. there's the same bug that we had in the sync version of the sparse_read, now in the async read
[21:27] <stingray> yehudasa: hmm.
[21:27] <stingray> will it affect qemu-kvm?
[21:27] <yehudasa> stingray: possibly
[21:28] <stingray> yehudasa: I am not using your stuff directly, I cherrypicked everything rbd-related to 0.14.0 and rolled it on top of fedora's 0.14.0. but it applied clean
[21:29] <yehudasa> well.. that specific bug is real and I'm working on a fix right now
[21:29] <stingray> aha, qemu-img convert raw -> rbd, rbd export output matches source
[21:29] <stingray> yehudasa: well, that was just FYI
[21:29] <yehudasa> nice.. so we're close
[21:35] <yehudasa> stingray: can you test the following patch: http://pastebin.com/BSeAgjPM
[21:35] * lxo (~aoliva@19NAABZWX.tor-irc.dnsbl.oftc.net) Quit (Read error: Connection reset by peer)
[21:36] * lxo (~aoliva@19NAABZZM.tor-irc.dnsbl.oftc.net) has joined #ceph
[21:38] <stingray> yehudasa: yes I can but it'll take some time
[21:40] <Tv> joshd: do you consider wip_rbd done? as in not wip anymore?
[21:40] <stingray> doesn't apply to e214420518914b2856e88e27456cd107db3f6166 though
[21:40] <Tv> joshd: in teuthology.git that is
[21:40] <Tv> ooh INFO:orchestra.run.out:error adding secret to kernel client. : No such device
[21:40] <Tv> that does not look good..
[21:40] <joshd> Tv: yes
[21:41] <joshd> Tv: that error means it's falling back to the old secret= option
[21:41] <Tv> joshd: what's the non-ascii junk?
[21:42] <yehudasa> stingray: should apply cleanly, probably just whitespace/tabs issue
[21:42] <Tv> joshd: oh wow it printfs the payload, no wonder
[21:42] <joshd> Tv: looks like the decoded key
[21:43] <stingray> yehudasa: b34e195a46e8fc6eba0099b517685a205ce86061 <- ?
[21:43] <Tv> and it actually uses errno after a printf call
[21:43] <Tv> fail
[21:43] <Tv> i'll fix it
[21:44] <yehudasa> stingray: git fetch, and git cherry-pick 30c47566d17d4f200f12a71b6d1c8295b76a0943
[21:46] <stingray> I applied -l, building/deploying now
[21:47] <yehudasa> stingray: that's just a librbd fix, so you just need the client side
[21:51] <stingray> yep
[21:57] <stingray> http://twitpic.com/5f42rx
[21:57] <stingray> bah
[21:57] <stingray> wrong window again
[21:58] <yehudasa> stingray: I won't ask
[21:59] <Tv> network to part of sepia is down again, ticket filed with NOC as #10301
[21:59] <stingray> Tv: that particular part of the network doesn't seem to like you
[22:11] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) has joined #ceph
[22:14] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) Quit (Remote host closed the connection)
[22:15] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) has joined #ceph
[22:31] <stingray> yehudasa: no more segfaults, qemu-img: error while reading
[22:31] <stingray> instead
[22:32] <yehudasa> stingray: anything else on the log?
[22:32] <stingray> http://pastebin.com/QaTxRMGr
[22:32] <stingray> this is roughly last screen
[22:33] <stingray> http://pastebin.com/HP2pYRfe entire log
[22:34] <yehudasa> stingray: what was the command that you were running?
[22:36] <stingray> qemu-img convert -f rbd -O raw rbd:rbd/w1 foo
[22:37] <yehudasa> oh, ok, I see it now
[22:39] <stingray> I'll go home
[22:39] <stingray> ping if anything is needed
[22:39] <yehudasa> stingray: ok, thanks.. I'll try to fix it
[22:52] <cmccabe1> tv: looks like there's some kind of ccache problem on gitbuilder
[22:52] <cmccabe1> tv: ccache: failed to create /nonexistent/.ccache (No such file or directory)
[22:55] <Tv> hmm
[22:57] <Tv> the /nonexistent is $HOME for the user running it
[22:58] <Tv> what i don't understand is what does running radosgw_admin have to do with ccache
[22:58] <Tv> unless there's some libtool magic happening behind my back
[22:59] <cmccabe1> it looks like it's trying to build it, not run it
[22:59] <cmccabe1> the order of operations is running clitests, building unit tests, running unit tests
[22:59] <Tv> that output is from the cli test runner
[23:00] <cmccabe1> oh, yeah
[23:01] <cmccabe1> it's running radosgw_admin... supposedly
[23:01] <Tv> well, libtool makes radosgw_admin a shell script
[23:01] <Tv> with automagic relink behavior in some cases
[23:02] <cmccabe1> yes, that's new
[23:02] <Tv> when does it trigger?
[23:03] <cmccabe1> "if necessary"?
[23:03] <Tv> { file=`ls -1dt "$progdir/$program" "$progdir/../$program" 2>/dev/null | /bin/sed 1q`; \
[23:03] <Tv> test "X$file" != "X$progdir/$program"; }; then
[23:03] <Tv> so mtime comparison
[23:03] <Tv> and if they have the same mtime, it's effectively random
[23:03] <Tv> well that's crap
[23:05] <cmccabe1> I wish there were a way to disable this behavior
[23:06] <cmccabe1> I know there's some platforms where it might be necessary, but I'm pretty sure those platforms burned down a long time ago
[23:06] <Tv> what i don't really get is why the ccache doesn't just work
[23:06] <Tv> even when i don't want it to relink all the time, i don't see why it wouldn't work
[23:07] <sagewk> CCACHE_DIR is set by build-ceph.sh during compile, but not for unit tests
[23:07] <Tv> sagewk: it's exported in the shell script, it's set from there on
[23:07] <sagewk> ?
[23:07] <sagewk> ah
[23:13] <Tv> http://www.gnu.org/software/libtool/manual/libtool.html#Linking-executables
[23:13] <Tv> no resolution there though
[23:15] <Tv> hah google for libtool +ccache relink finds the ceph gitbuilder
[23:15] <cmccabe1> it's funny that they name their example "hell"
[23:15] <cmccabe1> you can't make this stuff up
[23:15] <cmccabe1> .libs/hell
[23:17] <Tv> http://sourceware.org/autobook/autobook/autobook_85.html
[23:17] <cmccabe1> so could we use make install DESTDIR=/tmp/dir
[23:17] <Tv> see --enable-fast-install
[23:17] <Tv> funky workaround, perhaps
[23:18] <Tv> cmccabe1: yeah but my head gets dizzy from thinking of "make check" calling "make install"
[23:18] <cmccabe1> it's definitely not ideal
[23:19] <cmccabe1> I don't think requiring mysterious configure flags for make check to work is great either...
[23:19] <Tv> yeah
[23:20] <cmccabe1> I don't think I completely understand the problem yet
[23:20] <cmccabe1> so libtool has this weird auto-relinking semantics
[23:20] <cmccabe1> and distcc has some kind of mtime-based cache
[23:20] <Tv> not mtime
[23:21] <cmccabe1> md5?
[23:21] <Tv> distcc does md5 comparison
[23:21] <Tv> libtool executable wrappers do the ugly ls -t|head -1 comparison thingie
[23:21] <Tv> which is like mtime comparison except it may relink even if mtimes match
[23:22] <cmccabe1> I'm still not seeing the problem yet
[23:22] <Tv> i think i'm going to focus on "why didn't it see CCACHE_DIR in env"
[23:22] <cmccabe1> it's just that make check didn't have that CCACHE_DIR set?
[23:22] <cmccabe1> and formerly, obviously, make check never needed CCACHE_DIR set.
[23:22] <Tv> well even then the relink was just plain old wrong
[23:22] <cmccabe1> in what sense
[23:23] <cmccabe1> it was unecessary, or actually used the wrong versions?
[23:23] <Tv> unnecessary, might cause funny things
[23:23] <Tv> like, now you ran part of your test with one binary, part with another
[23:23] <Tv> harder to reason about
[23:23] <cmccabe1> I do agree that the relink is annoying
[23:24] <cmccabe1> but as long as nothing changed between make and make check, I can't see it actually being wrong
[23:24] <Tv> barring compiler bugs, random bit flips, etc
[23:24] <Tv> just less certainty about things
[23:25] <Tv> in fact
[23:25] <Tv> relinking *after* we run the tests now sounds very bad
[23:25] <cmccabe1> I don't think you're going to ever fix that, except by moving to CMake
[23:25] <cmccabe1> which doesn't have any of this foolishness
[23:26] <cmccabe1> unless there's some way for us to make --enable-fast-install the default
[23:27] <cmccabe1> which I guess might be possible in configure.ac
[23:27] <Tv> well, that's the default
[23:28] <cmccabe1> oh, I meant disable
[23:28] <Tv> it doesn't need to be in configure.ac, only in the gitbuilder calls
[23:28] <Tv> but i'm debugging the missing CCACHE_DIR angle..
[23:29] * verwilst (~verwilst@dD57672E7.access.telenet.be) has joined #ceph
[23:29] <cmccabe1> tv: every weird option that only gitbuilder uses makes it harder and harder to actually reproduce what is going on there
[23:29] <Tv> yup
[23:29] <cmccabe1> tv: that's why I was advocating that you just run do_autogen.sh, but that's a whole other argument I guess
[23:30] <Tv> non-standard wrappers on top of automake will be ignored by 99.99% of your users, i don't see that route as very good
[23:30] <Tv> and do_autogen.sh has a bazillion branches, it just makes the variability problem worse
[23:31] <Tv> figure out the good settings, put those in the autoconf proper
[23:31] <Tv> CCACHE_DIR=/srv/autobuild-ceph/gitbuilder.git/build/../../ccache
[23:31] <Tv> $#@$@
[23:31] <Tv> it does get ccache_dir
[23:31] <cmccabe1> the good settings for developers probably aren't what you want in production
[23:32] <cmccabe1> at the very least you need production vs. debug
[23:32] <cmccabe1> I mean, the binary sizes go from ~20 MB to ~2
[23:32] <Tv> and those would be two separate gitbuilders, and we need *both*
[23:32] <cmccabe1> there is no way a one-size-fits-all set of configure defaults will work
[23:32] <Tv> so it just doesn't come to "oh just call do_autogen.sh", either way
[23:33] <cmccabe1> at the very least, we could have two scripts, both with no options
[23:33] <cmccabe1> that just encode what both gitbuilders are doing
[23:33] <Tv> that's not enough
[23:33] <cmccabe1> ?
[23:34] <Tv> sorry, you saying "both" means you don't know what you're talking about
[23:34] <cmccabe1> so one gitbuilder runs one script, the other runs another
[23:34] <Tv> there's more than two
[23:34] <cmccabe1> no options on either
[23:35] <cmccabe1> I think you are confusing unrelated things
[23:35] <cmccabe1> the first is how many configurations we should build and test
[23:35] <cmccabe1> the second is whether developers should be able to easily figure out what settings those build machines are using
[23:36] <cmccabe1> I mean it's kind of silly to say "I won't use do_autogen.sh because it supports more than one configuration" and "I want to support a bazillion gitbuilder configurations"
[23:37] <Tv> wrong axis, but you're not helping and are distracting, so i'm not having this conversation now
[23:37] <Tv> we have a red build
[23:38] <cmccabe1> I wonder if someone is clearing env in make check?
[23:38] <Tv> way, way ahead of you
[23:42] <Tv> there we go, that's the culprit
[23:43] <Tv> commit 5a0bc6b78f2e40ec9255a1ea49f77ef9ea4690a6
[23:43] <Tv> Author: Tommi Virtanen <tv@hq.newdream.net>
[23:43] <Tv> Date: 2011-01-14 16:39:38 -0800
[23:43] <Tv> Sanitize environment before running clitests.
[23:43] <Tv>
[23:43] <Tv> This avoids CEPH_KEYRING etc from slipping in.
[23:43] <Tv> harrrumph
[23:43] <Tv> ok so explicitly pass CCACHE_DIR through, i guess
[23:45] <Tv> darn bash
[23:45] <Tv> hard to write a case that works for both CCACHE_DIR set and unset
[23:58] * verwilst (~verwilst@dD57672E7.access.telenet.be) Quit (Quit: Ex-Chat)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.