#ceph IRC Log

Index

IRC Log for 2011-01-07

Timestamps are in GMT/BST.

[0:01] <jantje> if i create like 250 directories
[0:02] <jantje> and put 40MB files in each directory
[0:02] <jantje> I get the same
[0:02] <jantje> (in the newly created directory)
[0:05] <jantje> so now i'm sure it has nothing to do with our source tree :P
[0:18] * wido (~wido@fubar.widodh.nl) Quit (Remote host closed the connection)
[0:18] <jantje> it also happens when there is a large file
[0:18] <jantje> just a single large file in the directory
[0:21] * wido (~wido@fubar.widodh.nl) has joined #ceph
[0:32] <jantje> [root@client0 testdir2]# ls -alh
[0:32] <jantje> total 4.2G
[0:32] <jantje> drwxr-xr-x 1 root root 1.8G Jan 6 18:21 .
[0:32] <jantje> drwxr-xr-x 1 root root 13G Jan 6 18:09 ..
[0:32] <jantje> -rw-r--r-- 1 root root 4.2G Jan 6 23:26 file
[0:32] <jantje> shouldn't be '.' report 4.2G as well?
[0:32] <jantje> s/report/reporting
[0:36] <jantje> I think I got it, if the dirstat size is >4GB
[0:36] <jantje> then I get this
[0:37] <jantje> if a do a dirstat while stil writing
[0:37] * `gregorg` (~Greg@78.155.152.6) has joined #ceph
[0:37] * gregorg_taf (~Greg@78.155.152.6) Quit (Read error: Connection reset by peer)
[0:37] <jantje> the size get's 'sticky' and is never updated (?)
[0:38] <jantje> so when doing ls -al While creating a file, and the 'dir size' is still < 2GB => OK
[0:38] <jantje> if you know what I mean ...
[0:38] <jantje> now it would be just great if I could end this monologue :-)
[0:39] <gregaf> jantje: just a sec, I need to finish being a moron with our repository and I'll take a look ;)
[0:43] <gregaf> if anybody did a checkout of unstable in the last 40 minutes, it was temporarily busted so just got readjusted, make sure to update your local copies before you try and do any merges
[0:44] <jantje> gregaf: it's ok, i have to go to sleep anyway
[0:44] <jantje> but it would be nice to be able to continue my testing tomorrow
[0:44] <jantje> i have to give my superior some clues if ceph is suitable for our build environment
[0:45] <jantje> well, somewhere next week
[0:45] <jantje> anyway
[0:46] <jantje> i think ceph is great :)
[0:47] <jantje> (a file move also updates the dir size)
[0:47] <gregaf> jantje: how long did you leave it to update?
[0:48] <gregaf> recursive file accounting does lazy updating, I think it can take a few minutes sometimes?
[0:48] <jantje> yea, was thinking that too
[0:48] <jantje> didn't check
[0:49] <gregaf> but if you're running in a 32-bit environment then it looks like it's reporting sizes that are more than 32 bytes, which certainly could be breaking something somewhere...
[0:49] <jantje> anyway, i think you should be able to reproduce: 32bit client, 64bit server and dir size >4GB
[0:49] <jantje> yea
[0:49] <jantje> but it's funny because I didn't notice any so far
[0:51] <gregaf> I wasn't following along too well with the previous discussion, your compile is having difficulty finding libraries under this scenario?
[0:51] <jantje> gcc -I/path/to/dir
[0:52] <jantje> gives
[0:52] <jantje> cc1.orig: error: /mnt/ceph/testdir2: Value too large for defined data type
[0:52] <gregaf> k
[0:52] <gregaf> gotta go now, be back later
[0:52] <jantje> strace is at 23:58 < jantje> http://www.fpaste.org/aons/
[0:52] <jantje> i'm going to bed
[0:52] <jantje> nite !
[0:52] <jantje> and thanks!
[0:53] <cmccabe2> nite jantje
[1:35] <breed> has anyone seen this error? g++ -DHAVE_CONFIG_H -I. -I /home/breed/cryptopp -I /home/breed/libedit -Wall -D__CEPH__ -D_FILE_OFFSET_BITS=64 -D_REENTRANT -D_THREAD_SAFE -rdynamic -g -O2 -MT ceph-ceph.o -MD -MP -MF .deps/ceph-ceph.Tpo -c -o ceph-ceph.o `test -f 'tools/ceph.cc' || echo './'`tools/ceph.cc
[1:35] <breed> tools/ceph.cc:59: error: expected initializer before ‘*’ token
[1:35] <breed> it is this line: static Tokenizer *tok;
[1:35] <breed> i'm not sure where that class comes from
[1:36] <gregaf> breed: what version are you using?
[1:37] <breed> ceph-0.24. it's the latest right?
[1:37] <gregaf> yeah, it's the latest release
[1:38] <breed> i just noticed that i'm using libedit 0.3 and there is a 2.11 on the download site. is Tokenizer in there?
[1:38] <gregaf> I haven't seen that error though and I'd only expect to find build errors in the source
[1:39] <gregaf> *in the source repo
[1:40] <gregaf> cmccabe: any ideas?
[1:40] <breed> ah it is libedit
[1:40] <breed> 2.11
[1:41] <breed> sorry my bad. your question about the version number prompted me to look at the download web page
[1:41] <gregaf> np :)
[1:41] <breed> i'll try it again with the correct libedit :)
[1:42] <cmccabe2> back
[1:42] <gregaf> that's probably something that should get put in the package requirements, but packaging isn't exactly my strong suit
[1:43] <cmccabe2> it needs to go in 3 places: automake, rpm, deb
[1:43] <Tv|work> bwahaha src/crush/test.c won't even compile
[1:44] <cmccabe2> yeah, see, we just check for a header file now.
[1:44] <cmccabe2> that's not going to cut it
[1:44] <Tv|work> anyone have a nice little very functional (data-oriented) set of functions/classes i could play with?
[1:44] <Tv|work> i want to plug in an actual unit test framework
[1:44] <cmccabe2> you could encode some data and then verify that you could decode it
[1:45] <cmccabe2> with one of the ::encode() and ::decode() methods
[1:45] <Tv|work> cmccabe2: yeah.. i was slightly scared by relink_command in src/testencoding ;)
[1:46] <Tv|work> just libtool triggering old nightmares
[1:46] <cmccabe2> tv: libtool is used for pretty much everything in ceph
[1:46] <Tv|work> yeah i know, just made me go look at crush first in case that was simpler
[1:47] <cmccabe2> tv: I know what you mean-- sort of-- but at the same time, it's "simple" from the perspective of Makefile.am
[1:47] <cmccabe2> testencoding_SOURCES = test/TestEncoding.cc
[1:47] <cmccabe2> testencoding_LDADD = libceph.la libcrush.la -lpthread -lm -lcrypto
[1:47] <cmccabe2> bin_PROGRAMS += testencoding
[1:48] <Tv|work> cmccabe2: i tend to describe that as "the monster is hidden by the closet it's in"
[1:48] <cmccabe2> tv: there are some things I don't like about libtool, but that's a separate discussion... and probably irrelevant to the question of unit test libraries
[1:49] <breed> the libedit that you download from the ceph download page seems to have a messed up makefile. does anyone know how to compile it?
[1:49] <cmccabe2> tv: I really like CMake. I just think it does everything better than automake/libtool
[1:49] <cmccabe2> tv: but again, that's a separate issue.
[1:51] <cmccabe2> breed: just out of curiousity, what distro are you on
[1:51] <breed> redhat 5.3
[1:51] <cmccabe2> ic
[1:51] * MK_FG (~MK_FG@188.226.51.71) has joined #ceph
[1:52] <breed> it was cake to compile on my ubuntu machine, but i've spent all day now on this redhat machine.
[1:52] <breed> tragically redhat is what we use on our clusters.
[1:53] <cmccabe2> breed: yeah, I dunno how to compile libedit. what's messed up about its makefile
[1:53] <cmccabe2> breed: is it using automake ? :)
[1:53] <breed> i wish (ok not really)
[1:53] <breed> it has a prebuilt makefile, but make on linux doesn't seem to parse it correctly
[1:54] <breed> it seems to be setup for bsd
[1:54] * MK_FG (~MK_FG@188.226.51.71) Quit ()
[1:55] <cmccabe2> might be best to find a source RPM for fedora
[1:56] * MK_FG (~MK_FG@188.226.51.71) has joined #ceph
[1:57] <gregaf> I'm trying to remember who else has gotten this working on RHEL
[1:57] <cmccabe2> if you find the SRPM for fedora and pull out the source, you almost certainly will be able to get it to compile on red hat
[1:57] <breed> yeah i think i'll grab the source from the debian package since that seems to work
[1:57] <breed> what do you guys usually run?
[1:58] <gregaf> we're all on debian here
[1:59] <bchrisman> I got it working on rhel6
[1:59] <bchrisman> But abandoned CentOS 5.4…
[2:00] <bchrisman> There were some other environmental reasons for that.. (our particular tools which will need to be on there)
[2:00] <bchrisman> but something was going back all the way to needing a glibc upgrade.. which basically means new OS…
[2:02] <gregaf> that's odd
[2:03] <gregaf> I'm quite certain people have gotten it working on RHEL5.5, but maybe 5.4 is too far back
[2:06] <bchrisman> ahh.. looking back.. I had some issues with btrfs/btrfsprogs under 5.4… wanted to run ceph on btrfs
[2:08] <bchrisman> path of least resistance was rhel6
[2:08] <breed> i don't have to run on btrfs do i?
[2:08] * Tv|work (~Tv|work@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[2:09] <gregaf> nope!
[2:09] <gregaf> it's just the filesystem of first choice, there are enough hooks into it that we can wring out some more performance and things like snapshots are more space-efficient
[2:10] * sentinel_e86 (~sentinel_@188.226.51.71) Quit (Remote host closed the connection)
[2:10] <breed> ah cool ok.
[2:11] <breed> arg it looks like libedit uses pmake
[2:17] <gregaf> all right folks, I'm off for the night though I'll log in from the train :)
[2:20] <cmccabe2> nite
[2:34] * greglap (~Adium@166.205.139.122) has joined #ceph
[3:37] * greglap (~Adium@166.205.139.122) Quit (Read error: Connection reset by peer)
[3:38] * bchrisman (~Adium@70-35-37-146.static.wiline.com) Quit (Quit: Leaving.)
[3:47] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[4:44] * greglap (~Adium@cpe-76-90-239-202.socal.res.rr.com) has joined #ceph
[4:52] * greglap (~Adium@cpe-76-90-239-202.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[4:56] * breed (~breed@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Ping timeout: 480 seconds)
[5:21] * breed (~breed@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[5:45] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) has joined #ceph
[6:33] * ijuz_ (~ijuz@p4FFF585D.dip.t-dialin.net) Quit (Ping timeout: 480 seconds)
[6:36] * breed (~breed@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Read error: Connection reset by peer)
[6:42] * ijuz_ (~ijuz@p4FFF63D3.dip.t-dialin.net) has joined #ceph
[6:49] * f4m8_ is now known as f4m8
[6:57] * breed (~breed@68-186-58-50.dhcp.mdfd.or.charter.com) has joined #ceph
[7:16] * breed (~breed@68-186-58-50.dhcp.mdfd.or.charter.com) Quit (Ping timeout: 480 seconds)
[7:25] * MarkN (~nathan@59.167.240.178) Quit (Quit: Leaving.)
[7:27] * breed (~breed@nat-dip6.cfw-a-gci.corp.yahoo.com) has joined #ceph
[7:55] * breed (~breed@nat-dip6.cfw-a-gci.corp.yahoo.com) Quit (Ping timeout: 480 seconds)
[8:06] * cmccabe2 (~cmccabe@adsl-76-202-117-29.dsl.pltn13.sbcglobal.net) Quit (Quit: Leaving.)
[8:51] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) Quit (Read error: Operation timed out)
[8:52] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) has joined #ceph
[8:58] * allsystemsarego (~allsystem@188.27.165.135) has joined #ceph
[9:37] * jiqiren (~jiqiren@c-67-188-179-41.hsd1.ca.comcast.net) has joined #ceph
[9:41] * jiqiren (~jiqiren@c-67-188-179-41.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[9:56] * DJLee (82d8d198@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[10:06] * Yoric (~David@213.144.210.93) has joined #ceph
[10:26] <stingray> fffuuu
[10:27] <stingray> rage thread go
[10:57] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[10:57] * Yoric (~David@213.144.210.93) has joined #ceph
[11:31] <jantje> gregaf: the lazy dir size update ... I don't think it's happening, the dir size is still wrong after 8hours
[12:40] * verwilst (~verwilst@router.begen1.office.netnoc.eu) has joined #ceph
[13:08] <jantje> hi verwilst
[13:08] <jantje> looks like netlog is investigating ceph? ;-")
[13:14] <verwilst> jantje, how's that? :)
[13:15] <verwilst> it's actually detached from netlog :) it's personal shizzle :) but maybe some of the knowledge winds up at netlog, who knows ;)
[13:26] <verwilst> jantje, are you a ceph dev maybe?
[13:38] <jantje> No, just someone who's testing it
[13:39] <jantje> and hopefully be able to deploy it to create a distributed build environment
[13:40] * stingray misread jantje's "deploy" as "destroy"
[13:40] <stingray> ETOOLITTLESLEEP
[13:40] <jantje> i'm doing my best to destroy it!
[13:41] <stingray> I can tell you a few tricks
[13:41] <stingray> 1. hot-add 3 or more osds in rapid succession
[13:42] <stingray> 2. create a zillion of hardlinks, both within the same directory and between directories
[13:42] <jantje> i'm more into basic functionality :-)
[13:42] <stingray> the second one I have failed to produce a proper isolated test case for
[13:42] <jantje> stingray: do you have 64bit servers and a 32bit client by any chance ?
[13:42] <stingray> all my synthetic tests are good
[13:43] <stingray> the only thing that fails is my backup directory, which accumulated stuff from past 18 years
[13:43] <stingray> no, everything I have here is 64bit
[13:44] <jantje> k
[14:44] * Yoric (~David@213.144.210.93) Quit (Quit: Yoric)
[14:44] * Yoric (~David@213.144.210.93) has joined #ceph
[15:06] <verwilst> i'm preparing a few kvm boxes to set up my first ceph environment ;)
[15:10] <ijuz_> jantje: why don't you run a 32bit linux on a 64bit kernel?
[15:11] <ijuz_> s/linux/userland/
[15:19] * allsystemsarego_ (~allsystem@188.25.132.202) has joined #ceph
[15:26] * allsystemsarego (~allsystem@188.27.165.135) Quit (Ping timeout: 480 seconds)
[15:27] * allsystemsarego_ (~allsystem@188.25.132.202) Quit (Read error: Operation timed out)
[15:28] <stingray> why do you want 32-bit clients
[15:28] <stingray> 32 bit clients are ancient
[15:28] <stingray> funny thing, btw
[15:28] <stingray> after adding osd with weight 0.000 it starts moving stuff around, to new device, even with weight 0.000
[15:43] <ijuz_> i understood that is is about some "enterprise ready" compiler
[15:57] <stingray> ijuz_: that takes an xml config file?
[15:58] <stingray> discards it and drives into walls
[15:58] <ijuz_> no remote idea, i don't know such high-tech stuff
[16:00] * Yoric_ (~David@213.144.210.93) has joined #ceph
[16:00] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[16:00] * Yoric_ is now known as Yoric
[16:01] * Yoric (~David@213.144.210.93) Quit ()
[16:01] * Yoric (~David@213.144.210.93) has joined #ceph
[16:23] * zoobab (zoobab@vic.ffii.org) has joined #ceph
[16:23] <zoobab> hi
[16:23] <zoobab> trying ceph on a 2.6.32 openvz kernel
[16:23] <zoobab> how stable is it?
[16:24] <jantje> ijuz_: all our compiler boxes run 32bit os / 32bit kernel, and they like it to stay that way
[16:24] * Yoric (~David@213.144.210.93) has left #ceph
[16:27] <stingray> jantje: too bad :)
[16:33] <jantje> one step at a time :-)
[16:34] <jantje> now I really need sage to wake up .. hehe :)
[16:35] <ijuz_> jantje: so you are using Ceph for production build boxes?
[16:36] <jantje> no
[16:36] <jantje> just testing
[16:47] <jantje> ijuz_: actually, it might be worth a try to use a 64bit kernel
[16:47] <jantje> do i need any additional userland software?
[16:48] <ijuz_> i don't know, sorry, i just know that it is possible
[16:48] <ijuz_> another way might be a chroot
[17:40] <sage> jantje: i'm awake :)
[17:44] * verwilst (~verwilst@router.begen1.office.netnoc.eu) Quit (Quit: Ex-Chat)
[17:50] <sage> wido; can you remind me what your phprados.git url is?
[17:55] * Yoric (~David@213.144.210.93) has joined #ceph
[17:56] * greglap (~Adium@166.205.137.141) has joined #ceph
[17:57] <sage> wido: nm, found it.
[18:01] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:13] * Tv|work (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:38] <stingray> braaaaaaaaaaaaaaaains
[18:43] * greglap (~Adium@166.205.137.141) Quit (Quit: Leaving.)
[18:43] * bchrisman (~Adium@70-35-37-146.static.wiline.com) has joined #ceph
[18:58] * Yoric (~David@213.144.210.93) Quit (Quit: Yoric)
[18:58] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:03] <stingray> 2011-01-07 21:02:50.871268 7f1d89b6d700 mds0.journaler try_read_entry got 0 len entry at offset 797176418
[19:03] <stingray> 2011-01-07 21:02:50.871334 7f1d89b6d700 mds0.log _replay journaler got error -22, aborting
[19:03] <stingray> 2011-01-07 21:02:50.871347 7f1d89b6d700 mds0.6 boot_start encountered an error, failing
[19:03] <stingray> 2011-01-07 21:02:50.871358 7f1d89b6d700 mds0.6 suicide. wanted up:replay, now down:dne
[19:03] <stingray> 2011-01-07 21:02:50.872294 7f1d8d57e740 stopped.
[19:03] <stingray> now what
[19:03] <stingray> ?
[19:07] <gregaf> stingray: what version and how did you make that happen?
[19:10] <stingray> ceph version 0.24 (commit:ba3a28c1642aad8a3ef8f5cc9e56446d506b963d)
[19:10] <stingray> I was copying files using rsync
[19:10] <stingray> at some point, copy froze
[19:10] <stingray> I restarted the client but it wouldn't mount
[19:11] <stingray> so, I figured, I'll restart mdses
[19:11] <stingray> haha lol now I have broken cluster
[19:12] <gregaf> hmm, where'd you get it from?
[19:12] <stingray> "it" ?
[19:12] <gregaf> I can't find that commit in my repository or in the mailing list archives
[19:13] <stingray> agh
[19:13] <stingray> sure
[19:13] <stingray> 180a4176035521940390f4ce24ee3eb7aa290632
[19:13] <gregaf> oh, right, vanilla .24
[19:13] <gregaf> that's beter :)
[19:15] <gregaf> so your MDS journal got corrupted somehow; I've seen this error before but only in a local development branch
[19:15] <wido> yehudasa: you there?
[19:16] <stingray> so, how to uncorrupt it?
[19:16] <gregaf> well that depends on how it got corrupted, I'm trying to remember what was happening when I saw it, but I don't think it's relevant here
[19:17] <stingray> you know, all those weird bugs that I constantly encounter are very easy to trigger - you need a huge archives, like, music, or software, and user home dirs
[19:17] <stingray> gregaf: so if on journaler error, I just stop processing and truncate journal
[19:17] <stingray> recreating it from scratch
[19:17] <stingray> truncate journal, how does it work
[19:18] <gregaf> sorry, I'm trying to carry on a couple different conversations here, one moment
[19:21] <yehudasa> wido: yes
[19:21] <gregaf> stingray: do you have mds logs I can look at?
[19:21] <gregaf> I want to figure out where this came from
[19:22] <stingray> only after it got corrupted, I enabled debug and tried to run it
[19:23] <gregaf> that's fine, I can still get a few clues out of it
[19:25] <stingray> bzipping
[19:25] <gregaf> cool, thanks
[19:26] <yehudasa> iggy: as you may remember, we've come a long way
[19:26] <yehudasa> wrong window!
[19:26] <wido> yehudasa: about the librados versioning, I just updated the issue. Any plans to implement the version in the C++ librados?
[19:27] <yehudasa> wido: I'm not sure.. was thinking about just including the C header there
[19:27] <wido> Oh, that would be fine too, I'd just like to use it in phprados
[19:28] <yehudasa> I don't want to have two copies of that code, and I'm not sure about having a 3rd common .h file
[19:30] <wido> I get it, that wouldn't be nice. Having to bump the version in two places, that will surely be forgotten in the future
[19:31] <cmccabe> wido: also, using extern C, including the C header should work fine
[19:32] <wido> cmccabe: Yes, it would. But that would make the C++ code depending on the C header
[19:32] <wido> And there is a rados_version() method in librados.h, isn't available yet in the C++ version
[19:32] <wido> Not a problem for now, but it would be nice if it could be there at some point
[19:33] <yehudasa> wido: I can add a rados->version() method now
[19:33] <wido> yehudasa: If that could be done without to many troubles, it would be nice
[19:36] <wido> yehudasa: I've got a VM with Qemu-RBD only right now, it has been up for two weeks :)
[19:36] <wido> doing some small I/O's, not a really busy machine. But it keeps working
[19:43] <Tv|work> [ OK ] Encoding.RoundTripSimple (0 ms)
[19:43] <Tv|work> ...
[19:43] <Tv|work> [==========] 1 test from 1 test case ran. (0 ms total)
[19:43] <Tv|work> [ PASSED ] 1 test.
[19:43] <Tv|work> whee unit test framework
[19:44] <yehudasa> wido: cool!
[19:44] <cmccabe> :)
[19:47] <yehudasa> wido: I pushed something. There's rados->version() and also I included librados.h in .hpp so you should get the version defines
[19:58] <wido> yehudasa: tnx!
[20:01] * ElectricBill (~bill@smtpv2.cosi.net) has joined #ceph
[20:16] <stingray> this is full of fail\
[20:17] <gregaf> sorry man
[20:18] <gregaf> it's in development and the only way to find this stuff is for people to try it out
[20:19] <stingray> I'm not frustrated about the cluster :)
[20:19] <stingray> well, not just it
[20:22] <sagewk> meeting!
[20:40] * ajnelson (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[20:58] <gregaf> stingray: can you run the dumpjournal program and post up the resulting file?
[20:59] <gregaf> I was hoping the logs would reveal an obvious bug but they aren't so we're going to need to look at this in more detail
[21:01] <stingray> gregaf: sure. Where's this dumpjournal ?
[21:01] <gregaf> are you building from source?
[21:01] <stingray> yeah, but using srpms and koji
[21:01] <stingray> I'll compile it
[21:02] <gregaf> wait, never mind
[21:02] <gregaf> I forgot I moved it into the cmds binary
[21:03] <gregaf> so the command is, let me check...
[21:05] <gregaf> cmds -i 0 —dump-journal
[21:05] <gregaf> it should read the journal and produce a file mds.journal.dump in the directory you run it from
[21:05] <stingray> hmm. in which version?
[21:05] <gregaf> v0.24
[21:05] <gregaf> has it, anyway, I just ran it
[21:06] <gregaf> apparently it's crashing too but that seems to be on shutdown
[21:06] <stingray> ah yep I was looking into src/mds/* not in src/cmds.cc
[21:06] <gregaf> heh
[21:06] <gregaf> just the normal cmds binary :)
[21:07] <stingray> journal is 4194304~794223255
[21:07] <stingray> ./include/Context.h: In function 'bool C_Gather::sub_finish(void*, int, int)':
[21:07] <stingray> ./include/Context.h:123: FAILED assert(waitfor.count(num))
[21:09] <gregaf> that's the output from cmds -i 0 —dump-journal?
[21:09] <stingray> (and no file created)
[21:09] <stingray> yeah
[21:09] <stingray> let me gdb this
[21:11] <gregaf> that's....odd
[21:12] <stingray> yeah
[21:13] <stingray> 1: cmds() [0x4b5659]
[21:13] <stingray> 2: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x5de) [0x69454e]
[21:13] <stingray> 3: (Dumper::ms_dispatch(Message*)+0x41) [0x4b4401]
[21:13] <stingray> 4: (SimpleMessenger::dispatch_entry()+0x621) [0x4a5e21]
[21:13] <stingray> 5: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x49a3cc]
[21:13] <stingray> damn
[21:14] <stingray> ok, I'll crawl home
[21:14] <stingray> will try to beat this thing into submission later on weekend
[21:15] <gregaf> okay
[21:15] <gregaf> I'll see if I can reproduce this journal dump problem here and figure out if it's related to your mds journal problems
[22:13] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[22:36] * ken_barber (~kbarber@93-97-221-206.zone5.bethere.co.uk) has joined #ceph
[22:39] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[22:48] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[22:58] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[23:25] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[23:28] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[23:34] * NoahWatkins (~NoahWatki@soenat3.cse.ucsc.edu) has joined #ceph
[23:39] <NoahWatkins> I just tossed Ceph on Ubuntu 10.10 using pre-compiled packages from apt-get. The version is being reported at "ceph version 0.21 (090436f5)". I was expecting 0.24, and indeed a few weeks ago things look correct with 0.23.1.
[23:42] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[23:46] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit ()
[23:51] <ajnelson> sagewk: Where does dout(n) for the Synthetic Client go?
[23:53] <gregaf> ajnelson: what do you mean where does it go?
[23:54] <gregaf> if you give it a log file on startup it'll go into that file
[23:54] <gregaf> although the synthetic client is old and crufty, I'm not sure if it even works at this point?
[23:54] <ajnelson> It somewhat works.
[23:55] <ajnelson> And there are dout(n) statements in there,
[23:55] <ajnelson> but grepping for something that should have shown up in the logs, through dev/, log/ and out/ (following the "Simple Test Setup" wiki page) didn't give me anything.
[23:55] <gregaf> did you pass it a log file location to use, or a debug level to print out at?
[23:56] <gregaf> by default it will only log dout level 0, and that will print to either standard out or standard error (not sure which)
[23:56] <ajnelson> I just used the debug level with `./vstart.sh -d -n -l`, and I thought the -d flag bumped up the debug level to max.
[23:56] <gregaf> if you want higher log levels or to log to a file you need to tell it to
[23:56] <gregaf> ah, nope
[23:56] <ajnelson> Oh.
[23:57] <gregaf> vstart sets the debug level on the server daemons it starts via the command-line flags, to near-max (but not all the way)
[23:57] <gregaf> it doesn't set global prefs or anything though
[23:57] <ajnelson> Ooooh.
[23:57] <gregaf> so I think the synthetic client uses the normal client flag
[23:57] <gregaf> —debug_client x
[23:57] <gregaf> where x is the debug level
[23:57] <ajnelson> Oh, ok.
[23:57] <gregaf> and you'll want to provide a log file
[23:57] <gregaf> —log-file=out/synclient.log
[23:58] <ajnelson> Ok, I'll try that...
[23:58] <gregaf> :)
[23:59] <cmccabe> ajnelson: synthetic client sends its output to stderr
[23:59] <gregaf> cmccabe just changed the dout system a bit, I don't know exactly how
[23:59] <cmccabe> ajnelson: at least in the unstable branch. Doesn't that program always run in the foreground?
[23:59] <gregaf> but those changes will only apply if you're on a current unstable
[23:59] <ajnelson> cmccabe: Aye, that program runs in the foreground.
[23:59] <ajnelson> I'm not on current unstable, I branched from v0.24.

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.