#ceph IRC Log


IRC Log for 2010-11-08

Timestamps are in GMT/BST.

[0:35] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[1:05] * MarkN (~nathan@ Quit (Ping timeout: 480 seconds)
[1:13] * MarkN (~nathan@ has joined #ceph
[1:29] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[1:37] * p-static (~ravi@ has joined #ceph
[1:38] <p-static> hey, is it just me, or is the sample config file broken?
[1:38] <p-static> it specifies the osd journal as a file, and doesn't specify "osd journal size", so nothing works
[2:05] * terang (~me@ip-66-33-206-8.dreamhost.com) Quit (Read error: Connection reset by peer)
[2:22] * terang (~me@ip-66-33-206-8.dreamhost.com) has joined #ceph
[3:52] * Jiaju (~jjzhang@ Quit (Remote host closed the connection)
[6:32] * Jiaju (~jjzhang@ has joined #ceph
[7:14] * terang (~me@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[7:53] * terang (~me@pool-173-55-24-140.lsanca.fios.verizon.net) has joined #ceph
[7:56] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[8:15] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Ping timeout: 480 seconds)
[8:22] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) has joined #ceph
[9:03] * andret (~andre@pcandre.nine.ch) Quit (Remote host closed the connection)
[9:05] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[9:05] * andret (~andre@pcandre.nine.ch) has joined #ceph
[10:14] * Yoric (~David@ has joined #ceph
[11:19] * f4m8_ is now known as f4m8
[11:38] * allsystemsarego (~allsystem@ has joined #ceph
[12:27] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[14:30] <jantje_> Hi
[14:31] <jantje_> I have another MON core dump
[14:31] * jantje_ is going to look for a kick-start-gdb guide :-)
[15:36] * MarkN (~nathan@ Quit (Ping timeout: 480 seconds)
[15:39] * MarkN (~nathan@ has joined #ceph
[15:52] <jantje_> I have 1 mon, 1 mds and 2 osd's on each server
[15:52] <jantje_> should I be able to mount the ceph cluster using the second or third mds ?
[15:53] <jantje_> I can only mount using the first one, or are the other MDSs refusting connections untill they detect the 'master' MDS is down?
[15:59] <sagewk> jantje_: you always mount against the monitor ips. the load balancing and handoff between mds's is handled internally
[16:01] <jantje_> oh i see, but I also have mon's on the other servers
[16:02] <sagewk> jantje_ is the mon crash reproducible? would be nice to have the full mon logs, 'debug mon = 20'
[16:05] <jantje_> somehow it just started after trying some more
[16:06] <jantje_> but I was doing a cvs checkout from our source tree, and I think all my osd's just died, i'll have to talke a closer look
[16:07] <sagewk> btw you might want to switch to the 'rc' branch (soon to be v0.23). unstable just got a bunch of untested code merged in.
[16:10] <jantje_> #0 0x0000000000575de6 in ceph::buffer::ptr::release (this=0x0, bl=...) at ./include/buffer.h:428
[16:10] <jantje_> #1 ~ptr (this=0x0, bl=...) at ./include/buffer.h:387
[16:10] <jantje_> #2 FileJournal::do_write (this=0x0, bl=...) at os/FileJournal.cc:620
[16:10] <jantje_> (something from last week...)
[16:10] <jantje_> no so relevant I guess
[16:17] <sagewk> was it during osd shutdown or something? i wouldn't expect this=NULL unless it's a thread teardown problem
[16:18] <jantje_> can't tell, I even might have used the wrong binary with gdb
[16:33] <jantje_> you don't know by any chance how to send the sysrq-b key combo to a serial console connected on an terminal server?
[16:33] <jantje_> :)
[16:33] <sagewk> it's the break character
[16:33] <sagewk> not sure what the ascii code for it is. on our terminal servers it's control-z b'
[16:36] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[16:38] * greglap (~Adium@cpe-76-90-74-194.socal.res.rr.com) Quit (Quit: Leaving.)
[16:53] * greglap (~Adium@ has joined #ceph
[17:17] * Yoric (~David@ Quit (Quit: Yoric)
[17:50] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[18:22] * cmccabe (~cmccabe@dsl081-243-128.sfo1.dsl.speakeasy.net) has joined #ceph
[18:34] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:35] * sjust (~sam@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:50] * greglap (~Adium@ Quit (Ping timeout: 480 seconds)
[18:53] <jantje_> sagewk: I reproduced it by .. euhm .. doing nothing :-)
[18:53] <jantje_> I got the logs, uploading them right now
[18:54] <jantje_> http://jan.sin.khk.be/bug.tar.gz
[18:55] <sagewk> is this the mon or filestore crash?
[19:05] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:14] <wido> I'm seeing a btrfs bug with 2.6.37, when I run cosd -i 8 -c /etc/ceph/ceph.conf --mkfs --mkjournal --monmap monmap, the machine gets a kernel panic
[19:14] <wido> got remote syslog set-up, but nothing gets logged
[19:14] <wido> anyone seen this before?
[19:17] <wido> i've got something else too, when running "ceph -w" I suddenly get old loglines being printed on my screen, I guess a few thousand
[19:18] <cmccabe> loglines that you've seen before you mean?
[19:18] <wido> yes, old logs from this afternoon
[19:19] <wido> but they get a recent timestamp
[19:19] <yehudasa> wido: 2.6.37? you mean rc1?
[19:19] <wido> yes, rc1 indeed
[19:20] <wido> hmm, I can simply invoke it by doing: touch foo; sync; rm foo
[19:20] <yehudasa> there were a few btrfs issues that we've seen but the fixes were supposed to get pushed to -rc1
[19:20] <wido> tried a fresh mkfs, even overwrite the partition with zeros, didn't help either
[19:20] <wido> pretty weird, same kernel is running fine on other machines
[19:21] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: Leaving)
[19:44] <jantje_> sagewk: MON
[19:47] * jantje_ afk
[19:59] * Meths_ (rift@ has joined #ceph
[20:06] * Meths (rift@ Quit (Ping timeout: 480 seconds)
[20:06] * Meths_ is now known as Meths
[20:09] * Meths_ (rift@ has joined #ceph
[20:11] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[20:14] * Meths__ (rift@ has joined #ceph
[20:16] * Meths (rift@ Quit (Ping timeout: 480 seconds)
[20:17] * Meths_ (rift@ Quit (Ping timeout: 480 seconds)
[20:19] * Meths__ is now known as Meths
[21:03] * greglap1 (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[21:03] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Read error: Connection reset by peer)
[22:09] * allsystemsarego (~allsystem@ has joined #ceph
[22:09] * darkfader (~floh@host-82-135-62-109.customer.m-online.net) Quit (Read error: Connection reset by peer)
[22:11] * darkfader (~floh@host-82-135-62-109.customer.m-online.net) has joined #ceph
[23:45] <jantje_> sagewk: I hope the logs could be of any help
[23:50] <jantje_> I have to run some benchmarks tomorrow
[23:50] <jantje_> probably best not to use the unstable branch
[23:50] <jantje_> so rc branch, right?
[23:51] <jantje_> unless there are some cool things in the unstalbe branch that can speed up thing significantly?
[23:57] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[23:57] * greglap1 (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Read error: Connection reset by peer)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.