#ceph IRC Log

Index

IRC Log for 2011-06-15

Timestamps are in GMT/BST.

[0:00] <Tv> i think i was just careful as it was security-relevant
[0:00] <Tv> though re-installing will change host keys etc etc :(
[0:00] * alexxy (~alexxy@79.173.81.171) Quit (Remote host closed the connection)
[0:01] <sagewk> ignoring missing keys is pbly ok here
[0:01] <sagewk> whatever user runs the automated jobs can have a .ssh/config that tolerates changed keys
[0:01] <sagewk> did you manually fiddle /etc/fuse.conf on your nodes? the ones i grabbed don't have user_allow_other
[0:04] <Tv> that means they have just base install, not "part of sepia" customizations
[0:04] <Tv> that part is not automated :(
[0:04] <Tv> give me hostnames and i'll run it
[0:05] <sagewk> - ubuntu@sepia75.ceph.dreamhost.com
[0:05] <sagewk> - ubuntu@sepia77.ceph.dreamhost.com
[0:05] <sagewk> - ubuntu@sepia79.ceph.dreamhost.com
[0:05] <sagewk> is that a script in teuthology.git?
[0:06] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[0:06] <Tv> nope
[0:06] <Tv> sorry several conversations at once
[0:08] <Tv> somebody is using 75
[0:08] <Tv> it has ceph debs installed on it
[0:08] <Tv> in a broken state
[0:09] <Tv> same thing with 77
[0:09] <Tv> and 79
[0:09] <Tv> not touching those with a long stick
[0:11] <joshd> Tv: I think those are leftover from sjust's larger scale OSD testing
[0:11] <sjust> yup, those were me
[0:11] <Tv> well those boxes are broken as far as i'm concerned
[0:11] <Tv> and shouldn't have been unlocked, then
[0:12] <sjust> ah, I didn't unlock 'em
[0:13] <Tv> so how did sage get 'em?
[0:13] <sagewk> they were unlocked :)
[0:13] <Tv> do we have an unlocking poltergeist?
[0:14] <joshd> I unlocked them when sam told me he didn't need them anymore (I didn't realize they'd been customized)
[0:15] <sjust> fixing them now
[0:34] * alexxy (~alexxy@79.173.81.171) Quit (Remote host closed the connection)
[0:37] * verwilst (~verwilst@dD5769271.access.telenet.be) Quit (Quit: Ex-Chat)
[0:45] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) Quit (Remote host closed the connection)
[0:46] <lx0> minor nit: mount cephsrv:/ /mnt/point gives me ???modprobe: command not found???, presumably because mount takes /sbin (where modprobe is) out of PATH
[0:47] <lx0> I suppose hard-coding /sbin/modprobe in mount.ceph is not the way to go, but adding ???|| /sbin/modprobe ceph??? doesn't sound great either. any other ideas?
[0:49] * wilfrid (5138106c@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[0:51] <lx0> mount succeeds anyway, for mount() gets the module loaded on its own
[0:52] <lx0> so another alternative would be to redirect the modprobe output to the bit bucket
[0:53] <Tv> lx0: oh huh, that's interesting
[0:54] <Tv> lx0: could you please file a bug so i don't forget that? today is too full of interruptions (and i kinda want to have your contact info on the bug)
[0:54] <lx0> or perhaps only trying modprobe if the first call to mount fails, and then retry if modprobe succeeds
[0:55] <lx0> I was aiming at posting a patch, but sure
[0:55] <cmccabe> running shell commands from a binary run by superuser is a big no-no
[0:58] <cmccabe> I guess using fork + execl would make it slightly better, but the real question is whether we need to do that at all
[1:00] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[1:14] * allsystemsarego (~allsystem@188.25.130.212) Quit (Quit: Leaving)
[2:13] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) Quit (Read error: Operation timed out)
[2:21] * arken420 (~mike@75-138-193-69.static.snfr.nc.charter.com) has joined #ceph
[2:25] * arken420 (~mike@75-138-193-69.static.snfr.nc.charter.com) Quit (Quit: TinyIRC 1.1)
[2:28] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[2:31] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[2:35] * arken420 (~mike@75-138-193-69.static.snfr.nc.charter.com) has joined #ceph
[2:39] * arken420 (~mike@75-138-193-69.static.snfr.nc.charter.com) Quit ()
[2:39] * bchrisman (~Adium@70-35-37-146.static.wiline.com) Quit (Quit: Leaving.)
[2:57] * cmccabe (~cmccabe@208.80.64.174) has left #ceph
[3:18] <lx0> Tv, sorry got a phone call and forgot to mention I opened http://tracker.newdream.net/issues/1188
[3:58] * yoshi (~yoshi@p24092-ipngn1301marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[5:52] * votz (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) Quit (Quit: Leaving)
[5:52] * votz (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) has joined #ceph
[6:17] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[6:34] * sage1 (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has joined #ceph
[6:34] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) Quit (Read error: Connection reset by peer)
[6:35] * sage1 (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has left #ceph
[6:36] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has joined #ceph
[6:37] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) Quit ()
[6:38] * sage (~sage@dsl092-035-022.lax1.dsl.speakeasy.net) has joined #ceph
[7:05] * jantje (~jan@paranoid.nl) Quit (Read error: Connection reset by peer)
[7:05] * jantje (~jan@paranoid.nl) has joined #ceph
[7:23] * johnl (~johnl@johnl.ipq.co) Quit (Remote host closed the connection)
[7:23] * johnl (~johnl@johnl.ipq.co) has joined #ceph
[8:55] * alexxy[home] (~alexxy@79.173.81.171) has joined #ceph
[9:00] * alexxy (~alexxy@79.173.81.171) Quit (Ping timeout: 480 seconds)
[9:39] * allsystemsarego (~allsystem@188.25.130.212) has joined #ceph
[10:27] * mr_fribble (591e7c93@ircip3.mibbit.com) has joined #ceph
[10:31] * bhem (~bhem@tor1.digineo.de) has joined #ceph
[10:38] * fred_ (~fred@80-219-183-100.dclient.hispeed.ch) has joined #ceph
[10:39] * bhem (~bhem@82VAAB1SJ.tor-irc.dnsbl.oftc.net) Quit (Remote host closed the connection)
[11:09] * bhem (~bhem@1GLAAB82E.tor-irc.dnsbl.oftc.net) has joined #ceph
[11:45] * yoshi (~yoshi@p24092-ipngn1301marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:54] * allsystemsarego (~allsystem@188.25.130.212) Quit (Ping timeout: 480 seconds)
[12:02] * allsystemsarego (~allsystem@188.27.164.204) has joined #ceph
[12:08] * allsystemsarego (~allsystem@188.27.164.204) Quit (Quit: Leaving)
[12:08] * alexxy[home] (~alexxy@79.173.81.171) Quit (Ping timeout: 480 seconds)
[12:08] * allsystemsarego (~allsystem@188.27.164.204) has joined #ceph
[12:13] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[12:20] * mr_fribble (591e7c93@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[12:27] * alexxy[home] (~alexxy@79.173.81.171) has joined #ceph
[12:32] * alexxy (~alexxy@79.173.81.171) Quit (Ping timeout: 480 seconds)
[13:22] * mib_ksojc6 (591e7c93@ircip1.mibbit.com) has joined #ceph
[13:25] * arken420 (~mike@75-138-193-69.static.snfr.nc.charter.com) has joined #ceph
[13:43] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) has joined #ceph
[13:55] * macana (~ml.macana@159.226.41.129) has joined #ceph
[14:48] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) has joined #ceph
[14:49] * alexxy[home] (~alexxy@79.173.81.171) Quit (Ping timeout: 480 seconds)
[14:50] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[14:51] * Yulya_ (~Yu1ya_@ip-95-220-189-27.bb.netbynet.ru) Quit (Ping timeout: 480 seconds)
[16:31] * pombreda (~Administr@109.128.207.189) has joined #ceph
[16:51] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) Quit (Remote host closed the connection)
[16:56] * mtk (~mtk@ool-182c8e6c.dyn.optonline.net) has joined #ceph
[17:27] * bhem (~bhem@1GLAAB82E.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[17:46] * yoshi (~yoshi@KD027091032046.ppp-bb.dion.ne.jp) Quit (Remote host closed the connection)
[17:53] * greglap (~Adium@166.205.140.142) has joined #ceph
[18:00] * Yulya_ (~Yu1ya_@ip-95-220-186-126.bb.netbynet.ru) has joined #ceph
[18:02] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:18] <greglap> slang: did you get help with your crushmap thing yet?
[18:19] <slang> greglap: not yet no
[18:19] <slang> greglap: everything seems fine if I set the crushmap with ceph osd setcrushmap
[18:20] <slang> greglap: with mkcephfs --crushmap though, it seems to make mounting impossible
[18:20] <greglap> okay, I'm not an expert with crushmaps but I'll give this a go :)
[18:20] <greglap> can you post your crushmap somewhere?
[18:21] <greglap> and are you sure that setcrushmap is actually setting it?
[18:21] <greglap> it's possible that eg your crushmap is malformed so when you set it up with it initially you don't have a metadata pool, or proper rules for it
[18:23] <slang> greglap: when I do getcrushmap after setcrushmap, it looks like its been set
[18:23] <slang> greglap: one sec and I can post the crushmaps
[18:31] <slang> http://pastebin.com/raw.php?i=JSe3LYNY
[18:31] <slang> greglap: that's the original that gets created by default
[18:31] <slang> http://pastebin.com/raw.php?i=U5F6VdfP
[18:32] <slang> greglap: that's the one I was trying to set with mkcephfs --crushmap, and am setting with ceph osd setcrushmap
[18:33] <greglap> okay, the only thing I see is silly, but I bet it's because you have pool root id ???11
[18:33] <greglap> maybe sagewk can check it too?
[18:34] <greglap> it might need the rule ids to be sequential
[18:34] <sagewk> hmm that shouldn't be it
[18:36] <greglap> it seems that this works(? apparently) if he uses setcrushmap, but doesn't work if he sets the crushmap during mkcephfs
[18:40] <sagewk> need to just test it i think, nothing comes to mind
[18:40] * greglap (~Adium@166.205.140.142) Quit (Read error: Connection reset by peer)
[18:45] * Yulya_ (~Yu1ya_@ip-95-220-186-126.bb.netbynet.ru) Quit (Ping timeout: 480 seconds)
[18:56] <bchrisman> no standup (sitdown) fer me today...
[18:59] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:00] * cmccabe (~cmccabe@c-24-23-254-199.hsd1.ca.comcast.net) has joined #ceph
[19:13] * mib_ksojc6 (591e7c93@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[19:15] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[19:31] <sagewk> standup
[19:51] * pombreda (~Administr@109.128.207.189) Quit (Quit: Leaving.)
[19:51] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:52] <slang> greglap: sagewk: I just tried it again with mkcephfs --crushmap and it worked fine
[19:53] <slang> greglap: sagewk: not sure what was happening before -- sorry for the false alarm
[19:53] <gregaf> cool, glad it worked out for you!
[19:55] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit ()
[20:03] <Tv> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x371) [0x5fcf61]
[20:03] <Tv> 10: (SimpleMessenger::Pipe::accept()+0x2a97) [0x613a27]
[20:03] <Tv> anyone interested in a cluster that just had that happen on one osd
[20:03] <Tv> 2011-06-15 11:00:30.772196 7f16b2a74700 osd1 3 pg[0.0( empty n=0 ec=2 les/c 2/2 3/3/3) [0,1] r=1 stray] state<Start>: transitioning to Stray
[20:04] <Tv> that might be relevant, don't see much else special in the logs
[20:05] * bchrisman (~Adium@70-35-37-146.static.wiline.com) has joined #ceph
[20:11] <gregaf> Tv: do you have line numbers for that assert?
[20:11] <Tv> that's all it spewed out
[20:11] <Tv> sorry moved on already
[20:12] <gregaf> well I don't have time right now but if you saved a core I can probably look at it later, or maybe sjust wants to
[20:12] <sjust> hmm
[20:12] <Tv> no cores yet with teuthology, haven't done that work yet :(
[20:14] * stefanha (~stefanha@yuzuki.vmsplice.net) has joined #ceph
[20:16] * fred_ (~fred@80-219-183-100.dclient.hispeed.ch) Quit (Quit: Leaving)
[20:22] * fred_ (~fred@80-219-183-100.dclient.hispeed.ch) has joined #ceph
[20:46] <Tv> INFO:teuthology.task.ceph.mds.0.err:./include/xlist.h: In function 'xlist<T>::~xlist() [with T = Capability*]', in thread '0x7f659db08700'
[20:46] <Tv> INFO:teuthology.task.ceph.mds.0.err:./include/xlist.h: 63: FAILED assert(_size == 0)
[20:46] <Tv> this seems to be easy today
[21:13] * alexxy (~alexxy@79.173.81.171) Quit (Remote host closed the connection)
[21:21] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[21:22] * alexxy (~alexxy@79.173.81.171) Quit (Remote host closed the connection)
[21:27] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[22:16] * verwilst (~verwilst@dD5769416.access.telenet.be) has joined #ceph
[22:18] * fred_ (~fred@80-219-183-100.dclient.hispeed.ch) Quit (Quit: Leaving)
[22:42] <Tv> joshd: i want to cherry-pick teuthology commit 11dce556756f006d2e1babd982548c145a87fc1f from your wip_rbd branch
[22:43] <Tv> joshd: if that's ok, just make sure you take that into consideration when rebasing / before pushing
[22:44] <joshd> Tv: that's fine
[23:04] * allsystemsarego (~allsystem@188.27.164.204) Quit (Quit: Leaving)
[23:32] * dr_bibble (5138106c@ircip2.mibbit.com) has joined #ceph
[23:33] <Tv> this again:
[23:33] <Tv> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x371) [0x5fcf61]
[23:33] <Tv> 10: (SimpleMessenger::Pipe::accept()+0x2a97) [0x613a27]
[23:33] <Tv> you have ~2 minutes to claim live debugging rights to this cluster ;)
[23:33] <Tv> sjust: ?
[23:33] <sjust> Tv:

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.