#ceph IRC Log


IRC Log for 2012-02-01

Timestamps are in GMT/BST.

[0:02] <sjust> Ixo: all osds are running the same ceph version, right?
[0:02] <sjust> Ixo: (I think it would have crashed otherwise, but worth checking)
[0:04] <lxo> yeah, everything on 0.41
[0:09] * Mike (~Mike@awlaptop1.esc.auckland.ac.nz) has joined #ceph
[0:10] * Mike is now known as Guest1164
[0:11] * aliguori (~anthony@ Quit (Quit: Ex-Chat)
[0:11] <Guest1164> Hi all, I have a couple of new-b questions
[0:11] <Guest1164> 1) What is the easiest way to identify a failed OSD?
[0:11] <Guest1164> 2) Is gceph available on OpenSuSE
[0:13] <gregaf> Guest1164: 2) you don't want gceph; it's useless :)
[0:13] <gregaf> 1) not sure what you're asking — you mean you see that you have a down OSD in ceph -s and want to identify it?
[0:13] <gregaf> the command "ceph osd dump" will tell you the OSDs and their states, including up/down and in/out
[0:14] <Guest1164> Yep, have 12 OSDs, but only 10 up and 10 in when I run ceph -s
[0:14] <Guest1164> OK, great
[0:14] <Guest1164> And you think geceph is not worth the effort?
[0:14] <Guest1164> Sorry, gceph
[0:14] <gregaf> it doesn't expose any more information than ceph does, and it doesn't illustrate it in a useful way
[0:15] <gregaf> I'm not sure it's even in the repository any more; I thought it was dead and gone
[0:15] <Guest1164> That might be why I had a hard time getting it :)
[0:15] <Guest1164> Ok, thanks I'll go for ceph osd dump
[0:17] <BManojlovic> and gceph crash a lot :(
[0:18] <BManojlovic> you can use it on opensuse it is compiled on OBS just search for it
[0:18] <BManojlovic> latest version
[0:18] <sjust> lxo: that's pretty odd, do you have a log from an osd with osd debugging on?
[0:19] <Guest1164> I don't have a log sorry
[0:19] <Guest1164> The set up is two servers with 6 disks each
[0:20] <Guest1164> When testing using iozone, one of my colleagues made ceph hang
[0:20] <Guest1164> From what I can see one disk from each server has gone down
[0:21] <Guest1164> I'm thinking of trying to repair the disks and then add them back into the ceph fs
[0:21] <Guest1164> Any advice welcome and appreciated!
[0:21] <sjust> Guest1164: what is the output of ceph -s?
[0:23] <Guest1164> 2012-02-01 12:17:48.181632 pg v24486: 2376 pgs: 1 inactive, 2254 active+clean, 63 peering, 58 down+peering; 35470 MB data, 93200 MB used, 18362 GB / 18617 GB avail
[0:23] <Guest1164> 2012-02-01 12:17:48.187251 mds e20: 1/1/1 up {0=0=up:active}, 5 up:standby
[0:23] <Guest1164> 2012-02-01 12:17:48.187310 osd e398: 12 osds: 10 up, 10 in
[0:23] <Guest1164> 2012-02-01 12:17:48.187434 log 2012-01-31 14:32:21.203538 osd2 6118 : [INF] 0.28 scrub ok
[0:23] <Guest1164> 2012-02-01 12:17:48.187569 mon e1: 2 mons at {0=,1=}
[0:23] * ceph (~hylick@ has left #ceph
[0:23] <sjust> do you know how the two down osds died? In particular, there should be a backtrace at the end of their logs
[0:24] <Guest1164> We have thrashed them quite a bit with testing, so we may have worn them out
[0:24] <Guest1164> Sorry, I am a new-b, where do I find those logs?
[0:24] <sjust> usually /var/log/ceph, perhaps?
[0:24] <sjust> it's configurable
[0:25] <Guest1164> OK, on the server with the OSDs or on the server with the client?
[0:25] <Guest1164> We have mostly used defaults so probably /var/log/ceph
[0:25] <sjust> on the server with the osds
[0:27] <Guest1164> OK, I can see the log files for osd3 (one of the down OSDs) - I'll tail it
[0:27] <Guest1164> Sorry, be back in a minue
[0:27] <gregaf> Guest1164: btw, you seem to have created 6 MDSes, but only set one of them active
[0:27] <sjust> ok
[0:27] <gregaf> that isn't hurting anything, but you probably only need one and a standby :)
[0:27] <Tv|work> i give the gift of hardware ;)
[0:28] <Tv|work> 3 lucky winners so far, rest tomorrow
[0:28] <sjust> w00t
[0:28] <Tv|work> and now, the best part about coming to work at zero-dark-thirty: leaving early
[0:30] <lxo> sjust, no debugging logs, sorry
[0:30] <sjust> lxo: would it be possible to kill an osd, add debugging, and restart it?
[0:32] <Guest1164> ss1:~ # tail /var/log/ceph/osd.3.log
[0:32] <Guest1164> 12: (OSD::_share_map_outgoing(entity_inst_t const&)+0x280) [0x563e90]
[0:32] <Guest1164> 13: (OSD::do_queries(std::map<int, std::map<pg_t, PG::Query, std::less<pg_t>, std::allocator<std::pair<pg_t const, PG::Query> > >, std::less<int>, std::allocator<std::pair<int const, std::map<pg_t, PG::Query, std::less<pg_t>, std::allocator<std::pair<pg_t const, PG::Query> > > > > >&)+0x4bf) [0x564a1f]
[0:32] <Guest1164> 14: (OSD::activate_map(ObjectStore::Transaction&, std::list<Context*, std::allocator<Context*> >&)+0x20a) [0x56dc4a]
[0:33] <Guest1164> 15: (OSD::handle_osd_map(MOSDMap*)+0x3283) [0x57a833]
[0:33] <Guest1164> 16: (OSD::_dispatch(Message*)+0x31b) [0x57bdfb]
[0:33] <Guest1164> 17: (OSD::ms_dispatch(Message*)+0xec) [0x57cfec]
[0:33] <Guest1164> 18: (SimpleMessenger::dispatch_entry()+0x8b3) [0x5e4f83]
[0:33] <Guest1164> 19: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4c003c]
[0:33] <Guest1164> 20: (()+0x6a4f) [0x7fef047f2a4f]
[0:33] <Guest1164> 21: (clone()+0x6d) [0x7fef0301791d]
[0:33] <sjust> Guest1164: could you pastebin the last 200 lines?
[0:33] <lxo> sjust, sure! any preference between one of the preloaded osds and one of the nearly-empty ones?
[0:33] <lxo> what debug level?
[0:34] <sjust> lxo: hmm, osd0 would be fine, debug osd = 20, debug ms = 20
[0:35] <lxo> I'll make it osd2, for it has more /var disk space
[0:35] <sjust> lxo: sounds good
[0:35] <sjust> lxo: shouldn't really matter whether it's a preloaded osd or not
[0:36] <sjust> lxo: actually, could you do it to one of each?
[0:36] <sjust> that would be ideal
[0:36] <lxo> 'k
[0:36] * Tv|work (~Tv|work@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[0:41] * adjohn is now known as Guest1166
[0:41] * Guest1166 (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Read error: Connection reset by peer)
[0:41] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[0:56] <Guest1164> The MDS section of my ceph.conf is
[0:56] <Guest1164> [mds]
[0:56] <Guest1164> ; debug ms = 1 ; message traffic
[0:56] <Guest1164> ; debug mds = 20 ; mds
[0:56] <Guest1164> ; debug mds balancer = 20 ; load balancing
[0:56] <Guest1164> ; debug mds log = 20 ; mds journaling
[0:56] <Guest1164> ; debug mds_migrator = 20 ; metadata migration
[0:56] <Guest1164> ; debug monc = 20 ; monitor interaction, startup
[0:56] <Guest1164> [mds.0]
[0:56] <Guest1164> host = ss3
[0:56] <Guest1164> [mds.1]
[0:56] <Guest1164> host = ss4
[0:56] <Guest1164> [osd]
[0:56] <Guest1164> so not sure why 6 MDS?
[0:56] <Guest1164> Just getting the osd.3.log 200 lines now
[1:00] * joao (~joao@89-181-157-105.net.novis.pt) Quit (Quit: joao)
[1:04] <Guest1164> 2011-11-15 07:06:30.783374 7fee7ceb8710 -- >> pipe(0x1aae6280 sd=965 pgs=0 cs=0 l=1).accept replacing existing (lossy) channel (new one lossy=1)
[1:04] <Guest1164> 2011-11-15 07:07:30.881116 7fee7ccb6710 -- >> pipe(0x1aae6000 sd=966 pgs=0 cs=0 l=0).accept peer addr is really (socket is
[1:04] <Guest1164> 2011-11-15 07:07:30.881166 7fee7ccb6710 -- >> pipe(0x1aae6000 sd=966 pgs=0 cs=0 l=1).accept replacing existing (lossy) channel (new one lossy=1)
[1:04] <Guest1164> 2011-11-15 07:08:30.978695 7fee7cab4710 -- >> pipe(0x28445c80 sd=967 pgs=0 cs=0 l=0).accept peer addr is really (socket is
[1:04] <Guest1164> 2011-11-15 07:08:30.978754 7fee7cab4710 -- >> pipe(0x28445c80 sd=967 pgs=0 cs=0 l=1).accept replacing existing (lossy) channel (new one lossy=1)
[1:04] <Guest1164> 2011-11-15 07:09:31.076389 7fee7c8b2710 -- >> pipe(0x28445a00 sd=968 pgs=0 cs=0 l=0).accept peer addr is really (socket is
[1:04] <Guest1164> 2011-11-15 07:09:31.076438 7fee7c8b2710 -- >> pipe(0x28445a00 sd=968 pgs=0 cs=0 l=1).accept replacing existing (lossy) channel (new one lossy=1)
[1:04] <Guest1164> 2011-11-15 07:10:31.174101 7fee7c6b0710 -- >> pipe(0x28445780 sd=969 pgs=0 cs=0 l=0).accept peer addr is really (socket is
[1:05] <Guest1164> 2011-11-15 07:10:31.174150 7fee7c6b0710 -- >> pipe(0x28445780 sd=969 pgs=0 cs=0 l=1).accept replacing existing (lossy) channel (new one lossy=1)
[1:05] <Guest1164> 2011-11-15 07:11:31.271786 7fee7c4ae710 -- >> pipe(0x28445500 sd=970 pgs=0 cs=0 l=0).accept peer addr is really (socket is
[1:05] * BManojlovic (~steki@ Quit (Remote host closed the connection)
[1:05] <Guest1164> Sorry, is there a better way that putting all 200 lines into chat?
[1:05] <sjust> yeah, email it to sam.just@dreamhost.com
[1:05] <gregaf> Guest1164: my guess is your startup scripts are starting a bunch of daemons for some reason
[1:06] <sjust> or pastebin.com
[1:07] <Guest1164> The 200 lines are here
[1:07] <Guest1164> http://pastebin.com/wn4jjpAD
[1:09] <Guest1164> Here is my start script
[1:09] <Guest1164> #!/bin/bash
[1:09] <Guest1164> ssh root@ss3 cmon -i 0 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss4 cmon -i 1 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss3 cmds -i 0 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss4 cmds -i 1 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss1 cosd -i 0 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss1 cosd -i 1 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss1 cosd -i 2 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss1 cosd -i 3 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss1 cosd -i 4 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss1 cosd -i 5 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss2 cosd -i 6 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss2 cosd -i 7 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss2 cosd -i 8 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss2 cosd -i 9 -c /etc/ceph/ceph.conf
[1:09] <sjust> Guest1164: the first problem is that you appear to be running ceph from 6 months ago
[1:09] <Guest1164> ssh root@ss2 cosd -i 10 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> ssh root@ss2 cosd -i 11 -c /etc/ceph/ceph.conf
[1:09] <Guest1164> #ceph osd pool set data size 1
[1:09] <Guest1164> #ceph osd pool set metadata size 1
[1:10] <Guest1164> And my stop script
[1:10] <Guest1164> #!/bin/bash
[1:10] <Guest1164> umount /media/ceph
[1:10] <Guest1164> sleep 3
[1:10] <Guest1164> ssh root@ss4 /root/ogg/ceph/src/stop.sh all
[1:10] <Guest1164> I inherited these
[1:10] <Guest1164> Yes, the crash was from v0.34 I think
[1:10] <Guest1164> I have since updated and restarted ceph, but the disks are still down
[1:10] <sjust> have you restarted those osds?
[1:11] <Guest1164> I think it is very possible they may have failed
[1:11] <gregaf> ah, you don't have anything shutting down your old MDS daemons…so every time you run that you'll just be starting new ones that go to standby… :)
[1:11] <lxo> sjust, interesting... recovery has completed, and we still have only one pg doing backfill. however, the degraded counts seem to have been corrected
[1:11] <Guest1164> No, I haven't tried that yet as I've just learnt how to find out which osds are down
[1:12] <lxo> I'm thinking of restarting the other preloaded osd while keeping logging enabled on the other two, before sending them to you
[1:12] <sjust> lxo: could you pastebin your pg dump again?
[1:12] <sjust> :)
[1:12] <sjust> Guest1146: could you pastebin your ceph.conf?
[1:12] <Guest1164> Does the ceph src stop.sh all not stop mds?
[1:12] <Guest1164> Yep, just a mo
[1:13] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has left #ceph
[1:14] <Guest1164> Here you are
[1:14] <Guest1164> http://pastebin.com/qGi7Mrr9
[1:15] <Guest1164> Just grabbing a coffee, back in 2
[1:16] <sjust> Guest1164: it seems to think that osd.3 is still running
[1:17] <Guest1164> Hmmm... that is the ceph.conf from the client server, should I look at the OSD server?
[1:17] <Guest1164> I haven't changed the ceph.conf since before the OSD crash
[1:18] <lxo> sjust, pastebinned into your inbox ;-)
[1:18] <Guest1164> I'm guseeing I should take the down ones out, maybe fsck them? And then add them back in?
[1:18] <sjust> lxo: works for me
[1:19] <sjust> Guest1164: one sec, I mean that the ceph-osd process is bailing because it thinks there it is already running
[1:20] <sjust> lxo: if I could get those two logs gzipped, that would help
[1:21] <Guest1164> OK, after the initial OSD failure (some months ago) all I have done is stop ceph, update the version and start ceph again. I haven't touched the down OSDs because I wasn't sure if they had really failed or it was a bug from ceph v0.34
[1:22] <Guest1164> Although it seems I haven't stopped the ceph MDS! :o
[1:22] <Guest1164> Maybe I could get some advice about what to do from here please?
[1:23] <Guest1164> I'd like to get the two OSDs back into ceph (unless the disks really are dead) and then run the iozone test again
[1:25] <lxo> FTR, http://pastebin.com/9rZaUgsh now has pgdump2; the earlier pgdump is now http://pastebin.com/8Gk8fdg1
[1:25] <lxo> sjust, I'll get the logs and send them to you momentarily. any objections to xz -9 rather than gzip?
[1:25] <Guest1164> I'd also like to make sure that the mds and mon daemons stop when I run my stop.sh script
[1:26] <lxo> FWIW, restarting osd0 and osd1 didn't make any difference, it's just back as it was before restarting them
[1:26] <gregaf> Guest1164: ah, the stop.sh script in the source dir is really only for developers — it only kills local processes!
[1:27] <Guest1164> OK, but if I run this on all the servers in the ceph "cluster" then I should stop ceph?
[1:27] <gregaf> yeah
[1:27] <Guest1164> At the moment it is only running on one server...
[1:28] <Guest1164> Is there a better way of doing this?
[1:28] <gregaf> I thought you had two servers
[1:28] <Guest1164> I mean I'm only running the ceph src stop.sh on one server
[1:28] <gregaf> yeah, so you're only killing those daemons :)
[1:28] <sjust> lxo: sounds fine
[1:29] <Guest1164> There are actually 5 servers, 2 for OSDs, 2 for MDS/MONs and 1 for the client
[1:29] * yoshi (~yoshi@p11133-ipngn3402marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:29] <Guest1164> No "one command to stop them all"?
[1:29] <gregaf> I think you will do better with "init-ceph -a stop"
[1:29] <gregaf> that does things like ssh to other machines and shut down their daemons
[1:30] <Guest1164> OK, thanks that is great advice
[1:30] <gregaf> and I think there's some better stuff coming down the pipeline but I don't keep up with it
[1:36] <lxo> woah, 180MB (uncompressed) of logs for the tiny osd3, 3G+ for osd2
[1:37] <lxo> now I remember why I had to disable logging ;-)
[1:39] <sjust> lxo: yeah, improving that is on the todo list
[1:43] <lxo> sjust, want me to burna CD and mail it to you? :-) set up a torrent? something else?
[1:44] <sjust> lxo: how big are they compressed?
[1:44] <lxo> still compressing the larger one; the smaller one shrunk to 17MB (10%) with gzip, xz is still going
[1:45] <joshd> Guest1164: it looks like your osd.3 ran out of file descriptors (possibly due to a connection handling bug, or a low ulimit) - restarting it should be fine
[1:46] <Guest1164> Ok, thanks, just look at the ceph wiki under recover OSDs? Sorry, I really am a new-b
[1:46] <lxo> so the other will be 350MB or so with gzip, still not sure with xz
[1:47] <sjust> lxo: actually, make a bug and try to attach them to the bug
[1:47] <sjust> at least the smaller one should fit
[1:47] <lxo> 300M used to be too much for the bug tracking system
[1:47] <joshd> Guest1164: just run 'ssh root@ss1 cosd -i 3 -c /etc/ceph/ceph.conf'
[1:47] <lxo> ok
[1:48] <Guest1164> ok, thanks that is great
[1:48] <Guest1164> You folks have been brilliant, gotta go to a meeting now...
[1:48] <joshd> Guest1164: the data stored there should be fine, so there's no need to wipe it
[1:49] * Guest1164 (~Mike@awlaptop1.esc.auckland.ac.nz) Quit (Quit: Leaving)
[1:50] <sjust> lxo: hang on, I'll set you up an account on a ps
[1:52] <gregaf> sjust: there's a cephdrop bucket on ceph.newdream.net, ask Sage about it
[1:57] <sjust> lxo: if you email a public key, you can log into ceph.newdream.net as cephdrop
[2:02] <lxo> sjust, emailed
[2:03] <sjust> lxo: cephdrop@ceph.newdream.net should now work
[2:04] <lxo> wow, xz brings it down to 3.4%
[2:04] <lxo> (for the smaller file)
[2:04] <sjust> lxo: yeah, it should be exceedingly compressible
[2:04] <lxo> want me to wait for the larger one to complete the xz, or to upload the gzip
[2:04] <sjust> lxo: gzip should be fine
[2:05] <lxo> any naming conventions?
[2:05] <sjust> nope, maybe lxo_ in front of each one?
[2:08] <lxo> lxo-to-sjust-osd.3.log.xz is already there; the other will take a while
[2:08] <sjust> ok
[2:09] <lxo> one hour or so
[2:09] <sjust> ok
[2:19] <sjust> lxo: I'll take a look in the morning
[2:21] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Remote host closed the connection)
[2:21] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[2:37] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[2:58] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[3:05] <lxo> sjust, there seems to be something special about this one pg that gets backfill: even after restarting multiple osds, it's still the only one that gets this treatment
[3:08] <lxo> it appears to be the only one that's missing an object in one of the preloaded osds
[3:08] <lxo> that happens to be its primary
[3:09] <lxo> presumably the other osd recovered faster, got that pg active, the mds created or updated an object there, and then the other osd came back up
[3:11] <lxo> lemme try to bring that osd down and update the filesystem to see whether it picks up more pgs for backfill
[3:12] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[3:22] * adjohn is now known as Guest1172
[3:22] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[3:23] * Guest1172 (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Read error: Operation timed out)
[3:23] * jantje_ (~jan@paranoid.nl) has joined #ceph
[3:29] * jantje (~jan@paranoid.nl) Quit (Ping timeout: 480 seconds)
[3:38] <lxo> nope, that didn't help
[3:47] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[3:52] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Quit: adjohn)
[3:56] * chutzpah (~chutz@ Quit (Quit: Leaving)
[3:57] * jantje (~jan@paranoid.nl) has joined #ceph
[3:59] * lollercaust (~paper@212.Red-83-55-54.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[3:59] <lxo> sjust, upload to cephdrop completed
[4:03] * jantje_ (~jan@paranoid.nl) Quit (Ping timeout: 480 seconds)
[4:20] <elder> sage--or anybody else--know what's going on with gitbuilder?
[4:21] <elder> It's been stuck for a number of hours, "wip-crush" pending state.
[6:58] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[7:01] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[9:15] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[10:17] * joao (~joao@89-181-157-105.net.novis.pt) has joined #ceph
[10:21] * yoshi (~yoshi@p11133-ipngn3402marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[10:42] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) has joined #ceph
[11:07] * gregorg (~Greg@ has joined #ceph
[11:43] * morse_ (~morse@supercomputing.univpm.it) has joined #ceph
[11:46] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[11:52] * morse_ (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[12:27] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[12:27] * peritus (~andreas@h-150-131.a163.priv.bahnhof.se) Quit (Ping timeout: 480 seconds)
[12:32] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[12:49] * Anticimex (anticimex@netforce.csbnet.se) Quit (Remote host closed the connection)
[12:57] * Anticimex (anticimex@netforce.csbnet.se) has joined #ceph
[13:54] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[14:38] * Habitual (~JJones@dynamic-acs-24-101-150-38.zoominternet.net) has joined #ceph
[14:43] * Habitual waves
[15:17] * joao waves back
[15:19] <Habitual> :)
[15:43] <Habitual> Never thought I'd be perusing someone's thesis today, but yet here I am.
[15:45] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[16:27] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[17:46] * ameen (~ameen@unstoppable.gigeservers.net) Quit (Ping timeout: 480 seconds)
[17:49] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:55] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) has joined #ceph
[17:55] <- *jmlowe* quick question, is there a problem with rbd 0.41 connecting to a 0.40 cluster?
[17:57] <jmlowe> quick question, is there a problem with rbd 0.41 connecting to a 0.40 cluster?
[18:08] <jmlowe> well, it looks like it is the case, 0.41 rbd won't talk to a 0.40 cluster, don't suppose there is any way to sweet talk somebody into putting the 0.40 packages back into the debian repo?
[18:09] <jmlowe> I'm guessing librados 0.40 won't talk to a 0.41 cluster?
[18:14] * Tv|work (~Tv|work@aon.hq.newdream.net) has joined #ceph
[18:20] <gregaf> jmlowe: huh, I would have thought they would be fine; the client stuff didn't change that I'm aware of...
[18:21] <jmlowe> rbd ls worked before apt-get upgrade, times out following upgrade
[18:21] <jmlowe> 2012-02-01 12:21:12.622622 7f27d9a28780 monclient(hunting): authenticate timed out after 30
[18:21] <jmlowe> 2012-02-01 12:21:12.622971 7f27d9a28780 librados: client.admin authentication error (110) Connection timed out
[18:21] <jmlowe> error: couldn't connect to the cluster!
[18:27] * lx0 is now known as lxo
[18:32] <gregaf> hmm, did you make sure the keys and config are right? they might have gotten overwritten or something
[18:57] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[19:05] * bchrisman (~Adium@ has joined #ceph
[19:11] <jmlowe> not using xauth, works with no keys, diff between a working and not working client's config says they are the same
[19:12] * chutzpah (~chutz@ has joined #ceph
[19:12] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:26] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[19:29] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[19:37] <gregaf> jmlowe: sorry, stepped away — there might have been a message encoding change that caused that then, although I didn't think the clients had any
[19:37] <gregaf> you could check the monitor log and see what it says for the incoming client connection, but otherwise it looks like you're right :(
[19:38] * ameen (~ameen@unstoppable.gigeservers.net) has joined #ceph
[19:41] <jmlowe> logs are empty
[19:42] <jmlowe> I've got a number of vm's running with librados, I'm thinking my best option is to suspend them all, do the upgrade, then migrate them onto a host with an upgraded ceph and resume
[19:43] <Tv|work> alright sam sage alex yehuda josh greg dan have all been handed a static 10 machine allocation
[19:43] <Tv|work> let me know if that's not enough to keep you happy for a while
[19:44] <sage> awesome thanks!
[19:44] <Tv|work> i'm going back to digging around the reimaging world
[19:44] <gregaf> jmlowe: I guess, yeah…sorry :(
[19:45] <sage> would it be worth using the old locker on the new machines so that we can run the nightlies there?
[19:45] <Tv|work> sage: no network connectivity :(
[19:45] <sage> or will that make it harder to transition to the new locking stuff?
[19:45] <gregaf> we're doing upgrades this sprint or next that ought to prevent encoding from causing trouble in the future! :)
[19:45] <sage> oh yeah
[19:45] <sage> blarg
[19:45] <Tv|work> sage: i'd install another db on the side or something, but it won't work
[19:45] <Tv|work> sage: that's why i want to bring up the new hypervisors soon, to move that stuff over
[19:46] <sage> sounds good
[19:46] <Tv|work> i mean, even if the networking came up, that would make sense for the separation of the pools
[19:46] <gregaf> sage: you see that fsdevel thread about end-to-end data verification?
[19:46] <jmlowe> gregaf: I saw that in the roadmap, looking forward to it
[19:46] <sage> gregaf: yeah
[19:46] <gregaf> I'm keeping an eye on it but you know the kernel just a little better than I do ;)
[19:48] <sage> yeah it all looks reasonable. we would need to attach info to the page to make it work. i martin's point about how to handle concurrent pi and non-pi access is the tricky bit
[19:53] <sjust> sage: done reviewing the backfill stuff, looks fine
[19:53] <sage> sjust: thanks
[19:56] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Remote host closed the connection)
[19:56] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[19:57] <gregaf> sage: well presumably once we implement end-to-end protection you either turn it on or you don't, but once it's on there's no non-pi access, just clients who discard the information
[19:59] <sage> gregaf: no, it's a new application interface: instead of provide data to write, you provide data + integrity info, and that goes up and down the stack. if the page cache has cached data + integrity data, and a app writes data without providing integrity, what happens
[19:59] <sage> i think that's what martin is getting at
[20:00] <gregaf> oh, I see — I was assuming that our clients would generate that information
[20:00] <sage> we could do that too.. even if the app doesn't explicitly check/provide it we could generate and verify it from client -> osd -> btrfs
[20:01] <sage> that would still be useful. presumably the fs could generate the integrity info if not provided (since other layers are capable of verifying it that would follow).
[20:01] <sage> not sure if it's a crc32c or pluggable or what
[20:01] <gregaf> I can see a use for the application providing it too, just hadn't thought about it!
[20:02] <sage> i think the first step would be to make the page cache deal with it intelligently so that ceph-osd can provide/verify. then extend it over the wire.. then do the same on the ceph client.
[20:03] <sage> unless we went o_direct on ceph-osd. i don't think thats a good idea tho :)
[20:04] <sage> anyway, you're all set on the osd tracking? let me know when it's ready to look at again
[20:09] * Habitual (~JJones@dynamic-acs-24-101-150-38.zoominternet.net) Quit (Quit: Don't bring a knife to a gunfight.)
[20:13] * ghaskins (~ghaskins@68-116-192-32.dhcp.oxfr.ma.charter.com) Quit (Ping timeout: 480 seconds)
[20:20] * ghaskins (~ghaskins@68-116-192-32.dhcp.oxfr.ma.charter.com) has joined #ceph
[20:23] <sjust> lxo: you there?
[20:24] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[20:25] <jmlowe> anybody know off the top of their heads the virsh syntax for attach-disk to use with a rbd device?
[20:42] <lxo> sjust, I am now
[20:49] * adjohn is now known as Guest1245
[20:49] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[20:50] * Guest1245 (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Read error: Connection reset by peer)
[21:04] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[21:26] <sjust> lxo: when there were just two osds up, what did ceph -s say?
[21:26] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Quit: adjohn)
[21:28] * vodka (~paper@212.Red-83-55-54.dynamicIP.rima-tde.net) has joined #ceph
[21:29] * tjikkun_ (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[21:34] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (Ping timeout: 480 seconds)
[21:35] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[21:37] <lxo> sjust, before I first added the other osds, still in 0.40, all pgs were active+clean. after bringing up other osds (even after kicking them down and out again) they were active+clean+degraded
[21:38] * izdubar (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[21:38] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[21:39] * amichel (~amichel@ has joined #ceph
[21:41] <lxo> after upgrading to 0.41, I had to rollback one of the osds to an earlier state, and because of this some other osds wouldn't recover. so I reinitialized the other osds, rolled back both preloaded osds to a snapshot taken before the upgrade, told the mons that the other osds were lost, and then, with both preloaded osds running by themselves, all pgs got to active+clean
[21:42] * vodka (~paper@212.Red-83-55-54.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[21:43] <lxo> finally, I brought the other osds up and in, and it started replicating data to them, keeping most pgs in active state, rather than active+*+backfill
[21:43] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) has joined #ceph
[21:43] <lxo> only 3 pgs got the special backfill treatment AFAICT
[21:44] <lxo> one thing that might have played a role in this mess is that the rollback of the osds to an earlier state in which only two osds had ever been seen is that the mon's state was not rolled back
[21:45] <lxo> another is that those snapshots were taken still with 0.40, so if backfill relies on any information created by 0.41+ osds to work, it wouldn't be available
[21:48] <amichel> lxo: what distro are you running on? Just curious, not knowing any of your history in channel and whatnot :D
[21:55] <lxo> amichel, I run ceph on a cluster running BLAG
[21:57] <amichel> Man, a distro I've never heard of. Well played. :D
[21:58] <amichel> Philosophical decision to run BLAG or does it offer some specific niceties for running ceph?
[21:58] * BManojlovic (~steki@ has joined #ceph
[22:00] <lxo> amichel, philosophical and practical. I work on GCC at Red Hat, so it helps my work to run a distro that's close to Fedora and Red Hat Enterprise [GNU/]Linux, but I'd rather not run a distro I wouldn't be comfortable recommending to others, and I'd rather not recommend distros that contain non-Free Software
[22:01] <amichel> That makes sense.
[22:01] <amichel> We're an all RHEL shop in general around here, but the free v non-free debate isn't really a topic for us, we need the vendor support
[22:01] <Anticimex> lxo: ha! busted ;)
[22:02] <Anticimex> lxo: i have a question for some real gcc-clue :)
[22:02] <lxo> amichel, you could help requesting Red Hat to offer a 100% Free distro. internal debates often end up in “there's no demand from customers” :-)
[22:03] <lxo> Anticimex, :-)
[22:04] <amichel> Well, it's not really the support FOR Linux I meant, it's more the "my software product is supported on these specific distros"
[22:04] <lxo> amichel, because, really, the absence of lock-in (another name for freedom) is one of the strongest reasons for businesses to adopt Free Software, and any piece of non-Free Software is a symptom of lock-in
[22:04] <amichel> I don't think I've ever filed a support ticket with Redhat that wasn't something to do with RHN that I can't control
[22:05] <sjust> lxo: were the osd stores for the non-preloaded osds nuked before they were reintroduced?
[22:05] <lxo> amichel, I see, you develop products for your customers, and it runs on Red Hat's enterprise distro
[22:06] <amichel> No :D I'm not being very clear. I work at a Big University and we deploy all sorts of other vendor's software on Linux
[22:06] <lxo> sjust, yup. ceph-osd --mkfs --mkjournal on all of them, then ceph osd lost for the two preloaded osds to recover fully
[22:06] <amichel> Oracle, BMC, etc etc
[22:06] <lxo> amichel, aah, I think I got it now ;-)
[22:07] <Anticimex> lxo: i'm studying intel optimization reference manual and software manuals, and on p181 in the former, http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf , there's a intel optimization for sandy bridge utilizing the double load ports it has
[22:08] <Anticimex> but i can't get gcc to do this code, not even close
[22:08] <Anticimex> not the first example i've seen of gcc not using sse/avx regs when you could think it should
[22:08] <amichel> But I'm building a ceph cluster to create a low-cost high-capacity low-tier of storage for our customers and so I'm trying to suss out what the right software combo is to make it manageable and maintainable
[22:08] <Anticimex> known issue? :)
[22:08] <lxo> Anticimex, I haven't kept track of x86 latest developments; I've been working mostly on debug info generation for optimized programs for the past several years
[22:08] <Anticimex> "gcc version 4.6.2 (Debian 4.6.2-12)"
[22:08] <Anticimex> ok
[22:09] <Anticimex> lxo: thanks anyway! lucky you ;)
[22:09] <lxo> Anticimex, I vaguely recall seeing something about Sandy Bridge stuff making 4.7 or so, but I may be way off
[22:09] <Anticimex> i've seen some of it coming in
[22:09] <Anticimex> already. but that was general AVX instruction set support
[22:09] <Anticimex> i see now that means not necessarily utilizing it for code generation
[22:09] <Anticimex> lxo: i guess i'll have to turn to the gcc ml
[22:10] <lxo> gcc-help is generally helpful to users
[22:10] <Anticimex> great
[22:10] <Anticimex> thanks :)
[22:10] <lxo> is that x86 32 or 64?
[22:10] <Anticimex> 64
[22:11] <lxo> you may have to use an -mcpu or -march or somesuch flag to tell GCC that these instructions are to be used. also, possibly enable vector optimizations, that aren't enabled by default
[22:11] <Anticimex> oh, i've tried that
[22:11] <Anticimex> i tried a few loop things, i think i tried vector too
[22:12] <Anticimex> thanks for your help, i'm satisfied. :) need to get dirty and test more (and ask gcc-help if necessar)
[22:12] <lxo> oh well... sorry for my -ENOCLUE ;-) good luck on the list
[22:12] <Anticimex> i haven't tested enough to warrant more of your time :-)
[22:12] <lxo> IIRC there's a GCC channel for users on freenode (the one on OFTC is for development *of*, not *with* GCC)
[22:14] <lxo> biaw
[22:23] * ghaskins (~ghaskins@68-116-192-32.dhcp.oxfr.ma.charter.com) Quit (Quit: Leaving)
[22:25] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[22:27] * ghaskins (~ghaskins@68-116-192-32.dhcp.oxfr.ma.charter.com) has joined #ceph
[22:42] <Anticimex> lxo: see, I shouldn't have asked you before doing more testing
[22:42] <Anticimex> stupid me missing return of sum variable in that example, allowing gcc to just optimize the whole code away completely due to it not doing anything.
[22:42] <Anticimex> i see xmm-regs and vpadd
[22:43] <Anticimex> (before i got nothing :-)) -- now to check if it's actually doing sandy bridge load port alternation
[22:48] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[22:49] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[22:50] * BManojlovic (~steki@ Quit (Remote host closed the connection)
[22:54] * BManojlovic (~steki@ has joined #ceph
[23:08] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[23:23] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Ping timeout: 480 seconds)
[23:32] * adjohn (~adjohn@ma30536d0.tmodns.net) has joined #ceph
[23:33] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) has joined #ceph
[23:34] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) has left #ceph
[23:34] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) has joined #ceph
[23:35] <elder> Anybody know whether teuthology test "blogbench" is supposed to pass?
[23:37] <gregaf> elder: did last night:
[23:37] <gregaf> 10090: collection:basic btrfs:with-btrfs.yaml clusters:fixed-3.yaml tasks:cfuse_workunit_suites_blogbench.yaml
[23:37] <gregaf> 10102: collection:basic btrfs:with-btrfs.yaml clusters:fixed-3.yaml tasks:kclient_workunit_suites_blogbench.yaml
[23:38] <joao> ah
[23:38] <joao> I had to google to discover what a teuthology was
[23:38] <joao> a department that relates to cephalopods
[23:38] <joao> clever :p
[23:42] <Anticimex> lxo: and look at that: i'd forgotten unroll-loops was *not* part of -O3
[23:42] <Anticimex> with -O3 + -funroll-loops, -march=corei7-avx, all is fine, i guess. not identical code, but close enough. :)
[23:45] <Anticimex> ha, or actually, the double load-port thing *is* missing.
[23:49] <elder> Wait! It wasn't blogbench. Might have been suites/bonnie.sh (and I'll keep looking to see if it was something else.)
[23:50] <elder> Yes, looks like bonnie failed for me. Any word on that?
[23:50] <elder> gregaf,
[23:53] * andresambrois (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[23:54] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) Quit (Quit: Ex-Chat)
[23:54] <joshd> elder: bonnie on the kernel client isn't in the qa suite
[23:55] <nhm> elder: I think bonnie is in the autotest stuff still. I was just looking at that yesterday.
[23:57] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Ping timeout: 480 seconds)
[23:59] <sage> anyone not like the look of the recnet wip-encoding patches, before i go too crazy?

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.