#ceph IRC Log

Index

IRC Log for 2012-02-21

Timestamps are in GMT/BST.

[0:06] * fronlius (~fronlius@e176053020.adsl.alicedsl.de) Quit (Quit: fronlius)
[0:13] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) has joined #ceph
[1:04] * izdubar (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[1:08] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Remote host closed the connection)
[1:09] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[1:11] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Read error: Operation timed out)
[1:12] * yoshi (~yoshi@p8031-ipngn2701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:18] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Remote host closed the connection)
[1:19] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[1:20] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Remote host closed the connection)
[1:24] * joao (~joao@89-181-147-200.net.novis.pt) has joined #ceph
[1:26] * joao (~joao@89-181-147-200.net.novis.pt) Quit ()
[1:28] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[1:29] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[1:35] * The_Bishop (~bishop@e179009099.adsl.alicedsl.de) has joined #ceph
[1:41] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[1:43] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Remote host closed the connection)
[1:44] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[1:49] * BManojlovic (~steki@212.200.243.83) Quit (Remote host closed the connection)
[2:21] * CAPT_Fahd (~Guest6595@9KCAAD8M5.tor-irc.dnsbl.oftc.net) has joined #ceph
[2:21] * CAPT_Fahd (~Guest6595@9KCAAD8M5.tor-irc.dnsbl.oftc.net) has left #ceph
[3:07] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[3:22] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[3:55] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Remote host closed the connection)
[4:08] * The_Bishop (~bishop@e179009099.adsl.alicedsl.de) Quit (Remote host closed the connection)
[4:11] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[4:36] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[8:12] * adjohn (~adjohn@50-0-92-115.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[9:04] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) has joined #ceph
[9:25] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:38] * yoshi (~yoshi@p8031-ipngn2701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[9:56] * NaioN (~stefan@andor.naion.nl) Quit (Remote host closed the connection)
[10:08] * fronlius (~fronlius@e176053020.adsl.alicedsl.de) has joined #ceph
[10:09] * fronlius (~fronlius@e176053020.adsl.alicedsl.de) Quit ()
[10:44] * NaioN (~stefan@andor.naion.nl) has joined #ceph
[10:56] * fronlius (~fronlius@testing78.jimdo-server.com) has joined #ceph
[10:56] * Olivier_bzh (~langella@xunil.moulon.inra.fr) has joined #ceph
[11:32] * joao (~joao@89.181.147.200) has joined #ceph
[11:46] <Olivier_bzh> Hi everybody, I've just seen that ceph v0.42 was available
[11:47] <Olivier_bzh> I'm currently using 0.41, and there is a strong warning in the release notes :
[11:47] <Olivier_bzh> not backwards-compatible (for 0.42)
[11:49] <Olivier_bzh> does this mean that all files have to be moved in a secure place, and that I should start up a new clean ceph cluster with 0.42 ?
[11:49] <Olivier_bzh> and move back my data after ?
[11:49] <Olivier_bzh> or is there a way to upgrade with data in place ?
[11:50] <NaioN> If i'm correct you can update with data in place, but you can't go back if .42 doesn't suit you
[11:50] <Olivier_bzh> ok, thank you very much !
[12:24] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[12:40] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) Quit (Quit: Ex-Chat)
[13:29] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) has joined #ceph
[14:01] * mtk (~mtk@ool-44c35967.dyn.optonline.net) has joined #ceph
[14:40] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[14:47] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[15:24] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[15:41] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[16:23] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[16:34] <nhm> Olivier_bzh: just make sure to back up your data if it is important.
[16:58] <Olivier_bzh> Ok, this is wize indeed :I will plan a complete backup before proceeding
[16:59] <Olivier_bzh> I am warned that ceph file system is experimental, but I am pleased by the good shape of the project...
[17:00] <Olivier_bzh> It's working fine for me, with very good perfs
[17:02] <Olivier_bzh> thanks nhm
[17:25] <nhm> Olivier_bzh: glad to hear that it's going well! If you don't mind me asking, what kind of performance are you seeing and what is your setup?
[17:26] <Olivier_bzh> I have a little cluster :
[17:26] <Olivier_bzh> 3 servers, 8 OSDs each 3To
[17:27] <Olivier_bzh> it is connected to 2 servers that needs good bandwidth to perform scientific computations
[17:28] <Olivier_bzh> each server has a 1Gb/s ethernet card and mount ceph with the kernel module
[17:29] <Olivier_bzh> I've made some benchmark with a simple "time cp" command
[17:29] <Olivier_bzh> and it makes 500 to 600 Mbit/s for writing to ceph
[17:29] <Olivier_bzh> a little bit less for reading
[17:30] <Olivier_bzh> this is not decreasing when the 2 nodes are concurently accessing ceph
[17:30] <Olivier_bzh> So I am glad ;-)
[17:31] <Olivier_bzh> does this sounds good too for you ?
[17:33] <nhm> Olivier_bzh: On march 1st, I start working as a performance engineer on Ceph, so my job is to make sure it's as fast as possible. :)
[17:34] <Olivier_bzh> good to know you ;-)
[17:34] <nhm> Olivier_bzh: hah! We'll see. ;)
[17:35] <nhm> Olivier_bzh: right now I'm trying to get a good feel for what kind of performance users are seeing on different hardware setups.
[17:35] <nhm> Btw, is that with btrfs or xfs?
[17:35] <Olivier_bzh> So, you have a happy user, really
[17:35] <Olivier_bzh> btrfs
[17:36] <nhm> Olivier_bzh: That's great. I'm sure all the developers will be happy to hear it!
[17:36] <Olivier_bzh> and I've made nic bonding with 2 ethernet cards 1gb/s on each ceph node
[17:36] <Olivier_bzh> these are DELL R515 (sorry for the commercial)
[17:36] <Olivier_bzh> biproc AMD 16Gb of ram
[17:37] <nhm> Good deal. I think Dream Host has some R515s they use for testing.
[17:37] * bchrisman1 (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:38] <Olivier_bzh> I'm so happy that I've recently shared my little knowledge of ceph with my colleagues (INRA in France)
[17:38] <Olivier_bzh> in a seminar, and they are interested too
[17:39] <nhm> Excellent. I'm just finishing up at a supercomputing institute right now.
[17:43] <Olivier_bzh> Sorry nhm, I've to leave now, but if you have some question, no problem and I will answer tomorrow
[17:43] <Olivier_bzh> thanks again, bye all
[17:43] <nhm> Olivier_bzh: ok, nice to meet you!
[17:44] <nhm> Olivier_bzh: good luckw ith the upgrade!
[17:44] <Olivier_bzh> Tahnks (I will try to not make mistakes ;-))
[18:24] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[18:27] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[18:28] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[18:32] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[18:46] * sagewk (~sage@aon.hq.newdream.net) has joined #ceph
[18:49] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[18:50] * aliguori (~anthony@32.97.110.59) has joined #ceph
[18:52] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:06] * chutzpah (~chutz@216.174.109.254) has joined #ceph
[19:07] * MarkDude (~MT@64.134.223.71) has joined #ceph
[19:28] * BManojlovic (~steki@212.200.243.83) has joined #ceph
[19:29] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[19:55] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Ping timeout: 480 seconds)
[20:03] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[20:04] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[20:09] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Quit: fronlius)
[20:17] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[20:51] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (Remote host closed the connection)
[21:41] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[21:48] * MarkDude (~MT@64.134.223.71) Quit (Quit: Leaving)
[21:48] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[22:04] * gregaf (~Adium@aon.hq.newdream.net) has joined #ceph
[22:04] * yehudasa_ (~yehudasa@aon.hq.newdream.net) has joined #ceph
[22:05] * dmick1 (~dmick@aon.hq.newdream.net) has joined #ceph
[22:05] * joshd1 (~joshd@aon.hq.newdream.net) has joined #ceph
[22:06] * sjust (~sam@aon.hq.newdream.net) has joined #ceph
[22:09] * sagewk1 (~sage@aon.hq.newdream.net) has joined #ceph
[22:10] * sjust2 (~sam@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[22:10] * gregaf2 (~Adium@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[22:11] * joshd (~joshd@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[22:11] * dmick (~dmick@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[22:11] * yehudasa (~yehudasa@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[22:12] * sagewk (~sage@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[22:35] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[22:52] <SpamapS> Anybody know if Ted T'so's large xattr support is necessary to use ext4 for CEPH, or is it an optimization?
[22:52] <nhm> SpamapS: I don't know for sure, but I think xattr size limitations are a problem for Ceph.
[22:53] <SpamapS> This seems to suggest that it could cause crashes: http://ceph.newdream.net/docs/latest/dev/filestore-filesystem-compat/
[22:54] <dmick1> work is underway to lessen the load on xattrs
[22:54] * dmick1 is now known as dmick
[22:54] <SpamapS> dmick: specifically trying to decide what to do about the CEPH we ship in Ubuntu 12.04
[22:54] <SpamapS> I think we'll just recommend XFS
[23:00] * aliguori_ (~anthony@32.97.110.65) has joined #ceph
[23:00] <nhm> SpamapS: If you do any testing on XFS let me know. I'd like to know if people are seeing a lot of extent fragmentation.
[23:01] * aliguori (~anthony@32.97.110.59) Quit (Ping timeout: 480 seconds)
[23:09] * fronlius (~fronlius@e182095028.adsl.alicedsl.de) has joined #ceph
[23:14] * aliguori_ (~anthony@32.97.110.65) Quit (Ping timeout: 480 seconds)
[23:14] <gregaf> SpamapS: dmick1: late reply, but yes, xattrs are a problem for Ceph on ext4, no, the large xattr patches don't actually fix it :( (unless there's a separate set I haven't seen?)
[23:15] <gregaf> we may be able to use our integrated levelDB in the future instead, but we haven't gotten to testing that out yet and it's probably not going to be ready for 12.04
[23:15] <dmick> gregaf knows better than I, certainly
[23:17] <gregaf> if Ted has a set separate from the Lustre ones that might work, but I know at least one set of the large xattr patches are still limited size and don't overflow into the extra space, it just uses the extra space for actual large xattrs
[23:18] <dmick> yehudasa_ was experimenting recently, I know; don't know if he has more-recent information
[23:18] <gregaf> yeah, I'm just reporting what I remember from him :)
[23:18] <dmick> (I was trying to get his alert to go off :) )
[23:19] <yehudasa_> dmick: got that, reading log
[23:19] <pulsar> are there any know problems with ceph 0.41 / debian squeeze / xfs? I am waiting for a 39 node system to come up as "active" after initializing the FS from the scratch for .... about 4 hours by now
[23:19] <yehudasa_> the biggest problem with the lustre ext4 patch is that it was created assuming there's a small number of xattrs
[23:20] <gregaf> pulsar: that's a lot longer than it should have taken, what does ceph -s report?
[23:20] <pulsar> "mds e3: 1/1/1 up {0=1=up:creating}" ... still waiting
[23:20] <gregaf> what's the rest say?
[23:20] <yehudasa_> so it fixes the case of writing one big xattr, but it fails with writing a lot of small xattrs
[23:20] <pulsar> root@node-1 ~ # ceph -s
[23:20] <pulsar> 2012-02-21 23:20:39.449178 pg v1315: 7920 pgs: 6730 active+clean, 1190 peering; 7837 bytes data, 39743 GB used, 63602 GB / 100 TB avail
[23:20] <pulsar> 2012-02-21 23:20:39.467933 mds e3: 1/1/1 up {0=1=up:creating}
[23:20] <pulsar> 2012-02-21 23:20:39.467983 osd e62: 38 osds: 37 up, 37 in
[23:20] <pulsar> 2012-02-21 23:20:39.468296 log 2012-02-21 21:39:50.566157 osd.2 10.0.3.3:6800/31813 249 : [INF] 2.1p2 scrub ok
[23:20] <pulsar> 2012-02-21 23:20:39.468381 mon e1: 1 mons at {1=10.0.3.1:6789/0}
[23:21] <gregaf> okay, so it's got 1190 PGs which aren't currently active and are still peering
[23:21] <gregaf> not being active, they're not taking any reads or writes
[23:21] <pulsar> root@node-1 ~ # PG stands for?
[23:21] <gregaf> and that's probably what is blocking the MDS from activating
[23:21] <gregaf> PG is "placement group"
[23:21] <pulsar> ah, ic.
[23:22] <pulsar> one node died on me during the first boot
[23:22] <gregaf> objects are hashed into placement groups for purposes of metadata aggregation, tracking, etc
[23:22] <pulsar> could be related?
[23:22] <gregaf> I wouldn't expect it to, but it's possible
[23:22] <pulsar> 2012-02-21 21:26:46.789657 7fee5f26a700 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7fee5825c700' had suicide timed out after 180
[23:22] <pulsar> common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fee5f26a700 time 2012-02-21 21:26:46.789686
[23:22] <pulsar> common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
[23:22] <pulsar> that kind of borkness
[23:22] <yehudasa_> SpamapS: sagewk1: I think recommending xfs for now is a good choice, sage?
[23:23] <sagewk1> spamaps, yehudasa_: yeah
[23:23] <pulsar> i could boot up the cluster with half of the nodes with no pain
[23:23] <gregaf> pulsar: so that's indicating that a thread tried to write to the local filesystem and the fs never returned control back :/
[23:23] <pulsar> gregaf: like local I/O blocking issue or something?
[23:23] <pulsar> let me check the metrics....
[23:24] <gregaf> yeah
[23:24] <pulsar> harddisk were ok last time i checked
[23:24] <gregaf> if a lot of your nodes are running close to that slow it could just be that progress is taking a really long time and nothing else is wrong...
[23:24] <gregaf> though it's unlikely that that's the case
[23:24] <pulsar> disk I/Os look good
[23:25] <pulsar> and the machine was in the cluster before, back then i used 20 nodes instead of 40
[23:25] <pulsar> so.. nope. disks are fine
[23:25] * aliguori_ (~anthony@32.97.110.64) has joined #ceph
[23:26] <gregaf> well, the number of OSD maps looks reasonable so I'm not sure what's likely to be the trouble…sagewk, joshd, any ideas why peering would take hours on first startup?
[23:26] <gregaf> this would be pretty easy to diagnose on version 0.42, we added a bunch of extra visibility stuff
[23:26] <pulsar> the number of peering nodes does not get any lower
[23:26] <gregaf> you mean peering PGs?
[23:26] <gregaf> hrm
[23:26] <pulsar> hmm... i could push an update to the cluster if there are any debian packages available
[23:27] <pulsar> yep
[23:27] <pulsar> sticks to 1190
[23:27] <SpamapS> yehudasa_: thanks
[23:28] <pulsar> 2012-02-21 23:25:17.249369 7f0591169700 mds.0.1 ms_handle_reset on 10.0.3.36:6800/20713
[23:28] <pulsar> 2012-02-21 23:25:17.250591 7f0591169700 mds.0.1 ms_handle_connect on 10.0.3.36:6800/20713
[23:28] <pulsar> i get that a lot
[23:28] <gregaf> I'm not aware of any incorrect placement issues or bugs that would cause that to happen, but something bizarre is obviously going on
[23:29] <pulsar> running ceph as root, ulimit are not likely to be an issue?
[23:29] <gregaf> yeah, that's just due to the MDS waiting on IO requests and the OSD not responding to them
[23:29] <pulsar> i am used to "bizarre"
[23:29] <pulsar> ....
[23:29] <pulsar> a LOT
[23:29] <gregaf> no, it's not a ulimit issue (I don't know if we've ever seen them and they couldn't manifest like this)
[23:30] <gregaf> let me go poke a few people, see if they have ideas
[23:30] <pulsar> load / activity is non existent on all the nodes
[23:30] <pulsar> oh, that would be great! thanks
[23:35] <gregaf> pulsar: all right, nothing obvious so sjust is going to walk you through crush map placement checks and some other stuff :)
[23:35] <pulsar> glad for any help i can get
[23:35] <sjust> pulsar: one sec
[23:36] <pulsar> i could fire up teamviewer or patch you through into a screen session?
[23:36] <sjust> ceph osd getmap -o /tmp/tmpmap; osdmaptool --test-map-pg <pgid> /tmp/tmpmap
[23:36] <sjust> where pgid is the pgid of one of the pgs stuck in peering
[23:37] * aliguori_ (~anthony@32.97.110.64) Quit (Ping timeout: 480 seconds)
[23:37] <sjust> you can get a list of pg statuses using 'ceph pg dump'
[23:37] <pulsar> how to .. .yea :)
[23:37] <sjust> that first command should give you the mapping for the pg in question
[23:37] <sjust> let me know what the mapping is
[23:38] <sjust> so for pg 1.0, the command would be 'ceph osd getmap -o /tmp/tmpmap; osdmaptool --test-map-pg 1.0 /tmp/tmpmap'
[23:39] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[23:39] * grape (~grape@216.24.166.226) Quit (Quit: leaving)
[23:45] <pulsar> ok, here is the pgdump
[23:45] <pulsar> http://paul.vc/tmp/pg_dump.txt
[23:46] <sjust> pulsar: what happened to osd12?
[23:46] * aliguori_ (~anthony@32.97.110.59) has joined #ceph
[23:46] <pulsar> see above, died due to a timeout when trying to boot up the cluster for the first time
[23:47] <pulsar> 2012-02-21 14:53:12.204872 7effc77a9700 osd.12 24 OSD::ms_handle_reset() s=0xb8ed480
[23:47] <pulsar> 2012-02-21 14:53:13.647943 7effbc64c700 -- 10.0.3.13:6802/6653 >> 10.0.3.2:0/21866 pipe(0x27c8500 sd=53 pgs=9 cs=1 l=0).fault initiating reconnect
[23:47] <pulsar> 2012-02-21 14:53:13.647988 7effbb73d700 -- 10.0.3.13:0/6654 >> 10.0.3.2:6802/21865 pipe(0x288e500 sd=58 pgs=3 cs=1 l=0).fault with nothing to send, going to standby
[23:47] <pulsar> 2012-02-21 14:53:15.598623 7effbfa80700 -- 10.0.3.13:6802/6653 >> 10.0.3.12:0/26183 pipe(0x2521c80 sd=33 pgs=3 cs=1 l=0).fault initiating reconnect
[23:47] <pulsar> 2012-02-21 14:53:15.598688 7effbf77d700 -- 10.0.3.13:0/6654 >> 10.0.3.12:6802/26182 pipe(0x2862500 sd=32 pgs=2 cs=1 l=0).fault with nothing to send, going to standby
[23:47] <pulsar> *** Caught signal (Terminated) ** in thread 7effd3846780. Shutting down.
[23:48] <pulsar> i tried to restart it, but then it died due to timeout issues as stated above
[23:49] <sjust> pulsar: ah, right
[23:49] * fronlius (~fronlius@e182095028.adsl.alicedsl.de) Quit (Quit: fronlius)
[23:50] <sjust> pulsar: ok, looks like osd12 was likely the first to boot and died between moving the pg out of creating and replicating it to any peers
[23:50] <sjust> *moving these pgs
[23:51] <pulsar> seems this happens all the time to me
[23:51] <pulsar> tried to reformat the cluster a couple of times already
[23:51] <sjust> pulsar: well, I think osd12 has a bad disk
[23:51] <pulsar> and as soon i use more than 20 nodes it fails
[23:51] <sjust> is there anything in osd12's dmesg/
[23:51] <pulsar> just checked. but let me have a look again
[23:51] <sjust> ?
[23:52] <pulsar> nope
[23:52] <pulsar> all clean
[23:52] <pulsar> and the disk is ok
[23:52] <pulsar> i had it filled up before
[23:52] <pulsar> furthermore hdfs is running next to it with 0 issues
[23:52] <sjust> ah
[23:53] <pulsar> no kernel log messages either
[23:53] <pulsar> metrics do also look healthy
[23:53] <sjust> hmm
[23:53] <pulsar> the node was perfectly fine and working when running the cluster with 19 nodes instead of 39
[23:55] <sjust> well, let's try this, we should be able to bring the cluster up sans osd12 by marking those pgs creating once more
[23:55] <pulsar> ok. should i blacklist that node by removing it from ceph.config?
[23:55] <sjust> pulsar: not yet, one moment
[23:56] <sjust> pulsar: for each <pgid> currently marked peering, we need to call 'ceph pg force_create_pg <pgid>'
[23:56] <pulsar> that node could have other issues though. i see a lot of "osd.12 19 OSD::ms_handle_reset() s=0x40236c0" errors in the logfiles
[23:57] <sjust> pulsar: basically, those pgs all got remapped to new osds, but there aren't any current copies to kick start peering
[23:58] <pulsar> ic, so how would i get a list of all pgids?
[23:58] <sjust> one sec
[23:58] <pulsar> glancing over that dump as we speak
[23:59] <sjust> grep peering pg_dump.txt | sed 's/\t.*//'
[23:59] <sjust> that'll give you a list of all of the pgids

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.