#ceph IRC Log


IRC Log for 2010-07-26

Timestamps are in GMT/BST.

[0:08] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[2:27] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has joined #ceph
[2:29] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) Quit ()
[6:46] * eternaleye (~quassel@184-76-53-210.war.clearwire-wmx.net) Quit (Ping timeout: 480 seconds)
[7:32] * MarkN (~nathan@ has left #ceph
[7:32] * MarkN (~nathan@ has joined #ceph
[9:49] * allsystemsarego (~allsystem@ has joined #ceph
[10:08] * Jiaju (~jjzhang@ has joined #ceph
[11:13] * allsystemsarego_ (~allsystem@ has joined #ceph
[11:16] * allsystemsarego (~allsystem@ Quit (Ping timeout: 480 seconds)
[14:55] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) has joined #ceph
[14:55] * ghaskins_mobile (~ghaskins_@ has joined #ceph
[16:44] * sakib (~sakib@gate.usic.ukma.kiev.ua) has joined #ceph
[17:06] * sakib (~sakib@gate.usic.ukma.kiev.ua) Quit (Quit: leaving)
[18:06] * eternaleye (~quassel@184-76-53-210.war.clearwire-wmx.net) has joined #ceph
[18:42] * Osso (osso@AMontsouris-755-1-7-241.w86-212.abo.wanadoo.fr) has joined #ceph
[18:45] * ghaskins_mobile (~ghaskins_@ Quit (Quit: This computer has gone to sleep)
[18:53] * yehudasa_bb (~yehudasa@ has joined #ceph
[19:01] * yehudasa_bb (~yehudasa@ Quit (Ping timeout: 480 seconds)
[19:02] * ghaskins_mobile (~ghaskins_@ has joined #ceph
[19:12] * ghaskins_mobile (~ghaskins_@ Quit (Quit: This computer has gone to sleep)
[19:22] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has left #ceph
[19:23] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[20:11] <todinini> hi, did anyone do a cluster expansion, I am stucked, the new osd doesn't show up in the map
[20:18] <gregaf> todinini: what did you do to expand it?
[20:19] <todinini> gregaf: I used this wiki how-to http://ceph.newdream.net/wiki/OSD_cluster_expansion/contraction
[20:20] <todinini> but the new osd is not in the map ceph osd dump -o -
[20:20] <gregaf> is it running?
[20:20] <todinini> yes
[20:21] <gregaf> what's your initial conf file look like, and what does the map output?
[20:22] <todinini> http://tuxadero.com/multistorage/ceph.conf node10 is the new one
[20:23] <gregaf> what does the ceph osd dump -o - output?
[20:23] <todinini> thats the map http://pastebin.com/sqbB4Nvv
[20:28] <todinini> I don't understand why node10 is not showing up
[20:28] <wido> todinini: did you increase maxosd?
[20:29] <todinini> wido: yes from 10 to 11
[20:29] <wido> ok, are you using cephx?
[20:29] <wido> for auth
[20:29] <todinini> wido: no
[20:29] <todinini> but the auth list command does list keys for the other nodes
[20:29] <todinini> but I never configuerd it
[20:31] <todinini> on the new node there is some cosd activity, but I do not know what cosd does
[20:33] <gregaf> what does ceph -s show?
[20:34] <todinini> 10.07.26_20:34:01.726501 pg v9708: 2640 pgs: 2640 active+clean; 7623 MB data, 15493 MB used, 896 GB / 911 GB avail
[20:34] <todinini> 10.07.26_20:34:01.742157 mds e29: 1/1/1 up {0=up:active}, 1 up:standby
[20:34] <todinini> 10.07.26_20:34:01.742229 osd e78: 4 osds: 4 up, 4 in
[20:34] <todinini> 10.07.26_20:34:01.742343 log 10.07.26_20:21:35.807102 mon0 22 : [INF] osd9 boot
[20:34] <todinini> 10.07.26_20:34:01.742518 mon e1: 2 mons at
[20:35] <wido> todinini: what's your new crushmap
[20:35] <wido> did you create one?
[20:35] <wido> and was the new osd formatted?
[20:36] <wido> yehudasa: you there?
[20:37] <todinini> wido: I did not create a new crushmap, because in the tutorial the osd should show up bevor, yes the new osd was formatted
[20:37] <wido> ok, that's true
[20:37] <wido> what's the id of the new osd?
[20:37] <gregaf> it looks like the new OSD isn't connecting to the monitor at all
[20:37] <todinini> the new osd is 10
[20:38] <gregaf> how'd you start up the OSD precisely?
[20:38] <gregaf> and do you have a log file?
[20:38] <todinini> the osd10 is writing to it's btrfs partion
[20:38] <wido> todinini: can you confirm that maxosd is at 11? ceph osd getmaxosd
[20:38] <todinini> gregaf: /etc/init.d/ceph start
[20:38] <gregaf> it's in the map, wido
[20:38] <wido> did i mis the map?
[20:39] <gregaf> pastebin up above
[20:39] <todinini> 0.07.26_20:39:07.602927 mon0 -> 'max_osd = 11 in epoch 78' (0)
[20:39] <gregaf> you ran that on node10?
[20:39] <wido> yes, you are right :)
[20:39] <wido> missed it
[20:39] <todinini> gregaf: the start command? yes on node10
[20:40] <gregaf> where's your ceph.conf located?
[20:40] <gregaf> and do you have a log file for the osd? (look in /var/log/ceph)
[20:40] <todinini> /etc/ceph/ceph.conf
[20:40] <gregaf> if you can pastebin the log or something that ought to tell us why it's not talking to the monitor
[20:41] <todinini> osd10 log http://pastebin.com/n0euRBCu
[20:41] <wido> could you up the debug level for the osd?
[20:42] <wido> debug osd and debug ms i think?
[20:42] <todinini> wido: for all osds or just number 10?
[20:43] <wido> only for the new one
[20:45] <gregaf> that's the whole log, todinini? ending in "10.07.26_18:52:58.684968 7f66da2f1720 filestore(/data/osd10) parse meta -> 0.0p0_0 = 1"?
[20:45] <todinini> gregaf: yep
[20:46] <todinini> new log leve ms 15 http://pastebin.com/WfEFSYuh
[20:52] <gregaf> okay, it is talking to the monitor
[20:53] <gregaf> not enough info there, though — add in debug_osd 20 (the log will get large)
[20:57] <todinini> http://pastebin.com/CqZxHc3x
[21:02] <gregaf> hey, you found a bug!
[21:02] <todinini> gregaf: really?!
[21:02] <gregaf> it's a big shock, I know ;)
[21:03] <todinini> so what's wrong?
[21:03] <gregaf> the OSD needs to get a full OSDMap history and it's not due to a protocol issue
[21:03] <todinini> where in the logs could I see that?
[21:04] <gregaf> well, we figured it out from "cur 0 < newest 78" and the related lines
[21:04] <gregaf> the issue is that the OSD is sending a "subscribe" message to the monitor, and in that message it's trying to say "give me map epoch 0"
[21:05] <gregaf> but 0 in the protocol is understood to mean the most recent map
[21:05] <todinini> I see, that's bad, you file a bug for it?
[21:05] <gregaf> yeah, we will
[21:06] <todinini> ok, so I have to wait for this feature,
[21:08] <gregaf> unfortunately :/
[21:08] <gregaf> I imagine it won't be long, though, that's not a good issue to leave waiting
[21:08] <todinini> gregaf: no prob, can I somewhere subcribe to this bug, so that I get notified if its fixed and test again?
[21:11] <gregaf> todinini: http://tracker.newdream.net/issues/308
[21:11] <gregaf> err,
[21:11] <gregaf> http://tracker.newdream.net/issues/308
[21:13] <todinini> can I somewhere watch that issues to get a email notifier it the status changes?
[21:13] <wido> todinini: register at the tracker
[21:13] <gregaf> well, if you register a user you can become a "watcher" and get bug updates
[21:13] <wido> and "Watch" the issue
[21:14] <gregaf> I don't think there's a way just by putting in your email
[21:16] <todinini> ok, have an account now, and wachting it
[21:20] <sagewk> todinini: can you turn up your monitor debug level (debug mon = 20, debug ms = 1 in [mon] section), restart the montior, and then restart the new cosd?
[21:20] <todinini> yep, give me a sec
[21:20] <sagewk> assuming it doesn't come up that time, can you post the monitor log somewhere?
[21:20] <sagewk> also, can you post the full (new) osd log somewhere?
[21:21] <sagewk> (btw, you can do debug ms = 1 in the osd log, no need anything higher than that for the messaging layer)
[21:21] <sagewk> thanks
[21:24] <sagewk> btw, the wiki article got screwd up, someone replaced the 'format the osd' section with spam. reverted it.. that may be part of your problem?
[21:25] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[21:26] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[21:26] <todinini> sagewk: the mon0 log http://pastebin.com/6UA9fcaU
[21:26] <todinini> do you need the osd10 log as well?
[21:27] <sagewk> yeah
[21:28] <sagewk> lucnh! back soon
[21:30] <todinini> osd10 log after restart http://pastebin.com/4xxJ1fYj
[21:43] <todinini> mon log http://pastebin.com/uy3LFQZv
[21:45] <todinini> mon log grep '' http://pastebin.com/NwcE8gtm
[22:23] <sagewk> todinini: can you post the complete osd10 log somewhere? it's the beginning bits i'm wondering about
[22:28] <todinini> sagewk: are the first 500 lines enough? http://pastebin.com/BTPzPSmp
[22:29] <sagewk> can you less the mon log, find where the osd_boot message is received (serach for osd_boot(osd10 v0)), and pastebin that section of the mon log?
[22:29] <todinini> http://tuxadero.com/multistorage/osd.10.log
[22:31] <todinini> http://tuxadero.com/multistorage/mon.0.log
[22:31] <sagewk> 10.07.26_21:40:07.271004 7fd084ae0710 mon0(leader).osd e78 preprocess_boot on fsid 80b227e0-cdbf-5106-5257-0d13036039be != 7da9cc7d-707d-4c75-a6e6-f8252768317e
[22:31] <sagewk> re-run cosd --mkfs.. make sur eyou use the correct monmap
[22:31] <sagewk> see http://ceph.newdream.net/wiki/OSD_cluster_expansion/contraction#Format_the_OSD
[22:35] <wido> hi
[22:35] <wido> noticed something weird with cephx this weekend, not really sure what it was
[22:36] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has joined #ceph
[22:36] <wido> last friday i upgraded my cluster to the latest unstable, this ofcourse involved a restart of my osd's. Somehow they then didn't want to authenticate with eachother anymore
[22:36] <wido> had to restart the osd's a few times (in random order) to get everything working again.
[22:37] <wido> they were communicating with the mon, this was saying all the OSD's were up, but mounting the fs for example hang with mount error
[22:37] <wido> 5
[22:37] <sagewk> hmm, interesting. the key they use to authenticate is a rotating key..
[22:38] <todinini> sagewk: should this command come back to the shell at some point? ceph mon getmap -o /tmp/monmap ?
[22:39] <wido> sagewk: ok, i found it pretty weird too
[22:40] <todinini> ahh, mon wasn't running
[22:41] <todinini> now the osd is in the map, the first part of the formating wasn't in the wiki a few houers ago?
[22:42] <gregaf> that's correct, apparently it got deleted by a spambot and not replaced
[22:43] <todinini> ohh, than i apologize for the trouble
[22:43] <sagewk> no worries, we found a related bug with the protocol
[22:43] <gregaf> not your fault, and there are still some protocol bugs we need to fix
[22:45] <wido> is yehudasa coming online today?
[22:52] <yehudasa> yehudasa is online
[22:52] <yehudasa> wido: i'm here
[22:53] <wido> ah, great
[22:54] <wido> i found something weird today with the gateway. Some states seem to get cached
[22:54] <yehudasa> what do you mean?
[22:54] <wido> when testing with the If-Modified-Since header, as soon as i got a 304 Not Modified, the gateway kept returning that, with every request i did
[22:55] <wido> a simple HEAD request on that file (without any extra headers) would return a 304
[22:55] <wido> checking the logs, the gateway is indeed returning a 304
[22:55] <yehudasa> that's probably what the gateway got from the osd
[22:56] <yehudasa> the gateway itself is stateless
[22:56] <wido> but why would it return a 304? Since mod_ptr isn't set, it shoud never get in that state
[22:56] <wido> added some lines for debugging, but inside that if, it never comes
[22:57] <yehudasa> hmm.. maybe it doesn't clean up something
[22:57] <wido> i think so, that there is a pointer, which isn't cleared
[22:57] <wido> haven't been able to find it though
[22:58] * allsystemsarego_ (~allsystem@ Quit (Quit: Leaving)
[23:01] <yehudasa> so you get it on what operations?
[23:03] <wido> when i got into a situation where the gateway should return 304, so i did a request for if-modified-since, which is after the last-modified of that file
[23:03] <wido> then you get a 304, but if you then to a HEAD or GET on that file, it will also return 304
[23:03] <yehudasa> on that specific file only?
[23:03] <wido> no, on every file
[23:04] <yehudasa> and what code isn't reached?
[23:04] <wido> triggering a 404 then fixes it
[23:04] <wido> if (mod_ptr) {
[23:04] <wido> in rgw_rados.cc, i've got a cout << in there
[23:04] <yehudasa> hmm.. probably that error structure isn't cleaned up
[23:05] <wido> i've just uploaded a patch in #302 btw
[23:05] <yehudasa> yeah, thanks.. I'll look at it
[23:06] <wido> btw, since the gateway is a part of Ceph, could there be a config section in ceph.conf for the gateway? for example, set the debugging
[23:06] <wido> and what might be usefull is a config setting that the gateway add a header with the hostname of the machine it is running on: X-RADOS-Server: myhostname
[23:06] <wido> usefull in large clusters, to see which server served the request
[23:07] <yehudasa> you can add as much sections as you want.. however, at the moment that gateway debugging isn't really integrated with the ceph debugging
[23:07] <yehudasa> you can have the ceph client debugging though
[23:09] <yehudasa> iirc, there's some variable that you can set that would affect the log name, and can also be used for the section name on the config file
[23:10] <wido> but right now it isn't reading the config, is it? but the env is a good idea, you can then configure this from Apache
[23:10] <yehudasa> it actually does read the config
[23:10] <wido> because right now the gateway is just dumping a lot of data in the error log, which is not always desired
[23:10] <yehudasa> the error log is one thing and the ceph log is another
[23:10] <yehudasa> you just don't see the ceph log output
[23:10] <wido> yes, that is true
[23:11] <yehudasa> from what I see, at the moment, you could use '[librados]' section on the ceph.conf file
[23:11] <yehudasa> we might be able to make it configurable
[23:12] <wido> but i can configure what i want? no need to edit config.cc?
[23:12] <wido> i thought all the entries had to be summed up in there
[23:12] <yehudasa> only for handling the default values
[23:13] <wido> ok, i'll think about it. Some configuration of the gateway would be usefull imho
[23:14] <yehudasa> config.cc defines the g_conf structue and all its fields, and update the fields according to the defaults and to the ceph.conf
[23:14] <yehudasa> but you can add config entries of your own, and read them by youself
[23:15] <wido> ok
[23:15] <yehudasa> but keep in mind that you'll have to include certain code out of ceph that is not exported in librados
[23:15] <wido> one thing i found even more, when you shut down Apache, the gateway keeps running, isn't normal, is it?
[23:15] <wido> yes, so envirioment might be even better
[23:15] <wido> since we are using CGI
[23:15] <yehudasa> that's related to the apache fcgi interface
[23:16] <yehudasa> I'm not sure whether it is an actual problem, or some apache configuration
[23:16] <wido> well, processes start to stack up at some point
[23:17] <wido> i also configured Apache to kill a fcgi process after 1800 seconds and respawn it, just to prevent memory leaks, doesn't work either
[23:17] <wido> seems to be the same, Apache can't shut down the process
[23:19] <wido> btw, you are using fcgiapp.h, which is official FastCGI, why use fcgid and not fastcgi?
[23:20] <yehudasa> that what the default debian apache installation uses
[23:21] <yehudasa> ok, the shutdown problem is solvable
[23:22] <yehudasa> need to add a signal handler to catch SIGUSR1, it'll need to signal the main loop to end at its convenience
[23:24] <wido> ok, tomorrow i'll make a new branch and make it more configureable, control the output
[23:25] <wido> we could make a strict mode, where it exactly followes Amazon, for example, the partial content
[23:25] <wido> and a regular mode, where it is a bit more loose, but still is S3 compatible
[23:34] <wido> i'm going afk, ttyl
[23:35] <yehudasa> thanks, ttyl
[23:49] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.