#ceph IRC Log


IRC Log for 2012-11-27

Timestamps are in GMT/BST.

[0:00] <lurbs> dmick: "For non-btrfs file systems we rely on the journal for basic consistency", according to Sage.
[0:00] <dmick> yes, but
[0:00] <dmick> that means for the transactions that are not yet flushed from the journal
[0:00] <dmick> if it's committed to stable storage, the journal is irrelevant
[0:00] <via> the rebuild is going 4x as fast with tmpfs vs journal on same disk as osd
[0:00] <lurbs> How do you rebuild a OSD that's had a failed journal without starting it over?
[0:01] <via> well, when i just did this switch to tmpfs i totally just --mkjournal'd
[0:01] <via> and didn't lose the osd contents
[0:01] * wer (~wer@wer.youfarted.net) Quit (Remote host closed the connection)
[0:01] <dmick> lurbs: if the device is completely gone, --mkjournal should get it going, and then when it joins the cluster, it learns what it has vs. what all the other OSDs have
[0:02] <dmick> if it's not completely gone, I suppose it could be in some kind of inconsistent state that might require some repair
[0:02] <lurbs> Right. Go mkjournal and a 'ceph osd repair $osd' is likely to recover it reasonably well?
[0:02] <lurbs> s/Go/So/
[0:03] * wer (~wer@wer.youfarted.net) has joined #ceph
[0:04] <dmick> not clear on osd repair, but the osd will start recovery if it needs to anyway
[0:05] * MK_FG (~MK_FG@ Quit (Ping timeout: 480 seconds)
[0:07] * MK_FG (~MK_FG@00018720.user.oftc.net) has joined #ceph
[0:12] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) Quit (Remote host closed the connection)
[0:12] <dmick> gregaf just corrected me; apparently journal death is indeed "the OSD is unreliable and must be rebuilt from scratch"
[0:12] <dmick> because the OSD can't be known to be consistent if the journal state is unknown
[0:13] <dmick> (is that a reasonable way to put it gregaf?)
[0:13] <gregaf> sounds good to me, yeah :)
[0:13] <rweeks> Journal Death is my new metal band
[0:14] <dmick> Creeping Journal Death
[0:15] <jtang1> heh metal heads?
[0:15] * vata (~vata@ Quit (Quit: Leaving.)
[0:16] <dmick> guilty as charged (but dammit it ain't right...there's someone else controlling me)
[0:16] <rweeks> dmick is definitely a metalhead. I just enjoy it from time to time
[0:17] <via> oh, so crap
[0:17] <via> so even if i went with a shared ssd, that causes me to increase chances of failure
[0:17] <via> and i'll just have to live with slow journals on the same disk as the osd
[0:18] <lurbs> dmick: http://uber.geek.nz/heavymetal.jpg
[0:18] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) has joined #ceph
[0:18] <rweeks> hehe Voivod
[0:20] * maxiz (~pfliu@ Quit (Remote host closed the connection)
[0:21] <dmick> "skid row" is a deadly thing?....stretch...
[0:21] * tnt (~tnt@55.188-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[0:22] <rweeks> if you listen to them, yes
[0:22] * rweeks grins
[0:22] <joao> it's missing GWAR
[0:22] <dmick> <bristle>
[0:22] * joao sets mode -o joao
[0:23] <dmick> Monkey Business is about as heavy a tune as you'll ever run across rweeks, but I'll say no more about it here :-P
[0:23] <rweeks> I counter with 18 And Life
[0:23] <dmick> I knew you would
[0:23] <rweeks> having said that, yes, they had some decent tracks
[0:23] <rweeks> not sure where GWAR fits in there, joao
[0:24] <joao> nor do I, but is missing nonetheless :(
[0:24] <dmick> imaginary probably
[0:24] <dmick> but thanks for the diagram lurbs; fun times
[0:24] <joao> also, Moonspell would fit right into the "occult" section or something
[0:25] * BManojlovic (~steki@ Quit (Ping timeout: 480 seconds)
[0:25] <dmick> Yeah, but they're from Portugal, so no one cares :)
[0:25] <joao> oh, you have no idea how true that is...
[0:26] <joao> not even Portuguese care
[0:26] <rweeks> also Ragnarok on there is in the wrong spot
[0:26] <joao> and by that I mostly mean me :p
[0:35] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[0:36] * loicd (~loic@magenta.dachary.org) has joined #ceph
[0:36] * sagelap2 (~sage@ has joined #ceph
[0:39] * sagelap1 (~sage@61.sub-70-197-128.myvzw.com) Quit (Ping timeout: 480 seconds)
[0:41] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[0:45] * illuminatis (~illuminat@89-76-193-235.dynamic.chello.pl) Quit (Ping timeout: 480 seconds)
[0:46] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[0:47] * Lennie`away is now known as leen
[0:47] * drokita (~drokita@ Quit (Quit: Leaving.)
[0:47] * leen is now known as Lennie
[0:49] * Lennie is now known as Lennie`away
[0:56] * PerlStalker (~PerlStalk@ Quit (Quit: rcirc on GNU Emacs 24.2.1)
[1:04] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:04] * loicd (~loic@magenta.dachary.org) has joined #ceph
[1:07] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[1:11] <dmick> hey sagewk
[1:23] <elder> Headed out for some dinner.
[1:23] <elder> (I am, not sagewk)
[1:23] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:24] * weber (~he@ Quit (Quit: weber)
[1:29] * jlogan1 (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) Quit (Ping timeout: 480 seconds)
[1:33] <AaronSchulz> yehudasa: do GET/HEADs against containers with .r:* for read ACLs still need authentication?
[1:34] * AaronSchulz swears this worked the other day
[1:34] <yehudasa> AaronSchulz: GET/HEAD on the container or on the objects?
[1:34] <AaronSchulz> oh, just an object
[1:34] <AaronSchulz> I don't have .rlistings set either
[1:35] <yehudasa> don't think that .rlistings is implemented
[1:35] <yehudasa> .r:* should work
[1:35] <AaronSchulz> so listings always need auth?
[1:35] <yehudasa> are you sure it's set?
[1:36] <yehudasa> well, listing requires a different acl
[1:36] <yehudasa> which you cannot set through the swift api
[1:37] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[1:37] <AaronSchulz> well we don't want public listings anyway so that's fine
[1:37] <yehudasa> I would have probably implemented .rlistings, had that been documented anywhere when I implemented it
[1:37] <yehudasa> it was hard enough locating the .r:* magic
[1:38] * AaronSchulz just set .r:* again and it still doesn't work
[1:38] * AaronSchulz heads the container
[1:39] <yehudasa> was it working on the same version before?
[1:40] * AaronSchulz doesn't recall upgrading
[1:42] <yehudasa> what version are you running? just checked that and it worked
[1:50] <AaronSchulz> X-Container-Read: .r:*
[1:51] <AaronSchulz> ok, version, lets see
[1:51] <AaronSchulz> ceph version 0.48.2argonaut (commit:3e02b2fad88c2a95d9c0c86878f10d1beb780bfe)
[1:52] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[1:53] <yehudasa> AaronSchulz: does your rgw log say something like 'Getting permissions id=...' for the request that fails?
[1:54] <AaronSchulz> radosgw.log is empty
[1:55] <yehudasa> hmm.. well, do you have 'log file = ...' under the relevant section in your ceph.conf?
[1:56] <AaronSchulz> log file = /var/log/ceph/radosgw.log
[1:56] <AaronSchulz> under [client.radosgw.gateway]
[1:58] <yehudasa> debug rgw = 20?
[1:58] <AaronSchulz> also there already
[2:03] <yehudasa> AaronSchulz: try running 'radosgw-admin policy --bucket=<container>'
[2:04] <AaronSchulz> what does that do?
[2:04] <yehudasa> dumps the bucket policy
[2:04] * maxiz (~pfliu@ has joined #ceph
[2:05] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[2:05] <yehudasa> ah
[2:05] <AaronSchulz> <AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>mediawiki</ID><DisplayName>MediaWiki</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>mediawiki</ID><DisplayName>MediaWiki</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPoli
[2:05] <AaronSchulz> cy>
[2:05] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[2:05] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[2:05] <yehudasa> but it shows the S3 policy, does not include the swift specific headers
[2:06] <AaronSchulz> :(
[2:06] * doug (doug@breakout.horph.com) has left #ceph
[2:08] <yehudasa> make sure that the radosgw has the correct log file opened: ls /proc/`pgrep`/fd
[2:08] <yehudasa> make sure that the radosgw has the correct log file opened: ls /proc/`pgrep radosgw`/fd
[2:10] * plut0 (~cory@pool-96-236-43-69.albyny.fios.verizon.net) has joined #ceph
[2:10] <AaronSchulz> yehudasa: you mean ls -l ?
[2:10] <yehudasa> yeah
[2:10] <AaronSchulz> I see l-wx------ 1 root root 64 Nov 26 17:08 0 -> /var/log/ceph/radosgw.log.1 (deleted)
[2:11] <AaronSchulz> the otheres are pipe or socket
[2:11] <yehudasa> AaronSchulz: you have log rotation
[2:11] <yehudasa> killall -HUP radosgw might solve it
[2:12] <AaronSchulz> wow, the log has stuff again
[2:14] <AaronSchulz> 2012-11-26 17:12:10.920837 7f1181fab700 2 req 38340:0.000196:swift:GET /swift/v1/aaron-devwiki-wmf-local-public/0/07/Monthly_Metrics_Meeting_September_4%2C_2012.ogv:get_obj:authorizing
[2:14] <AaronSchulz> 2012-11-26 17:12:10.920841 7f1181fab700 10 failed to authorize request
[2:14] <AaronSchulz> yehudasa: well I see the 401s in the log now, but not much terribly interesting though (lots of HTTP header and server info)
[2:15] <yehudasa> try accessing the object directly, without the /swift/v1 prefix
[2:16] <AaronSchulz> that seems to work
[2:19] <AaronSchulz> yehudasa: now the metadata headers are prefixed with x-amz-meta-*
[2:22] <AaronSchulz> seems like more s3 speak
[2:24] <yehudasa> AaronSchulz: right, when you don't prefix it with /swift/v1 it assumes it's S3
[2:25] <yehudasa> is that a problem?
[2:26] <yehudasa> the issue is that we don't deal with anonymous access with the swift protocol handling
[2:26] <yehudasa> I can open an issue for that
[2:27] <AaronSchulz> it would be nice, though it's ok if the metadata headers are a little funny I guess (since they are only used internally)
[2:27] * The_Bishop (~bishop@p4FCDEF25.dip.t-dialin.net) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[2:29] <AaronSchulz> yehudasa: does rgw have a white-list of allowed custom headers?
[2:29] <AaronSchulz> I swear my X-Content-Duration one doesn't stick
[2:30] <yehudasa> AaronSchulz: there's a set of headers that we keep, X-Content-Duration is not one of them
[2:31] <AaronSchulz> in swift you can redefine the list
[2:31] <AaronSchulz> is there any analogous ceph setting?
[2:32] <AaronSchulz> Content-Disposition doesn't set either
[2:32] <yehudasa> well, it works in a later version
[2:32] <yehudasa> bobtail will have it
[2:33] <yehudasa> there's a set of headers defined in bobtail
[2:34] <yehudasa> static struct rgw_http_attr rgw_to_http_attr_list[] = {
[2:34] <yehudasa> { RGW_ATTR_CONTENT_TYPE, "Content-Type"},
[2:34] <yehudasa> { RGW_ATTR_CONTENT_LANG, "Content-Language"},
[2:34] <yehudasa> { RGW_ATTR_EXPIRES, "Expires"},
[2:34] <yehudasa> { RGW_ATTR_CACHE_CONTROL, "Cache-Control"},
[2:34] <yehudasa> { RGW_ATTR_CONTENT_DISP, "Content-Disposition"},
[2:34] <yehudasa> { RGW_ATTR_CONTENT_ENC, "Content-Encoding"},
[2:34] <yehudasa> { NULL, NULL},
[2:34] <yehudasa> };
[2:35] <yehudasa> I guess we can make that more flexible
[2:36] <AaronSchulz> yehudasa: in swift you have to remember the defaults to define those plus the new ones...it would be nice to just define the custom ones
[2:36] <AaronSchulz> (e.g. not have to re-enumerate Content-Type and such)
[2:36] <yehudasa> yeah
[2:37] * sagelap (~sage@115.sub-70-197-146.myvzw.com) has joined #ceph
[2:37] <yehudasa> feature #3535
[2:39] <AaronSchulz> yehudasa: don't forget "public swift API" too :)
[2:39] * sagelap2 (~sage@ Quit (Ping timeout: 480 seconds)
[2:39] <yehudasa> that's #3534
[2:39] <AaronSchulz> heh
[2:41] * AaronSchulz wonders what went wrong with the logging earlier
[2:45] * sagelap (~sage@115.sub-70-197-146.myvzw.com) Quit (Ping timeout: 480 seconds)
[2:46] <yehudasa> AaronSchulz: log rotation needs to send a HUP signal to the gateway after it's doing its thing
[2:55] * AaronSchulz actually goes home now
[2:57] * sagelap (~sage@124.sub-70-197-142.myvzw.com) has joined #ceph
[3:02] <via> are there any recommended mount options for xfs for the osd? is noatime okay?
[3:06] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[3:07] <dmick> noatime is key, yes
[3:07] <via> for performance?
[3:07] <dmick> yes
[3:07] <via> any other important ones?
[3:10] <dmick> user_xattr
[3:10] <via> even on xfs?
[3:11] <dmick> oh sorry, no
[3:12] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) Quit (Remote host closed the connection)
[3:12] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Remote host closed the connection)
[3:18] * sagelap (~sage@124.sub-70-197-142.myvzw.com) Quit (Ping timeout: 480 seconds)
[3:22] * plut0 (~cory@pool-96-236-43-69.albyny.fios.verizon.net) has left #ceph
[3:23] <via> what about barrier=0?
[3:24] <via> or is that drive dependent
[3:36] * buck (~buck@bender.soe.ucsc.edu) has left #ceph
[3:44] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[3:44] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[3:44] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[3:44] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[4:08] * chutzpah (~chutz@ Quit (Quit: Leaving)
[4:15] * adjohn (~adjohn@ Quit (Quit: adjohn)
[4:26] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[4:27] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[4:30] * deepsa (~deepsa@ has joined #ceph
[4:36] * wubo (80f42605@ircip4.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[4:55] * benner (~benner@ Quit (Read error: Connection reset by peer)
[4:56] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[4:56] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[4:56] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[5:00] * kbad (~kbad@malicious.dreamhost.com) Quit (Remote host closed the connection)
[5:00] * kbad (~kbad@malicious.dreamhost.com) has joined #ceph
[5:00] * liiwi (liiwi@idle.fi) Quit (Remote host closed the connection)
[5:00] * liiwi (liiwi@idle.fi) has joined #ceph
[5:00] * joey__ (~joey@71-218-31-90.hlrn.qwest.net) Quit (Remote host closed the connection)
[5:00] * joey__ (~joey@71-218-31-90.hlrn.qwest.net) has joined #ceph
[5:01] * rino (~rino@ Quit (Ping timeout: 480 seconds)
[5:03] * joao (~JL@ Quit (Ping timeout: 480 seconds)
[5:03] * yanzheng (~zhyan@ has joined #ceph
[5:05] * benner (~benner@ has joined #ceph
[5:12] * joao (~JL@ has joined #ceph
[5:12] * ChanServ sets mode +o joao
[5:34] * jlogan1 (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) has joined #ceph
[5:38] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[5:38] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[6:07] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) has joined #ceph
[6:22] * n-other (~c32fff42@2600:3c00::2:2424) has joined #ceph
[6:23] <n-other> hello
[6:24] <n-other> I have a problem with apps showing negative filesizes for big files resided on the ceph
[6:25] <n-other> stat64("/home/data1/bigfile.dat", {st_mode=S_IFREG|0644, st_size=2336962885, ...}) = 0 write(1, "-1958004411\n", 12-1958004411
[6:26] <n-other> this is the trace from php
[6:26] <n-other> however stat program uses lstat64 and seems getting it correct
[6:26] <n-other> has anyone seen this kind of behaviour?
[6:28] * miroslav1 (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) has joined #ceph
[6:30] <elder> I think something about your kernel is using 32-bit file offsets.
[6:30] <elder> n-other,
[6:31] <elder> That size, 2336962885, is in hexadecimal 0x8B4B3945. That has the high bit set, meaning if it's treated as a signed 32-bit value it becomes negative. In fact, as a 32-bit negative value, it is -1958004411.
[6:32] <n-other> is it kernel-related or libc related?
[6:32] <elder> So the size of the file is in fact 2336962885 bytes, but your kernel, or your libraries, or something, is only capable of working with files that are (2^32 - 1) bytes or less.
[6:32] <elder> I'm not sure, and I don't think I'll be able to help you track that down at the moment.
[6:33] <elder> I assume you are running with a 32-bit kernel though, right?
[6:33] * kermin (ca037809@ircip2.mibbit.com) has joined #ceph
[6:33] <n-other> ok, this is clear. just wanted to check if someone had seen this particular issue already
[6:33] <n-other> right
[6:34] <n-other> 3.2.0
[6:34] <n-other> kernel
[6:34] <n-other> I have a feeling that the problem is somewhere around not using #define __USE_LARGEFILE64
[6:34] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[6:35] <n-other> in the code in php
[6:35] <elder> Could be.
[6:35] <elder> Good luck, I'm off to bed.
[6:35] <n-other> thanks )
[6:36] <n-other> will put some details here for the descendants )
[6:39] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[6:39] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[6:44] <n-other> nope it doesn't. even using all those 64-bit related defines still gives negative filesize
[6:44] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[6:44] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[6:46] <dmick> n-other: you rebuilt the php interpreter?
[6:47] <n-other> nope
[6:47] <n-other> just made a simple c program
[6:48] <dmick> pastebin it and its compilation command somewhere?
[6:48] <n-other> http://pastebin.com/0kYBW80N
[6:48] <n-other> here is it
[6:48] <n-other> size is -1958004411
[6:48] <dmick> %d can't print a 64-bit value
[6:49] <dmick> try %ld or maybe %lld
[6:49] <n-other> compilation comman is gcc stat_test.c )
[6:49] <n-other> yep
[6:49] <n-other> already checking it
[6:50] <n-other> it definately doesn't work for long
[6:50] <n-other> but for long long it does
[6:51] <n-other> so %lld produces correct values
[6:51] <n-other> let's dig in what's used inside php )
[6:51] <dmick> so I think you want to give some flag to gcc to make it compile 64-bit
[6:52] <dmick> like maybe --64
[6:52] <dmick> and then I'm pretty sure that's an "LP64" model (longs and pointers are 64-bit)
[6:52] <dmick> which means %ld would be right
[6:52] <n-other> not sure if gcc will accept it for 32b arch
[6:52] <n-other> let me check...
[6:52] <dmick> oh it's a 32-bit machine? never mind
[6:52] <n-other> aha )
[6:53] <dmick> so there are printf formats to allow this to be portable
[6:53] <dmick> I can never remember their names
[6:55] <dmick> /usr/include/inttypes.h
[6:55] <dmick> PRI*
[6:56] <dmick> so for example
[6:56] <dmick> printf("%" PRId64 "\n", val64)
[6:56] <dmick> will print with either %ld or %lld as appropriate for the compilation flags in effect
[7:01] <n-other> ok, I guess it will work, but it really matters how it's done inside php
[7:01] <n-other> looking through 5.3.19 release...
[7:01] <dmick> sure
[7:01] <n-other> looks rubbish )
[7:01] <n-other> kernel code is much more easy to understand )
[7:01] <dmick> I would not be shocked if it doesn't work right on 32-bit arch
[7:02] <dmick> not to say it shouldn't
[7:02] <dmick> just that I would not be shocked
[7:02] <n-other> heh
[7:02] <n-other> hm...
[7:02] <n-other> i might rewrite it in perl... old school )
[7:02] <n-other> let me check if Larry made it right way ))
[7:05] * rino (~rino@ has joined #ceph
[7:07] <n-other> yep, it works!
[7:07] <dmick> not surprising either. The strace showed the right value so it was just a question of userland interpretation of the bits
[7:07] <dmick> but good
[7:13] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[7:14] * deepsa (~deepsa@ has joined #ceph
[7:14] <n-other> argh
[7:14] <n-other> they are f&%$ing kidding
[7:15] <n-other> php.net does not accept bug report without a patch
[7:15] <n-other> 8-()
[7:15] <tore_> haha
[7:15] <dmick> wow
[7:16] <n-other> heh
[7:16] <tore_> I like that. only report things you've fixed
[7:16] <n-other> I had to double sumbit it until it accepted )
[7:17] <n-other> https://bugs.php.net/bug.php?id=63618
[7:17] * n-other (~c32fff42@2600:3c00::2:2424) Quit (Quit: TheGrebs.com CGI:IRC)
[7:17] * n-other (~c32fff42@2600:3c00::2:2424) has joined #ceph
[7:17] <n-other> yep
[7:18] <n-other> make your users do the job )
[7:18] <tore_> nice. just keeping banging on it til it works. Same thing used to work for my grandmother's television
[7:18] <n-other> ))
[7:19] <n-other> right now I am in position whatever to rewrite the whole stuff in perl or make a patch, recompile php and submit it )
[7:19] <n-other> or...
[7:19] <tore_> nobody wants to watch static on soap operas.
[7:21] <n-other> oops....lunch time...
[7:24] <dmick> n-other:
[7:24] <dmick> note:
[7:24] <dmick> http://php.net/manual/en/function.filesize.php
[7:24] <dmick> return type int
[7:24] <dmick> Note: Because PHP's integer type is signed and many platforms use 32bit integers, some filesystem functions may return unexpected results for files which are larger than 2GB.
[7:26] <dmick> all kinds of headstands to work around this in the commentary there. <shudder>
[7:36] * dmick (~dmick@2607:f298:a:607:19e4:51dc:d444:6009) Quit (Quit: Leaving.)
[7:36] * jlogan1 (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) Quit (Ping timeout: 480 seconds)
[7:36] * miroslav1 (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[7:40] <tore_> damn ubuntu 10.04 LTS... why does libapache2-mod-fastcgi only show up in multiverse?!?
[7:43] <n-other> dmick, hmmm I really missed that. nevertheless this is a bug
[7:43] <n-other> however workaroundis simple enough
[7:44] <n-other> stat -s "%s" $filename
[7:45] * f4m8_ is now known as f4m8
[7:47] <n-other> I meant stat -c
[7:56] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[7:58] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[8:01] * jlogan (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) has joined #ceph
[8:02] * n-other (~c32fff42@2600:3c00::2:2424) Quit (Quit: TheGrebs.com CGI:IRC (EOF))
[8:02] * tnt (~tnt@55.188-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[8:14] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:15] * yanzheng (~zhyan@ has joined #ceph
[8:59] * tpeb (~tdesaules@ has joined #ceph
[9:01] * nosebleedkt (~kostas@kotama.dataways.gr) has joined #ceph
[9:03] * BManojlovic (~steki@ has joined #ceph
[9:18] * tnt (~tnt@55.188-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:21] * Leseb (~Leseb@stoneit.xs4all.nl) has joined #ceph
[9:21] <nosebleedkt> hi all
[9:21] * Leseb_ (~Leseb@ has joined #ceph
[9:22] * jtang1 (~jtang@ has joined #ceph
[9:25] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:29] * Leseb (~Leseb@stoneit.xs4all.nl) Quit (Ping timeout: 480 seconds)
[9:29] * Leseb_ is now known as Leseb
[9:32] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[9:37] <tore_> anyone know of any work going on with a windows client to mount rbd on windows-based servers? I have a coworker annoying me with this particular question and google isn't yielding anything useful
[9:37] <tore_> he wants to avoid using cygwin etc
[9:39] * deepsa_ (~deepsa@ has joined #ceph
[9:40] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:41] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[9:41] * deepsa_ is now known as deepsa
[9:42] * tpeb (~tdesaules@ Quit (Remote host closed the connection)
[9:45] <nosebleedkt> joao, does cephx works with openldap ?
[9:48] * ScOut3R (~ScOut3R@ has joined #ceph
[9:52] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[9:55] * loicd (~loic@ has joined #ceph
[9:57] * psomas (~psomas@inferno.cc.ece.ntua.gr) Quit (Ping timeout: 480 seconds)
[9:59] * Kioob`Taff1 (~plug-oliv@local.plusdinfo.com) has joined #ceph
[10:00] * deepsa_ (~deepsa@ has joined #ceph
[10:01] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[10:01] * yanzheng (~zhyan@ Quit (Ping timeout: 480 seconds)
[10:01] * deepsa_ is now known as deepsa
[10:08] * jlogan (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) Quit (Ping timeout: 480 seconds)
[10:14] * illuminatis (~illuminat@0001adba.user.oftc.net) has joined #ceph
[10:15] * illuminatis (~illuminat@0001adba.user.oftc.net) Quit ()
[10:17] * illuminatis (~illuminat@0001adba.user.oftc.net) has joined #ceph
[10:31] * jlogan (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) has joined #ceph
[10:39] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Quit: Leaving.)
[10:44] * maxiz (~pfliu@ Quit (Quit: Ex-Chat)
[10:46] * loicd (~loic@ Quit (Ping timeout: 480 seconds)
[10:55] * iltisanni (d4d3c928@ircip3.mibbit.com) has joined #ceph
[11:01] * loicd (~loic@ has joined #ceph
[11:34] * MikeMcClurg (~mike@firewall.ctxuk.citrix.com) has joined #ceph
[11:41] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[11:43] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[11:45] * tziOm (~bjornar@ has joined #ceph
[12:01] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[12:06] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Quit: slang)
[12:31] <joao> is anyone able to access the tracker?
[12:35] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[12:35] <ScOut3R> it times out for me
[12:36] <joao> ScOut3R, thanks
[12:36] <tnt> yup here too ... Apache responds but the backend doesn't seem to say anything.
[12:38] * jlogan (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) Quit (Ping timeout: 480 seconds)
[12:42] * Norman (53a31f10@ircip1.mibbit.com) has joined #ceph
[12:46] * jtang1 (~jtang@2001:770:10:500:a5a3:41c0:558f:935e) has joined #ceph
[12:46] <Norman> hey guys! I got a question, when having 2 nodes with both 36 drives. Is it better tot have a OSD for each drive or to pool drives with RAID so you have less OSD daemons?
[12:48] <jtang> Norman: that could depend on the distro that you have
[12:48] <jtang> we found that on RHEL6 (due to kernel and libc versions) that one big OSD was the only solution
[12:48] <tnt> really ? why ?
[12:48] <tnt> ah yes, the syncfs ...
[12:48] <jtang> and on ubuntu 12.x and new i think it's fine ot have as many osd's as you want
[12:48] * maxiz (~pfliu@ has joined #ceph
[12:48] <jtang> tnt: yea syncfs was the limiting factor
[12:49] <Norman> jtang we will be running Ubuntu, does it matter for performance ?
[12:50] <tnt> several osd process seems to be faster according to the tests IIRC.
[12:50] <tnt> However with 36 drives that's a lot of OSD processes.
[12:50] <tnt> maybe a hybrid approach ?
[12:51] * mdawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[12:52] <tnt> Here I alloc 1Go of RAM per OSD process so if you launch 36 that's a lot of ram :p
[12:53] <Norman> Hmm I'm looking for a best situation when running the cluster with only two nodes? also for recommended number of PG's or number of MON daemons to run and offcourse the OSD's, both nodes would have 36 drives :)
[12:53] <Norman> tnt: yes that would be quite allot of RAM :)
[12:53] <tnt> 2 nodes is really not ideal because for the mon process you need an odd number ...
[12:54] <jtang> Norman: dunno, we've only two nodes in our small testbed
[12:55] <jtang> each node has 45disks, but we ended up doing one big osd on each
[12:55] <jtang> and we havent tested with ubuntu
[12:55] <tnt> 45 disks wow. What do you use to actually connect that much btw ?
[12:55] <jtang> gigabit :P
[12:56] <jtang> we have two backblaze pods which we're testing ceph on
[12:56] <Norman> jtang: but with one big OSD you have them running in RAID 5/6 ?
[12:56] <jtang> *sigh* in hindsight, they werent that hot of an idea
[12:56] <jtang> Norman: we are running btrfs in raid10
[12:56] <jtang> we dont have hardware raid at all, just a jbod controller
[12:57] * mdawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.89 [Firefox 17.0/20121119183901])
[12:57] <tnt> jtang: huh ? you mean like iscsi ?
[12:57] <jtang> tnt: no
[12:57] <jtang> no iscsi at all
[13:05] <Norman> hmm its a pitty there is so little to be found for people running setups with more disks and how they configured them :(
[13:14] * kermin (ca037809@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[13:15] * match (~mrichar1@pcw3047.see.ed.ac.uk) has joined #ceph
[13:19] <Norman> with a 2 node setup do you need to configure anything like PG's in order to make sure that when one node goes down, the system is still running?
[13:19] <match> Norman: You ideally need an odd number of mons to allow quorum decisions
[13:21] <match> Norman: by default replicas of an object will be spread across the osds, but you can 'tweak' this by using crush maps to ensure things like building separation etc
[13:22] <Norman> match: does it make sense to run a 3rd monitor on a non-storage node ?
[13:27] <match> Norman: yep - there's some info on the hardware requiremebnts of each daemon here: http://ceph.com/docs/master/install/hardware-recommendations/ (though th eceph site is down for me atm)
[13:27] <Norman> match: yea here also
[13:28] <match> Norman: google cache works for me: http://webcache.googleusercontent.com/search?q=cache:36nfP-MTNX4J:ceph.com/docs/master/install/hardware-recommendations/&hl=en&tbo=d&gl=uk&strip=1
[13:33] * gucki (~smuxi@80-218-125-247.dclient.hispeed.ch) has joined #ceph
[13:34] <jtang> Norman: we just left the defaults with the pg's
[13:34] <jtang> we did change the crushmap to put data on nodes instead of osd's as the minimum unit size/placement
[13:34] <Norman> jtang, when u unplug one machine, is the storage still running? :)
[13:34] <jtang> yea
[13:34] <jtang> ;)
[13:34] <jtang> as we have a replica count of 2
[13:35] <jtang> recovery is a bit of a nightmare right now as we have 65t per pod
[13:35] <jtang> so restoring a completely failed replica isn't pleasant
[13:35] <jtang> as we just hve two machines
[13:35] <jtang> and two osd's
[13:35] <Norman> jtang: you mean the whole system hangs, cause of all the I/O ? :)
[13:35] <jtang> Norman: heh yea
[13:35] <jtang> i think oyu can limit the rate of recovery
[13:36] <jtang> by weighting the osd's or throttling the number of operations per sec
[13:36] <jtang> we just havent looked at it much
[13:36] <jtang> it'd be nice if osds that get re-added back into the system would just automatically start of with 50% of its marked weight
[13:37] <jtang> and slowly bump it back up automatically to 100% of the specified weight
[13:41] <jtang> re-visiting irods right now
[13:41] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Read error: Connection reset by peer)
[13:42] <jtang> its time like this that i appreciate the likes of ceph and gpfs for its high-availity and redundancy setups
[13:42] <jtang> there's less stuff to go wrong with configuring things
[13:43] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[13:43] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[13:44] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[14:00] <Norman> If you take a SSD for the Journal, I guess you would need to have that SSD in RAID-1, no? Otherwise you would still have single point of failure ?
[14:04] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has left #ceph
[14:05] <firaxis> hi
[14:06] <firaxis> anyone tried to use xen with rbd ?
[14:07] <firaxis> I`m getting io timeouts if I try to use rbd image via rbd:/ or via kernel module and phy:/
[14:07] <tnt> firaxis: I use it.
[14:08] <tnt> what do you mean io timeout ?
[14:08] <firaxis> I try it on debian wheezy with kernel 3.6, xen 4.1 and ceph 0.54
[14:09] <tnt> I use it on wheezy, with 3.6.4 (compiling 3.6.8 now to update), 4.1.3 and ceph 0.48.2 here.
[14:09] <firaxis> any disk activity stops when I try to use disk
[14:09] <tnt> do you have the OSD on the same xen dom0 ?
[14:10] <firaxis> yes
[14:10] <tnt> yeah, that causes issues. I have a workaround, wait a sec.
[14:13] <tnt> firaxis: http://pastebin.com/MtLjLqt0 Apply this patch to /etc/xen/scripts/vif-bridge
[14:13] <tnt> then in the vif config of the domU running the osd, add the 'accel=off' option
[14:14] <tnt> this disables some of the TX TCP offload accelerations ... causing slightly higher cpu usage, but it makes it work.
[14:14] <tnt> (you need ethtool installed of course)
[14:15] <firaxis> many thanks
[14:16] <tnt> firaxis: also if running xen, you might be high with the 'clock advancing magically by 50 minutes bug' ... depending on your hw.
[14:17] <tnt> s/high/hit/
[14:22] <nosebleedkt> hi, the osd directories contain a folder called "current"
[14:22] <nosebleedkt> which has multiple files
[14:22] <nosebleedkt> what is that directory about ?
[14:23] <tnt> ... that's where the data is ... duh.
[14:26] <nosebleedkt> ah
[14:26] <nosebleedkt> really
[14:26] <nosebleedkt> the objects?
[14:28] <tnt> in some form yes ... but don't expect to just read them as file, ceph is free to do whatever magic they want there
[14:29] <nosebleedkt> yeah sure
[14:30] <nosebleedkt> tnt, also, I have two disks and a rep factor of 3
[14:30] <nosebleedkt> how is about to share?
[14:30] <tnt> I think that just won't work.
[14:30] <jtang> hmm trinity just released the book of kells on the ipad
[14:30] <jtang> seems the cat is out of the bag now
[14:31] <jtang> we're gonna get hit hard on serving out the content in the next few weeks
[14:31] <nosebleedkt> tnt, whats the relation between the number of disks and the number of rep factor?
[14:32] <tnt> well rep size tried to replicated the same data cross that many different OSD ... if you have 2 OSD you can't have 3 replicas on 3 different OSD ...
[14:33] <nosebleedkt> so the replica number must be a multiple of the number of OSD ?
[14:34] <tnt> no
[14:34] <tnt> num replica <= num osd ... that's it.
[14:39] <firaxis> tnt: if I have OSDs on dom0 domain, and adding rbd devices on dom0, than "xm block-attach 3 phy:/dev/rbd2 xvdb w"
[14:39] <tnt> OSDs inside the dom0 will just not work, at all.
[14:40] <nosebleedkt> tnt, now my strange question. I have an RBD device of 100mb. I dd it to fill it with data. When 100mb are reached then dd stops.
[14:40] <nosebleedkt> Then that partition get remounted read-only
[14:41] <nosebleedkt> why is that?
[14:42] <tnt> no idea ... but how do you know it gets remounted as RO ?
[14:44] <nosebleedkt> just type mount
[14:44] <nosebleedkt> and it appears ro
[14:44] <nosebleedkt> and of course i cannot write to it
[14:45] <tnt> what does dmesg say ?
[14:45] <tnt> and what FS do you use on it ?
[14:49] <nosebleedkt> OSD FS is ext4
[14:50] <nosebleedkt> RBD FS also ext4
[14:50] <tnt> And I assume that the available space on yur OSD is a lot bigger than 100Mb ?
[14:50] <nosebleedkt> yeah
[14:50] <nosebleedkt> got 3 osd of 1GB each
[14:51] <tnt> and what does dmesg says? it should say _why_ it remounted the RBD FS Read only.
[14:51] <nosebleedkt> yes, must do it again, wait :D
[14:54] * deepsa (~deepsa@ Quit (Quit: Computer has gone to sleep.)
[15:00] <firaxis> tnt, sorry for molestation, but on dom0 rbd block devices works fine and i can`t see any errors in OSD logs. rbd devices added via loopback, and any actions with rbd going normal. it fails only with domU
[15:01] <firaxis> also I tried to add it via rbd:/ with same result
[15:02] <tnt> firaxis: your OSD are running in the dom0 or domU ?
[15:02] <firaxis> dom0
[15:03] <tnt> well, that's not a supported config at all and I never ran with that config.
[15:04] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) has joined #ceph
[15:05] <tnt> you can get all sort of issues when the RBD client and OSD run under the same kernel.
[15:05] <tnt> When you encapsulate the OSD under another domU, you can avoid those.
[15:07] <tnt> you can try the ethtool -K tx off XXX on the dom0 (XXX being an interface on the dom0 ... not sure which one would need to be 'fixed', try them all) but even if that works, it probably won't be very stable.
[15:08] * The_Bishop (~bishop@2001:470:50b6:0:7929:224d:a562:7402) has joined #ceph
[15:08] <firaxis> I tried ethtool -K tx off XXX, but it can`t change settings on loopback.
[15:09] <tnt> huh ... loopback ?
[15:09] <tnt> ceph shouldn't use the loopback address
[15:10] <firaxis> also, KVM works fine with RBD and OSD on same node but via librbd
[15:10] <tnt> yes, those issue only appears when the client is within the kernel.
[15:26] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[15:27] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[15:32] * drokita (~drokita@ has joined #ceph
[15:46] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[15:46] <nosebleedkt> root@cluster:~# ceph health
[15:46] <nosebleedkt> HEALTH_WARN 576 pgs stale; 576 pgs stuck stale
[15:46] <nosebleedkt> this doesnt change to OK
[15:46] <nosebleedkt> never :(
[15:47] <nosebleedkt> can someone help ?
[15:48] <tnt> well if you have rep size set to 3 and only 2 OSD ... it will never be OK.
[15:48] <nosebleedkt> no i dont
[15:49] <nosebleedkt> i have 3 osd
[15:49] <slang> nosebleedkt: can you post the output of: ceph pg dump_stuck stale
[15:49] <nosebleedkt> but i dont know what is the factor?
[15:49] <slang> nosebleedkt: to pastebin or something similar?
[15:49] * MikeMcClurg (~mike@firewall.ctxuk.citrix.com) has left #ceph
[15:50] <nosebleedkt> how do i set the rep factor?
[15:50] <nosebleedkt> in my pool
[15:51] <nosebleedkt> slang, http://pastebin.com/0z6K4x8k
[15:53] <slang> nosebleedkt: ceph -s
[15:53] <slang> ?
[15:54] <nosebleedkt> root@cluster:~# ceph -s
[15:54] <nosebleedkt> health HEALTH_WARN 576 pgs stale; 576 pgs stuck stale
[15:54] <nosebleedkt> monmap e1: 1 mons at {a=}, election epoch 0, quorum 0 a
[15:54] <nosebleedkt> osdmap e63: 6 osds: 3 up, 3 in
[15:54] <nosebleedkt> pgmap v867: 576 pgs: 576 stale+active+clean; 1252 MB data, 401 MB used, 2622 MB / 3023 MB avail
[15:54] <nosebleedkt> mdsmap e23: 1/1/1 up {0=a=up:replay}
[15:54] <slang> ceph osd dump -o -
[15:55] <slang> nosebleedkt: all your pgs are stale, looks like a few osds can't reach the monitor for some reason
[15:56] <nosebleedkt> osd.0 up in weight 1 up_from 56 up_thru 12 down_at 55 last_clean_interval [50,54) lost_at 49 exists,up b38fed66-3869-446c-a082-48dac2a66811
[15:56] <nosebleedkt> osd.1 up in weight 1 up_from 58 up_thru 12 down_at 57 last_clean_interval [51,54) lost_at 50 exists,up 3648ef16-cf9f-497c-9a21-b75f2f64a109
[15:56] <nosebleedkt> osd.2 up in weight 1 up_from 60 up_thru 12 down_at 59 last_clean_interval [51,56) lost_at 50 exists,up 457c1d6a-ea0a-4bef-8da8-66c6d11a7d0e
[15:56] <nosebleedkt> osd.3 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) :/0 :/0 :/0 exists,new
[15:56] <nosebleedkt> osd.4 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) :/0 :/0 :/0 exists,new
[15:56] <nosebleedkt> osd.5 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) :/0 :/0 :/0 exists,new
[15:56] <nosebleedkt> osd 3,4,5
[15:56] <nosebleedkt> sorry for the spam
[15:57] <slang> nosebleedkt: looks like osds 3,4, and 5 were never started?
[15:57] <nosebleedkt> slang, just removed them
[15:57] <nosebleedkt> using ceph osd rm 3,4,5
[15:57] <nosebleedkt> blacklist expires 2012-11-27 17:10:22.634703
[15:57] <nosebleedkt> blacklist expires 2012-11-27 17:21:04.072404
[15:58] <nosebleedkt> i also have this
[15:58] <tnt> did you wait for the pg to be redistributed between each rm ?
[15:58] <nosebleedkt> i dont think
[15:58] <nosebleedkt> then i restarted ceph daemons
[15:59] <nosebleedkt> now i have these
[15:59] <nosebleedkt> http://pastebin.com/yAnVDZ8x
[16:02] * deepsa (~deepsa@ has joined #ceph
[16:04] * PerlStalker (~PerlStalk@ has joined #ceph
[16:07] <nosebleedkt> so ?
[16:07] <slang> nosebleedkt: did you remove the osds from the crushmap before removing them?
[16:07] <nosebleedkt> no
[16:08] <slang> nosebleedkt: you could try that, I'm not sure it will work now though
[16:08] <slang> nosebleedkt: ceph osd crush remove osd.[3,4,5]
[16:08] * loicd (~loic@ Quit (Ping timeout: 480 seconds)
[16:09] <nosebleedkt> root@cluster:~# ceph osd crush remove osd.3
[16:09] <nosebleedkt> removed item id 0 name 'osd.3' from crush map
[16:09] <slang> http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-an-osd-manual
[16:11] <nosebleedkt> still the same
[16:11] <slang> nosebleedkt: you removed all 3?
[16:11] <nosebleedkt> yes
[16:12] <nosebleedkt> root@cluster:~# ceph -s
[16:12] <nosebleedkt> health HEALTH_WARN 576 pgs stale; 576 pgs stuck stale
[16:12] <nosebleedkt> monmap e1: 1 mons at {a=}, election epoch 0, quorum 0 a
[16:12] <nosebleedkt> osdmap e86: 3 osds: 3 up, 3 in
[16:12] <nosebleedkt> pgmap v908: 576 pgs: 576 stale+active+clean; 1252 MB data, 402 MB used, 2621 MB / 3023 MB avail
[16:12] <nosebleedkt> mdsmap e29: 1/1/1 up {0=a=up:replay}
[16:15] <slang> you might try to force remap one of the pgs: ceph pg force_create_pg 1.5e
[16:17] <nosebleedkt> i dont know
[16:17] <nosebleedkt> nothing happens
[16:17] <nosebleedkt> i think i will delete everything
[16:18] <nosebleedkt> its just a test system
[16:19] <slang> nosebleedkt: that's up to you
[16:20] <slang> nosebleedkt: I would also try: ceph osd out [1,2,3]
[16:20] <slang> errr...not that
[16:20] * loicd (~loic@magenta.dachary.org) has joined #ceph
[16:20] <nosebleedkt> did that too
[16:20] <slang> nosebleedkt: I would also try: ceph osd out [3,4,5]
[16:21] <slang> after the force_create_pg, the number of stale pgs didn't decrease by 1?
[16:21] <nosebleedkt> slang, do u know if cephx can work with openldap or radius?
[16:21] <nosebleedkt> I think it said degraded 1.
[16:22] <slang> ah ok
[16:22] <slang> nosebleedkt: so then you just need to force_create_pg for each of the stale pgs...
[16:22] <nosebleedkt> now i removed everything :d
[16:22] <slang> oh
[16:22] <nosebleedkt> i will init ever osd from the beggining
[16:23] <nosebleedkt> can you answer my question about cephx?
[16:23] <tnt> no it can't
[16:23] <nosebleedkt> so cephx has its own database for passwords
[16:25] <tnt> yes
[16:25] <nosebleedkt> tnt, if a monitor goes down,
[16:25] <nosebleedkt> and another is taking charge
[16:25] <nosebleedkt> how does the client knows the ip of the new in charge?
[16:26] <tnt> the client knows the ip of all the mons
[16:26] <nosebleedkt> those are set in ceph.conf ?
[16:26] <tnt> yup
[16:26] <nosebleedkt> ok
[16:26] <nosebleedkt> so it tries to speak with mon.0 and if it doesnt reply
[16:26] <nosebleedkt> it switches to mon.b?
[16:27] <nosebleedkt> mon.1
[16:27] <tnt> I think so
[16:28] <tnt> I don't know the exact procedure ... but it works, it will connect so I guess it just tries them all.
[16:28] <tnt> until one answers (and possibly redirects it to the right master if it's not him)
[16:28] <nosebleedkt> ok
[16:28] <nosebleedkt> thank you
[16:29] * nosebleedkt (~kostas@kotama.dataways.gr) Quit (Quit: Leaving)
[16:33] * deepsa_ (~deepsa@ has joined #ceph
[16:34] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[16:34] * deepsa_ is now known as deepsa
[16:38] * illuminatis (~illuminat@0001adba.user.oftc.net) Quit (Quit: WeeChat
[16:40] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[16:42] * tziOm (~bjornar@ Quit (Remote host closed the connection)
[17:08] * illuminatis (~illuminat@0001adba.user.oftc.net) has joined #ceph
[17:09] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[17:10] * jlogan (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) has joined #ceph
[17:14] * wubo (80f42605@ircip1.mibbit.com) has joined #ceph
[17:18] <nhm> good morning #ceph
[17:19] <joao> moning nhm :)
[17:20] <jtang> sup!
[17:20] <jtang> nhm: i seemed to have missed meeting you at sc12
[17:20] <jtang> ;)
[17:21] <nhm> jtang: yeah, sorry about that, I was off running around a lot.
[17:21] * tnt (~tnt@55.188-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[17:21] * deepsa (~deepsa@ Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[17:21] <nhm> jtang: I heard from Roger and Miroslav that you stopped by though and they were impressed with what you guys are doing!
[17:22] * drokita (~drokita@ has left #ceph
[17:23] <jtang> we're experimenting
[17:23] <jtang> you wouldn't be impressed if you had to deal with them backblaze pods
[17:24] <jtang> yea it was good to bounce ideas off people in real life
[17:24] <jtang> :)
[17:24] <nhm> jtang: interesting, are you using the stock pod designs?
[17:24] <jtang> i found it amusing that IBM are going down a route similar to what ceph is doing
[17:25] <jtang> with its de-clustered raid which only available on AIX and power6/7 based systems
[17:25] <jtang> *sigh*
[17:25] <jtang> nhm: we have two stock pods, fully loaded up with 45x3tb disks in each one
[17:25] <nhm> jtang: What problems are you running into?
[17:26] <jtang> they are topheavy on disks, the controllers suck
[17:26] <jtang> we use RHEL6 (SL6 really)
[17:26] <nhm> I was wondering about those controllers.
[17:26] <jtang> so we're running into kernel version problems, old glibc (lack of syncfs)
[17:26] <jtang> i guess if we werent stubborn and just used ubuntu then it would be less of a problem
[17:27] <jtang> but we like SL6 cause its stable and doesnt change for ages, and there is vendor packages which just work for RHEL6 and its clones
[17:27] <jtang> nhm: we got the recommended controllers that backblaze themselves use
[17:27] <nhm> yes, ubuntu would work better. So long as you are on a newer kernel, I think the newer releases of ceph may not require glibc 2.14+ for syncfs.
[17:27] <jtang> the problem is that we need the port density, so it restricts what we can get
[17:28] <wubo> heh, here's a fun new log message from ceph -w
[17:28] <wubo> 2012-11-27 11:27:33.814916 mon.0 [INF] pgmap v31785: 1536 pgs: 497 active, 1033 active+clean, 6 peering; 239 GB data, 717 GB used, 16041 GB / 16758 GB avail; -2/184798 degraded (-0.001%)
[17:28] <wubo> i have negative degradation... i guess that makes things super-awesome :-)
[17:28] <jtang> nhm: we ran into problems with ubuntu's kernel not recognising the jbod cards
[17:28] <jtang> :)
[17:28] <nhm> jtang: the highpoint rocket 2720SGL does pretty well in JBOD mode in my testing.
[17:28] <wubo> ceph version 0.48.2argonaut (commit:3e02b2fad88c2a95d9c0c86878f10d1beb780bfe)
[17:28] <jtang> so either which way we have to install custom kernels and glibc's
[17:28] <nhm> jtang: it's more expensive, but you get 8 ports.
[17:29] <jtang> nhm: we need 15 ports per card
[17:29] <nhm> jtang: yeah, thats $$$.
[17:30] <nhm> jtang: there's an LSI card that does it for real. Some marvel cards look like they do but just have an on-board expander.
[17:30] <jtang> well with the current setup that we have with the pods its working alright
[17:30] <jtang> we're doing raid10 in btrfs so we just have one OSD on each pod running SL6
[17:30] <jtang> performance is better than what it was before
[17:30] <jtang> it hasnt crashed so far
[17:31] <nhm> jtang: wow, interesting!
[17:31] <jtang> its just a pain when a disk fails and when we need to do a rebuild of one of the pod's osd's
[17:31] <nhm> jtang: so far the best performance I've seen has been from 1-disk-per-osd configurations.
[17:31] <jtang> it takes ages to push data back onto a fresh osd
[17:31] <nhm> jtang: did you guys try multiple smaller raid10s?
[17:32] <jtang> nhm: no, we didnt cause we cant run more than on osd per machine in SL6
[17:32] <jtang> :P
[17:32] <jtang> the lack of syncfs doesnt help things at all
[17:32] * ScOut3R (~ScOut3R@ Quit (Remote host closed the connection)
[17:32] <jtang> nhm: we think we'll have 65tb of usable space in the current setup that we have
[17:32] <jtang> thats a far cry from 130t that we originally planed
[17:32] <nhm> jtang: We definitely need to get you guys that patch. :)
[17:33] <jtang> its been an interesting experiment either which way
[17:33] <jtang> we're still experimenting and testing with data that we can afford to lose
[17:34] <jtang> i.e. it's turned into a dumping ground for laptop backups, virtual tape drives for amanda that kinda of stuff
[17:34] <jtang> i'm not sure what else the guys are using the system for right now
[17:34] * Norman (53a31f10@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[17:35] <jtang> we've another project that we're evaluating ceph for
[17:36] <jtang> that might generate some patches if we decide to go with ceph
[17:36] <jtang> so far im pretty convinced that ceph is a good fit if we can do some development work on the cephfs side, but recently ive been reading up on librados
[17:36] <jtang> and it looks pretty neat
[17:38] * guigouz1 (~guigouz@201-87-100-166.static-corp.ajato.com.br) has joined #ceph
[17:42] * sagelap (~sage@251.sub-70-197-128.myvzw.com) has joined #ceph
[17:49] <jtang> *cool* i have heating in my office
[17:49] * jtang discovered that the heating was off
[17:51] * illuminatis (~illuminat@0001adba.user.oftc.net) Quit (Ping timeout: 480 seconds)
[18:05] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) Quit (Quit: Ex-Chat)
[18:07] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[18:13] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[18:20] * match (~mrichar1@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[18:21] * sagelap1 (~sage@2607:f298:a:607:7463:bf6a:b3fa:74f8) has joined #ceph
[18:24] * sagelap (~sage@251.sub-70-197-128.myvzw.com) Quit (Ping timeout: 480 seconds)
[18:26] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[18:27] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[18:33] * aliguori (~anthony@ has joined #ceph
[18:39] * ScOut3R (~scout3r@54004264.dsl.pool.telekom.hu) has joined #ceph
[18:39] * nwatkins (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[18:44] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:50] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[18:56] <jtang> hrm
[18:57] <jtang> had an interesting chat with some inktanks in a conf call just now
[18:57] <rweeks> I was supposed to be there, sorry, jtang.
[18:57] * rweeks was in traffic instead.
[18:57] <jtang> they've cleared up a few things for us alright
[18:57] <jtang> heh, it was good
[19:00] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:03] * calebamiles (~caleb@c-24-128-194-192.hsd1.vt.comcast.net) Quit (Ping timeout: 480 seconds)
[19:05] * calebamiles (~caleb@c-24-128-194-192.hsd1.vt.comcast.net) has joined #ceph
[19:08] * nwatkins (~Adium@soenat3.cse.ucsc.edu) has left #ceph
[19:10] * jlogan2 (~Thunderbi@ has joined #ceph
[19:12] * jlogan (~Thunderbi@2600:c00:3010:1:c411:8052:9a4c:99a) Quit (Ping timeout: 480 seconds)
[19:14] * Leseb (~Leseb@ Quit (Ping timeout: 480 seconds)
[19:17] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) Quit (Quit: Leaving.)
[19:18] * chutzpah (~chutz@ has joined #ceph
[19:29] * Ryan_Lane (~Adium@ has joined #ceph
[19:30] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) has joined #ceph
[19:33] * jtang1 (~jtang@2001:770:10:500:a5a3:41c0:558f:935e) Quit (Ping timeout: 480 seconds)
[19:33] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[19:33] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:35] * illuminatis (~illuminat@0001adba.user.oftc.net) has joined #ceph
[19:40] * phil (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[19:40] * loic (~loic@ has joined #ceph
[19:40] <loic> Merci je viens de gagner 1300 euros sur le site jackpotparis com aux machines a sous avec 25 euros gratuit
[19:40] * phil is now known as Guest122
[19:41] * houkouonchi-work (~linux@ has joined #ceph
[19:42] * loic (~loic@ Quit (autokilled: Mail support@oftc.net with questions (2012-11-27 18:42:34))
[19:49] <nhm> jtang: glad to hear it went well!
[19:50] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[19:51] * Guest122 (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Remote host closed the connection)
[19:55] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Quit: Leaving)
[19:59] * guigouz1 (~guigouz@201-87-100-166.static-corp.ajato.com.br) Quit (Quit: Computer has gone to sleep.)
[20:07] * sagelap1 (~sage@2607:f298:a:607:7463:bf6a:b3fa:74f8) Quit (Ping timeout: 480 seconds)
[20:12] * jtang1 (~jtang@ has joined #ceph
[20:18] * sagelap (~sage@2607:f298:a:607:f51a:d51:bb88:da91) has joined #ceph
[20:44] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) Quit (Remote host closed the connection)
[20:46] * nwatkins (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[20:49] <tnt> Is it possible to do rolling upgrade to bobtail when it comes out ?
[20:49] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) has joined #ceph
[20:52] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[21:06] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[21:13] * The_Bishop (~bishop@2001:470:50b6:0:7929:224d:a562:7402) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[21:18] * ScOut3R (~scout3r@54004264.dsl.pool.telekom.hu) Quit (Quit: Lost terminal)
[21:19] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[21:21] * jlogan2 (~Thunderbi@ Quit (Read error: Connection reset by peer)
[21:22] * jlogan (~Thunderbi@ has joined #ceph
[21:22] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[21:26] * sagelap (~sage@2607:f298:a:607:f51a:d51:bb88:da91) Quit (Ping timeout: 480 seconds)
[21:29] * jjgalvez (~jjgalvez@ has joined #ceph
[21:36] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[21:37] * sagelap (~sage@2607:f298:a:607:f51a:d51:bb88:da91) has joined #ceph
[21:39] * sagelap1 (~sage@ has joined #ceph
[21:39] * sagelap (~sage@2607:f298:a:607:f51a:d51:bb88:da91) Quit (Read error: Connection reset by peer)
[21:46] * The_Bishop (~bishop@2001:470:50b6:0:85ca:6278:8125:d9e9) has joined #ceph
[21:48] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[21:49] * sagelap1 (~sage@ Quit (Ping timeout: 480 seconds)
[21:49] * dmick (~dmick@2607:f298:a:607:9971:8e07:6a30:2cb8) has joined #ceph
[21:52] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[22:04] * nwatkins (~Adium@soenat3.cse.ucsc.edu) Quit (Quit: Leaving.)
[22:07] * vata (~vata@ has joined #ceph
[22:08] * jtang1 (~jtang@ has joined #ceph
[22:12] <dmick> hey sagewk
[22:13] <dmick> paging sagewk
[22:13] <dmick> and yet again sagewk
[22:13] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:13] <sagewk> bah can't make pidgin raise window without stealing focus :(
[22:14] <dmick> might be the wm getting in the way
[22:17] <rweeks> pidgin is the worst IRC client
[22:18] * sagelap (~sage@2607:f298:a:607:7463:bf6a:b3fa:74f8) has joined #ceph
[22:18] <dmick> rweeks: but it unifies all chat sessions very well
[22:18] <rweeks> yes
[22:19] <rweeks> however I much prefer to use an IRC client that actually understands IRC commands
[22:19] <rweeks> you could do what nwl does and run everything in CLI with finch and irssi
[22:21] * aliguori (~anthony@ has joined #ceph
[22:22] <nhm> I just use irssi for irc and whatever the gnome thing is for chat.
[22:23] <nhm> the gnome thing sucks though because someone decided to make the new IM notification share the new email notification.
[22:23] <nhm> Since I pretty much constantly have new email, I miss when I have new IMs.
[22:26] <joao> I have no idea what has happened with vim, it started using tabs instead of spaces all of a sudden
[22:26] <joao> this is really annoying
[22:27] <nhm> I wonder if it's time for me to switch to cinnamon finally.
[22:30] <lurbs> It requires surprisingly few packages pulled in from Mint over the top of the standard Ubuntu ones.
[22:31] <nhm> lurbs: I've been limping along on fallback mode for a while. I even ported an old theme from gnome2 to make it mimic my old desktop.
[22:31] <nhm> It mostly works, but it's pretty clear this isn't going to be sustainable.
[22:39] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[22:41] * gucki (~smuxi@80-218-125-247.dclient.hispeed.ch) Quit (Remote host closed the connection)
[22:42] <sjust> joao: you might be missing the vim magic header at the top of a new file
[22:43] <joshd> sjust: no, it's python
[22:47] <dmick> python can have modelines, but we tend not to
[22:47] <dmick> I have ~/.vim/syntax/{c,python}.vim files
[22:47] <dmick> that help me
[22:48] <dmick> and also this in .vimrc:
[22:48] <dmick> au FileType python setlocal ts=8 expandtab sw=4 softtabstop=4
[22:48] <dmick> joao^ maybe that will help you
[22:48] <joao> dmick, thanks
[22:48] <joao> the annoying thing is that I set ts=8 sw=2 smarttab on vim and it was still using tabs instead of spaces
[22:49] <joao> maybe I need the 'expandtab' too
[22:49] <dmick> yes
[22:49] <dmick> that's the "always spaces" thing
[22:50] <joao> oh yeah, you're right
[22:51] <joao> dmick, any chance you could make your vim files available to me? :)
[22:51] <joao> I should scrap my old .vimrc as it probably is the source of all my problems
[22:52] <dmick> they're more about syntax highlighting. It shows me tabs and trailing spaces so I can see what's up. The trailing spaces thing is a little annoying
[22:52] <dmick> but sure
[22:52] <joao> thanks
[22:55] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[23:00] * calebamiles (~caleb@c-24-128-194-192.hsd1.vt.comcast.net) Quit (Remote host closed the connection)
[23:09] * calebamiles (~caleb@c-24-128-194-192.hsd1.vt.comcast.net) has joined #ceph
[23:22] * plut0 (~cory@pool-96-236-43-69.albyny.fios.verizon.net) has joined #ceph
[23:27] * chutzpah (~chutz@ Quit (Quit: Leaving)
[23:28] * jlogan (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[23:28] * jlogan (~Thunderbi@ has joined #ceph
[23:29] * vata (~vata@ Quit (Quit: Leaving.)
[23:30] * miroslav (~miroslav@adsl-67-124-149-139.dsl.pltn13.pacbell.net) has joined #ceph
[23:32] <buck> Has anyone encountered issues with the gcov builds not playing well with LD_LIBRARY_PATH?
[23:33] <buck> specifically, I'm seeing an error that is the same as one that occurs if a .so is not in my LD_LIBRARY_PATH, but the path is correct.
[23:33] <buck> same call works on other builds. I was just wondering if the profiling requires some other method of setting LD_LIBRARY_PATH
[23:34] <joshd> it shouldn't matter, and I haven't seen problems like that before
[23:39] * sbadia (~seb@yasaw.net) Quit (Quit: WeeChat 0.3.8)
[23:39] <buck> joshd: thanks. seemed unlikely but since I'm getting a bunch of profiling junk at the end as well, seemed worth asking about.
[23:40] * yehudasa_ (~yehudasa@ has joined #ceph
[23:46] <plut0> rweeks: hey
[23:46] <rweeks> 'allo
[23:47] <plut0> rweeks: got that email for me? :)
[23:47] <rweeks> nothing to date, no...
[23:48] <plut0> nothing to share yet?
[23:48] * aliguori (~anthony@ has joined #ceph
[23:48] * jtang1 (~jtang@ Quit (Quit: Leaving.)
[23:49] <rweeks> not since we chatted last, no
[23:49] <plut0> ok
[23:50] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[23:51] * yehudasa_ (~yehudasa@ Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.