#ceph IRC Log

Index

IRC Log for 2010-07-27

Timestamps are in GMT/BST.

[1:19] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has joined #ceph
[1:23] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) Quit ()
[1:52] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Ping timeout: 480 seconds)
[1:54] * tjikkun (~tjikkun@195-240-122-237.ip.telfort.nl) has joined #ceph
[1:56] * paunchy (~jreitz@dsl253-098-218.sfo1.dsl.speakeasy.net) has joined #ceph
[2:28] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has joined #ceph
[2:50] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[3:22] * Osso (osso@AMontsouris-755-1-7-241.w86-212.abo.wanadoo.fr) Quit (Quit: Osso)
[4:09] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has joined #ceph
[4:21] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[5:27] * pcish (7a742213@ircip1.mibbit.com) has joined #ceph
[5:57] <MarkN> when I try to update the unstable branch or try to do a git clone i get a connection reset by peer error, anyone else experiencing this ?
[6:43] * tjikkun (~tjikkun@195-240-122-237.ip.telfort.nl) Quit (Ping timeout: 480 seconds)
[7:07] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) Quit (Quit: Leaving.)
[8:36] * mtg (~mtg@vollkornmail.dbk-nb.de) has joined #ceph
[8:47] * eternaleye (~quassel@184-76-53-210.war.clearwire-wmx.net) Quit (Ping timeout: 480 seconds)
[8:50] <wido> MarkN: try to switch to HTTP for git
[9:00] <MarkN> wido: thanks, will try that at work tomorrow
[9:00] <wido> MarkN: git remote set-url origin http://ceph.newdream.net/git/ceph.git
[9:40] * kblin (~kai@mikropc7.biotech.uni-tuebingen.de) has joined #ceph
[9:40] <kblin> morning folks
[9:42] <wido> morning
[9:55] * allsystemsarego (~allsystem@188.27.164.91) has joined #ceph
[12:48] * Osso (osso@AMontsouris-755-1-7-241.w86-212.abo.wanadoo.fr) has joined #ceph
[14:04] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) has joined #ceph
[15:01] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) has joined #ceph
[15:32] * mtg (~mtg@vollkornmail.dbk-nb.de) Quit (Quit: Verlassend)
[16:59] <todinini> I triggered an assertion in buffer.h after cleanly rebooting all osds at once, here is the log of one osd http://pastebin.com/YTMQ9fx7
[17:53] <wido> todinini: could you uploaded the corefile, binary and logfile somewhere?
[17:53] <wido> that's what sagewk needs to backtrace it
[17:53] <wido> he needs your /usr/bin/cosd
[17:53] <wido> and even your debug symbols in /usr/lib/debug/usr/bin/cosd
[18:07] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) has joined #ceph
[18:13] * eternaleye (~quassel@184-76-53-210.war.clearwire-wmx.net) has joined #ceph
[20:40] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) Quit (Quit: Leaving.)
[20:40] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) has joined #ceph
[21:19] <wido> yehudasa: just tried your SIGUSR1 handler, my gateway is not shutting down yet
[21:21] <yehudasa> wido: I pushed a fix to the unstable branch
[21:22] <yehudasa> there's some issue with it not breaking out of the FCGX_Accept(), so I just added a timer to exit after 5 seconds
[21:23] <yehudasa> but anyway, even if you have some processes left, they shouldn't stack up as the next time you'll get a request, the old processes will die
[21:23] <wido> ok, i'll see what it does
[21:24] <wido> btw, did some performance tests today on a small set of files, got 350 req/sec out of it, where a regular Apache server does ~1000 req/sec on the same file (with local disk)
[21:24] <wido> but, that's where the local fs cache kicks in i think
[21:24] <wido> since serving the file over NFS gave me the same performance
[21:25] <yehudasa> when you access rados, there's no local caching involved, unlike NFS
[21:25] <wido> yes, i know
[21:26] <wido> so it's a pretty good performance then
[21:26] <yehudasa> yeah
[21:26] <todinini> hmm, now almost all of my osds are diying within minutes and I get a kernel trace http://pastebin.com/quGeh4nw
[21:26] <wido> i think people should use a caching proxy like Varnish or just use mod_cache_disk/mem
[21:26] <wido> todinini: what are you doing?
[21:27] <wido> your machine seems to go OOM
[21:27] <todinini> wido: it's not oom it doen't touch the swap
[21:28] <todinini> oot@node6:~# free -m total used free shared buffers cached
[21:28] <todinini> Mem: 982 465 517 0 82 310
[21:28] <todinini> -/+ buffers/cache: 71 910
[21:28] <todinini> Swap: 5130 0 5130
[21:29] <wido> "kswapd0: page allocation failure"
[21:30] <todinini> but you see there is plenty of memory still unused
[21:30] <wido> uhm, the dev's might know this better. Maybe the OSD is trying to allocate a lot, which is not available?
[21:31] <wido> yehudasa: the gateway shuts down nicely right now, it not even waits five seconds, but goes down directly
[21:31] <todinini> maybe, but until yesterday, the cluster was running stable, and yesterday I did this expansion thing
[21:35] <wido> hmm, there is not a lot of data on your cluster
[21:40] <todinini> a few G, we are still at the beginning of our tests
[21:44] <yehudasa> wido: cool
[21:53] <wido> yehudasa: just placed Varnish in between the gateway, 2000 req/sec
[21:53] <wido> that is a really cool proxy btw, if you ever want to do something with HTTP caching, check out Varnish
[21:53] <wido> lightweight, really fast and advanced
[22:00] <yehudasa> sounds great.. will look at it
[22:02] <wido> and for dreamhost it could be something too, we've had a lot of HTTP floods (slowloris) or other DDoS'es where Apache would go down within seconds, Varnish kept the service online. Problem with Apache is, it opens a slot and then waits for the request to come in, this fills up all your slots. Varnish handles this much better, so protects your Apache
[22:03] <wido> one question about the logging, Ceph uses "dout(n)" for the logging, this seem to be a stream which, but i can't find where it is implemented. This might be usefull for the gateway too
[22:04] <gregaf> dout is a big multilayered macro
[22:05] <yehudasa> there's no single place where dout is defined
[22:05] <yehudasa> there is the generic dout though
[22:06] <yehudasa> that is generic_dout()
[22:07] <wido> i see, in debug.h
[22:45] <wido> yehudasa: one more code cleanup (i think) i found
[22:46] <wido> rgw_rest.cc line 271, why the iteration? future support of more attributes?
[22:51] <wido> i'm going afk, ttyl
[23:06] * AnthonyOIT1 (~anthony@dhcp-v025-008.mobile.uci.edu) has joined #ceph
[23:07] <AnthonyOIT1> Hello, I have a question about ceph's OSD storage.
[23:07] <gregaf> AnthonyOIT1: shoot :)
[23:08] <AnthonyOIT1> Okay thanks :)
[23:08] <AnthonyOIT1> so i have two nodes where each holds 4 SCSI disks
[23:08] <AnthonyOIT1> one node holds 4 36gb scsi disk
[23:08] <AnthonyOIT1> and the other node holds 4 73 gb scsi disk
[23:09] <AnthonyOIT1> i read that ceph auto balances the osds so that each disk would use the same amount of space
[23:09] <AnthonyOIT1> and i wrote a 300GB.bin using dd
[23:09] <AnthonyOIT1> the first node which holds the 36gb scsi disk shows 100% storage used while the other node shows 50%
[23:10] <AnthonyOIT1> so i was wondering if every OSD needs to be the same size?
[23:10] <gregaf> ah
[23:10] <gregaf> you need to modify the OSD weights in the CRUSH map if you want to use up all your storage in uneven disk situations
[23:10] <gregaf> let me find the wiki link
[23:11] <AnthonyOIT1> okay thank you
[23:12] <gregaf> http://ceph.newdream.net/wiki/Custom_data_placement_with_CRUSH
[23:12] <gregaf> that tells you how to modify the CRUSH map, and the part you'll want to modify is the weight field
[23:13] <AnthonyOIT1> ahh
[23:13] <AnthonyOIT1> i see.
[23:13] <gregaf> which I think you can just adjust by getting it out of your ceph instance and adjusting the weights and reimporting
[23:13] <gregaf> of course this can impact performance since the OSD with larger disks will then be handling a larger portion of the data and throughput
[23:14] <gregaf> but it'll keep them using about the same percentage of their disk space
[23:14] <AnthonyOIT1> i see
[23:14] <AnthonyOIT1> wait
[23:14] <AnthonyOIT1> so let's say
[23:14] <AnthonyOIT1> i replicate the same situation with the 300GB.bin file
[23:16] <AnthonyOIT1> so it won't distribute evenly as with the default weight of 1 because it's writing more files to the larger hard disks?
[23:16] <gregaf> yeah
[23:16] <AnthonyOIT1> i see.
[23:16] <AnthonyOIT1> i thought it would rollover once the node has been filled at first
[23:17] <gregaf> I think that's possible in the future, but it's not something available right now for a variety of reasons
[23:18] <AnthonyOIT1> i see
[23:18] <AnthonyOIT1> thanks alot gregaf
[23:18] <gregaf> and really doing that wouldn't improve your overall aggregate throughput, you'd just have 2x until the smaller node ran out and then 1x, instead of 2/3x or whatever all the way through
[23:21] <AnthonyOIT1> that's true.
[23:21] <AnthonyOIT1> sorry to bother
[23:21] <AnthonyOIT1> -.-
[23:21] <gregaf> no bother at all
[23:21] <gregaf> let us know if you have any other questions
[23:21] <AnthonyOIT1> hehe okay thanks alot!
[23:22] <AnthonyOIT1> i'm fairly new to linux and nfs's overall.
[23:22] <AnthonyOIT1> just a second year undergraduate
[23:22] <AnthonyOIT1> -.-
[23:22] <gregaf> what are you looking at Ceph for?
[23:24] <AnthonyOIT1> well, i'm working under my school's IT department and we're testing what ceph seems to be aiming for in terms of performance, reliability, and scalability
[23:25] <AnthonyOIT1> i'm in the process of running benchmarks for performance results
[23:25] <AnthonyOIT1> oh yeah
[23:25] <gregaf> ah
[23:25] <AnthonyOIT1> does ceph have it's own benchmark system?
[23:25] <gregaf> a few, nothing terribly serious
[23:26] <gregaf> the rados tool can do a basic write and read
[23:26] <AnthonyOIT1> i read that there was a command "ceph osd tell 0 bench"
[23:26] <AnthonyOIT1> or something like that
[23:26] <gregaf> yes, that's one
[23:26] <gregaf> that'll have each OSD bench its disk bandwidth and report back
[23:26] <AnthonyOIT1> root@ceph-master:~# ceph osd tell 0 bench
[23:26] <AnthonyOIT1> 10.07.27 14:25:18.907739 7f69a8550710 monclient(hunting): found mon0
[23:26] <AnthonyOIT1> 10.07.27 14:25:18.908158 mon <- [osd,tell,0,bench]
[23:26] <AnthonyOIT1> 10.07.27 14:25:18.919283 mon0 -> 'ok' (0)
[23:27] <AnthonyOIT1> i received that
[23:27] <AnthonyOIT1> but there's no report coming back
[23:27] <gregaf> "rados -p pool bench 60 write" will write to the OSD cluster for 60 seconds from your client, without the filesystem layer
[23:27] <gregaf> oh, it just reports in the log, sorry
[23:28] <AnthonyOIT1> ahh
[23:28] <AnthonyOIT1> i see.
[23:28] * deksai (~chris@71-13-57-82.dhcp.bycy.mi.charter.com) Quit (Quit: Leaving.)
[23:28] <gregaf> you can see it go by on your client if you have ceph -w running, too
[23:28] <AnthonyOIT1> yeah i had that running as well
[23:28] <gregaf> it should have shown up in there then, I think
[23:28] <AnthonyOIT1> but i didn't see the ceph osd tell 0 bench report still
[23:31] <gregaf> did you wait long enough to check for it?
[23:31] <gregaf> Sage says it writes a gig, so leave enough time for that
[23:32] * ghaskins_mobile (~ghaskins_@66-189-114-103.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[23:33] <sagewk> it's "bench [blocksize] [total bytes to write]", defaulting to 4MB and 1GB
[23:35] <AnthonyOIT1> hmm
[23:35] <AnthonyOIT1> well
[23:35] <AnthonyOIT1> i don't see any active disks(LED lights)
[23:36] <AnthonyOIT1> when i ran it
[23:38] <sagewk> you did 'ceph osd tell 0 bench' ?
[23:38] <sagewk> and were running ceph -w at the time?
[23:39] <AnthonyOIT1> yes
[23:40] * allsystemsarego (~allsystem@188.27.164.91) Quit (Quit: Leaving)
[23:41] <sagewk> hmm, and osd0 is up and running?
[23:41] <AnthonyOIT1> ah, i think something's wrong with RADOS.
[23:42] <AnthonyOIT1> i ran ./testrados and got a bunch of errors.
[23:44] <gregaf> does ./ceph -s actually show any OSDs up/in?
[23:44] <AnthonyOIT1> yes
[23:45] <gregaf> what's the output?
[23:45] <AnthonyOIT1> root@ceph-master:/usr/local/src/ceph-0.20.2/src# ceph -s
[23:45] <AnthonyOIT1> 10.07.27 14:44:45.061843 7f63f74b8710 monclient(hunting): found mon0
[23:45] <AnthonyOIT1> 10.07.27 14:44:45.066912 pg v14499: 2120 pgs: 2120 active+clean; 116 GB data, 230 GB used, 123 GB / 370 GB avail
[23:45] <AnthonyOIT1> 10.07.27 14:44:45.072729 mds e94: 1/1/1 up, 7 up:standby(laggy or crashed), 1 up:standby, 1 up:active
[23:45] <AnthonyOIT1> 10.07.27 14:44:45.072782 osd e176: 8 osds: 8 up, 8 in
[23:46] <AnthonyOIT1> 10.07.27 14:44:45.072843 log 10.07.27 13:45:16.055855 mon0 192.168.1.1:6666/0 49 : [INF] osd7 192.168.1.4:6806/1463 boot
[23:46] <AnthonyOIT1> 10.07.27 14:44:45.072942 mon e1: 1 mons at 192.168.1.1:6666/0
[23:46] <AnthonyOIT1> here's the results i got from ./testrados
[23:46] <AnthonyOIT1> root@ceph-master:/usr/local/src/ceph-0.20.2/src# ceph -s
[23:46] <AnthonyOIT1> 10.07.27 14:44:45.061843 7f63f74b8710 monclient(hunting): found mon0
[23:46] <AnthonyOIT1> 10.07.27 14:44:45.066912 pg v14499: 2120 pgs: 2120 active+clean; 116 GB data, 230 GB used, 123 GB / 370 GB avail
[23:46] <AnthonyOIT1> 10.07.27 14:44:45.072729 mds e94: 1/1/1 up, 7 up:standby(laggy or crashed), 1 up:standby, 1 up:active
[23:46] <AnthonyOIT1> 10.07.27 14:44:45.072782 osd e176: 8 osds: 8 up, 8 in
[23:47] <AnthonyOIT1> 10.07.27 14:44:45.072843 log 10.07.27 13:45:16.055855 mon0 192.168.1.1:6666/0 49 : [INF] osd7 192.168.1.4:6806/1463 boot
[23:47] <AnthonyOIT1> 10.07.27 14:44:45.072942 mon e1: 1 mons at 192.168.1.1:6666/0
[23:47] <AnthonyOIT1> oops
[23:47] <AnthonyOIT1> root@ceph-master:/usr/local/src/ceph-0.20.2/src# ./testrados
[23:47] <AnthonyOIT1> 10.07.27 14:44:10.963220 7f3f60f0d710 monclient(hunting): found mon0
[23:47] <AnthonyOIT1> rados_create_pool = -17
[23:47] <AnthonyOIT1> rados_open_pool = 0, pool = 0x1c41810
[23:47] <AnthonyOIT1> rados_snap_create snap1 = 0
[23:47] <AnthonyOIT1> rados_snap_list got snap 9 snap1
[23:47] <AnthonyOIT1> rados_snap_lookup snap1 got 9, result 0
[23:47] <AnthonyOIT1> rados_snap_remove snap1 = 0
[23:47] <AnthonyOIT1> rados_write = 26
[23:47] <AnthonyOIT1> rados_read = 26
[23:47] <AnthonyOIT1> rados_setxattr attr1=bar = 3
[23:47] <AnthonyOIT1> rados_getxattr attr1 = 3
[23:47] <AnthonyOIT1> rados_stat size = 26 mtime = 1280267054 = 0
[23:47] <AnthonyOIT1> exec result=Tue Jul 27 14:44:13 2010
[23:47] <AnthonyOIT1> read result=Tue Jul 27 14:44:13 2010
[23:47] <AnthonyOIT1> size=26
[23:47] <AnthonyOIT1> common/Mutex.h: In function 'void Mutex::Lock(bool)':
[23:47] <AnthonyOIT1> common/Mutex.h:97: FAILED assert(r == 0)
[23:47] <AnthonyOIT1> 1: (RadosClient::aio_write(RadosClient::PoolCtx&, object_t, long, ceph::buffer::list const&, unsigned long, RadosClient::AioCompletion*)+0xb0) [0x7f3f62668020]
[23:47] <AnthonyOIT1> 2: (rados_aio_write()+0xeb) [0x7f3f6266859b]
[23:47] <AnthonyOIT1> 3: (main()+0x40f) [0x4015ef]
[23:47] <AnthonyOIT1> 4: (__libc_start_main()+0xfd) [0x7f3f61a78c4d]
[23:48] <AnthonyOIT1> 5: /usr/local/src/ceph-0.20.2/src/.libs/lt-testrados() [0x401119]
[23:48] <AnthonyOIT1> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
[23:48] <AnthonyOIT1> common/Mutex.h: In function 'void Mutex::Lock(bool)':
[23:48] <AnthonyOIT1> common/Mutex.h:97: FAILED assert(r == 0)
[23:48] <AnthonyOIT1> 1: (RadosClient::aio_write(RadosClient::PoolCtx&, object_t, long, ceph::buffer::list const&, unsigned long, RadosClient::AioCompletion*)+0xb0) [0x7f3f62668020]
[23:48] <AnthonyOIT1> 2: (rados_aio_write()+0xeb) [0x7f3f6266859b]
[23:48] <AnthonyOIT1> 3: (main()+0x40f) [0x4015ef]
[23:48] <AnthonyOIT1> 4: (__libc_start_main()+0xfd) [0x7f3f61a78c4d]
[23:48] <AnthonyOIT1> 5: /usr/local/src/ceph-0.20.2/src/.libs/lt-testrados() [0x401119]
[23:48] <AnthonyOIT1> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
[23:48] <AnthonyOIT1> terminate called after throwing an instance of 'ceph::FailedAssertion*'
[23:48] <AnthonyOIT1> Aborted
[23:54] <sagewk> we're not seeing that error. can you take a look at the unstable from git? or if you need a prerelease tarball, http://ceph.newdream.net/testing/0.21/ceph-0.21.tar.gz (we're about to release v0.21)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.