#ceph IRC Log


IRC Log for 2010-10-22

Timestamps are in GMT/BST.

[0:09] * ajnelson (~ajnelson@soenat3.cse.ucsc.edu) Quit (Quit: ajnelson)
[0:11] * sentinel_e86 (~sentinel_@ Quit (Quit: sh** happened)
[0:12] * sentinel_e86 (~sentinel_@ has joined #ceph
[0:24] * sentinel_e86 (~sentinel_@ Quit (Ping timeout: 480 seconds)
[0:26] <jantje> gregaf: do you need me to reproduce it with more logging ?
[0:27] <gregaf> jantje: no, pretty sure we've found the root cause, just trying to come up with an appropriate fix now
[0:27] <jantje> ok, great :)
[0:27] <jantje> make sure to update the debian packages, makes it easier for me to update :P
[0:28] <gregaf> there's what looks to be a long-standing bug in the messaging connection protocol that was exposed by an error in our newly-applied use of timeouts
[0:28] <jantje> (actually, I should get the git tree ...)
[0:29] <jantje> It's really great to see how quickly you guys resolve those issues
[0:30] <gregaf> :)
[0:30] <gregaf> the timeout bug was an easy patch, but solving the protocol error is taking a bit longer :(
[0:30] <jantje> :-)
[0:34] <jantje> i'm going to sleep now
[0:35] <jantje> let me know how things went
[0:37] <jantje> (is it possible for different log levels to go to different files? for example, i want a debug file to look at when things go very wrong,but that debug file would be too big to look at the statuses)
[0:38] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[0:43] * johnl (~johnl@cpc3-brad19-2-0-cust563.barn.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)
[0:48] * ajnelson (~ajnelson@soenat3.cse.ucsc.edu) has joined #ceph
[0:50] <sagewk> jantje: there's a longstanding bug open to log to syslog.. that would do the trick
[0:52] <jantje> Oh, ok
[0:52] <jantje> never mind then :-)
[1:28] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[1:40] * fzylogic (~fzylogic@dsl081-243-128.sfo1.dsl.speakeasy.net) Quit (Quit: DreamHost Web Hosting http://www.dreamhost.com)
[1:47] * ajnelson (~ajnelson@soenat3.cse.ucsc.edu) Quit (Quit: ajnelson)
[1:56] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[2:05] * jiqi (~Bryan@odin000963151.ndc.nasa.gov) Quit (Quit: Leaving)
[2:14] * greglap (~Adium@ has joined #ceph
[3:04] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[3:14] * greglap (~Adium@ Quit (Read error: Connection reset by peer)
[3:24] * sjust (~sam@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[3:52] * phreak- (~phreak@gangsta.nl) Quit (Remote host closed the connection)
[3:52] * LW_ (~jkreger@rrcs-98-101-117-50.midsouth.biz.rr.com) has joined #ceph
[3:54] * LW (~jkreger@rrcs-98-101-117-50.midsouth.biz.rr.com) Quit (Ping timeout: 480 seconds)
[3:58] * Guest206 (~user@ has joined #ceph
[4:00] * phreak- (~phreak@gangsta.nl) has joined #ceph
[4:20] * Guest206 (~user@ Quit (Quit: leaving)
[4:28] * greglap (~Adium@cpe-76-90-74-194.socal.res.rr.com) has joined #ceph
[6:03] * cmccabe5 (~cmccabe@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[6:05] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[6:05] * yehudasa (~yehudasa@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[6:06] * deksai (~deksai@96-35-100-192.dhcp.bycy.mi.charter.com) Quit (Ping timeout: 480 seconds)
[6:07] * terang (~me@ip-66-33-206-8.dreamhost.com) Quit (Read error: Connection reset by peer)
[6:09] * gregaf1 (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[6:09] * terang (~me@ip-66-33-206-8.dreamhost.com) has joined #ceph
[6:12] * terang (~me@ip-66-33-206-8.dreamhost.com) Quit (Read error: Connection reset by peer)
[6:13] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[6:20] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) has joined #ceph
[6:20] * terang (~me@ip-66-33-206-8.dreamhost.com) has joined #ceph
[6:29] * Guest3272 (quasselcor@bas11-montreal02-1128536392.dsl.bell.ca) Quit (Quit: No Ping reply in 180 seconds.)
[6:29] * bbigras (quasselcor@bas11-montreal02-1128536392.dsl.bell.ca) has joined #ceph
[6:30] * bbigras is now known as Guest218
[6:35] * yehudasa (~yehudasa@ip-66-33-206-8.dreamhost.com) has joined #ceph
[7:55] * cmccabe5 (~cmccabe@ip-66-33-206-8.dreamhost.com) has joined #ceph
[7:58] * terang (~me@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[8:12] * MarkN (~nathan@ Quit (Quit: Leaving.)
[8:31] * terang (~me@pool-173-55-24-140.lsanca.fios.verizon.net) has joined #ceph
[8:31] * allsystemsarego (~allsystem@ has joined #ceph
[8:33] * terang (~me@pool-173-55-24-140.lsanca.fios.verizon.net) Quit ()
[8:34] * terang (~me@pool-173-55-24-140.lsanca.fios.verizon.net) has joined #ceph
[8:59] * Yoric (~David@ has joined #ceph
[9:21] <jantje> morning
[9:24] <terang> Morning
[9:39] <hijacker> morning
[10:58] * Yoric (~David@ Quit (Quit: Yoric)
[11:07] * johnl (~johnl@cpc3-brad19-2-0-cust563.barn.cable.virginmedia.com) has joined #ceph
[11:46] * Yoric (~David@ has joined #ceph
[12:41] * cmccabe5 (~cmccabe@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[12:46] * yehudasa (~yehudasa@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[12:48] * yehudasa (~yehudasa@ip-66-33-206-8.dreamhost.com) has joined #ceph
[13:09] * Yoric (~David@ Quit (Quit: Yoric)
[13:35] * Yoric (~David@ has joined #ceph
[14:33] * cmccabe5 (~cmccabe@ip-66-33-206-8.dreamhost.com) has joined #ceph
[16:03] * Yoric (~David@ Quit (Quit: Yoric)
[16:06] * Yoric (~David@ has joined #ceph
[16:21] <jantje> Hmm
[16:21] <jantje> I compiled the git tree
[16:21] <jantje> (And I hope I switched correctly to unstable by doing git checkout unstable ?)
[16:22] <jantje> oh no
[16:22] * jantje slaps himself
[17:03] * gregorg_taf (~Greg@ Quit (Quit: Quitte)
[17:16] * deksai (~deksai@96-35-100-192.dhcp.bycy.mi.charter.com) has joined #ceph
[17:40] * greglap (~Adium@cpe-76-90-74-194.socal.res.rr.com) Quit (Quit: Leaving.)
[17:50] * greglap (~Adium@ has joined #ceph
[17:56] <greglap> jantje: everything okay over there? :)
[17:56] <greglap> the immediate cause of your crash problem has (I hope) been fixed in testing and unstable at this point
[17:59] <jantje> greglap: yea, :)
[18:41] * greglap (~Adium@ Quit (Read error: Connection reset by peer)
[18:46] * deksai (~deksai@96-35-100-192.dhcp.bycy.mi.charter.com) Quit (Quit: Leaving)
[18:48] * alexxy (~alexxy@ has joined #ceph
[18:50] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:57] * Yoric (~David@ Quit (Quit: Yoric)
[18:58] * sjust (~sam@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:59] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:02] * cmccabe5 (~cmccabe@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving)
[19:02] * cmccabe (~cmccabe@adsl-76-200-191-1.dsl.pltn13.sbcglobal.net) has joined #ceph
[19:11] * greglap1 (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:11] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Read error: Connection reset by peer)
[21:06] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[21:19] * gregorg (~Greg@ has joined #ceph
[21:25] * gregorg (~Greg@ Quit (Quit: Quitte)
[21:48] * michael-ndn (~michael-n@ has joined #ceph
[21:53] * cvaske (~cvaske@dsl-63-249-119-156.static.cruzio.com) has joined #ceph
[21:54] * ooiioo (~ooiioo@ has joined #ceph
[21:59] <cvaske> I'd like to convince my group to try out Ceph for a new filesystem before going to GPFS, can anyone comment on the reliability and performance of single MDS installs?
[22:15] <sagewk> cvaske: hi!
[22:15] <sagewk> oh you sent email too
[22:15] <cvaske> hi sage!
[22:16] <sagewk> the single mds configuration is pretty stable, but i'm not going to make any data loss promises just yet. you shouldn't store any data you can't afford to lose
[22:17] <cvaske> fortunately most of our original data is downloaded from other repositories, but downloading is a significant time cost
[22:18] <cvaske> so we may be a good test case
[22:18] <sagewk> as far as mds performance goes, remmeber it only handles metadata ops, not file ops (so lookup/rename/unlink/create but not read/write). last time i checked i was seeing ~5-10k updates, ~15k/sec lookups, something along those lines. haven't benchmarked recently on more modern hardware.
[22:18] <sagewk> yeah
[22:19] <sagewk> read throughput will mostly depend on the hardware and how readahead is configured.
[22:20] <cvaske> ok, i think a single MDS would work well then
[22:21] <cvaske> most of our files are huge, and read mostly in order, so I'm guessing that means a large readahead would work fine?
[22:22] <sagewk> yeah
[22:23] <cvaske> btw, when you say "pretty stable" how many problems are reported? or is it that no problems are reported and no code changes are made?
[22:23] <sagewk> the default is like 512k, which often isn't large enough. we having tuned any of this yet.
[22:24] <sagewk> it means relatively few problems are reported, and they are mostly in the failure/recovery area, or the unstable code branch
[22:25] <sagewk> what time frame are you looking at?
[22:25] <cvaske> probably within a few weeks, though it's hard to say for sure
[22:27] <cvaske> I'll bring it up with our cluster guy and see how he feels about it. He says he's been keeping his eye on Ceph, so he might want to try it too. Thanks Sage!
[22:28] <sagewk> cool. np! be sure to send him our way if he has any questions
[22:28] <cvaske> will do!
[22:31] * ajnelson (~ajnelson@kresge-37-206.resnet.ucsc.edu) has joined #ceph
[22:56] * cvaske (~cvaske@dsl-63-249-119-156.static.cruzio.com) Quit (Quit: This computer has gone to sleep)
[23:18] <jantje> it's really getting to a wider audience, wich is kind of sweeet :)
[23:21] <sagewk> yeah, although maybe not as wide as you think... charlie is a friend from grad school who happened to see my talk at ucsc on tuesday.
[23:26] <terang> maybe it's time to go on a lecturing tour :)
[23:32] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[23:45] * ajnelson (~ajnelson@kresge-37-206.resnet.ucsc.edu) Quit (Quit: ajnelson)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.