#ceph IRC Log


IRC Log for 2011-02-04

Timestamps are in GMT/BST.

[0:11] * jantje (~jan@paranoid.nl) has joined #ceph
[0:33] <Tv|work> anyone at the office really good with mysql and/or django?
[0:33] <Tv|work> "Access denied; you need the SUPER privilege for this operation" grumble grumble
[0:39] <Tv|work> hur dur DEFINER=.. SQL SECURITY DEFINER
[0:39] <Tv|work> i hate sql
[0:42] <gregaf> Tv|work: ask Kyle who's next to you, maybe
[0:43] <gregaf> I think all our panel guys are pretty good at sql stuff and I noticed him talking about django a few days ago
[0:43] <Tv|work> i know what's happening, now.. dh hosting doesn't give you SUPER access to mysql
[0:43] <Tv|work> i can work around this
[0:43] <gregaf> okay then
[0:43] <gregaf> <— is not good at sql stuff or django
[1:28] * polarix (~polarix@189-54-103-155-nd.cpe.vivax.com.br) has joined #ceph
[1:32] * polarix (~polarix@189-54-103-155-nd.cpe.vivax.com.br) Quit (Quit: Leaving)
[1:58] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[2:02] * bencherian (~bencheria@ip-66-33-206-8.dreamhost.com) has joined #ceph
[2:09] * bcherian (~bencheria@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[2:11] * bcherian (~bencheria@ip-66-33-206-8.dreamhost.com) has joined #ceph
[2:17] * jantje (~jan@paranoid.nl) Quit (Read error: Connection reset by peer)
[2:17] * jantje (~jan@paranoid.nl) has joined #ceph
[2:18] * bencherian (~bencheria@ip-66-33-206-8.dreamhost.com) Quit (Read error: Operation timed out)
[2:18] * Juul (~Juul@static.88-198-13-205.clients.your-server.de) Quit (Quit: Leaving)
[2:36] * bcherian (~bencheria@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[3:27] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[4:41] * baldben (~bencheria@cpe-76-173-232-163.socal.res.rr.com) has joined #ceph
[7:03] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[7:31] * cmccabe (~cmccabe@c-24-23-253-6.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[8:25] * gregorg_taf (~Greg@ Quit (Quit: Quitte)
[8:25] * gregorg (~Greg@ has joined #ceph
[8:53] * MK_FG (~MK_FG@ Quit (Quit: o//)
[8:54] * MK_FG (~MK_FG@ has joined #ceph
[10:01] * hijacker_ (~hijacker@ has joined #ceph
[10:01] * hijacker (~hijacker@ Quit (Read error: Connection reset by peer)
[10:49] * verwilst (~verwilst@router.begen1.office.netnoc.eu) has joined #ceph
[11:16] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[12:13] * DJLee (82d8d198@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[12:22] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Read error: No route to host)
[13:08] * Yoric (~David@ has joined #ceph
[15:35] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[17:09] * greglap (~Adium@ has joined #ceph
[17:17] * julienhuang (~julienhua@pasteur.dedibox.netavenir.com) has joined #ceph
[17:20] <greglap> Anticimex: you around?
[17:24] * baldben (~bencheria@cpe-76-173-232-163.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[17:52] * greglap1 (~Adium@ has joined #ceph
[17:52] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:59] * greglap (~Adium@ Quit (Ping timeout: 480 seconds)
[18:08] * verwilst (~verwilst@router.begen1.office.netnoc.eu) Quit (Quit: Ex-Chat)
[18:20] <wido> greglap1: do you have some RBD images running?
[18:23] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:24] <greglap1> wido: me personally? or us in the office?
[18:24] <johnl> hey, I have a stuck kernel client mount. any advice on debugging/bug reporting?
[18:24] * julienhuang (~julienhua@pasteur.dedibox.netavenir.com) Quit (Quit: julienhuang)
[18:24] <Anticimex> greglap1: i am, but not available for debugging atm
[18:24] <Anticimex> @work (non-ceph work)
[18:25] <greglap1> Anticimex: okay, it's just I looked through those logs and it sure looks like the ceph client is writing data for about 1.5 minutes and then goes idle — it keeps exchanging messages but they're all heartbeat-type stuff
[18:26] * baldben (~bencheria@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:26] <greglap1> johnl: is the mount stuck mounting, or unmounting, or...?
[18:26] <johnl> mounted, was writing lots of little files.
[18:26] <johnl> worked for hours and hours and then stopped. write process in D state. no load on cluster
[18:26] <johnl> the client is from the standard 2.6.37 tree.
[18:27] <greglap1> what's the cluster status?
[18:27] <greglap1> all OSDs and MDSes are up?
[18:28] <johnl> all the osds were still up
[18:28] <johnl> pretty certain all the mons were still up, as was the mds process
[18:29] <johnl> I restarted all ceph cluster processes to try and get the client to kinda wake up
[18:29] <greglap1> heh
[18:29] <johnl> so the cluster is now definitely good
[18:29] <johnl> client is still stuck though.
[18:30] <greglap1> there's probably a request that's hanging or got lost somehow that the client is waiting on
[18:30] <greglap1> do you have any logging enabled?
[18:30] <johnl> the mdsc file in /sys/kernel/debug/id shows 3 entires
[18:31] <johnl> unmounting is denied.
[18:31] <johnl> unfortunately not got logging on no :/
[18:31] <greglap1> not on the MDS either?
[18:31] <johnl> no :(
[18:31] <greglap1> bummer
[18:31] <johnl> was benchmarking, so didn't want to slow it down
[18:31] <greglap1> what are the entries in /sys/kernel/debug?
[18:31] <johnl> 820580 mds0 lookup #1000007bf67/m6fd (bigfiles4/m6fd)
[18:31] <johnl> 820582 mds0 readdir #1
[18:31] <johnl> 820584 (no request) getattr #1
[18:32] <johnl> http://pastebin.com/NFL7mQfj
[18:32] <Anticimex> greglap1: any clue why the sync doesn't terminate?
[18:32] <Anticimex> i got more fail yesterday when turning up the cluster as well, which i will have to wait a few days before i can investigate further
[18:33] <Anticimex> eg, on more than one box, with multiple of everything
[18:33] <johnl> dmesg output on the client: http://pastebin.com/FVtnfCbV
[18:34] <johnl> tbh, I'd be happy with a way to kick the client so it continues/disconnects/reconnects
[18:34] <greglap1> mmm, I'm trying to think if there's any useful way to get info out of the MDS
[18:35] <greglap1> I guess if you gdb attach and "info threads all" we can check if there's anything obviously broken there
[18:35] <greglap1> but I think we'd need logs to figure it otu
[18:35] <greglap1> *out
[18:35] <johnl> I've unfortunately restarted the mds since :/
[18:35] <johnl> was hoping the client would notice and reconnect
[18:35] <greglap1> oh, right
[18:36] <johnl> even tried to lazy unmount and mount a new one - seems to use the same connection
[18:36] <greglap1> I think it did reconnect
[18:36] <greglap1> those are the "mds0 reconnect start" etc lines
[18:36] <johnl> I think the mount was still working after that point
[18:36] <greglap1> the problem is the client is waiting for a request to finish on the MDS and for some reason the MDS isn't finishing them
[18:37] <greglap1> and the client can't continue its work without those, ugh
[18:37] <johnl> there were a couple of hours of "libceph: tid 405965 timed out on osd1, will reset osd" messages and then it stopped
[18:37] <greglap1> a couple hours worth?
[18:37] <greglap1> the OSD was up that whole time?
[18:39] <johnl> there are 4 ODSs
[18:39] <johnl> looks like connections to 3 of them were timing out quite regularly
[18:39] <greglap1> connections timing out is normal if there's no traffic
[18:39] <greglap1> if there's a specific tid timing out that's unusual
[18:40] <johnl> every minute for 2 hours
[18:40] <greglap1> that tid timed out every minute for 2 hours?
[18:40] <johnl> same set of tids
[18:40] <greglap1> weird
[18:40] <johnl> lemme pastebin
[18:40] <greglap1> hrmm
[18:40] * bchrisman (~Adium@70-35-37-146.static.wiline.com) has joined #ceph
[18:41] <johnl> http://pastebin.com/V7y45Ku8
[18:41] <greglap1> sorry, I'm at the train station and gotta run, but put it up and somebody'll get it later
[18:41] <johnl> k thanks bye!
[18:41] <greglap1> sjust should be in soon, poke him
[18:41] <greglap1> got a meeting for part of the day starting in 20 minutes, later
[18:41] * greglap1 (~Adium@ Quit (Quit: Leaving.)
[18:48] <sagewk> johnl: we were seeing similar messages due to bad socket error handling.. the latest ceph-client.git master branch has the fix. any chance you can try that?
[18:48] <sagewk> was this a reproduceable thing, or is this the first time you've seen it?
[19:21] * Yoric (~David@ Quit (Quit: Yoric)
[19:47] * uwe (~uwe@mb.uwe.gd) has joined #ceph
[19:52] <uwe> i guess this question gets asked a few times already (and it's on the FAQ) - can I use ceph for a productivy mail-backend? (I just hope that the information in the FAQ is outdated)
[19:52] <darkfader> oooh
[19:53] <darkfader> i'd not do that yet, unless the users will accept the downtime for a restore. normally you wouldn't need it, but ...
[19:54] <uwe> :(
[19:54] <darkfader> i think it is safe to assume it'll be stable some this year ;)
[19:54] <uwe> how many times these things happen?
[19:55] <Tv|work> "Allocating 'testvm-001.im 3% | | 766 MB 262:02 ETA"
[19:55] <darkfader> uwe: normally never
[19:55] <Tv|work> i want that COW support in kvm :(
[19:55] <uwe> darkfader: sure :) but how many times was it reported in the last month? just to give me an estimate how bad things are - and if you say it will be ready this year it shouldn't happen too often
[19:55] <Tv|work> uwe: ceph is not production quality yet, but we are working towards that..
[19:56] <Tv|work> hey, we recommend btrfs, that's not production quality either, and it takes about a year for something like btrfs to stabilize (based on lessons from ext{2,3,4})
[19:56] <uwe> later this year is just a few month too late, I need a system to hold about 100T mail data for an archive and one with around 10-15T for the "live system" - and ceph really has just everything I need :)
[19:57] <darkfader> but thats not stuff you put on something experimental
[19:57] <darkfader> unless you wanna get a new job ;)
[19:57] <uwe> :)
[19:58] <Tv|work> oh wow that clone i pasted earlied jumped from 200minute ETA to done.
[19:58] <Tv|work> something's funky about that estimator
[19:58] <uwe> there aren't many options around, we tested lustre a lot but I wont expect any support/development in the near future
[19:58] <darkfader> i got the same problem, ceph would be my best bet for VM storage, but i rather stay careful. i'll put stuff like the distro mirrors and stuff like that on it
[19:59] <Tv|work> uwe: the only real way to do it with current stuff is sharding and managing that sharding layer
[19:59] <uwe> still, thanks for the answers - helped a lot in the decision
[20:00] <uwe> sharding? you mean like having a database which points me to the right server or something like that?
[20:00] <darkfader> uwe: ceph will be much better for getting iops than stuff like lustre could do
[20:00] * Tv|work notes IMAP is not distributable for a single account anyway
[20:00] <darkfader> maybe just grab a solaris box with l2arc and replicate to a second one
[20:00] <Tv|work> you end up with some kind of sharding, or heavy locking, anyway
[20:00] <darkfader> use that for 2 years and then jump to ceph
[20:02] <uwe> I need access from at least 2 servers to build a text index on one and the other one handling the users (or preferable a few more of course)
[20:02] <uwe> and that can't wait 2 years, it needs to get productive around q2/3
[20:03] <darkfader> yeah well, first use whats available and then switch whats great :)
[20:03] <darkfader> +to
[20:03] <uwe> :)
[20:03] <uwe> let's see if that's still an option in 2 years :)
[20:06] <darkfader> you mean you wont be getting the downtime for migrating then because all disks will be humming?
[20:08] <uwe> the downtime is not the biggest problem, the problem will be getting time for developing another concept and testing resources
[20:08] <uwe> time/money :)
[20:11] <darkfader> maybe you can argue by saving some time/money now... good point anyway. think about how long you'll keep running the initial setup
[20:13] * cclien (~cclien@ec2-175-41-146-71.ap-southeast-1.compute.amazonaws.com) Quit (Remote host closed the connection)
[20:14] * cclien (~cclien@ec2-175-41-146-71.ap-southeast-1.compute.amazonaws.com) has joined #ceph
[20:34] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[20:34] <gregaf> johnl: looked at your last pastebin
[20:43] <bchrisman> is there a /proc or other file I can echo somethign into o turn on ceph kernel client debugging?
[20:44] <bchrisman> I'm trying to look at dout output in the code.
[20:46] <wido> bchrisman: there is some stuff in /sys I think
[20:48] <wido> gregaf: No, I didn't meant you personally. But I've been using the RBD VM's, but the performance is pretty low. A sequentail read is OK, but small writes or random are still slow
[20:49] <wido> And since you took a look at my iostat numbers I was wondering if you saw the same
[20:49] <bchrisman> wonder if I didn't compile my kernel correctly.. I see ceph stuff in /sys/module only.
[20:52] * baldben (~bencheria@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[20:54] <wido> bchrisman: I'm not sure, haven't touched the fs lately
[20:56] <bchrisman> cool.. I'll ask on ML and then put up a page on the wiki
[20:59] <wido> joshd: I've got some inconsitent pg's again! But this time with logs
[21:04] * gregorg_taf (~Greg@ has joined #ceph
[21:04] * gregorg (~Greg@ Quit (Read error: Connection reset by peer)
[21:04] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[21:22] <gregaf> johnl: is your cluster still in that state where the client is hanging on those requests, or have you done something with it?
[21:22] <gregaf> osd requests timing out repeatedly like that is not something we've seen before and is probably evidence of a new/different issue
[21:24] <sjust> wido: cool, where can I find them?
[21:29] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[21:34] <wido> sjust: Can you send me your public key?
[21:34] <wido> I'll upload them to my logger machine
[21:35] <sjust> wido: email?
[21:35] <wido> wido@widodh.nl
[21:37] <sjust> wido: sent
[21:39] <wido> sjust: great. Hope the greylisting doesn't cause any problems
[21:39] <wido> sjust: Nope, got it. You should be able to log on to root@logger.ceph.widodh.nl
[21:40] <sjust> thanks!
[21:40] <wido> I'm uploading the files right now to /srv/ceph/issues/osd_inconsistent_pg
[21:40] <wido> osd.0.log.gz is a capture of my syslog (I'm logging to syslog)
[21:40] <wido> there should also be a txt file with my output from 'ceph -s' and 'ceph -w'. Should all be there in about 5 min
[21:40] <sjust> ok
[21:43] * baldben (~bencheria@ip-66-33-206-8.dreamhost.com) has joined #ceph
[21:53] <wido> sjust: You got the logs?
[21:54] <sjust> yeah, about to start looking them over
[21:54] <wido> Great, i'll be afk in a minute
[21:54] <sjust> ok
[21:54] <wido> sjust: From logger you could do "ssh noisy"
[21:55] <wido> that is the machine which is having those issues, just saw that another pg became inconsistent
[21:55] <sjust> ok
[21:55] <wido> Feel free to log in, it's a test cluster :)
[21:56] <sjust> ok :)
[22:51] * baldben (~bencheria@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[23:05] * cmccabe (~cmccabe@ has joined #ceph
[23:08] * baldben (~bencheria@ip-66-33-206-8.dreamhost.com) has joined #ceph
[23:13] * bcherian (~bencheria@ip-66-33-206-8.dreamhost.com) has joined #ceph
[23:14] * uwe (~uwe@mb.uwe.gd) Quit (Quit: sleep)
[23:20] * baldben (~bencheria@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[23:20] * votz (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) has joined #ceph
[23:21] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[23:37] * bcherian (~bencheria@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[23:47] * prometheanfire (~mthode@mx1.mthode.org) has joined #ceph
[23:47] * bcherian (~bencheria@ip-66-33-206-8.dreamhost.com) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.