#ceph IRC Log


IRC Log for 2012-06-05

Timestamps are in GMT/BST.

[0:00] * s[X] (~sX]@ppp59-167-154-113.static.internode.on.net) has joined #ceph
[0:02] * jmlowe1 (~Adium@140-182-210-166.dhcp-bl.indiana.edu) has joined #ceph
[0:02] * jmlowe (~Adium@140-182-210-166.dhcp-bl.indiana.edu) Quit (Read error: Connection reset by peer)
[0:10] * lofejndif (~lsqavnbok@09GAAFSAV.tor-irc.dnsbl.oftc.net) has joined #ceph
[0:20] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[0:40] * jmlowe1 (~Adium@140-182-210-166.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[1:25] * lofejndif (~lsqavnbok@09GAAFSAV.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[1:33] * Tv_ (~tv@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:57] * RupS (~rups@panoramix.m0z.net) Quit (Ping timeout: 480 seconds)
[2:04] * The_Bishop (~bishop@p5DC11807.dip.t-dialin.net) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[2:23] * The_Bishop (~bishop@cable-86-56-102-91.cust.telecolumbus.net) has joined #ceph
[2:36] * yoshi (~yoshi@p37158-ipngn3901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:42] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[2:44] <jmlowe> ok, I seem to be stuck with some pg's in active+clean+replay
[2:44] <jmlowe> any tips?
[2:46] <jmlowe> nm, I killed the offending osd and restarted, recovery completed
[2:49] * aliguori (~anthony@ has joined #ceph
[2:55] <gregaf> jmlowe: I think that's a fixed bug, but not certain and nobody who would know is at their desk right now???sorry
[3:04] * adjohn (~adjohn@ Quit (Quit: adjohn)
[3:12] * andresambrois (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[3:59] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[4:06] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[4:06] * chutzpah (~chutz@ Quit (Quit: Leaving)
[4:57] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) has joined #ceph
[5:07] * joao (~JL@ Quit (Remote host closed the connection)
[5:11] * cattelan is now known as cattelan_away
[5:27] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[5:47] <Qten> has anyone considered trying to run ceph in a virtual machine under kvm in prod to allow us to share the resources of the server for compute?
[6:00] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[6:04] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[7:00] <iggy> Qten: it's been used mostly for testing... virtualizing i/o heavy stuff isn't generally considered a good idea
[7:42] <Qten> iggy: is that mainly a squeeze every last bit of performance issue tho?
[7:47] <liiwi> virtualization can only bring issues in between when you have machines as raw storage containers
[9:11] * s[X] (~sX]@ppp59-167-154-113.static.internode.on.net) Quit (Remote host closed the connection)
[9:27] * BManojlovic (~steki@smile.zis.co.rs) has joined #ceph
[9:53] * adjohn (~adjohn@50-0-133-101.dsl.static.sonic.net) has joined #ceph
[10:18] * adjohn (~adjohn@50-0-133-101.dsl.static.sonic.net) Quit (Quit: adjohn)
[10:43] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[10:51] * ao (~ao@ has joined #ceph
[10:52] * deam_ (~deam@dhcp-077-249-088-048.chello.nl) Quit (Quit: leaving)
[11:18] <Qu310> liiwi: true i suppose just a thought trying to squeeze the budget :)
[11:36] * aliguori (~anthony@ has joined #ceph
[12:04] * yoshi (~yoshi@p37158-ipngn3901marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[12:18] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[12:27] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[12:33] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[13:06] * aliguori (~anthony@ has joined #ceph
[13:22] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) has joined #ceph
[13:25] * s[X]_ (~sX]@ppp59-167-157-96.static.internode.on.net) Quit (Remote host closed the connection)
[13:59] <nhm> good morning #ceph
[14:00] <jmlowe> morning
[14:01] <liiwi> good afternoon
[14:02] * aliguori (~anthony@ Quit (Ping timeout: 480 seconds)
[14:11] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) has joined #ceph
[14:17] <nhm> ooh, looks like I made ext4 fall over last night.
[14:18] <darkfader> i bet you wrote same data to it!
[14:19] <darkfader> seriously, you guys could all start a career in filesystem stress testing even if CephFS never got to the stress testing stage :)
[14:22] <nhm> darkfader: Lets hope we can keep working on ceph. ;)
[14:23] <nhm> ok, afk for a while
[14:29] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) has left #ceph
[14:29] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) has joined #ceph
[14:32] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) has left #ceph
[14:32] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) has joined #ceph
[15:02] * Theuni (~Theuni@ has joined #ceph
[15:03] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[15:06] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[15:07] * lofejndif (~lsqavnbok@09GAAFTB4.tor-irc.dnsbl.oftc.net) has joined #ceph
[15:09] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[15:10] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has left #ceph
[15:27] * lofejndif (~lsqavnbok@09GAAFTB4.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[15:58] * Theuni (~Theuni@ Quit (Quit: Leaving.)
[16:01] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[16:21] * brambles (brambles@ Quit (Remote host closed the connection)
[16:22] * brambles (brambles@ has joined #ceph
[16:42] * ao (~ao@ Quit (Quit: Leaving)
[17:07] * BManojlovic (~steki@smile.zis.co.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:19] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:25] * joao (~JL@ has joined #ceph
[17:25] <joao> hello people of #ceph
[17:30] * bchrisman1 (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:32] <nhm> morning joao
[17:33] <joao> it does feel like it somehow, but it's nearly 5pm here :x
[17:33] <nhm> joao: hehe
[17:33] <nhm> joao: Must be tough trying to get your sleep schedule worked out!
[17:33] <joao> I'm with a huge internal clock skew
[17:34] <joao> it was so much easier going west... it's amazing
[17:34] <nhm> joao: maybe you should have just kept going the other way
[17:34] <joao> lol
[17:34] <joao> I'm sure things don't work that way :p
[17:35] <joao> on the up side of being back home, is that it's hotter than it was in lA
[17:35] <joao> *LA
[17:35] <nhm> joao: hah, I wish it were cooler here!
[17:35] <joao> the down side is that it's hotter than it was in LA
[17:36] <nhm> I like it right around 22c.
[17:36] <nhm> 18-20c if I'm biking.
[17:37] <joao> yeah, it's ~28C here
[17:41] <elder> nhm, you should be biking. It's 19 C.
[17:42] <nhm> elder: I should be biking.
[17:42] <nhm> elder: Actually, I just had a dumpster delivered, so I should be hauling crap.
[17:42] <joao> elder, he said he likes 18-20C when he's biking, not that he would bike with those temperatures :p
[17:43] <elder> It's a good crap hauling temperature too.
[17:44] <elder> Back later, off to look at code (on paper).
[17:45] <joao> are you implying you printed ceph on paper?
[17:46] <joao> I am now imagining a whole filer containing the the daily patches
[18:04] * Tv_ (~tv@aon.hq.newdream.net) has joined #ceph
[18:09] <iggy> yeah, that'd be a ton of paper
[18:13] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[18:14] <nhm> interesting, ext4 is really good at small write performance with ceph when the number of OSDs per node is low. Then it drops way below xfs and btrfs when the number of OSDs per node is high.
[18:15] <Tv_> nhm: but with each osd on a separate filesystem? that's odd
[18:15] <nhm> Tv_: yeah. At 5 OSDs per node ext4 was doing great. At 10, it's sucking it up.
[18:15] <Tv_> nhm: perhaps it just bottlenecks at the IO controller at that point, or something.. submitting too many small operations?
[18:15] <Tv_> where as the others might be batching them up
[18:16] * yehudasa (~yehudasa@aon.hq.newdream.net) Quit (Remote host closed the connection)
[18:16] <Tv_> my naive understanding is ext4 is more likely to "pass the app behaviour through" to the hardware, where as i understand xfs was explicitly written to "manage" the reads and writes
[18:16] <nhm> Tv_: Yeah, not sure. EXT4 blew up the last time I did this test though, so maybe something is breaking.
[18:17] <Tv_> nhm: kernel bug?
[18:18] <nhm> Tv_: yeah, something to do with jbd2_journal_dirty_metadata.
[18:19] <nhm> Tv_: Caused ext4 to get into a bad state and write out bad xattr blocks to ext4 on the boot drive even.
[18:20] <nhm> not sure if it was a coincidence or a result of having so many ext4 OSDs writing at once.
[18:24] * yehudasa (~yehudasa@aon.hq.newdream.net) has joined #ceph
[18:24] <sagewk> nhm: if you have the full stack trace, you shoudl send it to the ext4 list
[18:25] <nhm> sagewk: Ok, will do. I wasn't sure if it was something maybe we already knew about.
[18:26] <joao> sagewk, I assigned this to me: http://tracker.newdream.net/issues/2497
[18:40] <nhm> Tv_: ext4 performance issues seem to be related to the nodes becoming extremely busy with ext4.
[18:40] <nhm> Tv_: haven't looked at what's going on, but during heavy writes the OSD nodes aren't responsive.
[18:41] <nhm> hrm, lots of cpu time spent waiting.
[18:44] <sagewk> joao: ok. we need this on the master branch, btw, can't wait for the new mon stuff
[18:45] <joao> sure thing
[18:50] * lofejndif (~lsqavnbok@82VAAEAGI.tor-irc.dnsbl.oftc.net) has joined #ceph
[18:50] <sagewk> elder: what would be your response if someone said "hey, let's backport the latest rbd code to 2.6.33 (rhel6?)?"
[18:53] * bchrisman (~Adium@ has joined #ceph
[18:54] <gregaf> Qten: Qu310: if you're just using the userspace stuff you can always run the ceph daemons and the compute stuff on the same node without virtualization in the way anywhere
[18:57] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[18:57] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[19:00] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:02] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[19:03] <joao> brb; fixing bugs by rebooting into newly installed packages (hurray!)
[19:04] * joao (~JL@ Quit (Remote host closed the connection)
[19:07] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[19:14] * cattelan_away is now known as cattelan_away_away
[19:19] * lofejndif (~lsqavnbok@82VAAEAGI.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[19:31] <elder> joao yes I have a strong preference for paper printouts when reviewing code. But only a source file at a time, generally.
[19:32] <elder> nhm is the ext exhausting memory or something? Or CPU? Have you looked at that?
[19:33] <elder> sagewk, I'd say I'd have to try building it once to see what breaks. Then I'd have a better idea of whether it would be very hard.
[19:33] <nhm> elder: lots of cpu wait time
[19:33] <nhm> elder: haven't looked closer at it yet, trying to get some more tests setup to run and once those are going I'll have more time to look.
[19:33] <elder> You mean blocked? What do you mean by CPU wait?
[19:35] <Tv_> vmstat column 2 i guess
[19:36] <Tv_> a better metric for that is queue depth, but i forget how to see that
[19:36] <nhm> elder: waiting on IO.
[19:36] <nhm> Tv_: collectl will report that I think when I go back and replay the run.
[19:36] <elder> OK. I presume all the OSD's are distinct spindles so they aren't conflicting in that way.
[19:37] <Tv_> the controller might suck though
[19:37] <nhm> elder: 10 drives, 10 OSDs, 1 disk per OSD/Journal pair.
[19:38] <nhm> Tv_: could be. I noticed it couldn't handle lots of IOs without WB cache on (which is now on for everything).
[19:39] <elder> Do the OSD and journal have an interaction with each other? (Sorry, I'm just not very up on that side of things.)
[19:44] <nhm> elder: Beyond simply competing for resources, I'm not sure.
[19:45] <nhm> elder: certainly in some fashion, but I don't yet know the behavior changes for each FS.
[19:45] <nhm> IE I imagine we do things slightly differently with btrfs vs ext4/xfs.
[19:46] * joao (~JL@ has joined #ceph
[19:48] * monrad-51468 (~mmk@domitian.tdx.dk) Quit (Quit: bla)
[20:03] * monrad-51468 (~mmk@domitian.tdx.dk) has joined #ceph
[20:10] * monrad-51468 (~mmk@domitian.tdx.dk) has left #ceph
[20:11] <wido_> hey
[20:11] <wido_> who is in charge of ceph.com? Ross?
[20:13] <dmick> wido_: yes
[20:15] <wido_> dmick: Thanks! I lost his card though. ross@ceph.com? or ross@inktank.com?
[20:15] <dmick> ross.turk@inktank.com will work
[20:17] <wido_> great
[20:29] * BManojlovic (~steki@ has joined #ceph
[20:54] * tjikkun (~tjikkun@82-169-255-84.ip.telfort.nl) Quit (Remote host closed the connection)
[20:57] * CristianDM (~CristianD@host93.190-227-49.telecom.net.ar) has joined #ceph
[20:57] <CristianDM> Hi.
[20:57] * adjohn (~adjohn@ has joined #ceph
[21:09] * tjikkun (~tjikkun@82-169-255-84.ip.telfort.nl) has joined #ceph
[21:13] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[22:14] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[22:15] <gregaf> elder: so you just want the make output from somebody checking out 2.6.33 and cherry-picking all the rbd patches?
[22:23] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Read error: Connection reset by peer)
[22:23] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[22:32] * gregorg (~Greg@ Quit (Ping timeout: 480 seconds)
[23:03] * Ryan_Lane (~Adium@dslb-188-106-098-138.pools.arcor-ip.net) has joined #ceph
[23:11] * gregorg (~Greg@ has joined #ceph
[23:26] * rturk (~rturk@aon.hq.newdream.net) has joined #ceph
[23:32] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[23:33] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) has joined #ceph
[23:35] <elder> gregaf, is it frightening?
[23:35] <gregaf> elder: haven't tried, just wanted to make sure I knew what you meant :)
[23:36] <gregaf> Sage thinks even that will be a little bit of work since part of the rbd kernel integration was separating libceph from the cephfs code
[23:36] <gregaf> I'm happy to do it, but not today
[23:37] * szaydel (~szaydel@c-67-169-107-121.hsd1.ca.comcast.net) Quit ()
[23:37] <elder> Oh, well yes, I guess that's about what I'd want to try out. I've done this sort of thing before and sometimes the way to go about it is to start with what's there and have a patch series that you fix, one at a time, that brings it up-to-date with current code.
[23:37] <gregaf> (2.6.33 is interesting because???RHEL6! *sigh*)
[23:38] <gregaf> I know that cephfs is unfeasible, but I'm hoping that the block interfaces have changed less over that time
[23:43] <Tv_> gregaf: uh, rhel+patch is almost as unlikely to happen as rhel+new kernel
[23:43] <gregaf> module?
[23:43] <Tv_> compiled separately?
[23:44] <gregaf> sure?
[23:44] <Tv_> unngh
[23:44] <Tv_> make sure you say that when you ask ;)
[23:44] <gregaf> <??? has no idea about this, thus asking about it last week :p
[23:44] <Tv_> we don't have any build support for that
[23:44] <Tv_> usually that would mean a separate source distribution
[23:44] <Tv_> i don't know if you could take a linux.git and compile for *another* build of the kernel using the whole source tree
[23:45] <Tv_> when i last looked, i think the makefiles looked different for the different cases
[23:45] <Tv_> but it's more work than "make it work with 2.6.33"
[23:45] <gregaf> I thought when you set up the build that you could choose to build modules as part of either the integrated kernel or as a module that works with it
[23:45] <gregaf> Y/N/M
[23:45] <Tv_> also, rhel isn't upstream 2.6.33 either
[23:45] <gregaf> yes, I realize that
[23:46] <Tv_> gregaf: that doesn't mean you could load that into *another* build of the same source
[23:46] <gregaf> ah
[23:46] <Tv_> well it kinda does, but you have to be careful
[23:46] <Tv_> and so does the person who built that kernel
[23:46] <Tv_> and carefulness costs time
[23:46] <gregaf> I really don't know much about this; I assume it's possible since people distribute closed-source kernel modules
[23:46] <Tv_> hence, this is way more than "backport to 2.6.33"
[23:47] <Tv_> it's not about whether it's possible, it's just the amount of work related to it
[23:47] <gregaf> it's not a zero amount of work, but customers and users are interested so it would be nice if we could at least say "we looked into it and it will cost $50k, would you like to provide those resources?"
[23:47] <Tv_> i mean, rhel is an interesting target, but you need to state the whole problem when you ask people for how much work it'd be
[23:48] <Tv_> something like: 1. backport to 2.6.33; 2. sideport on top of rhel kernel; 3. build as a standalone module
[23:48] <gregaf> keep in mind this conversation got filtered through a couple people and wasn't all on irc :)
[23:48] <Tv_> #3 would need a little experiment to see whether it's likely to work
[23:49] <Tv_> i've forgotten the details but modversioning used to be a fight.. something a hash of the abi that has to match etc
[23:49] <Tv_> *something like
[23:49] <Tv_> also, build environments for all this, qa

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.