#ceph IRC Log


IRC Log for 2012-04-05

Timestamps are in GMT/BST.

[0:04] * TheNim24 (~wircer@ has joined #ceph
[0:04] * TheNim24 (~wircer@ has left #ceph
[0:06] * perplexed_ (~ncampbell@ has joined #ceph
[0:06] * perplexed (~ncampbell@ Quit (Read error: Connection reset by peer)
[0:06] * perplexed_ is now known as perplexed
[0:11] <nhm> yehudasa: does object size have an effect on object distribution?
[0:13] <yehudasa> nhm: it may affect the number of operations we do in order to upload it
[0:13] <yehudasa> since we read/write it in chunks
[0:14] * LarsFronius (~LarsFroni@g231136036.adsl.alicedsl.de) Quit (Quit: LarsFronius)
[0:27] <perplexed> Odd... I just noticed that the import looks to be failing to some degree. Here are the steps I used: 1) Store each image file to a testpool1 2) rados export testpool1 to ./testpool1dir 3) rados import ./testpool1dir to testpool2. When I do a df on the cluster I see that testpool2 is populated with the right number of objects, but the size is 1KB... not the 218MB I'd expect to see. Looks like the import isn't actually happening.
[0:29] <perplexed> The actual commands: "rados export --workers 16 --create testpool1 ./testpool1dir" (works... the new testpool1dir directory is filled with the exported files of the right sizes per file)
[0:31] <yehudasa> preplexed: yeah, I think I see that too, probably rados import is broken
[0:32] <yehudasa> I'll try to figure out why
[0:32] <perplexed> "rados import --workers 16 ./testpool1dir testpool2 --debug-objecter 10 --log-to-stderr 2>&1 | grep op_submit >> testpool2importlog.txt" (shows the ~16,000 objects in the testpool2 pool (rados df), but also indicates the KB for the pool is 1KB
[0:33] <perplexed> thx. No wonder my write rates looked so good :)
[0:34] <perplexed> ceph 0.43 btw
[0:35] <yehudasa> yeah, I see that on the latest too
[0:47] * perplexed (~ncampbell@ Quit (Quit: perplexed)
[0:59] * jluis (~JL@89-181-151-120.net.novis.pt) has joined #ceph
[1:04] * perplexed (~ncampbell@c-76-21-85-168.hsd1.ca.comcast.net) has joined #ceph
[1:05] * joao (~JL@89-181-151-120.net.novis.pt) Quit (Ping timeout: 480 seconds)
[1:25] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:34] * jluis is now known as joao
[1:44] * perplexed (~ncampbell@c-76-21-85-168.hsd1.ca.comcast.net) Quit (Quit: perplexed)
[2:04] * lofejndif (~lsqavnbok@1RDAAAI18.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[2:10] * rturk (~rturk@aon.hq.newdream.net) Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[2:26] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[2:34] <Qten1> anyone know of any doco on how to get the RDB/RADOS working with swift?
[2:35] * yoshi (~yoshi@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:39] <iggy> Qten1: i think i saw something on the mailing list (link to a blog post or somesuch)
[2:40] <gregaf1> they don't really work with swift so much as replace it???.what are you trying to do?
[2:48] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:50] <Qten1> rados looks great just looking for a stable platform to use it with
[2:50] <Qten1> i imagine ceph cant be far off being stable either
[2:52] <Qten1> i'm looking to implement a storage solution and i've been using moosefs however the lack of community/support is a bit of a turn off
[2:53] <Qten1> also single meatadata server issue etc,
[2:53] <Qten1> meta even!
[2:56] * joao (~JL@89-181-151-120.net.novis.pt) Quit (Remote host closed the connection)
[3:07] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Quit: adjohn)
[3:36] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[4:28] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[4:32] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[5:00] * chutzpah (~chutz@ Quit (Quit: Leaving)
[6:21] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[6:39] * f4m8_ is now known as f4m8
[7:08] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[7:22] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[7:41] * imjustmatthew (~imjustmat@pool-71-176-237-208.rcmdva.fios.verizon.net) has joined #ceph
[8:22] * loicd (~loic@magenta.dachary.org) Quit (Ping timeout: 480 seconds)
[8:48] <Qten1> does ceph know if a chunk is corrupt or not?
[8:52] * yoshi_ (~yoshi@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[8:52] * yoshi (~yoshi@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Read error: Connection reset by peer)
[9:16] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:24] * loicd (~loic@ has joined #ceph
[9:25] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Quit: adjohn)
[9:27] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[9:29] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit ()
[9:35] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[9:54] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Quit: adjohn)
[10:01] * MarkN (~nathan@ Quit (Quit: Leaving.)
[10:04] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[10:06] * SpamapS (~clint@xencbyrum2.srihosting.com) Quit (Quit: Lost terminal)
[10:06] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[10:53] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[10:55] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit ()
[11:09] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[11:09] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[11:29] * yoshi_ (~yoshi@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:34] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[11:46] * adjohn (~adjohn@p1062-ipngn1901marunouchi.tokyo.ocn.ne.jp) Quit (Quit: adjohn)
[12:04] * mtk (~mtk@ool-44c35967.dyn.optonline.net) Quit (Remote host closed the connection)
[12:12] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Quit: LarsFronius)
[12:13] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[12:37] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Remote host closed the connection)
[12:38] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[12:45] * pruby (~tim@leibniz.catalyst.net.nz) has joined #ceph
[12:48] * pruby (~tim@leibniz.catalyst.net.nz) Quit (Remote host closed the connection)
[12:56] * gregorg (~Greg@ Quit (Ping timeout: 480 seconds)
[12:59] * gregorg (~Greg@ has joined #ceph
[13:19] * iggy (~iggy@theiggy.com) Quit (Ping timeout: 480 seconds)
[13:22] * iggy (~iggy@theiggy.com) has joined #ceph
[13:25] <nhm> good morning all
[13:30] * iggy (~iggy@theiggy.com) Quit (Ping timeout: 480 seconds)
[13:35] * iggy (~iggy@theiggy.com) has joined #ceph
[13:56] * nutz (~nutz@cafe.noova.de) has joined #ceph
[13:56] <nutz> hi all
[13:58] <nutz> i have a question about the inner workings of Ceph: let's say i have a 40MB binary file (e.g. a .ppt) and it is synced, i.e. in Location A and B. now i change a word or two, thereby changing the binary-file obviously. Will Ceph resync 40MB now? or does it use some kind of Deltas?
[14:10] <nhm> nutz: FYI most of the developers probably won't be around for a few hours.
[14:11] <nhm> nutz: it's like 5am out on the west coast. :)
[14:24] <nutz> ;) that's alright.
[14:25] <nutz> thanks for the response though
[14:30] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[14:30] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[14:51] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[14:51] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[15:05] * mtk (~mtk@ool-44c35967.dyn.optonline.net) has joined #ceph
[15:06] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[15:47] * f4m8 is now known as f4m8_
[15:50] * lofejndif (~lsqavnbok@82VAACT8T.tor-irc.dnsbl.oftc.net) has joined #ceph
[15:56] * lofejndif (~lsqavnbok@82VAACT8T.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[15:59] * lofejndif (~lsqavnbok@09GAAEM9P.tor-irc.dnsbl.oftc.net) has joined #ceph
[16:03] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[16:12] * lofejndif (~lsqavnbok@09GAAEM9P.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[16:15] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Read error: Operation timed out)
[16:16] * lofejndif (~lsqavnbok@09GAAENAB.tor-irc.dnsbl.oftc.net) has joined #ceph
[16:19] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[16:36] * adjohn (~adjohn@s24.GtokyoFL16.vectant.ne.jp) has joined #ceph
[16:53] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[16:54] * lofejndif (~lsqavnbok@09GAAENAB.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[17:09] * adjohn (~adjohn@s24.GtokyoFL16.vectant.ne.jp) Quit (Quit: adjohn)
[17:36] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:37] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Read error: Operation timed out)
[17:48] * oliver1 (~oliver@p4FECF9D6.dip.t-dialin.net) has joined #ceph
[17:52] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[17:55] * lofejndif (~lsqavnbok@1RDAAAJ1Y.tor-irc.dnsbl.oftc.net) has joined #ceph
[17:56] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[18:01] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[18:08] * gregphone (~gregphone@66-87-65-176.pools.spcsdns.net) has joined #ceph
[18:09] <gregphone> nutz: all writes are synchronously replicated as written
[18:10] <gregphone> so it'll only transmit the changes over the wire
[18:10] <gregphone> but it's now like Ceph need to calculate deltas -- only the change data will be flushed back out from the computer you're editing on
[18:10] <yehudasa> nutz: ceph doesn't send deltas, if you rewrite the entire files, it will all be sent again
[18:14] <nutz> gregphone: i'm not sure i entirely understood your answer. does it contradict yehudasa's ?
[18:14] <NaioN> no
[18:14] <NaioN> say you have that file of 40MB
[18:14] <nutz> so what does usually happen if i hit the save-button? wont it write the whole file again?
[18:14] <NaioN> and you write a new part ranging from byte 1000-2000
[18:15] <nutz> okay - so if i did: echo foo >> 40MBfile, that would only write the new "foo"?
[18:15] <NaioN> then those 1000 bytes will get send to the first OSD, that OSD will send those 1000 bytes to the second OSD (with repkication 2)
[18:15] <yehudasa> usuaully with msoffice application it'll write a new complete temporary files, then move it atomically to the original location
[18:15] <nutz> right. so short answer, for MS Office: you'll get a new 40MB sync
[18:15] <yehudasa> nutz: in the second case only 'foo' will be sent
[18:16] <nutz> right
[18:16] <NaioN> well this is behavior of MS Office
[18:16] <Tv|work> fwiw a lot of newer apps treat their files as databases, and overwrite-in-place
[18:16] <yehudasa> nutz: for ms office there's no easy solution without using some content addressable storage, or real ugly hacks (been there, done that)
[18:16] <NaioN> for Ceph it looks like the whole file gets rewritten
[18:17] <nhm> Tv|work: how long do you think before you have time to get a caching proxy server setup on the test cluster?
[18:17] <nutz> yeah - i figured MS Office is going to continue to give me aggressive butt rashes...
[18:17] <Tv|work> nhm: not working on it at all
[18:18] <Tv|work> nhm: so any time estimates would be bogus
[18:18] <NaioN> nutz: well it's a safety thing
[18:19] <nhm> Tv|work: Ok. I think I better stop downloading collectl from sourceforge. I increased his downloads by about 3000% over the last couple of days. ;)
[18:19] <nutz> NaioN: one way of putting it
[18:20] <NaioN> if you write a new file and at the end you just rename you have less change on corruption
[18:20] <nutz> yeah i know.. it's just. MS office is so frustrating
[18:22] <nutz> oh one more question though: the szenario i have in mind would be: ceph replication and samba on top, so windows computers would access via CIFS. that would still mean samba will do the "write new file, rename" thing right?
[18:23] <yehudasa> nutz: yes
[18:23] <Tv|work> neither samba nor ceph change what the application does
[18:23] <nutz> right
[18:23] <nutz> okay
[18:23] <nutz> thanks everybody :)
[18:23] <yehudasa> nutz: note that it wouldn't matter anyway.
[18:24] <yehudasa> even if it would have overwritten the same object in some unsafe fushion, it would probably overwrite the entire file
[18:24] <yehudasa> fushion <= fashion
[18:25] <gregphone> did Microsoft abandon their append-log change system? I know that's how it worked at one point
[18:25] <gregphone> anyway sorry for the confusion, I had the wrong context in mind :)
[18:25] <yehudasa> gregphone: have no idea
[18:26] <nutz> again: thanks everyone :)
[18:26] <yehudasa> at a previous project I worked on, we did have a hack for msproject files
[18:26] <Tv|work> perhaps that went away with the xml-based formats? no idea, really.. but even that thing would rewrite the whole file every now and then
[18:27] <Tv|work> if you care about bandwidth, nothing in the samba+ceph stack does magic for you
[18:27] <nutz> yeay.. that's kinda of a last resport - putting another layer inbetween for delta-syncing-capabilities
[18:27] <yehudasa> msproject <= msoffice
[18:30] <yehudasa> we mostly only sent deltas and for msoffice files we used a hack to do a cross file delta calculations. Also we never synched the temporary files to the server.
[18:30] <yehudasa> it worked quite well, but was a real hack that was tailored to these environments
[18:31] <yehudasa> wouldn't work with ceph, but if you're using cifs, you can always put some network optimization appliance on top of it
[18:31] <yehudasa> sadly the one we developed is no longer on the market
[18:51] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Read error: Operation timed out)
[18:52] * gregphone (~gregphone@66-87-65-176.pools.spcsdns.net) Quit (Ping timeout: 480 seconds)
[18:54] * bchrisman (~Adium@ has joined #ceph
[18:55] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[19:03] * danieagle (~Daniel@ has joined #ceph
[19:03] * oliver1 (~oliver@p4FECF9D6.dip.t-dialin.net) has left #ceph
[19:04] * chutzpah (~chutz@ has joined #ceph
[19:11] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[19:12] * cattelan_away (~cattelan@c-66-41-26-220.hsd1.mn.comcast.net) has joined #ceph
[19:13] * lofejndif (~lsqavnbok@1RDAAAJ1Y.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[19:13] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[19:21] * joshd (~joshd@aon.hq.newdream.net) Quit (Read error: Connection reset by peer)
[19:35] * Oliver (~oliver1@ip-37-24-160-195.unitymediagroup.de) has joined #ceph
[19:37] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:38] <Tv|work> sagewk: i'm actually not sure what email you meant
[19:38] <nutz> yehudasa: sounds like you really put effort into it
[19:39] <nutz> alright - i gotta run. thanks again boys and girls. take care
[19:39] * nutz (~nutz@cafe.noova.de) Quit (Quit: leaving)
[19:41] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Quit: LarsFronius)
[19:44] <sagewk> the default paths thread.
[19:45] <sagewk> people seem happy with /var/lib/ceph/$type/$id. the question is are we okay with that, and making multicluster people do something uglier if they need it.
[19:45] <sagewk> well, that's my question. :)
[19:47] <yehudasa> sagewk: we certainly need to add $cluster to the config args, whether to set new defaults or not is a different issue
[19:47] <sagewk> yeah
[19:49] * Qten (~Q@ppp59-167-157-24.static.internode.on.net) Quit (Ping timeout: 480 seconds)
[19:49] <Tv|work> sagewk: for some reason i'm not seeing it..
[19:49] <sagewk> if the chef (or whatever) stuff uses something like ceph-conf or ceph-osd --show-config foo to determine where to mount the device, then it doesn't matter.. the default will be fine for most, and multicluster people can include $cluster in there as needed. and that's probably best anyway so that we don't have paths hardcoded in the cookbooks/etc.
[19:49] <sagewk> there were 5-10 email sin the last 12 hrs.. maybe vger dropped you from the list? that happens to me periodically
[19:49] <Tv|work> sagewk: crap
[19:50] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[19:51] * Qten (Qten@ppp59-167-157-24.static.internode.on.net) has joined #ceph
[19:52] <gregaf1> aww crap, I got dropped too
[19:52] <Tv|work> i received email yesterday, not today
[19:52] <Tv|work> +1 on the ceph authx
[19:53] <yehudasa> cephx
[19:53] <Tv|work> heh
[19:54] <sagewk> re: cephx, we should put a quick doc in there about what it does and does not protect.
[19:54] * loicd (~loic@ Quit (Quit: Leaving.)
[19:54] <Tv|work> yes
[20:00] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:05] * joao (~JL@ has joined #ceph
[20:05] <gregaf1> just caught up to the cephx email thread, and I think if we're going to enable it by default we'll need a LOT of docs about it
[20:06] <gregaf1> obviously the reporting on this is one-sided, but I've seen a lot of people who have trouble configuring their cluster because their keyring etc management is all wrong
[20:06] <gregaf1> and I don't know of anybody who is running it who didn't need walkthroughs from a dev to help them figure out how to use it
[20:07] <iggy> agreed
[20:07] <iggy> especially the stuff about key names seems to catch people a lot
[20:07] <Tv|work> gregaf1: worst case, they specify "ceph auth = none"...
[20:07] <gregaf1> okay, but nobody's going to do that without being told...
[20:08] <gregaf1> what I'm saying is that we're going to see a higher support load and more people are going to have a negative initial experience
[20:08] <gregaf1> unless our documentation suddenly becomes very clear
[20:08] <Tv|work> i think the way forward on that side is chef.. but yeah
[20:08] * LarsFronius (~LarsFroni@g231137242.adsl.alicedsl.de) has joined #ceph
[20:09] <gregaf1> when we have working chef scripts and can say "the supported way to use Ceph is with this tarball of scripts" that will be an answer
[20:10] <nhm> gregaf1: there are a number of sites out there using puppet or cfengine or god knows what too though.
[20:19] * loicd (~loic@magenta.dachary.org) has joined #ceph
[20:24] * BManojlovic (~steki@ has joined #ceph
[20:37] <wonko_be> I would opt to keep the authentication part disabled for now for "tutorial walkthrough" ... just as gregaf1 says "more people are going to have a negative initial experience"
[20:37] <wonko_be> certainly as long as the docs aren't up to date
[20:37] <wonko_be> as a first time user, you just want to have everything working asap
[20:37] <wonko_be> and store that first file/block on your ceph cluster
[20:39] <wonko_be> and to make matters worse, i don't believe in the whole authentication thing anyhow, as I intend to run the ceph network completely separate - the only authentication would be for clients accessing the cluster, not the osd's authenticating to the mon's,...
[21:16] <nhm> wonko_be: That's my feeling too.
[21:20] <Tv|work> wonko_be: my only real complaint with that is
[21:20] <Tv|work> the docs being out of date is a question of attitude
[21:20] <Tv|work> and letting things be, will let things be
[21:21] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[21:51] * joao (~JL@ Quit (Ping timeout: 480 seconds)
[22:03] * lofejndif (~lsqavnbok@04ZAACH0A.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:10] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[22:10] * lofejndif (~lsqavnbok@04ZAACH0A.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[22:13] * lofejndif (~lsqavnbok@28IAADR6Y.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:37] * kloo (~kloo@a82-92-246-211.adsl.xs4all.nl) has joined #ceph
[22:38] <kloo> hi.
[22:38] <kloo> ceph rocks.
[22:38] <sjust> good to hear :)
[22:38] <kloo> except that right now, my ceph-osd processes take an increasing amount of cpu.
[22:38] <yehudasa> ah, the carrot and the stick
[22:38] <kloo> they intermittently run at 100%.
[22:39] <kloo> the cpu bound threads are running hobject_t::encode whenever i take a peek with gdb.
[22:40] <yehudasa> kloo: what's the workload that you're running?
[22:40] <kloo> more like a carrot offered in supplication. :)
[22:40] <kloo> i have two nodes, two OSDs each.
[22:41] <kloo> it's been running for some weeks.
[22:41] <kloo> when i restart a ceph-osd it spends minutes in this cpu-bound state before it's 'up'.
[22:41] <kloo> after that it's intermittent.
[22:42] <sjust> after it starts up, it goes through peering which is fairly cpu intensive
[22:42] <kloo> it looks like this happens when it does PG-related hobject_t::encode stuff.
[22:42] <sjust> were there pgs in backfill?
[22:42] <sjust> actually, what version are you running?
[22:42] <kloo> 0.44.1.
[22:43] <kloo> i've had pgs in backfill but right now all is active+clean and the cpu behaviour still happens.
[22:43] <sjust> kloo: oh, it's probably scrub
[22:43] <kloo> no it says the load is too high to scrub.
[22:43] <sjust> oh...
[22:43] <sjust> hmm
[22:44] <sjust> so no pgs have been in scrub state?
[22:44] <kloo> the osd processes really stall, it's not really usable from the client (cephfs).
[22:44] <sjust> ok, can you post the backtrace from gdb at one of these points?
[22:44] <kloo> it won't scrub now, it scrubbed fine before.
[22:44] * loicd (~loic@magenta.dachary.org) Quit (Ping timeout: 480 seconds)
[22:44] <kloo> i don't have numbers but it appears to be increasingly slow.
[22:45] * yehudasa (~yehudasa@aon.hq.newdream.net) has left #ceph
[22:45] * yehudasa (~yehudasa@aon.hq.newdream.net) has joined #ceph
[22:45] <sjust> the backtrace should give me an idea of what's going on
[22:46] <kloo> yep, grabbing one!
[22:50] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:51] * loicd (~loic@magenta.dachary.org) Quit ()
[22:56] <sagewk> yehudasa: can you look at wip-cluster?
[22:56] <yehudasa> sagewk: sure
[22:59] * rturk (~rturk@aon.hq.newdream.net) has joined #ceph
[23:03] * LarsFronius (~LarsFroni@g231137242.adsl.alicedsl.de) Quit (Quit: LarsFronius)
[23:14] <yehudasa> sagewk: looks good, though it's not completely clear how the two uuid patches are relevant (other than cleanup?)
[23:16] <sagewk> oh, i guess they're not, but they'll needed for the same chef stuff this will support
[23:35] * asadpanda (~asadpanda@ has joined #ceph
[23:44] <kloo> i think i was focused on the wrong thread, before.
[23:45] <kloo> the high cpu caused me to attach gdb, which caused some peering.
[23:46] <kloo> but with all pgs active+clean and a quiet cluster, the cpu spikes continue to happen.
[23:46] <kloo> it looks to be this: http://pastebin.com/BDgKpkni
[23:47] <kloo> the SyncThread wakes up on a schedule?
[23:49] <gregaf1> kloo: is that a consistent backtrace, or does it change some?
[23:49] <gregaf1> and yes, the SyncThread wakes up on a regular schedule
[23:49] <gregaf1> ???err, actually, it might just be a separate thread for running syncs, I forget
[23:50] <kloo> what i've seen so far is that it's always in DBObjectMap::check (when it's CPU-bound and i attach gdb).
[23:50] <kloo> and it's doing leveldb stuff but details vary a bit..
[23:51] <kloo> the leveldb use is recent, right?
[23:55] <sjust> kloo: yeah, that's my bad
[23:55] <sjust> it's debug code which should be disabled
[23:56] <kloo> the entire check()?
[23:56] <sjust> it's performing a consistency check of the leveldb structures
[23:56] <sjust> yes
[23:56] <kloo> aha.
[23:56] <kloo> has it been removed post-0.44.1?
[23:56] <sjust> no
[23:57] <kloo> could i remove the call to it?
[23:57] <sjust> yeah, but one sec
[23:57] <kloo> i use ceph for some real work but back up all the time..
[23:57] <yehudasa> we'll probably fix that and push 0.44.2
[23:59] <kloo> that'd be great yehudasa, more carrots!
[23:59] <yehudasa> heh

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.