#ceph IRC Log

Index

IRC Log for 2012-09-18

Timestamps are in GMT/BST.

[0:00] <joshd> the osd might have become overloaded and killed itself due to its disk taking too long, for example
[0:00] <joshd> it's best to check the log first
[0:02] <cblack101> http://mibpaste.com/ph6qbb - grep of the log is in this paste, looks like lots of entries...
[0:02] <cblack101> good news is it still maintained 30k IOPS (4k block writes) even after the failure
[0:04] <cblack101> perhaps I was pounding on it a bit too hard to start off... :-D
[0:04] <Leseb> hi guys
[0:05] <Leseb> am I the only one who can't import a file to a specific pool? It always points to the pool "rbd"
[0:05] <joshd> cblack101: so the end of osd.27's log (in /var/log/ceph/ceph-osd.27.log) will tell us why it failed... but yeah, 30k IOPS is very nice
[0:06] <joshd> Leseb: just fixed that bug - previous version of 'rbd import' ignore --pool and only obey --dest-pool
[0:09] <Leseb> joshd: like so? rbd --dest-pool nova import
[0:09] <Leseb> ?
[0:09] <joshd> yeah
[0:09] <Leseb> hum doesn't seem to work too
[0:09] <Leseb> It returns: error opening pool rbd: (2) No such file or directory
[0:09] <Leseb> same as the -p or --pool option
[0:10] <joshd> what about rbd import file nova/image?
[0:10] <dmick> cblack101: ah, that's the osd dump output; down_at is throwing the grep off
[0:11] <joshd> the fix is in the stable branch, so it'll be in 0.48.2
[0:11] <dmick> osd.27 is the culprit
[0:11] <joshd> I've got to run now though - see you tomorrow
[0:11] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[0:11] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) Quit (Quit: Leaving.)
[0:11] <dmick> go check out his logs (likely in /var/log/ceph/osd.27.log or something like it, if you didn't set them in the ceph.conf)
[0:12] <dmick> Leseb: I'll look at the source quick
[0:12] * slang (~slang@2607:f298:a:607:95d0:c72c:d47:7351) has joined #ceph
[0:12] <Leseb> dmick: thanks! :)
[0:13] <cblack101> @dmick looks like 27 is a zero byte file: -rw-r--r-- 1 root root 0 Sep 17 06:29 osd.27.log
[0:13] <cephalobot> cblack101: Error: "dmick" is not a valid command.
[0:13] <cblack101> dmick: looks like 27 is a zero byte file: -rw-r--r-- 1 root root 0 Sep 17 06:29 osd.27.log
[0:13] <dmick> ! that's odd. It had started originally, right?
[0:14] <Tv_> cephalobot: but if "dmick" *was* a valid command, what would it do?
[0:14] <cephalobot> Tv_: Error: "but" is not a valid command.
[0:14] <dmick> :q
[0:14] <dmick> oops
[0:14] <elder> You mean :wq
[0:14] <cblack101> the only place I see a non zero byte file is on the host that osd.27 sits on
[0:14] <dmick> no writes, nobody gets hurt
[0:14] <dmick> cblack101: ? that's the only place that should have a log at all
[0:15] <dmick> but yeah, that's the one you want to peruse
[0:17] <slang> Tv_: if this were a mud you could make "dmick" do just about anything..
[0:17] <dmick> Tv_: complain
[0:18] <nhmlap_> slang: Hey, do you know Ravi Madduri?
[0:18] <dmick> cblack101: so anything interesting in that log file on the osd.27 host?
[0:19] <slang> nhmlap_: yep - used to work with him on Globus
[0:19] <cblack101> dmick: yes, unfortunately nothing... looking at the gz files
[0:19] <nhmlap_> slang: I used to be the lead for the Minnesota Supercomputing Institute's Globus/Grid computing effort.
[0:20] <slang> nhmlap_: I'm sorry
[0:20] <nhmlap_> slang: That's what I was going to say to you!
[0:20] <nhmlap_> slang: I see we are going to get along just fine. ;)
[0:21] <cblack101> dmick: interesting, nothing in there except last saturday, health is OK now from ceph -s
[0:21] <cblack101> osdmap e58: 48 osds: 47 up, 47 in - looks like we still have one down now
[0:22] <dmick> yeah
[0:22] <cblack101> ok, will look at this more tomorrow before we start at the open stack stuff, and go check for red-lights
[0:22] <nhmlap_> slang: anyway, Ravi and I ended up knowing each other through cagrid work.
[0:23] <cblack101> what line in that log paste tipped you off to osd.7?
[0:23] <nhmlap_> slang: Which is this crazy thing that the NIH built ontop of globus.
[0:24] <dmick> osd.27
[0:24] <dmick> was the one marked 'down'
[0:24] <dmick> ceph osd dump | grep down | grep -v down_at
[0:25] <cblack101> got it "down out weight 0" is the indicator then
[0:26] <cblack101> dmick: have a good afternoon, ttyl! :-) & thanks again!
[0:28] <Leseb> dmick: still busy?
[0:30] <dmick> cblack101: down and out and weight 0 are all different
[0:30] <dmick> but yeah, sorta
[0:30] <dmick> Leseb: sorry, queued up :)
[0:30] <Leseb> dmick: np np :)
[0:32] * cblack101 (c0373725@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[0:32] * slang (~slang@2607:f298:a:607:95d0:c72c:d47:7351) Quit (Quit: Leaving.)
[0:32] * mrjack_ (mrjack@office.smart-weblications.net) Quit (Ping timeout: 480 seconds)
[0:32] * mrjack (mrjack@office.smart-weblications.net) Quit (Ping timeout: 480 seconds)
[0:33] * slang (~slang@38.122.20.226) has joined #ceph
[0:33] <slang> nhmlap_: cool - yeah Ravi is still doing cabig stuff afaik
[0:34] <nhmlap_> slang: Really? The rumors were that the whole thing was being defunded. There was some kind of scathing report that came out about cabig a year or two ago.
[0:34] <slang> oh
[0:34] <slang> nhmlap_: I guess you know more than me
[0:35] <dmick> Leseb: workaround:
[0:35] <nhmlap_> slang: not really, that's about all I know. I moved on to our HPC group after our grant finished up.
[0:35] <dmick> specify pool/image as dest
[0:36] <dmick> but for me, --dest-pool also works
[0:37] <nhmlap_> slang: I was mostly interested in integrating our clusters for proteomics and bioinformatics analyses. Ultimately I ended up getting sucked into performance analysis on our clusters and then lustre performance work and eventually maintenance.
[0:37] <slang> nhmlap_: oof
[0:37] <slang> nhmlap_: now I'm really sorry :-)
[0:38] <Leseb> dmick: it wants to open the rbd pool: error opening pool rbd: (2) No such file or directory
[0:38] <Leseb> dmick: this pool doesn't exist if I create it, the import will go into the rbd pool :/
[0:39] <nhmlap_> slang: Hrm, I think there is some kind of life lesson here I need to evaluate after the fact. ;)
[0:40] <dmick> Leseb: rbd -v?
[0:40] <Leseb> ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c)
[0:40] <nhmlap_> In any event, nice to see a fellow HPC guy around. :)
[0:40] <dmick> oh I misunderstood; let me get that version
[0:41] <slang> nhmlap_: seems like you made the right choice
[0:42] <nhmlap_> slang: It's not even a question in my mind. :)
[0:43] <nhmlap_> slang: Ravi told me about some of the politics out there. Inktank is mostly free of such things.
[0:43] <nhmlap_> slang: or at the very least, I don't see them since I work from my basement.
[0:45] <rturk> yeah, not many politics yet :)
[0:45] <dmick> Leseb: and just to be clear, rbd import <image> poolname/objectname doesn't work either?
[0:46] <Leseb> dmick: nop
[0:46] <dmick> heh, apparently I can't read code then. Actually building
[0:47] <nhmlap_> rturk: I think the most amazing thing is that people are motivated and not consumed by bitterness.
[0:47] <Leseb> dmick: just leave it, I'll re-ask tomorrow (time to go to bed for me)
[0:47] <dmick> it's nice having a goal that seems less than lining some yacht-owner's pockets
[0:47] <dmick> Leseb: sorry. It's certianly better in later releases :)
[0:47] <Leseb> dmick: thanks for the help, cheers!
[0:48] <dmick> but I can't believe there's not a workaround in argonaut. I'll post here if I find it
[0:48] <nhmlap_> rturk: btw, you can tell Ilya he can use that quote for recruiting if he wants.
[0:48] <Leseb> dmick: thank you :)
[0:48] <dmick> s/seems less/seems less evil/
[0:52] * danieagle (~Daniel@177.43.213.15) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[0:55] * BManojlovic (~steki@212.200.243.39) Quit (Quit: Ja odoh a vi sta 'ocete...)
[0:59] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[1:01] <Tv_> alright, i need to talk my way through this presentation to see how it times & rhymes -- i'll go do that at home, to avoid making noise for an hour.. and i'm at SUSECon for the rest of the week. see you next week! email if you need me.
[1:02] * Tv_ (~tv@2607:f298:a:607:391b:b457:8e5c:c6ea) Quit (Quit: Tv_)
[1:05] * lofejndif (~lsqavnbok@28IAAHP79.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[1:05] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) has joined #ceph
[1:13] * sagelap (~sage@12.130.118.47) has joined #ceph
[1:14] <rturk> nhmlap_: I will. That's fantastic! Inktank: We're Not Consumed By Bitterness
[1:14] <nhmlap_> rturk: It's a fairly amazing selling point.
[1:16] * slang (~slang@38.122.20.226) Quit (Quit: Leaving.)
[1:18] <rturk> it actually is!
[1:18] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[1:18] <rturk> Inktank: Infinitely Scalable Positive Energy
[1:18] <rturk> :D
[1:22] * jjgalvez (~jjgalvez@12.248.40.138) Quit (Ping timeout: 480 seconds)
[1:24] <nhmlap_> rturk: lol, I'm not sure about that. Lets just stick with not unrecoverably bitter. ;)
[1:26] <dmick> Inktank: it's like a good balsamic for your greens
[1:28] <nhmlap_> Inktank: If we were coffee we wouldn't be the burnt stuff that sat out all night.
[1:29] * jjgalvez (~jjgalvez@12.248.40.138) has joined #ceph
[1:32] * jlogan1 (~Thunderbi@2600:c00:3010:1:8934:ad93:2153:3a19) Quit (Ping timeout: 480 seconds)
[1:32] * slang (~slang@38.122.20.226) has joined #ceph
[1:37] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[1:37] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) has joined #ceph
[1:43] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[1:45] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) has joined #ceph
[1:45] * gohko (~gohko@natter.interq.or.jp) Quit (Quit: Leaving...)
[1:47] <rturk> lol
[1:53] <dmick> Leseb: It Works For Me. I'm mystified.
[1:53] <dmick> both --dest-pool and pool/image work just as expected
[1:54] * amatter (~amatter@209.63.136.130) Quit ()
[1:58] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[1:58] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) has joined #ceph
[2:05] * gohko (~gohko@natter.interq.or.jp) has joined #ceph
[2:11] * jjgalvez (~jjgalvez@12.248.40.138) Quit (Quit: Leaving.)
[2:17] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) Quit (Read error: Operation timed out)
[2:26] * sagelap (~sage@12.130.118.47) Quit (Quit: Leaving.)
[2:37] * sagelap (~sage@12.130.118.47) has joined #ceph
[2:49] * tryggvil_ (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[2:51] * liiwi (liiwi@idle.fi) Quit (Remote host closed the connection)
[2:54] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Ping timeout: 480 seconds)
[2:54] * tryggvil_ is now known as tryggvil
[3:05] <gregaf> mikeryan: wow, that is a particularly unhelpful log :(
[3:05] <gregaf> still, I suspect it's an MDS issue, not an OSD issue
[3:06] * liiwi (liiwi@idle.fi) has joined #ceph
[3:06] <gregaf> if you wanted to be particularly conscientious, you could match up the stars appropriately in order to look through the core dump and check where the bad pointer got introduced
[3:08] * tryggvil_ (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[3:10] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Ping timeout: 480 seconds)
[3:10] * tryggvil_ is now known as tryggvil
[3:25] * slang (~slang@38.122.20.226) Quit (Quit: Leaving.)
[3:30] * yoshi_ (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[3:34] * tryggvil_ (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[3:34] * sagelap (~sage@12.130.118.47) Quit (Ping timeout: 480 seconds)
[3:34] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Ping timeout: 480 seconds)
[3:34] * tryggvil_ is now known as tryggvil
[3:36] * Deuns (~kvirc@office.resolvtelecom.fr) Quit (Read error: Connection reset by peer)
[3:37] * Deuns|2 (~kvirc@office.resolvtelecom.fr) has joined #ceph
[3:41] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[3:43] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[3:46] * maelfius (~mdrnstm@66.209.104.107) Quit (Quit: Leaving.)
[3:47] * slang (~slang@38.122.20.226) has joined #ceph
[3:55] * slang (~slang@38.122.20.226) Quit (Quit: Leaving.)
[4:02] * yehudasa_ (~yehudasa@static-66-14-234-139.bdsl.verizon.net) has joined #ceph
[4:04] * sagelap (~sage@12.130.118.47) has joined #ceph
[4:42] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[4:45] * loicd (~loic@82.235.173.177) has joined #ceph
[5:03] * sagelap (~sage@12.130.118.47) Quit (Ping timeout: 480 seconds)
[5:09] * yehudasa_ (~yehudasa@static-66-14-234-139.bdsl.verizon.net) Quit (Ping timeout: 480 seconds)
[5:13] * Cube (~Adium@12.248.40.138) Quit (Quit: Leaving.)
[5:27] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) has joined #ceph
[5:37] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[5:52] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[6:14] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[6:17] * sjustlaptop (~sam@66-214-139-112.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:19] * dmick is now known as dmick_away
[6:22] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[6:29] * chihjen (~chatzilla@122-116-65-7.HINET-IP.hinet.net) has joined #ceph
[7:15] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[7:29] * nwl (~levine@atticus.yoyo.org) has joined #ceph
[7:34] * chihjen (~chatzilla@122-116-65-7.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[7:36] * chihjen (~chatzilla@122-116-65-7.HINET-IP.hinet.net) has joined #ceph
[7:45] * sage (~sage@76.89.177.113) has joined #ceph
[7:50] * sage (~sage@76.89.177.113) Quit (Remote host closed the connection)
[7:56] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[7:59] * gregaf1 (~Adium@2607:f298:a:607:2978:9a00:bcd5:7c60) has joined #ceph
[8:01] * SpamapS (~clint@xencbyrum2.srihosting.com) Quit (Remote host closed the connection)
[8:01] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[8:02] * rturk (~rturk@ps94005.dreamhost.com) Quit (Ping timeout: 480 seconds)
[8:03] * rturk (~rturk@ps94005.dreamhost.com) has joined #ceph
[8:05] * EmilienM (~EmilienM@ADijon-654-1-133-33.w90-56.abo.wanadoo.fr) has joined #ceph
[8:05] * gregaf (~Adium@2607:f298:a:607:fcc2:81f2:4b68:23fb) Quit (Ping timeout: 480 seconds)
[8:11] * EmilienM_ (~EmilienM@ADijon-654-1-15-175.w109-217.abo.wanadoo.fr) has joined #ceph
[8:12] * EmilienM_ (~EmilienM@ADijon-654-1-15-175.w109-217.abo.wanadoo.fr) has left #ceph
[8:13] * EmilienM (~EmilienM@ADijon-654-1-133-33.w90-56.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[8:13] * loicd (~loic@82.235.173.177) Quit (Quit: Leaving.)
[8:17] * EmilienM (~EmilienM@ADijon-654-1-15-175.w109-217.abo.wanadoo.fr) has joined #ceph
[9:05] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[9:17] * chihjen_ (~chatzilla@122-116-65-7.HINET-IP.hinet.net) has joined #ceph
[9:23] * chihjen (~chatzilla@122-116-65-7.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[9:25] * loicd (~loic@178.20.50.225) has joined #ceph
[9:26] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)
[9:28] * Deuns|2 (~kvirc@office.resolvtelecom.fr) Quit (Quit: KVIrc 4.0.0 Insomnia http://www.kvirc.net/)
[9:28] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[9:36] * deepsa_ (~deepsa@122.172.11.100) has joined #ceph
[9:37] * deepsa (~deepsa@122.172.161.188) Quit (Ping timeout: 480 seconds)
[9:37] * deepsa_ is now known as deepsa
[9:43] * chihjen (~chatzilla@122-116-65-7.HINET-IP.hinet.net) has joined #ceph
[9:48] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[9:48] * chihjen_ (~chatzilla@122-116-65-7.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[9:50] * deepsa (~deepsa@122.172.11.100) Quit (Ping timeout: 480 seconds)
[9:50] * deepsa (~deepsa@122.172.11.100) has joined #ceph
[10:13] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[10:20] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[10:21] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[10:25] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit ()
[10:28] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[10:33] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[10:50] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[11:04] * mgalkiewicz (~mgalkiewi@staticline58611.toya.net.pl) has joined #ceph
[11:10] * grant (~grant@60-240-78-43.static.tpgi.com.au) has joined #ceph
[11:11] <grant> Hi all. I have an issue at present - where after reboots an OSD is not automatically brought up. I can fix with "service ceph -a start osd20".
[11:11] <grant> The only obvious logging I can find is: ** ERROR: error converting store /ceph-data/osd.20: (16) Device or resource busy
[11:11] <grant> Any tips?
[11:13] <grant> This is 0.48.1 on Ubuntu 12.04.1 - 3 MDS / 3 MON / 22 OSD setup.
[11:24] * yoshi_ (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:29] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[11:34] * chihjen (~chatzilla@122-116-65-7.HINET-IP.hinet.net) Quit (Ping timeout: 480 seconds)
[11:39] * The_Bishop_ (~bishop@e179022038.adsl.alicedsl.de) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[12:01] * chihjen (~chatzilla@111.80.250.128) has joined #ceph
[12:28] * chihjen (~chatzilla@111.80.250.128) Quit (Read error: Connection reset by peer)
[12:49] * lofejndif (~lsqavnbok@659AAANLQ.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:56] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[12:57] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:02] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[13:43] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Ping timeout: 480 seconds)
[13:49] * mrjack (mrjack@office.smart-weblications.net) has joined #ceph
[13:52] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[13:54] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[14:05] * mrjack_ (mrjack@office.smart-weblications.net) has joined #ceph
[14:07] * EmilienM (~EmilienM@ADijon-654-1-15-175.w109-217.abo.wanadoo.fr) Quit (Remote host closed the connection)
[14:08] * EmilienM (~EmilienM@ADijon-654-1-15-175.w109-217.abo.wanadoo.fr) has joined #ceph
[14:12] * steki-BLAH (~steki@8-168-222-85.adsl.verat.net) has joined #ceph
[14:16] * BManojlovic (~steki@91.195.39.5) Quit (Ping timeout: 480 seconds)
[14:16] * fc (~fc@83.167.43.235) Quit (Quit: Lost terminal)
[14:28] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: zzzzzzzzzzzzzzzzzzzz)
[14:57] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[15:01] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[15:06] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[15:10] * steki-BLAH (~steki@8-168-222-85.adsl.verat.net) Quit (Ping timeout: 480 seconds)
[15:17] * scuttlemonkey (~scuttlemo@173-14-58-198-Michigan.hfc.comcastbusiness.net) has joined #ceph
[15:32] * SvenDowideit (~SvenDowid@203-206-171-38.perm.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:41] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[15:43] * gregorg (~Greg@78.155.152.6) Quit (Quit: Quitte)
[15:44] * lofejndif (~lsqavnbok@659AAANLQ.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[16:06] * cblack101 (c0373729@ircip1.mibbit.com) has joined #ceph
[16:11] <cblack101> Newbie Linux question: osd.27 in my cluster is down, the osd.27.log is empty, last one with content was from Saturday, went to the box unmounted the drive, ran an xfs_repair on /dev/sdf which comes back with 'fatal error -- Input/output error' , and /var/log/syslog reports a kernel error IO error, dev sdf, sector 0.... Is my hard drive dead?
[16:15] <darkfader> probably, have a look at dmesg and io_err_cnt in /sys/block/sdf/some-thing-there
[16:23] <joao> sagewk, ping me whenever you're around :)
[16:34] * EmilienM (~EmilienM@ADijon-654-1-15-175.w109-217.abo.wanadoo.fr) has left #ceph
[16:37] * EmilienM (~EmilienM@ADijon-654-1-15-175.w109-217.abo.wanadoo.fr) has joined #ceph
[16:43] * mgalkiewicz (~mgalkiewi@staticline58611.toya.net.pl) Quit (Remote host closed the connection)
[16:43] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) Quit (Remote host closed the connection)
[16:54] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Ping timeout: 480 seconds)
[17:03] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[17:11] * jlogan1 (~Thunderbi@72.5.59.176) has joined #ceph
[17:15] * slang (~slang@2607:f298:a:607:ed04:8e8e:461c:f499) has joined #ceph
[17:16] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:24] * lofejndif (~lsqavnbok@09GAAAY8P.tor-irc.dnsbl.oftc.net) has joined #ceph
[17:25] * aliguori (~anthony@32.97.110.59) has joined #ceph
[17:26] * allsystemsarego (~allsystem@188.25.131.159) has joined #ceph
[17:29] * glowell (~Adium@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:31] * three18ti (~three18ti@68.71.155.21) Quit (Quit: Leaving)
[17:45] * slang (~slang@2607:f298:a:607:ed04:8e8e:461c:f499) Quit (Read error: Connection reset by peer)
[17:45] * slang (~slang@2607:f298:a:607:ed04:8e8e:461c:f499) has joined #ceph
[17:48] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:54] * loicd (~loic@178.20.50.225) Quit (Ping timeout: 480 seconds)
[17:56] * stass (stas@ssh.deglitch.com) Quit (Read error: Connection reset by peer)
[18:03] * sagelap (~sage@220.sub-70-197-144.myvzw.com) has joined #ceph
[18:08] * loicd (~loic@jem75-2-82-233-234-24.fbx.proxad.net) has joined #ceph
[18:15] * slang (~slang@2607:f298:a:607:ed04:8e8e:461c:f499) Quit (Quit: Leaving.)
[18:16] * loicd (~loic@jem75-2-82-233-234-24.fbx.proxad.net) Quit (Quit: Leaving.)
[18:23] * loicd (~loic@jem75-2-82-233-234-24.fbx.proxad.net) has joined #ceph
[18:23] * stass (stas@ssh.deglitch.com) has joined #ceph
[18:25] * stass (stas@ssh.deglitch.com) Quit (Remote host closed the connection)
[18:25] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:26] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[18:33] * slang (~slang@38.122.20.226) has joined #ceph
[18:37] * stass (stas@ssh.deglitch.com) has joined #ceph
[18:39] <slang> redmine seems rather slow for me
[18:39] <slang> anyone else having difficulty?
[18:40] <nhmlap_> slang: yeah, problems here too.
[18:40] <nhmlap_> slang: was just going to say somethign about it
[18:40] * sagelap (~sage@220.sub-70-197-144.myvzw.com) Quit (Ping timeout: 480 seconds)
[18:48] * amatter (~amatter@209.63.136.130) has joined #ceph
[18:54] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) has joined #ceph
[18:55] * BManojlovic (~steki@212.200.243.39) has joined #ceph
[19:02] <elder> joshd, did you reschedule an rbd meeting for today?
[19:04] <joshd> not yet
[19:07] <elder> OK. Just was scanning my e-mail and didn't see anything so I wanted to check.
[19:08] * Cube (~Adium@12.248.40.138) has joined #ceph
[19:11] * lofejndif (~lsqavnbok@09GAAAY8P.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[19:13] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[19:18] * BManojlovic (~steki@212.200.243.39) Quit (Remote host closed the connection)
[19:22] * Ryan_Lane (~Adium@216.38.130.162) has joined #ceph
[19:29] * slang (~slang@38.122.20.226) Quit (Quit: Leaving.)
[19:39] * Ryan_Lane (~Adium@216.38.130.162) Quit (Ping timeout: 480 seconds)
[19:40] * slang (~slang@38.122.20.226) has joined #ceph
[19:47] <nhmlap_> hrm, new Panasas hybrid SSD storage is being announced.
[19:52] * slang (~slang@38.122.20.226) Quit (Quit: Leaving.)
[19:52] * slang (~slang@38.122.20.226) has joined #ceph
[19:55] * maelfius (~mdrnstm@66.209.104.107) has joined #ceph
[19:58] <iggy> right now I want to beat Panasas about the head and shoulders severely
[19:59] <nhmlap_> iggy: The last person I spoke to who had recently purchased a (very large!) Panasas system had very similar sentiments. Our experience at MSI was mostly positive, but it was on a much older system.
[20:00] <iggy> I work for an oil and gas company. Since I've started, it's been nothing but trouble. And they basically undersell it by 15%
[20:00] <iggy> they recommend not using over 85% of the available space
[20:00] * BManojlovic (~steki@212.200.243.39) has joined #ceph
[20:01] <iggy> if they have that much overhead, they need to build it in, not expect a bunch of halfwit users to not fill up the filesystem
[20:01] <iggy> which they did last week, and it's been running like absolute crap since
[20:01] <iggy> takes about 30 secs for an ls to return
[20:02] <nhmlap_> iggy: hrm, I don't remember much about how their hardware looks under the hood. Sounds like a fragmentation issue once the drives are close to full.
[20:03] * sagelap (~sage@38.122.20.226) has joined #ceph
[20:03] <elder> sagewk, I couldn't hear you well on the call. Were you talking about the lock inversion warning in tests?
[20:03] <elder> sagelap, I'll say what I just said again...
[20:03] <elder> I couldn't hear you well on the call. Were you talking about the lock inversion warning in tests?
[20:03] <iggy> nhmlap_: and I think it constantly tries rebalancing in the background... oh, and we have to replace blades like 2-3 times per month, so that probably doesn't help
[20:03] <nhmlap_> iggy: though ls should be metadata.
[20:04] <nhmlap_> iggy: yikes! our panasas was much more stable than that. Sounds like they've taken a turn for the worst.
[20:04] <elder> sagelap, that is, "INFO: possible circular locking dependency detected"
[20:04] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) Quit (Quit: Leaving.)
[20:04] * Ryan_Lane (~Adium@216.38.130.162) has joined #ceph
[20:04] <joshd> elder: yes, that's what he was talking about
[20:04] <elder> OK.
[20:05] <elder> Sorry I wasn't noticing that. My filter for teuthology runs stopped working and I haven't been looking at them lately.
[20:05] <joshd> I filed a bug for it: http://www.tracker.newdream.net/issues/3151
[20:05] <iggy> nhmlap_: I'll just say this... I would never suggest anyone use Panasas for anything after the experience I've had here... In theory, it sounds good, but reality != theory
[20:05] <iggy> nhmlap_: if you don't mind me asking, how big was the system?
[20:06] <nhmlap_> iggy: ours was from like 2006 or 2007 and around 70TB.
[20:07] <iggy> so probably a rack full or so... we have 2 racks, and the hardware failure rate seems pretty high for the amount of kit we have
[20:07] <nhmlap_> Yeah, I want to say it was about a rack.
[20:07] <nhmlap_> maybe a bit less.
[20:08] <nhmlap_> We rarely had failures on it. Compared to our lustre storage it seemed positively stable.
[20:08] <iggy> they want to try some DDN system with a few different filesystems toward the end of the year... one of which is Lustre... I really hope I get fired before that
[20:08] <iggy> my boss wants me heading it up
[20:08] <nhmlap_> iggy: gpfs and lustre have their issues. GPFS may be the better bet, but I've never actually maintained it.
[20:09] <iggy> that is one of the others
[20:09] <nhmlap_> I'd only use luster for scratch.
[20:09] <iggy> I forget what the 3rd was
[20:09] <nhmlap_> Was it also from DDN?
[20:10] <iggy> the hardware is all DDN, they are just going to redo the filesystem on site for testing
[20:11] <nhmlap_> Ok. I know DDN does lustre and gpfs. Not sure what else they support.
[20:11] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) has joined #ceph
[20:11] <nhmlap_> iggy: Firmware upgrades can be a bit painful on the DDN chassis. Otherwise they seem relatively solid from a hardware perspective from what I recall.
[20:12] <iggy> I think it's funny that the powers that be believe sales people more than me, so I've stopped really caring to try to push alternatives
[20:13] <iggy> I was trying to get them to look at ceph for a while, but since the guy taking them to lunch didn't know what it was, they nack'ed it
[20:14] <iggy> roughly the same time, I started looking for a new job :/
[20:14] <nhmlap_> iggy: DDN's boxes are all about their custom high performance controllers. As far as I can tell, Ceph runs best on $200 LSI SAS cards.
[20:15] <iggy> yeah, just need to get some hardware companies to start pushing it and you guys will be rolling in non-startup $s
[20:16] <iggy> that'll probably get easier after the recent announcements
[20:16] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Read error: Connection reset by peer)
[20:16] <nhmlap_> iggy: Do you mind if I ask what other vendors besides DDN your management might listen to?
[20:17] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) Quit (Remote host closed the connection)
[20:17] <iggy> if I knew, I'd probably be pushing them... Right now we have Panasas (sucks all around) and Isilon (performance sucks)
[20:18] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) has joined #ceph
[20:18] <gregaf1> Isilon? But I thought that was their whole thing
[20:18] * maelfius (~mdrnstm@66.209.104.107) has left #ceph
[20:18] <gregaf1> at least for large-file stuff
[20:19] <nhmlap_> gregaf1: what was their whole thing?
[20:19] <gregaf1> extreme performance via add-on caching servers and such
[20:19] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[20:19] <iggy> They were using a local (Houston) "solution provider" for years that (from the people I've met there) I'm pretty sure were just googling and picking the first result when we asked them for stuff
[20:19] <gregaf1> really awesome automatic tiering to move hot files into SSDs, etc
[20:20] <nhmlap_> gregaf1: oh interesting. We never really considered them for HPC stuff which is usually aimed at large IOs. I admit I don't know much about their solutions.
[20:20] <gregaf1> I could be wrong, all I know about them was from one Ars Technica feature or something
[20:20] <gregaf1> and I don't think anybody in HPC could afford a large enough system from them
[20:20] <gregaf1> but I could be wrong
[20:20] <iggy> well, we have 7 "shelves" (nothing with SSD afaik) and a NDMP offloader (that doesn't get used) and people mount it via NFS (don't know if there is a direct mount option like panasas has)
[20:23] <nhmlap_> iggy: I will say that DDN+Lustre seems to have a presence in Oil/Gas already, so hopefully DDN would have some experience with similar deployments. :)
[20:23] <nhmlap_> iggy: maybe it won't be that bad. ;)
[20:24] <iggy> our Isilon is only 76T, It gets about 85M/s streaming writes over 10GEth :/
[20:24] <nhmlap_> iggy: Make sure that they live demo the failover working properly before committing to buy.
[20:25] <iggy> rgr :)
[20:25] <iggy> but as I said, I hope I'm gone before then
[20:25] <nhmlap_> iggy: hehe, well good luck. We are looking for good people. ;)
[20:26] <nhmlap_> And apparently our recruiter might use my quote "We aren't consumed by bitterness" as a benefit of working here!
[20:27] <iggy> I saw that yesterday :)
[20:27] <sjust> grant: if you are there, that usually means that another ceph-osd process is still running on that store
[20:29] * Ryan_Lane1 (~Adium@166.250.35.220) has joined #ceph
[20:33] * cblack101 (c0373729@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[20:34] * Ryan_Lane (~Adium@216.38.130.162) Quit (Ping timeout: 480 seconds)
[20:39] * glowell (~Adium@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[20:46] * nhm (~nh@67-220-20-222.usiwireless.com) Quit (Ping timeout: 480 seconds)
[20:47] * nhmhome (~nh@67-220-20-222.usiwireless.com) Quit (Ping timeout: 480 seconds)
[20:48] * slang (~slang@38.122.20.226) Quit (Quit: Leaving.)
[20:52] * nhorman_ (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[20:56] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Ping timeout: 480 seconds)
[21:05] * slang (~slang@38.122.20.226) has joined #ceph
[21:20] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[21:23] * dmick_away is now known as dmick
[21:24] * helloadam (~adam@office.netops.me) has joined #ceph
[21:37] * lofejndif (~lsqavnbok@09GAAAZGM.tor-irc.dnsbl.oftc.net) has joined #ceph
[21:44] * pentabular (~sean@adsl-70-231-141-128.dsl.snfc21.sbcglobal.net) has joined #ceph
[21:47] * Ryan_Lane (~Adium@216.38.130.162) has joined #ceph
[21:47] * Ryan_Lane1 (~Adium@166.250.35.220) Quit (Read error: Connection reset by peer)
[21:49] * Ryan_Lane1 (~Adium@216.38.130.162) has joined #ceph
[21:49] * Ryan_Lane (~Adium@216.38.130.162) Quit (Read error: Connection reset by peer)
[21:54] * loicd (~loic@jem75-2-82-233-234-24.fbx.proxad.net) Quit (Quit: Leaving.)
[21:56] * sagelap1 (~sage@38.122.20.226) has joined #ceph
[21:56] * sagelap (~sage@38.122.20.226) Quit (Read error: Connection reset by peer)
[22:05] <helloadam> Howdy cephers!
[22:05] <helloadam> Few questions
[22:09] <helloadam> We are currently using gluster for our HPC setup here on campus (UCI) and we are finding that small writes and any sort of traversals are suuuuuuper slow.
[22:10] <helloadam> We were able to speed up small writes by creating a local loop back via NFS which supports client side caching where the gluster client did not. But still issues with traversal and the like
[22:10] <helloadam> We were wondering how other users use CEPH and if it will meet our needs here on campus
[22:11] <helloadam> Soo..does anyone have experince with small writes (or small file writes) with ceph? How well does it preform?
[22:12] <sjust> helloadam: what ceph client are you using?
[22:14] <helloadam> none, issue I was talking about was gluster. Just exploring other distrib. file systems. I have been keeping my eye on ceph and it seem some-what production ready without the need of a full on staff (ala dreamhost)
[22:14] <sjust> ah
[22:15] <gregaf1> helloadam: well, there aren't too many people running cephfs in production right now; Inktank (aka us) doesn't consider it stable yet
[22:17] <gregaf1> that said, Ceph clients support coherent caching, so small file access isn't going to perform as well as large files do in benchmarks, but it should be better than gluster (with its lack of cache and synchronous metadata queries)
[22:18] <gregaf1> maybe lxo can talk about it more?
[22:18] <gregaf1> or check out the ceph-devel list archives and maybe ask there; several people have recently come out of the woodwork and admitted to running it that we didn't know about :)
[22:19] <helloadam> so cephfs is what each client runs to access what ever is on the OSDs, right? So if cephfs is not stable, how do people get access to the data? NFS?
[22:19] <gregaf1> cephfs is the whole posix layer
[22:19] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:20] <gregaf1> as opposed to RADOS, which is the object store that CephFS is built on — you can also access RADOS via RBD (virtual block device) or the RadosGW (S3- and swift-compatible object store)
[22:20] <slang> helloadam: note the difference between cephfs and libcephfs
[22:21] <exec> gregaf1: btw, I have one q for you. in http://ceph.com/docs/master/start/quick-start/ we are using `hostname` output, but really there are - hostname=`hostname | cut -d . -f 1` (ceph_common.sh). it leads non-worked 5-min setup for new people
[22:22] <exec> helloadam: most of people use rbs via kernel driver or qemu-rbd in virtual machines. the second option works for me.
[22:23] <gregaf1> exec: hmm, our docs guy isn't in here right now, but I'll pass it on to him
[22:23] * John (~john@astound-64-85-239-164.ca.astound.net) has joined #ceph
[22:24] <exec> gregaf1: I suggest to use `hostname -s` in both places (wiki and ceph_common.sh), cut -d looks ugly )
[22:24] <exec> gregaf1: thanks )
[22:24] <gregaf1> well, for the script you should talk to sagewk for now ;)
[22:24] <exec> sagewk: ping )
[22:24] <sagewk> exec: that does sound nicer.
[22:25] <exec> sagewk: `hostname -s` vs `hostname | cut -d . -f 1`
[22:25] <sagewk> yeah
[22:25] <sagewk> is that semantically different or just prettier?
[22:26] <John> Where should I update the docs? Quickstart?
[22:26] <exec> sagewk: 'hostname -s' uses no additional external calls. and looks prettier. at least for me )
[22:27] <exec> John: yup. http://ceph.com/docs/master/start/quick-start/
[22:27] <sagewk> mkcephfs is about to be deprecated anyway.. i just don't want to risk breaking something for an aesthetic change
[22:28] <exec> from docs: Execute hostname on the command line to retrieve the name of your host......
[22:28] <exec> hostname != `hostname | cut -d . -f 1` in the common bash script
[22:28] <sagewk> yeah, ok
[22:29] <sagewk> john: let's make it 'hostname -s' there.. and i'll update mkcephfs
[22:30] <exec> sagewk: what is going to replace mkcephfs?
[22:30] <exec> I'm using it as separate calls for osd/mon init
[22:31] <sagewk> ceph-deploy, http://github.com/ceph/ceph-deploy
[22:31] * scuttlemonkey (~scuttlemo@173-14-58-198-Michigan.hfc.comcastbusiness.net) Quit (Quit: zzzzzzzzzzzzzzzzzzzz)
[22:31] <sagewk> i think that url works
[22:31] * nhorman_ (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:32] <exec> yup
[22:32] <exec> sagewk: anyway, I'm using fabric for faster deployment/cluster changes )
[22:33] <sagewk> cool
[22:33] <sagewk> if nothing else, ceph-deploy will be an easy reference for doing such integrations
[22:34] <John> ok. Doing it now
[22:35] <nhmlap_> exec: I started playing with fabric a bit when I started at Inktank. How do you like it?
[22:35] <exec> sagewk: as I see, ceph-deploy supports only debian-based distros
[22:35] <sagewk> gregaf: new wip-mon pushed. see f22614f3b297edd1be101f9c728187d119c80f5a for the interesting bits
[22:35] <sagewk> exec: currently, yeah.
[22:36] <exec> nhmlap_: why not? )
[22:36] <nhmlap_> exec: why not what?
[22:36] <exec> nhmlap_: I like fabric. why not? )
[22:37] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[22:37] <nhmlap_> exec: I think I'm confused. I just wanted to know if you liked it. :)
[22:37] <exec> sagewk: any plans to make rpm repo?
[22:37] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[22:38] <sagewk> next dev release will have centos RPMs, hopefully fedora too.
[22:38] <exec> nhmlap_: yup, I like it. however my fabfile is so simple
[22:38] <sagewk> sles, opensuse next on the list
[22:38] <exec> sagewk: what about agronaut release?
[22:38] <sagewk> maybe. we did fix the spec file, but not sure we'll build RPMs.
[22:39] * allsystemsarego (~allsystem@188.25.131.159) Quit (Remote host closed the connection)
[22:40] <exec> I have recompiled agronaut rpms for centos using fedora's version, however it's better to have precompiled packages from vendor )
[22:41] <nhmlap_> exec: I ended up just using pdsh with some simple scripts I've got since it's a bit less python specific.
[22:41] <exec> nhmlap_: sorry for confusing.
[22:41] <nhmlap_> exec: no worries
[22:41] * sjustlaptop (~sam@2607:f298:a:607:c685:8ff:fe0d:a9d5) has joined #ceph
[22:43] * pentabular flops around like a fish out of water
[22:43] <pentabular> would love to see some fabfiles
[22:43] <exec> nhmlap_: it depends on your task description. fabric vs 'for x in $hosts; do .... done' it's always subject to discuss )
[22:44] <sagewk> sjust: new wip-tp pushed
[22:45] <exec> pentabular: myown is very simple and almost non-usable for everyone. I don't ready to share them, sorry
[22:45] <pentabular> :) most undertood
[22:46] * pentabular gets tossed back in
[22:47] * slang (~slang@38.122.20.226) Quit (Quit: Leaving.)
[22:49] * slang (~slang@38.122.20.226) has joined #ceph
[22:49] <Leseb> sagewk: is ceph-deploy ready to use?
[22:49] <sagewk> still a work in progress, but it's in a reasonably working state
[22:50] <Leseb> cool thanks I'll have a look though :)
[22:50] <sagewk> please play with it, but don't be upset if tv changes stuff around
[22:51] <Leseb> okok :)
[22:57] <lxo> helloadam, I've been running ceph with mixed success so far, and I have indeed observed several super-slow operations. until I tuned up the size of the journals! I found out many ops were slowing down because of full journals, so I bumped them up, and the cluster feels much faster now!
[22:58] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[23:00] <exec> lxo: what size of journal and osd topology do you use?
[23:01] <nhmlap_> lxo: good deal. I've been using 10G journals for testing here.
[23:03] <sjustlaptop> lxo: did you also need to adjust the filestore throttle tunables?
[23:04] * exec going to sleep. bye.
[23:06] <nhmlap_> good night!
[23:09] * amatter_ (~amatter@209.63.136.130) has joined #ceph
[23:13] * grant (~grant@60-240-78-43.static.tpgi.com.au) Quit (Remote host closed the connection)
[23:15] * amatter (~amatter@209.63.136.130) Quit (Ping timeout: 480 seconds)
[23:29] * aliguori (~anthony@32.97.110.59) Quit (Remote host closed the connection)
[23:34] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[23:36] <lxo> sjustlaptop, I don't even know what tunables you're talking about, so I guess not
[23:37] <sjustlaptop> lxo: cool
[23:37] <lxo> exec, I have 3 servers, each with a few disks, and I run one osd per disk, with a crushmap that avoids placement of replicas on the same server
[23:40] <lxo> I've been using about 1% of the volume size as the journal size. I have disks of various sizes, so the crushmap also arranges for data to be distributed somewhat evenly according to disk size
[23:43] <lxo> the “somewhat” is because the distribution is not quite as proportional to the crush weights as I'd hoped
[23:45] * sjustlaptop (~sam@2607:f298:a:607:c685:8ff:fe0d:a9d5) Quit (Ping timeout: 480 seconds)
[23:46] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[23:46] * loicd (~loic@magenta.dachary.org) has joined #ceph
[23:50] * John (~john@astound-64-85-239-164.ca.astound.net) Quit (Ping timeout: 480 seconds)
[23:55] * SvenDowideit (~SvenDowid@203-206-171-38.perm.iinet.net.au) has joined #ceph
[23:59] * John (~john@astound-64-85-239-164.ca.astound.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.