#ceph IRC Log


IRC Log for 2013-07-03

Timestamps are in GMT/BST.

[0:05] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[0:09] * zapotah (~zapotah@dsl-hkibrasgw2-50dfdb-234.dhcp.inet.fi) has joined #ceph
[0:10] * zapotah_ (~zapotah@dsl-hkibrasgw2-50dfdb-234.dhcp.inet.fi) Quit (Ping timeout: 480 seconds)
[0:12] * Tv (~tv@pool-108-13-115-92.lsanca.fios.verizon.net) has joined #ceph
[0:12] <Tv> nice name drop: http://redmonk.com/jgovernor/2013/07/02/its-the-data-stupid-whats-next-in-cloud-and-apps-ibm-softlayer-joyent-riak-10gen/
[0:13] <rturk> Tv: ya, and inktank isn't even a client :)
[0:13] <nwl> *cough*
[0:17] * aliguori (~anthony@ Quit (Quit: Ex-Chat)
[0:20] * themgt (~themgt@201-223-189-134.baf.movistar.cl) has joined #ceph
[0:21] * themgt (~themgt@201-223-189-134.baf.movistar.cl) Quit ()
[0:23] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Quit: Leaving.)
[0:25] * portante|afk is now known as portante
[0:25] * BillK (~BillK-OFT@124-169-221-120.dyn.iinet.net.au) has joined #ceph
[0:26] * mschiff (~mschiff@ Quit (Remote host closed the connection)
[0:39] <paravoid> sagewk: re: #5460, you did see that I attached debug-osd 30/debug-ms 10 when I first submitted, right?
[0:40] <paravoid> i.e. you do need debug-ms 20 instead and from a marked-down osd as well
[0:40] <sagewk> oh, nope. was that before or after the first fix?
[0:41] <paravoid> before
[0:41] * LPG|2 (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) Quit (Ping timeout: 480 seconds)
[0:41] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) Quit (Ping timeout: 480 seconds)
[0:41] <paravoid> log output was the same before/after though
[0:41] <sagewk> k, i'll look. thanks
[0:41] <paravoid> non-debug output that is
[0:53] * markbby (~Adium@ Quit (Quit: Leaving.)
[0:53] * markbby (~Adium@ has joined #ceph
[0:54] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) has joined #ceph
[0:56] <grepory> In ceph.conf, the documentation specifies that in the mon section of ceph.conf, the value of host should be the short name. Is this a requirement? Can it simply be fqdn?
[0:56] <grepory> s/the mon/a mon/
[0:56] * portante is now known as portante|afk
[0:57] * ghartz (~ghartz@ill67-1-82-231-212-191.fbx.proxad.net) Quit (Read error: Connection reset by peer)
[1:17] * John (~john@astound-64-85-225-33.ca.astound.net) Quit (Read error: Connection reset by peer)
[1:21] * nwat (~Adium@eduroam-251-132.ucsc.edu) has left #ceph
[1:23] * Gamekiller77 (~oftc-webi@128-107-239-235.cisco.com) has joined #ceph
[1:23] <Gamekiller77> Hello Ceph team wanted to ask some simple questions about object base storage that make my head hurt when i think about it
[1:25] <Gamekiller77> with traditional storage you have you raw capacity then your usable. In using ceph i know what my raw it that simple but with 3 replicas is that really raw/3 or is there more math to it
[1:26] <gregaf> pretty much raw/3, yep
[1:26] <Gamekiller77> figured
[1:26] <Gamekiller77> just wanted to valdate my math
[1:26] <gregaf> there's a bit extra that gets used up by OSD journals and things if those are sharing storage *shrug*
[1:26] <lurbs> gregaf: Arguably worse, if you take into account failed nodes, etc.
[1:26] <Gamekiller77> for sure
[1:26] <Gamekiller77> do ops over look now
[1:27] <Gamekiller77> so trying to see how much storage i going to put in to each OSD node
[1:27] <Gamekiller77> also thinking about mixing some SSD and SATA
[1:27] <Gamekiller77> for pools
[1:27] <Gamekiller77> today then seeing the road map hoping auto tiering will be something for the future
[1:28] <gregaf> lurbs: well, yeah, you should account for failures I guess, but you should do that with anything :p
[1:28] <Gamekiller77> are some of the Inktank guys in here too
[1:28] <lurbs> http://uber.geek.nz/graph-rawratio.png
[1:28] <lurbs> I got bored one day and graphed it.
[1:28] <Gamekiller77> was wondering if they offer paid for support services
[1:29] <dmick> http://www.inktank.com/support-services/
[1:29] <lurbs> Also: http://uber.geek.nz/graph-nearfullratio.png
[1:30] <lurbs> That's total node failures, BTW. Didn't bother doing single disk. It assumes the same capacity per node, too.
[1:30] <gregaf> haha, nice
[1:33] <Gamekiller77> last question now of any tools or scripts freely out there to do load testing on something like Ceph?
[1:34] * jluis is now known as joao
[1:34] <rturk> Gamekiller77: there is rados bench, which is built in
[1:35] <Gamekiller77> rturk: thanks for that did not now this
[1:35] * John (~john@astound-64-85-225-33.ca.astound.net) has joined #ceph
[1:35] <rturk> http://ceph.com/docs/next/man/8/rados/?highlight=bench
[1:36] <dmick> Gamekiller77: various stuff in the qa directory too using rados load-gen
[1:37] <Gamekiller77> ok my team ask me to find this for go live
[1:38] <Gamekiller77> this is great info love the IRC community
[1:40] <rturk> :)
[1:41] <Gamekiller77> i will be back here more as soon as my OSD come online
[1:41] <Gamekiller77> waiting for my NIC to come in
[1:42] <rturk> Gamekiller77: cool!
[1:42] <Gamekiller77> i know it over kill but i have 10gb cards
[1:42] <lurbs> Not overkill at all. I'd say necessary. :)
[1:43] <rturk> depends on how many OSDs/NIC, I guess
[1:43] <Gamekiller77> doing twin 10gb for private and public with nexus 5k switchs
[1:43] <Gamekiller77> so total 40gb
[1:43] <Gamekiller77> but 20 for cluster private
[1:45] <lurbs> Does the Nexus gear support link aggregation for a single 802.3ad connection to multiple switches?
[1:48] <Gamekiller77> yah btw i work for cisco
[1:48] <Gamekiller77> the nexus has vPC
[1:48] <Gamekiller77> so we can have one nic from one card and one nice from other card play in the same port channel
[1:48] <Gamekiller77> so it will be a bonded 20gb link
[1:49] * rturk is now known as rturk-away
[1:49] * tnt (~tnt@ Quit (Ping timeout: 480 seconds)
[1:51] <Gamekiller77> on my dev setup it will be on simple Nexus 5548 then my production setup with sit on some nice big nexus 7000
[1:52] <lurbs> Nice. Us mere mortals have to beg. :)
[1:52] <sagewk> paravoid: can you generate a new log with the 'last' branch, or current next? i can see from the log that it is doing the thing that we fixed, but i don't see any other obvious problem
[1:53] <Gamekiller77> lurbs: it the bens of work for cisco IT innovative group we get nice toys but we have limits too
[1:54] * markbby (~Adium@ Quit (Remote host closed the connection)
[1:54] * markbby (~Adium@ has joined #ceph
[1:54] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[1:56] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit (Quit: Leaving.)
[1:58] * Gamekiller77 (~oftc-webi@128-107-239-235.cisco.com) Quit ()
[1:58] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[1:59] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[2:01] * andrei (~andrei@host86-155-31-94.range86-155.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[2:09] * mikedawson_ (~chatzilla@c-68-58-243-29.hsd1.sc.comcast.net) Quit (Ping timeout: 480 seconds)
[2:09] * alram (~alram@ Quit (Quit: leaving)
[2:19] * Cube (~Cube@ Quit (Quit: Leaving.)
[2:27] * fridudad (~oftc-webi@fw-office.allied-internet.ag) Quit (Remote host closed the connection)
[2:32] <tchmnkyz> hey guys is there a easy way to roll back the ceph version on deb/ubuntu?
[2:32] * markbby (~Adium@ Quit (Remote host closed the connection)
[2:32] <tchmnkyz> 61.4 is killing me right now
[2:32] <tchmnkyz> it keeps dropping OSD's
[2:32] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[2:33] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[2:39] * sagelap (~sage@2600:1012:b01b:d220:59ef:e19:c22c:ec80) has joined #ceph
[2:46] <dmick> tchmnkyz: rollback is touchy; in general we don't support it (on-disk formats change, etc.)
[2:46] <dmick> but, are your bugs known/investigated?
[2:46] <tchmnkyz> i dont know
[2:46] <tchmnkyz> i just upgraded monday
[2:46] <tchmnkyz> had a major issue last night
[2:46] <tchmnkyz> and now it is happening again
[2:47] <tchmnkyz> it seems to drop 2 OSD from my cluster for no reason
[2:47] <tchmnkyz> and it was fine on 61.3
[2:47] <tchmnkyz> 61.4 is when all of the problems started
[2:51] <tchmnkyz> dmick: i wish like hell i could afford the support contract
[2:52] <tchmnkyz> i just cant afford it yet
[2:52] <tchmnkyz> this project does not make enough to sustain that kind of yearly costs
[2:52] <dmick> tchmnkyz: yeah. well there is probably evidence of what's going on in the logs etc.
[2:53] <tchmnkyz> the problem is i have 13 OSD
[2:53] <tchmnkyz> and ceph is not really telling me what one is dropping
[2:54] <tchmnkyz> seeing alot of this in the logs:
[2:54] <tchmnkyz> 2013-07-02 19:42:28.871571 7f28ac8d2700 0 -- >> pipe(0x16b45280 sd=162 :47339 s=2 pgs=5885 cs=1543 l=0).fault, initiating reconnect
[2:54] * LeaChim (~LeaChim@ Quit (Ping timeout: 480 seconds)
[2:57] <tchmnkyz> and ton of this dmick
[2:57] <tchmnkyz> 2013-07-02 19:56:11.655476 7f44f655c700 0 log [WRN] : map e20302 wrongly marked me down
[3:17] * sagelap (~sage@2600:1012:b01b:d220:59ef:e19:c22c:ec80) has left #ceph
[3:22] <Psi-jack> Welp. I think I've decided, finally. This weekend will be the day I start migrating my Arch Ceph systems into CentOS 6.4 based. Fun fun fun..
[3:27] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit (Quit: Leaving.)
[3:27] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[3:28] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) Quit (Read error: Operation timed out)
[3:33] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[3:39] * markbby (~Adium@ has joined #ceph
[3:40] * mikedawson (~chatzilla@c-68-58-243-29.hsd1.sc.comcast.net) has joined #ceph
[3:44] * buck1 (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving.)
[3:44] * portante|afk is now known as portante
[3:52] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit (Quit: Leaving.)
[3:53] * nwat (~Adium@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:53] * mikedawson (~chatzilla@c-68-58-243-29.hsd1.sc.comcast.net) Quit (Ping timeout: 480 seconds)
[3:53] * nwat (~Adium@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[3:53] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) has joined #ceph
[3:54] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[3:55] * DarkAce-Z (~BillyMays@ has joined #ceph
[3:58] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[3:59] * DarkAceZ (~BillyMays@ Quit (Ping timeout: 480 seconds)
[4:04] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) Quit (Ping timeout: 480 seconds)
[4:04] * markbby (~Adium@ Quit (Remote host closed the connection)
[4:17] * portante is now known as portante|afk
[4:26] * houkouonchi-home (~linux@pool-108-38-63-48.lsanca.fios.verizon.net) Quit (Remote host closed the connection)
[4:26] * oddomatik (~Adium@ has left #ceph
[4:27] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:31] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[4:31] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[4:37] * DarkAce-Z is now known as DarkAceZ
[4:39] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[4:44] * rongze (~zhu@173-252-252-212.genericreverse.com) has joined #ceph
[4:51] * haomaiwang (~haomaiwan@ Quit (Read error: Connection reset by peer)
[4:54] * haomaiwang (~haomaiwan@ has joined #ceph
[5:00] * fireD1 (~fireD@93-142-245-29.adsl.net.t-com.hr) has joined #ceph
[5:05] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[5:06] * fireD (~fireD@93-139-180-220.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:11] * haomaiwang (~haomaiwan@ Quit (Remote host closed the connection)
[5:11] * haomaiwang (~haomaiwan@ has joined #ceph
[5:35] * horsey (~horsey@ has joined #ceph
[5:40] * Tv (~tv@pool-108-13-115-92.lsanca.fios.verizon.net) Quit (Read error: Operation timed out)
[5:40] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 480 seconds)
[5:54] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[5:57] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) has left #ceph
[5:57] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (Read error: Operation timed out)
[5:58] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[6:00] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[6:06] * julian (~julianwa@ has joined #ceph
[6:24] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[6:24] * ChanServ sets mode +v andreask
[6:25] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has left #ceph
[6:39] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:51] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:53] * haomaiwang (~haomaiwan@ Quit (Ping timeout: 480 seconds)
[7:04] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) has joined #ceph
[7:07] * madkiss (~madkiss@115-239.197-178.cust.bluewin.ch) has joined #ceph
[7:12] * zapotah_ (~zapotah@dsl-hkibrasgw2-50dfdb-234.dhcp.inet.fi) has joined #ceph
[7:13] * zapotah (~zapotah@dsl-hkibrasgw2-50dfdb-234.dhcp.inet.fi) Quit (Ping timeout: 480 seconds)
[7:16] * AfC (~andrew@2001:44b8:31cb:d400:bd30:1c22:a4b9:1e8b) has joined #ceph
[7:17] * madkiss (~madkiss@115-239.197-178.cust.bluewin.ch) Quit (Quit: Leaving.)
[7:21] * AfC (~andrew@2001:44b8:31cb:d400:bd30:1c22:a4b9:1e8b) Quit (Quit: Leaving.)
[7:22] * AfC (~andrew@2001:44b8:31cb:d400:bd30:1c22:a4b9:1e8b) has joined #ceph
[7:30] * xmltok (~xmltok@pool101.bizrate.com) Quit (Read error: Connection reset by peer)
[7:45] * AfC (~andrew@2001:44b8:31cb:d400:bd30:1c22:a4b9:1e8b) Quit (Quit: Leaving.)
[7:45] <BillK> are there any problems going back from 61.4 to 61.3? - 61.4 is too unreliable ... just lost two more OSDś that cant be recovered until I reboot.
[7:47] * madkiss (~madkiss@ has joined #ceph
[7:58] * haomaiwang (~haomaiwan@ has joined #ceph
[8:10] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[8:11] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) Quit (Ping timeout: 480 seconds)
[8:12] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[8:35] * zapotah_ (~zapotah@dsl-hkibrasgw2-50dfdb-234.dhcp.inet.fi) Quit (Ping timeout: 480 seconds)
[8:38] * Machske2 (~bram@d5152D8A3.static.telenet.be) has joined #ceph
[8:40] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[8:40] * Vjarjadian (~IceChat77@ Quit (Quit: A fine is a tax for doing wrong. A tax is a fine for doing well)
[8:40] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit ()
[8:56] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[8:58] * mxmln3 (~maximilia@ Quit (Read error: Connection reset by peer)
[8:59] * mxmln (~maximilia@ has joined #ceph
[9:09] * bergerx_ (~bekir@ has joined #ceph
[9:09] * BManojlovic (~steki@ has joined #ceph
[9:16] * sleinen (~Adium@2001:620:0:25:9174:3e77:a814:67fb) has joined #ceph
[9:20] * BillK (~BillK-OFT@124-169-221-120.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[9:30] * haomaiwa_ (~haomaiwan@ has joined #ceph
[9:36] * rongze (~zhu@173-252-252-212.genericreverse.com) Quit (Ping timeout: 480 seconds)
[9:36] * iii8 (~Miranda@ Quit (Read error: Connection reset by peer)
[9:37] * haomaiwang (~haomaiwan@ Quit (Ping timeout: 480 seconds)
[9:43] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[9:53] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[9:55] * LeaChim (~LeaChim@ has joined #ceph
[10:00] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[10:01] * AfC (~andrew@2001:44b8:31cb:d400:f182:1216:1454:7dfe) has joined #ceph
[10:05] <tnt> Huh ... that's strange: health HEALTH_WARN recovery recovering 0 o/s, 171KB/s but "pgmap v464583: 1600 pgs: 1600 active+clean; 2048 MB data, 2860 MB used, 99489 MB / 102350 MB avail; recovering 0 o/s, 171KB/s"
[10:07] * Dennis3 (~Dennis@bzq-82-80-183-50.red.bezeqint.net) has joined #ceph
[10:07] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[10:07] <Dennis3> Hey
[10:08] <Dennis3> I have a question about Ceph, after I installed Ceph on a few nodes. Which package do I need to install on the client node in order to mount the drive locally ?
[10:24] <wogri_risc> also ceph.
[10:25] * julian (~julianwa@ Quit (Ping timeout: 480 seconds)
[10:25] <wogri_risc> Dennis3: depends on what you mean by 'mount' - do you want to mount cephfs or RBD volumes?
[10:26] <wogri_risc> anyway, you will need the basic ceph packages.
[10:29] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has left #ceph
[10:30] * fridudad (~oftc-webi@fw-office.allied-internet.ag) has joined #ceph
[10:30] <john_> ceph-common should work.
[10:31] * ScOut3R (~ScOut3R@dsl51B6D44C.pool.t-online.hu) has joined #ceph
[10:32] * leseb1 (~Adium@2a04:2500:0:d00:dd6:c775:3621:8cd3) has joined #ceph
[10:35] <Dennis3> I tried installing Ceph-common on ubuntu
[10:35] <Dennis3> It didn't present me the command of mount.ceph
[10:36] <Dennis3> I prefer CentOS but when I try using Ceph-Deploy I get a message that I have no modules installed, couldn't find a solution for it so I installed Ubuntu servers instead.
[10:36] <Dennis3> wogri_risc, I honestly do not know the difference between cephfsand RBD Volume
[10:41] * zhangjf_zz2 (~zjfhappy@ has joined #ceph
[10:42] * ScOut3R (~ScOut3R@dsl51B6D44C.pool.t-online.hu) Quit (Ping timeout: 480 seconds)
[10:48] * tziOm (~bjornar@ has joined #ceph
[10:55] * haomaiwa_ (~haomaiwan@ Quit (Remote host closed the connection)
[10:57] * haomaiwang (~haomaiwan@li565-182.members.linode.com) has joined #ceph
[11:00] * mikedawson (~chatzilla@c-68-58-243-29.hsd1.sc.comcast.net) has joined #ceph
[11:01] * psomas (~psomas@inferno.cc.ece.ntua.gr) has joined #ceph
[11:03] * julian (~julianwa@ has joined #ceph
[11:06] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[11:07] <wogri> Dennis3: rbd is like USB disks for your servers. cephfs is a shared and distributed filesystem.
[11:07] * madkiss (~madkiss@ Quit (Quit: Leaving.)
[11:07] <wogri> you have to format the "usb disks" yourself, and it's considered production quality, whereas the distributed filesystem is in a rather experimental state.
[11:13] <Dennis3> I see. Thank you for the explenation. So if I want to mount a RBD volume on a client OS. What should I do ?
[11:18] <wogri> first of all you need to decied how you want to mount the rbd volume. KVM+RBD plays very nice together. in that case you need nothing special, except for maybe a recent KVM version.
[11:19] <wogri> if you want to mount the RBD volume directly on your host, it is recommended to run the latest linux kernel (3.10) - as it understands all of the latest and greatest features of ceph.
[11:19] <wogri> then you need ceph_common, as john_ stated. and a RBD mapping (man rbd)
[11:20] <wogri> with that mapping you get a physical device, like /dev/rbd/pool/yourvolume
[11:20] <wogri> and you do whatever you want with it.
[11:22] * haomaiwang (~haomaiwan@li565-182.members.linode.com) Quit (Remote host closed the connection)
[11:23] * haomaiwang (~haomaiwan@ has joined #ceph
[11:32] * mschiff (~mschiff@tmo-111-22.customers.d1-online.com) has joined #ceph
[11:40] * TiCPU (~jeromepou@190-130.cgocable.ca) Quit (Quit: Ex-Chat)
[11:48] <tnt> Does anyone know if stable pg merging support is scheduled for any time soon ?
[11:49] <Dennis3> I see, thank you wogri, I understand now
[11:49] <Dennis3> Thanks for the help
[11:52] * horsey (~horsey@ Quit (Quit: Lost terminal)
[11:57] * haomaiwang (~haomaiwan@ Quit (Remote host closed the connection)
[12:05] * haomaiwang (~haomaiwan@ has joined #ceph
[12:09] * madkiss (~madkiss@ has joined #ceph
[12:12] * Dennis3 (~Dennis@bzq-82-80-183-50.red.bezeqint.net) Quit (Quit: Leaving)
[12:14] * stacker666 (~stacker66@ Quit (Ping timeout: 480 seconds)
[12:20] * mschiff (~mschiff@tmo-111-22.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[12:26] * stacker666 (~stacker66@94.pool85-61-185.dynamic.orange.es) has joined #ceph
[12:28] * markbby (~Adium@ has joined #ceph
[12:33] * BillK (~BillK-OFT@124-169-221-120.dyn.iinet.net.au) has joined #ceph
[12:40] * markbby (~Adium@ Quit (Remote host closed the connection)
[12:42] <Machske2> Can anyone tell me if there is a specific reason to manage your rbd images in different pools ? Or that it's bad practice to keep all your rbd images in the rbd pool ? Performance wise maybe ?
[12:42] <tnt> Machske2: you can set different replication levels in different pools.
[12:42] <tnt> Also if you have several different types of OSD (like a bunch with 7.2k drives and a bunch with 15k drives or SSD), you can control placement per pool
[12:43] <Machske2> well that's something I did not consider
[12:43] <Machske2> but very interesting
[12:43] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[12:43] <wogri_risc> be aware that too many pools are not very good for ceph - performance-wise.
[12:45] * haomaiwa_ (~haomaiwan@ has joined #ceph
[12:46] <Machske2> what is "too many", probably in relation to your cluster size?
[12:46] <Machske2> for example in a 10 node setup, are 10 pools to much ?
[12:47] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[12:49] * haomaiwang (~haomaiwan@ Quit (Ping timeout: 480 seconds)
[12:52] <wogri_risc> well, too many is very subjective. ceph cares about the number of placement groups. a pool occupies a definable number of placement groups.
[12:52] <wogri_risc> if those placement groups grow very big, ceph is using more cpu, memory and so on.
[12:52] <wogri_risc> I create pools per "failure domain"
[12:52] <wogri_risc> so we have a pool that has to be availabel all time. we set the repl size to three.
[12:53] <wogri_risc> we have another pool, that is less important. repl size = 2
[12:53] <wogri_risc> you get my point?
[12:53] <wogri_risc> you can also play with crushmaps to create pools that are very fast (SSD only), or pools that only reside in a specific rack. it really depends on your usage.
[13:03] * portante|afk is now known as portante
[13:05] * masterpe (~masterpe@2a01:670:400::43) Quit (Ping timeout: 480 seconds)
[13:14] * ScOut3R (~ScOut3R@gprsc2b0e2c7.pool.t-umts.hu) has joined #ceph
[13:18] <madkiss> is anybody aware of a smart solution to provide FTP- or CIFS-access to RADOS? I was wondering whether there is some sort of FTP-S3-Gateway that would allow to use the Rados gateway.
[13:19] <joelio> madkiss: why would you want to do that? It could be done, mind.
[13:20] <joelio> but if you need to use s3, use an s3 client?
[13:20] <madkiss> joelio: let's assume for a moment that S3, which would be the real correct to do this, is not available
[13:20] <madkiss> in fact, I need an FTP front-end for RADOS, the most obvious way would be rbd+fs+ftp obviously, but that doesn't make a smart impression
[13:22] <joelio> why would s3 not be available if you are using a RAODS gw?
[13:22] <joelio> I just don't follow I'm afraid
[13:23] <joelio> if you say yoy want clients only to use FTP, then I understand more, but there are no FTP-s3 g/w afaik.
[13:23] <brother> Using FTP for something that isn't filesystem like seems strange
[13:23] * ScOut3R (~ScOut3R@gprsc2b0e2c7.pool.t-umts.hu) Quit (Ping timeout: 480 seconds)
[13:24] <joelio> yea, put sftp on top of cephfs or better rbd imho
[13:24] <joelio> don't quite follow how this isn't smart. It's like saying you want an FTP interface to your RAID array
[13:26] <joelio> or just use cyberduck or some s3 client straight to RADOS :)
[13:26] <madkiss> well obviously the not-so-smart part of this is that it creates a bottleneck where radosgw wouldn't.
[13:27] <joelio> it depends how you design it
[13:27] <joelio> you only may have one RADOS gw, that may be your bottleneck in an FTP gw
[13:27] <joelio> add more and load balance and it's not
[13:27] <joelio> just like doing cephhfs/rdb etc.
[13:28] <madkiss> if I have a Ceph cluster, and there is an RBD that comes out of it, and on top of that, I have a file system and an FTP server, how would that not be a bottleneck? :)
[13:29] <madkiss> the requirement here is that the actual application that'S feeding data to the setup can do CIFS or FTP, but no S3, at the moment.
[13:29] <brother> madkiss: What problem are you trying to solve?
[13:29] <madkiss> brother: uploading files to Ceph via CIFS or FTP. and I was looking for a smart way to do that than the obvious RBD-backed solution.
[13:31] <brother> I am not sure what you mean by "uploading files" if you want to bypass an actual filesystem layer
[13:32] <madkiss> hu?
[13:33] <madkiss> I need to store files in Ceph, and I need to do it via FTP or S3, so I was wondering whether there is something that translates between S3 and FTP so that on the Ceph side, I can actually use the RADOS Gateway. If there is no such thing, it will all boil down to RBD anyway.
[13:33] * infinitytrapdoor (~infinityt@ has joined #ceph
[13:34] * portante is now known as portante|afk
[13:34] <joelio> madkiss: if you have a ceph cluster and one radogw, and let's suppose that did ftp translation magically, you would still need an FTP gateway right. As you can't talk to s3 directly. Surely that's a bottleneck
[13:35] <madkiss> ouch. of course, you're right.
[13:35] <darkfaded> he could have 200 ftp gateways behind 10 bigip clusters ;)
[13:35] <joelio> you have bottlenecks based on design, not on the type of technology
[13:35] * BillK (~BillK-OFT@124-169-221-120.dyn.iinet.net.au) Quit (Read error: Connection reset by peer)
[13:38] <madkiss> joelio: well, sort of. 10 RADOS Gateways with FTP-backends and a load-balancer would still allow to distribute the traffic over 10 rados gateways. with the standard RBD-backed solution, all traffic is actually terminating at one destination (simply because there is no smart way to have more than one RBD-head in such a solution)
[13:38] <joelio> but what you're describing to me is the requirement to access a clustered file system
[13:39] <joelio> I don't know about your app requirements
[13:39] <joelio> whether they all need to write into the same datastore
[13:39] <joelio> or can write to indepenent (therefore individual dbs)
[13:40] <madkiss> the clustered file system would be one possible way to go, yeah. one that I would actually want to avoid under any circumstance
[13:41] <joelio> does the app *need* to be all in one big datastore? if not split up the apps to write to different rbd backed stores
[13:42] <joelio> much like when you do a query on a search engine.. map/reduce
[13:43] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[13:44] <madkiss> yeah.
[13:44] <madkiss> i'll see if that can work.
[13:44] <tchmnkyz> dmick: seems things got stable when i rebooted to the latest ubuntu kernel
[13:45] <tchmnkyz> went from 3.5.0-30 to 3.5.0-34
[13:47] * zhangjf_zz2 (~zjfhappy@ Quit (Quit: 离开)
[13:47] * portante|afk is now known as portante
[13:50] * mschiff (~mschiff@tmo-111-22.customers.d1-online.com) has joined #ceph
[14:00] <tchmnkyz> ok personal op question. i have a few nodes in my cluster that have 2 OSD in the same hardware. came from a hardware limitation on my raid controller. right now i use a single Intel 520 SSD for the OS/Journal for my ceph cluster. should i move the journal to its own SSD and seperate it from the OS SSD or is it not that big of a deal? also what about the OSD's where i have 2 OSD on the same box should i use 2 SSD in that for the journal one sep SSD fo
[14:05] * haomaiwang (~haomaiwan@ has joined #ceph
[14:12] * haomaiwa_ (~haomaiwan@ Quit (Ping timeout: 480 seconds)
[14:18] * portante is now known as portante|afk
[14:20] * john_barbee (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:25] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[14:28] * masterpe (~masterpe@2a01:670:400::43) has joined #ceph
[14:31] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[14:33] * deadsimple (~infinityt@ has joined #ceph
[14:33] * scuttlemonkey (~scuttlemo@24-180-198-92.dhcp.aldl.mi.charter.com) has joined #ceph
[14:33] * ChanServ sets mode +o scuttlemonkey
[14:35] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) has joined #ceph
[14:35] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) has joined #ceph
[14:37] * infinitytrapdoor (~infinityt@ Quit (Read error: Operation timed out)
[14:41] * yeled (~yeled@spodder.com) Quit (Quit: meh..)
[14:43] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) Quit (Ping timeout: 480 seconds)
[14:44] * yeled (~yeled@spodder.com) has joined #ceph
[14:48] * markbby (~Adium@ has joined #ceph
[14:50] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[14:50] * ChanServ sets mode +v andreask
[14:50] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has left #ceph
[14:53] * stacker666 (~stacker66@94.pool85-61-185.dynamic.orange.es) Quit (Ping timeout: 480 seconds)
[14:58] * BillK (~BillK-OFT@124-169-221-120.dyn.iinet.net.au) has joined #ceph
[14:58] * rongze (~zhu@173-252-252-212.genericreverse.com) has joined #ceph
[14:59] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[15:00] * scuttlemonkey (~scuttlemo@24-180-198-92.dhcp.aldl.mi.charter.com) Quit (Ping timeout: 480 seconds)
[15:00] * jebba (~aleph@2601:1:a300:8f:f2de:f1ff:fe69:6672) Quit (Quit: Leaving.)
[15:02] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[15:05] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[15:05] * deadsimple (~infinityt@ Quit (Read error: Connection reset by peer)
[15:05] * AfC (~andrew@2001:44b8:31cb:d400:f182:1216:1454:7dfe) Quit (Quit: Leaving.)
[15:05] * infinitytrapdoor (~infinityt@ has joined #ceph
[15:06] * AfC (~andrew@2001:44b8:31cb:d400:cd9a:8606:e302:ca1) has joined #ceph
[15:12] * AfC (~andrew@2001:44b8:31cb:d400:cd9a:8606:e302:ca1) Quit (Quit: Leaving.)
[15:20] * deadsimple (~infinityt@ has joined #ceph
[15:26] * jebba (~aleph@70-90-113-25-co.denver.hfc.comcastbusiness.net) has joined #ceph
[15:26] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) Quit (Remote host closed the connection)
[15:27] * infinitytrapdoor (~infinityt@ Quit (Ping timeout: 480 seconds)
[15:34] * bram__ (~bram@d5152D8A3.static.telenet.be) has joined #ceph
[15:37] * john_barbee_ (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:38] * Machske2 (~bram@d5152D8A3.static.telenet.be) Quit (Ping timeout: 480 seconds)
[15:39] * Machske2 (~bram@d5152D8A3.static.telenet.be) has joined #ceph
[15:41] * john_barbee (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[15:41] * john_barbee_ is now known as john_barbee
[15:42] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[15:42] * bram__ (~bram@d5152D8A3.static.telenet.be) Quit (Ping timeout: 480 seconds)
[15:43] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[15:47] * illya (~illya_hav@9-158-135-95.pool.ukrtel.net) has joined #ceph
[15:48] * portante|afk is now known as portante
[15:48] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[15:49] * madkiss1 (~madkiss@ has joined #ceph
[15:49] <illya> hi
[15:49] * madkiss (~madkiss@ Quit (Read error: No route to host)
[15:50] <illya> I have cluster without MDS
[15:50] <smiley> morning
[15:51] <illya> I'm trying to start one
[15:51] <illya> and getting next
[15:51] <illya> http://pastebin.com/yY2Emz60
[15:52] <illya> should I do any changes to config or any changes to auth to fix this ?
[15:53] * haomaiwang (~haomaiwan@ Quit (Remote host closed the connection)
[15:54] * john_barbee (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Quit: ChatZilla 0.9.90 [Firefox 21.0/20130511120803])
[15:55] <joao> illya, looks like you're missing a keyring
[15:55] <joao> illya, have you created a key for that mds?
[15:56] <illya> I did nothing specific :(
[15:56] <illya> I deployed my cluster with https://github.com/ceph/ceph-cookbooks
[15:56] <illya> and there is a notice here that MDS id not finished
[15:57] * redeemed (~quassel@static-71-170-33-24.dllstx.fios.verizon.net) has joined #ceph
[15:57] <joao> illya, I'm not familiar enough with the cookbooks to help you with that
[15:57] <joao> but you should be able to add it manually though
[15:57] <illya> yes and I'm trying to do this
[15:58] <joao> http://ceph.com/docs/master/rados/deployment/ceph-deploy-mds/
[15:58] * scuttlemonkey (~scuttlemo@24-180-198-92.dhcp.aldl.mi.charter.com) has joined #ceph
[15:58] * ChanServ sets mode +o scuttlemonkey
[16:00] * bram__ (~bram@d5152D8A3.static.telenet.be) has joined #ceph
[16:06] * Machske2 (~bram@d5152D8A3.static.telenet.be) Quit (Ping timeout: 480 seconds)
[16:07] * KindTwo (KindOne@ has joined #ceph
[16:08] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[16:08] * KindTwo is now known as KindOne
[16:13] * BillK (~BillK-OFT@124-169-221-120.dyn.iinet.net.au) Quit (Read error: Operation timed out)
[16:14] * sel (~sel@ has joined #ceph
[16:17] * haomaiwa_ (~haomaiwan@ has joined #ceph
[16:18] * mjeanson (~mjeanson@00012705.user.oftc.net) Quit (Remote host closed the connection)
[16:19] * Dennis3 (~Dennis@ has joined #ceph
[16:20] * mjeanson (~mjeanson@bell.multivax.ca) has joined #ceph
[16:20] <illya> joao: solved it by next
[16:20] <illya> added to config
[16:20] <illya> [mds]
[16:20] <illya> mds data = /var/lib/ceph/mds/mds.$id
[16:20] <illya> keyring = /var/lib/ceph/mds/mds.$id/mds.$id.keyring
[16:21] <illya> ceph auth get-or-create mds.0 mds 'allow ' osd 'allow *' mon 'allow rwx' > /var/lib/ceph/mds/mds.0/mds.0.keyring
[16:21] <illya> ceph-mds -i 0
[16:21] * PerlStalker (~PerlStalk@ has joined #ceph
[16:21] <illya> thx
[16:27] <illya> any keyring should be generated for cephfs ?
[16:30] <illya> I'm getting next
[16:30] <illya> 2013-07-03 14:30:16.924758 7fe46ad9e700 0 cephx server client.admin: unexpected key: req.key=559b1a39288d591c expected_key=ae36a9decde86e6c
[16:31] <illya> for command
[16:31] <illya> mount -t ceph /opt/ceph -vv -o name=admin,secret=AQBvZc1RUETnHxAAsAukDiFFnuFCKpqmuMhcNA==
[16:31] * haomaiwa_ (~haomaiwan@ Quit (Remote host closed the connection)
[16:32] * haomaiwang (~haomaiwan@ has joined #ceph
[16:35] * tziOm (~bjornar@ Quit (Remote host closed the connection)
[16:36] * sel (~sel@ Quit (Quit: Leaving)
[16:42] * leseb1 (~Adium@2a04:2500:0:d00:dd6:c775:3621:8cd3) Quit (Ping timeout: 480 seconds)
[17:00] * madkiss1 (~madkiss@ Quit (Quit: Leaving.)
[17:01] * bram__ (~bram@d5152D8A3.static.telenet.be) Quit (Read error: Connection reset by peer)
[17:01] * Machske2 (~bram@d5152D8A3.static.telenet.be) has joined #ceph
[17:03] * deadsimple (~infinityt@ Quit (Read error: Operation timed out)
[17:04] <tnt> Argh ... I have the same problem that I had on a test cluster on my prod cluster.
[17:04] <tnt> http://pastebin.com/HCN2b3bn
[17:05] <tnt> It says "recovering" ... but there is nothing to recover from , all PGs are active+clean
[17:06] * Dennis3 (~Dennis@ Quit (Quit: Leaving)
[17:06] <tnt> mmm, it seems to have resolved itself, but still, why did it happen at all.
[17:08] * leseb (~Adium@2a04:2500:0:d00:907d:34da:b3c7:fac9) has joined #ceph
[17:11] * leseb (~Adium@2a04:2500:0:d00:907d:34da:b3c7:fac9) Quit ()
[17:11] * john_barbee (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[17:14] * mschiff (~mschiff@tmo-111-22.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[17:20] * mschiff (~mschiff@tmo-111-22.customers.d1-online.com) has joined #ceph
[17:20] * joao (~JL@ Quit (Read error: Connection reset by peer)
[17:20] * joao (~JL@ has joined #ceph
[17:20] * ChanServ sets mode +o joao
[17:24] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[17:28] * nwat (~nwatkins@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[17:30] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) has joined #ceph
[17:31] * bergerx_ (~bekir@ Quit (Quit: Leaving.)
[17:31] * rongze (~zhu@173-252-252-212.genericreverse.com) Quit (Ping timeout: 480 seconds)
[17:32] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[17:42] * joshd1 (~joshd@2602:306:c5db:310:6996:4df7:648d:7b25) has joined #ceph
[17:42] * nhm (~nhm@184-97-193-106.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[17:42] * Machske2 (~bram@d5152D8A3.static.telenet.be) Quit (Quit: Leaving)
[17:43] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[17:46] <infernix> if I want to decommision an entire host, should I 'ceph osd out X' one at a time, or just all the OSDs on that host at once?
[17:51] <infernix> logic tells me that I should do them all at once so that ceph won't needlessly relocate data to the same host
[17:51] * infernix tries
[17:53] <infernix> that doesn't make ceph happy i
[17:55] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[17:55] * tnt (~tnt@ has joined #ceph
[17:55] <infernix> mon.0 [INF] pgmap v8637602: 3128 pgs: 1937 active+clean, 686 active+remapped+wait_backfill, 1 active+degraded+wait_backfill, 1 active+recovery_wait, 352 active+remapped+backfilling, 122 active+degraded+remapped+wait_backfill, 1 active+recovery_wait+remapped, 28 active+degraded+remapped+backfilling; 19449 GB data, 39035 GB used, 125 TB / 163 TB avail; 1959716/11670416 degraded (16.792%)
[17:56] <infernix> that looks OK to me now, no more incompletes
[18:07] * nwat (~nwatkins@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Read error: Operation timed out)
[18:11] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[18:19] * mikedawson_ (~chatzilla@c-68-58-243-29.hsd1.sc.comcast.net) has joined #ceph
[18:19] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[18:22] * mikedawson (~chatzilla@c-68-58-243-29.hsd1.sc.comcast.net) Quit (Ping timeout: 480 seconds)
[18:23] * mikedawson_ is now known as mikedawson
[18:23] * madkiss (~madkiss@136-239.197-178.cust.bluewin.ch) has joined #ceph
[18:29] * sleinen (~Adium@2001:620:0:25:9174:3e77:a814:67fb) Quit (Quit: Leaving.)
[18:32] * infinitytrapdoor (~infinityt@ip-109-41-97-196.web.vodafone.de) has joined #ceph
[18:33] * rturk-away is now known as rturk
[18:39] * nwat (~nwatkins@eduroam-226-128.ucsc.edu) has joined #ceph
[18:40] * illya (~illya_hav@9-158-135-95.pool.ukrtel.net) has left #ceph
[18:40] * infinitytrapdoor (~infinityt@ip-109-41-97-196.web.vodafone.de) Quit (Ping timeout: 480 seconds)
[18:43] * mschiff (~mschiff@tmo-111-22.customers.d1-online.com) Quit (Remote host closed the connection)
[18:43] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[18:44] * unit3 (~Unit3@ has joined #ceph
[18:46] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[18:46] * ChanServ sets mode +v andreask
[18:48] <smiley> Anyone know how I set a bucket to 'public' so that anything I place in that bucket later will also be public?
[18:48] * portante is now known as portante|afk
[18:51] <smiley> I don't have an issue setting buckets and objects as public using dragon disk...but it does not seem to apply to new objects I place in any of the folders
[18:51] <grepory> is drbd still the multi-datacenter solution?
[18:54] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[19:00] * haomaiwa_ (~haomaiwan@ has joined #ceph
[19:00] * haomaiwang (~haomaiwan@ Quit (Read error: Connection reset by peer)
[19:08] * sjusthm (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[19:15] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[19:16] <grepory> or is multi-datacenter in the roadmap?
[19:17] * mschiff (~mschiff@ has joined #ceph
[19:17] <andreask> grepory: http://goo.gl/Ti0TE
[19:17] <grepory> andreask: thank you
[19:17] <andreask> yw
[19:18] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:19] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) has joined #ceph
[19:19] <smiley> http://www.youtube.com/watch?v=DH42dB6cbu8#at=1225
[19:26] * mschiff (~mschiff@ Quit (Remote host closed the connection)
[19:34] * haomaiwa_ (~haomaiwan@ Quit (Remote host closed the connection)
[19:35] * rturk is now known as rturk-away
[19:36] * sleinen (~Adium@2001:620:0:26:945a:9715:3b19:a704) has joined #ceph
[19:45] * dpippenger (~riven@tenant.pas.idealab.com) has joined #ceph
[19:46] * mschiff (~mschiff@ has joined #ceph
[19:54] <unit3> Hey all. Just getting started with ceph, installed on my first node using ceph-deploy, seemed to go ok. Trying to add a monitor to my second node, and it just goes to stack trace hell. Suggestions on how to proceed? http://pastebin.com/jKcYtUKn
[19:56] * mschiff_ (~mschiff@ has joined #ceph
[19:56] * mschiff (~mschiff@ Quit (Read error: Connection reset by peer)
[19:56] <andreask> hmm ... u use the correct hostname?
[19:57] <unit3> Yeah, see pastebin, host resolves and I can ssh to it fine.
[19:57] <unit3> key based auth is setup, as is sudo, ready to go.
[19:57] <andreask> what is the result of uname -n?
[19:58] <unit3> "c2"
[19:58] <andreask> hm
[19:58] <unit3> Boxes are configured identically using saltstack.
[19:58] <unit3> so I'm very surprised it worked for the first node and not the second.
[19:58] <unit3> baffled, even.
[19:58] <unit3> and the python stack dump isn't exactly informative. ;)
[19:59] * markbby (~Adium@ Quit (Quit: Leaving.)
[19:59] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Read error: Connection reset by peer)
[19:59] <andreask> well ... this is strange: sudo: no tty present and no askpass program specified
[19:59] <unit3> yeah. I don't get it. I can ssh to the box and use sudo, it works fine.
[20:00] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph
[20:00] <unit3> so I'm not sure how ceph is calling ssh differently.
[20:01] * markbby (~Adium@ has joined #ceph
[20:01] <unit3> ceph-deploy -v doesn't actually tell me what it's calling, either. It just tells me it's "Deploying mon, cluster ceph hosts c2", which is correct.
[20:03] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[20:03] <andreask> strange
[20:03] * nwat (~nwatkins@eduroam-226-128.ucsc.edu) Quit (Read error: Connection reset by peer)
[20:03] * nwat (~nwatkins@eduroam-226-128.ucsc.edu) has joined #ceph
[20:05] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[20:05] <unit3> yeah.
[20:06] <andreask> you did "ceph deploy" and "ceph new" and that worked?
[20:07] <unit3> Didn't do ceph-deploy, since I just installed the packages from the PPA on all systems using salt. I did ceph new on the first system, it was fine. Didn't do new again before trying to add the second system because I'm just trying to add to the existing cluster, not create a new cluster.
[20:07] <andreask> meant ceph-deploy instal ... ceph-deploy new
[20:08] * sleinen (~Adium@2001:620:0:26:945a:9715:3b19:a704) Quit (Quit: Leaving.)
[20:08] <andreask> hmm
[20:08] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[20:08] <unit3> yeah, that's what I figured you meant. ;)
[20:08] <unit3> so yeah, packages exist on target host. source host is running mon, mds, osd.
[20:09] * markbby (~Adium@ Quit (Quit: Leaving.)
[20:11] <andreask> configuration is already there?
[20:11] * markbby (~Adium@ has joined #ceph
[20:11] <unit3> nope. ceph-deploy config push fails with the same stack trace and sudo tty missing error.
[20:13] <unit3> also, to be clear, this is 0.61.4-1raring packages.
[20:15] * portante|afk is now known as portante
[20:16] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[20:30] <unit3> ceph-deploy doesn't expect passwordless sudo, does it?
[20:32] <dmick> unit3: http://ceph.com/docs/master/rados/deployment/preflight-checklist/#create-a-user
[20:32] <dmick> a: yes, absolutely. no place to enter a password
[20:32] <dmick> indeed, no tty
[20:34] <unit3> uhhhhh ok. so no using ceph-deploy in any production situation then, since that's super insecure. I guess I'll just have to figure out all the manual configuration. Too bad, ceph-deploy seemed like it'd be handy.
[20:35] * fireD1 (~fireD@93-142-245-29.adsl.net.t-com.hr) has left #ceph
[20:35] * haomaiwang (~haomaiwan@ has joined #ceph
[20:35] * fireD1 (~fireD@93-142-245-29.adsl.net.t-com.hr) has joined #ceph
[20:35] <dmick> http://ceph.com/docs/master/rados/deployment/ intro paragraph
[20:36] * markbby1 (~Adium@ has joined #ceph
[20:36] * markbby (~Adium@ Quit (Remote host closed the connection)
[20:37] * houkouonchi-work (~linux@ Quit (Remote host closed the connection)
[20:37] <fireD1> procudo su tvoji pajdaši? reci im da si provjere wordpress itd, da ne zarade suspenziju s obzirom da spamiraju na veliko
[20:37] <fireD1> /home/procudoh/public_html/slobodni-programi.hr/wp/wp-admin/maint/index.php: hex.acDj9cqd.0.UNOFFICIAL FOUND
[20:37] <fireD1> /home/procudoh/public_html/slobodni-programi.hr/wp/wp-content/themes/twentyten/ini.php: PHP.Trojan.Spambot FOUND
[20:37] <fireD1> /home/procudoh/public_html/slobodni-programi.hr/wp/wp-content/themes/twentyten/images/index.php: hex.acDj9cqd.0.UNOFFICIAL FOUND
[20:37] <fireD1> /home/procudoh/public_html/slobodni-programi.hr/wp/wp-includes/SimplePie/Content/index.php: hex.acDj9cqd.0.UNOFFICIAL FOUND
[20:37] <unit3> Yeah. I guess I just expected it to do things with sensible defaults without too much trouble. Supporting sudo password prompts so it's not completely insecure didn't seem insane to me. ;)
[20:38] <unit3> Ahh well! just means I need to dig into the config of the services sooner than I thought. ;)
[20:38] <fireD1> wrong window ... sorry ;-)
[20:40] * doubleg (~doubleg@ Quit (Quit: Lost terminal)
[20:43] * haomaiwang (~haomaiwan@ Quit (Ping timeout: 480 seconds)
[20:49] * houkouonchi-work (~linux@ has joined #ceph
[20:59] <saaby> joao: here?
[21:01] * unit3 (~Unit3@ has left #ceph
[21:09] * John (~john@astound-64-85-225-33.ca.astound.net) Quit (Remote host closed the connection)
[21:13] <joao> saaby, am now
[21:19] <Machske> a tuning question about mds, the default inode cache is set at 100.000 . I'm experiencing a serious slowdown in cephfs since two weeks, rbd and object performance are still ok. Cephfs seems to use 1.100.000 nodes atm. Would it help to increase the inode cache setting for the mds's ?
[21:20] <gregaf> yes
[21:20] <saaby> joao: we are having problems...
[21:20] <saaby> just added 12 new servers to the cluster today..
[21:21] <saaby> adding them to the cluster went fine, but adding them to the production crusmap made all hell break loose
[21:21] <Machske> gregaf: would you go higher then the current inode count with some reserve or just increase it up to 500.000 for example ?
[21:21] <saaby> osd's are getting markd down, mons are unresponsive.. looks like what we saw earlier, before the fixes you backported
[21:21] <sagewk> use sse4.2 instructions for crc32c: https://github.com/ceph/ceph/pull/393 anyone want to review?
[21:21] <saaby> end then a few monites ago 2/3 of all OSD's segfaulted.. :(
[21:22] <gregaf> Machske: it depends on your usage pattern, but in general I'd set it as high as your server can provide the memory for
[21:23] <Machske> gregaf: any idea how much memory it would eat ? for example if it would be set to 500.000 ?
[21:23] <joao> saaby, can you pastebin the osds' stack traces?
[21:23] <gregaf> word of warning though, you're probably running into an inode leak (though you might be able to reset it by restarting your clients)
[21:23] <gregaf> once upon a time each inode was 1KB, but I dunno if that's still an up-to-date number or not
[21:23] <Machske> hmm sounds scary
[21:24] <Machske> so stopping all clients, unmount it, restart mds's and remount everything ?
[21:24] <saaby> joao: yes
[21:25] <gregaf> Machske: I'd try that if you can, yes
[21:26] <joao> also, saaby, some mon logs would be appreciated, just to check what's going on there
[21:26] <Machske> what is the relation of an inode on ceph fs -> 1 inode points virtually to an object ? How are cephfs inodes to be compared to normal inodes in a classic filesystem ?
[21:27] <Machske> any way to prevent inode leaks btw :)
[21:28] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[21:28] <gregaf> Ceph has an inode for each file in the system doing what they do in a standard fs; except it's basically constant size regardless of the size of the file because the file->object mapping is deterministic (so we don't need to store extents or blocks)
[21:29] <Machske> hmm ok then it's definitly a an inode leak :s, because it's only storing 324 files and folders
[21:29] <saaby> joao: ok, will send them. we are a bit stressed to restart everything, so please bear with us
[21:29] <Machske> though some of them are big files
[21:29] <saaby> nyerup: can you help with logs and segfault traces?
[21:29] * sleinen1 (~Adium@2001:620:0:26:900d:648b:7abb:433f) has joined #ceph
[21:30] <joao> saaby, sure, take your time :)
[21:30] <gregaf> there are two (we think) things causing the leak that are known right now — one is we aren't putting enough pressure from the MDS to the clients to drop inodes which aren't in use, so if your clients have a lot of memory and aren't using a lot of inodes they can keep them around in-memory for basically forever; the second is there appears to be an issue with deleted files not getting released that somebody will need to track down
[21:31] <saaby> joao: we are securing osd logfiles now, hopefully we have everything.. they completely filled up their logspace, so we have to delete logs to restart osds.
[21:31] <gregaf> Machske: at least the first, and maybe the second, can be fixed by restarting the clients (it doesn't need to be all at once)
[21:31] <Machske> ah ok
[21:31] <saaby> but there is probably a chance that we dont actually have the traces I think :(
[21:31] <Machske> so client by client is ok
[21:31] <Machske> I'll try that
[21:31] <gregaf> Machske: are you cycling through a lot of files? or using snapshots? because I haven't seen a number that high without a very large file count as well
[21:33] * markbby1 (~Adium@ Quit (Quit: Leaving.)
[21:33] <Machske> well, I'm using it to store some virtual disk images for windows environments, I know that's not the best way, but anyhow. Windows machines are, via xen, just reading and writing to it
[21:33] <Machske> no snapshots are being used
[21:34] <Machske> so the file count does not increase very often
[21:34] <Machske> but avg file size would be around 80GB
[21:34] <Machske> In total only 6 clients are connected, but 3 of them are up for about 180 days
[21:35] <gregaf> huh, that's fairly odd then
[21:35] <gregaf> where did you get the node count from?
[21:35] * haomaiwang (~haomaiwan@ has joined #ceph
[21:35] <gregaf> and I don't know much about Xen, but might it be doing some other file storage?
[21:35] <Machske> netxen-25303:~# df -i /cloudstore
[21:35] <Machske> Filesystem Inodes IUsed IFree IUse% Mounted on
[21:35] <Machske> ceph-fuse 1129091 - - - /cloudstore
[21:35] <Machske> netxen-25303:~#
[21:36] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[21:36] <nyerup> joao: Hey.
[21:37] <nyerup> joao: I've got a core dump and a tarball of /var/log/ceph/ on one of the data nodes with crashed OSDs.
[21:37] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[21:37] <Machske> files are just opened for read/write by a qemu proc so that Xen can use it as a virtual disk
[21:37] <nyerup> I'm just figuring out a way to provide these to you quickly. As in, not through my home DSL.
[21:38] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[21:38] * ChanServ sets mode +v andreask
[21:38] <joao> nyerup, cephdrop them?
[21:40] <gregaf> Machske: ah, I think that's actually reporting object count rather than inode count
[21:40] <nyerup> joao: Wat? :)
[21:40] <nyerup> joao: Oh. Thanks, got your PM.
[21:41] <joao> :)
[21:41] <Machske> gregaf: :)
[21:41] <Machske> that would explain it
[21:41] <saaby> nyerup: can you also send the logs from c16mon1 to joao?
[21:41] <gregaf> sagewk: it looks like we report the number of objects in the filesystem instead of a file count (for Client::statfs); any thoughts on that?
[21:42] <sagewk> hmm
[21:42] <Machske> because if you multiply that with 4MB, that would be about the amount of data on it
[21:42] <sagewk> yeah, that's not ideal.
[21:42] <sagewk> can take the rstat file count from the root inode instead
[21:42] * haomaiwang (~haomaiwan@ Quit (Read error: Operation timed out)
[21:42] <gregaf> Machske: you should be able to get some info out of the MDS that might provide a clue about why the performance is degrading, by using the admin socket
[21:42] <sagewk> or root of the mount..
[21:43] <paravoid> sagewk: can I downgrade from 0.65 to 0.64?
[21:43] <paravoid> sagewk: for #5460
[21:43] <paravoid> cluster isn't very happy with those OSDs out
[21:43] <gregaf> sagewk: yeah, that'd be my preference
[21:43] <gregaf> are we doing the same thing for the kclient, do you konw?
[21:43] <sagewk> looking
[21:43] <sagewk> (also at your bug right now)
[21:43] <sagewk> really need to reproduce.. very strange.
[21:43] <Machske> gregaf: I'll have to dive into the docs to see how to do that :)
[21:44] <gregaf> sagewk: I suspect we are since it's filling in all the info from the monitor statfs message
[21:44] <nyerup> saaby: Sure thing.
[21:44] <Machske> but anyway, don't need to tune up the mds inode cache parameter then
[21:44] <nyerup> joao: OSD logs are up. OSD coredump uploading (4G).
[21:44] <sagewk> paravod: osds can downgrade
[21:44] <paravoid> sagewk: ok, thanks
[21:45] <nyerup> saaby: f11mon1 was the one that got evicted – but you wanted logs from c16mon1, right?
[21:45] <joao> nyerup, transatlantic uploads can be slow :)
[21:45] <saaby> nyerup: yeah, but maybe both would be good?
[21:46] <joao> nyerup, saaby, if possible, get them both, just in case
[21:46] <saaby> the mons are being almost unresponsive from time to time..
[21:46] <joao> specially if one of them is the leader
[21:46] <saaby> c16 is the leader
[21:46] <joao> cool
[21:46] <saaby> f11 is getting evicted from the quorum from time to time
[21:46] <nyerup> joao, saaby: Coming right up.
[21:46] <nyerup> saaby: Okay.
[21:48] <gregaf> Machske: http://ceph.com/docs/master/dev/perf_counters/?highlight=admin%20socket#access, though that example is for an OSD and you'll want the MDS :)
[21:50] <loicd> what configuration option of radosgw defines the size of a librados object ? i.e. if a 1GB object is uploaded to radosgw, it will be sliced into N smaller objects of size XXX and I'm looking for the option that defines this XXX ;-)
[21:50] <nyerup> joao: Everything's up now: logs from the data node, the evicted mon, the leader mon, and a core dump from an OSD.
[21:51] <joao> nyerup, cool, thanks!
[21:51] <gregaf> loicd: I believe that's "rgw obj stripe size"
[21:51] <gregaf> just from looking through config_opts.h :)
[21:52] <loicd> gregaf: thanks :-)
[21:52] * erwan_taf is working for making something similar to http://www.spinics.net/lists/fio/msg02140.html for ceph benchmarking
[21:53] <erwan_taf> I wonder if some have some interest in
[21:53] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit (Quit: Leaving.)
[21:54] <saaby> joao: this all looks a bit familiar.. we added 12 new servers earlier today, but it turned out that one of them had a network problem, so we shut down all osd's on that one, leaving them down,in. As far as I remember having osd's down for an extended period on a busy cluster was a proble somehow. we hit that earlier too.
[21:54] <saaby> it looks as if performance has just degraded ever since we started having those osd's down.
[21:55] <joao> hmm, I'm not familiar enough with osd issues to know that unfortunately; sjust, any idea?
[21:56] <gregaf> erwan_taf: looks nifty, but how are you thinking you'll adjust it for ceph benching?
[21:56] <sagewk> paravoid: i have a patch for you to test, though... ? :)
[21:56] <erwan_taf> the rados benchmarking offer some outputs
[21:57] <erwan_taf> I'm thinking about runnin benchmarking with various settings
[21:57] <erwan_taf> and plot the result this way
[21:57] <gregaf> ah, coolio
[21:57] <Machske> gregaf: thx, seems to work, now trying to find what those numbers mean :)
[21:58] <erwan_taf> gregaf: thinking about showing scalibilty this way
[21:58] <gregaf> we don't have a lot that are super-helpful for the MDS :(, but I'd find the cache size one (forget what exactly it is), and things related to that
[21:59] * rturk-away is now known as rturk
[22:00] * joao (~JL@ has left #ceph
[22:00] * joao (~JL@ has joined #ceph
[22:00] * ChanServ sets mode +o joao
[22:02] <erwan_taf> I'll send my work on the ml once it will be ready
[22:03] <sagewk> nhm: around?
[22:03] <sagewk> paravoid: repushed (hopeful) fix to paravoid-test branch
[22:07] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[22:07] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:09] <saaby> joao: I am seeing very slow peering when the osd's get back up
[22:10] <paravoid> sagewk: fwiw, I'm pretty sure I haven't downgraded
[22:10] <paravoid> so if that's the hypothesis, it's wrong :)
[22:10] <paravoid> (waiting for gitbuilder)
[22:11] <saaby> joao: last time we talked about that being related to: http://tracker.ceph.com/issues/5232
[22:11] <saaby> could that still be the case?
[22:11] <sagewk> yeah, i wonder if there is a way that the new-format encoded osdmap got into the mon tho, which could maybe also explain it
[22:11] <sagewk> anyway, this will disambiguate
[22:12] <joao> saaby, I expect sagewk or sjust to know better than me
[22:12] <saaby> ok
[22:13] <Machske> gregaf: http://jsonviewer.stack.hu/ , nice easy json viewer :)
[22:14] <sjustlaptop> sagewk, joao: review on wip-5497?
[22:14] <sjustlaptop> should fix the feature nonsense
[22:14] <joao> sjustlaptop, looking
[22:14] <sjustlaptop> thanks
[22:17] <sagewk> sjustlaptop: looks good
[22:17] * kyle_ (~kyle@ Quit (Quit: Leaving)
[22:17] <sjustlaptop> thanks
[22:17] <joao> sjustlaptop, agree with sagewk
[22:17] <sjustlaptop> it'll need to go back to cuttlefish for upgrades to work right
[22:18] <sagewk> there is a mon intenral protocol switch for the cli/api changes anyway, so it shouldn't matter
[22:19] <sjustlaptop> ok
[22:20] <sjustlaptop> it does matter to the osd though
[22:20] <Machske> gregaf: parameter mds_data points to a location on the local fs: but that folder doesn't contain any data ?
[22:20] <gregaf> Machske: yeah, the only thing that goes there is the MDS' keyring (usually); it doesn't have any other local data
[22:21] <Machske> :)
[22:21] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[22:21] * kyle (~kyle@ has joined #ceph
[22:21] * kyle is now known as Guest2013
[22:23] * markbby (~Adium@ has joined #ceph
[22:34] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[22:34] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[22:35] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Read error: Connection reset by peer)
[22:37] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[22:37] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit ()
[22:37] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[22:40] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[22:41] * ScOut3R (~ScOut3R@dsl51B6D44C.pool.t-online.hu) has joined #ceph
[22:43] * vata (~vata@2607:fad8:4:6:3505:9b64:879e:bf93) Quit (Ping timeout: 480 seconds)
[22:48] <paravoid> sagewk: nope, same error
[22:48] <paravoid> let me debug-ms/osd
[22:48] <sagewk> k thanks
[22:48] <paravoid> 20/20, right?
[22:50] <sagewk> yeah
[22:50] <sagewk> also, can you include 'ceph osd getmap -o /tmp/osdmap'
[22:50] <sagewk> (the encoded map from the mon)
[22:53] <paravoid> sagewk: 5460-070302-ceph-osd.0.log.bz2 & 5460-osdmap
[22:53] <sagewk> thanks
[22:54] <paravoid> thank you
[22:54] <paravoid> so, I don't mind much just biting the bullet and upgrading everything with downtime
[22:55] <paravoid> the only reason I haven't done so is to help you debug this -- if you think it's something cluster-specific/not worth of your time just say so :)
[22:56] <sagewk> yeah, definitely want to nail this one down.. thanks for you patience!
[23:03] * grepory1 (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[23:03] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Read error: Connection reset by peer)
[23:04] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[23:04] * Qu310 (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) Quit (Read error: Connection reset by peer)
[23:05] * Qu310 (~Qten@ip-121-0-1-110.static.dsl.onqcomms.net) has joined #ceph
[23:05] <saaby> joao: just a quick update; we managed to stabilize the cluster again, and getting all osd's back up.
[23:06] <saaby> did you have time looking into the logs nyerup sent you?
[23:06] * sleinen1 (~Adium@2001:620:0:26:900d:648b:7abb:433f) Quit (Quit: Leaving.)
[23:06] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[23:08] * LeaChim (~LeaChim@ Quit (Read error: Connection reset by peer)
[23:08] * markbby (~Adium@ Quit (Ping timeout: 480 seconds)
[23:09] <sagewk> paravoid: aha, i see it.
[23:09] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit (Quit: Leaving.)
[23:09] * LeaChim (~LeaChim@ has joined #ceph
[23:12] <sagewk> paravoid: in this case the bug was triggered earlier and the fix won't change things, so go ahead and upgrade everything. i've found the root cause
[23:12] <sagewk> paravoid: thanks again for the help!
[23:12] * Tamil (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[23:13] <paravoid> sagewk: if you're commiting a fix I'll deploy that instead
[23:14] <sagewk> the fix won't change things for your cluster... the bug put a garbage value for the addr for the non-upgraded osds, and fixing it now won't change that.. only upgrading the rest will (or doing some workaround for this particular case)
[23:14] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[23:15] <sagewk> hmm, actually, i guess it'll prevent any new garbage entries. pushing to wip-5460, and i'll give you a list of the osds that need to be restarted before things are clean
[23:15] * mtanski (~mtanski@ has joined #ceph
[23:15] <sagewk> paravoid: 29 31 51 64 103 108 123
[23:16] <mtanski> Can one of you guys help me debug an issue with MDS
[23:16] <mtanski> I'm running into the same bug as this existing one: http://tracker.ceph.com/issues/5036
[23:16] <paravoid> sagewk: to confirm, these still run 0.61.3
[23:16] <mtanski> I've added relevant logs from MDS but I can't tell what to look for (when chasing the hang)
[23:16] <sagewk> paravoid: and actually just restarting those osds should do it
[23:17] <sagewk> yeah
[23:17] <sagewk> restarting them after the mon i upgraded to the new wip-5460.
[23:17] * brambles (lechuck@s0.barwen.ch) Quit (Ping timeout: 480 seconds)
[23:20] <paravoid> wait, so, 1) upgrade mons to wip-5460, 2) restart the list of osds you gave me, still at 0.61.3, 3) upgrade osds 0-24 (the 0.65 ones) to wip-5460 and start them 4) wait for it to settle, then proceed with the rest of the upgrade
[23:22] <gregaf> mtanski: what version of ceph? we've all fixed a bunch of things like this the last several months
[23:22] <joao> saaby, no, stepped away for dinner in the meantime
[23:22] <mtanski> 0.61.4
[23:22] <joao> saaby, glad to hear the cluster is stable again! :)
[23:22] <gregaf> damn
[23:24] <gregaf> mtanski: do you have full logs at that debugging level?
[23:24] <gregaf> basically we want to see how the inode got into that mix->sync state
[23:24] <gregaf> and haven't gotten logs of it happening
[23:25] <mtanski> no, i only turned on debugging in mrd after we hit this issue
[23:26] <gregaf> so you saw it somewhere, then enabled debugging, and tried the ls again?
[23:27] * BillK (~BillK-OFT@124-169-221-120.dyn.iinet.net.au) has joined #ceph
[23:27] <gregaf> mtanski: can you grep for 10000003dae in the log and attach the output to http://tracker.ceph.com/issues/2019
[23:27] * vata (~vata@2607:fad8:4:6:ac61:1176:88ff:d7d1) has joined #ceph
[23:27] <mtanski> 1) saw the issue (on a node) 2) replicated it on 3 other nodes 4) bounced mds 5) retried, still not fixed 6) turned on debugging
[23:28] <gregaf> hrm, I think usually restarting the MDS made it go away
[23:28] <gregaf> or else the replay logging just wasn't enough to show us how it got there but it still got stuck, which seems unlikely
[23:29] <saaby> joao: ok
[23:29] <mtanski> gracefully restarted too
[23:30] <saaby> joao: I just tried something now (probably shouldn't have..) - restarting a few osd's. - that emmediatly stalled the mons again, stalling all I/O to the cluster and stalling all mon commands e.g. "ceph osd tree".
[23:31] <saaby> is that relevant to track this down?
[23:31] * jebba (~aleph@70-90-113-25-co.denver.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[23:31] <mtanski> Updated the bug: http://tracker.ceph.com/issues/2019
[23:32] <saaby> so, now 3-4 minutes later, the mons are behaving again, and I/O continues.
[23:32] <gregaf> so the inode's first appearance is in that state :(
[23:32] <saaby> gotta' admit that I'm a bit worried about this..
[23:33] <saaby> btw. there is quite a bit of internal I/O on the cluster now, because of backfilling those new 144 OSDs.
[23:33] <saaby> that is probably related..
[23:38] <mtanski> Yeah, not sure how to get my client unstuck or even delete the file
[23:40] * dpippenger1 (~riven@tenant.pas.idealab.com) has joined #ceph
[23:40] <gregaf> mtanski: it's a hang on the internal per-file locks; that state should go away if you get rid of the clients that have touched that file and restart the MDS — so kill the clients, wait 5 minutes, and restarting the MDS I think should do it (not the most pleasant solution, I know)
[23:40] <gregaf> actually, wait, let me think that through again
[23:41] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Quit: Leaving.)
[23:42] <saaby> joao: I just noticed one more thing, after the backfilling of the new nodes started monitor stores are growing fast..! :(
[23:42] <sagewk> davidz: can you look at the 2 patches in wip-5460?
[23:43] <sagewk> paravoid: just 1 and 2 should make the osd failures stop. if not, ceph osd getmap -o /tmp/foo and send me that again so i can see what i missed
[23:43] <sagewk> then you can upgrade the rest at will
[23:43] * jebba (~aleph@2601:1:a300:8f:f2de:f1ff:fe69:6672) has joined #ceph
[23:45] <davidz> sagewk: ok
[23:45] <gregaf> mtanski: what are your clients? ceph-fuse or kclient?
[23:45] <mtanski> kernel clients
[23:46] <sagewk> hrm, it was in 0.61.4
[23:46] <sagewk> are the clients definitely running 0.61.4 ceph-fuse?
[23:46] <sagewk> ah
[23:46] <paravoid> okay, fingers crossed that next/wip-5460 works :)
[23:47] <gregaf> I thought we'd fixed this recently :) but it wasn't listed under the keywords I was using because it was a client revoke issue
[23:47] <sagewk> what kernel?
[23:47] * ShaunR (~ShaunR@staff.ndchost.com) has joined #ceph
[23:49] * rturk is now known as rturk-away
[23:49] <mtanski> gregaf: 3.10 vanilla
[23:51] * rturk-away is now known as rturk
[23:52] <sagewk> mtanski: osds are healthy? no 'slow request' warnings?
[23:52] <davidz> sagewk: wip-5460 looks good
[23:52] <sagewk> single mds?
[23:52] <mtanski> Single MDS and healthy OSDs
[23:52] <sagewk> davidz: great. hopefully paravoid's cluster agrees :)
[23:53] * rturk is now known as rturk-away
[23:54] <mtanski> I was actually going to deploy a second MDS today but I ran into this problem when changing our app's dev environment to use the existing ceph cluster
[23:55] <gregaf> sagewk: mtanski: haha, Zheng is apparently awake and watching the tracker; he thinks "libceph: call r_unsafe_callback when unsafe reply is received" should handle it
[23:55] <gregaf> that I think is not in 3.10, although I'm not familiar enough with the kernel caps handling to know if that's plausible or not
[23:56] * BManojlovic (~steki@fo-d- has joined #ceph
[23:56] <sagewk> it's the top patch of testing-next. hasn't been through a clean kernel qa run yet bc it was breaking rbd; about to test the fix there.
[23:57] <mtanski> Yeah, I struggled with understanding caps in the kernel as well when working on the fscache too
[23:57] <mtanski> But the client that are having the hang are vanilla (non-fscache compiled) clients
[23:59] <mtanski> I see the process that are trying to us the fs get stuck in the kernel waiting io: ceph_mdsc_do_request
[23:59] <mtanski> at least that's what /proc/PID/stack tells me

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.