#ceph IRC Log

Index

IRC Log for 2013-06-05

Timestamps are in GMT/BST.

[0:02] * mattbenjamin (~matt@aa2.linuxbox.com) Quit (Quit: Leaving.)
[0:08] * drokita1 (~drokita@199.255.228.128) Quit (Ping timeout: 480 seconds)
[0:15] * redeemed (~redeemed@static-71-170-33-24.dllstx.fios.verizon.net) Quit (Quit: bia)
[0:19] * Tamil (~tamil@38.122.20.226) Quit (Quit: Leaving.)
[0:20] * Tamil (~tamil@38.122.20.226) has joined #ceph
[0:28] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Quit: ...)
[0:31] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[0:32] * aliguori (~anthony@32.97.110.51) Quit (Remote host closed the connection)
[0:34] <phantomcircuit> Gugge-47527, on top of? no but im running ceph on top of zfs
[0:35] * loicd (~loic@brln-4d0cdd77.pool.mediaWays.net) Quit (Quit: Leaving.)
[0:40] * tziOm (~bjornar@ti0099a340-dhcp0745.bb.online.no) Quit (Remote host closed the connection)
[0:41] * portante (~user@66.187.233.206) Quit (Ping timeout: 480 seconds)
[0:44] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) has left #ceph
[0:53] * mschiff (~mschiff@81.92.22.210) Quit (Remote host closed the connection)
[0:56] * dosaboy (~dosaboy@eth3.bismuth.canonical.com) Quit (Quit: leaving)
[0:58] * portante (~user@c-24-63-226-65.hsd1.ma.comcast.net) has joined #ceph
[0:58] * sagelap1 (~sage@76.89.177.113) has joined #ceph
[1:05] * sagelap (~sage@2600:1012:b00d:2a9c:a5b8:d69b:857f:ca26) Quit (Ping timeout: 480 seconds)
[1:07] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has joined #ceph
[1:07] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has left #ceph
[1:12] * rturk is now known as rturk-away
[1:17] * LeaChim (~LeaChim@2.122.119.234) Quit (Ping timeout: 480 seconds)
[1:23] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) Quit (Read error: Connection reset by peer)
[1:23] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[1:33] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[1:40] * tnt (~tnt@91.176.13.220) Quit (Ping timeout: 480 seconds)
[1:45] <cjh_> anyone know what the largest ceph cluster created so far is?
[2:00] <iggy> cjh_: if I was to guess, I'd say DreamHost's
[2:05] * mattbenjamin (~matt@76-206-42-105.lightspeed.livnmi.sbcglobal.net) has joined #ceph
[2:18] * buck (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving.)
[2:22] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has left #ceph
[2:28] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[2:42] <cjh_> iggy: how many hosts are running at dreamhsot?
[2:43] <iggy> no clue, you just asked which was the largest
[2:43] <iggy> I think there might be some info out there about the deployment
[2:46] <cjh_> ok i'll dig around
[2:46] <cjh_> i'm wondering if they've gotten into the thousands yet
[2:46] <cjh_> or if it's still hundreds
[3:06] * The_Bishop (~bishop@2001:470:50b6:0:ad7c:586a:dca1:5125) has joined #ceph
[3:12] * nlopes (~nlopes@a95-92-0-12.cpe.netcabo.pt) Quit (Read error: Connection reset by peer)
[3:15] * nlopes (~nlopes@a95-92-0-12.cpe.netcabo.pt) has joined #ceph
[3:28] * Tamil (~tamil@38.122.20.226) Quit (Quit: Leaving.)
[3:28] * mattbenjamin (~matt@76-206-42-105.lightspeed.livnmi.sbcglobal.net) Quit (Quit: Leaving.)
[3:39] * rongze1 (~zhu@117.79.232.187) Quit (Ping timeout: 480 seconds)
[3:47] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[3:51] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[3:52] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) Quit (Quit: Leaving.)
[3:55] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Read error: Connection reset by peer)
[3:55] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[4:07] <TiCPU|Home> I'm currently trying some workload on RBD backed Qemus and found out MariaDB is painfully slow, everything looks idle but progress is very slow, is there a way to see if RBD cache is in use, and maybe some stats?
[4:09] <TiCPU|Home> ceph -w shows... less than 1MB/s and never over 200op/s
[4:09] <TiCPU|Home> CPU idle and disk working a bit, less than 50% in iostat for every OSD
[4:11] * rongze (~zhu@117.79.232.200) has joined #ceph
[4:18] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[4:19] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[4:22] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[4:24] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[4:27] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[4:32] * Cube (~Cube@c-38-80-203-93.rw.zetabroadband.com) has joined #ceph
[4:37] * rongze (~zhu@117.79.232.200) Quit (Ping timeout: 480 seconds)
[4:39] * rongze (~zhu@117.79.232.139) has joined #ceph
[4:44] <TiCPU|Home> it seems after debugging the issue that mariadb/innodb were issuing one flush per transaction for about 512 to 1024 bytes, which caused 2ms delay per insert.. is that normal? disabling flush per transaction and instead using flush every second workaround the problem
[4:46] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[4:53] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Quit: Leaving.)
[4:53] * mattbenjamin (~matt@76-206-42-105.lightspeed.livnmi.sbcglobal.net) has joined #ceph
[5:00] <TiCPU|Home> its just like rbd_cache wasn't working and write takes tame to return
[5:01] * san (~san@81.17.168.194) Quit (Quit: Ex-Chat)
[5:01] * Vanony (~vovo@i59F7A7A1.versanet.de) has joined #ceph
[5:05] * TiCPU|Home (jerome@p4.i.ticpu.net) Quit (resistance.oftc.net charm.oftc.net)
[5:05] * DLange (~DLange@dlange.user.oftc.net) Quit (resistance.oftc.net charm.oftc.net)
[5:05] * __jt__ (~james@rhyolite.bx.mathcs.emory.edu) Quit (resistance.oftc.net charm.oftc.net)
[5:05] * sage (~sage@76.89.177.113) Quit (resistance.oftc.net charm.oftc.net)
[5:05] * coredumb (~coredumb@xxx.coredumb.net) Quit (resistance.oftc.net charm.oftc.net)
[5:05] * mattch (~mattch@pcw3047.see.ed.ac.uk) Quit (resistance.oftc.net charm.oftc.net)
[5:05] * __jt__ (~james@rhyolite.bx.mathcs.emory.edu) has joined #ceph
[5:05] * DLange (~DLange@dlange.user.oftc.net) has joined #ceph
[5:05] * TiCPU|Home (jerome@p4.i.ticpu.net) has joined #ceph
[5:06] * coredumb (~coredumb@xxx.coredumb.net) has joined #ceph
[5:06] * mattch (~mattch@pcw3047.see.ed.ac.uk) has joined #ceph
[5:06] * sage (~sage@76.89.177.113) has joined #ceph
[5:08] * Vanony_ (~vovo@i59F79922.versanet.de) Quit (Ping timeout: 482 seconds)
[5:13] * nlopes_ (~nlopes@a89-153-95-87.cpe.netcabo.pt) has joined #ceph
[5:15] * nlopes (~nlopes@a95-92-0-12.cpe.netcabo.pt) Quit (Ping timeout: 480 seconds)
[5:16] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[5:31] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[5:31] * mattbenjamin (~matt@76-206-42-105.lightspeed.livnmi.sbcglobal.net) Quit (Quit: Leaving.)
[5:33] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[5:40] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 480 seconds)
[5:48] * tkensiski (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has joined #ceph
[5:49] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[5:49] * tkensiski (~tkensiski@c-98-234-160-131.hsd1.ca.comcast.net) has left #ceph
[5:52] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Ping timeout: 480 seconds)
[5:53] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[5:59] * Vanony_ (~vovo@i59F7A9B1.versanet.de) has joined #ceph
[6:06] * Vanony (~vovo@i59F7A7A1.versanet.de) Quit (Ping timeout: 480 seconds)
[6:16] * The_Bishop (~bishop@2001:470:50b6:0:ad7c:586a:dca1:5125) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[6:17] * yehuda_hm (~yehuda@2602:306:330b:1410:78aa:5cbb:d4b7:2aa9) Quit (Ping timeout: 480 seconds)
[6:18] * yehuda_hm (~yehuda@2602:306:330b:1410:baac:6fff:fec5:2aad) has joined #ceph
[6:52] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[7:17] <iggy> TiCPU|Home: have you tried other workloads?
[7:22] * iggy (~iggy@theiggy.com) Quit (Remote host closed the connection)
[7:22] * iggy_ (~iggy@theiggy.com) Quit (Remote host closed the connection)
[7:22] * rongze (~zhu@117.79.232.139) Quit (Quit: Leaving.)
[7:28] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[7:39] * Machske (~Bram@d5152D87C.static.telenet.be) Quit ()
[7:45] * Meths (rift@2.27.72.232) Quit (Read error: Operation timed out)
[7:51] * tnt (~tnt@91.176.13.220) has joined #ceph
[7:52] * Meths (rift@2.27.72.232) has joined #ceph
[7:58] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[8:00] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[8:06] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[8:19] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[8:23] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[8:26] * codice (~toodles@75-140-71-24.dhcp.lnbh.ca.charter.com) Quit (Quit: leaving)
[8:29] * dcasier (~dcasier@131.35.132.79.rev.sfr.net) has joined #ceph
[8:29] * Machske (~Bram@d5152D8A3.static.telenet.be) has joined #ceph
[8:38] * barryo (~borourke@cumberdale.ph.ed.ac.uk) has left #ceph
[8:40] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: The early bird may get the worm, but the second mouse gets the cheese)
[8:47] * Machske (~Bram@d5152D8A3.static.telenet.be) Quit (Remote host closed the connection)
[8:49] * rongze (~zhu@106.120.176.116) has joined #ceph
[8:49] * Machske (~Bram@d5152D8A3.static.telenet.be) has joined #ceph
[8:53] * loicd (~loic@brln-4d0cdd77.pool.mediaWays.net) has joined #ceph
[8:56] * rongze1 (~zhu@117.79.232.222) has joined #ceph
[9:00] * rongze (~zhu@106.120.176.116) Quit (Ping timeout: 480 seconds)
[9:03] * MK_FG (~MK_FG@00018720.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:15] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[9:15] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:15] * loicd (~loic@brln-4d0cdd77.pool.mediaWays.net) Quit (Quit: Leaving.)
[9:24] * Vanony_ is now known as Vanony
[9:27] * rahmu (~rahmu@ip-147.net-81-220-131.standre.rev.numericable.fr) has joined #ceph
[9:27] <Vanony> Hi. Is there an ETA when the .deb packages for 0.63 will be available? According to http://ceph.com/releases/v0-63-released/ "You can get v0.63 from the usual places". But in http://ceph.com/debian-cuttlefish/pool/main/c/ceph/ the latest packages are 0.61.2
[9:30] * tnt (~tnt@91.176.13.220) Quit (Ping timeout: 480 seconds)
[9:31] * loicd (~loic@dslb-088-073-122-180.pools.arcor-ip.net) has joined #ceph
[9:41] * Cube (~Cube@c-38-80-203-93.rw.zetabroadband.com) Quit (Quit: Leaving.)
[9:45] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[9:46] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[9:48] * codice (~toodles@75-140-71-24.dhcp.lnbh.ca.charter.com) has joined #ceph
[9:50] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[9:57] * san (~san@81.17.168.194) has joined #ceph
[9:59] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) has joined #ceph
[10:03] * leseb (~Adium@83.167.43.235) has joined #ceph
[10:04] * bergerx_ (~bekir@78.188.204.182) has joined #ceph
[10:08] * ScOut3R (~ScOut3R@212.96.47.215) has joined #ceph
[10:09] * glejeune (~greg@83.167.43.235) has joined #ceph
[10:12] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[10:13] * loicd (~loic@dslb-088-073-122-180.pools.arcor-ip.net) Quit (Quit: Leaving.)
[10:16] * ShaunR- (~ShaunR@staff.ndchost.com) Quit (Ping timeout: 480 seconds)
[10:18] * maximilian (~maximilia@212.79.49.65) Quit (Remote host closed the connection)
[10:22] * loicd (~loic@brln-4d0cdd77.pool.mediaWays.net) has joined #ceph
[10:25] * glejeune (~greg@83.167.43.235) Quit (Quit: leaving)
[10:29] * sha (~kvirc@81.17.168.194) has joined #ceph
[10:38] * LeaChim (~LeaChim@2.122.119.234) has joined #ceph
[10:41] * athrift (~nz_monkey@203.86.205.13) Quit (Ping timeout: 480 seconds)
[10:45] * itamar_ (~itamar@82.166.185.149) has joined #ceph
[10:45] <itamar_> Hi,
[10:45] <itamar_> I'm having some rgw issues..
[10:45] * fmarchand (~fmarchand@90.84.146.243) has joined #ceph
[10:45] <fmarchand> Hi !
[10:46] <itamar_> hey
[10:46] <fmarchand> I need some help !
[10:46] <fmarchand> :)
[10:46] <itamar_> :) mee too..
[10:46] <fmarchand> I installed a ceph cluster
[10:46] <itamar_> OK
[10:47] * athrift (~nz_monkey@203.86.205.13) has joined #ceph
[10:47] <fmarchand> and when I do "sudo ceph stop" I have no error ... and ceph processed are still running ...
[10:47] <absynth> sudo service ceph stop ?
[10:48] <fmarchand> yes this command ... sorry typo
[10:48] <fmarchand> what do you have when you do that ?
[10:51] * athrift_ (~nz_monkey@203.86.205.13) has joined #ceph
[10:51] * athrift (~nz_monkey@203.86.205.13) Quit (Read error: No route to host)
[10:55] * fmarchand2 (~fmarchand@90.84.146.243) has joined #ceph
[10:55] * fmarchand (~fmarchand@90.84.146.243) Quit (Read error: Connection reset by peer)
[10:56] * fmarchand (~fmarchand@90.84.146.243) has joined #ceph
[10:56] * fmarchand2 (~fmarchand@90.84.146.243) Quit (Read error: Connection reset by peer)
[10:56] <fmarchand> joao : hi ! how are you ?
[11:00] * fmarchand (~fmarchand@90.84.146.243) Quit (Read error: Connection reset by peer)
[11:09] <itamar_> is there a configuration reference for a radosgw setup on rhel?
[11:09] <itamar_> I'm struggling with seting up a new machine with 0.61.2
[11:15] <itamar_> I get "2013-06-05 05:15:18.024908 7f0552567700 0 <cls> cls/rgw/cls_rgw.cc:1030: gc_iterate_entries end_key=1_01370423718.024903000"
[11:15] <itamar_> in the osd logs
[11:15] <itamar_> and 500 reply when trying to access the radosgw
[11:15] <itamar_> any idea?
[11:16] <itamar_> the cephx key is correct as far as I can see..
[11:17] <tnt> that log entry doesn't look bad AFAICT, just the gc doing its job.
[11:29] * loicd (~loic@brln-4d0cdd77.pool.mediaWays.net) Quit (Quit: Leaving.)
[11:32] * sha (~kvirc@81.17.168.194) Quit (Read error: Connection reset by peer)
[12:03] * loicd (~loic@p5797AA45.dip0.t-ipconnect.de) has joined #ceph
[12:11] * dcasier (~dcasier@131.35.132.79.rev.sfr.net) Quit (Ping timeout: 480 seconds)
[12:13] * dcasier (~dcasier@131.35.132.79.rev.sfr.net) has joined #ceph
[12:17] <itamar_> thanks tnt
[12:22] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) has joined #ceph
[12:23] * rongze (~zhu@211.155.113.212) has joined #ceph
[12:30] * rongze1 (~zhu@117.79.232.222) Quit (Ping timeout: 480 seconds)
[12:31] * rahmu (~rahmu@ip-147.net-81-220-131.standre.rev.numericable.fr) Quit (Remote host closed the connection)
[12:37] * Volture (~quassel@office.meganet.ru) Quit (Remote host closed the connection)
[12:41] * Volture (~quassel@office.meganet.ru) has joined #ceph
[12:44] * syed_ (~chatzilla@180.151.28.160) has joined #ceph
[12:50] * loicd (~loic@p5797AA45.dip0.t-ipconnect.de) Quit (Quit: Leaving.)
[13:11] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[13:14] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:22] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:36] * diegows (~diegows@190.190.2.126) has joined #ceph
[13:38] * capri (~capri@212.218.127.222) Quit (Quit: Verlassend)
[13:46] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:53] * MK_FG (~MK_FG@00018720.user.oftc.net) has joined #ceph
[13:55] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[14:22] * loicd (~loic@p5797AA45.dip0.t-ipconnect.de) has joined #ceph
[14:24] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[14:39] * ScOut3R_ (~ScOut3R@212.96.47.215) has joined #ceph
[14:40] * ScOut3R__ (~ScOut3R@212.96.47.215) has joined #ceph
[14:45] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:46] * ScOut3R (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[14:47] * ScOut3R_ (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[14:57] * fmarchand (~fmarchand@90.84.146.243) has joined #ceph
[14:57] <fmarchand> Hi !
[14:58] <fmarchand> I installed a cuttlefish ... it was working well ...
[14:58] <fmarchand> I can't stop daemons using "sudo service ceph ..."
[14:58] <fmarchand> is that normal ?
[14:59] <fmarchand> processes are still running after the command if I try to stup
[14:59] <fmarchand> stop them
[14:59] <fmarchand> anybody ?
[15:02] <imjustmatthew> fmarchand: most of the developers are on US-Pacific time, you might get better responses later
[15:02] * fmarchand (~fmarchand@90.84.146.243) Quit (Read error: Connection reset by peer)
[15:02] * todin (tuxadero@kudu.in-berlin.de) Quit (Remote host closed the connection)
[15:02] * todin (tuxadero@kudu.in-berlin.de) has joined #ceph
[15:03] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) Quit (Remote host closed the connection)
[15:04] <imjustmatthew> Vanony: Packages of the development release series are at http://ceph.com/debian-testing/ see http://ceph.com/docs/master/install/debian/#development-release-packages
[15:05] <syed_> fmarchand: you are on ubuntu ?
[15:06] <Vanony> imjustmatthew: great, thanks!
[15:06] * chutz (~chutz@rygel.linuxfreak.ca) Quit (Ping timeout: 480 seconds)
[15:06] * Ludo_ (~Ludo@falbala.zoxx.net) Quit (Remote host closed the connection)
[15:07] * Ludo_ (~Ludo@falbala.zoxx.net) has joined #ceph
[15:12] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[15:13] * dcasier (~dcasier@131.35.132.79.rev.sfr.net) Quit (Read error: Connection reset by peer)
[15:15] * fmarchand (~fmarchand@90.84.146.243) has joined #ceph
[15:15] * fmarchand (~fmarchand@90.84.146.243) has left #ceph
[15:15] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[15:17] * fmarchand (~fmarchand@90.84.146.243) has joined #ceph
[15:17] <fmarchand> re
[15:17] <fmarchand> so nobody for my process pb ?
[15:18] <tnt> nope, works fine here.
[15:19] * dosaboy (~dosaboy@eth3.bismuth.canonical.com) has joined #ceph
[15:19] * fmarchand (~fmarchand@90.84.146.243) Quit (Read error: Connection reset by peer)
[15:19] * fmarchand (~fmarchand@90.84.146.243) has joined #ceph
[15:19] <fmarchand> I'm sure I'm missing something
[15:20] <fmarchand> when I do "service ceph stop" .... I still have all processes running ...
[15:20] <fmarchand> is it normal ?
[15:20] * portante (~user@c-24-63-226-65.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[15:21] <fmarchand> I installed cuttlefish with ceph-deploy
[15:21] <fmarchand> what happens when you run "service ceph stop" ?
[15:22] * fmarchand2 (~fmarchand@90.84.146.243) has joined #ceph
[15:22] * fmarchand (~fmarchand@90.84.146.243) Quit (Read error: Connection reset by peer)
[15:22] <fmarchand2> tnt : ?
[15:25] <syed_> fmarchand2: you can manually kill all the process and then restart all the services
[15:25] * fmarchand2 (~fmarchand@90.84.146.243) Quit (Read error: Connection reset by peer)
[15:26] * fmarchand (~fmarchand@a.clients.kiwiirc.com) has joined #ceph
[15:26] <fmarchand> re
[15:27] <fmarchand> I think my connection was bad
[15:27] * pressureman (~pressurem@62.217.45.26) has joined #ceph
[15:27] <fmarchand> tnt : did I miss your answer ?
[15:27] <tnt> fmarchand: it should work and stop the process and it works for me, so something is wrong on your end.
[15:27] * PerlStalker (~PerlStalk@72.166.192.70) has joined #ceph
[15:27] <fmarchand> how did you install your osd and mon ?
[15:27] * loicd (~loic@p5797AA45.dip0.t-ipconnect.de) Quit (Quit: Leaving.)
[15:27] <tnt> my guess is that the hostname doesn't match between what 'uname -n' returns and your ceph.conf file
[15:27] <fmarchand> old school ?
[15:28] <tnt> I installed them about 1 year ago ...
[15:28] <tnt> so yeah ceph-deploy didn't exist back then.
[15:28] * eternaleye (~eternaley@2002:3284:29cb::1) Quit (Ping timeout: 480 seconds)
[15:28] * eternaleye (~eternaley@2002:3284:29cb::1) has joined #ceph
[15:29] * ScOut3R (~ScOut3R@212.96.47.215) has joined #ceph
[15:29] <fmarchand> that's really weird ... if I kill -15 ceph processes ... they restart automatically !
[15:29] <pressureman> what happens to a ceph cluster if power fails to the entire cluster, then the nodes come back online in a random order?
[15:29] <fmarchand> So I can't stop them at all
[15:30] <pressureman> (theoretical question)
[15:31] <tnt> pressureman: I think it depends on how fast/slowly they come up.
[15:31] <tnt> pressureman: what's sure is that the osd / mds will just sit around doing nothing until at least a majority of the mons are up.
[15:32] <pressureman> right... so it basically comes down to quorum being achieved first
[15:32] <tnt> if some OSD are really too slow to boot, it might start re-replicating to others.
[15:32] <tnt> pressureman: yes, before any node will do anything, it will require quorum
[15:33] <pressureman> and once quorum is achieved, the elected monitor dictates which pgs are "current" ?
[15:34] <fmarchand> tnt : what can I do to solve it ? I mean I don't know what to do .... is that ceph-deploy which has a bug ?
[15:34] <fmarchand> tnt : I'm sure it's not a bug anyway
[15:34] <tnt> ceph-deploy plays no role past the install AFAIK.
[15:35] <tnt> pastebin your ceph.conf
[15:35] <tnt> also, pastebin the result of 'uname -n'
[15:36] * ScOut3R__ (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[15:37] <fmarchand> tnt : http://pastebin.com/g6hj0FHv
[15:37] <imjustmatthew> pressureman: Yes, that's my understanding. sagewk would have a more detaield explanation if you need more. Like tnt said, the boot process will be messy if the cluster is under load since it may try to replicate pgs. It might be worth having your applications wait for manual start in a "cold boot" scenario.
[15:37] <imjustmatthew> ^ client applications
[15:38] <tnt> fmarchand: there is no ceph process defined there ( no osd / no mons )
[15:38] <pressureman> obviously it would be a scenario i hope would never happen... but just curious anyway
[15:38] <fmarchand> I know it says it should take the default values
[15:38] <fmarchand> I tried ...
[15:39] <tnt> there is no such thing as "default" values for the daemon definitions ...
[15:39] <tnt> (AFAIK)
[15:40] <fmarchand> mmm oki I'm gonna try again
[15:40] <fmarchand> with device definition
[15:40] <tnt> not device. Daemon. you need some [mon.x] and [osd.x] entries for every daemon that needs to run on that host.
[15:43] * leseb (~Adium@83.167.43.235) Quit (Quit: Leaving.)
[15:47] <fmarchand> yes but for osd.x you can specify device .... it's what I meant
[15:48] * redeemed (~redeemed@static-71-170-33-24.dllstx.fios.verizon.net) has joined #ceph
[15:48] * leseb (~Adium@83.167.43.235) has joined #ceph
[15:49] * iggy (~iggy@theiggy.com) has joined #ceph
[15:49] <fmarchand> I can't kill current daemons
[15:50] <tnt> just add them in the ceph.conf manually
[15:51] <fmarchand> I did it
[15:54] <fmarchand> I'm gonna install it the old way
[15:58] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[15:59] * portante (~user@66.187.233.206) has joined #ceph
[16:00] <TiCPU> is there any way to get statistics from rbd_cache ?
[16:04] * leseb (~Adium@83.167.43.235) Quit (Quit: Leaving.)
[16:05] * leseb (~Adium@83.167.43.235) has joined #ceph
[16:06] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has left #ceph
[16:09] * syed_ (~chatzilla@180.151.28.160) Quit (Quit: ChatZilla 0.9.90 [Firefox 19.0.2/20130307122351])
[16:12] * drokita (~drokita@199.255.228.128) has joined #ceph
[16:12] <Machske> I've got a bobtail cluster running and want to setup a seperate server to function as a radosgw gateway. The package on that server installs v0.61.2. should that be compatible with a bobtail cluster ?
[16:15] <LeaChim> Has anyone had any success at using CORS with radosgw? Everything I try to set the configuration is returning 403.
[16:16] <redeemed> machske, did you do this with ceph-deploy? i believe ceph-deploy will deploy with bobtail if you tell it to. ex: ceph-deploy install --stable bobtail HOST HOST HOST
[16:18] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[16:23] <jtang> has anyone tested .63 on centos6/sl6 recently?
[16:23] <jtang> in particualar the radosgw bits and pieces?
[16:23] <Machske> redeemed: No, I did not, was a manual install/config setup. But it seems to work. Really need to have a look to ceph-deploy. Hard to change habbits :)
[16:27] <jtang> hrm... seems i missed cuttlefish which might have the sysv init scripts etc...
[16:33] * kyle__ (~kyle@216.183.64.10) Quit (Ping timeout: 480 seconds)
[16:33] * kyle__ (~kyle@216.183.64.10) has joined #ceph
[16:34] * andrei (~andrei@host86-155-31-94.range86-155.btcentralplus.com) has joined #ceph
[16:34] * dcasier (~dcasier@131.35.132.79.rev.sfr.net) has joined #ceph
[16:35] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[16:35] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[16:38] * loicd (~loic@brln-4db819d8.pool.mediaWays.net) has joined #ceph
[16:41] <andrei> hello guys
[16:42] <andrei> i have been doing some performance/stress testing last night
[16:42] <andrei> basically, i ran 4 vms which were running on the same kvm host server
[16:42] <andrei> using fio i've done 4 jobs each doing random reads/writes in 200GB files
[16:43] <andrei> i left it running overnight and have discovered in the morning that the tests were hanging
[16:43] <andrei> most of them at around 1%
[16:44] <andrei> i've checked ceph -s to discover HEALTH_WARN 58 pgs peering; 58 pgs stuck inactive; 58 pgs stuck unclean; 1 mons down, quorum 1,2 a,b
[16:44] <andrei> that is on 0.61.2 ceph
[16:46] <andrei> so, i was wondering what caused one of the mon servers to die
[16:46] <andrei> and how come it caused the whole test freeze
[16:47] <andrei> how do I go about determining these things?
[16:47] <absynth> did you check dmesg on the mon?
[16:47] <absynth> maybe it died with the recent memleak issues
[16:48] <andrei> i will check the logs now
[16:48] <fmarchand> I really don't understand ... who has used ceph-deploy to install a cluster ??
[16:48] <absynth> still, a dead monitor should not stall I/O
[16:49] <andrei> i can see that all of my vms have frozen
[16:49] <andrei> fmarchand: i've tried to use it many times, but it has never worked for me!
[16:49] <andrei> absynth: yeah, that's true
[16:49] <nhm> andrei: there are some mon bugs that we are getting ironed out.
[16:49] <nhm> andrei: I think some of the first round of patches are in 0.61.3
[16:49] <fmarchand> andrei:so I'm not crazy ... it does not really work ?
[16:50] <andrei> fmarchand: nope, you are not crazy!
[16:50] <andrei> it didn't work for me when it was setting up multiple monitors
[16:50] <andrei> it just hanged on the key generation part
[16:51] <fmarchand> when I install through ceph-deploy I can't stop or restart mon osd daemons
[16:52] <andrei> nhm: would one broken mon cause 4 vm hangs?
[16:52] <andrei> i don't belive it should
[16:53] <iggy> absynth: how many mons did you have?
[16:53] <nhm> andrei: on the surface it may just look like 1 broken mon, but in reality we were seeing situations where the mons were hanging for hours at a time causing all kinds of subsequent problems.
[16:54] <andrei> nhm: just checking now and it looks like that ceph has crashed
[16:54] <andrei> i can't list rbd file for instance
[16:54] <nhm> andrei: at least part of this was due to underlying leveldb behavior with compaction that we ended up having to work around.
[16:54] <andrei> when I run "rbd -p Primary-kvm-centos-1 ls -l" it just hangs
[16:54] <nhm> andrei: does ceph health also hang?
[16:55] <andrei> nope
[16:55] <andrei> let me double check actually
[16:55] <andrei> nope
[16:55] <andrei> works
[16:55] <pressureman> is there an ETA for 0.61.3? i'm about to install a new cuttlefish cluster, and would rather not do it with leaky software
[16:55] <absynth> iggy: what? why is that relevant? :)
[16:55] <andrei> it times out with one mon, but after a few seconds gives me the detail
[16:56] <iggy> too many a's....
[16:56] <absynth> ah, i see
[16:56] * mattbenjamin (~matt@aa2.linuxbox.com) has joined #ceph
[16:56] <andrei> i can see that 0.63 is out
[16:56] <andrei> is this the latest stable release?
[16:56] <iggy> andrei: how many mon's were you running?
[16:56] <andrei> i can't see it in the repos though
[16:56] <andrei> iggy: i had 3 mons
[16:57] <andrei> and i've check health status before running the tests
[16:57] <andrei> all was okay
[16:57] <nhm> andrei: can you bring the down mon back up?
[16:57] <andrei> plus I ran the same test twice before, but using 20GB files instead of 200GB
[16:57] <andrei> nhm: what do you mean?
[16:57] <andrei> this is a PoC, i can do pretty much anything I want
[16:58] <iggy> did anything else die? (i.e. OSDs?)
[16:58] <absynth> install gluster then :D
[16:58] <absynth> *ducks&runs*
[16:58] <nhm> andrei: I mean, if you try to restart the mon that's down, does it come back up?
[16:58] <andrei> nope, it crashes after a few seconds
[16:58] <andrei> leaving a bug of debugging info in logs
[16:59] <pressureman> PoC? yeah, our current system is a piece of crap too
[16:59] <nhm> andrei: Mind pasting that in pastie.org or something?
[16:59] <andrei> one sec
[16:59] <iggy> pressureman: I think sage was saying today... yesterday
[17:00] <pressureman> iggy, what about 0.63? are there debs for it yet?
[17:00] <sage> there was a network hiccup during the build last night.. will push it out as soon as glowell is up and gets it sorted out
[17:00] <iggy> that... nfc
[17:00] <pressureman> awesome
[17:01] <sage> andrei: would love to see the dump in teh logs
[17:01] <nhm> andrei: I'm wondering if it could be similar (though possibly triggered in a different way) to this: http://tracker.ceph.com/issues/5246
[17:01] <absynth> sage: RE our current ticket, we just started a scrub a half hour ago. nothing happened yet.
[17:01] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Remote host closed the connection)
[17:01] * leseb (~Adium@83.167.43.235) Quit (Quit: Leaving.)
[17:02] <sage> absynth: did it previous start right away?
[17:02] * jgallard (~jgallard@gw-aql-129.aql.fr) has joined #ceph
[17:02] * mattbenjamin (~matt@aa2.linuxbox.com) Quit (Read error: Connection reset by peer)
[17:03] <absynth> nope, took some time
[17:03] * leseb (~Adium@83.167.43.235) has joined #ceph
[17:03] <absynth> not very predictable if i remember correctly
[17:03] <andrei> here is the log output
[17:03] <absynth> triggered oliver to get his butt over here
[17:03] <andrei> http://fpaste.org/16718/37044461/
[17:04] * oliver1 (~oliver@p4FFFED96.dip0.t-ipconnect.de) has joined #ceph
[17:04] <absynth> that was quick
[17:04] <oliver1> True.
[17:04] <absynth> oliver1: the last time we had the memleak issue, did it appear right away? no, correct?
[17:04] <absynth> it appeared only after some time
[17:04] <oliver1> absynth: after "a while", let's say 30 minutes at least...
[17:06] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[17:08] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:14] <sage> andrei: looks like a leveldb bug or a bad disk/fs. if the other mons are healthy you can blow it away and re-add it.
[17:16] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) Quit (Remote host closed the connection)
[17:18] <oliver1> sage: I tried to scrub dedicated osd.2, so injected max-scrubs to a couple of paired osd.s from "ceph pg dump". Everything calm so far except couple of slow-reqs... Q: If I inject all OSDs with max-scrub 1 and one other OSD starts eating up memory, could I inject the debugging on the fly and collect logs?
[17:18] <sage> yeah, i think that will still be useful
[17:20] <oliver1> sage: One osd on dope, 28GB RSS
[17:21] <oliver1> sage: could you please help me with debug inject syntax... I would then inject on this particular OSD and then try to stop/restart this guy...
[17:22] <oliver1> sage: 33GiB
[17:22] <sage> ceph osd tell 123 injectargs '--debug-osd 20 --debug-ms 1'
[17:23] * jlogan1 (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[17:24] <oliver1> sage: sure... 40GiB, I will wait not very much longer, something else worth a shot?
[17:24] <sage> nope, hopefully that will have some clues.
[17:28] * ivan` (~ivan`@000130ca.user.oftc.net) Quit (Ping timeout: 480 seconds)
[17:29] * ivan` (~ivan`@000130ca.user.oftc.net) has joined #ceph
[17:29] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[17:32] <PerlStalker> What's a good place to send a feature request?
[17:32] * Machske (~Bram@d5152D8A3.static.telenet.be) Quit ()
[17:33] <loicd> PerlStalker: I would send it to the mailing list first.
[17:33] <loicd> http://vger.kernel.org/vger-lists.html#ceph-devel
[17:34] * iggy_ (~iggy@theiggy.com) has joined #ceph
[17:35] <loicd> the description of the list "This is the mailing list for CEPH filesystem discussion." should probably be updated to something more generic and focused on development like "CEPH development discussion" since it is not exclusively about the filesystem.
[17:35] <PerlStalker> loicd: Thanks
[17:36] <oliver1> sage: you still have access? it's on fcmsnode7, 17.log.bz2, 16megs. Or attach to ticket?
[17:36] <sage> attach to zendesk ticket please
[17:37] <oliver1> sage: nP, thnx for now.
[17:38] <sage> thanks!
[17:39] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[17:44] * mattch (~mattch@pcw3047.see.ed.ac.uk) has left #ceph
[17:44] * mattch (~mattch@pcw3047.see.ed.ac.uk) has joined #ceph
[17:48] <pressureman> i originally installed ceph 0.62 specifically from the cuttlefish repo, but i see that 0.63 is already in the ceph.com/wheezy repos... does this mean that 0.63 is the current "stable" version?
[17:49] <pressureman> like, should i replace http://eu.ceph.com/debian-cuttlefish/ with http://eu.ceph.com/debian-last/ in my apt sources?
[17:50] * tnt (~tnt@91.176.13.220) has joined #ceph
[17:50] <pressureman> scratch that, i'm running 0.61.2, not 0.62
[17:53] <andrei> sage: thanks
[17:53] <andrei> guys, it is safe to remove this folder on a dead mon? /var/lib/ceph/mon/ceph-c/store.db
[17:54] <andrei> it generated around 7gb of space
[17:55] * tkensiski (~tkensiski@126.sub-70-197-4.myvzw.com) has joined #ceph
[17:55] * leseb (~Adium@83.167.43.235) Quit (Quit: Leaving.)
[17:55] * tkensiski (~tkensiski@126.sub-70-197-4.myvzw.com) has left #ceph
[17:56] <andrei> pressureman: I am having issues with 0.61.2
[17:56] <andrei> which caused by ceph cluster to become unavailable
[17:56] * leseb (~Adium@83.167.43.235) has joined #ceph
[17:56] <andrei> sage: do you know if 0.63 is a stable release? ready for production?
[17:56] <sage> 0.61.x is what you want
[17:56] <sage> .3 will be out in a few hours
[17:57] <andrei> sage: would it address the mon crash that i've experienced overnight?
[17:57] <andrei> the one i've mentioned earlier?
[17:57] <pressureman> is 0.63 not already lurking in the package repos? or is that not a final version?
[17:57] <sage> no.. that looks like a leveldb or fs/disk issue
[17:58] * joshd1 (~jdurgin@2602:306:c5db:310:21b4:f1a4:b7a8:80fa) has joined #ceph
[17:58] <sage> it isn't a long-term stable release
[17:58] <andrei> has these issues been addressed yet?, do you know?
[17:59] <andrei> sage: do you know if it is safe to remove /var/lib/ceph/mon/ceph-c/store.db - a folder on the broken mon server?
[18:00] <pressureman> so, 0.61.3 will be LTS? with memory leaks fixed? ;-)
[18:02] * leseb (~Adium@83.167.43.235) Quit (Read error: Operation timed out)
[18:07] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[18:08] * BillK (~BillK@124-148-124-185.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[18:08] * ScOut3R (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[18:08] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[18:09] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[18:21] * oliver1 (~oliver@p4FFFED96.dip0.t-ipconnect.de) has left #ceph
[18:33] <topro> is it normal that I still see a lot of ceph-server_to_ceph-server communication on public (client) interface using ports 6803 6802 6801 6789, though most of ceph-intra-server-communication is using cluster interface?
[18:42] * mschiff (~mschiff@81.92.22.210) has joined #ceph
[18:44] * pressureman (~pressurem@62.217.45.26) Quit (Quit: Ex-Chat)
[18:52] <gregaf> topro: the cluster interface is only used by the OSDs to communicate with each other, but they talk to the monitors (and the monitors talk to each other) on the public interface
[18:52] <gregaf> and in new enough code (don't remember if it's released or not) the OSDs are heartbeating on both interfaces
[18:54] <paravoid> so
[18:54] <paravoid> 0.56.6 -> cuttlefish
[18:54] <paravoid> first mon upgraded, crashing with http://p.defau.lt/?IZqAkhGEsx80I6wXpHaXjA
[18:55] <gregaf> that's for joao ^
[18:56] <gregaf> paravoid: that looks like a branch build, not a release, right?
[18:56] <paravoid> yes
[18:56] <paravoid> cuttlefish tip
[18:57] <joao> that's triggered by one of the more recent patches (that introduced perfcounter support on leveldb-related stuff)
[18:57] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: Pull the pin and count to what?)
[18:57] <joao> paravoid, can you pastebin a bit more of that log, ideally with debug mon = 20?
[18:57] <paravoid> sure
[18:58] <joao> thanks
[18:59] <paravoid> http://p.defau.lt/?37YmxDyHHJEfFjZ1AlkEwQ
[18:59] <joao> hmm
[19:00] <joao> off the top of my head, perfcounters haven't been initialized when we get there
[19:00] <joao> I'll create a bug and take a look
[19:00] <paravoid> I can do the bug
[19:00] <paravoid> no worries
[19:01] <joao> cool, thanks
[19:02] * portante (~user@66.187.233.206) Quit (Ping timeout: 480 seconds)
[19:02] <andrei> sage: would 0.61.3 be available to ubuntu and rpm repos?
[19:03] * tkensiski (~tkensiski@209.66.64.134) has joined #ceph
[19:03] * tkensiski (~tkensiski@209.66.64.134) has left #ceph
[19:05] <paravoid> joao: http://tracker.ceph.com/issues/5255
[19:05] <joao> paravoid, ty
[19:05] <paravoid> sage: you asked for more QA didn't you? :-)
[19:06] <gregaf> sage is out today (or is supposed to be), fyi folks
[19:07] <paravoid> nah it was just a joke
[19:09] <sage> gregaf: can you track that down?
[19:09] <andrei> does anyone know if the leveldb bug has been fixed in the 0.63 release?
[19:10] <sage> i pushed something yesterday that added an if(logger) in ~LevelDBStore that may be the culprit.. may not ahve put it in cuttlefish branch
[19:10] * jgallard (~jgallard@gw-aql-129.aql.fr) Quit (Quit: Leaving)
[19:11] <gregaf> I think joao's on it
[19:11] * itamar_ (~itamar@82.166.185.149) Quit (Quit: Leaving)
[19:11] <joao> sage, it's there
[19:11] <andrei> sage: is this bug easily reproducible? I've got it after a few hours of stress testing with large files
[19:11] <andrei> was wondering if that's something i need to worry about in production?
[19:11] <joao> if (logger)
[19:11] <joao> cct->get_perfcounters_collection()->remove(logger);
[19:11] <joao> sage, ^
[19:12] <joao> oh, I mean, it's on the cuttlefish branch
[19:12] <gregaf> joao: and the bug is that we don't have a perfcounters collection set up yet, right?
[19:12] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[19:12] <andrei> how do i go about removing the mon which has fallen over? I thought that ceph doesn't like having 2 monitors?
[19:12] <andrei> if I remove the broken one, would it not kill the cluster?
[19:12] <joao> gregaf, afaict, it's the logger that is missing from the collections, yes
[19:13] * DarkAceZ (~BillyMays@50.107.53.195) Quit (Ping timeout: 480 seconds)
[19:13] <gregaf> ah, got it
[19:14] <gregaf> you fixing this up or should I do it?
[19:16] <joao> gregaf, if you got the solution, go ahead; otherwise I'm looking into it :)
[19:16] * Tamil (~tamil@38.122.20.226) has joined #ceph
[19:16] <gregaf> I haven't even opened up the files, just making sure you were getting to it soon :)
[19:17] <joao> eh, already looking into the sources :)
[19:17] <paravoid> I'm running off git tip anyway, so I can immediately test when it's commited/built by gitbuilder
[19:22] * dpippenger (~riven@cpe-75-85-17-224.socal.res.rr.com) has joined #ceph
[19:24] <joao> paravoid, when was your branch built?
[19:24] <paravoid> it's from gitbuilder, it has the sha1
[19:24] <joao> ah right
[19:24] <joao> I never remember the traces have a sha1
[19:24] <joao> :\
[19:24] <paravoid> 8544ea7 test_librbd: use correct type for varargs snap test
[19:24] <joao> oh
[19:24] <joao> okay
[19:25] <joao> that's fixed then
[19:25] <paravoid> i.e. cuttlefish
[19:25] <joao> your crash that is
[19:25] <paravoid> er?
[19:25] <joao> the fix that sage mentioned above went to cuttlefish after that commit
[19:25] <joao> ce67c58db7d3e259ef5a8222ef2ebb1febbf7362
[19:26] <paravoid> er
[19:26] <paravoid> cuttlefish doesn't point to that
[19:26] <joao> oh
[19:26] <joao> right
[19:26] <joao> I grabbed next
[19:26] <joao> -_-
[19:26] <joao> gregaf, seems like that patch needs to be backported to cuttlefish
[19:27] <gregaf> coolio, you should review the patch and do that then :P
[19:27] <gregaf> and figure out why it wasn't already, and chastise whoever was responsible
[19:27] <gregaf> ;)
[19:28] <joao> eh, my first backport ever? this is going to be fun! :p
[19:31] * DarkAceZ (~BillyMays@50.107.53.195) has joined #ceph
[19:39] * diegows (~diegows@200.68.116.185) has joined #ceph
[19:40] <joao> paravoid, pushed the backport
[19:41] <paravoid> thanks
[19:41] <joao> (to cuttlefish this time, for reals)
[19:41] <paravoid> nope
[19:41] <joao> nope?
[19:42] <gregaf> it just pushed; it'll need some time to build :)
[19:42] <joao> ah
[19:42] <paravoid> ah, it's there now
[19:42] <gregaf> alternatively, wrong windo
[19:54] * rektide (~rektide@192.73.236.68) Quit (Remote host closed the connection)
[19:54] * bergerx_ (~bekir@78.188.204.182) Quit (Quit: Leaving.)
[20:02] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[20:04] * KindTwo (~KindOne@h241.44.28.71.dynamic.ip.windstream.net) has joined #ceph
[20:06] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[20:06] * KindTwo is now known as KindOne
[20:15] * The_Bishop (~bishop@2001:470:50b6:0:6473:bc33:4957:72d) has joined #ceph
[20:16] <paravoid> so, I upgraded, it doesn't crash anymore
[20:17] <paravoid> upgraded the second mon, now it complains
[20:17] <paravoid> 2013-06-05 18:17:43.214888 7f6250115700 0 cephx: verify_reply coudln't decrypt with error: error decoding block for decryption
[20:20] <paravoid> hm, I managed to restore it after a bit of a downtime
[20:28] <paravoid> ok, another bug
[20:33] <cjh_> so is there another point release of cuttlefish coming soon? :)
[20:34] <andrei> when is the new release coming to debian/centos repos?
[20:34] <andrei> any time estimates?
[20:34] <paravoid> degraded pgs, slow requests, ...
[20:36] * partner (joonas@ajaton.net) Quit (Read error: Operation timed out)
[20:41] * partner (joonas@ajaton.net) has joined #ceph
[20:44] * rturk-away is now known as rturk
[20:49] * tziOm (~bjornar@ti0099a340-dhcp0745.bb.online.no) has joined #ceph
[20:55] * partner (joonas@ajaton.net) Quit (Ping timeout: 480 seconds)
[20:56] * b1tbkt (~b1tbkt@24-217-196-119.dhcp.stls.mo.charter.com) has joined #ceph
[20:58] * Machske (~Bram@d5152D87C.static.telenet.be) has joined #ceph
[20:59] * eschnou (~eschnou@220.177-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:09] * eschnou (~eschnou@220.177-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[21:13] * partner (joonas@ajaton.net) has joined #ceph
[21:13] * alram (~alram@209.19.107.248) has joined #ceph
[21:14] * alram (~alram@209.19.107.248) Quit ()
[21:20] * mattbenjamin (~matt@aa2.linuxbox.com) has joined #ceph
[21:21] * LeaChim (~LeaChim@2.122.119.234) Quit (Ping timeout: 480 seconds)
[21:23] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[21:30] * LeaChim (~LeaChim@2.122.119.234) has joined #ceph
[21:38] * eschnou (~eschnou@220.177-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:43] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) Quit (Read error: Connection reset by peer)
[22:00] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:00] * dcasier (~dcasier@131.35.132.79.rev.sfr.net) Quit (Read error: Connection reset by peer)
[22:03] * KevinPerks (~Adium@cpe-066-026-239-136.triad.res.rr.com) has joined #ceph
[22:05] * ScOut3R (~ScOut3R@54024172.dsl.pool.telekom.hu) has joined #ceph
[22:10] * mattbenjamin (~matt@aa2.linuxbox.com) Quit (Quit: Leaving.)
[22:11] * eschnou (~eschnou@220.177-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[22:15] * andreask (~andreask@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[22:15] * ChanServ sets mode +v andreask
[22:15] <paravoid> quiet today
[22:18] * mattbenjamin (~matt@aa2.linuxbox.com) has joined #ceph
[22:19] * drokita (~drokita@199.255.228.128) has left #ceph
[22:22] * eschnou (~eschnou@220.177-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[22:39] * alexxy[home] (~alexxy@79.173.81.171) has joined #ceph
[22:42] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (Read error: Connection reset by peer)
[22:52] <paravoid> joao: I'm around if you want more data or interactive troubleshooting
[22:53] <joao> paravoid, thanks; I'll poke you if need be :)
[22:55] * jfriedly (~jfriedly@50-0-250-146.dedicated.static.sonic.net) has joined #ceph
[22:57] <jfriedly> Hello! I'm experimenting with a Ceph cluster and I had a quick question.
[22:58] <jfriedly> I read through the RADOS paper (http://ceph.com/papers/weil-rados-pdsw07.pdf) and it sounds like if I add an OSD, it should send some kind of message to the monitor informing it that a new OSD is present. But the docs on adding an OSD say that we have to update the ceph.conf on each host. Is that feature just not implemented yet?
[22:59] <TiCPU> jfriedly, you have to add it to ceph.conf and add it to the monitor's database using ceph osd add, ceph.conf is never updated automatically.
[22:59] * tziOm (~bjornar@ti0099a340-dhcp0745.bb.online.no) Quit (Remote host closed the connection)
[22:59] <paravoid> there are ways to operate the cluster without adding every osd to ceph.conf.
[22:59] <TiCPU> (maybe with ceph-deploy, but I never used it)
[22:59] <joao> jfriedly, the monitors do not rely on ceph.conf
[22:59] <joao> err
[22:59] <joao> wrt the osds
[23:00] <joao> those infos are primarily for the osds themselves
[23:00] <TiCPU> each OSD reads its ceph.conf to configure itself, but monitor don't have to be aware of it in ceph.conf
[23:00] <jfriedly> Ok, thank you
[23:00] <jfriedly> Will OSDs reconfigure themselves if you overwrite their ceph.conf while they're running?
[23:01] <joao> no
[23:03] * ScOut3R (~ScOut3R@54024172.dsl.pool.telekom.hu) Quit (Ping timeout: 480 seconds)
[23:06] <jfriedly> So it sounds like ceph.conf is only used for initializing OSDs with both info on how to contact the monitors and their peer OSDs, but after the cluster is running, new OSDs can be added and the old OSDs will learn about it from the monitor(s)?
[23:06] <sjustlaptop> jfriedly: the ceph.conf is really only used for mon addresses and config options
[23:06] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Read error: Operation timed out)
[23:07] <joao> paravoid, do you have a larger chunk of the log on #5256?
[23:07] <paravoid> yes
[23:07] <joao> mind pointing me to it? :)
[23:07] <paravoid> I was about to :)
[23:07] <joao> cool thx
[23:08] <paravoid> in fact let me cephdrop it all
[23:09] <joao> that works too
[23:09] <jfriedly> sjustlaptop: Ok thanks. Would the whole thing work if we didn't specify any OSDs in our ceph.conf at all then?
[23:10] <paravoid> joao: 5256.log.gz
[23:10] <sjustlaptop> jfriedly: there are ways to configure it that way I think
[23:10] <joao> ty
[23:10] <jfriedly> Alright, I think I just need to read more docs then. Thanks for the help guys
[23:22] * dosaboy (~dosaboy@eth3.bismuth.canonical.com) Quit (Quit: leaving)
[23:23] * fmarchand (~fmarchand@a.clients.kiwiirc.com) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[23:30] * portante (~user@66.187.233.206) has joined #ceph
[23:34] * andrei (~andrei@host86-155-31-94.range86-155.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[23:36] <cjh_> with rbd diff would it be possible to break it up into multiple pieces before sending it?
[23:48] * mattbenjamin (~matt@aa2.linuxbox.com) Quit (Quit: Leaving.)
[23:49] * eschnou (~eschnou@220.177-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[23:54] * diegows (~diegows@200.68.116.185) Quit (Ping timeout: 480 seconds)
[23:54] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (Read error: Connection reset by peer)
[23:54] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.