#ceph IRC Log


IRC Log for 2012-12-04

Timestamps are in GMT/BST.

[0:01] * dshea (~dshea@masamune.med.harvard.edu) Quit (Quit: Leaving)
[0:01] * drokita (~drokita@ Quit (Read error: Connection reset by peer)
[0:01] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[0:01] * loicd (~loic@2a01:e35:2eba:db10:dd29:5e92:673f:162b) has joined #ceph
[0:03] * cblack101 (c0373628@ircip2.mibbit.com) has joined #ceph
[0:04] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has left #ceph
[0:04] <cblack101> Quick question... I'm building a new cluster of Ubuntu 12.10, what testing path should I use in the /etc/apt/sources.list (deb http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/testing precise main was the old one for precise)
[0:04] <cblack101> I want to run the testing version
[0:05] <dmick> cblack101: we don't have quantal builds yet, so it's precise or nothing. I think those packages still work OK
[0:05] <cblack101> cool, ty!
[0:05] <dmick> do let us know if that's not the case :)
[0:17] * calebamiles1 (~caleb@c-98-197-128-251.hsd1.tx.comcast.net) has joined #ceph
[0:21] * calebamiles (~caleb@c-98-197-128-251.hsd1.tx.comcast.net) Quit (Ping timeout: 480 seconds)
[0:21] * jlogan (~Thunderbi@2600:c00:3010:1:e12a:776f:2a6d:8a8) has joined #ceph
[0:24] * calebamiles1 (~caleb@c-98-197-128-251.hsd1.tx.comcast.net) Quit (Quit: Leaving.)
[0:25] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[0:27] * cblack101 (c0373628@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[0:31] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[0:32] * mooperd (~andrew@dslb-188-103-067-049.pools.arcor-ip.net) Quit (Ping timeout: 480 seconds)
[0:32] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) has joined #ceph
[0:42] * maxiz (~pfliu@ Quit (Ping timeout: 480 seconds)
[0:43] * densone (~densone@74-92-51-22-NewEngland.hfc.comcastbusiness.net) has joined #ceph
[0:45] <densone> I've always looked for the Ceph IRC Channel on Freenode.
[0:45] <densone> Now I know why it's always empty :)
[0:45] <rweeks> correct, sir.
[0:45] <rweeks> or madam.
[0:45] <densone> Sir.
[0:46] <rweeks> details here: http://ceph.com/resources/mailing-list-irc/
[0:46] <densone> Are there any good performance docs laying around for Ceph S3 Rados?
[0:47] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) Quit (Remote host closed the connection)
[0:50] <terje> hey guys, I'm wondering if anyone has seen this before from ceph-fuse:
[0:50] <terje> 2012-12-03 16:48:43.701104 7ffca0dfa700 0 -- >> pipe(0x1ecc330 sd=0 :33625 pgs=0 cs=0 l=0).connect claims to be not - wrong node!
[0:51] <joshd> terje: yeah, sage fixed some bugs in the osd recently that resulted in those warnings
[0:51] <rweeks> I'm not sure if we have performance numbers for the rados gateway
[0:52] <terje> can they be ignored>
[0:52] <terje> ?
[0:52] <terje> I ask because that ceph-fuse on that node is using 100% of one of my cpu's
[0:53] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[0:53] * tnt (~tnt@251.163-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[0:53] <joshd> try attaching to it in gdb (gdb -p $PID) and getting a backtrace (bt)
[0:54] <densone> rweeks: Thanks.
[0:54] <joshd> I don't think that bug would cause 100% cpu usage, but maybe indirectly
[0:55] <densone> I might do some testing. I have 5 36TB Servers laying around. Might do some playing.
[0:55] <terje> I'll try and unmount it, then re-mount it and see if that helps.
[0:55] * miroslav1 (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) has joined #ceph
[0:56] * densone (~densone@74-92-51-22-NewEngland.hfc.comcastbusiness.net) Quit (Quit: densone)
[0:57] * plut0 (~cory@pool-96-236-43-69.albyny.fios.verizon.net) has joined #ceph
[0:58] <plut0> hi
[0:58] * PerlStalker (~PerlStalk@ Quit (Quit: ...)
[1:00] * calebamiles (~caleb@ has joined #ceph
[1:00] * benpol (~benp@garage.reed.edu) has joined #ceph
[1:00] <terje> restarting ceph-fuse seems to have cleared up that issue.
[1:01] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Ping timeout: 480 seconds)
[1:03] * sjustlaptop (~sam@ has joined #ceph
[1:07] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[1:08] * BManojlovic (~steki@242-174-222-85.adsl.verat.net) Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:26] * maxiz (~pfliu@ has joined #ceph
[1:27] * densone (~densone@74-92-51-22-NewEngland.hfc.comcastbusiness.net) has joined #ceph
[1:29] * Cube (~Cube@ has joined #ceph
[1:29] * sjustlaptop (~sam@ Quit (Ping timeout: 480 seconds)
[1:31] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[1:50] * mooperd_ (~andrew@dslb-094-223-143-175.pools.arcor-ip.net) has joined #ceph
[1:52] * jlogan1 (~Thunderbi@2600:c00:3010:1:742d:702d:9c2f:e934) has joined #ceph
[1:52] * mooperd_ (~andrew@dslb-094-223-143-175.pools.arcor-ip.net) Quit ()
[1:53] * jlogan2 (~Thunderbi@ has joined #ceph
[1:56] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[1:57] * jlogan (~Thunderbi@2600:c00:3010:1:e12a:776f:2a6d:8a8) Quit (Ping timeout: 480 seconds)
[2:00] * jlogan1 (~Thunderbi@2600:c00:3010:1:742d:702d:9c2f:e934) Quit (Ping timeout: 480 seconds)
[2:01] <densone> Say I have some fairly large servers. Is there any performance benefit to running more than one odd?
[2:02] <densone> osd*
[2:02] <densone> Per server.
[2:02] <rweeks> we recommend you run one OSD per disk
[2:02] <rweeks> that way your server isn't a single failure domain.
[2:03] <densone> ok. So skip any ford of Soft or Hard Raid.
[2:03] <densone> I like that.
[2:03] <rweeks> that's the design, yeah
[2:04] <iggy> densone: when you have 1 gigantic osd, it takes that much longer to rebuild on failure
[2:04] <rweeks> let Ceph take care of the replicas at the object level.
[2:04] * yoshi (~yoshi@p4105-ipngn4301marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:04] <densone> Awesome.
[2:04] <densone> So I am still reading the docs which are fantastic btw.
[2:05] <densone> This might be the wrong question, but I am assuming replica management happens at the CRUSH Map?
[2:05] <rweeks> here: http://ceph.com/docs/master/rados/operations/data-placement/
[2:06] <densone> Why thank you.
[2:08] <densone> rweeks: Are you an Inktank person?
[2:08] <rweeks> I am!
[2:08] <densone> Cool.
[2:08] <densone> Been following your recent investments
[2:08] <densone> Congrats
[2:08] <rweeks> thanks. most of the developers in here also work for inktank
[2:08] <densone> I've been following Ceph since like 2007ish
[2:09] <densone> Or since it was like 0.1
[2:09] <densone> Forever ago.
[2:09] <densone> But back in that day I was more interested in storing data in a SAN like way.
[2:09] <densone> Now I focus on object storage.
[2:10] <rweeks> cool
[2:10] <densone> What kind of test hardware do you use at Inktank?
[2:10] <rweeks> I came from a Big Storage Vendor(tm)
[2:10] <rweeks> I think SAN is … well let's just say I'm not a fan.
[2:10] <densone> Yeah. It's ugly.
[2:10] <rweeks> nhm could probably answer that better than I can
[2:11] <densone> I was looking for a way to ditch dmc clarion nfs volumes when I looked at Ceph back in the day.
[2:11] <rweeks> pff
[2:11] <rweeks> emc nfs
[2:11] <rweeks> that's… hah
[2:11] <densone> yar. I was paying some absurd cost.
[2:11] <densone> $2 Gigabyte
[2:11] * benpol (~benp@garage.reed.edu) has left #ceph
[2:12] <rweeks> yep
[2:12] <iggy> that's panasas pricing
[2:13] <densone> I could build an 800TB of Ceph object storage cluster pretty cheap.
[2:13] <densone> < 1 penny GB amortized over 3 years
[2:13] <rweeks> that's why I'm here
[2:13] <rweeks> well, that, and the architecture is something I believe in
[2:14] <iggy> i'm starting a new job tomorrow... already thinking of ways to use ceph
[2:14] <rweeks> good to hear, from both of you
[2:15] <densone> Yeah.
[2:15] <densone> Network attached may have advantages
[2:15] <densone> but they are getting fewer
[2:15] <rweeks> well, and you can use ceph as a block device if you need it
[2:16] * loicd (~loic@2a01:e35:2eba:db10:dd29:5e92:673f:162b) Quit (Quit: Leaving.)
[2:17] <densone> true.
[2:18] <rweeks> we've got someone in here who's using LIO to re-export RBD blocks over FC
[2:18] <rweeks> for VMware
[2:19] <densone> So how cumbersome is it managing 1000's of osd's?
[2:20] <rweeks> well that's the thing - if you get ceph set up properly with a crush map that takes into account your cluster location, you shouldn't have to manage them.
[2:24] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[2:31] <iggy> and i think a lot of people are using their normal config mgmt systems to do some of it
[2:35] <densone> I am assuming each ceph.conf has to list each disk on each server as an osd.
[2:35] <densone> Which can be easily automated.
[2:49] * calebamiles (~caleb@ Quit (Ping timeout: 480 seconds)
[2:58] * mooperd (~andrew@dslb-094-223-143-175.pools.arcor-ip.net) has joined #ceph
[3:19] * mooperd (~andrew@dslb-094-223-143-175.pools.arcor-ip.net) Quit (Quit: mooperd)
[3:39] * plut01 (~cory@pool-96-236-43-69.albyny.fios.verizon.net) has joined #ceph
[3:39] * plut0 (~cory@pool-96-236-43-69.albyny.fios.verizon.net) Quit (Read error: Connection reset by peer)
[3:43] * plut01 (~cory@pool-96-236-43-69.albyny.fios.verizon.net) Quit (Read error: Connection reset by peer)
[3:46] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[3:47] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Quit: Leaving)
[3:51] * densone (~densone@74-92-51-22-NewEngland.hfc.comcastbusiness.net) Quit (Quit: densone)
[4:00] * jlogan2 (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[4:09] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[4:35] * deepsa (~deepsa@ has joined #ceph
[4:40] * jlogan1 (~Thunderbi@2600:c00:3010:1:7db5:bf2b:27d1:c794) has joined #ceph
[4:42] * calebamiles (~caleb@c-98-197-128-251.hsd1.tx.comcast.net) has joined #ceph
[4:43] * buck (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving.)
[4:47] * Guest424 (~ubuntu@bl5-55-230.dsl.telepac.pt) has joined #ceph
[4:47] <Guest424> hello
[4:47] <Guest424> this is my first experience with xubunu
[4:48] <Guest424> but i have a problem....i dont listen to music
[4:48] * Guest424 (~ubuntu@bl5-55-230.dsl.telepac.pt) Quit ()
[4:49] * densone (~densone@c-67-189-240-25.hsd1.ma.comcast.net) has joined #ceph
[5:04] * chutzpah (~chutz@ Quit (Quit: Leaving)
[5:11] * miroslav1 (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[5:29] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Read error: Connection reset by peer)
[5:50] * deepsa (~deepsa@ Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[6:22] * The_Bishop (~bishop@f052103079.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[6:25] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.89 [Firefox 17.0/20121119183901])
[6:25] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) Quit (Quit: Leaving.)
[6:26] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) has joined #ceph
[6:33] * The_Bishop (~bishop@f052103079.adsl.alicedsl.de) has joined #ceph
[6:34] * drokita (~drokita@24-107-180-86.dhcp.stls.mo.charter.com) Quit (Ping timeout: 480 seconds)
[6:42] * jlogan1 (~Thunderbi@2600:c00:3010:1:7db5:bf2b:27d1:c794) Quit (Ping timeout: 480 seconds)
[6:53] * boll (~boll@00012a62.user.oftc.net) Quit (Quit: boll)
[6:54] * calebamiles1 (~caleb@c-98-197-128-251.hsd1.tx.comcast.net) has joined #ceph
[7:01] * calebamiles (~caleb@c-98-197-128-251.hsd1.tx.comcast.net) Quit (Ping timeout: 480 seconds)
[7:02] * dmick (~dmick@2607:f298:a:607:7992:e07e:4bb4:6742) Quit (Quit: Leaving.)
[7:17] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[7:51] * tnt (~tnt@251.163-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[8:07] * loicd (~loic@2a01:e35:2eba:db10:dd29:5e92:673f:162b) has joined #ceph
[8:13] * loicd (~loic@2a01:e35:2eba:db10:dd29:5e92:673f:162b) Quit (Quit: Leaving.)
[8:13] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:14] * loicd (~loic@magenta.dachary.org) Quit ()
[8:15] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[8:19] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[8:34] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[8:35] * boll (~boll@00012a62.user.oftc.net) has joined #ceph
[8:59] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[9:05] * densone (~densone@c-67-189-240-25.hsd1.ma.comcast.net) Quit (Quit: densone)
[9:05] * loicd (~loic@magenta.dachary.org) has joined #ceph
[9:08] * densone (~densone@c-67-189-240-25.hsd1.ma.comcast.net) has joined #ceph
[9:15] * BManojlovic (~steki@ has joined #ceph
[9:16] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[9:18] * nosebleedkt (~kostas@kotama.dataways.gr) has joined #ceph
[9:18] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:25] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[9:28] * densone (~densone@c-67-189-240-25.hsd1.ma.comcast.net) Quit (Quit: densone)
[9:33] * xiaoxi (~xiaoxiche@jfdmzpr06-ext.jf.intel.com) has joined #ceph
[9:33] * Psi-Jack_ (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[9:34] <xiaoxi> hi, what's "FULL_WAIT" status stand for in FileJournal.cc?
[9:35] <agh> hello, does someone have a great Zabbix Ceph template ?
[9:40] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:40] * Psi-Jack_ is now known as Psi-jack
[9:42] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[9:46] * tnt (~tnt@251.163-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:58] <ctrl> agh: Hey! Did you looked here http://ceph.com/w/index.php?title=Image:Zabbix_ceph_templates.xml&redirect=no?
[9:59] <agh> ctrl: hey ! no i did not !Thanks a lot
[10:03] * loicd (~loic@ has joined #ceph
[10:04] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[10:07] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[10:13] * loicd (~loic@ Quit (Ping timeout: 480 seconds)
[10:18] * Leseb (~Leseb@ has joined #ceph
[10:18] * ScOut3R (~ScOut3R@ has joined #ceph
[10:19] * fr0st (~matt@mail.base3.com.au) Quit (Read error: Connection reset by peer)
[10:26] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[10:29] * loicd (~loic@ has joined #ceph
[10:38] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[10:40] * nosebleedkt_ (~kostas@ has joined #ceph
[10:48] * nosebleedkt (~kostas@kotama.dataways.gr) Quit (Ping timeout: 480 seconds)
[10:48] * Kioob`Taff1 (~plug-oliv@local.plusdinfo.com) has joined #ceph
[10:49] * mooperd (~andrew@dslb-178-012-145-248.pools.arcor-ip.net) has joined #ceph
[11:28] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[11:38] * MikeMcClurg (~mike@firewall.ctxuk.citrix.com) has joined #ceph
[11:41] <nosebleedkt_> hi everybody
[11:48] * mooperd (~andrew@dslb-178-012-145-248.pools.arcor-ip.net) Quit (Quit: mooperd)
[11:49] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[11:49] * mooperd (~andrew@dslb-178-012-145-248.pools.arcor-ip.net) has joined #ceph
[11:53] * MikeMcClurg (~mike@firewall.ctxuk.citrix.com) has left #ceph
[11:55] <jtangwk> morning
[12:00] * maxiz (~pfliu@ Quit (Ping timeout: 480 seconds)
[12:05] * mooperd (~andrew@dslb-178-012-145-248.pools.arcor-ip.net) Quit (Quit: mooperd)
[12:07] * ghbizness (~ghbizness@host-208-68-233-254.biznesshosting.net) Quit (Read error: Connection reset by peer)
[12:07] * ghbizness (~ghbizness@host-208-68-233-254.biznesshosting.net) has joined #ceph
[12:07] * yoshi (~yoshi@p4105-ipngn4301marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[12:23] * match (~mrichar1@pcw3047.see.ed.ac.uk) has joined #ceph
[12:29] * tziOm (~bjornar@ has joined #ceph
[12:50] * mooperd (~andrew@ has joined #ceph
[12:50] * loicd (~loic@ Quit (Ping timeout: 480 seconds)
[12:51] * mooperd (~andrew@ Quit (Remote host closed the connection)
[12:52] * mooperd (~andrew@ has joined #ceph
[12:58] * joao (~JL@ Quit (Read error: Connection reset by peer)
[13:27] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[13:28] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[13:52] <nosebleedkt_> I use 'rbd snap create rbd/foo@mysnap
[13:52] <nosebleedkt_> '
[13:52] <nosebleedkt_> to create a snapshot of my foo image
[13:52] <nosebleedkt_> which is a rados block device
[13:53] <nosebleedkt_> Now what's the point of use of the snapshot?
[13:53] <nosebleedkt_> and how can i possibly use it ?
[13:53] <nosebleedkt_> why I should need a snapshot?
[13:54] <nosebleedkt_> ping tnt
[14:01] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[14:01] * ChanServ sets mode +o elder
[14:02] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) has joined #ceph
[14:10] * loicd (~loic@ has joined #ceph
[14:13] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[14:13] * BManojlovic (~steki@ has joined #ceph
[14:17] * joao (~JL@ has joined #ceph
[14:17] * ChanServ sets mode +o joao
[14:52] <jefferai> any news on if the issue(s) causing slow queries to build up on an osd have been found/solved for bobtail?
[14:53] * l3akage (~l3akage@martinpoppen.de) has joined #ceph
[14:55] <Robe> cool
[14:55] <Robe> 0.55 has been stamped?
[14:59] * guigouz (~guigouz@ has joined #ceph
[15:04] * Psi-Jack_ (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[15:10] <tnt> so it would seem. It is however, not bobtail. This has been delayed to 0.56
[15:11] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Ping timeout: 480 seconds)
[15:11] * Psi-Jack_ is now known as Psi-jack
[15:27] * drokita (~drokita@ has joined #ceph
[15:28] * The_Bishop_ (~bishop@e179015003.adsl.alicedsl.de) has joined #ceph
[15:34] * loicd (~loic@ Quit (Ping timeout: 480 seconds)
[15:35] * The_Bishop (~bishop@f052103079.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[15:36] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[15:56] * loicd (~loic@magenta.dachary.org) has joined #ceph
[16:05] * PerlStalker (~rbsmith@ has joined #ceph
[16:07] * boll is now known as Guest471
[16:07] * boll (~boll@00012a62.user.oftc.net) has joined #ceph
[16:07] * boll (~boll@00012a62.user.oftc.net) Quit ()
[16:07] * nosebleedkt_ (~kostas@ Quit (Quit: Leaving)
[16:13] * Guest471 (~boll@00012a62.user.oftc.net) Quit (Ping timeout: 480 seconds)
[16:21] * tziOm (~bjornar@ Quit (Remote host closed the connection)
[16:22] * PerlStalker (~rbsmith@ has left #ceph
[16:24] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[16:25] * PerlStalker (~PerlStalk@ has joined #ceph
[16:30] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) Quit (Remote host closed the connection)
[16:31] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Quit: tryggvil)
[16:38] * cdblack (c0373727@ircip4.mibbit.com) has joined #ceph
[16:38] <cdblack> Morning all!
[16:44] * jlogan (~Thunderbi@2600:c00:3010:1:7db5:bf2b:27d1:c794) has joined #ceph
[16:50] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[16:51] <jmlowe> Anybody on that can poke the debian-testing repo?
[16:52] <jmlowe> I see the 0.55 packages but the http://ceph.com/debian-testing/dists/quantal/main/binary-amd64/Packages only has 0.54
[16:53] * densone (~densone@c-67-189-240-25.hsd1.ma.comcast.net) has joined #ceph
[16:54] * Kioob`Taff1 (~plug-oliv@local.plusdinfo.com) Quit (Quit: Leaving.)
[16:57] * aliguori (~anthony@ has joined #ceph
[17:07] * gaveen (~gaveen@ has joined #ceph
[17:08] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:11] <cdblack> Hey all, I'm trying to install testing version on Ubuntu 11.10, have the /etc/apt/sources.list.d/ceph.list configured with deb http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/testing precise main... But am getting: public key is not available: NO_PUBKEY 6EAEAE2203C3951A
[17:11] <cdblack> Any way I can force this update?
[17:11] <cdblack> The following packages have been kept back:
[17:12] <cdblack> ceph ceph-common libcephfs1 librados2 librbd1
[17:12] <cdblack> *Ubuntu 12.10 not 11.10
[17:12] <jmlowe> you will need to add the key
[17:13] <jmlowe> let me see if I can't find the instructions
[17:13] <cdblack> I did this step already: wget -q -O- https://raw.github.com/ceph/ceph/master/keys/release.asc | sudo apt-key add -
[17:13] * aliguori (~anthony@ Quit (Read error: Connection reset by peer)
[17:15] <cdblack> Thanks for the assist jm
[17:17] * jtangwk1 (~Adium@2001:770:10:500:f840:e792:e137:7a31) has joined #ceph
[17:17] * jtangwk (~Adium@2001:770:10:500:24d9:d548:19a2:6ea1) Quit (Read error: Connection reset by peer)
[17:29] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:30] * aliguori (~anthony@ has joined #ceph
[17:31] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[17:31] <jmlowe> distracted for a few there
[17:35] * rweeks (~rweeks@c-24-4-66-108.hsd1.ca.comcast.net) has joined #ceph
[17:38] <cdblack> no worries man
[17:38] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[17:39] * loicd (~loic@magenta.dachary.org) has joined #ceph
[17:39] <jmlowe> I have this one
[17:39] <jmlowe> pub 4096R/17ED316D 2012-05-20
[17:39] <jmlowe> Key fingerprint = 7F6C 9F23 6D17 0493 FCF4 04F2 7EBF DD5D 17ED 316D
[17:39] <jmlowe> uid Ceph Release Key <sage@newdream.net>
[17:40] <jmlowe> and this one
[17:40] <jmlowe> pub 1024D/03C3951A 2011-02-08 [expires: 2013-02-07]
[17:40] <jmlowe> Key fingerprint = FCC5 CB2E D8E6 F6FB 79D5 B331 6EAE AE22 03C3 951A
[17:40] <jmlowe> uid Ceph automated package build (Ceph automated package build) <sage@newdream.net>
[17:40] <jmlowe> sub 4096g/2E457B51 2011-02-08 [expires: 2013-02-07]
[17:40] <jmlowe> apt-key fingerprint
[17:40] <jmlowe> or apt-key finger
[17:40] * jtangwk1 (~Adium@2001:770:10:500:f840:e792:e137:7a31) Quit (Remote host closed the connection)
[17:41] * loicd (~loic@magenta.dachary.org) Quit ()
[17:41] <cdblack> apt-key finger produces: pub 4096R/17ED316D 2012-05-20
[17:41] * jtangwk (~Adium@2001:770:10:500:f840:e792:e137:7a31) has joined #ceph
[17:41] <cdblack> Key fingerprint = 7F6C 9F23 6D17 0493 FCF4 04F2 7EBF DD5D 17ED 316D
[17:41] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[17:41] <cdblack> uid Ceph Release Key <sage@newdream.net>
[17:42] <cdblack> how do I add that other one? (951A)
[17:42] <jmlowe> so I gues sthe question is where did I get 03C3951A
[17:42] <cdblack> can I just append this to a txt file somewhere?
[17:44] * drokita (~drokita@ Quit (Read error: Connection reset by peer)
[17:44] * fc is now known as fc__
[17:44] * drokita (~drokita@ has joined #ceph
[17:44] <jmlowe> how about adding this one https://raw.github.com/ceph/ceph/master/keys/autobuild.asc
[17:47] <cdblack> Key is in (951A) after: wget -q -O- https://raw.github.com/ceph/ceph/master/keys/autobuild.asc | sudo apt-key add -
[17:48] * fmarchand (~fmarchand@85-168-75-42.rev.numericable.fr) has joined #ceph
[17:48] <fmarchand> hi !
[17:48] <cdblack> apt-get update && apt-get upgrade -y yields: The following packages have been kept back: ceph ceph-common libcephfs1 librados2 librbd1 0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded.
[17:48] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[17:48] <cdblack> How do I force the upgrade from .48 to current testing version (.54 I think)
[17:49] <cdblack> Error is gone now
[17:49] <rweeks> .55 was released this morning
[17:49] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[17:49] <cdblack> SWEET!!!!
[17:49] <fmarchand> I have a question .... cdblack : how do I upgrade ceph ?
[17:49] <rweeks> but I am not sure there are packages for it yet
[17:49] <rweeks> Note: .55 is not the long-term release, though.
[17:49] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) has joined #ceph
[17:49] <rweeks> http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/11173
[17:49] <rweeks> See Sage's announcement here
[17:50] <jmlowe> dist-upgrade
[17:50] <cdblack> sorry, still a n00b
[17:50] <jmlowe> upgrade is safe, dist-upgrade can break things
[17:51] <jmlowe> kernel updates and major ceph revisions take dist-upgrade
[17:51] <fmarchand> 0.55 is a dev release or a release ?
[17:51] <jmlowe> not that dist-upgrade is dangerous, it just caries more weight than upgrade
[17:52] <joao> fmarchand, dev release
[17:52] * joao sets mode -o joao
[17:52] <cdblack> ok, that still yields: 5 not upgraded
[17:52] <cdblack> ceph ceph-common libcephfs1 librados2 librbd1
[17:53] <fmarchand> oki oh hi joao :)
[17:53] <joao> hello :)
[17:53] <fmarchand> I would like to change my osd to rados ...
[17:54] <rweeks> If you read Sage's announcement, we're planning on 0.56 to be the release, and 0.55 is a dev/testing release
[17:54] <fmarchand> my mds crashes every day ... for no reason ... I mean no understandable reason
[17:54] <jtang> so are most of the more significant cephfs stabilisations going to hit 0.56?
[17:54] <fmarchand> And I feel that rados is a better option than cephfs for prod
[17:54] <jtang> i didnt notice much cephfs stuff in 0.55
[17:55] <via> i too have huge problems with mds crashes
[17:55] <via> but since i've been told its not considered stable, i shouldn't expect not that
[17:55] <fmarchand> via : and you solved it ?
[17:55] <via> no
[17:55] <fmarchand> oh oki
[17:55] <fmarchand> that's why I'm gonna changed to rados
[17:55] <fmarchand> change
[17:55] <jtang> btw the rbd driver crashes in a seemingly random way on 3.6.8-1.el6.elrepo.x86_64
[17:56] <Robe> any announcements to read on the bobtail delay?
[17:56] <jtang> linux kernel 3.6.8-1.el6.elrepo.x86_64 that is
[17:56] <Robe> +are there
[17:56] <jtang> do not 3.6.8-1.el6.elrepo.x86_64
[17:57] <jtang> i mean to say 3.0.52
[17:58] * densone (~densone@c-67-189-240-25.hsd1.ma.comcast.net) Quit (Quit: densone)
[17:59] <jtang> so from some minor testing, 3.0.52 with the rbd driver isnt't a good idea on RHEL6 based distros (kernel was taken from elrepo)
[18:00] * tnt (~tnt@207.171-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[18:02] <jmlowe> I don't think the debian-testing repo was updated for 0.55
[18:04] * ScOut3R (~ScOut3R@ Quit (Remote host closed the connection)
[18:04] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[18:11] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[18:11] <cdblack> Installing testing version on Ubuntu 12.10: get these dependency errors: ceph : Depends: libboost-thread1.46.1 (>= 1.46.1-1) but it is not installable and Depends: libgoogle-perftools0 but it is not installable
[18:11] <cdblack> Any idea how to resolve?
[18:12] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit ()
[18:13] * boll (~boll@00012a62.user.oftc.net) has joined #ceph
[18:14] <jtang> hey nhm, have you seen this http://www.newmexicoconsortium.org/probe
[18:15] <jtang> i cant remember if none-academics can apply for time or not
[18:15] <jtang> but it might be a good idea to check if you can apply for time, you could potentially have hte capability to setup a 1000node ceph cluster for testing
[18:15] <jtang> at least for a short amount of time
[18:16] <jtang> or setup varying ratios of osd's to clients
[18:16] * fmarchand (~fmarchand@85-168-75-42.rev.numericable.fr) Quit (Ping timeout: 480 seconds)
[18:16] * benpol (~benp@garage.reed.edu) has joined #ceph
[18:17] <cdblack> On the 12.10 dependency thing: libboost-thread1.49.0 and libgoogle-perftools4 are installed from dpkg -l | grep boost/google
[18:20] * match (~mrichar1@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[18:20] * Leseb (~Leseb@ Quit (Quit: Leseb)
[18:21] <benpol> I see the 0.55 Debian packages have been uploaded, but the repo metadata in http://ceph.com/debian-testing/dists/squeeze/ hasn't been refreshed yet.
[18:23] * benpol is looking forward to trying out 0.55 :)
[18:24] <jmlowe> benpol: I'm having the same trouble with quantal
[18:25] * gaveen (~gaveen@ Quit (Quit: Leaving)
[18:25] * gaveen (~gaveen@ has joined #ceph
[18:26] * rweeks (~rweeks@c-24-4-66-108.hsd1.ca.comcast.net) Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[18:30] <via> looks like someone broke the EL initscript for 0.55
[18:30] * loicd (~loic@jem75-2-82-233-234-24.fbx.proxad.net) has joined #ceph
[18:31] * loicd (~loic@jem75-2-82-233-234-24.fbx.proxad.net) Quit ()
[18:33] * loicd (~loic@jem75-2-82-233-234-24.fbx.proxad.net) has joined #ceph
[18:36] <via> holy crap, even after fixing the init script, osd's and mds's now crash on start
[18:36] * via wonders if this was tested at all on el6
[18:36] * Cube (~Cube@ has joined #ceph
[18:40] * glowell (~glowell@c-98-210-224-250.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:44] * drokita (~drokita@ Quit (Quit: Leaving.)
[18:45] <yehudasa> via: what kind of crash are you seeing?
[18:45] <yehudasa> elder: can you update the title?
[18:46] <via> well, the init script has a very basic syntax error in it, and the osd's are just dumping 'aborted' to the log
[18:46] <via> i'm trying with gdb
[18:47] * drokita (~drokita@ has joined #ceph
[18:48] <via> yehudasa: http://pastebin.com/acwCP45P
[18:48] <via> let me know if i should install debuginfos for those other things
[18:49] <via> also, in the nitscript, the whole fs handling code calls btrfs and modprobe with invalid commandline options
[18:49] <via> seems like it was written purely for ubuntu with no regard for el
[18:49] <yehudasa> via: did you have cephx turned on beforehand?
[18:50] <via> yes
[18:50] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) has joined #ceph
[18:50] <via> none of the keys changed, so i assume i don't really have to change anything for that, right?
[18:55] <yehudasa> via: can you have that reproduced with 'debug ms = 20' and 'debug auth = 20'?
[18:57] <via> i can, also, mds produces this output which i can also do with increased derbug if you'd like: http://pastebin.com/2MXDcK3U
[18:58] * ChanServ sets mode +o joao
[18:58] * joao changes topic to 'v0.55 has been released -- http://goo.gl/r6OG1'
[18:58] <via> uh...so running ceph-osd again with those produced even less output
[18:58] <via> even with -d
[18:59] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:00] <yehudasa> via: my guess is that if you turn off cephx in the conf it's going to work now
[19:01] <via> by turn off do you mean actually set it to not cephx or just comment it out?
[19:01] <via> i commented it out and the same happens
[19:02] <yehudasa> via: set it to not use cephx explicitly
[19:03] <yehudasa> via: what's the conf option that you're using?
[19:03] <via> auth supported = cephx
[19:03] <via> so = none?
[19:04] <via> it aborted still
[19:06] <via> backtrace is the same
[19:06] <yehudasa> auth cluster required = false
[19:06] <via> ok
[19:06] <yehudasa> I mean
[19:06] <yehudasa> auth cluster required = none
[19:06] <yehudasa> auth service required = none
[19:07] <via> the last one?
[19:07] <yehudasa> last two
[19:07] * rweeks (~rweeks@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[19:09] <via> okay, it doesn't abort, it exits gracefully now
[19:09] * fmarchand (~fmarchand@85-168-75-42.rev.numericable.fr) has joined #ceph
[19:10] <yehudasa> via: is there anything interesting in the logs?
[19:10] <via> i'm getting it
[19:10] <jmlowe> anybody on that can update the debian-testing repo metadata to point to 0.55 instead of 0.54?
[19:10] <via> how come when debuggin is turned on and -d is used, it outputs less to the console but more to the log
[19:10] <yehudasa> via: are you mixing -d and -f?
[19:11] * chutzpah (~chutz@ has joined #ceph
[19:11] <via> maybe
[19:11] <via> -d says foreground logs to stderr
[19:11] <via> so i don't thin kso
[19:11] <via> unless the manpage is wrong
[19:11] * The_Bishop_ (~bishop@e179015003.adsl.alicedsl.de) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[19:12] <yehudasa> via: not sure, I think -d used to be 'daemon' which would explain the behavior
[19:14] <yehudasa> ah, looking at the code, -d means daemonize=false, log_file="", pid_file="", log_to_stderr=true, err_to_stderr=true, log_to_syslog=false
[19:14] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[19:17] <via> so... -d is what i'm supposed to be using, right?
[19:17] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[19:18] <yehudasa> via: if you want to get log dumped to your console
[19:19] <via> well, it does not
[19:19] * via tries some things
[19:20] <via> just to confirm, in ceph.conf under [osd], i have debug osd = 20
[19:20] <via> debug auth = 20
[19:23] <via> there's actually not anything of value in the log either
[19:23] <via> it jsut exists gracefully with no output
[19:25] <yehudasa> via: how are you running the osd?
[19:25] <via> ceph-osd -d -i 0
[19:27] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[19:28] <yehudasa> well, it might be that -d doesn't work as intended, you can remove it and should have logs in your log file
[19:30] <via> okay, it looks like it might be an auth error, but the mon proably wasn't restarted since i disabled authx, let me try that
[19:34] <via> it doesn't look like its dying, but now ceph -s doesn't work because its trying to use cephx
[19:35] <fmarchand> via : disable it in your ceph.conf
[19:35] <via> it is
[19:35] <via> that said, ceph -s works on another node, coincidentally wher ei have not disabled cephx
[19:35] <fmarchand> oups sorry I did not read everything
[19:36] <via> but yeah, none of the daemons are dying
[19:38] <via> so where from here, re-enable cephx?
[19:43] * fmarchand (~fmarchand@85-168-75-42.rev.numericable.fr) Quit (Ping timeout: 480 seconds)
[19:44] * fmarchand (~fmarchand@ has joined #ceph
[19:44] <via> re-enabling it results in osd death again
[19:44] <via> and i still can't ceph -s
[19:47] * mooperd (~andrew@ Quit (Quit: mooperd)
[19:53] <via> well, since i'm gathering the auth config syntax has changed, what is the correct way to enable cephx?
[19:54] <yehudasa> via: try adding 'auth_client_required = none' to your ceph.conf
[19:55] <yehudasa> via: http://ceph.com/docs/master/rados/operations/authentication/
[19:55] <via> well, since i'm trying to re-enable cephx i assume that should be = cephx?
[19:57] <via> i mean, everything works fine without cephx, but i can't keep it like that
[20:04] * dmick (~dmick@2607:f298:a:607:88ef:4fcc:fc5a:8b23) has joined #ceph
[20:06] * Ryan_Lane (~Adium@ has joined #ceph
[20:08] <yehudasa> via: try to go through the config instructions in the docs
[20:09] <via> the docs suggest i should just remove all the auth lines to enable cephx
[20:10] <yehudasa> via: you may need to make sure that the keys are in place
[20:11] <via> kepe in mind this was a fully functioning cluster with 0.54 this morning
[20:11] <via> with cephx
[20:14] <via> to verify the keys do i have to disable cephx and restart things?
[20:14] <via> since none of the ceph auth commands will work without auth
[20:17] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Quit: Leaving)
[20:17] <via> after doing so, my client.auth key matches whats in my keyring
[20:17] <via> er, client.admin
[20:17] <via> just like it was this morning with auth worked fine
[20:22] <fmarchand> I upgraded ceph but it didn't took the 0.55. instead it took the 0.54.1 ... maybe I missed something ...
[20:23] <fmarchand> I added the dev release repo ...
[20:23] <via> yehudasa: see a problem here? http://pastebin.com/AN7utY11
[20:24] <jmlowe> fmarchand: they haven't updated the repo metadata to point to the new packages
[20:25] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[20:25] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit ()
[20:25] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[20:26] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[20:27] * mooperd (~andrew@dslb-178-012-145-248.pools.arcor-ip.net) has joined #ceph
[20:27] <mikedawson> When I bounce nodes, the Mon restarts properly, but the OSDs don't automatically start. Ubuntu 12.10 with 0.54. OSD logs show anything after the shutdown. Any ideas?
[20:28] <yehudasa> mikedawson: bounce nodes?
[20:28] <dmick> reboot, I assume you mean mikedawson. How did you install? mkcephfs, or ceph-deploy?
[20:28] <mikedawson> reboot after installing new kernel
[20:29] <mikedawson> mkcephfs
[20:29] <yehudasa> via: looks right, but what are you experiencing?
[20:30] <mikedawson> service ceph start works as expected. It notes the mon is already started, then starts up the OSDs
[20:30] <via> if i procede to re-enalbe cephx: unable to authenticate as client.admin
[20:33] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[20:35] <fmarchand> jmlowe oki
[20:35] * densone (~densone@70-88-47-52-ct-ne.hfc.comcastbusiness.net) has joined #ceph
[20:36] * glowell (~glowell@c-98-234-186-68.hsd1.ca.comcast.net) has joined #ceph
[20:36] <fmarchand> So If I want to update I need to use the tar.gz file ?
[20:36] * The_Bishop (~bishop@2001:470:50b6:0:714b:9a8d:ae17:c726) has joined #ceph
[20:37] <mikedawson> I have the ceph cluster network running over a NIC that is a part of an Open vSwitch setup that may take some time to come up, but I'm not seeing any logs about the OSDs trying to start and failing to communicate on the cluster network
[20:38] <jmlowe> fmarchand: I guess you could grab the individual packages from http://ceph.com/debian-testing/pool/main/c/ceph/ and dpkg -i
[20:39] <jmlowe> I'd personally rather wait for the repo to be updated
[20:39] <yehudasa> glowell: apparently deb repository still points to the 0.54
[20:40] <glowell> Looking into it now. I see folks have a few issues.
[20:40] <yehudasa> via: if you enable cephx, does to cluster go up?
[20:41] <via> everything dies due to lack of auth
[20:41] <via> all the keys for all daemons match
[20:42] <jmlowe> glowell: just need to update the metadata for example this still has 0.51-1quantal http://ceph.com/debian-testing/dists/quantal/main/binary-amd64/Packages
[20:42] <via> as of right now, the only difference from this morning when it was working is the upgrade to 0.55 and the removal of ceph auth = authx (or whatever it was that was now depricated)
[20:42] <yehudasa> via: how does your ceph.conf look like?
[20:43] <jmlowe> glowell: make that 0.54-1quantal
[20:43] <via> yehudasa: http://pastebin.com/cMFdYDsz
[20:44] <glowell> jimlowe: Thanks for the pointer. I'm currently trying to find the root cause. Something went wrong with a build script.
[20:44] <yehudasa> via: where are your keys located?
[20:45] <via> in each daemon's location on disk, e.g. /data/osd.0/keyring /data/mds.alpha/keyring, etc
[20:45] <via> and the client.admin is /etc/ceph/ceph.keyring
[20:46] * loicd (~loic@jem75-2-82-233-234-24.fbx.proxad.net) Quit (Quit: Leaving.)
[20:47] <yehudasa> via: can you provide a log with 'debug auth = 20'?
[20:47] <fmarchand> thx jmlowe :) I will wait
[20:47] <yehudasa> .. and also 'debug ms = 1'
[20:47] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[20:48] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[20:50] <via> yehudasa: http://pastebin.com/APTvHBZV
[20:51] <via> so,k that said, all the daemons are still running somehow
[20:54] <yehudasa> via: seems to me that the client tries auth none
[20:54] <via> even though it speaks of client.admin?
[20:56] <yehudasa> oh, not sure anymore
[20:56] <yehudasa> it says proto 0, but that's unknown, and not none
[20:56] <yehudasa> in any case, the service doesn't agree with it on the protocol
[20:59] <via> the ceph binary is definetely from the .55 package
[21:02] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[21:02] * gaveen (~gaveen@ Quit (Remote host closed the connection)
[21:03] * ircolle (~ian@c-67-172-132-164.hsd1.co.comcast.net) Quit (Quit: ircolle)
[21:04] * BManojlovic (~steki@242-174-222-85.adsl.verat.net) has joined #ceph
[21:07] <via> yehudasa: any other ideas? at this point basically my entire cluster is unusable
[21:16] * Cube (~Cube@ Quit (Quit: Leaving.)
[21:16] <yehudasa> via: can you provide a log of the mon/osd side with the same debug options when you try to start the cluster with cephx?
[21:17] * guigouz (~guigouz@ Quit (Quit: Computer has gone to sleep.)
[21:23] <yehudasa> via: what version did you upgrade from?
[21:26] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[21:27] <glowell> 0.55 update. There is a checksum problem with a new ceph package that is preventing the repo from being built correctly. Looking into this now.
[21:30] <via> yehudasa: yeah, it'll be a few minutes. was running 0.54 this morning, now to 0.55
[21:32] * ircolle (~ian@c-67-172-132-164.hsd1.co.comcast.net) has joined #ceph
[21:35] <wer> I am having to whittle my way around creating scrips to generate some sort of valid naming conventions for ceph osd's.... I thought maybe osd.0c through osd.0z would be acceptable... They are not. But it sure would map the process to the drives well in my case.
[21:36] * Cube (~Cube@ has joined #ceph
[21:36] <dmick> wer: yes, they need to be numeric, and should be contiguous
[21:38] <wer> dmick: thanks. It is kind of annoying me :) I could be wrong, but mounting drive /dev/sdc to /var/lib/ceph/osd/ceph-c and calling it osd.0c, or on another host osd.1c would make a lot of sense. But perhaps I am missing the idea behind the naming conventions.
[21:39] <wer> Cause then I would know which osd to manage should I loose a drive or something.
[21:39] * mooperd (~andrew@dslb-178-012-145-248.pools.arcor-ip.net) Quit (Quit: mooperd)
[21:40] <wer> so dmick is there any problem starting with [osd.099] as long as it is contiguous?
[21:41] * Cube1 (~Cube@14.sub-174-254-83.myvzw.com) has joined #ceph
[21:41] <wer> Cause it was easier just to convert the ascii values to numbers :) I mean, am I being too lazy here?
[21:43] <dmick> wer: the numbers are actually used as indices in the code; it's not intended that the names be user-friendly
[21:44] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) has joined #ceph
[21:45] <via> yehudasa: so mon.0.log is empty, nothing gets logged to it. and the osd log is massive and grows so fast it nearly locks up my ssh sesion tailing it, but the osd stays on now, its just ceph -s that does't work
[21:45] * jtang1 (~jtang@ has joined #ceph
[21:45] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[21:45] <yehudasa> via: can you provide the osd log?
[21:46] <lurbs> Looks to me like the 0.55-1 packages in the debian-testing repository require Quantal to install on. Is it safe to assume packages for Precise are forthcoming?
[21:46] <via> how much of it?
[21:46] <via> its already almost 100 MB
[21:46] <dmick> lurbs: that would be very surprising
[21:47] <yehudasa> via: set 'debug ms = 1'
[21:47] <wer> dmick: ok. Then I will have to fix it. ty.
[21:47] <yehudasa> 'debug auth = 20'
[21:47] <yehudasa> I don't need to see everything if it start repeating
[21:48] <dmick> but I am surprised to see quantal in dists/
[21:48] <via> yehudasa: okay, although its growing just as fast. keep in mind the osd's are not dying
[21:48] <via> i'm not sure there's a problem there
[21:49] <dmick> (argh stupid apache indexing)
[21:49] <AaronSchulz> yehudasa: how is the 's' in your name pronounced?
[21:50] <yehudasa> hmm.. s as in stamp
[21:50] <via> yehudasa: http://pastebin.com/dcNDsNVu
[21:51] <AaronSchulz> ok, I was talking to an inktank guy that pronounced your name like "yehuda"
[21:52] <yehudasa> AaronSchulz: first name is yehuda, there's no 's' in there
[21:53] <yehudasa> via: turn 'debug osd = 0', restart.. interested in auth interaction
[21:53] <AaronSchulz> ahh, then that makes sense :)
[21:53] <lurbs> dmick: The existing ones are just versioned with 0.55-1, not 0.55-1precise, and don't show up in the precise repo yet.
[21:53] <dmick> yeah, looking now
[21:53] <dmick> glowell mentioned some autobuild errors; this might be another manifestation
[21:53] <lurbs> But they definitely require things like libboost-threads >= precise version.
[21:54] <lurbs> Er, >, not >=.
[21:54] <dmick> trying to remember how to view that info from a deb
[21:55] <lurbs> dmick: dpkg -e $package $path
[21:55] <lurbs> Will extract the control files.
[21:55] <dmick> oh right
[21:57] <dmick> -I was what I was remembering
[21:58] <via> yehudasa: okay, i'll have to get back to you in a bit
[22:04] <dmick> lurbs: we're looking into it. the 55 files are not quite ready for consumption just yet
[22:10] <lurbs> Sweet, thanks.
[22:11] * mooperd (~andrew@dslb-178-012-145-248.pools.arcor-ip.net) has joined #ceph
[22:18] * Cube1 (~Cube@14.sub-174-254-83.myvzw.com) Quit (Quit: Leaving.)
[22:19] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:30] * densone (~densone@70-88-47-52-ct-ne.hfc.comcastbusiness.net) Quit (Quit: densone)
[22:31] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[22:51] * rlr219 (43c87e04@ircip4.mibbit.com) has joined #ceph
[22:53] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[22:54] <rlr219> I have a windows 2K8 VM running on a KVM host. The VM is a qcow2 format. I run the qemu-img converter to convert the VM to RBD and the VM starts to boot then BSOD. Has anyone else seen this happen? Is there a config change needed of some kind? I am using the virtio drivers for the VM.
[22:55] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[22:57] <joshd> rlr219: maybe due to this bug fixed in 0.55? http://tracker.newdream.net/issues/3521
[22:59] <rlr219> joshd: thanks.
[23:15] * densone (~densone@74-92-51-22-NewEngland.hfc.comcastbusiness.net) has joined #ceph
[23:20] * rino (~rino@ Quit (Quit: ircII EPIC4-2.10.1 -- Are we there yet?)
[23:21] * fmarchand (~fmarchand@ Quit (Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org)
[23:22] <todin> hi, is it somehow possible to get a beta account at at the dreamcompute beta?
[23:24] * jtang1 raises an eye brow
[23:24] <jtang1> dreamcompute?
[23:24] <todin> jtang1: http://dreamhost.com/cloud/dreamcompute/
[23:25] <yehudasa> todin: that's up to dreamhost
[23:26] <todin> yehudasa: hmm, I sign for a beta accout right after the ceph day, until today I got no response
[23:26] <yehudasa> todin: not sure where it stands, but probably once it'll be ready for more users then you'll get invited
[23:27] <todin> yehudasa: ok, so I will wait
[23:30] * loicd (~loic@magenta.dachary.org) has joined #ceph
[23:30] * jstrunk (~quassel@ Quit (Remote host closed the connection)
[23:36] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[23:42] * BManojlovic (~steki@242-174-222-85.adsl.verat.net) Quit (Quit: Ja odoh a vi sta 'ocete...)
[23:43] <via> yehudasa: http://pastebin.com/GusXNzXB
[23:43] <via> before configurign to investigate the osd's more i just want to make sure you understand that the osd's are working
[23:44] * ircolle (~ian@c-67-172-132-164.hsd1.co.comcast.net) Quit (Quit: ircolle)
[23:45] <yehudasa> via: the fact that the processes are up doesn't mean that they're working
[23:45] <yehudasa> from what I can see they fail to connect to the monitors too
[23:46] * ircolle (~ian@c-67-172-132-164.hsd1.co.comcast.net) has joined #ceph
[23:49] * mooperd (~andrew@dslb-178-012-145-248.pools.arcor-ip.net) Quit (Quit: mooperd)
[23:50] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.