#ceph IRC Log

Index

IRC Log for 2012-10-21

Timestamps are in GMT/BST.

[0:06] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[0:08] * Kioob (~kioob@82.67.37.138) has joined #ceph
[0:11] * danieagle (~Daniel@186.214.76.251) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[0:25] * BManojlovic (~steki@bojanka.net) has joined #ceph
[1:06] * danieagle (~Daniel@186.214.76.251) has joined #ceph
[1:10] * gaveen (~gaveen@112.134.112.140) Quit (Remote host closed the connection)
[1:11] * lofejndif (~lsqavnbok@04ZAAAYTD.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[1:21] * Kioob (~kioob@82.67.37.138) Quit (Ping timeout: 480 seconds)
[1:21] * andret (~andre@pcandre.nine.ch) Quit (Ping timeout: 480 seconds)
[1:22] * andret (~andre@pcandre.nine.ch) has joined #ceph
[1:25] * nhm_ (~nh@184-97-251-146.mpls.qwest.net) has joined #ceph
[1:25] * joshd (~joshd@rrcs-74-62-34-205.west.biz.rr.com) has joined #ceph
[1:25] * nhm (~nh@174-20-101-163.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[1:26] <joshd> jks: that shouldn't be necessary unless you just upgraded from a really old version. what platform and versions are you using?
[1:29] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[1:32] * BManojlovic (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[1:33] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[1:33] * sjustlaptop (~sam@71-80-181-44.dhcp.gldl.ca.charter.com) has joined #ceph
[1:45] <buck> teuthology query. If i'm running tests on my localhost, can I specify that my local host act as the lock_server?
[1:48] <joshd> buck: you can obviate the lock server stuff entirely by setting check_locks: false at the top level of your test yaml
[1:49] <joshd> buck: err, that's check-locks: false
[1:51] <buck> the test yaml file is the one where I'm specifying roles: targets: tasks: , yeah?
[1:56] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[1:59] <joshd> right
[2:02] * joshd (~joshd@rrcs-74-62-34-205.west.biz.rr.com) Quit (Quit: Leaving.)
[2:08] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[2:08] * LarsFronius (~LarsFroni@frnk-590d327d.pool.mediaWays.net) Quit (Quit: LarsFronius)
[2:19] * scalability-junk (~stp@188-193-208-44-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[2:35] * scalability-junk (~stp@188-193-208-44-dynip.superkabel.de) has joined #ceph
[2:51] * scalability-junk (~stp@188-193-208-44-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[3:00] * The_Bishop__ (~bishop@e179012241.adsl.alicedsl.de) Quit (Remote host closed the connection)
[3:15] <buck> one more teuthology question. The long key that goes in the yaml file. That's the intended users .ssh/id_rsa file, correct?
[3:15] <buck> er i mean id_rsa.pub
[3:25] * buck (~buck@bender.soe.ucsc.edu) has left #ceph
[3:25] * sjustlaptop (~sam@71-80-181-44.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[3:32] * MK_FG (~MK_FG@188.226.51.71) Quit (Quit: o//)
[3:34] * MK_FG (~MK_FG@188.226.51.71) has joined #ceph
[3:42] * nwatkins_ (~nwatkins@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:43] * joshd (~joshd@rrcs-74-62-34-205.west.biz.rr.com) has joined #ceph
[3:55] * joshd (~joshd@rrcs-74-62-34-205.west.biz.rr.com) Quit (Quit: Leaving.)
[4:03] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[4:06] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[4:07] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[4:07] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[4:08] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[4:08] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[4:09] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit ()
[4:09] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit ()
[4:43] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[4:47] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[5:34] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[5:38] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[5:57] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) has joined #ceph
[7:19] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[8:01] * Administrator (~chatzilla@118.195.65.95) has joined #ceph
[8:02] * Administrator is now known as long
[8:04] <long> hi,all
[8:51] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:01] <NaioN> hi
[9:01] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[9:03] <long> if ceph prefer btrfs?
[9:14] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) Quit (Quit: Leaving.)
[9:16] <NaioN> well btrfs isn't considered stable at the moment for the purpose of ceph
[9:17] <NaioN> ceph uses the more "advanced" and "new" features of btrfs
[9:17] <NaioN> at the moment they recommend using xfs in production environments
[9:19] * lxo (~aoliva@lxo.user.oftc.net) Quit (Read error: Connection reset by peer)
[9:20] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[9:27] * nwatkins_ (~nwatkins@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: leaving)
[9:28] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:30] * loicd (~loic@magenta.dachary.org) has joined #ceph
[9:35] <jks> josh, it's a Fedora 16 machine that was upgraded from the original ceph 0.39 to ceph 0.52 (the RPMs available on the web site now)
[9:47] <long> now libvirt and qemu both support ceph-rbd, if a Vm has a ceph-rbd disk, how RW operetions is implements? i am not clear the difference s between mount and rbd ?
[9:49] <long> if qemu ask MDS and then RW with OSD?
[9:51] * grifferz_ (~andy@specialbrew.392abl.bitfolk.com) has joined #ceph
[9:51] * grifferz (~andy@specialbrew.392abl.bitfolk.com) Quit (Read error: Connection reset by peer)
[10:11] * Kioob (~kioob@luuna.daevel.fr) has joined #ceph
[10:14] * BManojlovic (~steki@bojanka.net) has joined #ceph
[10:43] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[10:45] * loicd (~loic@magenta.dachary.org) has joined #ceph
[10:49] * BManojlovic (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[10:58] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[11:09] * BManojlovic (~steki@nat.fit.cvut.cz) has joined #ceph
[11:13] * steki-BLAH (~steki@bojanka.net) has joined #ceph
[11:17] * BManojlovic (~steki@nat.fit.cvut.cz) Quit (Ping timeout: 480 seconds)
[11:35] <NaioN> long: with rbd you don't use a MDS
[11:36] <NaioN> qemu uses librbd directly and you can use an RBD as virtual disk, instead of mapping an RBD device and pointing qemu to the correct block device (/dev/rbdX)
[11:37] <NaioN> in both cases no mounting needed, but with the first option you even don't have kernel involvement and you have more options (e.g. for caching)
[11:47] <long> would you tell me ,where or which function of librbd RW is excuted ?
[11:48] <NaioN> I'm not into the code :)
[11:50] * Kioob (~kioob@luuna.daevel.fr) Quit (Quit: Leaving.)
[11:51] <long> what are you using ceph for?
[11:59] <NaioN> presenting rbd's to a backup server
[12:00] <NaioN> the rbd's get re-exported with rsync daemon or with LIO/iscsi
[12:06] <long> how is that?
[12:06] <long> its stability, preformance ?
[12:07] * steki-BLAH (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[12:07] <long> i use rbd as VM'disk in virtualization
[12:11] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[12:14] <NaioN> iscsi to vmware or windows
[12:15] <long> similar
[12:27] * Dulcika (~Dulcika@xaudmzsmtp3.pdm.es) has joined #ceph
[12:27] * Dulcika (~Dulcika@xaudmzsmtp3.pdm.es) Quit (autokilled: This host triggered network flood protection. please mail support@oftc.net if you feel this is in error, quoting this message. (2012-10-21 10:27:57))
[12:50] * Yann (~Yann@did75-15-88-160-187-237.fbx.proxad.net) has joined #ceph
[12:50] <Yann> hi !
[12:50] * Yann is now known as Guest2592
[12:51] * Guest2592 is now known as kYann
[12:53] <kYann> i'm having trouble with ceph not going to nominal state. The peering process make the OSD load, so they timeout :/
[12:53] <kYann> Do you know a way to "throttle" operation so the load doesn't make the osd timeout
[12:53] <kYann> the load is on the disk
[12:54] <kYann> they are 100% busy
[13:03] * steki-BLAH (~steki@nat.fit.cvut.cz) has joined #ceph
[13:07] * BManojlovic (~steki@bojanka.net) has joined #ceph
[13:09] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[13:11] * steki-BLAH (~steki@nat.fit.cvut.cz) Quit (Ping timeout: 480 seconds)
[13:18] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[13:25] * BManojlovic (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[13:28] * Leseb_ (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[13:28] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[13:28] * Leseb_ is now known as Leseb
[13:30] * BManojlovic (~steki@bojanka.net) has joined #ceph
[13:46] * lofejndif (~lsqavnbok@9KCAACICG.tor-irc.dnsbl.oftc.net) has joined #ceph
[13:51] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[13:58] * danieagle (~Daniel@186.214.76.251) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[14:03] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[14:07] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[14:10] * BManojlovic (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[14:15] * LarsFronius (~LarsFroni@frnk-590d0137.pool.mediaWays.net) has joined #ceph
[14:28] * LarsFronius (~LarsFroni@frnk-590d0137.pool.mediaWays.net) Quit (Quit: LarsFronius)
[14:37] * LarsFronius (~LarsFroni@frnk-590d0137.pool.mediaWays.net) has joined #ceph
[15:05] <NaioN> kYann: yeah there is a way to throttle, I assume the cluster isn't in healthy state? So some moving has to be done?
[15:05] <NaioN> have to search for the option
[15:05] <kYann> i think is moving stuff
[15:05] <kYann> but all osd could be up
[15:05] <kYann> it's just that they load and timeout (abort signal)
[15:05] <kYann> thanks
[15:06] <kYann> we use the stable release
[15:06] <NaioN> kYann: osd recovery max active = 1
[15:06] <kYann> i'm gonna test that
[15:06] <NaioN> By default it is set to 5, by tuning this down you'll put less stress on the OSDs
[15:08] <NaioN> kYann: what does ceph -s say?
[15:09] <NaioN> especially yhe line with the pg's
[15:10] <kYann> health HEALTH_WARN 1549 pgs down; 11991 pgs peering; 5921 pgs stale; 11991 pgs stuck inactive; 5921 pgs stuck stale; 11992 pgs stuck unclean
[15:10] <kYann> monmap e6: 1 mons at {d=178.33.20.18:6789/0}, election epoch 0, quorum 0 d
[15:10] <kYann> osdmap e65719: 10 osds: 10 up, 10 in
[15:10] <kYann> pgmap v2047299: 11992 pgs: 1 stale+active, 6043 peering, 4365 stale+peering, 1549 stale+down+peering, 28 remapped+peering, 6 stale+remapped+peering; 342 GB data, 790 GB used, 12170 GB / 12961 GB avail
[15:10] <kYann> mdsmap e41465: 1/1/1 up {0=a=up:replay}
[15:10] * guilhem_ (~guilhem@2a01:e35:2e13:acd0:9c4f:8e55:417f:443b) has joined #ceph
[15:10] * LarsFronius (~LarsFroni@frnk-590d0137.pool.mediaWays.net) Quit (Quit: LarsFronius)
[15:11] * BManojlovic (~steki@nat.fit.cvut.cz) has joined #ceph
[15:12] <NaioN> yeah some are remapped, so they have to be moved
[15:12] <NaioN> did you expand the cluster?
[15:12] <NaioN> well all osds are up and in, that's good
[15:13] <kYann> nop, we don't know what happened
[15:13] <kYann> certainly some load
[15:13] <kYann> osd start timeoued
[15:14] <kYann> and since then we can't put it online
[15:14] <NaioN> hmmm than some osds where out for a while
[15:16] * steki-BLAH (~steki@bojanka.net) has joined #ceph
[15:17] <kYann> we set max active to 1 but it's still loading
[15:17] <kYann> disk are ate 100% use
[15:17] <NaioN> all?
[15:17] <kYann> yes
[15:17] <kYann> we have 5 disks, 5 osds
[15:18] <NaioN> ?
[15:18] <NaioN> your ceph -s says 10
[15:18] <kYann> journal is not on ssd but on same disk and only 2Gb of RAM... we don't control our hardware spec, we used to use mogilefs on that hardware and ceph seemed to work so far
[15:19] <kYann> yes but 2 host
[15:19] <kYann> we set the param only on one to test
[15:19] <NaioN> oh ok so 5 osds per server and 2 servers
[15:19] <NaioN> with an osd per disk
[15:19] * BManojlovic (~steki@nat.fit.cvut.cz) Quit (Ping timeout: 480 seconds)
[15:19] <kYann> yes
[15:20] <NaioN> did you restart the cluster with that option in the osd section?
[15:20] <kYann> we restarted the osd with that option in the osd section
[15:21] <kYann> i checked with --admin-daemon the param is set
[15:24] <NaioN> hmmm well I haven't used the parameter myself, but as far as I can tell it limits the recovery threads to 1 thread...
[15:24] <NaioN> I found it on the mailing list (Wido den Hollander / 09/03/1012)
[15:24] <kYann> recovery include doing peering ?
[15:25] <kYann> i do have more active pg than I had in the last 10 hours, so there is hope
[15:25] <NaioN> http://marc.info/?l=ceph-devel&m=134665991821916&w=2
[15:26] <NaioN> no recovery is the proces of moving data from one osd to another
[15:28] <kYann> i think my cluster is getting better
[15:29] <kYann> more and more pg active
[15:32] <kYann> :/ osd are still getting killed because of timeout
[15:37] <kYann> NaioN, do you think it is a good idea to set rep size to 1 for all pool, will it stop copying file ? then maybe I could set rep size to 2 one pool at a time
[15:37] <NaioN> it will not nessecarely stop the moving
[15:38] <NaioN> it depends on the crushmap
[15:38] <NaioN> if there are pg's on a "wrong" osd they still have to move
[15:38] <kYann> crushmap is min_size 1 max_size 10
[15:38] <kYann> and chooseleaf is on host
[15:39] <NaioN> that's not what i mean
[15:39] <NaioN> if you lower the replication to 1 (so no replication)
[15:39] <NaioN> it won't stop all the moving
[15:40] <NaioN> the pg's that now reside on a remapped osd will get moved to the correct osd
[15:40] <kYann> ok
[15:41] <NaioN> the pg's are pseudo-random placed on the osd's so when something happens to the cluster they get remapped to different osds
[15:41] <NaioN> when you fix the problem, the pg's will be moved back
[15:42] <NaioN> but it depends on what happened
[15:42] <NaioN> that's e.g. the difference between out and down
[15:43] <kYann> the load is certainly high because of the remap :/, and because the load is high osd timedout... it's a circle :/
[15:44] <NaioN> well yeah I've also seen that before and it took a long time to recover
[15:44] <NaioN> you could also try to increase the time-out time of an osd
[15:44] <NaioN> maybe that's enough for your cluster to recover
[15:45] <kYann> we already double the default value, but maybe we should try a higer value
[15:46] <long> how to change replication count
[15:47] <long> i will test to kill one in two of my osd
[15:48] <NaioN> ceph osd pool set {pool-name} {field} {value}
[15:48] <NaioN> with field=size
[15:48] <NaioN> and value=<the number of copies>
[15:51] <long> ok,
[15:51] <NaioN> kYann: which option did you use?
[15:51] <long> you look like a expert of ceph
[15:52] <NaioN> hehe certainly nor :)
[15:52] <NaioN> not
[15:52] <kYann> filestore op thread suicide timeout = 360
[15:52] <kYann> filestore op thread timeout = 180
[15:52] <NaioN> I'm a user
[15:52] <NaioN> kYann: look here: http://ceph.com/docs/master/config-cluster/osd-config-ref/
[15:53] <NaioN> maybe some other options that influence the timeout
[15:53] <NaioN> kYann: osd default notify timeout
[15:54] <NaioN> I don't think those options work kYann
[15:54] <NaioN> they are for the filestore (osd -> disk)
[15:54] <NaioN> not osd -> cluster
[15:55] <kYann> ok i'm trying notify timeout
[15:57] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[16:13] <kYann> 11992 pgs: 66 active, 914 active+clean it is slowly getting better
[16:17] <NaioN> does the timeout value do anything?
[16:18] <kYann> I don't see any osd.X boot when ceph -w
[16:18] <kYann> so I think it helped
[16:18] <NaioN> nice
[16:18] <kYann> load is still very high on disk
[16:19] <NaioN> no it doesn't do anything about the load
[16:19] <NaioN> seems the option doesn't work
[16:19] <NaioN> well you have to ask one of the developers to know for sure
[16:19] <kYann> yes
[16:19] * joao (~JL@89.181.150.224) has joined #ceph
[16:28] * BManojlovic (~steki@nat.fit.cvut.cz) has joined #ceph
[16:31] * steki-BLAH (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[16:34] * lofejndif (~lsqavnbok@9KCAACICG.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[17:05] * Leseb_ (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[17:05] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[17:05] * Leseb_ is now known as Leseb
[17:05] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[17:17] * gaveen (~gaveen@112.134.112.201) has joined #ceph
[17:23] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[17:37] * psomas (~psomas@inferno.cc.ece.ntua.gr) Quit (Quit: Lost terminal)
[17:39] * BManojlovic (~steki@nat.fit.cvut.cz) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:42] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[17:42] * gucki (~smuxi@HSI-KBW-082-212-034-021.hsi.kabelbw.de) has joined #ceph
[17:42] <gucki> hi there
[17:43] <gucki> i just tried to install ceph using ubuntu quantal and followed the quick install guide
[17:43] <gucki> install seems to work fine, but "ceph health" returns an error
[17:43] <gucki> ceph health
[17:43] <gucki> HEALTH_ERR 384 pgs stuck inactive; 384 pgs stuck unclean; no osds
[17:44] <gucki> here's my config: http://pastie.org/5093820
[17:44] <gucki> any hints would be great :-=
[17:47] <imjustmatthew> gucki: did you start any OSDs?
[17:48] <gucki> imjustmatthew: I just did "service ceph start" as stated
[17:48] <gucki> it seems it just started the mds
[17:48] <gucki> sorry, the mon
[17:48] <gucki> but nothing else?
[17:49] <gucki> "/etc/init.d/ceph start osd" does nothing :(
[17:49] <imjustmatthew> look at the output of 'ceph status' it sounds like none of the OSDs started
[17:49] <gucki> http://pastie.org/5093847
[17:49] <gucki> how can i start them, or better said..why are they not started?
[17:50] <gucki> imjustmatthew: ok sorry, i'm stupid ;)
[17:51] <imjustmatthew> haha, yeah, I've made some dumb ones too
[17:51] <gucki> imjustmatthew: i just double checked the config...seems like i placed ip addresses where hostnames ...probably it matters...hold on, i'll try
[17:53] <gucki> imjustmatthew: ok, it's working now :)
[17:54] <gucki> so ceph seems to not like ip addresses, it prefers hostnames :)
[17:54] <gucki> thanks :-)
[17:55] <imjustmatthew> awesome, good luck
[17:56] * long (~chatzilla@118.195.65.95) Quit (Quit: ChatZilla 0.9.89 [Firefox 15.0.1/20120905151427])
[18:08] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[18:08] * loicd (~loic@magenta.dachary.org) has joined #ceph
[18:18] * lofejndif (~lsqavnbok@9KCAACIIF.tor-irc.dnsbl.oftc.net) has joined #ceph
[18:21] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[18:26] <imjustmatthew> Is there a straightforward way to remove all the CephFS data fomr a cluster and start over without damaging the rBD data? Perhaps something along the line of deleting the data and metadata pools?
[18:29] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[18:31] <sage> imjustmatthew: yes.. delete the data and metadata pools. when you are ready to create a new fs, thereis a 'ceph mds newfs ...' command
[18:32] <imjustmatthew> sage: perfect, thanks.
[18:39] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[18:56] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[18:59] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[18:59] * BManojlovic (~steki@bojanka.net) has joined #ceph
[19:01] * LarsFronius (~LarsFroni@frnk-590d0137.pool.mediaWays.net) has joined #ceph
[19:03] * Leseb_ (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:03] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[19:03] * Leseb_ is now known as Leseb
[19:03] <kYann> NaioN, we updated to 0.53 and now load on disk is ok
[19:06] * Leseb_ (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:06] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[19:06] * Leseb_ is now known as Leseb
[19:07] * Leseb_ (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:07] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[19:07] * Leseb_ is now known as Leseb
[19:08] <lightspeed> I just updated from 0.52 to 0.53, and my OSDs are no longer starting successfully - the logs suggest a problem opening the journal:
[19:08] <lightspeed> 2012-10-21 17:50:46.094802 72dba9182780 10 journal open header.fsid = 31a77d5a-2e54-4e10-aad1-068c3665867e
[19:08] <lightspeed> 2012-10-21 17:50:46.094809 72dba9182780 2 journal open journal size 5368709120 > current 1073741824
[19:08] <lightspeed> 2012-10-21 17:50:46.094814 72dba9182780 3 journal journal_replay open failed with Invalid argument
[19:08] <lightspeed> 2012-10-21 17:50:46.094842 72dba9182780 -1 filestore(/var/lib/ceph/osd/ceph-1) mount failed to open journal /dev/vg0/osd.1.jnl: (22) Invalid argument
[19:09] <lightspeed> I put the full osd log here (with debug filestore and debug journal set to 20): http://pastebin.com/C7b13VPV
[19:10] <lightspeed> my journals are LVM volumes (on an SSD)
[19:11] <lightspeed> I tried recreating the journal and it made no difference (with ceph-osd --mkjournal)
[19:11] <lightspeed> has anyone else seen this?
[19:13] <kYann> just updated from 0.48 to 0.53 and everything went fine
[19:18] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[19:20] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:25] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[19:34] * nwatkins1 (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[19:39] * gaveen (~gaveen@112.134.112.201) Quit (Quit: Leaving)
[19:42] * allsystemsarego (~allsystem@188.25.130.148) has joined #ceph
[19:47] * loicd (~loic@90.84.144.224) has joined #ceph
[19:49] <NaioN> kYann: nice only I want to wait till the next stable release, we're now on 48.2
[19:52] * LarsFronius (~LarsFroni@frnk-590d0137.pool.mediaWays.net) Quit (Quit: LarsFronius)
[20:05] <gucki> i'm currently playing with ceph stable and found that "rbd rm foo" is really slow. even when the image has just been created and nothing written to it. is this a known issue?
[20:06] * steki-BLAH (~steki@bojanka.net) has joined #ceph
[20:07] <NaioN> gucki: yes it is
[20:07] <NaioN> but I'm not aware why it is, you should ask one of the developers
[20:08] <NaioN> it depends on the size of the rbd
[20:09] * BManojlovic (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[20:11] <imjustmatthew> sage: is it worth submitting patches to document those MDS commands or are they still too unstable?
[20:14] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) has joined #ceph
[20:42] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[20:42] * BManojlovic (~steki@bojanka.net) has joined #ceph
[20:43] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit (Quit: Ex-Chat)
[20:46] * steki-BLAH (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[20:47] * allsystemsarego (~allsystem@188.25.130.148) Quit (Quit: Leaving)
[20:48] * loicd (~loic@90.84.144.224) Quit (Ping timeout: 480 seconds)
[21:00] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:11] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[21:18] * The_Bishop (~bishop@e179004000.adsl.alicedsl.de) has joined #ceph
[21:25] <buck> I'm hitting a BadHostKeyException when trying to run the teuthology examples. I've generated ssh keys and can ssh into the one host without a password. Are there any common gotchas that I should be on the watch for when trying to sort this out?
[21:27] * cowbell (~sean@70.231.129.172) has joined #ceph
[21:27] * cowbell (~sean@70.231.129.172) has left #ceph
[21:28] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[21:32] * andret (~andre@pcandre.nine.ch) Quit (Ping timeout: 480 seconds)
[21:36] * steki-BLAH (~steki@bojanka.net) has joined #ceph
[21:36] * The_Bishop (~bishop@e179004000.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[21:40] * BManojlovic (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[21:41] * The_Bishop (~bishop@e179004000.adsl.alicedsl.de) has joined #ceph
[22:15] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[22:15] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:22] * steki-BLAH (~steki@bojanka.net) Quit (Ping timeout: 480 seconds)
[22:26] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[22:26] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:28] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) Quit (Quit: Leaving.)
[22:33] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) has joined #ceph
[22:35] * steki-BLAH (~steki@bojanka.net) has joined #ceph
[22:36] * danieagle (~Daniel@186.214.76.251) has joined #ceph
[22:37] * gucki (~smuxi@HSI-KBW-082-212-034-021.hsi.kabelbw.de) Quit (Remote host closed the connection)
[22:42] * nwatkins2 (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[22:42] * nwatkins1 (~Adium@soenat3.cse.ucsc.edu) Quit (Read error: Connection reset by peer)
[22:47] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) Quit (Quit: Leaving.)
[22:55] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) has joined #ceph
[23:11] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[23:17] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[23:30] * verwilst (~verwilst@dD5769628.access.telenet.be) has joined #ceph
[23:33] * verwilst (~verwilst@dD5769628.access.telenet.be) Quit ()
[23:34] * steki-BLAH (~steki@bojanka.net) Quit (Quit: Ja odoh a vi sta 'ocete...)
[23:37] * danieagle (~Daniel@186.214.76.251) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.