#ceph IRC Log

Index

IRC Log for 2013-05-08

Timestamps are in GMT/BST.

[0:02] <cjh_> dmick: i believe it is.
[0:02] <cjh_> when i umount and then ls in the dir my file is gone which is good
[0:03] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[0:03] * kylem (~kyle@216.183.64.10) has joined #ceph
[0:04] * drokita1 (~drokita@199.255.228.128) Quit (Ping timeout: 480 seconds)
[0:04] <dmick> did it show the preexisting images?
[0:05] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) has joined #ceph
[0:07] * kyle_ (~kyle@216.183.64.10) Quit (Ping timeout: 480 seconds)
[0:11] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Read error: Operation timed out)
[0:16] * yasu` (~yasu`@dhcp-59-157.cse.ucsc.edu) Quit (Remote host closed the connection)
[0:18] * loicd (~loic@2a01:e35:2eba:db10:dd35:eadd:f15f:fa75) Quit (Quit: Leaving.)
[0:21] <cjh_> dmick: is does not :-/
[0:22] <cjh_> oh maybe my put command didn't do what i thought
[0:22] <cjh_> 1 sec
[0:24] <sagewk> mikedawson: http://tracker.ceph.com/issues/4895
[0:25] * gaveen (~gaveen@175.157.230.31) Quit (Remote host closed the connection)
[0:27] * jtang1 (~jtang@79.97.135.214) Quit (Quit: Leaving.)
[0:27] <cjh_> dmick: if i run something like rados -p bench put myobject kernel-3.83.rpm it returns but doesn't show up with rbd ls --pool bench or the rbd-fuse
[0:28] <cjh_> i'm using the new cuttlefish
[0:30] <cjh_> dmick: it shows up with everything if i do rbd -p bench create myimage --size 1024
[0:31] <cjh_> so i think it was just me not understanding how things work
[0:33] * john_barbee_ (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[0:34] * tnt (~tnt@91.177.240.165) Quit (Ping timeout: 480 seconds)
[0:40] * Cube (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[0:41] * loicd (~loic@magenta.dachary.org) has joined #ceph
[0:42] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) has left #ceph
[0:43] <sagewk> cjh_: that puts a rados object, not an rbd image (whch is lots of rados objects)
[0:43] <sagewk> rbd import ...
[0:43] <sagewk> or rados -p bench ls -
[0:44] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) has joined #ceph
[0:45] * yanzheng (~zhyan@134.134.139.70) has joined #ceph
[0:47] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[0:47] * Kioob (~kioob@2a01:e35:2432:58a0:21e:8cff:fe07:45b6) has joined #ceph
[0:49] <dmick> cjh_: yeah, rbd images only. Sorry, I'd stepped away
[0:49] * rustam (~rustam@94.15.91.30) has joined #ceph
[0:51] <jmlowe> umm, I'm in trouble, I think all of my osd's crashed
[0:51] <jmlowe> yeah, I'm in deep trouble, my osd's won't start
[0:54] <sjust> jmlowe: what happened?
[0:54] <jmlowe> tried to pg split
[0:55] <sjust> what is the back trace?
[0:55] <jmlowe> ceph version 0.61 (237f3f1e8d8c3b85666529860285dcdffdeda4c5)
[0:55] <jmlowe> 1: /usr/bin/ceph-osd() [0x83fac0]
[0:55] <jmlowe> 2: (()+0xfcb0) [0x7f8004457cb0]
[0:55] <jmlowe> 3: (gsignal()+0x35) [0x7f8002bc4425]
[0:55] <jmlowe> 4: (abort()+0x17b) [0x7f8002bc7b8b]
[0:55] <jmlowe> 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f80034bfe2d]
[0:55] <jmlowe> 6: (()+0x5ef26) [0x7f80034bdf26]
[0:55] <jmlowe> 7: (()+0x5ef53) [0x7f80034bdf53]
[0:55] <jmlowe> 8: (()+0x5f17e) [0x7f80034be17e]
[0:55] <jmlowe> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x43d) [0x8eb77d]
[0:55] <jmlowe> 10: (OSD::load_pgs()+0x2254) [0x6eb0f4]
[0:55] <jmlowe> 11: (OSD::init()+0xc6e) [0x6ece9e]
[0:55] <jmlowe> 12: (main()+0x2148) [0x621108]
[0:55] <jmlowe> 13: (__libc_start_main()+0xed) [0x7f8002baf76d]
[0:55] <jmlowe> 14: /usr/bin/ceph-osd() [0x6235e9]
[0:55] <jmlowe> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
[0:55] <jmlowe> mon's seem stuck
[0:55] <jmlowe> won't update status
[0:56] <jmlowe> health HEALTH_WARN 2 pgs peering; 264 pgs stuck inactive; 264 pgs stuck unclean; mds cluster is degraded
[0:56] <jmlowe> monmap e11: 3 mons at {alpha=149.165.228.10:6789/0,delta=149.165.228.133:6789/0,epsilon=149.165.228.134:6789/0}, election epoch 700, quorum 0,1,2 alpha,delta,epsilon
[0:56] <jmlowe> osdmap e9307: 18 osds: 18 up, 18 in
[0:56] <jmlowe> pgmap v7363057: 2700 pgs: 2 inactive, 260 creating, 2436 active+clean, 2 peering; 4165 GB data, 8203 GB used, 74671 GB / 82874 GB avail
[0:56] <jmlowe> mdsmap e2471: 1/1/1 up {0=alpha=up:replay}
[0:56] <jmlowe> 2013-05-07 18:52:58.543214 mon.0 [INF] pgmap v7363057: 2700 pgs: 2 inactive, 260 creating, 2436 active+clean, 2 peering; 4165 GB data, 8203 GB used, 74671 GB / 82874 GB avail
[0:56] <sjust> can you restart with osd, filestore logging at 20?
[0:57] <jmlowe> http://pastebin.com/BThkjj9p
[0:57] <loicd> "Factor reusable components out of PG/ReplicatedPG and have PG/ReplicatedPG and ErasureCodedPG share only those components and a common PG interface rather than placing those components in a common base class." in http://pad.ceph.com/p/Erasure_encoding_as_a_storage_backend kept me thinking.
[0:57] <sjust> jmlowe: need filestore/osd logging
[0:58] * rturk is now known as rturk-away
[0:58] <loicd> I think I see the advantages, although some of the common components are beyond me for now ;-)
[0:59] * jtang1 (~jtang@79.97.135.214) Quit (Quit: Leaving.)
[0:59] <gregaf> (that had the filestore logging on, but osd is still 0/5)
[0:59] * sjustlaptop (~sam@38.122.20.226) has joined #ceph
[1:00] <jmlowe> -7> 2013-05-07 18:49:15.102951 7f859c23c780 2 journal read_entry 350928896 : seq 7220489 930 bytes
[1:00] <jmlowe> -6> 2013-05-07 18:49:15.102957 7f859c23c780 2 journal read_entry 350932992 : seq 7220490 774 bytes
[1:00] <jmlowe> -5> 2013-05-07 18:49:15.102963 7f859c23c780 2 journal No further valid entries found, journal is most likely valid
[1:00] <jmlowe> -4> 2013-05-07 18:49:15.102967 7f859c23c780 2 journal No further valid entries found, journal is most likely valid
[1:00] <jmlowe> -3> 2013-05-07 18:49:15.102968 7f859c23c780 3 journal journal_replay: end of journal, done.
[1:00] <jmlowe> -2> 2013-05-07 18:49:15.102990 7f859c23c780 1 journal _open /data/osd.17/journal fd 27: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
[1:00] <jmlowe> -1> 2013-05-07 18:49:15.103189 7f859c23c780 2 osd.17 0 boot
[1:00] <jmlowe> 0> 2013-05-07 18:49:20.008684 7f859c23c780 -1 osd/OSD.cc: In function 'void OSD::load_pgs()' thread 7f859c23c780 time 2013-05-07 18:49:20.007517
[1:00] <jmlowe> osd/OSD.cc: 1779: FAILED assert(i->second.empty())
[1:00] <jmlowe> pastebin is not doing well right now
[1:01] <jmlowe> sjust: did that give you enough to work on?
[1:02] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[1:03] <loicd> I think both approach ( creating a base class or moving common components outside of PG/ReplicatedPG ) require that a clean PG API is defined. I.e. that OSD no longer references the internals of PG/ReplicatedPG but relies on an interface that can be common to ReplicatedPG or ErasureEncodedPG.
[1:03] <cjh_> sagewk: thanks. i think i had that mixed up in my head
[1:03] <cjh_> dmick: thanks
[1:04] * brady (~brady@rrcs-64-183-4-86.west.biz.rr.com) Quit (Quit: Konversation terminated!)
[1:04] <loicd> Rather : I can't see how the peering state machine could be moved out of PG without defining the PG API ...
[1:05] * loicd now better understands what sjust wrote :-)
[1:05] * gmason (~gmason@hpcc-fw.net.msu.edu) Quit (Ping timeout: 480 seconds)
[1:08] * sjustlaptop (~sam@38.122.20.226) Quit (Ping timeout: 480 seconds)
[1:08] <jmlowe> sjust: need more?
[1:08] <andreask> how can the new per-pool quota feature be used in cuttlefish?
[1:09] * alram (~alram@38.122.20.226) Quit (Quit: leaving)
[1:11] * sjustlaptop (~sam@38.122.20.226) has joined #ceph
[1:11] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) Quit (Ping timeout: 480 seconds)
[1:12] <jmlowe> http://pastebin.com/kKUYtwei
[1:12] <sjustlaptop> jmlowe: can you pastebin the contents of the current directory on that osd?
[1:13] <sjustlaptop> loicd: yeah, the pg state machine should be the last thing, I think
[1:13] <jmlowe> http://pastebin.com/kRhKE6vq
[1:15] * lofejndif (~lsqavnbok@04ZAAAFSQ.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[1:15] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Quit: Leaving.)
[1:17] * sjustlaptop1 (~sam@38.122.20.226) has joined #ceph
[1:17] * sjustlaptop (~sam@38.122.20.226) Quit (Read error: Connection reset by peer)
[1:17] <sjustlaptop1> is there anything in:
[1:17] <sjustlaptop1> 2.333_1a7
[1:17] <sjustlaptop1> ?
[1:17] <sjustlaptop1> jmlowe: ^
[1:18] <jmlowe> no
[1:18] <sjustlaptop1> that's wierd
[1:18] <sjustlaptop1> 2.36a_1a7?
[1:18] <jmlowe> no
[1:18] <sjustlaptop1> 2.b1_head?
[1:19] <sjustlaptop1> oops
[1:19] <sjustlaptop1> 2.36d_1ae
[1:19] <sjustlaptop1> ?
[1:19] <jmlowe> no
[1:19] <sjustlaptop1> those should all have been removed in (I think) the 59 to 60 upgrade
[1:19] <sjustlaptop1> did you go from 56.4 to cuttlefish?
[1:19] <sjustlaptop1> or from 60 to cuttlefish?
[1:19] <jmlowe> went from 56.4.6 to cuttlefish
[1:19] <sjustlaptop1> ok
[1:20] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[1:20] <jmlowe> is there a fix?
[1:21] * jtang1 (~jtang@79.97.135.214) Quit (Quit: Leaving.)
[1:21] <jmlowe> make that 0.56.6 to 0.61
[1:22] * danieagle (~Daniel@186.214.56.43) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[1:22] * yanzheng (~zhyan@134.134.139.70) Quit (Remote host closed the connection)
[1:22] <sjustlaptop1> jmlowe: this is an easy fix, if it's the only problem
[1:22] <sjustlaptop1> trying to understand how it happened
[1:22] <jmlowe> whew
[1:23] <jmlowe> what do I do?
[1:24] <jmlowe> also I still had some 0.56.6 clients active if that makes a difference
[1:24] <sjustlaptop1> jmlowe: I don't think so
[1:24] <jmlowe> last thing before meltdown 'ceph osd pool set rbd pg_num 900'
[1:24] <sjustlaptop1> can you verify that all directories with <pgid>.<hex>
[1:24] <sjustlaptop1> where <hex> is not head
[1:24] <sjustlaptop1> sorry
[1:24] <sjustlaptop1> <pgid>_<hex>
[1:24] <sjustlaptop1> where <hex> is not head
[1:24] <sjustlaptop1> can you verify that those are empty?
[1:25] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[1:25] * jlogan2 (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[1:26] <jmlowe> I think so, check me, I'm a little flustered http://pastebin.com/feBNC7EM
[1:27] <sjustlaptop1> ok, go ahead and remove those empty directories
[1:27] <sjustlaptop1> NOT
[1:27] <sjustlaptop1> omap or meta
[1:27] <jmlowe> empty pg dirs only
[1:27] <sjustlaptop1> right
[1:28] <sjustlaptop1> and only those without _head
[1:28] * Tamil (~tamil@38.122.20.226) Quit (Quit: Leaving.)
[1:29] <jmlowe> ok, done on one osd
[1:29] <jmlowe> try it?
[1:29] * rturk-away is now known as rturk
[1:29] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 480 seconds)
[1:30] <sjustlaptop1> yeah
[1:30] <sjustlaptop1> ok, I see one bug
[1:30] <sjustlaptop1> it doesn't explain how your cluster crashed, though
[1:30] <sjustlaptop1> oh
[1:31] <jmlowe> http://pastebin.com/ncvfFN7m
[1:31] <sjustlaptop1> yeah, I think I know what happened
[1:31] <jmlowe> died, but differently
[1:32] <sjustlaptop1> one sec, going to need to put together a branch
[1:32] <jmlowe> ok, should I clean up all the other osd's in a similar fashion?
[1:33] <sjustlaptop1> one sec
[1:33] <sjustlaptop1> let's get that one up first
[1:36] <sjustlaptop1> jmlowe: fwiw, I believe I know what happened, and it should be fully recoverable
[1:37] <jmlowe> recoverable is what I'm most concerned about
[1:47] * LeaChim (~LeaChim@2.222.208.16) Quit (Ping timeout: 480 seconds)
[1:51] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[1:52] <jmlowe> sjustlaptop1: how are we doing?
[1:52] <sjustlaptop1> putting together patches
[1:52] <jmlowe> looking at a point release?
[1:58] * Cube (~Cube@96-41-69-24.dhcp.mtpk.ca.charter.com) has joined #ceph
[2:04] * jtang1 (~jtang@79.97.135.214) Quit (Ping timeout: 480 seconds)
[2:16] <Fetch> how much better targetted is btrfs for OSDs than XFS?
[2:17] <sjustlaptop1> xfs is our current recomendation
[2:17] <Fetch> I've seen some crazy magical problems, and I'm curious if I should try to use btrfs in the future
[2:17] <Fetch> is it? ok cool
[2:17] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[2:17] <sjustlaptop1> btrfs ideally should be better, but xfs is currently much more stable
[2:18] <Fetch> I'll keep on keeping on with it, then :) was getting deletion failures, partial deletion failures, format 2 creation failures...I thought maybe I just had a retarded config
[2:18] <buck> just a note, I'm seeing the behavior described in bug #4924 on a precise install
[2:18] <Fetch> (things seem to be resolved for me moment *crosses fingers*)
[2:18] <buck> I'
[2:18] <buck> er i'll update the bug
[2:23] * rturk is now known as rturk-away
[2:25] <jmlowe> whatever you do don't use btrfs with a linux kernel < 3.8 you will lose data
[2:25] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[2:27] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[2:31] <jmlowe> sjustlaptop1: how are we doing?
[2:33] <mikedawson> Any Inktank guys out there who can kick gitbuilder into build wip-mon-trace for gitbuilder-ceph-deb-raring-amd64?
[2:39] * john_barbee_ (~jbarbee@c-98-226-73-253.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.90 [Firefox 21.0/20130430204233])
[2:40] * markbby (~Adium@168.94.245.2) has joined #ceph
[2:42] <cjh_> dmick: do you know how to raise the ulimit on the ceph-mon process? i'm getting a failed 8192 ulimit error
[2:45] <sjust> jmlowe: ok, got a review, try wip_split_upgrade on the one OSD we have been working with so far
[2:46] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[2:47] * Tamil (~tamil@38.122.20.226) has joined #ceph
[2:48] <mikedawson> sjust: can you make gitbuilder build a wip-mon-trace for raring?
[2:51] <sjust> mikedawson: didn't see it in the git builder, actually
[2:51] * sjustlaptop1 (~sam@38.122.20.226) Quit (Ping timeout: 480 seconds)
[2:51] <jmlowe> sjust: um, not familiar with the procedure for working with the wip stuff
[2:52] <mikedawson> sjust: sage added it for me, but it didn't build for anything past precise for some reason
[2:52] <sjust> oh, I think you change the ceph line in your ceph apt sources
[2:53] <mikedawson> sjust: ok
[2:53] <sjust> jmlowe: it's at the bottom of the http://ceph.com/docs/master/install/debian/#installing-packages page
[2:54] <mikedawson> jmlowe: add something like "deb http://gitbuilder.ceph.com/ceph-deb-raring-x86_64-basic/ref/wip_split_upgrade raring main" in /etc/sources.list.d/ceph.list
[2:54] * jtang1 (~jtang@79.97.135.214) Quit (Ping timeout: 480 seconds)
[2:55] <sjust> jmlowe: actually, are you on centos6?
[2:55] <jmlowe> precise for my ceph cluster
[2:55] <sjust> ok, then the instructions I linked should work
[2:56] <sjust> wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc' | sudo apt-key add -
[2:56] <sjust> then add
[2:56] <jmlowe> did I say precise, I meant quantal, anyway I'm getting there
[2:56] * tkensiski (~tkensiski@217.sub-70-197-8.myvzw.com) has joined #ceph
[2:56] <sjust> ok
[2:56] * tkensiski (~tkensiski@217.sub-70-197-8.myvzw.com) Quit (Read error: Connection reset by peer)
[2:57] <sjust> jmlowe: I need to head home, I'll be back online in an hour
[2:57] <jmlowe> ok, you think I can just start up with this and everything will fix itself?
[2:57] <jmlowe> http://gitbuilder.ceph.com/ceph-deb-quantal-x86_64-basic/ref/wip_split_upgrade/dists/quantal/main/binary-amd64/Packages 404 Not Found
[3:00] <sjust> jmlowe: oops, the gitbuilder is behind
[3:00] <jmlowe> is it a wait or a poke operation?
[3:00] <sjust> http://ceph.com/gitbuilder.cgi
[3:00] <sjust> has the current status
[3:00] <sjust> http://gitbuilder.sepia.ceph.com/gitbuilder-quantal-deb-amd64/
[3:00] <sjust> is the right one
[3:01] <sjust> and you are waiting for wip_split_upgrade to show up
[3:01] <sjust> as yellow
[3:01] <sjust> I'll be back later
[3:01] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) Quit (Quit: Leaving.)
[3:01] * noahmehl (~noahmehl@wsip-98-173-51-204.sd.sd.cox.net) has joined #ceph
[3:01] <jmlowe> ok, thanks for all the help
[3:02] * markbby (~Adium@168.94.245.2) Quit (Ping timeout: 480 seconds)
[3:02] <mikedawson> jmlowe: https://github.com/ceph/ceph/compare/master...wip_split_upgrade
[3:03] <jmlowe> I saw some of that diff in the issue tracker
[3:03] <dmick> cjh_: yes, there's actually a .conf file setting for that
[3:03] * sagelap (~sage@2600:1012:b008:1ecf:ecbd:8495:bfcd:7a58) has joined #ceph
[3:05] <dmick> assuming you mean "number of open files". the variable is 'max open files'
[3:07] <dmick> also assuming you are starting daemons with /etc/init.d/ceph; sadly we don't have a solution in hand yet for upstart, I think
[3:07] <dmick> http://tracker.ceph.com/issues/3965
[3:08] <mikedawson> dmick: when Sage says "start ceph-mon while generating a trace", do you know what that means, exactly? http://tracker.ceph.com/issues/4895
[3:09] <mikedawson> dmick: perhaps the wip-mon-trace branch does the work? gdb? something else?
[3:13] <sagelap> sorry, just a sec
[3:14] <dmick> looking, sorry
[3:14] <sagelap> --mon-debug-dum-transactions
[3:14] <sagelap> --mon-debug-dump-transactions
[3:14] <sagelap> and it'll write a file to /var/log/ceph/soemthing.tdump
[3:14] <sagelap> similar to what we did last week
[3:14] <dmick> oh that's even a better way to get an answer from sage :)
[3:15] <mikedawson> sagelap: you'll have it soon
[3:15] <sagelap> thanks
[3:16] <sagelap> the non-compaction will be easier to reproduce than the hangs were were seeing before :)
[3:17] * buck (~buck@bender.soe.ucsc.edu) has left #ceph
[3:17] <mikedawson> sagelap: am I good with Precise builds for Raring, or could you kick gitbuilder to compile for Raring?
[3:17] <sagelap> dmick is kicking it now
[3:17] <sagelap> but if the precise package installs that should be fine too
[3:17] <sagelap> paravoid: ping!
[3:18] <jmlowe> while he is at it any way to push wip-split-upgrade through for quantal?
[3:19] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[3:20] <cjh_> dmick: ok :)
[3:20] <sagelap> sadly not really :(
[3:21] <sagelap> jmlowe: actually it looks like wip_split_upgrade is built (_'s not -'s)
[3:23] * Tamil (~tamil@38.122.20.226) Quit (Quit: Leaving.)
[3:25] <mikedawson> sagelap: dependency hell trying to get the precise packages installed... Guess I'll wait for the Raring packages from gitbuilder
[3:29] <sagelap> gitbuilder vm needed a kick in teh pants.. catching up now
[3:35] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[3:35] * loicd (~loic@magenta.dachary.org) has joined #ceph
[3:38] <dmick> mikedawson: should be there now
[3:38] <mikedawson> thanks dmick!
[3:39] <dmick> I may be lying; it should be there but I don't see the output dir
[3:39] <dmick> digging...
[3:40] <mikedawson> same here
[3:40] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[3:42] * sagelap (~sage@2600:1012:b008:1ecf:ecbd:8495:bfcd:7a58) Quit (Ping timeout: 480 seconds)
[3:43] <Fetch> I'm seeing Glance hang on a image-create, the hang is in rados_connect in the shared lib
[3:44] <Fetch> what steps should I go about to debug what's causing the hang?
[3:44] <dmick> yeah, sorry, we'd artificially marked all builds done because it needed rebooting and would otherwise bury itself recovering. Specifically marked that one as needed, and it's building now
[3:44] <dmick> Fetch: ceph -s for monitor health first, probably
[3:45] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[3:45] * loicd (~loic@magenta.dachary.org) has joined #ceph
[3:45] <jmlowe> dmick: I'm dying for ceph-deb-quantal-x86_64-basic/ref/wip_split_upgrade
[3:46] <dmick> well, how much you got? (slaps hand with wrench)
[3:46] <dmick> (jk. looking)
[3:46] <Fetch> dmick: health was ok, but it turns out the hang is specific to the user (I just reproduced with rbd). So, I'll go muck with my permissions
[3:47] <mikedawson> dmick: jmlowe certainly has a bigger problem than I do, I'd gladly wait in line behind him
[3:48] <dmick> mikedawson: you're off in the waiting area waiting for the fry clerk; I'm already ringing up jmlowe
[3:48] * jtang1 (~jtang@79.97.135.214) Quit (Ping timeout: 480 seconds)
[3:48] <Fetch> and...PEBCAK (for the logs): I copy pasted the perms from the ceph.com rbd openstack guide, but my ceph user is different in my glance config
[3:51] <jmlowe> dmick: I pay in beer and only in person
[3:52] <Fetch> so, related: rbd was hanging on creating an image because I created a user with no caps
[3:52] <Fetch> Is that a bug? Is that a known bug?
[3:53] * diegows (~diegows@190.190.2.126) Quit (Read error: Operation timed out)
[3:57] <dmick> it seems buggish, at least, and I don't know of one, but please search and file if you wish
[3:57] <dmick> seems like it ought to fail
[3:57] <dmick> not hang
[3:57] * markbby (~Adium@168.94.245.1) has joined #ceph
[3:58] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[3:58] <sjustlaptop> jmlowe: how is it going?
[3:58] <jmlowe> still waiting for the build
[3:59] <sjustlaptop> ah, it's 4th in line
[3:59] <sjustlaptop> you should probably upgrade that one osd, verify that it starts
[3:59] <sjustlaptop> then another osd, verify that it starts (the branch should take care of cleaning out the old snap dirs)
[3:59] <sjustlaptop> and then you should be good to go
[3:59] <Fetch> anyone on with redmine perms? I tried to register without an openid url, would like it deleted so I can try again if possible
[4:00] <Fetch> disregard
[4:00] <dmick> jmlowe/sjust: I'll mark the other builds as passed and then unmark them
[4:00] <sjustlaptop> oh, cool
[4:04] <dmick> sjustlaptop: you name your branches with _ just to be annoying, right? :)
[4:05] <sjustlaptop> dmick: I just like to keep people on their toes
[4:05] <dmick> ok, wip_split_upgrade will start after master finishes
[4:05] <sjustlaptop> cool
[4:05] <dmick> if you're *reeeally* desperate, the tarball is done
[4:17] * markbby (~Adium@168.94.245.1) Quit (Ping timeout: 480 seconds)
[4:34] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) Quit (Quit: Leaving.)
[4:34] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[4:36] * rustam (~rustam@94.15.91.30) has joined #ceph
[4:37] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[4:42] * jtang1 (~jtang@79.97.135.214) Quit (Ping timeout: 480 seconds)
[4:46] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[4:52] <dmick> gah. finally building that branch; master took forever
[4:57] <jmlowe> too much to hope that it won't take as long as master?
[5:03] <dmick> oh you can hope, it can't hurt :)
[5:19] * glowell (~glowell@38.122.20.226) Quit (Quit: Leaving.)
[5:28] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[5:36] * jtang1 (~jtang@79.97.135.214) Quit (Ping timeout: 480 seconds)
[5:40] <dmick> jmlowe: building pkgs
[5:46] <sage> jmlowe: looks like it's built
[5:47] <dmick> watching the log; not quite, it seems
[5:47] <sage> what dmick said :)
[5:48] <dmick> seems to *really* be going slow during this I/O part for reasons I don't understand
[5:55] * athrift (~nz_monkey@222.47.255.123.static.snap.net.nz) has joined #ceph
[5:57] * yehuda_hm (~yehuda@2602:306:330b:1410:5ce0:683c:121a:f749) Quit (Ping timeout: 480 seconds)
[5:59] <jmlowe> installing ...
[6:00] <dmick> yep, can confirm it's done now
[6:00] <jmlowe> sjust: you around?
[6:00] <sjustlaptop> yep
[6:01] <jmlowe> ok, shall I start one up?
[6:01] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[6:01] <sjustlaptop> yeah, the one you were working with before
[6:02] <mikedawson> dmick: can you do your magic for ceph-deb-raring-x86_64-basic/ref/wip-mon-trace ?
[6:03] <jmlowe> -9> 2013-05-08 00:02:19.296147 7f9b8fbe3780 2 journal read_entry 272211968 : seq 23041069 713 bytes
[6:03] <jmlowe> -8> 2013-05-08 00:02:19.296152 7f9b8fbe3780 2 journal read_entry 272216064 : seq 23041070 713 bytes
[6:03] <jmlowe> -7> 2013-05-08 00:02:19.296158 7f9b8fbe3780 2 journal read_entry 272220160 : seq 23041071 713 bytes
[6:03] <jmlowe> -6> 2013-05-08 00:02:19.296163 7f9b8fbe3780 2 journal read_entry 272224256 : seq 23041072 713 bytes
[6:03] <jmlowe> -5> 2013-05-08 00:02:19.296170 7f9b8fbe3780 2 journal read_entry 272228352 : seq 22973280 809 bytes
[6:03] <jmlowe> -4> 2013-05-08 00:02:19.296180 7f9b8fbe3780 2 journal read_entry 272228352 : seq 22973280 809 bytes
[6:03] <jmlowe> -3> 2013-05-08 00:02:19.296182 7f9b8fbe3780 3 journal journal_replay: end of journal, done.
[6:03] <jmlowe> -2> 2013-05-08 00:02:19.296220 7f9b8fbe3780 1 journal _open /data/osd.3/journal fd 28: 1048576000 bytes, block size 4096 bytes, directio = 1, aio = 0
[6:03] <jmlowe> -1> 2013-05-08 00:02:19.296439 7f9b8fbe3780 2 osd.3 0 boot
[6:03] <jmlowe> 0> 2013-05-08 00:02:20.198867 7f9b8fbe3780 -1 osd/PG.h: In function 'void PG::unlock()' thread 7f9b8fbe3780 time 2013-05-08 00:02:20.197406
[6:03] <jmlowe> osd/PG.h: 417: FAILED assert(!dirty_info)
[6:03] <dmick> mikedawson: looking
[6:03] <sjustlaptop> crap
[6:04] <sjustlaptop> one sec
[6:04] <sjustlaptop> will have patch shortly
[6:04] <sage> just forgot the write_info()?
[6:05] <sjustlaptop> yeah
[6:05] <sjustlaptop> d'oh
[6:05] <sjustlaptop> jmlowe: sorry, now for a rebuild
[6:05] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[6:05] <sjustlaptop> I assumed there was a write_info later on in load_pgs()
[6:06] <sjustlaptop> any chance we can allocate it more memory/io?
[6:06] <dmick> I wish I knew what the bottleneck was. it was painfully slow in the "objcopy" section which shouldn't be memory-limited
[6:06] <dmick> maybe the swraid is sick; I'm a n00b with that
[6:07] <dmick> mikedawson: raring wip-mon-trace is building
[6:07] <dmick> senta01:/proc/mdstat shows no sign of damage that I can see
[6:08] <dmick> ??
[6:08] <sage> i want to switch most of these over to lxc so they can actually share the whole machines resources
[6:08] <dmick> yeah
[6:08] <dmick> but then the base install has to be the target install, right?...not that it's a problem
[6:08] * yehuda_hm (~yehuda@2602:306:330b:1410:5ce0:683c:121a:f749) has joined #ceph
[6:09] <dmick> I can pause the precise-notcmalloc one, I guess
[6:10] <sage> lxc even works with cloud-init. only real restriction for us is the arch... can't do i386 most likely
[6:10] <dmick> I thought it was superjails and didn't run a different version. huh.
[6:11] <jmlowe> does anybody still use i386?
[6:11] <sage> probably not, but we still need to catch compilation errors and such
[6:12] <jmlowe> yeah, it's good form, no doubt
[6:12] <sage> dmick: same kernel, different namespaces for all kernel services
[6:13] <dmick> sage: right. so we'd need n vmhosts running different releases
[6:13] <jmlowe> <- ran openvz containers for several years
[6:13] <dmick> which we can have.
[6:16] <dmick> something weird is going on; virt-manager keeps dropping connection to senta01 too
[6:21] <mikedawson> dmick: installing now. thanks!
[6:22] <dmick> it finished??
[6:22] <mikedawson> seems like it
[6:23] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[6:23] <dmick> so it does
[6:24] <dmick> damn, 11 minutes
[6:24] <jmlowe> sjustlaptop: build failed
[6:25] <sjustlaptop> dmick: any idea?
[6:25] <sjustlaptop> hasn't failed anywhere else
[6:26] <dmick> um, no
[6:26] <dmick> CXXLD libglobal.la
[6:26] <dmick> libtool: link: cannot find the library `libglobal.la' or unhandled argument `libglobal.la'
[6:26] <dmick> ???
[6:26] <sjustlaptop> yeah, saw that
[6:26] <sjustlaptop> should I just force rebuild?
[6:26] <sjustlaptop> is it out of space?
[6:27] <dmick> no
[6:27] <dmick> I...guess?
[6:28] <sjustlaptop> I guess it's rebuilding
[6:28] <dmick> yeah
[6:30] <nigwil> anyone tried out ownCloud (sort-of-Dropbox/Webdav UI to Swift) against the RADOSgw? http://owncloud.org/
[6:31] * jtang1 (~jtang@79.97.135.214) Quit (Ping timeout: 480 seconds)
[6:33] <dmick> nigwil: it is reported:
[6:33] <dmick> Owncloud 4.5.x actually works fine "out of the box" with DreamObjects as secondary storage using Swift as the protocol.  The configuration is a little bit complex but not too bad.
[6:33] <dmick> and since DreamObjects is implemented with radosgw, that's encouraging
[6:38] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[6:38] * loicd (~loic@magenta.dachary.org) has joined #ceph
[6:41] <nigwil> thanks dmick, sounds great. Given the roadmap for Swift (long-term) is the Ceph community going to be able to track (keep up with) the Swift API developments (and required functionality)?
[6:41] * yehuda_hm (~yehuda@2602:306:330b:1410:5ce0:683c:121a:f749) Quit (Ping timeout: 480 seconds)
[6:41] <nigwil> It would seem that ideally the Swift (service) API should be able to front different object-stores (pluggable)
[6:42] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[6:43] <dmick> jmlowe: done
[6:44] <dmick> and apparently failed again. wth.]
[6:45] <dmick> sjustlaptop: jmlowe: let me add CPUs and mem to this thing. hold on.
[6:46] <sjustlaptop> k
[6:46] <sjustlaptop> I've got the situation reproduced locally
[6:46] <sjustlaptop> trying to verify the fix
[6:46] <dmick> oh? how did your changes cause that?
[6:46] <dmick> well finish
[6:47] <sjustlaptop> I think it's fine as is
[6:47] <sjustlaptop> just trying to minimize test cycles
[6:47] <sjustlaptop> if it finishes building, go for it
[6:47] <dmick> oh you mean the original bug you're trying to fix. I thought you meant the build problem
[6:47] <sjustlaptop> no, not sure how to reproduce that
[6:48] <sjustlaptop> is there a cache or something you can clear?
[6:48] <sjustlaptop> that's bizarre
[6:48] <dmick> agreed
[6:49] <dmick> running autoconf at a speed that seems more rational now. maybe it was just a horked kernel somehow
[6:52] <sjustlaptop> jmlowe: I was able to reproduce the situation and recover with that branch
[6:52] <jmlowe> ok, that is a bit of relief
[6:53] <sjustlaptop> jmlowe: basically, you have stray empty directories
[6:53] <sjustlaptop> and load_pgs() is zealous about checking for that sort of thing
[6:54] * rustam (~rustam@94.15.91.30) has joined #ceph
[6:55] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[7:00] * __jt__ (~james@rhyolite.bx.mathcs.emory.edu) Quit (Read error: Operation timed out)
[7:00] * __jt__ (~james@rhyolite.bx.mathcs.emory.edu) has joined #ceph
[7:00] <dmick> *(&@#$
[7:00] <dmick> same build error.
[7:01] <sjustlaptop> can you do a git clean -fdx on the git checkout?
[7:01] <dmick> yeah; you should be able to get there too fwiw
[7:01] <sjustlaptop> how?
[7:01] <dmick> if your key isn't installed, bounce through teuthology
[7:01] <sjustlaptop> what is the machine name?
[7:01] * dpippenger (~riven@cpe-76-166-221-185.socal.res.rr.com) has joined #ceph
[7:02] <dmick> gitbuilder-quantal-deb-amd64'
[7:02] <dmick> (no ')
[7:02] <sjustlaptop> I was about to ask, that would be *diaboloical*
[7:02] <sjustlaptop> **diabolical*
[7:02] <dmick> making /var/tmp/ceph and building with V=1
[7:02] <dmick> I can't understand what it's even complaining about
[7:02] <jmlowe> BOFH
[7:03] <sjustlaptop> what is the rest of the dns?
[7:03] <sjustlaptop> apparently the obvious
[7:03] <dmick> yes
[7:03] <dmick> I'd be really surprised if the normal rebuild didn't do that
[7:04] <dmick> or the equivalent
[7:07] <sjustlaptop> where does the build happen?
[7:07] <dmick> /srv/autobuild-ceph/gitbuilder.git/build
[7:09] <dmick> ubuntu@gitbuilder-quantal-deb-amd64:/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-0.61-4-ge99daaf/src
[7:09] <dmick> ls .libs/libglobal*
[7:09] <dmick> .libs/libglobal.a .libs/libglobal_la-global_init.o
[7:09] <dmick> .libs/libglobal.la .libs/libglobal_la-pidfile.o
[7:09] <dmick> .libs/libglobal_la-global_context.o .libs/libglobal_la-signal_handler.o
[7:09] <dmick> ...
[7:09] <sjustlaptop> it failed again?
[7:09] <sjustlaptop> hang on
[7:09] <dmick> no, that's from the last run
[7:10] <sjustlaptop> oh
[7:10] <dmick> My own make did not make the cors. Maybe that's make check?
[7:10] <sjustlaptop> looks like it failed
[7:11] <sjustlaptop> make check also worked on my machine
[7:11] <dmick> oh yeah, it failed in the gitbuilder dir, I just don't know why. claims it doesn't have libglobal.la, has it
[7:12] <sjustlaptop> did you run git clean -fdx last time?
[7:12] <dmick> no
[7:12] <dmick> I'm trying to reproduce with a clean dir
[7:12] <dmick> in /var/tmp/ceph
[7:13] <dmick> the build script does
[7:13] <dmick> git submodule foreach 'git clean -fdx && git reset --hard'
[7:13] <dmick> rm -rf ceph-object-corpus
[7:13] <dmick> rm -rf src/leveldb
[7:13] <dmick> rm -rf src/libs3
[7:13] <dmick> git submodule init
[7:13] <dmick> git submodule update
[7:13] <dmick> git clean -fdx
[7:13] <dmick> (build.sh)
[7:13] <sjustlaptop> oh
[7:14] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[7:14] <dmick> + export CCACHE_DIR=/srv/autobuild-ceph/gitbuilder.git/build/../../ccache + command -v ccache + [ ! -e /srv/autobuild-ceph/gitbuilder.git/build/../../ccache ] + set -- CC=ccache gcc CXX=ccache g++
[7:14] <dmick> hm
[7:15] <dmick> that dir is currently empty
[7:15] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[7:16] <sjustlaptop> maybe just disable ccache?
[7:17] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[7:21] <dmick> doesn't look like the actual actual compile was using it anyway
[7:21] <dmick> my build did build ceph_test_cors, just fine
[7:21] <dmick> I'm utterly mystified.
[7:22] <sjustlaptop> maybe just blast the git tree and recreate?
[7:22] <dmick> I don't suppose it could hurt
[7:25] * jtang1 (~jtang@79.97.135.214) Quit (Ping timeout: 480 seconds)
[7:31] * lx0 is now known as lxo
[7:34] <dmick> sigh, ok, regreated and finally have it attempting again
[7:34] <sjustlaptop> dmick: ok, thanks
[7:35] <dmick> http://weknowmemes.com/wp-content/uploads/2011/12/i-have-no-idea-what-im-doing.jpg
[7:37] <sjustlaptop> me neither man, me neither
[7:45] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[7:46] <dmick> guh.
[7:46] <dmick> same error.
[7:46] <dmick> I'm out of ideas and alertness.
[7:49] <sjustlaptop> yeah...
[7:49] <sjustlaptop> jmlowe: ok if we continue in the morning?
[7:53] * lightspeed (~lightspee@81.187.0.153) Quit (Ping timeout: 480 seconds)
[7:56] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[7:59] * tnt (~tnt@91.177.240.165) has joined #ceph
[8:01] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[8:04] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[8:04] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:04] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:11] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[8:14] <jmlowe> ok, it's 2:00 am here, I'm spent
[8:17] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[8:19] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[8:19] * jtang1 (~jtang@79.97.135.214) Quit (Ping timeout: 480 seconds)
[8:22] * madkiss (~madkiss@2001:6f8:12c3:f00f:870:2526:b8b5:c53e) has joined #ceph
[8:32] * madkiss (~madkiss@2001:6f8:12c3:f00f:870:2526:b8b5:c53e) Quit (Ping timeout: 480 seconds)
[8:36] * madkiss (~madkiss@2001:6f8:12c3:f00f:d413:cd8:6fc1:6e62) has joined #ceph
[8:39] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: Light travels faster then sound, which is why some people appear bright, until you hear them speak)
[8:40] * Cube (~Cube@96-41-69-24.dhcp.mtpk.ca.charter.com) Quit (Quit: Leaving.)
[8:40] * Cube (~Cube@96-41-69-24.dhcp.mtpk.ca.charter.com) has joined #ceph
[8:48] * davidzlap (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[8:52] <fridad> t
[8:52] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) has joined #ceph
[8:55] <paravoid> sage: pong
[9:01] * jtang1 (~jtang@79.97.135.214) has joined #ceph
[9:01] * ScOut3R (~ScOut3R@212.96.47.215) has joined #ceph
[9:19] * hank (~Adium@p2003006F8E07940075CC9DCFE925C382.dip0.t-ipconnect.de) has joined #ceph
[9:19] * hank (~Adium@p2003006F8E07940075CC9DCFE925C382.dip0.t-ipconnect.de) Quit ()
[9:20] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[9:20] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) has joined #ceph
[9:21] * tnt (~tnt@91.177.240.165) Quit (Ping timeout: 480 seconds)
[9:22] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[9:23] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[9:23] * hverbeek (~Adium@p2003006F8E07940075CC9DCFE925C382.dip0.t-ipconnect.de) has joined #ceph
[9:23] <hverbeek> Good morning from Germany
[9:25] <hverbeek> I have a quick question regarding cuttlefish: According to http://ceph.com/docs/master/install/os-recommendations/#platforms, debian 6.0 squeeze is supported. However, the 'ceph' package depends on 'cryptsetup-bin' which is only available from debian 7.0 wheezy onwards.
[9:30] * loicd (~loic@magenta.dachary.org) Quit (Ping timeout: 480 seconds)
[9:30] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[9:31] <wogri_risc> guten morgen hverbeek, what prevents you to use wheezy? it's considered stable now.
[9:32] * fridad (~fridad@b.clients.kiwiirc.com) Quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
[9:36] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[9:37] * fidadu (~oftc-webi@fw-office.allied-internet.ag) has joined #ceph
[9:38] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[9:38] <hverbeek> I have only a small number of machines, they are hosting KVM vms. The management environment (proxmox) is not yet supported on wheezy. At some point, I guess wheezy will come, but for the moment I'm stuck with that.
[9:39] * jtang1 (~jtang@79.97.135.214) Quit (Quit: Leaving.)
[9:40] <hverbeek> I wonder though if I'm doing something wrong here. If squeeze is supported, how did the QA team install it, without the cryptsetup-bin package?
[9:41] <wogri_risc> maybe they didn't.
[9:41] <fidadu> hverbeek: you know that proxmox 3.0rc1 was released today based on wheezy and cuttlefish?
[9:42] <hverbeek> wheee! no! I looked at the forum like… 30 minutes ago, must have missed it
[9:43] <fidadu> hverbeek: was posted to mailinglist 13 minutes ago
[9:43] <fidadu> hverbeek: pve-devel
[9:43] <fidadu> hverbeek: not sure whether the forum is updated
[9:47] <fghaas> hverbeek: still, for those who want to stick to oldstable for a little while and would like to try out cuttlefish, that dependency sounds like a packaging bug
[9:48] <fghaas> you might want to post that to the ceph-devel list, or file an issue in the redmine tracker
[9:48] * jjgalvez1 (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[9:48] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[9:49] <wogri_risc> you could apt-get download it and then run dpkg with --ignore-depends
[9:50] <tnt> Speaking of packaging bug, I think the 'ceph' package should depend on ceph-common (= ${binary:Version}) and not just ceph-common (i.e. force the same version).
[9:51] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (Remote host closed the connection)
[9:53] <hverbeek> i'll do that, thx.
[9:59] * LeaChim (~LeaChim@2.222.208.16) has joined #ceph
[10:01] * dxd828 (~dxd828@195.191.107.205) has joined #ceph
[10:18] * capri (~capri@212.218.127.222) has left #ceph
[10:23] * hverbeek (~Adium@p2003006F8E07940075CC9DCFE925C382.dip0.t-ipconnect.de) Quit (Quit: Leaving.)
[10:24] * lightspeed (~lightspee@81.187.0.153) has joined #ceph
[10:33] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[10:37] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) Quit (Quit: wogri_risc)
[10:39] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[10:46] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[10:48] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[10:49] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) has joined #ceph
[10:54] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[10:55] * tziOm (~bjornar@194.19.106.242) has joined #ceph
[10:59] * hverbeek (~Adium@ip-109-42-0-62.web.vodafone.de) has joined #ceph
[11:03] * mohits (~mohit@zccy01cs103.houston.hp.com) has joined #ceph
[11:20] * hverbeek (~Adium@ip-109-42-0-62.web.vodafone.de) Quit (Quit: Leaving.)
[11:23] * bergerx_ (~bekir@78.188.101.175) has joined #ceph
[11:23] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Read error: Connection reset by peer)
[11:27] * Cube (~Cube@96-41-69-24.dhcp.mtpk.ca.charter.com) Quit (Quit: Leaving.)
[11:45] * fabioFVZ (~fabiofvz@213.187.20.119) has joined #ceph
[11:54] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[11:58] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[12:05] <tnt> mikedawson: the mon store still seems to grow, it's much slower than I originally though (about 20 Mo in a day), but still that cluster has 1 single 2G RBD image on it and the store double its size in a day.
[12:07] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[12:08] <tnt> mikedawson: although now that I look at it, it might just be the log messages ... are those removed after sometime ?
[12:10] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) Quit (Ping timeout: 480 seconds)
[12:10] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[12:14] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) has joined #ceph
[12:16] * wogri_risc (~wogri_ris@ro.risc.uni-linz.ac.at) Quit (Remote host closed the connection)
[12:16] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[12:24] * Ian_M_Porter (~chatzilla@cpc15-thor5-2-0-cust102.14-2.cable.virginmedia.com) has joined #ceph
[12:25] * Ian_M_Porter is now known as mip
[12:25] * Jakdaw (~chris@ion.jakdaw.org) has joined #ceph
[12:25] * mip (~chatzilla@cpc15-thor5-2-0-cust102.14-2.cable.virginmedia.com) Quit ()
[12:26] * mip (~chatzilla@cpc15-thor5-2-0-cust102.14-2.cable.virginmedia.com) has joined #ceph
[12:27] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 482 seconds)
[12:29] <Jakdaw> What's the best way to use RBD for KVM root devices? Using the kernel RBD support on the VM host and exposing the block device to the VM seems ugly from a management pov. KVM's own RBD support seems very naive at present (eg the VM disappears altogether for tens of seconds when a monitor disappears, whether it's using the block device or not) and I've not yet come across anyone whose gone through the process of writing an initrd that will configure the guest
[12:29] <Jakdaw> kernel to map the RBD device and use that as root...
[12:29] <jmlowe> qemu rbd driver is the way to go
[12:30] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[12:30] <tnt> what do you mean the VM disappears ?
[12:30] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[12:31] <tnt> the qemu rbd driver would definitely be the way to go
[12:31] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[12:31] <Jakdaw> jmlowe: really? is it under active development?
[12:31] <tnt> yes
[12:31] <tnt> there have been some patch improving flush support recently for eg.
[12:31] <Jakdaw> tnt: at present if I kill a monitor then my VM is not scheduled at all (ie doesn't respond to ping) for a few seconds
[12:31] <jmlowe> I have 79 vm's running with it right now
[12:31] <Jakdaw> tnt: it's using an RBD device as root, but isn't doing any IO
[12:32] <Jakdaw> this is with qemu-kvm-1.2.0
[12:32] <tnt> well ... you might not be doing any IO but the kernel could be ... just paging stuff in/out.
[12:34] <Jakdaw> it's repeatable and the VM is totally idle
[12:34] <mip> Hi i have a question on how ceph returns a read on a large file (say foo) that has been striped across multiple ceph objects/object sets (and assume stored across multiple PGs). How does ceph know how to re-constitute the file back into foo on a read. Is there metadata held within the individual ceph objects to enable this?
[12:34] <tnt> but as long as you specified several monitors that shouldn't matter much.
[12:34] <Jakdaw> in any case - it still sucks - just because a block device isn't responding I still want the rest of my VM to be scheduled
[12:35] <tnt> you should try the latest one.
[12:36] <tnt> it has async flush instead of syncrhonously waiting in the main qemu thread.
[12:36] <Jakdaw> have qemu-kvm been merged into qemu now?
[12:37] <Jakdaw> so latest is now qemu-1.4.1 ?
[12:39] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[12:43] * Jakdaw switches to that and tries again
[12:45] * jjgalvez1 (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) Quit (Quit: Leaving.)
[12:48] <matt_> Jakdaw, you need the GIT versino of qemu for the async rbd patch
[12:48] <matt_> which is 1.5rc0 at the moment I think
[12:49] <Jakdaw> ah awesome thanks - matt_ any idea when that's likely to be released?
[12:50] <matt_> I'm not sure to be honest, might be a few weeks depending on how many bugs needs to be fixed in the release candidate
[12:51] * DarkAce-Z (~BillyMays@50.107.54.92) has joined #ceph
[12:53] <fidadu> Jakdaw: but you also need a recent version of ceph 0.56.6 or 0.61
[12:54] <Jakdaw> I'm on bobtail at the moment so I think that should be ok
[12:54] * DarkAceZ (~BillyMays@50.107.54.92) Quit (Ping timeout: 480 seconds)
[12:55] <fidadu> Jakdaw: yes if you have a version > 0.56.4
[12:55] <fidadu> Jakdaw: async rbd flush support was added with 0.56.5
[12:55] <Jakdaw> 0.56.6 :)
[12:56] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[12:58] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[12:58] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[12:59] * diegows (~diegows@190.190.2.126) has joined #ceph
[13:06] <Jakdaw> actually qemu 1.4.1 seems to be a huge improvement anyway
[13:08] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[13:10] <fidadu> Jakdaw: yes you can also cherry pick the rbd async to qemu 1.4.1 it works fine
[13:11] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Quit: killed (ChanServ (Quit Message Spam is off topic.)))
[13:11] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[13:18] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[13:20] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[13:21] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[13:22] * Jakdaw (~chris@ion.jakdaw.org) Quit (Read error: Operation timed out)
[13:23] <matt_> fidadu, How did you get the async patch into 1.4.1? I tried moving rbd.c but I just got compile errors :/
[13:23] <LeaChim> Hi, I'm having some problems with ceph OSDs seeming to hang, blocking requests
[13:24] <LeaChim> For example: 2013-05-08 12:21:31.372025 osd.3 [WRN] slow request 240.204091 seconds old, received at 2013-05-08 12:17:31.167890: osd_op(client.4251.0:646290 gc.25 [call lock.lock] 4.51baec8a) v4 currently waiting for subops from [0,2,1]
[13:24] <fidadu> matt_: no moving rbd.c does not work just cherry-pick the relevant async commit
[13:25] <tnt> LeaChim: what version ?
[13:27] <LeaChim> 0.56.4, running on ubuntu precise.
[13:28] <fidadu> LeaChim: maybe try to update to 0.56.6 not sure if it will fix your problem but i would try that first
[13:31] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Quit: killed (ChanServ (Quit Message Spam is off topic.)))
[13:31] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[13:33] <LeaChim> I'd prefer to find out what's causing the problem first, I don't see anything in the changelog that would seem related. And I can't just upgrade the production cluster at the drop of a hat.
[13:33] <tnt> LeaChim: is there anything weird in the dmesg of that OSD ?
[13:34] * leseb (~Adium@AMarseille-651-1-203-27.w92-153.abo.wanadoo.fr) has joined #ceph
[13:35] <fidadu> LeaChim: maxbe rados bench osd 3 and look if the write speed is ok
[13:35] <andreask> LeaChim: can also be an overloaded osd ... or broken disk or filesystem problem
[13:38] * leseb (~Adium@AMarseille-651-1-203-27.w92-153.abo.wanadoo.fr) Quit ()
[13:42] * leseb (~Adium@AMarseille-651-1-203-27.w92-153.abo.wanadoo.fr) has joined #ceph
[13:42] <LeaChim> Don't see anything weird in dmesg
[13:43] <LeaChim> 2013-05-08 12:42:22.958448 osd.3 [INF] bench: wrote 1024 MB in blocks of 4096 KB in 52.725448 sec at 19887 KB/sec
[13:43] <LeaChim> The OSDs should hardly be overloaded, there's barely anything happening
[13:45] <tnt> I'm wondering is osd.3 is the issue or if the issue is the other OSD for that PGs and osd.3 is waiting on its peer.
[13:45] <LeaChim> There's also a slow request in the list on 2, waiting for subops from [3,1,0]
[13:45] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[13:46] <tnt> what's your cluster size ?
[13:46] <tnt> is 4.51baec8a the pgid ??? it looks huge
[13:46] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[13:47] <LeaChim> There's only 4 nodes, 128 PGs per pool
[13:51] <tnt> what's the replication you requested ?
[13:51] <LeaChim> 4, at present it should result in one copy on every node.
[14:00] <LeaChim> Hmm, some messages in the osd log like: 2013-05-08 12:49:43.177176 7fa5d44fc700 0 -- 10.249.32.101:6801/7868 >> 10.248.32.11:6801/21100 pipe(0x4f7c000 sd=31 :6801 s=2 pgs=15033 cs=357 l=0).fault with nothing to send, going to standby
[14:01] * fabioFVZ (~fabiofvz@213.187.20.119) Quit (Remote host closed the connection)
[14:01] * fabioFVZ (~fabiofvz@213.187.20.119) has joined #ceph
[14:02] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[14:03] <andreask> LeaChim: not unusual in an idle cluster .... nodthing to do --> going to standby
[14:04] * JaksLap (~chris@2a03:9600:1:1:7987:c8ed:2619:1a3d) has joined #ceph
[14:05] <andreask> LeaChim: IIRC after 15min of inactivity
[14:06] <tnt> anyone upgraded to cuttlefish and using radosgw ? when i do "radosgw-admin bucket list" it shows nothing ... but listing them using the 's3' web api works fine.
[14:10] <LeaChim> Hmmm. I think this might have been the result of a weird network issue. The 2 OSDs that seemed to have problems were both listening on a link local address on the box, instead of the OSPF ip which is the main interface.
[14:12] <LeaChim> Anyone know how I can force it to bind to a specific IP?
[14:12] * jeffv (~jeffv@2607:fad0:32:a02:d932:1024:4074:3426) has joined #ceph
[14:13] <LeaChim> or force it to bind to 0.0.0.0, like most of them seem to be doing.
[14:14] * hflai (~hflai@alumni.cs.nctu.edu.tw) Quit (Remote host closed the connection)
[14:15] * schlitzer|work (~schlitzer@109.75.189.45) has joined #ceph
[14:15] <tnt> mm, mine just listen on 0.0.0.0 ... (except for the mon). I'm not even sure how you got them to listen on a specific ip.
[14:15] <schlitzer|work> hey folks
[14:18] <LeaChim> I didn't ask them to, the ip they bound to isn't referenced in any config file :/
[14:18] <LeaChim> Nor is it the one the hostname resolves to
[14:18] * leseb (~Adium@AMarseille-651-1-203-27.w92-153.abo.wanadoo.fr) Quit (Quit: Leaving.)
[14:18] * mohits (~mohit@zccy01cs103.houston.hp.com) Quit (Read error: Connection reset by peer)
[14:18] <schlitzer|work> i was just wondering if ceph (osd) would build on an arm cpu.... and if so if it would make sense to look for "micro servers" with two gigabit interfaces + one or two sata ports to build up osds on
[14:18] <schlitzer|work> *hmmm*
[14:19] <fghaas> schlitzer|work: people have been running ceph on raspberry pi, so yes, arm should work
[14:20] <fghaas> (iirc, that is)
[14:20] <schlitzer|work> well, i guess this is something that i gonna try
[14:20] <fghaas> do that, and report back :)
[14:21] <schlitzer|work> hmm, do rasp pi have a sata port?
[14:21] <schlitzer|work> guess not
[14:21] <schlitzer|work> i will first take a look waht board i could use for that
[14:21] <schlitzer|work> rasp pi also has only 10mbit as far as i know
[14:22] <fghaas> http://irclogs.ceph.widodh.nl/index.php?date=2012-07-30
[14:22] <fghaas> might provide some insight
[14:22] <schlitzer|work> thank you
[14:22] <andreask> LeaChim: well you can specify e a "cluster" and a "public" network
[14:25] <Azrael> hey folks
[14:25] <Azrael> anybody experiencing crashes of OSD's with cuttlefish?
[14:25] <Azrael> we are, repeatedly
[14:26] <tnt> Azrael: both my test clusters seem find. What kind of crash ?
[14:27] <darkfader> schlitzer|work: one person on here had osd's on atom boxes and said they didn't really cope after an osd crash
[14:27] <darkfader> i'd guess arm is the same plus memory pressue
[14:27] <schlitzer|work> ok
[14:28] <schlitzer|work> it was just a quick idea that come into my head
[14:28] <Azrael> tnt: http://pastebin.com/Bt3yWvfV
[14:28] <darkfader> schlitzer|work: i'd basically kill for someone to build a 3.5" board with fc ports on one end and two 1ge out front and ARM cpu
[14:28] <darkfader> recycle bazillions of old fc shelfs
[14:29] <Azrael> tnt: the crash seems to occur only when we stop other osd's
[14:29] <darkfader> you'd need to see how bad it really handles during recovery
[14:29] <Azrael> example... we have 6 nodes each housing 12 osd's
[14:29] <Azrael> we stopped all osd's on our node 'data3'
[14:29] <Azrael> (simulating a host crash / maintenance)
[14:30] <schlitzer|work> darkfader,this is something i would like to test. but rasp pi has only one 100mbit ethernet port, and not sata at all
[14:30] <Azrael> then 1 osd on data4 and 1 osd on data5 crashed
[14:30] <Azrael> we had this same problem yesterday
[14:30] <Azrael> its repeatable
[14:30] <darkfader> schlitzer|work: look for something called odroid
[14:30] <schlitzer|work> odroid has also no sata as far as i know
[14:30] <tnt> Azrael: it crashed during the re-replication ?
[14:31] <darkfader> oh, wow. sorry
[14:31] <jerker> Has anyone here been useing FluidFS (from Dell) and can compare it with performance from CephFS?
[14:32] <Azrael> tnt: yes. also btw we had 'ceph osd set noout' flag.
[14:32] <tnt> schlitzer|work: i.MX53 Quick Start Board has SATA
[14:33] <Azrael> tnt: osd.38 (node data4) and osd.54 (node data5) crashed while osd's osd.12 thru osd.23 were stopped on node data3
[14:33] <tnt> Azrael: huh ... it shouldn't have redistributed at all with the noout flag. I'll try to set it on my test cluster and kill an osd, see what happens.
[14:34] <Azrael> ok
[14:34] <Azrael> we have a pg 8.40 who uses osd.26, osd.36, osd.54..... there's something about this pg thats funny
[14:34] <Azrael> sorry osd.38
[14:34] <tnt> but my test clusters are much smaller ... so not sure it'll do anything.
[14:34] <schlitzer|work> tnt, hrmtz to expensive
[14:34] <Azrael> this same pg gave us grief yesterday which sage helped us resolve
[14:34] <schlitzer|work> something under 50� would be great
[14:35] <schlitzer|work> do not want to spend so much money just satisfy my private interest :-D
[14:35] * dosaboy_ (~dosaboy@host86-163-9-160.range86-163.btcentralplus.com) has joined #ceph
[14:35] * mohits (~mohit@122.179.86.1) has joined #ceph
[14:36] <nyerup> tnt, Azrael: (I'm a colleague) Yesterday we attributed the problem to the fact, that we were upgrading from testing (0.60) to Cuttlefish, but today - with all nodes on Cuttlefish - the OSD crashed again, apparently blaming a corrupt PG.
[14:37] <jerker> http://www.slideshare.net/LarryCover/scaleout-storage-on-intel-architecture-based-platforms-characterizing-and-tuning-practices (One of the few places I found FluidFS and Ceph mentioned in the same document)
[14:37] <nyerup> By the way, the exact same PG that got corrupted yesterday, which was eventually reinjected after the upgrade.
[14:37] <jerker> from 12 april 2013
[14:37] * nigly (~tra26@tux64-13.cs.drexel.edu) Quit (Ping timeout: 480 seconds)
[14:38] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[14:38] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[14:40] * dosaboy__ (~dosaboy@host86-161-206-107.range86-161.btcentralplus.com) has joined #ceph
[14:41] * dosaboy (~dosaboy@host86-161-203-141.range86-161.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[14:41] * jtang1 (~jtang@sgenomics.org) has joined #ceph
[14:42] * jtang1 (~jtang@sgenomics.org) Quit (Remote host closed the connection)
[14:43] * dosaboy_ (~dosaboy@host86-163-9-160.range86-163.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[14:45] <jerker> http://www.inktank.com/dell/white-paper/ Gah why have to register for reading a white paper????
[14:46] <tnt> to track you ...
[14:47] * capri_on (~capri@212.218.127.222) has joined #ceph
[14:49] <jerker> Talked to some local Dell sales guys yesterday, but wasn't aware that Dell and Inktank seem to have at least something published going on so I didn't ask them about it.
[14:50] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[15:01] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[15:01] * fidadu (~oftc-webi@fw-office.allied-internet.ag) Quit (Remote host closed the connection)
[15:02] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:06] * markbby (~Adium@168.94.245.3) has joined #ceph
[15:13] <Azrael> hi yehuda_hm
[15:14] <Azrael> are you able to assist with a reproducable osd daemon crash?
[15:16] <Azrael> (on cuttlefish)
[15:16] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[15:17] * DarkAce-Z is now known as DarkAceZ
[15:19] * aliguori (~anthony@66.187.233.207) has joined #ceph
[15:21] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[15:21] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[15:25] <mikedawson> Azrael: can you paste the log somewhere?
[15:26] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[15:29] <Azrael> mikedawson: http://pastebin.com/Bt3yWvfV
[15:31] <mikedawson> Azrael: I was just about to post that I saw someone else with this issue yesterday.... until I realized it was you :-)
[15:32] <mikedawson> Azrael: I would enter a bug at http://tracker.ceph.com/projects/ceph/issues/new
[15:39] <Azrael> mikedawson: hehe
[15:39] <Azrael> mikedawson: ok yeah we will do that
[15:41] <mikedawson> Azrael: if you think http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg10174.html is the same issue, you might want to mention that too
[15:42] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[15:49] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[15:52] * gmason (~gmason@hpcc-fw.net.msu.edu) has joined #ceph
[15:53] * diegows (~diegows@190.190.2.126) has joined #ceph
[16:00] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[16:03] * barryo (~borourke@cumberdale.ph.ed.ac.uk) Quit (Quit: Leaving.)
[16:07] * xevwork (~xevious@6cb32e01.cst.lightpath.net) has joined #ceph
[16:08] <xevwork> Is the length of time required to clone a snapshot in Ceph dependent on the size of the dataset? Or, is it like ZFS where it's virtually instant?
[16:09] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[16:14] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[16:15] * schlitzer|work (~schlitzer@109.75.189.45) Quit (Quit: Leaving)
[16:17] <wido> xevwork: RBD?
[16:17] <xevwork> Yes
[16:17] <wido> Instant, no copying is done
[16:17] <xevwork> That's what I thought.
[16:17] <xevwork> Our CEO is crazy.
[16:17] <xevwork> Thanks.
[16:18] <wido> xevwork: Keep in mind, that is only if you use RBD cloning
[16:18] <wido> If you use qemu-img it will do a full copy
[16:18] <wido> so you have to use the 'rbd' tool
[16:18] <xevwork> Sure.
[16:18] <xevwork> Much like zfs where you use the 'zfs' tool.
[16:19] <wido> Yes, correct
[16:19] * portante|afk is now known as portante
[16:27] * noahmehl (~noahmehl@wsip-98-173-51-204.sd.sd.cox.net) Quit (Quit: noahmehl)
[16:28] * noahmehl (~noahmehl@wsip-98-173-51-204.sd.sd.cox.net) has joined #ceph
[16:30] * barryo (~borourke@cumberdale.ph.ed.ac.uk) has joined #ceph
[16:30] <elder> wido, not rbd cloning, just rbd snapshotting.
[16:30] <elder> "rbd snap create imagename@snapname"
[16:30] * tkensiski (~tkensiski@2600:1010:b017:40a4:ccd1:1f82:4faf:17c) has joined #ceph
[16:30] * tkensiski (~tkensiski@2600:1010:b017:40a4:ccd1:1f82:4faf:17c) has left #ceph
[16:31] <tnt> does cloning copy the data immediately or "on-demand" (cow)
[16:31] <elder> rbd cloning goes beyond just that, creating a writable image that's copy-on-write.
[16:36] * noahmehl (~noahmehl@wsip-98-173-51-204.sd.sd.cox.net) Quit (Ping timeout: 480 seconds)
[16:36] <wido> elder: Am I confused here? But I mean the layering feature
[16:36] <wido> where you create a child image from a parent
[16:38] <tnt> I think what he meant is that there are two distinct features : snapshotting (creating a r/o image of the current state of an image) and cloning (create a r/w image from another and using copy-on-write to avoid duplicating data).
[16:38] <tnt> the original question referes to "clone a snapshot" which is not very clear.
[16:41] <wido> tnt: I think that he means creating a r/w image from a r/o snapshot
[16:41] <wido> That goes instantly
[16:41] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[16:42] <tnt> anybody with cuttlefish using radosgw here ?
[16:42] <pioto> a few osd-related questions... unlike a 'raid 1' mirror, where i think reads can happen from either disk... with replica size > 1, all reads are still going to hit the primary osd first, and only hit secondary/tertiary after failure?
[16:43] <pioto> so, you want a higher pg_num for each pool, to help balance out those reads in a different way
[16:43] <tnt> yes
[16:44] <tnt> rbd also has a brand new stripev2 feature to stripe accross multiple osd inside a 4Mb block.
[16:44] <tnt> (or so I hear, I never tested it yet)
[16:45] <pioto> well. i think i remember reading that striping was more usfeul for larger objects?
[16:45] <pioto> but, oh, i guess you're talking about striping at the rbd layer, not at the rados layer
[16:45] <tnt> yes, I was thinking about rbd.
[16:46] <elder> Sorry wido, wasn't looking for a few minutes.
[16:46] <elder> tnt provided an accurate assessment I think.
[16:47] <wido> I think the reference to zfs clone got me :)
[16:47] <elder> A snapshot is a read-only persistent view of an rbd image at a given point in time. It can be created instantly, no copying is done.
[16:47] <elder> An rbd clone is a distinct thing--a writable image, based on a snapshot, which is also created instantly. Reads are satisfied from the "parent" rbd image, writes are done copy-on-write.
[16:48] <pioto> a zfs clone is also "instant"
[16:48] <pioto> and yes, it mostly matches my mental model for rbd snapshots/clones
[16:50] <paravoid> I have a couple of OSDs leaking memory
[16:50] <paravoid> 15G of resident memory right now
[16:50] <paravoid> I had more, but OOM killed them
[16:50] <paravoid> anything meaningful I can get from them before restarting?
[16:50] <paravoid> I'm guessing debug osd 20 might not be of much help
[16:51] <tnt> I've had an OSD memory leak since argonaut ... still unsolved.
[16:51] * ScOut3R (~ScOut3R@212.96.47.215) Quit (Ping timeout: 480 seconds)
[16:52] <paravoid> MALLOC: + 15129030656 (14428.2 MB) Bytes in page heap freelist
[16:53] <tnt> do yuo have a graph of memory usage to see how/when it grows ?
[16:53] * portante (~user@66.187.233.206) Quit (Quit: upgrading)
[16:54] <tnt> meh, just noticed http://tracker.ceph.com/issues/3883 about that leak was marked as "won't fix" ...
[16:57] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:03] * tziOm (~bjornar@194.19.106.242) Quit (Remote host closed the connection)
[17:03] * ScOut3R (~ScOut3R@C2B0E4C1.dialup.pool.telekom.hu) has joined #ceph
[17:04] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[17:05] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[17:05] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[17:07] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[17:10] * imjustmatthew (~imjustmat@c-24-127-107-51.hsd1.va.comcast.net) has joined #ceph
[17:10] * eschnou (~eschnou@85.234.217.115.static.edpnet.net) Quit (Remote host closed the connection)
[17:15] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) Quit (Ping timeout: 480 seconds)
[17:17] * ScOut3R (~ScOut3R@C2B0E4C1.dialup.pool.telekom.hu) Quit (Ping timeout: 480 seconds)
[17:18] <imjustmatthew> Does the ceph.com/debian-testing repo not have 0.61?
[17:19] * berant (~blemmenes@vpn-main.ussignal.co) has joined #ceph
[17:20] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[17:22] * yehuda_hm (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[17:24] * gaveen (~gaveen@175.157.180.157) has joined #ceph
[17:26] * glowell (~glowell@38.122.20.226) has joined #ceph
[17:28] <mikedawson> sage: ping
[17:28] <sage> ping
[17:28] <jmlowe> I kept sjust and dmick up late last night, either one up yet?
[17:29] <sage> don't think so :)
[17:29] <mikedawson> sage: mikedawson-ceph-mon.a-tdump.tar.bz2 is out there
[17:29] <sage> on cephdrop?
[17:29] * tkensiski (~tkensiski@209.66.64.134) has joined #ceph
[17:30] * tkensiski (~tkensiski@209.66.64.134) has left #ceph
[17:30] <mikedawson> sage: Yeah. The mons had grown to 42GB each. This morning a and b crashed. I ended up stopping everything, starting the mons first without OSDs. They compacted to ~200MB and have been steady for 2 hours despite my increased workload
[17:31] <sage> not growing any more?
[17:31] <Azrael> sage: we were able to reproduce the osd crashes. and i have a core dump.
[17:31] <sage> azrael: which version?
[17:32] <Azrael> sage: cuttlefish
[17:32] <mikedawson> sage: also matt_ is at ~20GB and he's on Precise whereas I'm Raring. I know leveldb versions was once a theory.
[17:32] <sage> k
[17:32] <sage> i'll take a look at the dump shortly!
[17:33] <mikedawson> sage: still growing and compacting, but the compact is keeping up
[17:33] <sage> azrael: can you open a ticket in the tracker and include the backtrace and any logs?
[17:33] <sage> k
[17:33] <Azrael> circumstances was us restarting 12 osd's at once. osds on other nodes would crash. those osd's shared the same pg as one of the 12 osd's that were restarted (not surprising, since we have 1024, 1024, and 16000 pg's for our three pools respectively)
[17:33] <Azrael> in addition, we had 'noout' flag set
[17:33] <Azrael> repl size 3 with min_size 1
[17:33] <Azrael> what was interesting is
[17:33] <Azrael> we set min_size to 2 and then bam no more crashing
[17:34] <sage> interesting! ok. any logs or backtraces will help.
[17:34] <Azrael> sure thing
[17:35] <saras> sage: is their build of ceph that will run on RASPBERRY-PI a very current one
[17:36] <sage> debian-cuttlefish is 0.61.. not sure about debian-testing
[17:36] <sage> saras: no idea :)
[17:36] <tnt> I seem to have some 'damaged' rgw bucket that radosgw-admin doesn't want to delete, where should I go destroy them ?
[17:40] <saras> sage: will your debain build run on arm
[17:41] <saras> arm11
[17:42] <mikedawson> glowell: Are you guys still planning on building/packaging Qemu with Josh's patch? http://tracker.ceph.com/issues/4834
[17:45] <glowell> Yes. The past week was focused on getting cuttlefish out the door, but now that's doen, I'll be working on qemu packaging again.
[17:45] <mikedawson> glowell: thank you!
[17:47] * xevwork (~xevious@6cb32e01.cst.lightpath.net) has left #ceph
[17:53] <saras> <-- sad panda i think ceph just broke my heart it was short love store no arm maybe we can still be freinds
[17:55] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:59] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:01] <paravoid> sage: hey
[18:03] * sagelap (~sage@2600:1012:b012:10bc:ecbd:8495:bfcd:7a58) has joined #ceph
[18:05] <saras> saras: ceph 0.47.2-1 how old is this
[18:05] * mohits (~mohit@122.179.86.1) Quit (Ping timeout: 480 seconds)
[18:06] <saras> lol now talking to my self
[18:06] <paravoid> sagelap: hey
[18:06] <paravoid> you pinged yesterday?
[18:08] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[18:08] <saras> sage: ceph 0.47.2-1 how big a issue would learning ceph that old verison be should just say if build salt mangement around that verison how bad will iw breat when move to moern verison
[18:10] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) has joined #ceph
[18:13] * fabioFVZ (~fabiofvz@213.187.20.119) Quit (Quit: see you)
[18:16] * tnt (~tnt@91.177.240.165) has joined #ceph
[18:16] * aliguori (~anthony@66.187.233.207) Quit (Remote host closed the connection)
[18:17] * alram (~alram@38.122.20.226) has joined #ceph
[18:19] * dwt (~dwt@128-107-239-233.cisco.com) has joined #ceph
[18:19] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) Quit (Ping timeout: 480 seconds)
[18:21] * mohits (~mohit@122.172.182.235) has joined #ceph
[18:23] * rustam (~rustam@94.15.91.30) has joined #ceph
[18:25] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[18:32] <sagewk> paravoid: yeah, can you take a look at wip-suppress?
[18:37] <paravoid> I will
[18:37] <paravoid> I have a bit more important issues right now :)
[18:39] * fghaas (~florian@91-119-65-118.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[18:40] <sagewk> thanks ;)
[18:41] * dwt (~dwt@128-107-239-233.cisco.com) Quit (Ping timeout: 480 seconds)
[18:42] * loicd (~loic@90.84.144.25) has joined #ceph
[18:44] * brady (~brady@rrcs-64-183-4-86.west.biz.rr.com) has joined #ceph
[18:50] * rturk-away is now known as rturk
[18:51] * Tamil (~tamil@38.122.20.226) has joined #ceph
[18:54] <sagewk> hi
[18:55] <sagewk> saras: 0.47 is pre-argonaut.. so, about 1 year old.
[18:55] <sagewk> ancient! :)
[18:56] <saras> sagewk: i know that newest verison i find that should run on PI
[18:57] <saras> so how big issue would be to move mange stuff form that to .61
[18:57] <saras> or do you have newer armhf builds
[18:58] <saras> that is the build that debian has
[18:58] <saras> 0.47.2-1
[18:58] * sagelap (~sage@2600:1012:b012:10bc:ecbd:8495:bfcd:7a58) Quit (Ping timeout: 480 seconds)
[18:59] * iggy_ (~iggy@theiggy.com) has joined #ceph
[19:02] * rturk is now known as rturk-away
[19:02] <sagewk> if you build it yourself i susepct it will just work
[19:03] <sagewk> but we don't do arm builds (yet!)
[19:03] <saras> what the normal build time look like
[19:04] * Ifur (~osm@hornbill.csc.warwick.ac.uk) has joined #ceph
[19:05] * rturk-away is now known as rturk
[19:06] <saras> sagewk: is their any much from .43 to .47
[19:06] <sagewk> on amd64 it's ~10 minutes
[19:06] <sagewk> tons
[19:07] <sagewk> cross-compile, if you can :)
[19:07] <saras> oh will going try to compile any way
[19:08] <saras> wish me luck
[19:08] <sagewk> good luck!
[19:08] <sagewk> let us know how long it takes :)
[19:08] <cjh_> was the ceph osd scrub command removed from cuttlefish? it says unknown command scrub
[19:09] * gaveen (~gaveen@175.157.180.157) Quit (Quit: Leaving)
[19:10] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) has joined #ceph
[19:11] * sagewk (~sage@2607:f298:a:607:c54c:c777:f195:239a) Quit (Remote host closed the connection)
[19:13] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[19:15] * sjusthm (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[19:16] * portante (~user@66.187.233.206) has joined #ceph
[19:17] <Ifur> i see that suse has gone on board with ceph on its cloud platform, but how is cephfs on suse? i'm thinking of building a test setup for cephfs with SLES in and HPC environment with libsdp for added performance.
[19:18] * sagewk (~sage@2607:f298:a:607:10e3:380:4dd2:33c) has joined #ceph
[19:18] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: On the other hand, you have different fingers.)
[19:19] * nhm (~nhm@65-128-150-185.mpls.qwest.net) has joined #ceph
[19:20] <berant> Hi. can anyone give any recommendations on how to determine why my data pool thinks it has 10x more data than it actually *should* (even including replication)
[19:20] <nwl> ifur: SuSE are using us with their OPenstack solution but it doesn't use CephFS
[19:21] <nwl> ifur: cephfs on suse is the same as on other platforms, usable but in needing of more testing and QA
[19:21] * loicd (~loic@90.84.144.25) Quit (Quit: Leaving.)
[19:22] <Ifur> I guess a more prudent question to ask is wether is should be easy to get the kernel drivers working on SLES or if I'd have to beat it into submission
[19:22] <scuttlemonkey> berant: how are you coming to that conclusion?
[19:24] <berant> scuttlemonkey: 'ceph df' shows 30TB used when there is only 18Tb physical. as well as the 'ceph -w' output
[19:24] <berant> ls -lh within cephfs root only shows ~3.5TB
[19:24] <tnt> berant: do you have serveral Osd on the same filesystem ?
[19:25] <berant> tnt: each has it's own physical drive
[19:25] <scuttlemonkey> ^^
[19:26] <scuttlemonkey> whoops, wrong window
[19:26] <scuttlemonkey> berant: hrm, that seems strange
[19:26] <scuttlemonkey> can you pastebin the ceph -s output?
[19:27] * saras (~kvirc@74-61-8-52.war.clearwire-wmx.net) Quit (Ping timeout: 480 seconds)
[19:27] <berant> scuttlemonkey: sure, here it is: http://pastebin.com/CBBSaqxJ
[19:27] <jmlowe> sage: added comment to 4927, wip_split_upgrade finally built for quantal and I was able to restart my osd's early this morning
[19:29] <berant> scuttlemonkey: I sent an email to the maligning list earlier in the week that had further output titled "Cluster unable to finish balancing"
[19:29] <scuttlemonkey> berant: the overall space reporting appears correct...but the 31607 GB Data is odd, lemme see where that pulls from
[19:29] <berant> scuttlemonkey: thanks
[19:29] <berant> all was well up until a drive failure, and then two OSD crashes during that recovery
[19:33] * sagewk (~sage@2607:f298:a:607:10e3:380:4dd2:33c) Quit (Quit: Leaving.)
[19:36] <sagelap> jmlowe: excellent, thanks for testing! we'll make an 0.61.1 shortly
[19:37] <cjh_> why is the tcp rcvbuf disabled by default?
[19:38] <cjh_> i would think you would want to match that to what ethtool -g interface says is your buffer right?
[19:38] <sjusthm> jmlowe: good to hear it worked
[19:38] <sagelap> because we're not sure that it won't have unexpected effects. all we know is that in certain cases it definitely helps
[19:38] <cjh_> ok
[19:38] <cjh_> maybe since i have 10Gb nics all around it would help?
[19:38] <sagelap> (fixing the size and disabling the kenrel's autotuning helps, that is)
[19:38] <sagelap> most likely
[19:39] <cjh_> ok i'll give that a shot
[19:39] <sagelap> let us know. we probably do need to enable it by default...
[19:39] <cjh_> i set my osd op threads to 64 and that helped peg more drives in my jbod but i think i can squeeze more out of it
[19:39] <cjh_> i'm getting about 2GB/s throughput for the entire cluster when i add up all the rados bench output
[19:40] <cjh_> my theoretical limit is 7.5GB/s when i take 10Gb/3 for my 3 replica count and multiple by my hosts
[19:40] * LeaChim (~LeaChim@2.222.208.16) Quit (Ping timeout: 480 seconds)
[19:41] <sagelap> how many drives per host?
[19:41] <cjh_> 12 3TB drives per host
[19:41] <cjh_> 20 hosts
[19:42] <cjh_> the hosts are pretty powerful. i'm only at 50% cpu usage and about 50% disk usage from what i can see
[19:44] * jjgalvez (~jjgalvez@cpe-76-175-30-67.socal.res.rr.com) has joined #ceph
[19:45] * noahmehl (~noahmehl@mobile-198-228-211-058.mycingular.net) has joined #ceph
[19:46] <scuttlemonkey> berant: can you pastebin the ceph df as well?
[19:46] * dwt (~dwt@128-107-239-234.cisco.com) has joined #ceph
[19:47] * sagewk (~sage@2607:f298:a:607:10e3:380:4dd2:33c) has joined #ceph
[19:48] <berant> scuttlemonkey: Here you go: http://pastebin.com/ZMWtfGge
[19:48] <paravoid> sagewk: so, today's incident: OSDs memory leaking (1), OOM killing them, peering taking an hour to finish with slow requests all over the place (2), two pgs were left stuck unclean (plain active) with osd out/in fixing it (3)
[19:49] * LeaChim (~LeaChim@94.15.192.184) has joined #ceph
[19:49] <nhm> cjh_: I remember we were talking about your cluster a little while back. Did I make some kind of recommendation? :)
[19:49] <paravoid> peering takes a long time in general, I can't even restart an OSD without affecting cluster operations for minutes
[19:49] <cjh_> nhm: i believe so. i'm still tuning it
[19:50] <cjh_> i upgraded to cuttlefish and saw a small improvement in performance
[19:50] <cjh_> btrfs made a huge difference but i don't know if i'm ready to use it for real yet
[19:50] <paravoid> and I have two 2013-05-08 14:51:49.696411 osd.4 [INF] MALLOC: + 15129030656 (14428.2 MB) Bytes in page heap freelist
[19:50] <paravoid> 2013-05-08 14:53:38.211083 osd.113 [INF] MALLOC: + 13321252864 (12704.1 MB) Bytes in page heap freelist
[19:50] <paravoid> two OSDs with leaked memory right now
[19:51] <sagewk> hmm! ok that's not good
[19:51] <sjusthm> paravoid: version?
[19:51] <paravoid> bobtail
[19:51] <paravoid> mons/radosgw are 0.56.6, the rest are 0.56.4 with the exceptions of a few OSDs that restarted because of the leak
[19:52] <paravoid> I can't really restart them to upgrade without killing the cluster, so... :)
[19:52] <nhm> cjh_: yeah, btrfs benchmarks really nicely. :)
[19:52] * Havre (~Havre@2a01:e35:8a2c:b230:94b:4c37:1cf6:c28d) Quit ()
[19:52] <cjh_> nhm: it's incredible what a difference it made. +30% at least
[19:53] <nhm> cjh_: Yeah, it does that. :)
[19:53] <nhm> cjh_: though you may find it degrades over time.
[19:53] <cjh_> yeah i'm seeing that
[19:53] <nhm> cjh_: don't have any recent data on how bad it gets.
[19:53] <cjh_> it was 2.5GB/s last week. now it's 2GB/s this week. it's weird how it does that
[19:55] <nhm> cjh_: xfs write performance should have improved quite a bit with cuttlefish if you were upgrading from bobtail.
[19:56] <cjh_> ok cool. i was hoping that was the case
[19:56] * sagelap (~sage@2607:f298:a:607:ea03:9aff:febc:4c23) Quit (Quit: Leaving.)
[19:56] * eegiks (~quassel@2a01:e35:8a2c:b230:998e:383e:ebba:8709) has joined #ceph
[19:56] <cjh_> sagewk: adding the tcp rcvbuf into the config I picked up 1GB/s on the cluster
[19:56] <sagewk> nice. what did you set it to?
[19:56] <cjh_> 4096
[19:56] <nhm> cjh_: interesting!
[19:56] <cjh_> same as my ethtool says
[19:56] <sagewk> try setting it to 64 or 128k
[19:56] <cjh_> ok lets see
[19:57] <nhm> cjh_: was that 1GB/s for reads or writes and from the client perspective or total aggregate throughput?
[19:57] <cjh_> nhm: total aggregate
[20:01] <jmlowe> sjusthm: there is something slightly unusual, I seem to be doing some recovery as part of regular scrubbing " recovering 15E o/s, 15EB/s"
[20:02] <sjusthm> that's probably not real
[20:02] <jmlowe> sjusthm: is that just some cleanup?
[20:02] <sjusthm> no, I mean there probably isn't any actual recovery happening
[20:02] <sjusthm> more likely something funky with the stats reporting
[20:05] <jmlowe> sjusthm: my data is dropping, I didn't get all of the osd's started at once and it did some rebalancing before they all came up, I'm nearing the amount of data that I had before all of this
[20:05] <sjusthm> oh, that's good then
[20:05] <jmlowe> recovery possibly related to removing extraneous replicas?
[20:06] <sjusthm> possibly
[20:06] <sjusthm> I don't remember clearly what get's counted for that
[20:06] <paravoid> so, anything I should do to collect info about that memleak before I restart those osds?
[20:07] * drokita (~drokita@199.255.228.128) has joined #ceph
[20:10] <cjh_> sagewk: looks like 128K on the rcvbuf reduced the throughput
[20:10] <sagewk> weird.. i would have expected the opposite
[20:10] <cjh_> yeah i know
[20:10] <cjh_> the hosts say they're getting about 86MB/s vs 96-101MB/s before
[20:11] <cjh_> are there any levers i can pull to crank the cpu's harder?
[20:11] <cjh_> i know they can do more work than they're doing
[20:11] <sagewk> osd op threads = something bigger than the default of 2
[20:11] <cjh_> i have it set to 64
[20:12] <cjh_> and disk threads to 16
[20:13] <cjh_> looks like from the logs that aio is disabled on my btrfs drives
[20:14] * hverbeek (~Adium@p2003006F8E07940089F9E6809194703C.dip0.t-ipconnect.de) has joined #ceph
[20:14] * hverbeek (~Adium@p2003006F8E07940089F9E6809194703C.dip0.t-ipconnect.de) Quit ()
[20:15] * fidadud (~oftc-webi@p4FC2C84B.dip0.t-ipconnect.de) has joined #ceph
[20:15] <cjh_> does aio mode go under the general sections header?
[20:15] <cjh_> or is there a journal header also?
[20:17] <scuttlemonkey> berant: ok, the consensus is that the data number is actually reporting object size
[20:17] * dxd828_ (~dxd828@host-2-97-78-18.as13285.net) has joined #ceph
[20:17] <scuttlemonkey> and the objects may be sparse
[20:18] <berant> wouldn't' cephfs 'ls -lh' report closer to that number then?
[20:18] <scuttlemonkey> I filed a doc bug for John to expound on how those numbers are generated a bit
[20:18] <scuttlemonkey> http://tracker.ceph.com/issues/4948
[20:19] <berant> hmmm, so it's somewhat of a red-herring and not related to my issue of recovery/rebalancing not finishing?
[20:20] <scuttlemonkey> I'm a little out of my depth trying to read a non-interpreted language
[20:20] <scuttlemonkey> but I can share what sage said
[20:20] <scuttlemonkey> sagewk objects only consume space that isn't a hole/zeros
[20:20] <scuttlemonkey> sagewk and the "data NN" value is the sum of sizes, not non-zero allocated bytes
[20:20] <scuttlemonkey> sagewk its the same for file systems, except the report bytes allocated and not the sum of file sizes
[20:21] <scuttlemonkey> yeah, my guess is the rebalancing issue is separate
[20:21] <berant> I'm puzzled as to why my drives are so full when I don't have that much data. Prior to my original drive failing I was doing a size 3 on both data and rbd and had plenty of space, now using size 2 I have 12% to rebalance and my drives are averaging around 70% full
[20:21] <berant> you're out of your depth, well I'm screwed ;)
[20:22] <scuttlemonkey> haha
[20:22] <scuttlemonkey> keep in mind I'm just a community monkey :P
[20:22] <scuttlemonkey> not a dev
[20:24] <berant> well I appreciate all your help!
[20:24] <scuttlemonkey> sure, for a better look it might be worth poke joao with a stick when he has time
[20:25] <berant> do you know if there is a way to stop balancing etc while still keeping OSDs 'up/in' so I could connect a client to cephfs and copy the data out?
[20:25] <berant> then I could nuke the data pool
[20:25] <berant> (presuming the data in cephfs is actually still fine)
[20:25] <janos> scuttlemonkey: your efforts are still appreciated
[20:26] <scuttlemonkey> berant: that I don't know...haven't ever tried
[20:26] <scuttlemonkey> janos: thanks :)
[20:26] <berant> janos: agreed
[20:27] <berant> I'm tempted to attempt to wait it out and see if the rebalancing can happen complete without all the drives filling, just don't get why it would need so much extra space to rebalance
[20:28] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[20:30] <paravoid> sjustlaptop: tips on identifying the memory leak?
[20:30] <sjustlaptop> paravoid: if you can get the heap profiler working, that would be a good start
[20:31] <paravoid> I did get it somewhat working
[20:31] <sjustlaptop> paravoid: actually, can you describe the events leading up to the leak?
[20:31] <paravoid> nothing in particular
[20:31] <paravoid> it just happened
[20:31] <sjustlaptop> I have a cause in mind, but it would require lots of pg movement
[20:31] <paravoid> I have two OSDs in this case
[20:32] <sjustlaptop> but if you didn't see that, then it must be something else
[20:32] <paravoid> so, there is indications in the logs that 2/3 of the cluster was marked down again
[20:32] <sjustlaptop> hmm
[20:32] <sjustlaptop> and only two nodes are seeing the leak?
[20:32] <paravoid> but I think there was a memory leak/OOM before that
[20:32] <paravoid> that lead up to this
[20:32] <paravoid> still going through the details
[20:32] <paravoid> but now, yes, only two OSDs have a leak
[20:32] <sjustlaptop> do you think you can reproduce it?
[20:33] <paravoid> 12G & 14G of ram
[20:33] <Fetch> I'm unable to attach a nova client (openstack) to a cinder volume using rbd backend. Volume is created without problem, attach is giving error libvirtError: internal error unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-virtio-disk1' could not be initialized . Anybody in channel seen this with Ceph + OpenStack ?
[20:33] <paravoid> but I don't think they're leaking anymore
[20:33] <paravoid> they've been fairly stable for hours
[20:34] * jeffv (~jeffv@2607:fad0:32:a02:d932:1024:4074:3426) Quit (Quit: Leaving.)
[20:34] <paravoid> sjustlaptop: https://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&h=ms-be1001.eqiad.wmnet&m=cpu_report&s=by+name&mc=2&g=mem_report&c=Ceph+eqiad
[20:35] <sjustlaptop> 75cb55b4e2791ce299cb05fe6ab224b86145a5b6
[20:35] <sjustlaptop> is the bug I am thinking of
[20:35] <paravoid> so yeah, definitely before all the osd out
[20:35] <paravoid> (that was around ~12:00)
[20:35] * jeffv (~jeffv@2607:fad0:32:a02:f5ab:6bed:610e:8166) has joined #ceph
[20:35] <sjustlaptop> there were no mark downs prior to that?
[20:36] <paravoid> no
[20:36] <sjustlaptop> ok
[20:36] <sjustlaptop> do you have a pdf of the heap dump?
[20:36] <paravoid> no
[20:36] <paravoid> so I have a very limited understanding of the heap profiler
[20:36] <sjustlaptop> one sec
[20:36] <paravoid> will it dump pre-existing leaks?
[20:37] <paravoid> it's not leaking anymore and the profiler wasn't started before
[20:37] <sjustlaptop> oh, that won't work then
[20:37] <sjustlaptop> hmm
[20:37] <paravoid> thought so...
[20:37] * Oliver1 (~oliver1@ip-178-201-147-182.unitymediagroup.de) has joined #ceph
[20:38] <sjustlaptop> can I see your ceph.log for the period covered by that graph?
[20:40] <sjustlaptop> paravoid: are all pgs active+clean?
[20:41] <paravoid> they are
[20:41] <nyerup> Azrael: Where do we have the dump and the logs? :)
[20:41] <paravoid> and ceph.log just has the pgmap
[20:41] <paravoid> 2013-05-08 10:00:01.028948 7f8ccdac5700 0 log [INF] : pgmap v6820976: 16760 pgs: 16737 active+clean, 23 active+clean+scrubbing+deep; 42187 GB data, 130 TB used, 131 TB / 261 TB avail; 0B/s rd, 1383KB/s wr, 138op/s
[20:41] <paravoid> fairly idle
[20:45] * JaksLap (~chris@2a03:9600:1:1:7987:c8ed:2619:1a3d) Quit (Ping timeout: 480 seconds)
[20:52] <fidadud> paravoid: huhu 261TB avail - how many drives?
[20:53] <kylem> hello all. I'm having trouble with my MDS servers. I have two of them and they both crash almost instantly after I start them. sometimes the second one will stay alive for a while. but seems to crash again if i start the first one. I just want one with a hot backup. Here is my conf: http://pastebin.com/aGA90vn9 .
[20:53] <paravoid> 12 boxes * 12 spindles * 2TB
[20:53] <kylem> also i'm running 0.61
[20:54] <fidadud> paravoid: ah ok 3,5" we're at 2,5" only so stuck at 1TB per drive
[20:56] <kylem> also here is the last error i see in mds0's log
[20:56] <kylem> 0> 2013-05-08 11:50:20.936769 7f39c9fc6700 -1 *** Caught signal (Aborted) **
[20:56] <kylem> in thread 7f39c9fc6700
[21:00] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) Quit (Read error: Operation timed out)
[21:00] * noahmehl (~noahmehl@mobile-198-228-211-058.mycingular.net) Quit (Read error: Connection reset by peer)
[21:00] * BMDan (~BMDan@74.121.199.170) has joined #ceph
[21:01] * saras (~kvirc@74-61-8-52.war.clearwire-wmx.net) has joined #ceph
[21:01] <saras> http://paste.ubuntu.com/5645584/ this how far i got so far
[21:04] <BMDan> I have a rogue piece of configuration sneaking in somewhere, I think. It's not in my ceph.conf anywhere, but I'm getting this on osd startups:
[21:04] <BMDan> df: `/var/lib/ceph/osd/ceph-1/': No such file or directory
[21:04] <BMDan> df: no file systems processed
[21:05] <BMDan> Which is true; that directory isn't in use anywhere. The OSD data directory is defined to be /opt/data/$name. And, indeed, the OSDs actually run like champs; they're just noisy on startup.
[21:07] <sjustlaptop> saras: distro?
[21:08] <sjustlaptop> saras: oops, it's graphviz
[21:08] <sjustlaptop> not dot
[21:14] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) has joined #ceph
[21:15] * yehuda_ (~yehuda@2602:306:330b:1410:8595:8c8e:abfc:6738) has joined #ceph
[21:15] <saras> debian
[21:15] <saras> wheezes
[21:15] <saras> armhf
[21:16] * dpippenger (~riven@cpe-76-166-221-185.socal.res.rr.com) Quit (Quit: Leaving.)
[21:17] <saras> sjustlaptop: it's armhf that the fun bit
[21:17] <sjustlaptop> you are on arm?
[21:18] <nhm> arm should be fun
[21:19] <saras> sjustlaptop: yes the box building on is arm
[21:19] <sjustlaptop> ah, that should be entertaining
[21:22] * drokita (~drokita@199.255.228.128) Quit (Ping timeout: 480 seconds)
[21:23] * diegows (~diegows@200.68.116.185) has joined #ceph
[21:24] * Tamil (~tamil@38.122.20.226) Quit (Quit: Leaving.)
[21:25] * Tamil (~tamil@38.122.20.226) has joined #ceph
[21:28] * netmass (~netmass@69.199.86.242) has joined #ceph
[21:28] <BMDan> Okay, tracked down the problem. It's actually in the init scripts.
[21:28] * rturk is now known as rturk-away
[21:28] <BMDan> Anyone in particular I should pester about that, or just post a bug as normal?
[21:29] <dmick> BMDan: that path is the default path
[21:29] <dmick> presumably the init scripts aren't checking the .conf before trying?
[21:30] <BMDan> Correct; line 313 of /etc/init.d/ceph
[21:30] <saras> http://paste.ubuntu.com/5645677/ more fun
[21:30] <BMDan> Surrounded by get_conf's that DO properly check it, amusingly enough.
[21:30] * jfchevrette (~jfchevret@modemcable208.144-177-173.mc.videotron.ca) has joined #ceph
[21:30] <BMDan> saras: That's just you missing the autoconf tools.
[21:31] <saras> kool
[21:33] <netmass> <Not sure of IRC protocol>... do I just butt in and ask a question?
[21:34] <darkfader> you did it just right. wait for a moment of silence then hi and ask
[21:34] <BMDan> netmass: Ideally, you send us each $1 by PayPal first.
[21:34] <netmass> Will you take more?
[21:34] <BMDan> No, any more would be gaudy.
[21:34] <netmass> Ahh.. :)
[21:36] <netmass> I had a test cluster working a few months ago and dropped it. I just restarted creating a brand new test cluster on VMs and have hit a roadblock with the (new) ceph-deploy. Looks like the step ceph-deploy mon create seems to work in that the montior starts. However, the gatherkeys step fails. Looks like the keys are not being properly created.
[21:36] <netmass> Before going further... I thought I would ask if this is a welll known problem where I screwed up early on in the process.
[21:37] <dmick> BMDan: not my init.d/ceph :) which version you got?
[21:37] <netmass> Cuttlefiish running on ubuntu 12.04 with upgraded kernel (3.9)
[21:37] <dmick> netmass: what error are you getting
[21:38] <netmass> Unable to find /etc/ceph/ceph.client.admin.keyring on ['ceph01']
[21:38] <BMDan> @dmick: line 313 is defaultweight=`df /var/lib/ceph/osd/ceph-$id/ | tail -1 | awk '{ d= $2/1073741824 ; r = sprintf("%.2f", d); print r }'`
[21:38] <cephalobot> BMDan: Error: "dmick:" is not a valid command.
[21:39] <BMDan> dmick: ii ceph 0.61-1precise distributed storage and file system
[21:40] * drokita (~drokita@199.255.228.128) has joined #ceph
[21:40] <dmick> BMDan: ah, yes
[21:40] <saras> http://paste.ubuntu.com/5645699/ how worried shoud i be
[21:40] <dmick> if you don't mind filing a bug that woudl be great
[21:40] <dmick> netmass: hm
[21:41] <dmick> netmass: and you did the mon create on ceph01, and no errors?
[21:43] <netmass> dmick: I added the -v flag when creating and it did not return any errors. The monitor is running on ceph01. However, there are no keys in /etc/ceph nor in /var/lib/ceph/bootstrap*. There is a keyring in /var/lib/ceph/mon/ceph-ceph01.
[21:44] <netmass> dmick: hm? (checked iRC slang and didnt find it... )
[21:44] <pioto> so, i have a test cluster i've upgrade from, i think, 0.60 to 0.61. .. and now, every osd i've restarted is failing to come back up. and the clsuter eventually marks it 'out'
[21:44] <pioto> any suggestions on where to start looking?
[21:45] <pioto> i also found that my mons wouldn't talk to each other until they were all upgraded and restarted
[21:45] <dmick> netmass: just me muttering confusedly to myself
[21:46] <dmick> netmass: ah, sorry: ceph-deploy admin <host> prepares it to be a cluster administrator
[21:47] <netmass> Yes... tried that. However, it gives error: RuntimeError: ceph.client.admin.keyring not found
[21:48] <netmass> BTW... if it makes a difference, I am running ceph-deploy on a standalong "management host" that does not have ceph installed. Just ceph-deploy... is this a no-no? I tried with ceph installed and it didn't appear to make a differnece.
[21:50] * drokita (~drokita@199.255.228.128) Quit (Quit: Leaving.)
[21:51] <dmick> I am, just this moment, having trouble figuring out what creates that keyring
[21:52] <dmick> it should be ok not to have ceph on the deploy-master-host
[21:53] * imjustmatthew (~imjustmat@c-24-127-107-51.hsd1.va.comcast.net) Quit (Remote host closed the connection)
[21:54] * eschnou (~eschnou@54.120-201-80.adsl-dyn.isp.belgacom.be) has joined #ceph
[21:55] * drokita (~drokita@199.255.228.128) has joined #ceph
[21:56] <dmick> ok. so the client.admin key is created by the ceph-create-keys process, triggered from upstart/sysvinit
[21:56] <dmick> do you have one of those still running on the mon host
[21:56] <saras> i have libuuid1 not libuuid this is a issue on ./config setup where should make the change
[21:56] <netmass> dmick: no... but I have also not "restarted" the mon host. Should I do that now?
[21:57] <dmick> no, it should happen based on the mon startup
[21:57] <dmick> what happens if you sudo start ceph-create-keys on the mon host
[21:59] <netmass> dmick: OK... got a clue... 'find / -name "*ceph-create-keys*" -print' showed this file: /var/crash/_usr_sbin_ceph-create-keys.0.crash
[22:00] <dmick> is that the XML thingie that has a pile of info about the crash?...either way, yes, sounds interesting
[22:00] <dmick> you could try running it directly too
[22:01] * mohits (~mohit@122.172.182.235) Quit (Read error: Connection reset by peer)
[22:01] <dmick> /usr/sbin/ceph-create-keys --cluster="<cluster>" -i "<id>" where 'cluster' defaults to ceph if you didn't change it and 'id' defaults to 'hostname'. As root, on the mon host
[22:01] * mohits (~mohit@122.172.182.235) has joined #ceph
[22:03] <netmass> dmick: Here are the two files of interest: http://paste.ubuntu.com/5645754/plain/ and http://paste.ubuntu.com/5645762/plain/
[22:03] <dmick> hm. I may have a LP account
[22:04] * drokita (~drokita@199.255.228.128) Quit (Ping timeout: 480 seconds)
[22:04] <dmick> ok, got it.
[22:05] * aliguori (~anthony@12.151.150.4) has joined #ceph
[22:05] * bergerx_ (~bekir@78.188.101.175) has left #ceph
[22:05] <netmass> dmick: OK... running your command as root seemed to do the trick. The ceph.client.admin.keyring was created and I was then able to run the "gatherkeys" operation from the management host and it said it got the key.
[22:06] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[22:06] <dmick> looks like, for whatever reason, when ceph-create-keys ran, the mon wasn't ready
[22:06] <dmick> had you done the sudo start ceph-create-keys?
[22:07] * dpippenger (~riven@206-169-78-213.static.twtelecom.net) has joined #ceph
[22:08] <dmick> pioto: that doesn't sound right
[22:08] <netmass> dmick: Checking directions here: http://ceph.com/docs/master/rados/deployment/ceph-deploy-mon/
[22:09] <netmass> dmick: I did not run that command manually
[22:09] <dmick> right; I was asking you to here to help diagnose the problem
[22:09] <dmick> but, eh. This seems like a race, maybe
[22:09] <dmick> not certain
[22:09] * doubleg (~doubleg@69.167.130.11) has joined #ceph
[22:10] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:10] <dmick> ceph-create-keys is supposed to keep retrying if it can't (yet) talk to the mon
[22:10] <saras> dmick: libuuid used some where in configure file
[22:11] <dmick> netmass: but this was different; it talked to the mon, but got back a bad response
[22:12] <dmick> netmass: I'm going to file a bug in case we see this again
[22:13] * fidadud (~oftc-webi@p4FC2C84B.dip0.t-ipconnect.de) Quit (Quit: Page closed)
[22:13] <BMDan> dmick: http://tracker.ceph.com/issues/4951 filed
[22:13] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[22:14] <dmick> netmass: http://tracker.ceph.com/issues/4952
[22:14] <dmick> BMDan: tnx
[22:15] <BMDan> o7
[22:17] <dmick> netmass: Sage says we may have found that and fixed it in the cuttlefish patch branch
[22:17] <dmick> 30ffca77df006a244044604074779af538721f14' ceph: return error code when failing to get result from admin socket
[22:18] <netmass> dmick: Awesome... thanks for looking into this. I think the manual fix has me so I can move forward!
[22:18] <sagewk> mikedawson: ping
[22:18] <mikedawson> yessir
[22:18] <dmick> netmass: if you could possibly reproduce and test that would be cool, but I know that might be difficult
[22:18] <sagewk> finally untarred this huge tar.bz2 :)
[22:18] <sagewk> and it looks like it was 35G before and 35G after?
[22:19] <mikedawson> ha. I didn't let it run too long... getting pretty tight on partition space
[22:19] <netmass> dmick... I can actually try pretty easily. I did a snapshot on all VMs just before starting the process. I will revert them and follow the steps exacltly like before. It has happened at least twice... I'll have to let you know...
[22:20] <sagewk> it's small now though, right?
[22:20] <saras> what is libtoolize
[22:20] <mikedawson> sagewk: after my miracle compact this am, I am steady at ~285MB and compact is keeping up. It seems that the compact gets less effective under some unknown condition
[22:20] <sagewk> the ideal smoking gun is start small + trace -> end big
[22:22] <sagewk> i'm not sure that the start big + trace will be very helpful :(
[22:23] * BMDan is intrigued... what is this "compact" of which you speak?
[22:23] <BMDan> I am not familiar with this protocol feature, and a brief Googling doesn't enlighten me.
[22:23] <mikedawson> sagewk: if you look back on cephdrop, I had another from last week. mikedawson-ceph-mon.a-tdump.tar.bz2
[22:24] <buck> I just installed from next (with ceph-deploy) and I'm still seeing the issue from 4924 (gatherkeys fails after mon create). The build was 0.61-113-g61354b2-1precise
[22:24] * dxd828_ (~dxd828@host-2-97-78-18.as13285.net) Quit (Quit: Computer has gone to sleep.)
[22:24] <sagewk> that was before we fixed the background thread problem, though, which breaks compaction
[22:24] <mikedawson> sagewk: also, matt_ was at 20GB and still growing, so perhaps he could lend a hand
[22:24] <mikedawson> ahh
[22:24] <jmlowe> BMDan: leveldb compact
[22:24] <sagewk> buck: is there an asok file in /var/run/ceph?
[22:24] <dmick> netmass: if you're not aware, you can get packages from http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/cuttlefish/
[22:25] <buck> sagewk: yes
[22:25] <sagewk> what hapepns if you run ceph-create-keys -n `hostname` ?
[22:25] <dmick> saras: part of libtool, which tries to abstract shared/nonshared library compilation
[22:25] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[22:25] <saras> thanks very much
[22:25] <mikedawson> sagewk: we're going to add more load, so maybe we can trigger it again. If we can, I'll be get you a better sample
[22:26] <dmick> BMDan: leveldb compaction, we're messing around with that code. mikedawson has a problem cluster :)
[22:26] <saras> it was mia on my pi
[22:26] <sagewk> mikedawson: that'd be great. again, small initial copy + trace is what we're after..
[22:26] <mikedawson> dmick: I have a cluster doing real work :-)
[22:26] <sagewk> thanks!
[22:26] <dmick> I didn't say it wasn't... :)
[22:26] * Cube (~Cube@12.248.40.138) has joined #ceph
[22:27] <dmick> saras: getting Ceph built on a pi will be awesome if you make it
[22:27] <saras> i am tring
[22:27] <dmick> have you seen the Build Prerequisites section of README? That will save you some time
[22:27] <BMDan> Hmmm, I see. FWIW, I have spent, at this point, probably a man-year debugging object stores and their garbage collection/compaction routines.
[22:27] <saras> i did their alot missing for pi
[22:28] <dmick> ok. but it does point out that you have to have libtool
[22:28] <BMDan> It invariably comes down to shipping multigigabyte files around until someone manages to identify the minimal case.
[22:28] <BMDan> Just saying, might as well embrace the madness. ;)
[22:28] <dmick> BMDan: welcome to our hell :)
[22:29] * fridad (~oftc-webi@p4FC2C84B.dip0.t-ipconnect.de) has joined #ceph
[22:29] <BMDan> dmick: Au contraire! I have 96 TB in this cluster and it stores multi-gigabit objects. LevelDB would have to get far more inefficient before it's within a couple orders of magnitude of me caring. And thus it REMAINS your hell. :P
[22:29] <saras> hum
[22:29] * dxd828_ (~dxd828@host-2-97-78-18.as13285.net) has joined #ceph
[22:30] <BMDan> More seriously, if I had insight, I'd offer it. But the reality is that the only insight I have is to encourage you to just bite the bullet and ship the big files around, rather than spending days devising a minimal case.
[22:31] <BMDan> The fundamental issue being that, in the latter case, you're blocking on a single individual making the cognitive leap to identify the problem, whereas in the former case, everyone can try to replicate the issue.
[22:31] <BMDan> </soapbox>
[22:33] <buck> sagewk: I think I'm seeing some DNS oddity. The asok created was for the hostname on our fast network whereas I'd specified the control network DNS name when I did my ceph-deploy mon create
[22:33] * berant (~blemmenes@vpn-main.ussignal.co) Quit (Ping timeout: 480 seconds)
[22:34] <sagewk> ah.. yeah
[22:34] * eschnou (~eschnou@54.120-201-80.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[22:34] <sagewk> you can do ceph-deploy mon create logicalname:dnsname
[22:34] <sagewk> instead of having it infer the logicalname from teh first component of the fqdn
[22:34] <buck> sagewk: rad, I'll try that
[22:39] <jmlowe> sjustlaptop: I may not be out of the woods yet
[22:39] <jmlowe> sjustlaptop: I have an inconsistent pg
[22:41] <saras> dmick: your right when copy that long line bottom of the readme it did not work only got some of them weird
[22:41] <jmlowe> can this exist? active+clean+inconsistent
[22:44] * markbby1 (~Adium@168.94.245.3) has joined #ceph
[22:44] * markbby (~Adium@168.94.245.3) Quit (Remote host closed the connection)
[22:52] <sagewk> jmlowe: yeah..
[22:54] <jmlowe> any suggesions, I'm guessing it was truncated
[22:56] * BMDan (~BMDan@74.121.199.170) Quit (Quit: Leaving.)
[22:56] <sagewk> ceph.log will have the scrub error that it found..
[22:56] <sagewk> sorry, wasn't following your convo with sjustlaptop
[22:56] <jmlowe> scrub 2.17a b5a7577a/rb.0.2f9a.2ae8944a.000000000ea0/head//2 on disk size (1155072) does not match object info size (4194304)
[22:57] <jmlowe> part of a rbd device, so the 4MB size is probably correct
[22:58] <netmass> dmick: I was able to reproduce this. Please see: http://paste.ubuntu.com/5645924/plain/ for the exact keystrokes I used.
[23:01] <netmass> dmick: I also have the same "crash" files on the monitor host.
[23:02] <tnt> a pastebin where you need to be logged into launchpad to see it ... seriously ...
[23:02] <buck> sagewk: using the logical:dnsname approach for 'ceph-deploy new' did the trick. Do you think that should be necessary or should I file a ticket to try to get ceph-deploy to handle this a little more gracefully?
[23:02] <sagewk> hmm.. file a ticket. it could error out if the hostname doesn't appear to match
[23:03] <dmick> netmass: ah, right, but you have the original cuttlefish packages, not the fixed ones
[23:03] <sagewk> jmlowe: you can repair it by going in an truncating the files to the right size. not sure why that hapepend, though..
[23:03] <netmass> Agreed... was just trying to reproduce as you asked. Let me grab the latest and see what happens. Be right back.
[23:03] * fridad (~oftc-webi@p4FC2C84B.dip0.t-ipconnect.de) Quit (Remote host closed the connection)
[23:04] <dmick> http://ceph.com/docs/master/install/debian/#development-testing-packages with "cuttlefish" as the BRANCH
[23:04] <jmlowe> I'd rather copy the big one onto the small one than truncate the big one
[23:04] <dmick> thanks for the help
[23:05] * mohits (~mohit@122.172.182.235) Quit (Read error: Connection reset by peer)
[23:06] * mohits (~mohit@122.172.182.235) has joined #ceph
[23:11] <buck> sagewk: filed #4953 and I'll watch it in case assistance is needed to reproduce. Thanks for the help.
[23:14] <saras> anyone smart about debian need to ask a stupid question how add the repo
[23:15] <sagewk> saras: go for it
[23:16] <saras> sagewk: my pi is useing the specail repo how add the normal repo so get leveldb in
[23:16] <saras> leveldb is the only thing that i am missing
[23:16] <saras> still
[23:16] <sagewk> squeeze or wheezy?
[23:16] <saras> wheezy
[23:16] <netmass> dmick: OK... to check that I did this correct.. once installed a version check shows "ceph version 0.61 (237f3f1e8d8c3b85666529860285dcdffdeda4c5)". Is that the testing version?
[23:17] <sagewk> should be able to pull that directly from debian's repos
[23:17] <saras> it should
[23:17] <dmick> netmass: no, that's the cuttlefish release version
[23:17] <saras> deb http://mirrordirector.raspbian.org/raspbian/ wheezy main contrib non-free rpi
[23:17] <netmass> K... let me try again. Sorry...
[23:18] <dmick> don't forget apt-get update
[23:18] <saras> dmick: i just did
[23:18] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[23:18] <saras> that is only thing in sorce.list file
[23:19] <sagewk> hmm, maybe that package isn't built for arm...
[23:19] <sagewk> dunno offhand!
[23:21] <saras> do you know how to run .deb form the command line
[23:21] <saras> i found a download for the file it self
[23:22] <dmick> saras: actually I meant for netmass. As for yours: yeah, I'm not sure about the hierarchy of Debian vs. pi-specific package repos
[23:22] <dmick> if you have a .deb, you can install it with dpkg -i
[23:22] <jmlowe> sagewk: ok, my problem is more complicated, the files for the object are identical on the primary and secondary
[23:22] <sagewk> both the wrong size?
[23:22] * rustam (~rustam@94.15.91.30) has joined #ceph
[23:23] <saras> dmick: thanks
[23:23] <sagewk> truncating them both to the right size will make ceph happy. but again, not sure how they got that way... is the problem you hit earlier today to blame?
[23:23] <jmlowe> matching size and md5 sums so I'd need to interpret this better "on disk size (1155072) does not match object info size (4194304)"
[23:23] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[23:23] <netmass> dmick: OK... running "ceph version 0.61-4-g4848fac (4848fac24224879bcdc4fcf202d6ab689d8d990f)" on the monitor node.
[23:24] <sagewk> ceph' sintenral metadata says the object is 4MB, but the file doesn't match
[23:25] <dmick> netmass: better version, yes
[23:26] <jmlowe> makes me feel slightly better, ceph repair is the thing to do?
[23:27] <netmass> dmick: Same problem... but... this log now... http://paste.ubuntu.com/5645981/plain/
[23:27] <sagewk> hmm i don't think repair will fix it in this case... but you can 'truncate <filename> 4192304' on all replicas
[23:27] <jmlowe> I've also got this going on "recovering 15E o/s, 15EB/s"
[23:27] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[23:28] <sjusthm> jmlowe: when do you think it last scrubbed?
[23:28] <saras> dmick: thanks very much
[23:28] <dmick> saras: np. netmass: well that's different :)
[23:29] <jmlowe> before yesterday, as soon as it popped up I started scrubbing everything
[23:29] <dmick> netmass: is the mon host 10.10.101.71 as advertised?
[23:30] <netmass> yes... static IP and all...
[23:30] <sjustlaptop> jmlowe: odd, there shouldn't have been anything in the upgrade to cause or reveal such an inconsistency
[23:30] <netmass> (and the mon is running)
[23:35] * dosaboy__ (~dosaboy@host86-161-206-107.range86-161.btcentralplus.com) Quit (Remote host closed the connection)
[23:36] <dmick> netmass: paste /var/lib/ceph/mon/ceph-ceph01/keyring?
[23:37] <netmass> [mon.]
[23:37] <netmass> key = AQBWwIpRAAAAABAA70ioE/PjMHfnn/J4OJpz1w==
[23:37] <netmass> caps mon = "allow *"
[23:37] * dosaboy (~dosaboy@host86-161-206-107.range86-161.btcentralplus.com) has joined #ceph
[23:38] <dmick> and if you run ceph --name=mon. --keyring=/var/lib/ceph/mon/ceph-ceph01/keyring -s?
[23:38] <jmlowe> let me rephrase, the regular scrubbing schedule didn't turn it up before today, so far so good on scrubbing 1/2 of my osd's save the one pg
[23:38] <netmass> ceph --name=mon. --keyring=/var/lib/ceph/mon/ceph-ceph01/keyring -s
[23:38] <netmass> health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
[23:38] <netmass> monmap e1: 1 mons at {ceph01=10.10.101.71:6789/0}, election epoch 1, quorum 0 ceph01
[23:38] <netmass> osdmap e1: 0 osds: 0 up, 0 in
[23:38] <netmass> pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail
[23:38] <netmass> mdsmap e1: 0/0/1 up
[23:40] <jmlowe> I think the one half is done, instructed the other half to scrub
[23:40] <dmick> makes no sense. you can run with the same auth as ceph-create-keys is failing with
[23:40] <joshd> Fetch: how are you trying to attach, and are you sure you're using a qemu binary with rbd support?
[23:40] <dmick> ceph-create-keys -i ceph01 by itself does what?
[23:41] <netmass> Just FYI... I have rerun your original "create keys" command manually... it worked again... and I was able to "ceph-deploy -v gatherkeys ceph01" successfully. Prior to the run, there were no keys in /etc/ceph
[23:42] <netmass> The only oddity that I am aware of is that I am running latest kernel V3.9-raring
[23:43] <joshd> Fetch: it could be selinux denying access to /etc/ceph/ceph.conf
[23:43] <dmick> by "my original create keys command" you mean "ceph-create-keys -i ceph01" ?
[23:43] <netmass> dmick: yes
[23:45] * tnt (~tnt@91.177.240.165) Quit (Ping timeout: 480 seconds)
[23:46] <dmick> somehow ceph-create-keys couldn't get to /etc/ceph/ceph.conf, and so gave up
[23:47] <dmick> because it thus couldn't find any monitors to talk to and interpreted that failure as EPERM or EACCES (which seems a bit strange as well)
[23:47] <sagewk> dmick: review wip-4952?
[23:48] <dmick> sagewk, does http://paste.ubuntu.com/5645981/plain/ make any sense to you? (and will do)
[23:49] <buck> I have an OSD node with 3x 7200 Sata drives and 1 SSD. Is it a terrible idea to use one SSD as the journal for all 3 OSDs, knowing that that SSD failing takes all 3 OSDs in-flight writes down with it?
[23:49] <sagewk> strange
[23:49] <dmick> wip-4952 is good as far as it goes, but I'd still like to try/except the json.loads(), and log out if it raises
[23:50] <sagewk> buck: not if its all in the same host.. a host failure would also take them down
[23:50] <sagewk> crush should jus tseparate replicas across hosts
[23:50] * rustam (~rustam@94.15.91.30) has joined #ceph
[23:51] <buck> sagewk: cool, thanks
[23:51] <sagewk> dmick: and retry? or exit?
[23:51] <dmick> yes :)
[23:51] <dmick> I dunno. retry is fairly cheap I guess
[23:51] <dmick> but the point is to be able to see what went wrong
[23:51] * BillK (~BillK@58-7-104-61.dyn.iinet.net.au) has joined #ceph
[23:52] <sagewk> dmick: repushed
[23:52] * rustam (~rustam@94.15.91.30) Quit (Remote host closed the connection)
[23:52] * saras (~kvirc@74-61-8-52.war.clearwire-wmx.net) Quit (Quit: KVIrc 4.1.3 Equilibrium http://www.kvirc.net/)
[23:53] <dmick> sagewk: lgtm
[23:55] <sagewk> yehudasa, sjustlaptop: should that omapgetvals fix go to cuttlefish?
[23:56] <sjustlaptop> sagewk: no big reason not to, I suppose
[23:56] <sjustlaptop> should only be triggered it an object class doesn't check for existence
[23:56] <cjh_> nhm: you still around?
[23:58] <pioto> sagewk: omapgetvals fix... i think maybe i hit that bug?
[23:59] <pioto> a failed assert... lemme see
[23:59] * newbie (~kvirc@74-61-8-52.war.clearwire-wmx.net) has joined #ceph
[23:59] * newbie is now known as saras
[23:59] <pioto> osd/PG.cc: 2909: FAILED assert(values.size() == 2)
[23:59] <pioto> is that the bug you're talking about, or something unrelated?

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.