#ceph IRC Log


IRC Log for 2012-08-10

Timestamps are in GMT/BST.

[0:29] * cpglsn (~ac@host96-174-dynamic.8-87-r.retail.telecomitalia.it) Quit (Quit: cpglsn)
[1:07] * tnt (~tnt@45.124-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[1:26] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[2:01] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[2:17] * asadpanda (~asadpanda@2001:470:c09d:0:20c:29ff:fe4e:a66) Quit (Read error: No route to host)
[2:17] * Cube (~Adium@ Quit (Quit: Leaving.)
[2:30] * ninkotech_ (~duplo@ Quit (Quit: Konversation terminated!)
[2:32] * Tv_ (~tv@ Quit (Quit: Tv_)
[2:43] * ninkotech (~duplo@ has joined #ceph
[2:44] * lofejndif (~lsqavnbok@19NAABMG4.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[2:48] * maelfius (~Adium@ Quit (Quit: Leaving.)
[3:07] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) Quit (Quit: Leaving.)
[3:14] * adjohn (~adjohn@ Quit (Quit: adjohn)
[3:15] * chutzpah (~chutz@ Quit (Quit: Leaving)
[3:19] <jefferai> nhm: alexxy: For zfs testing should I use stable packages or development release (not development testing) packages?
[3:20] <jefferai> eventually I'd want to probably be running stable so it'd be interesting to know if it works on there, but I'm not sure how fast any fixes/features would get pulled back there from development release
[3:23] <dmick> do you mean Ceph packages?
[3:24] <jefferai> Yes
[3:24] <jefferai> from here: http://ceph.com/docs/master/install/debian/
[3:24] <dmick> stable is intended to only get critical fixes
[3:24] <jefferai> Sure
[3:24] <jefferai> but I mean, I wouldn't be in production for a couple of months
[3:24] <jefferai> I'm not sure how fast stable iterates
[3:24] <jefferai> or how stable development/release is
[3:25] <jefferai> so if what's in development/release is going to be in stable in two months, I might as well test against that currently
[3:25] <dmick> it's relatively rare, I'd say, for the master branch to be broken
[3:26] <dmick> and stable doesn't really iterate; it's as-needed, and the features having to do with optimizing btrfs (and thus zfs) probably won't be patched there at least in the near term
[3:26] <jefferai> alternately if zfs would need a little bit of extra work to make it work well (snapshot support for instance) and that won't be in stable for a long time, but release is generally relatively stable for production use, I might want to consider that
[3:26] <dmick> so I'd say you're better off with master
[3:26] <jefferai> I see
[3:26] <jefferai> okay
[3:26] <jefferai> and master is usually stable enough to actually run in production, it sounds like
[3:27] <dmick> we have a wide range of regression/functionality tests running nightly and breakages are addressed reasonably quickly
[3:27] <jefferai> Yeah, I know
[3:27] <jefferai> sometimes people develop trending opinions over time :-)
[3:27] <dmick> I mean, this is always a judgement call, but if it were me I'd be using the development master
[3:27] <jefferai> I see
[3:27] <jefferai> okay
[3:28] <jefferai> so that's, basically, the development/release repository
[3:28] <jefferai> as opposed to development/testing
[3:28] <dmick> ah, I see the terminology on that now
[3:28] <dmick> I'd even consider development/testing
[3:29] <dmick> but the development/release is a good middle ground; those tend to be cut about every 3 weeks
[3:29] <jefferai> Sounds good
[3:29] <jefferai> if things aren't working well and people want to try out fixes I'll switch to building from git
[3:29] <jefferai> but eventually those things would I guess make it into development/release
[3:30] <jefferai> so that would be a decent long-term choice
[3:30] <dmick> if I'm not mistaken we actually publish on-demand builds too
[3:30] <dmick> so you needn't even build, necessarily, to get bleeding edge, except for your own changes
[3:31] <jefferai> ah, okay
[3:31] <dmick> I forget what you said about which distro you're using?
[3:31] <jefferai> I didn't :-)
[3:31] <jefferai> debian wheezy
[3:31] <dmick> we're much more likely to have usable Ubuntu Precise packages at this point in time
[3:31] <jefferai> Ah
[3:31] * jefferai isn't an Ubuntu fan
[3:32] <dmick> but wheezy is built regularly too
[3:32] <jefferai> but I know why you'd be targeting them
[3:32] <jefferai> openstack and all
[3:32] <dmick> we tend to test mostly on ubuntu
[3:32] <dmick> http://gitbuilder.ceph.com/ceph-deb-wheezy-x86_64-basic/
[3:32] <dmick> last build: one minute ago
[3:32] <jefferai> hah
[3:32] <jefferai> okay
[3:32] <jefferai> so you do CI, and then post up the packages
[3:32] <dmick> yeah
[3:32] <jefferai> very nice
[3:33] <jefferai> okay, looks like I'm all set to test tomorrow...thanks!
[3:33] <dmick> it's a philosophy. :) and sure.
[3:33] <jefferai> yeah, it's one that we use too, actually, but you don't see it much in the wild
[3:37] * asadpanda (~asadpanda@2001:470:c09d:0:20c:29ff:fe4e:a66) has joined #ceph
[3:38] * dpemmons (~dpemmons@ has joined #ceph
[3:42] * NashTrash (~Adium@99-5-94-133.lightspeed.nsvltn.sbcglobal.net) has joined #ceph
[3:42] * NashTrash (~Adium@99-5-94-133.lightspeed.nsvltn.sbcglobal.net) Quit ()
[3:46] <dpemmons> is it possible to tell MDS about blocks added through librados such that they become accessible as files?
[3:48] * maelfius (~Adium@pool-71-160-33-115.lsanca.fios.verizon.net) has joined #ceph
[4:29] * tjpatter (~tjpatter@c-68-62-88-148.hsd1.mi.comcast.net) Quit (Quit: tjpatter)
[5:18] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[5:21] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit ()
[5:22] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[6:12] * deepsa (~deepsa@ has joined #ceph
[6:13] * glowell (~Adium@ has joined #ceph
[6:13] * glowell1 (~Adium@ Quit (Read error: Connection reset by peer)
[6:19] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[6:46] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[6:48] * dmick is now known as dmick_away
[7:24] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[7:29] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit ()
[8:38] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[8:44] * tnt (~tnt@45.124-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[8:54] * pmjdebruijn (~pmjdebrui@overlord.pcode.nl) Quit (Quit: leaving)
[9:03] * NaioN (~stefan@andor.naion.nl) Quit (Remote host closed the connection)
[9:09] * BManojlovic (~steki@ has joined #ceph
[9:10] * tnt (~tnt@45.124-67-87.adsl-dyn.isp.belgacom.be) Quit (Read error: Operation timed out)
[9:16] * NaioN (~stefan@andor.naion.nl) has joined #ceph
[9:23] * tnt (~tnt@office.intopix.com) has joined #ceph
[9:41] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[9:42] <NaioN> is there any way to throttle the backfill process?
[9:43] <NaioN> at the moment the cluster is really slow because of a lot of pgs in backfill mode (after adding new disks/osds)
[9:56] * Leseb (~Leseb@ has joined #ceph
[10:27] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[10:29] * JamesD_ (~chatzilla@innov-64-25.bath.ac.uk) has joined #ceph
[10:30] <JamesD_> Hello!
[10:31] <JamesD_> I'm an IRC newbie...
[10:32] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[10:32] <JamesD_> I'm doing some work with librados but seem to have uncovered some memory leaks. I'm following the example code on ceph.com but can anyone help be validate the use if the api?
[10:33] <JamesD_> I'm using 0.48argonaut-1precise on Ubuntu 12.04
[10:37] * EmilienM (~EmilienM@ Quit (Quit: Leaving...)
[10:38] * rz (~root@ Quit (Ping timeout: 480 seconds)
[10:44] * rz (~root@ns1.waib.com) has joined #ceph
[10:54] * EmilienM (~EmilienM@ has joined #ceph
[10:57] * fiddyspence (~fiddyspen@94-192-234-112.zone6.bethere.co.uk) has joined #ceph
[11:23] <alexxy> jefferai: better to use latest rc for zfs
[11:58] * EmilienM (~EmilienM@ Quit (Read error: Connection reset by peer)
[11:58] * EmilienM (~EmilienM@ has joined #ceph
[12:17] * fiddyspence (~fiddyspen@94-192-234-112.zone6.bethere.co.uk) Quit (Quit: Leaving.)
[12:51] * EmilienM (~EmilienM@ Quit (Ping timeout: 480 seconds)
[12:56] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:05] * kermie (ca037809@ircip3.mibbit.com) has joined #ceph
[13:06] <kermie> hello there
[13:06] <kermie> an1 alive?
[13:25] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[13:29] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[13:30] * EmilienM (~EmilienM@ has joined #ceph
[13:30] <NaioN> nope
[13:32] <kermie> hi NaioN
[13:32] <NaioN> hello
[13:33] <kermie> can u guide me abt installing ceph?
[13:33] <kermie> i did some progress bt stuck in middle
[13:33] <NaioN> maybe, what's the problem?
[13:34] <alexxy> how many mons/mds should have ceph?
[13:34] <alexxy> in a system with ~100 osd
[13:35] <kermie> actually m installin ceph on virtual machine
[13:35] <kermie> de will clone it
[13:35] <kermie> let me tell u exact problem
[13:35] <NaioN> alexxy: i think a minimum of 3
[13:35] <alexxy> 3 mon?
[13:35] <alexxy> or 3 mds?
[13:35] <NaioN> yeps
[13:35] <NaioN> mds depends of usage, do you use cephfs?
[13:36] <alexxy> i plan to
[13:36] <alexxy> for tests
[13:37] <NaioN> our cluster has 3 dedicated servers with mon (and mds, although we don't use it) and 4 disk nodes with 24 disks each (96 in total)
[13:37] <alexxy> am i righth that there should be odd number of mds/mons
[13:37] <NaioN> yes it should be odd (for the mons)
[13:38] <NaioN> if you have two and one dies the other doesn't know for sure if the other died, or that it has lost connection to the cluster
[13:38] <kermie> @ NaioN
[13:38] <cephalobot`> kermie: Error: "NaioN" is not a valid command.
[13:38] <kermie> hemant@hemant-virtual-machine:~/rpmbuild/SOURCES$ sudo yum install rpm-build rpmdevtools Setting up Install Process No package rpm-build available. No package rpmdevtools available. Nothing to do
[13:39] <NaioN> kermie: I'm not familiar with rpm sources packages and the build process
[13:39] <alexxy> NaioN: what about mds?
[13:39] <alexxy> NaioN: did you use ceph fs?
[13:39] <NaioN> we use deps
[13:40] <NaioN> alexxy: just played a little with it
[13:40] <NaioN> but it isn't stable enough for our load
[13:40] <NaioN> by default only one mds is active and all the other mds'es go in standby mode
[13:41] <NaioN> you can make more mds'es active with a ceph command and they will split (dynamicly) the directory tree between them
[13:42] <NaioN> so if you have a lot of metadata activity it's good to have more active mds'es
[13:42] <kermie> @NaioN : can u tell wat exactly needed to be installed to hv ceph
[13:42] <cephalobot`> kermie: Error: "NaioN" is not a valid command.
[13:42] <kermie> i gone thru official documentation
[13:42] <NaioN> you can also define mds'es as hot-standby, in that mode all the metadata actions are also replicated to the hot-standby mds
[13:44] <NaioN> kermie: why do you want to build ceph yourself and why don't you just use the provided packages?
[13:45] * deepsa (~deepsa@ has joined #ceph
[13:47] <NaioN> kermie: I tested it with three vm's all ubuntu
[13:48] <kermie> NaioN : i didnt got u
[13:48] <NaioN> just apt-get the packages and follow the instructions on ceph.com
[13:48] * phil- (~user@ has joined #ceph
[13:48] <NaioN> you could use mkcephfs or prepare the disks yourself
[13:48] <NaioN> what distro do you want to run on the vm?
[13:49] <kermie> distro?? m kind of newbei
[13:49] * deepsa_ (~deepsa@ has joined #ceph
[13:49] <NaioN> which linux distro do you have installed on the vm you want to test with?
[13:50] <kermie> m using UBUNTU 12.04
[13:50] <kermie> with kernel 3.3.7
[13:50] <NaioN> ok
[13:50] <NaioN> kermie: http://ceph.com/docs/master/install/debian/
[13:51] <kermie> n do i need to install other rpm n other stuffs like chef open stack
[13:51] <NaioN> which ceph release do you want: stable or development... just pick the one you want
[13:51] <NaioN> no
[13:52] <kermie> i tried with ceph 0.47
[13:52] <NaioN> the stable one is 0.48 at the moment
[13:52] <kermie> i wanted to log read n write transactions actually
[13:52] <kermie> ok
[13:52] <kermie> so shud i go for 0.48
[13:52] <NaioN> yes
[13:53] <kermie> can i insert my code into it
[13:53] <NaioN> but you need at least a couple of vm's for proper testing
[13:53] <kermie> i want to log the rd/wt transactions
[13:53] <kermie> yeah
[13:53] <kermie> i hv 2 pcs n i can hv extra 2 vms
[13:53] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[13:53] * deepsa_ is now known as deepsa
[13:53] * dabeowulf (dabeowulf@free.blinkenshell.org) Quit (Ping timeout: 480 seconds)
[13:54] <NaioN> I used 3 vm's for ceph (osd+mon+mds on the same vm) and two client vm's (to mount cephfs and/or rbds)
[13:54] <kermie> d link u sent is enough to install ceph
[13:55] <kermie> no need of chef n openstack stuffs?
[13:55] <NaioN> nope
[13:55] <NaioN> openstack only if you want to use ceph in combination with openstack
[13:58] <NaioN> chef is a tool to make the configuration and installation of clusters easier
[13:58] <NaioN> with a small cluster i think you have more work with configuring chef than with configuring the cluster by hand
[13:59] <NaioN> kermie: http://ceph.com/docs/master/config-cluster/
[13:59] <NaioN> there you can chose how you want to configure your cluster
[13:59] <NaioN> I used mkcephfs
[14:00] * dabeowulf (dabeowulf@free.blinkenshell.org) has joined #ceph
[14:02] <kermie> so i just need to install from 1st link u send
[14:03] <kermie> n den need to configure it
[14:03] <NaioN> yes
[14:03] <kermie> using 2nd link
[14:03] <NaioN> yes
[14:03] <kermie> thanx a lot
[14:03] <kermie> i will try this
[14:03] <kermie> n if it works for me u deserve a treat from me
[14:03] <kermie> :)
[14:03] <NaioN> k thx :)
[14:06] <kermie> actually i faced prob while installing rpm
[14:06] <kermie> what exactly rpm is needed here?
[14:06] <kermie> *why
[14:07] <NaioN> on ubuntu you need to use the deps...
[14:07] <NaioN> http://ceph.com/docs/master/install/debian/
[14:08] <kermie> i executed these cmds
[14:08] <kermie> no probs with that
[14:08] <kermie> rpm -i ceph-*.rpm
[14:08] <NaioN> euhmmmm
[14:08] <kermie> this cmd was nt executing givin some error
[14:08] <NaioN> apt-get install ceph
[14:08] <NaioN> you mean
[14:09] <kermie> http://ceph.com/docs/master/install/rpm/
[14:09] <kermie> on this link 2nd cmd
[14:09] <kermie> do i need this ?
[14:09] <NaioN> no
[14:09] <kermie> ok
[14:09] <NaioN> just one of those 4
[14:10] <kermie> http://ceph.com/docs/master/install/debian/ && http://ceph.com/docs/master/config-cluster/ = ceph workin on machine
[14:10] <NaioN> so if you use debian/ubuntu only the things on that page ofcourse
[14:10] <NaioN> the rpm is for redhat or other rpm distro's
[14:10] <NaioN> kermie: yes
[14:10] <kermie> accha
[14:11] <NaioN> and with the config-cluster you also need to pick ONE configuration type
[14:11] <NaioN> so deploy with mkcephfs or with Chef, but I would recommend mkcephfs
[14:12] * cpglsn (~ac@host96-174-dynamic.8-87-r.retail.telecomitalia.it) has joined #ceph
[14:19] <kermie> ok sir
[14:31] * kermie (ca037809@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[14:36] * nhm (~nh@184-97-251-210.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[14:45] <alexxy> can someone share ceph.conf
[14:57] * JamesD_ (~chatzilla@innov-64-25.bath.ac.uk) Quit (Read error: Operation timed out)
[15:05] <alexxy> NaioN: is there some how to about adding removing osd?
[15:10] <NaioN> alexxy: in the docs :)
[15:10] <NaioN> it's easy
[15:10] <alexxy> where?
[15:10] <alexxy> last time i checked there were wiki =)
[15:10] <NaioN> http://ceph.com/docs/master/ops/manage/grow/osd/
[15:11] * JamesD (~chatzilla@innov-64-25.bath.ac.uk) has joined #ceph
[15:11] <NaioN> yeah the wiki was a bit cryptic sometimes :)
[15:11] <NaioN> but the docs on ceph.com are a lot better
[15:11] <NaioN> they put a great deal of work in it and it looks very good
[15:14] <NaioN> We had just added a new node with 24 disks, no problem. Only it takes some tiem before everything is settled again
[15:23] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[15:29] * denken (~denken@dione.pixelchaos.net) has joined #ceph
[15:29] * t0rn (~ssullivan@ has joined #ceph
[15:29] * tjpatter (~tjpatter@ has joined #ceph
[16:09] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[16:10] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[16:15] * jluis (~JL@ has joined #ceph
[16:17] * tnt (~tnt@office.intopix.com) Quit (Ping timeout: 480 seconds)
[16:20] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[16:20] * joao (~JL@ Quit (Ping timeout: 480 seconds)
[16:22] * tnt (~tnt@45.124-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[16:58] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:17] * nhm (~nh@184-97-251-210.mpls.qwest.net) has joined #ceph
[17:17] <tjpatter> On average, how long should a 'ceph health' command take to execute?
[17:19] <elder> Not long
[17:19] <elder> At least in my experience, but I haven't been dealing with unhealthy clusters...
[17:20] <tjpatter> Hmm??? We just set up a brand new cluster, ran mkcephfs, started the daemons. /etc/init.d/ceph status shows all daemons running, but 'ceph health' is hanging forever.
[17:21] <elder> I'm really not the right person to advise you on this. You should probably wait to hear from someone else with a bit more insight about it. jluis?
[17:22] <jluis> tjpatter, it shouldn't take much time
[17:22] <jluis> if it is taking too much time, usually the blame lies on authentication
[17:23] <tjpatter> We actually commented out the auth supported = cephx line???
[17:23] <tjpatter> This is a prototype cluster
[17:23] <jluis> yeah... I'm not sure if it is going to work out without cephx
[17:23] <tjpatter> I am seeing lots of traffic in /var/log/ceph/mon*.log showing that the nodes are talking to each other.
[17:24] <jluis> okay, try --debug-auth 20 and --debug-monc 20
[17:24] <jluis> on 'ceph'
[17:25] * dilemma (~dilemma@2607:fad0:32:a02:21b:21ff:feb7:82c2) has joined #ceph
[17:25] <jluis> if it is auth related, you'll probably notice some 'unable to authenticate client' messages, or something similar (can't recall the specific message)
[17:25] <tjpatter> We're not seeing those types of messages with the debug enabled.
[17:25] * lofejndif (~lsqavnbok@659AAB7WY.tor-irc.dnsbl.oftc.net) has joined #ceph
[17:26] <jluis> try adding --debug-ms 10
[17:26] <jluis> you should see some kind of messages, I suppose
[17:26] <jluis> at least message passing
[17:28] <tjpatter> Lots of messages, but nothing that indicates any kind of auth error
[17:28] <tjpatter> Should we start fresh with cephx auth enabled?
[17:28] <jluis> you should probably confirm this with someone else, but I think that cephx is assumed as enabled by default
[17:30] <jluis> tjpatter, in the monitor logs, do you see some monitors messages bearing a "leader" and the remaining monitors as "peon" ?
[17:30] * dabeowulf (dabeowulf@free.blinkenshell.org) Quit (Ping timeout: 480 seconds)
[17:30] <tjpatter> grep'ing for "leader" or "peon" in logs come back empty
[17:31] <jluis> so the monitors may be talking to each other but are not forming quorum; maybe running the monitors with --debug-auth 20 would make those auth errors pop up
[17:32] <jluis> if there is no formed quorum, the ceph tool won't be able to talk to the existing monitors
[17:33] <jluis> if you are willing though, it might be easier to just recreate the cluster with cephx enabled
[17:33] * dabeowulf (dabeowulf@free.blinkenshell.org) has joined #ceph
[17:34] <jluis> aside from that, I'm not sure why that would happen, but somebody else might have a better clue
[17:34] <tjpatter> I appreciate your help jluis!
[17:34] <jluis> you're most welcome ;)
[17:35] * jluis is now known as joao
[17:41] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:23] <tjpatter> Trying to start our cluster again from scratch, but we keep failing during mkcephfs command now. We are getting a 'umount' error.
[18:24] <tjpatter> mkcephfs is printing out the usage for the 'umount' command. Any thoughts?
[18:26] * maelfius (~Adium@pool-71-160-33-115.lsanca.fios.verizon.net) Quit (Quit: Leaving.)
[18:28] * Leseb (~Leseb@ Quit (Ping timeout: 480 seconds)
[18:34] * JamesD (~chatzilla@innov-64-25.bath.ac.uk) Quit (Ping timeout: 480 seconds)
[18:41] <gregaf> dpemmons: right now there's no way to view raw RADOS objects via the filesystem
[18:42] <tjpatter> We figured out our problem??? We had a bad config.
[18:42] * EmilienM (~EmilienM@ Quit (Quit: Leaving...)
[18:43] * EmilienM (~EmilienM@ede67-1-81-56-23-241.fbx.proxad.net) has joined #ceph
[18:44] * chutzpah (~chutz@ has joined #ceph
[18:44] * gregaf (~Adium@ Quit (Quit: Leaving.)
[18:47] * fc (~fc@ Quit (Quit: gone !)
[18:49] * bchrisman (~Adium@ has joined #ceph
[18:49] <tnt> I know you can't run rbd kernel on the same machine you run an osd, but what about running rbd in the xen dom0 and have a domU running an osd ?
[18:50] * lofejndif (~lsqavnbok@659AAB7WY.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[18:57] * cpglsn (~ac@host96-174-dynamic.8-87-r.retail.telecomitalia.it) Quit (Quit: cpglsn)
[19:02] * phil- (~user@ Quit (Remote host closed the connection)
[19:03] * dmick_away is now known as dmick
[19:03] * Cube (~Adium@ has joined #ceph
[19:05] * adjohn (~adjohn@ has joined #ceph
[19:06] * sagelap (~sage@wsip-70-167-159-70.oc.oc.cox.net) has joined #ceph
[19:07] <sagelap> yehudasa: can you put any of the rgw fixes that should go into 0.48.1 into stable-next?
[19:14] * gregaf (~Adium@ has joined #ceph
[19:22] <dmick> tnt: don't know if it's been tried, but it seems possible that that would avoid the direct deadlock issues
[19:23] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) has joined #ceph
[19:24] * maelfius (~Adium@ has joined #ceph
[19:26] * Tamil (~Adium@2607:f298:a:607:c82a:d741:d8f8:c8e5) has joined #ceph
[19:28] <yehudasa> sagelap: yeah
[19:34] <tnt> dmick: what were the symptoms ?
[19:38] * Tamil1 (~Adium@ has joined #ceph
[19:38] * Tamil (~Adium@2607:f298:a:607:c82a:d741:d8f8:c8e5) Quit (Read error: Connection reset by peer)
[19:42] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) Quit (Quit: Leaving.)
[19:43] <yehudasa> sagelap: pushed fixed to 2877, 2879, 2878, 2869, 2841
[19:47] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:51] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) has joined #ceph
[20:00] * JJ (~JJ@ has joined #ceph
[20:01] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:11] <dmick> tnt: I've never actually experienced it, but the idea is like http://h10025.www1.hp.com/ewfrf/wc/document?cc=us&lc=en&dlc=en&docname=c02073470j
[20:11] * bshah (~bshah@sproxy2.fna.fujitsu.com) Quit (Remote host closed the connection)
[20:11] <dmick> basically, if you need to do I/O-for-paging to complete outstanding I/O-for-nonpaging, you can get into a deadlock
[20:12] <tnt> is the link the right one ?
[20:13] <dmick> sorry, I added a j
[20:13] <dmick> http://h10025.www1.hp.com/ewfrf/wc/document?cc=us&lc=en&dlc=en&docname=c02073470
[20:13] * sagelap1 (~sage@145.sub-166-250-67.myvzw.com) has joined #ceph
[20:13] <tnt> ah ok thanks.
[20:15] * Ryan_Lane (~Adium@ has joined #ceph
[20:18] * sagelap (~sage@wsip-70-167-159-70.oc.oc.cox.net) Quit (Ping timeout: 480 seconds)
[20:20] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[20:25] * dilemma (~dilemma@2607:fad0:32:a02:21b:21ff:feb7:82c2) Quit (Quit: Leaving)
[20:28] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:45] * JamesD (~chatzilla@cpc2-hawk2-0-0-cust96.aztw.cable.virginmedia.com) has joined #ceph
[20:45] * joao (~JL@ Quit (Quit: Leaving)
[20:48] * BManojlovic (~steki@ has joined #ceph
[20:57] * tnt_ (~tnt@17.127-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[20:58] * tnt (~tnt@45.124-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[21:02] <maelfius> so, there is no way to allocate an OSD ID of an arbitrary value? it looks like ceph osd create cannot handle any arguments (i am wondering if I'm missing anything)
[21:02] <maelfius> this is back to the previous issue of having arbitrary OSD IDs that aren't sequential
[21:09] <maelfius> ah nvm looks like this might be a limitation of 0.48?
[21:13] * JJ (~JJ@ Quit (Ping timeout: 480 seconds)
[21:13] * JJ (~JJ@ has joined #ceph
[21:17] * adjohn (~adjohn@ Quit (Quit: adjohn)
[21:20] <joshd> maelfius: they're really intended to be sequential for now, since there's no indirection between what you call them and what crush/other internal things call them
[21:22] * bshah (~bshah@sproxy2.fna.fujitsu.com) has joined #ceph
[21:34] * lofejndif (~lsqavnbok@83TAAH026.tor-irc.dnsbl.oftc.net) has joined #ceph
[21:39] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Read error: Connection reset by peer)
[21:43] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[21:48] <joshd> JamesD: what kinds of leaks were you seeing?
[21:48] <joshd> there are some expected ones, like from the configuration system, but no one's created a valgrind ignore file for them
[21:57] * Kioob (~kioob@luuna.daevel.fr) Quit (Quit: Leaving.)
[21:58] * elder (~elder@2607:f298:a:607:457f:7231:f46f:bba9) Quit (Quit: Leaving)
[22:07] * Tamil1 (~Adium@ Quit (Quit: Leaving.)
[22:10] * glowell1 (~Adium@ has joined #ceph
[22:11] * Tamil (~Adium@ has joined #ceph
[22:12] * lofejndif (~lsqavnbok@83TAAH026.tor-irc.dnsbl.oftc.net) Quit (Ping timeout: 480 seconds)
[22:13] * lofejndif (~lsqavnbok@9YYAAIMH9.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:14] * glowell (~Adium@ Quit (Ping timeout: 480 seconds)
[22:16] * Tamil (~Adium@ has left #ceph
[22:18] * glowell1 (~Adium@ Quit (Ping timeout: 480 seconds)
[22:22] * glowell (~Adium@2607:f298:a:697:45d9:17f9:fefe:ccbe) has joined #ceph
[22:27] * glowell1 (~Adium@ has joined #ceph
[22:29] * glowell (~Adium@2607:f298:a:697:45d9:17f9:fefe:ccbe) Quit (Read error: Operation timed out)
[22:36] * lofejndif (~lsqavnbok@9YYAAIMH9.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[22:36] <maelfius> joshd: oh, well i mean, the issue was that i was trying to spec out an organizational system that would identify node/disk combinations based on the OSD id. from what I see the ceph osd create ends up being sequential, which prevents that. this becomes a bit more relevant with a more complex crushmap being autogenerated (which will be relevant for the usecase we were looking at here). I can create levels of indirection in my config management system to hel
[22:36] <maelfius> or am I completely missing an important point
[22:37] <maelfius> (e.g. osd 101 would be node 1, disk 1, osd.205 would be node 2, disk 5)
[22:40] <maelfius> I'll plan on adding indirection on my end if the plan i outlined would end up causing issue(s)
[22:40] <joshd> I think that'd be easiest for now
[22:41] <joshd> generally ceph assumes the osd ids are not sparse, which is why osd create adds them sequentially
[22:48] * Cube (~Adium@ Quit (Ping timeout: 480 seconds)
[22:48] * Cube (~Adium@ has joined #ceph
[22:51] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[23:02] <dpemmons> gregaf: is it a planned feature?
[23:04] <gregaf> dpemmons: not at this time ??? it's not impossible but I'm not sure what the point would be
[23:04] <gregaf> what's your use case?
[23:08] <dpemmons> developing a cloud-based video editor. some of our tools (ffmpeg) are file-based. some we're writing right now and could use object storage directly
[23:13] <gregaf> hmm
[23:13] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[23:14] <gregaf> right now all the objects used in the filesystem are named deterministically based on the inode and what portion of the file it is
[23:14] <gregaf> we've talked once or twice about having a top-level /rados directory that let you list or access objects directly, although the directory would be *huge*
[23:15] <dpemmons> yeah that doesn't seem like a great solution
[23:16] <gregaf> giving an existing block a new filesystem name and location???I don't have enough of the filesystem code in my head right now to know if that could be done reasonably
[23:16] <gregaf> my inclination is "no" since I think the clients and the MDS independently calculate stuff based on the inode
[23:16] <dpemmons> existing block or ordered list of blocks
[23:18] <dmick> but there's no reason you can't use rados as both an object store and a filesystem, right? Isn't that what you're after?
[23:18] <gregaf> but honestly once you've already mounted the filesystem I don't think that you'd get much out of using an object interface instead ??? once you have a file handle the metadata overhead is pretty limited
[23:18] <gregaf> dmick: yeah, but you don't want to copy the data around just so your different tools can access it
[23:18] <dmick> oh on the same data. i see.
[23:19] <dpemmons> gregaf: actually perhaps that's the question I didn't ask - how much overhead does the fs layer actually introduce?
[23:19] <gregaf> the main obstacle at the moment is that it's significantly less stable than the object store ;)
[23:20] <gregaf> ignoring that, fairly little
[23:21] <gregaf> to keep things consistent there's a lot of data flow when traversing the hierarchy, but that's amortized if you don't have people changing it much
[23:21] <dpemmons> hm. we're in dev right now - beta in a couple months, so stability will soon be an issue but isn't right now
[23:21] * Cube (~Adium@ Quit (Ping timeout: 480 seconds)
[23:23] <gregaf> once you have a file handle, (assuming there's only one client accessing the file at a time) then the only overhead compared to direct rados access is that the client does an (asynchronous) roundtrip with the MDS every 5-10 seconds (? maybe less frequently; depends) on the current file statistics, and if it grows the file by a lot it needs to get the "max_size" changed by the MDS
[23:23] <gregaf> but generally speaking that's all done before it's limiting anything, so out of the data path
[23:24] <gregaf> and it has caching and prefetching which vanilla librados doesn't do, but for ffmpeg like applications you're going to want that anyway
[23:25] <gregaf> basically if you're expecting to need the FS for some of your tools, and most of your filesystem expense is in the data access (not metadata changes), then there's little benefit to using librados over the FS.
[23:25] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[23:27] <dpemmons> right - eventually we can/could use rados directly to cache more intelligently than a vanilla fs layer would, but that's not an optimization we need to worry about right now
[23:27] <dpemmons> how unstable is the fs layer?
[23:28] <gregaf> depends on what you're doing with it ;)
[23:28] <dpemmons> ok - what about it is unstable? :)
[23:28] <gregaf> some code paths
[23:28] <gregaf> sorry to be so imprecise, but...
[23:28] <gregaf> some people try their workload on it and hit a bug after an hour; others I hear about occasionally have been running backups to it for 6 months without issue
[23:29] <gregaf> there are a few known bugs we haven't tracked down yet as we focus our development and stability efforts on RADOS
[23:29] <gregaf> and probably other bugs that need to be shaken out
[23:30] <dpemmons> how do the bugs you've seen manifest? data loss?
[23:30] <gregaf> data inaccessibility, which is moderately better but not a lot unless you can dig through the rados objects to pull it out yourself
[23:33] <gregaf> the big one is that some people have a workload which manages to???oh, I don't remember it properly any more, but they corrupt the hierarchy's metadata and the MDS hits an assert to prevent it from breaking anything else
[23:33] <dpemmons> hmm. we may be better off writing an io layer for ffmpeg to use rados directly then
[23:33] <gregaf> :)
[23:35] * adjohn (~adjohn@ has joined #ceph
[23:36] <dpemmons> which introduces some interesting possibilities actually
[23:37] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[23:38] <dpemmons> will have to think / research for a bit and come back with more questions :)
[23:52] * sjust (~sam@ Quit (Remote host closed the connection)
[23:53] * sjust (~sam@ has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.