#ceph IRC Log


IRC Log for 2011-05-03

Timestamps are in GMT/BST.

[0:18] * MarkN (~nathan@ has joined #ceph
[0:18] <cmccabe> these xml namespaces are really causing me problems
[0:19] <cmccabe> rgw doesn't seem to put its XML in a namespace, but Amazon does
[0:19] <cmccabe> actually, maybe that is an RGW bug?
[0:26] * MarkN (~nathan@ Quit (Ping timeout: 480 seconds)
[0:26] <Tv> that definitely sounds like a bug
[0:27] <cmccabe> here's an example of an amazon acl, straight from the server:
[0:27] <Tv> it seems s3-tests have nothing about acl xmls
[0:28] <Tv> that would be a worthy addition
[0:28] <cmccabe> <?xml version="1.0" encoding="UTF-8"?>
[0:28] <cmccabe> <AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>52270b...</ID><DisplayName>sageweil</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>522....</ID><DisplayName>sageweil</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
[0:28] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[0:35] <cmccabe> I'm having a little more trouble getting ACLs from RGW
[0:36] <greglap> better ask yehudasa
[0:38] <cmccabe> it's more a practical question of setting them up
[0:38] <cmccabe> I guess I can create an acl with an s3 tool
[0:38] <Tv> even the boilerplate acls should give you some xml
[0:38] <Tv> the current tests just deal with json
[0:38] <cmccabe> even the canned ones you mean?
[0:38] <Tv> yeah
[0:38] <Tv> it's monday, i can't remember complex words like "canned" ;)
[0:39] <cmccabe> it seems like I can pretty easily create an object with no ACL at all on rgw
[0:39] <Tv> relevant: http://www.flickr.com/photos/tv42/5681178111/
[0:39] <cmccabe> mocking fridges are the worst
[0:39] <Tv> cmccabe: i expect there to be lots of gotchas & bugs left in rgw acl handling; s3-tests and unit tests will help
[0:40] <cmccabe> yeah, I will make an issue to add some ACL tests to that
[0:40] <cmccabe> I haven't actually run s3-tests yet so there will be some setup time
[0:40] <cmccabe> hopefully minimal
[0:43] <cmccabe> yehudasa: so with regard to the whole user_id / display name decoupling
[0:43] <cmccabe> yehudasa: that is completely done, right?
[0:44] <cmccabe> I think one thing that may confuse sysadmins is that user_id for amazon is something like "52270b97f8c77f92327bf7ff49fae...", and for us it's like "cmccabe"
[0:45] <yehudasa> cmccabe: you mean uid - access-key decoupling?
[0:45] <cmccabe> yehudasa: yes
[0:45] <yehudasa> yes, it's done
[0:45] <yehudasa> with the caveat that atm you can have only one access key per user
[0:46] <yehudasa> but there is a uid and there is access key and they're distinct
[0:48] <cmccabe> yehudasa: tv seems to be saying that everything should have an ACL
[0:48] <cmccabe> yehudasa: but just using a simple old s3 put creates an RGW object with nada
[0:49] <yehudasa> cmccabe: what do you mean nada?
[0:50] <cmccabe> no acl
[0:51] <Tv> cmccabe: that might just mean "default acl"
[0:51] <Tv> as in, why store it if it's the default
[0:51] <yehudasa> cmccabe: 'simple old s3 put' you mean you're using the s3 tool?
[0:51] <cmccabe> using boto_tool.py
[0:52] <yehudasa> I really don't remember if we explicitly store the default acl
[0:53] <cmccabe> I'm trying to reproduce this, having mixed success
[0:53] <cmccabe> but the missing namespace is definitely a problem
[0:54] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[0:54] <cmccabe> I can fix it in just a moment
[0:55] * LordTerminus (~Terminus@ip-66-33-206-8.dreamhost.com) has joined #ceph
[0:55] <cmccabe> so anyway, this came in:
[0:55] <cmccabe> <AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>cmccabe</ID>....
[0:55] <cmccabe> and this went out:
[0:55] <cmccabe> <AccessControlPolicy><Owner><ID>cmccabe</ID>....
[0:55] * yehuda_wk (~quassel@ip-66-33-206-8.dreamhost.com) has joined #ceph
[0:59] * sjust (~sam@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[0:59] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[0:59] * gregaf1 (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[0:59] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[0:59] * MarkN (~nathan@ has joined #ceph
[1:00] * greglap1 (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[1:00] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[1:01] * greglap (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[1:01] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[1:01] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[1:02] * Terminus (~Terminus@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[1:02] * yehudasa (~quassel@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[1:03] * MarkN (~nathan@ has left #ceph
[1:06] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) has joined #ceph
[1:06] <trollface> huh
[1:07] * sjust (~sam@ip-66-33-206-8.dreamhost.com) has joined #ceph
[1:08] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[1:28] <cmccabe> yehudasa: so groups and email users cannot own objects
[1:28] <cmccabe> yehudasa: only canonical users can do that
[1:30] <Tv> cmccabe: is the email thing just a lookup mechanism to find the right user, or are they really *any* email address?
[1:30] <Tv> i mean, what's it gonna do, email me a link?
[1:30] <Tv> i think it's just a way of coordinating across organizations
[1:31] <cmccabe> tv: all I know is that there are "canonical" users and "email" users
[1:31] <cmccabe> tv: seems to be like a separate user namespace
[1:31] <Tv> well the acls know what kind of a pointer to user they have
[1:31] <Tv> so if you set "allow email=jdoe@example.com", it stays stored like that
[1:32] <cmccabe> tv: basically
[1:32] <cmccabe> tv: I was just commenting that email users can't own objects
[1:32] <Tv> it's still the same user object at the end, i think
[1:32] <cmccabe> tv: which is sort of weird and irregular
[1:32] <cmccabe> tv: but whatever
[1:32] <Tv> so "email users" should be able to own objects
[1:32] <Tv> as they're just users
[1:33] <cmccabe> tv: I'm afraid not
[1:33] <cmccabe> tv: the xml is not structured in a way that allows user_id to take a type argument
[1:33] <cmccabe> tv: sorry, owner_id
[1:33] <Tv> so to change ownership, you need to know their user_id; to let them access things, email is enough
[1:34] <Tv> not pretty ;)
[1:34] <cmccabe> tv: I think you can still grant an email user FULL_CONTROL
[1:34] <cmccabe> tv: they just can't be Owner
[1:34] <cmccabe> tv: I don't know if Owner has any special powers or if it's just administrativa
[1:34] <Tv> can you actually give your buckets/objects away like that?
[1:35] <Tv> doesn't that mean i can cause you to get a huge invoice
[1:35] <Tv> i think Owner can always set the acl
[1:35] <Tv> but i have no hard facts on that one
[1:35] <cmccabe> tv: seems reasonable
[1:38] <Tv> this is why s3-tests rock
[1:38] <greglap1> I think the point of the email thing is so you can give people access before they actually have an S3 account
[1:39] <Tv> you can explore what aws behavior is, and then run the same code against rgw
[1:39] <greglap1> just so long as it's the same email they use for their Amazon account
[1:39] <cmccabe> tv: I have been exploring aws behavior a lot, and running the same code against rgw
[1:39] <Tv> just need more coverage
[1:39] <cmccabe> tv: there is also an automated obsync test
[1:39] <cmccabe> tv: although it doesn't do acls at the moment
[1:40] <cmccabe> tv: but it would be nice to test some of this stuff in there
[1:40] <Tv> yeah just doing it via s3-tests rather than ad hoc means we get organizational knowledge & regression testing
[1:40] * greglap1 (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[1:40] <cmccabe> tv: test-obsync.py is not really ad-hoc
[1:40] <Tv> i see nothing in aws docs about being able / not being able to change owner of buckets and objects
[1:40] <Tv> it wouldn't surprise me if that was not doable
[1:41] <Tv> it really would be hard to do right; i think if you had that feature, i could make you owe money to amazon
[1:41] <cmccabe> tv: that is a pretty good point
[1:41] <Tv> it's the classic unix fs quota attack
[1:42] <cmccabe> tv: I thought chown was no longer accessible by non-root
[1:42] <Tv> hence "classic" ;)
[1:42] <cmccabe> tv: ACLs are kind of an annoying system
[1:43] <cmccabe> tv: I much prefer capabilities
[1:43] <cmccabe> tv: I guess S3 wanted to keep it simple, though
[1:43] <cmccabe> tv: or the S3 designers
[1:43] <Tv> for certain values of simple, yeah
[1:43] <Tv> it's more about users understanding it, though
[1:43] <cmccabe> tv: yeah
[1:43] <Tv> gmail is doing interesting work
[1:44] <Tv> using their 2-factor auth means you get use-specific passwords for every google service
[1:44] <Tv> my mail notifier has different pass then my actual mail client
[1:44] <Tv> now if they'd be locked into the operations they need...
[1:44] <Tv> that's caps via passwords, right there
[1:45] <cmccabe> tv: true, but the capabilities are probably somewhat overlapping
[1:45] <cmccabe> tv: like the linux root capabilities
[1:45] <cmccabe> tv: where if you have some of them, you can "collect" the others :)
[1:45] <Tv> well my mail notifer shouldn't be able to write to my mail, or look at my calendar, etc
[1:45] <cmccabe> tv: yeah, but your mail client can probably do the work of the mail notifier
[1:45] <cmccabe> tv: but I'm just nitpicking
[1:45] <Tv> that's more because -- especially the current linux caps -- are poorly assembled on top of a core that's not built for them
[1:45] <cmccabe> tv: it is nice that they're trying
[1:46] <cmccabe> tv: well, also, when dealing with hardware, sometimes the hardware wasn't built with your nice abstractions in mind
[1:46] <cmccabe> tv: like I think PCI at least could DMA to anywhere
[1:47] <Tv> old pci yes, new pci no, iirc
[1:47] <Tv> but device drivers are trusted code anyway
[1:47] <cmccabe> tv: or more obviously, if you have the ability to write to the root partition, you can usually get the rest of the caps after a reboot
[1:47] <Tv> it's not like my ls would talk to the pci bus
[1:47] <cmccabe> tv: I was talking root capabilities
[1:47] <Tv> oh autotest..
[1:48] <Tv> i could kludge this, but then i'd lose the concept of iterations within a test
[1:53] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) Quit (Read error: Operation timed out)
[1:56] * yehuda_wk (~quassel@ip-66-33-206-8.dreamhost.com) Quit (Remote host closed the connection)
[2:00] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[2:03] <bchrisman> I recall something changing with regards to filesystem key??? has the mount call changed as in: http://ceph.newdream.net/wiki/Mounting_the_file_system
[2:04] <bchrisman> getting: error adding secret to kernel client.admin : No such device when mounting
[2:06] <bchrisman> strace on the command has: [pid 4219] add_key(0x404023, 0x2398220, 0x7fffe50fee20, 0x22, 0xfffffffc) = -1 ENODEV (No such device)
[2:08] <bchrisman> ahh my bad??? no mountpoint was there??? odd error message but..
[2:10] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[2:14] <cmccabe> it's strange that amazon says "A bucket is owned by the AWS account (identified by AWS Access Key ID) that created it."
[2:14] <cmccabe> but then later on we're talking about user_id and such
[2:16] <bchrisman> feh.. still having problems with mountpoint there:error adding secret to kernel client.admin : No such devicemount error 5 = Input/output error
[2:16] <bchrisman> is there something required for the kernel key registration/lookup stuff that's been added recently?
[2:17] <bchrisman> or should it be all transparent from an admin perspective?
[2:17] <cmccabe> bchrisman: tv reworked the key management recently for mount.ceph
[2:17] <joshd> bchrisman: the kernel key registration stuff should have been backwards compatible, I think
[2:17] <cmccabe> bchrisman: I believe it was supposed to be backwards compatible
[2:18] <cmccabe> bchrisman: but maybe we ripped out the "pass in the key on the command line" silliness?
[2:19] <bchrisman> hmm
[2:19] <bchrisman> do we still generate the key with mkcephfs?
[2:20] <cmccabe> I'm pretty sure
[2:20] <joshd> bchrisman: it falls back to the old secret= option if it gets ENODEV or ENOSYS from the kernel key api
[2:20] <cmccabe> it's all in the calls to cauthtool
[2:22] <bchrisman> is cauthtool calling add_key then?
[2:23] <joshd> bchrisman: no, mount.ceph is calling add key
[2:23] <joshd> bchrisman: try putting your secret in a file and passing it to mount via the secretfile=<filename> option
[2:23] <cmccabe> bchrisman: cauthtool is all in userspace
[2:23] <cmccabe> bchrisman: it doesn't talk to the kernel really at all about security
[2:24] <bchrisman> yeah.. that's what I thought.
[2:24] <cmccabe> bchrisman: it is used to generate keys and do things with them. yehuda wrote it.
[2:25] <bchrisman> yeah.. checking with key in file rather than mount cmdline
[2:27] <bchrisman> so yeah.. did mkcephfs -v -a -c cephconf ???mkbtrfs.. then cauthtool -l /etc/ceph/keyring.bin, took the key from that and placed it in /etc/ceph/filesystemkey, then mount -t ceph /scale -o name=admin,secretfile=/etc/ceph/filesystemkey gets error adding secret to kernel client.admin : No such devicemount error 5 = Input/output error
[2:28] <bchrisman> not sure what's hosed up there???
[2:29] <bchrisman> (where 'cephconf' is actually /etc/ceph/ceph.conf)
[2:33] <joshd> bchrisman: trying to reproduce here
[2:39] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[2:43] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[2:50] <joshd> bchrisman: can't reproduce in my test vm, but I'm not sure what's wrong with your setup
[2:51] <bchrisman> cool enough.. thanks for testing that.. I'll deal with it in the morning.
[2:51] <bchrisman> something must be mucked up.
[2:54] * bchrisman (~Adium@70-35-37-146.static.wiline.com) Quit (Quit: Leaving.)
[2:55] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[3:33] * sagelap (~sage@ Quit (Ping timeout: 480 seconds)
[3:43] * cmccabe (~cmccabe@ has left #ceph
[3:50] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[4:28] * greglap (~Adium@cpe-76-170-84-245.socal.res.rr.com) has joined #ceph
[5:00] * djlee1 (~dlee064@des152.esc.auckland.ac.nz) has joined #ceph
[5:03] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: Leaving)
[5:06] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[5:15] * djlee (~dlee064@des152.esc.auckland.ac.nz) has joined #ceph
[5:18] * djlee1 (~dlee064@des152.esc.auckland.ac.nz) Quit (Read error: Operation timed out)
[5:46] * cephuser1 (~cephuser1@173-24-225-53.client.mchsi.com) has joined #ceph
[5:46] * cephuser1 (~cephuser1@173-24-225-53.client.mchsi.com) Quit ()
[5:47] * cephuser1 (~cephuser1@173-24-225-53.client.mchsi.com) has joined #ceph
[6:28] * cephuser1 (~cephuser1@173-24-225-53.client.mchsi.com) Quit (Quit: Leaving)
[6:28] * cephuser1 (~cephuser1@173-24-225-53.client.mchsi.com) has joined #ceph
[6:29] * cephuser1 (~cephuser1@173-24-225-53.client.mchsi.com) Quit ()
[6:47] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Read error: Operation timed out)
[6:51] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[8:43] * darktim (~andre@ticket1.nine.ch) has joined #ceph
[9:03] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) has joined #ceph
[9:07] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) Quit ()
[9:09] <chraible> hi @all
[9:11] <chraible> i got this "error" message: [WRN] message from mon2 was stamped 12.271440s in the future, clocks not synchronized but i have synchronized my clocks 1 min befor with the same ntp-server ....
[9:12] <chraible> can i ignore this message or what should I do?
[9:13] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) has joined #ceph
[9:15] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[9:43] <chraible> i have a new question.... is it possible that I cant create an kvm / qcow2 image on a ceph fs that runs with libvirt / kvm?
[9:57] * neurodrone_ (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[9:57] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Read error: Connection reset by peer)
[9:57] * neurodrone_ is now known as neurodrone
[10:38] * allsystemsarego (~allsystem@ has joined #ceph
[10:39] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Quit: zzZZZZzz)
[11:19] <Yulya_the_drama_queen> chraible: i'v heard about some problem with glusterfs and VM images
[11:19] <Yulya_the_drama_queen> maybe workaround for glusterfs could help you
[11:21] <Yulya_the_drama_queen> http://www.gluster.com/community/documentation/index.php/GlusterFS_and_Xen
[12:49] <Yulya_the_drama_queen> hm
[12:49] <Yulya_the_drama_queen> guys
[12:50] <Yulya_the_drama_queen> how can i debug ceph on client?
[12:51] <Yulya_the_drama_queen> i see one process hungs on fflush call for an file on ceph
[12:54] <Yulya_the_drama_queen> process in Sl state
[12:54] <Yulya_the_drama_queen> and there is no helpfull messages in dmesg
[12:56] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[13:18] <chraible> so the next error / question....
[13:19] <chraible> after 1 or 2 hours ceph is running I can't connect do a "ls -la" on the file system....
[13:20] <chraible> in /var/log/messages there are the following timeouts http://pastebin.com/dnwVRf5F
[13:22] <chraible> now i got this kernel error message. http://pastebin.com/UmrCRuhq
[13:32] * wonko_be (bernard@november.openminds.be) has joined #ceph
[13:32] * wonko_be_ (bernard@november.openminds.be) Quit (Read error: Connection reset by peer)
[13:47] <trollface> chraible: cn you check if your metadata servers are running?
[13:49] <chraible> cpeh -w shows following: mds e188: 1/1/1 up {0=up:active}, 2 up:standby
[13:51] <chraible> this shows for me that one is running...
[13:51] <trollface> hmm
[13:52] <trollface> you can try enabling debugging on an active mds
[13:52] <trollface> or you can try stopping active mds and see if/how this thing will recover
[13:54] <chraible> how can I do this? (stopping one mds)
[13:55] <chraible> if someone need more informations I wrote an email to the mailing list... http://marc.info/?l=ceph-devel&m=130442345004413&w=2
[13:55] <chraible> (sorry for my bad english) ^^
[13:58] * mrokooie (~mrokooie@ has joined #ceph
[13:59] <mrokooie> Any body who used ceph in large scale?
[13:59] <Yulya_the_drama_queen> hm
[13:59] <Yulya_the_drama_queen> strange
[14:00] <Yulya_the_drama_queen> i found file wich cannot be touched from any node
[14:01] <Yulya_the_drama_queen> and what now i can do to debug this?
[14:11] * darktim (~andre@ticket1.nine.ch) Quit (Remote host closed the connection)
[14:13] * darktim (~andre@ticket1.nine.ch) has joined #ceph
[14:13] <trollface> greglap, sagewk if I run out of space can I just truncate /data/mon${id}/log* ?
[14:16] * mrokooie (~mrokooie@ Quit (Ping timeout: 480 seconds)
[14:34] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[16:24] <josef> sagewk: well that took alot longer than it should have
[16:24] <josef> i'll be sending a patch shortly
[17:16] <josef> sagewk: just posted "Btrfs: fix how we do space reservation for truncate", that should fix it
[17:36] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[17:42] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[17:50] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) Quit (Remote host closed the connection)
[18:25] * sagewk (~sage@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:37] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[18:48] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[18:55] * cmccabe (~cmccabe@c-24-23-254-199.hsd1.ca.comcast.net) has joined #ceph
[19:03] <sagewk> 10:15 for standup
[19:11] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[19:12] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) has joined #ceph
[19:35] <Tv> gregaf1/greglap: i have a few mds questions if you have time
[19:35] <greglap> sure
[19:35] <Tv> greglap: background: the "How to reduce active mds number" email on the list
[19:35] <Tv> see my response to it for what i think is going on
[19:35] <Tv> basically
[19:36] <Tv> 1) i can vstart 3 mds'es, i do ceph mds stop 0, it just stays in stopping
[19:36] <Tv> 2) i can "ceph mds set_max_mds 2", and that doesn't seem to change anything; the stop runs into 1) above, and killing & restarting the mds in question just puts it back to "active"
[19:37] <greglap> hmm, stopping might be broken ??? it's not part of any of my regular testing and I don't know that I've ever used it
[19:37] <greglap> changing the max_mds I don't think can or is presently supposed to do anything by itself in terms of shutting down MDSes
[19:38] <Tv> so is there a way to make one go from active to standby?
[19:38] <greglap> it just adjusts a variable in the monitor that configures how many MDSes the MDSMonitor will try to keep in active mode (ie, move from standby to active)
[19:38] <greglap> nope
[19:38] <Tv> yeah i see that logic, and i see how it'll activate new ones as needed
[19:38] <greglap> stopping is supposed to shut down the mds
[19:38] <Tv> hmm
[19:39] <greglap> but that doesn't put it back into the standby pool (if you wanted it, you could just start up a new daemon on that node)
[19:39] <greglap> it looks like the problem is that maybe stopping is just broken
[19:40] <Tv> greglap: confirming one last round: stopping is *supposed to* make it potentially go back to standby, if it worked right?
[19:40] <greglap> I wouldn't worry about changing max_mds not stopping stuff though, it doesn't have the information required to choose the right MDS to shut down
[19:40] <Tv> oh yeah i'm fine with human choosing what to kill
[19:40] <greglap> stopping is *supposed to* export all the data the MDS is caching and has in its journal and then shut down the daemon
[19:40] <greglap> active->standby is not a possible transition for any daemon to make
[19:41] <Tv> i guess my question is, would it upon restart then re-evaluate its need to be active
[19:41] <greglap> once you stop it the daemon is dead, if another daemon got started then it would be like any other daemon starting
[19:42] <greglap> ie, it tells the monitor it's an alive daemon and gives the monitor a few preferences regarding standby-replay
[19:42] <greglap> and then it's in standby or maybe it goes into standby-replay (or if there are no active MDSes it can go active, of course)
[19:44] <Tv> yeah ok that should clear things up
[19:45] * trollface usually uses service ceph stop mds followed by service ceph start mds to make mds go to standby
[19:45] <greglap> the daemon does restart itself in the boot state sometimes, maybe that's confusing you? But it does it by calling exec and it's a whole new process
[19:45] <Tv> trollface: the stop works for you?
[19:46] <trollface> Tv: service ceph stop mds AKA /etc/init.d/ceph stop mds
[19:46] <Tv> trollface: oh sorry misread your service stop as ceph mds stop
[19:46] <greglap> that shuts down the daemons but is distinct from ceph mds stop 0 (or whatever the syntax is)
[19:46] <trollface> so, not a monitor command to send mds0 stop command, just plain kill -TERM cmds
[19:46] <Tv> trollface: that doesn't make it go to standby for me
[19:47] <trollface> Tv: it goes down, other one picks up the lead
[19:47] <trollface> then I start the one I just stopped and it enters the pool as a standby one
[19:47] <greglap> trollface: we're discussing trying to reduce the number of active daemons here, though
[19:47] <trollface> ah
[19:47] <Tv> yeah that explains it
[19:48] <trollface> last time I tried makeing more than one active, it resulted in massive clusterfsck. This was probably the first time I was extremelly annoying on this channel
[19:48] <greglap> which takes a bit more work because you need to flush out an entire mds log and make sure all the directories are owned by other MDSes etc
[19:48] <trollface> however I've lost count since
[19:49] <trollface> greglap: can I just truncate monitor logs, btw? They eat lots of space
[19:49] <Tv> trollface: i've seen plenty of bugs on write-heavy workloads with multiple mds'es, yes i'd suggest you have just one mds active if you don't like frustration
[19:49] <greglap> yeah, multi-MDS systems aren't stable although I've been working on it lately and they're much better now on master when used with the fuse client (kclient still isn't there yet :( )
[19:50] <greglap> trollface: depends on the logs...debug logs, always, if it's the log monitor logs I think you can but not sure, you'll need to ask Sage
[19:50] <greglap> if it's logs of the other monitor histories, probably not, though it depends on the specific system in question
[19:51] <Tv> http://tracker.newdream.net/issues/1048
[19:52] <trollface> btw, is anyone here using ceph as a backend for qemu-kvm images ?
[19:52] <trollface> (as in qemu-rbd)
[19:53] <Tv> trollface: joshd has worked on that a lot
[19:53] <joshd> trollface: what do you want to know?
[19:55] <trollface> joshd: my friends tried to do that, but they somehow failed and trying to do some massive rube goldberg-like thingy that includes drbd and ocfs2
[19:55] <trollface> I am now trying to extract something useful out of them and then will try to run a couple of windows machines myself
[19:56] <trollface> aha, so they say rbd pastition was unusable after installation, something like kernel panic after install
[19:56] <trollface> part
[19:59] <joshd> without more info on their setup it's hard to say what went wrong (maybe osds died)?
[19:59] <joshd> but rbd has certainly worked for me
[20:00] <trollface> was it with 0.14 or the old branch?
[20:02] <greglap> trollface: 0.14? that's a bit old, isn't it?
[20:02] <joshd> haven't tried with an official release in a while, since we recently created librbd and changed our qemu driver to use that, but it hasn't been accepted upstream yet
[20:02] <joshd> greglap: that's a qemu version
[20:02] <greglap> ah, okay
[20:02] <trollface> qemu 0.14 mainline that has rbd in it
[20:02] <joshd> if you're running a recent version of ceph (>= 0.25 or so) and you want to test rbd, I'd suggest trying the for-qemu branch of git://ceph.newdream.net/git/qemu-kvm.git
[20:04] <cmccabe> tv: so check out http://aws.amazon.com/iam/
[20:04] <trollface> okay
[20:04] <trollface> I'll need to apply it on top of fedoras though :)
[20:04] * verwilst (~verwilst@dD576FAAE.access.telenet.be) has joined #ceph
[20:04] <cmccabe> tv: you will "Receive a single bill for the activity of all Users within your AWS Account"
[20:04] <cmccabe> tv: so that indicates to me that a single AWS account (access key) can have multiple users
[20:04] <Tv> cmccabe: yeah. IAM is like a company having a hierarchical set of users, all billed together
[20:04] <Tv> cmccabe: each access key still belongs to a single user in that hierarchy
[20:05] <cmccabe> tv: so what is the point of having access keys separate from users
[20:05] <joshd> trollface: shouldn't be too hard to apply, rbd is pretty self-contained
[20:05] <cmccabe> tv: if it's always a 1->1 mapping
[20:05] <Tv> cmccabe: the whole point of IAM is to avoid needing to share access keys across multiple users
[20:05] <Tv> cmccabe: one user can have multiple access keys
[20:05] <Tv> cmccabe: you can rotate them if you e.g. realize you pastebinned your secret
[20:06] <cmccabe> tv: I guess any system needs something like that
[20:06] <Tv> any good system, yes ;)
[20:06] <cmccabe> tv: the ability to change your "password"
[20:07] <Tv> yeah, and because this is meant for serious use, you want more than one to be valid at the same time, so you can do a rolling update
[20:07] <Tv> hence, multiple access keys per user
[20:40] * greglap (~Adium@cpe-76-170-84-245.socal.res.rr.com) Quit (Quit: Leaving.)
[20:45] * Yulya_th1_drama_queen (~Yulya@ip-95-220-180-110.bb.netbynet.ru) has joined #ceph
[20:49] * Yulya_the_drama_queen (~Yulya@ip-95-220-189-119.bb.netbynet.ru) Quit (Ping timeout: 480 seconds)
[21:08] * midnightmagic (~midnightm@S0106000102ec26fe.gv.shawcable.net) Quit (Ping timeout: 480 seconds)
[21:14] * ghaskins_mobile (~ghaskins_@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: This computer has gone to sleep)
[21:28] * gunther (57b32646@ircip2.mibbit.com) has joined #ceph
[23:01] * gunther (57b32646@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[23:10] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.