#ceph IRC Log

Index

IRC Log for 2012-01-12

Timestamps are in GMT/BST.

[0:04] * adjohn is now known as Guest23761
[0:04] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[0:04] * adjohn (~adjohn@208.90.214.43) Quit (Remote host closed the connection)
[0:04] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[0:05] * Guest23761 (~adjohn@208.90.214.43) Quit (Ping timeout: 480 seconds)
[0:19] * fronlius (~fronlius@d217151.adsl.hansenet.de) Quit (Quit: fronlius)
[0:26] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[0:27] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[0:30] * alexk (~alexk@cadlab.kiev.ua) Quit (Read error: Connection reset by peer)
[0:30] * alexk (~alexk@cadlab.kiev.ua) has joined #ceph
[0:52] * edwardw is now known as edwardw`away
[0:52] * edwardw`away (~edward@ec2-50-19-100-56.compute-1.amazonaws.com) Quit (Remote host closed the connection)
[0:54] * edwardw`away (~edward@ec2-50-19-100-56.compute-1.amazonaws.com) has joined #ceph
[0:54] * edwardw`away is now known as edwardw
[0:55] * mtk (~mtk@ool-44c35967.dyn.optonline.net) Quit (Remote host closed the connection)
[0:57] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[1:10] * edwardw is now known as edwardw`away
[2:09] * adjohn (~adjohn@208.90.214.43) Quit (Quit: adjohn)
[2:14] * Tv (~Tv|work@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[2:15] * alexk (~alexk@cadlab.kiev.ua) Quit ()
[2:29] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:29] * spadaccio (~spadaccio@213-155-151-233.customer.teliacarrier.com) Quit (Quit: WeeChat 0.3.7-dev)
[2:33] * bchrisman (~Adium@108.60.121.114) Quit (Quit: Leaving.)
[3:01] * adjohn (~adjohn@70-36-197-80.dsl.dynamic.sonic.net) has joined #ceph
[3:04] * adjohn is now known as Guest23777
[3:04] * Guest23777 (~adjohn@70-36-197-80.dsl.dynamic.sonic.net) Quit (Read error: Connection reset by peer)
[3:04] * adjohn (~adjohn@70-36-197-80.dsl.dynamic.sonic.net) has joined #ceph
[3:40] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[4:20] * adjohn (~adjohn@70-36-197-80.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[4:34] * sjustlaptop (~sam@24-205-39-1.dhcp.gldl.ca.charter.com) has joined #ceph
[4:38] * sjustlaptop (~sam@24-205-39-1.dhcp.gldl.ca.charter.com) Quit ()
[4:54] * voidah (~voidah@pwel.org) has joined #ceph
[4:55] <voidah> hi everybody
[4:57] <voidah> first time ceph user here
[4:57] <voidah> while I have found an answer to almost all my question/problems online, I have a problem that I can figure out myself
[4:58] <voidah> I have a single ceph node, with osd, mds and mon on it
[4:58] <voidah> but I can't mount it
[4:58] <voidah> (22:50:45) voider@myhost $ sudo mount -t ceph 192.168.1.221:6789:/ ceph -vv
[4:58] <voidah> mount: 192.168.1.221:6789:/: can't read superblock
[4:59] <voidah> on the wiki, it'swritten that it might be because mds is not up
[4:59] <voidah> (22:50:45) voider@myhost $ sudo mount -t ceph 192.168.1.221:6789:/ ceph -vv
[4:59] <voidah> sorry for the last line
[4:59] <voidah> [root@ceph1 ~]# ceph -s | grep mds
[4:59] <voidah> 2012-01-11 22:54:49.353122 mds e3: 1/1/1 up {0=0=up:creating}
[4:59] <voidah> is the "creating" status is normal, and is it supposed to becore "active" soon?
[5:00] <voidah> it's an 8gb partition, if it does matter
[5:00] <voidah> which is relatively small
[5:03] <voidah> complete ceph -s output:
[5:03] <voidah> 2012-01-11 23:02:57.976976 pg v6: 192 pgs: 192 creating; 0 KB data, 164 KB used, 7344 MB / 8191 MB avail
[5:03] <voidah> 2012-01-11 23:02:57.977557 mds e3: 1/1/1 up {0=0=up:creating}
[5:03] <voidah> 2012-01-11 23:02:57.977886 osd e3: 1 osds: 1 up, 1 in
[5:03] <voidah> 2012-01-11 23:02:57.978185 log 2012-01-11 22:50:49.318884 mon.0 192.168.1.221:6789/0 4 : [INF] osd.0 192.168.1.221:6800/920 boot
[5:03] <voidah> 2012-01-11 23:02:57.978518 mon e1: 1 mons at {0=192.168.1.221:6789/0}
[5:08] * adjohn (~adjohn@70-36-197-80.dsl.dynamic.sonic.net) has joined #ceph
[5:39] * elder (~elder@aon.hq.newdream.net) Quit (Quit: Leaving)
[5:50] * adjohn is now known as Guest23790
[5:50] * Guest23790 (~adjohn@70-36-197-80.dsl.dynamic.sonic.net) Quit (Read error: Connection reset by peer)
[5:50] * adjohn (~adjohn@70-36-197-80.dsl.dynamic.sonic.net) has joined #ceph
[5:51] * adjohn (~adjohn@70-36-197-80.dsl.dynamic.sonic.net) Quit ()
[5:52] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[5:57] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[6:09] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[7:04] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[7:24] <voidah> ok, got it to work :)
[8:03] * raso (~raso@91.121.10.104) has joined #ceph
[8:03] * raso (~raso@91.121.10.104) Quit ()
[8:04] * raso (~raso@debian-multimedia.org) has joined #ceph
[8:08] * scode (~scode@pollux.scode.org) has joined #ceph
[8:08] <scode> Is anyone aware of an explanation for the math in crush_make_straw_bucket, when massaging the weights?
[8:09] <scode> The crush paper didn't really go into that, and while what it's doing vaguely makes sense to me in very broad strokes, if there is any explanation for it I'd love to read it.
[8:12] <scode> I'm sorry, I mean crush_calc_straw.
[8:44] <NaioN> does somebody know when the 0.40 will be tagged?
[8:44] * voidah (~voidah@pwel.org) Quit (Ping timeout: 480 seconds)
[8:46] * fghaas (~florian@85-127-93-41.dynamic.xdsl-line.inode.at) has joined #ceph
[9:38] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:44] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[10:12] * xarthisius (~xarth@hum.astri.uni.torun.pl) has joined #ceph
[10:13] * fghaas1 (~florian@85-127-93-41.dynamic.xdsl-line.inode.at) has joined #ceph
[10:17] * fghaas (~florian@85-127-93-41.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[10:31] * fronlius (~fronlius@testing78.jimdo-server.com) has joined #ceph
[10:41] * gregorg (~Greg@78.155.152.6) has joined #ceph
[11:19] * spadaccio (~spadaccio@213-155-151-233.customer.teliacarrier.com) has joined #ceph
[13:14] * fghaas1 (~florian@85-127-93-41.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[13:23] * BManojlovic (~steki@93-87-148-183.dynamic.isp.telekom.rs) has joined #ceph
[13:24] * fghaas (~florian@213.162.68.74) has joined #ceph
[13:35] * fghaas (~florian@213.162.68.74) Quit (Ping timeout: 480 seconds)
[13:49] * fghaas (~florian@85-127-92-127.dynamic.xdsl-line.inode.at) has joined #ceph
[14:02] * lollercaust (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) has joined #ceph
[14:27] * alexk (~alexk@cadlab.kiev.ua) has joined #ceph
[14:37] <alexk> hi guys. I would like to repeat my questions which I asked yesterday (not sure if sage or greg are online yet) - does anyone know if RADOS/libradosgw is stable enough for production use? any use case references would be highly appreciated.
[14:50] * ssedov (stas@ssh.deglitch.com) has joined #ceph
[14:51] * stass (stas@ssh.deglitch.com) Quit (Read error: Connection reset by peer)
[15:26] * fghaas (~florian@85-127-92-127.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[15:45] * pserik (~Serge@eduroam-61-166.uni-paderborn.de) has joined #ceph
[15:46] <pserik> Hey all….
[15:47] <pserik> I have a question about the consistency model followed by Cep 0.36
[15:47] <pserik> can it be configured, if yes… what is the default one..
[15:47] <pserik> Ceph 0.36
[15:49] <pserik> I meant, the semantic followed to write back the modified content to the cache….
[15:49] <pserik> #
[15:50] <pserik> from the cache to the storage backend
[16:11] * fghaas (~florian@85-127-92-127.dynamic.xdsl-line.inode.at) has joined #ceph
[16:23] * BManojlovic (~steki@93-87-148-183.dynamic.isp.telekom.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[16:23] * edwardw`away is now known as edwardw
[16:31] <dwm_> alexk: I suspect the answer is: try it and see. Your milage may vary, batteries not included. If it breaks, you get to keep the pieces.
[16:33] * mtk (~mtk@ool-44c35967.dyn.optonline.net) has joined #ceph
[16:44] * thafreak (~thafreak@dynamic-acs-24-144-210-108.zoominternet.net) has joined #ceph
[16:54] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) has joined #ceph
[16:59] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[17:00] <nhorman> hey all, anyone got a sec to answer a question regarding cephx authentication?
[17:06] <alexk> dwm_: thanks. I think this means NO to me :) we do not have time to play games, I need a working solution. I will be watching ceph development and hopefully will get back to review it again some time later
[17:06] <nhorman> specifically, has anyone hit a case where an strace shows a key getting added to the user keyring, but request key failing to find the same description during the mount operation
[17:07] * alexk (~alexk@cadlab.kiev.ua) Quit ()
[17:30] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[17:42] * Tv (~Tv|work@aon.hq.newdream.net) has joined #ceph
[17:44] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:45] * nhorman_ (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[17:51] * pserik (~Serge@eduroam-61-166.uni-paderborn.de) Quit (Quit: pserik)
[17:51] * nhorman_ (~nhorman@nat-pool-rdu.redhat.com) Quit (Read error: Connection reset by peer)
[17:51] * nhorman_ (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[17:52] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Ping timeout: 480 seconds)
[18:00] * adjohn is now known as Guest23845
[18:00] * Guest23845 (~adjohn@208.90.214.43) Quit (Read error: Connection reset by peer)
[18:00] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[18:15] * edwardw is now known as edwardw`away
[18:21] * adjohn (~adjohn@208.90.214.43) Quit (Remote host closed the connection)
[18:21] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[18:34] * voidah (~voidah@pwel.org) has joined #ceph
[18:40] <thafreak> So....how stable is ceph these days?
[18:40] <thafreak> I heard some one talking about using it with qemu for disk images...
[18:41] <jmlowe> I've been doing that
[18:41] <thafreak> yeah? And it's been stable?
[18:41] <iggy> rbd... you don't even use the ceph filesystem layer
[18:41] <iggy> it talks directly to the object store
[18:42] <jmlowe> ceph (rbd) itself is great, I kept having problems with the backing btrfs
[18:42] <thafreak> what distro are you guys running your object store on?
[18:42] <jmlowe> I use ubuntu 11.10
[18:42] <thafreak> so the object store uses a fs underneath then...similar to how glusterfs works I guess?
[18:42] <jmlowe> I have some deb packages I could send your way with rbd enabled and patches for async
[18:43] <iggy> the object store can use filesystems other than btrfs, but btrfs is required for snapshots and other features
[18:44] <jmlowe> I can untar a 2.6.38 kernel inside my vm in about 30 seconds, to give you an idea of how it performs
[18:44] <thafreak> would it make sense to use VM's to set up a test environment? Or is it something that really needs bare metal?
[18:44] <sjust> thafreak: you would probably want bare metal for performance testing, but for the basics vms should be fine
[18:44] <jmlowe> you could setup your ceph in a vm, I shudder to think of the performance
[18:44] <iggy> for testing it'll probably work, but yeah, it really wants real hardware for anything other than basic poking around
[18:45] <thafreak> yeah, I wouldn't be doing performance stuff yet...i guess just seeing how involved it is to setup the basics...
[18:46] <thafreak> so has anyone done something where like you have say 2-4 vm hosts, and each has a few phys disks, and they replicate vm images between themselves?
[18:46] <thafreak> or do most people have the image stores on separate hardware than the vm hosts that need access to the disk images?
[18:47] <jmlowe> I have 2 x hp dl180g6 with a p800 connecting a msa60 with 12x1tb
[18:47] <iggy> for rbd, I think that would work, but for the FS, they suggest you don't put osd's in the machines using the fs
[18:47] <jmlowe> then 4x hp dl160g6 to run the vm's on, everything has 10GigE
[18:47] <iggy> you can run into memory deadlocks mounting ceph on the osds
[18:48] <iggy> you should verify with someone that knows better though
[18:48] <iggy> that's just the way I understand it
[18:48] <jmlowe> agreed, when something goes wrong you really want your vm's and storage hosting separate
[18:48] <jmlowe> based on my personal experience
[18:50] <iggy> yeah, I'd think you wouldn't want the contention of VMs on the machines running the OSDs
[18:51] <iggy> without copious testing anyway... I know a company who makes a product that specifically is setup that way
[18:52] * lollercaust (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[18:55] <thafreak> what kind of hardware do osd's need, if they're dedicated to just that?
[18:55] <thafreak> is it high cpu demand or does it need a lot of memory, or is it mostly I/O bound?
[18:56] <jmlowe> i/o and extra memory will never hurt
[18:56] <iggy> I want to say there was something in the wiki about this at once point
[18:58] <Tv> there is
[18:59] <Tv> http://ceph.newdream.net/wiki/Designing_a_cluster
[18:59] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[18:59] <thafreak> ha, yep I just found that
[18:59] <thafreak> thanks
[18:59] <Tv> that hasn't migrated to the new docs because i wish to give it more meat before it moves, and not just copy-paste
[18:59] <thafreak> so, cpu might not be too important...fastest io and lots of ram...
[19:00] <Tv> thafreak: cpu is important but most servers these days have more cpu than other resources
[19:00] <Tv> thafreak: don't use Atom cpus...
[19:00] <iggy> for just rbd, you take the mds out of the picture, so that's one less thing using cpu
[19:02] <Tv> fwiw we just got a pile of 1*6-core CPU+16GB RAM+100GB SSD+8*1TB HDD+10GigE boxes for testing
[19:02] <Tv> i can't say yet if that's a perfect balance or not
[19:03] <Tv> but that was our guess at what'll work ;)
[19:03] <iggy> that's for testing or an actual deployment?
[19:03] <Tv> testing
[19:04] <thafreak> geez...
[19:04] <gregaf> iggy: don't need btrfs for snapshots…it's more efficient but not required!
[19:04] <Tv> thafreak: that's a small fraction of the real test setup..
[19:04] <thafreak> I was just looking at the one vendor...disk drives are through the roof...stupid flooding!
[19:05] <iggy> gregaf: good to know
[19:05] <gregaf> nhorman_: you'll have to give us more details about what you're doing with cephx that's breaking
[19:05] <thafreak> everyone knows you should have redundancy right...i mean what were HD manufacturers thinking having everything in thailand...
[19:06] <iggy> any input on my supposition that memory deadlocks aren't a problem if using rbd vs actually mounting ceph?
[19:07] <Tv> iggy: rbd.ko will still manage a pool of pages to flush out, i think that's subject to the same issues
[19:08] <iggy> what about librbd (with qemu/kvm)?
[19:08] <Tv> iggy: that should be immune
[19:08] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:08] <gregaf> userspace memory is never a problem because the kernel can page it to local disk ;)
[19:08] <Tv> iggy: that's just userspace vs userspace, so the kernel gets to mediate better; same with ceph-fuse
[19:08] <Tv> (we think)
[19:09] <iggy> that's what I was thinking... just had to specify that it was librbd I was thinking about
[19:13] * fghaas (~florian@85-127-92-127.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[19:22] * nhorman_ is now known as nhorman
[19:22] <nhorman> gregaf, heres the story
[19:23] <nhorman> I've got cephx authentication enabled, using the default keyring location /etc/ceph/cephx (on a fresh Fedora 16 install)
[19:24] <nhorman> If, with cephx enabled, I try to mount the volume with -0 name=admin,secret=<secret>
[19:24] <nhorman> the mount command exits with mount error=1 Permission Denied
[19:24] * josh (~josh@50-46-198-131.evrt.wa.frontiernet.net) has joined #ceph
[19:25] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[19:25] <nhorman> I've enabled some dynamic debug and found that the failure is the result of the request_key operation in the kernel failing to find the "ceph" key in the user keyring
[19:25] <josh> Hi
[19:25] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Read error: Connection reset by peer)
[19:25] <josh> I'm trying to setup ceph for the first time, but I have a problem with btrfs
[19:26] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[19:26] <josh> /sbin/mkcephfs -a --mkbtrfs -c mycluster.conf -k mycluster.keyring
[19:26] <nhorman> if I strace the mount command though, I see that the add_key call for the ceph key succeded
[19:26] <josh> adding device /dev/sdc id 2
[19:26] <josh> adding device /dev/sdd id 3
[19:26] <josh> adding device /dev/sde id 4
[19:26] <josh> fs created label (null) on /dev/sdb
[19:26] <josh> nodesize 4096 leafsize 4096 sectorsize 4096 size 7.28TB
[19:26] <josh> Btrfs Btrfs v0.19
[19:26] <josh> Scanning for Btrfs filesystems
[19:26] <josh> ** WARNING: Ceph is still under development. Any feedback can be directed **
[19:26] <josh> ** at ceph-devel@vger.kernel.org or http://ceph.newdream.net/. **
[19:26] <josh> 2012-01-12 10:19:19.749510 7f5e0e2d2760 filestore(/srv/ceph/osd.0) FileStore::mkfs: failed to chmod /srv/ceph/osd.0/current to 0755: error 30: Read-only file system
[19:26] <josh> 2012-01-12 10:19:19.749557 7f5e0e2d2760 OSD::mkfs: FileStore::mkfs failed with error -30
[19:26] <nhorman> so I'm trying to do some stap debugging to nail it down further, but I'm finding stap is currently broken on i686 F16
[19:26] <josh> 2012-01-12 10:19:19.749590 7f5e0e2d2760 ** ERROR: error creating empty object store in /srv/ceph/osd.0: error 30: Read-only file system
[19:26] <josh> failed: '/sbin/mkcephfs -d /tmp/mkcephfs.mOIyIxpa8j --init-daemon osd.0'
[19:28] <nhorman> josh, did you let mkcephfs mount /srv/ceph/osd.0 for you, or did you do it manually?
[19:29] <Tv> nhorman: -O? capital?
[19:29] <nhorman> Tv, sorry, no, lower case -o
[19:29] <Tv> nhorman: oh your line actually looks like -0, dash zero..
[19:29] <josh> I let it mount it for me
[19:29] <Tv> nhorman: yeah -o is the right thing
[19:29] <josh> I also tried to mount it by hand.
[19:30] <josh> and with out --mkbtrfs
[19:30] <nhorman> Tv, my bad, yeah, the options are right, the key is correct (identical to /etc/ceph/keyring)
[19:30] <josh> The current directory is RO, but the rest of the FS is still R/W
[19:30] <nhorman> Tv, and I see the key get added, its just that the ceph kernel module can't find it via request_key
[19:30] <josh> This is on a stock centos 6.2 host.
[19:30] <nhorman> josh, the Current directory, as in /src/ceph/osd.0/ is RO?
[19:31] <josh> [root@c01-sef ~]# df -h
[19:31] <josh> Filesystem Size Used Avail Use% Mounted on
[19:31] <josh> /dev/sda3 7.9G 6.1G 1.5G 82% /
[19:31] <josh> tmpfs 12G 0 12G 0% /dev/shm
[19:31] <josh> /dev/sda1 97M 44M 49M 47% /boot
[19:31] <josh> /dev/sda5 1.8T 199M 1.7T 1% /workplace
[19:31] <josh> /dev/sdb 7.3T 64K 7.3T 1% /srv/ceph/osd.0
[19:31] <josh> [root@c01-sef ~]# touch /srv/ceph/osd.0/T1
[19:31] <josh> [root@c01-sef ~]# touch /srv/ceph/osd.0/current/T1
[19:31] <josh> touch: cannot touch `/srv/ceph/osd.0/current/T1': Read-only file system
[19:33] <nhorman> josh, thats odd, I presume current doesn't have another disk mounted on it, does it?
[19:33] <josh> No, df is all filesystems
[19:34] <nhorman> josh, whats ls -l /src/ceph/osd.0 report
[19:34] <josh> [root@c01-sef ~]# ls -l /srv/ceph/osd.0/
[19:34] <josh> total 0
[19:34] <josh> drwx------ 1 root root 0 Jan 12 10:19 current
[19:34] <josh> -rw-r--r-- 1 root root 8 Jan 12 10:19 fsid
[19:34] <josh> -rw-r--r-- 1 root root 4 Jan 12 10:19 store_version
[19:34] <josh> -rw-r--r-- 1 root root 0 Jan 12 10:31 T1
[19:34] <josh> [root@c01-sef ~]#
[19:36] <nhorman> wierd, mkcephfs created the current directory, right?
[19:36] <josh> Yes.
[19:36] <josh> I unmounted everything
[19:36] <josh> cleared the partition on the drives
[19:36] <josh> and then let mkcephfs do it's think
[19:36] <josh> think -> thing
[19:37] <josh> [root@c01-sef ~]# umount /srv/ceph/osd.0/
[19:37] <josh> [root@c01-sef ~]# for i in {b..l} ; do parted -s -- /dev/sd$i rm 1 ; done
[19:37] <josh> Error: /dev/sdb: unrecognised disk label
[19:37] <josh> Error: /dev/sdc: unrecognised disk label
[19:37] <josh> Error: /dev/sdd: unrecognised disk label
[19:37] <nhorman> wierd, seems like a bug to me. You might try changing the permissions on current to 0755, and rerunning mkcepfs without the --mkbtrfs option as a workaround
[19:37] <josh> Error: /dev/sde: unrecognised disk label
[19:37] <josh> Error: Partition doesn't exist.
[19:37] <josh> Error: Partition doesn't exist.
[19:37] <josh> Error: Partition doesn't exist.
[19:37] <josh> Error: Partition doesn't exist.
[19:37] <josh> Error: Partition doesn't exist.
[19:37] <josh> Error: Partition doesn't exist.
[19:37] <josh> Error: Partition doesn't exist.
[19:37] <nhorman> um, thats not good
[19:37] <josh> [root@c01-sef ~]# /sbin/mkcephfs -a --mkbtrfs -c mycluster.conf -k mycluster.keyring
[19:37] <josh> temp dir is /tmp/mkcephfs.LA9c7O1PET
[19:37] <josh> preparing monmap in /tmp/mkcephfs.LA9c7O1PET/monmap
[19:38] <josh> /usr/bin/monmaptool --create --clobber --add a 10.10.64.150:6789 --add b 10.10.64.151:6789 --add c 10.10.64.152:6789 --print /tmp/mkcephfs.LA9c7O1PET/monmap
[19:38] <josh> /usr/bin/monmaptool: monmap file /tmp/mkcephfs.LA9c7O1PET/monmap
[19:38] <josh> /usr/bin/monmaptool: generated fsid 59996ae0-6bca-4146-8b3e-6597bc13ad01
[19:38] <josh> epoch 0
[19:38] <josh> fsid 59996ae0-6bca-4146-8b3e-6597bc13ad01
[19:38] <josh> last_changed 2012-01-12 10:37:37.998435
[19:38] <josh> created 2012-01-12 10:37:37.998435
[19:38] <josh> 0: 10.10.64.150:6789/0 mon.a
[19:38] <josh> 1: 10.10.64.151:6789/0 mon.b
[19:38] <josh> 2: 10.10.64.152:6789/0 mon.c
[19:38] <josh> /usr/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.LA9c7O1PET/monmap (3 monitors)
[19:38] <josh> === osd.0 ===
[19:38] <josh> umount: /srv/ceph/osd.0: not mounted
[19:38] <josh> umount: /dev/sdb: not mounted
[19:38] <josh> umount: /dev/sdc: not mounted
[19:38] <josh> umount: /dev/sdd: not mounted
[19:38] <josh> umount: /dev/sde: not mounted
[19:38] <josh> WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
[19:38] <josh> WARNING! - see http://btrfs.wiki.kernel.org before using
[19:38] <josh> adding device /dev/sdc id 2
[19:38] <josh> adding device /dev/sdd id 3
[19:38] <josh> adding device /dev/sde id 4
[19:38] <josh> fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 7.28TB
[19:39] <josh> Btrfs Btrfs v0.19
[19:39] <josh> Scanning for Btrfs filesystems
[19:39] <josh> ** WARNING: Ceph is still under development. Any feedback can be directed **
[19:39] <josh> ** at ceph-devel@vger.kernel.org or http://ceph.newdream.net/. **
[19:39] <josh> 2012-01-12 10:37:38.526970 7feca2636760 filestore(/srv/ceph/osd.0) FileStore::mkfs: failed to chmod /srv/ceph/osd.0/current to 0755: error 30: Read-only file system
[19:39] <josh> 2012-01-12 10:37:38.527029 7feca2636760 OSD::mkfs: FileStore::mkfs failed with error -30
[19:39] <josh> 2012-01-12 10:37:38.527079 7feca2636760 ** ERROR: error creating empty object store in /srv/ceph/osd.0: error 30: Read-only file system
[19:39] <josh> failed: '/sbin/mkcephfs -d /tmp/mkcephfs.LA9c7O1PET --init-daemon osd.0'
[19:39] <josh> rm -rf /tmp/mkcephfs.LA9c7O1PET
[19:39] <josh> Once /usr/bin/ceph-osd seems to touch the directory it goes RO. I cannot get it back without the reformat
[19:39] <josh> [root@c01-sef ~]# umount /srv/ceph/osd.0
[19:39] <josh> [root@c01-sef ~]# mount /dev/sdb /srv/ceph/osd.0
[19:39] <josh> [root@c01-sef ~]# chmod 0755 /srv/ceph/osd.0/current
[19:39] <josh> chmod: changing permissions of `/srv/ceph/osd.0/current': Read-only file system
[19:39] <Tv> josh: anything in dmesg?
[19:40] <Tv> josh: if ceph-osd triggered an fs bug, that would explain why it goes ro
[19:40] <josh> device fsid 954042bbec6f8978-5f9afde4999a5cb1 devid 3 transid 16 /dev/sdd
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 2 transid 4 /dev/sdc
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 3 transid 4 /dev/sdd
[19:40] <josh> device fsid 954042bbec6f8978-5f9afde4999a5cb1 devid 4 transid 16 /dev/sde
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 1 transid 4 /dev/sdb
[19:40] <nhorman> wait a sec, you're not creating any partitions on thses disks. That shouldn't be a problem, but I've always created a single partition on them prior to formatting, under the impression that it just makes things a bit more same.
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 3 transid 4 /dev/sdd
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 2 transid 4 /dev/sdc
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 4 transid 4 /dev/sde
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 4 transid 7 /dev/sde
[19:40] * voidah (~voidah@pwel.org) Quit (Ping timeout: 480 seconds)
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 1 transid 7 /dev/sdb
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 3 transid 7 /dev/sdd
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 2 transid 7 /dev/sdc
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 1 transid 7 /dev/sdb
[19:40] <josh> device fsid a84ebf1e735a7079-57e10b5395abe49a devid 1 transid 13 /dev/sdb
[19:41] <nhorman> josh, can you try creating a single primary partition on each drive and specifying those as block devices instead
[19:41] <nhorman> i.e. btrfs devs /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
[19:41] <andreask> josh: ever heard of www.pastie.org?
[19:42] <nhorman> I know it shouldn't matter, but I wonder if theres something in the mkcephfs script that expects a partition suffix on the device names
[19:42] <josh> I have heard of pastie, but not used it recently.
[19:42] <josh> should I be pasting some stuff there instead of IRC?
[19:43] <josh> <script src='http://pastie.org/3173797.js'></script>
[19:45] <josh> I forgot to umount before creating a partition. Back in a few once the reboot is done.
[19:48] <nhorman> josh, copy that, btw, you can also use pastebin or fpaste.org. the latter is particularly nice because its the default for the fpaste command line tool
[19:53] <nhorman> Tv, any thoughts regarding my cephx woes?
[19:53] <gregaf> he had to step out; sjust should be able to help you
[19:54] <nhorman> gregaf, thank you. sjust, any thoughts>'
[19:54] <nhorman> ?
[19:54] <sjust> nhorman: sorry, getting caught up on your conversation so far, one moment
[19:54] <gregaf> best I can suggest is to make sure that the key you're looking for is in fact in the file it's checking AND on the monitor
[19:54] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:55] <josh> No change with sd*1
[19:55] <josh> http://www.pastie.org/3173797
[19:56] <nhorman> gregaf, thank you, but I've not gotten to the monitor yet. Its the request_key call thats looking up the key in the client thats failing, despite an strace showing that the mount commands add_key call succeded
[19:56] <gregaf> nhorman: err, let me go through what I know
[19:56] <nhorman> gregaf, copy that, thank you
[19:56] <gregaf> okay, so you're trying to use the client.admin key in your mount, right?
[19:56] <josh> I used Centos 6.2 and the ceph-0.39.tar.gz tarball to build this.
[19:57] <gregaf> nhorman: so let's make sure that key actually exists in the file you're accessing
[19:57] <gregaf> ceph-authtool can do that: "ceph-authtool /etc/ceph/cephx -l"
[19:57] <gregaf> will list all the keys and capabilities
[19:58] <josh> Centos was first distro choice, matchs our other hosts, but is there a better distro to try?
[19:58] <gregaf> (my guess is it's not actually there)
[19:58] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Quit: fronlius)
[19:59] <gregaf> josh: debian/ubuntu are the easiest roads with Ceph (we develop on debian), but it should be fine on CentOS
[19:59] <nhorman> gregaf, http://fpaste.org/ccb6/
[19:59] <josh> I'll make an ext4 FS on a different disk... and try that out.
[20:00] <nhorman> josh, FWIW, I'm using F16, and with the exception of cephx, it all just works
[20:00] <josh> I wonder how that kernel compares to 6.2
[20:00] <nhorman> josh, just out of curiosity, those drives aren't some sort of shared back end storage are they?
[20:01] <nhorman> drives drop to RO when multiple initiatiors access them over iscsi or FCoE and their not set up for that
[20:01] <josh> No, just 4 2T drives. Supermicro hardware.
[20:01] <gregaf> nhorman: umm, that's the entire output?
[20:01] * bchrisman (~Adium@108.60.121.114) Quit (Quit: Leaving.)
[20:01] <gregaf> I woud expect it to have some caps as well (although strictly speaking it's not required IF they're specified on the monitor)
[20:01] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[20:01] <josh> if the whole mount went RO then I would fell better. It's weird that it's just a directory on the mount.
[20:02] <nhorman> gregaf, yes, thats the resultant keyring from the mkcephfs command
[20:02] <nhorman> I can add caps if need be easily enough, but, as noted, the mount command hasn't contacted the server when this failure occurs
[20:03] <gregaf> yeah
[20:04] <gregaf> I'm a little confused
[20:04] <josh> I tried 1 drive with btrfs, instead of 4, and that made no difference.
[20:04] <gregaf> you said it was failing to find the "ceph" key in the file, did you mean it was looking for that key?
[20:05] <nhorman> gregaf, no, sorry to be confusing. I didn't mean to use the word file. Heres a more detailed series of what I've noted:
[20:05] <joshd> gregaf: he's talking about the kernel keyring stuff
[20:05] <gregaf> joshd: yeah…do you know more about this than I do?
[20:05] <nhorman> joshd, yes, thats exactly it
[20:05] <gregaf> because I'm hella confused
[20:05] <nhorman> gregaf, what I'm seeing is that I run this command:
[20:05] <joshd> nhorman: there should be an error in syslog if the kernel can't read the key from the keyring
[20:06] * nhorman_ (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[20:06] <joshd> gregaf: yeah, I wrote the mount/rbd map handling of it
[20:06] <nhorman_> grr, sorry, OFTC just disconected me
[20:06] * jojy (~jvarghese@108.60.121.114) has joined #ceph
[20:06] <nhorman_> anywho, I run this command:
[20:06] <nhorman_> mount -t ceph server:/ /mnt/ceph -o name=admin,secret=<secret key value>
[20:07] <nhorman_> in the strace I see this that the add_key syscall succededs, telling me that I've added the key with the description "ceph"
[20:07] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Read error: Operation timed out)
[20:07] * nhorman_ is now known as nhorman
[20:07] <nhorman> but then I also see a syslog message saying:
[20:07] <josh> with ext4 I don't have the RO issue, but my ssh key now pass a passphrase. How can I tell what it was set to?
[20:08] <nhorman> mount error=1 : operation not permitted
[20:09] <nhorman> if I enable some extra dynamic debug I can see that that get_secret in ceph_common is failing. Specifically its failing because request_key is returning -ENOKEY and generating the ceph: Mount failed due to key not found
[20:10] * jmlowe (~Adium@129-79-134-204.dhcp-bl.indiana.edu) has joined #ceph
[20:11] <joshd> nhorman: is the key name that's set the same as the one that's not found?
[20:12] <josh> ok, put my old private key back in place. This executed correctly
[20:13] <nhorman> joshd, looking it I don't think it is. The strace doesn't print out the key description, but the mount.ceph source has the key name hardcoded as "ceph", whereas the request_key operation takes the name thats pased as -o name=<name> and prepends "client."
[20:13] <nhorman> but that doesn't seem like it would have ever worked, so I'm guessing I'm misreading something.
[20:14] <joshd> nhorman: ceph is the key type, the name should be client.whatever
[20:15] <joshd> but if you're seeing "ceph: Mount failed due to key not found: ceph" I think that's the problem
[20:15] <nhorman> joshd, I agree, that certainly makes sense.
[20:15] * nhorman checks the mount.ceph source to see what description is being added
[20:15] * voidah (~voidah@pwel.org) has joined #ceph
[20:17] <josh> When should ceph -s start to respond?
[20:17] <josh> osd.*.log has scrub messages now
[20:17] <nhorman> joshd, looks like mount.ceph tries to prepend client. to the key name prior to calling get_secret_option, I wonder if thats not working
[20:19] <joshd> nhorman: if you trace the mount with gdb you can see the exact options being passed - they should include key=client.blah
[20:19] <nhorman> joshd, doing that now
[20:26] <nhorman> joshd, hmm, no love yet, right before calling add_key:
[20:26] <nhorman> (gdb) print key_name
[20:26] <nhorman> $2 = 0x8050b28 "client.admin"
[20:27] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[20:28] <nhorman> joshd, so It looks like they key_name is correct in mount up in user space, but I still see this in var/log/messages
[20:28] <joshd> nhorman: what's the return value of parse_options?
[20:28] <nhorman> Jan 12 14:27:15 shamino kernel: [20862.906001] libceph: ceph: Mount failed due to key not found: client.admin
[20:28] <nhorman> 1 sec, I'll tell you
[20:29] <joshd> nevermind, key= is set correctly
[20:29] <nhorman> set args shamino.rdu.redhat.com:/ /mnt/ceph -o name=admin,secret=AQDorw1PqJunKBAA3SvlD/DavpwNp31NoyerYw==
[20:30] <nhorman> oops, wrong clipboard
[20:30] <nhorman> joshd, (gdb) print popts
[20:30] <nhorman> $3 = 0x8050aa0 "name=admin,key=client.admin"
[20:30] <joshd> yeah, that's fine
[20:30] <nhorman> joshd, possible permission problem maybe?
[20:30] <joshd> seems to be so
[20:30] <nhorman> joshd, ugh
[20:30] <joshd> rbd.ko can't find the key in it's keyrings
[20:32] <joshd> maybe because it doesn't permissions for the user keyring
[20:32] <nhorman> that sounds like a reasonable theory
[20:32] <nhorman> I've got to jet, but I'll work down that path and get back up with you here later/tomorrow
[20:32] <nhorman> thanks for your help~!
[20:32] <joshd> you're welcome!
[20:33] <nhorman> talk to you all soon!
[20:33] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Quit: Leaving)
[20:43] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:47] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (Remote host closed the connection)
[20:48] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[20:48] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:52] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:01] * edwardw`away is now known as edwardw
[21:14] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[21:28] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[21:37] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (Quit: Ex-Chat)
[21:37] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[22:01] * The_Bishop (~bishop@e179024117.adsl.alicedsl.de) has joined #ceph
[22:05] * jmlowe (~Adium@129-79-134-204.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[22:17] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[22:23] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) has joined #ceph
[22:29] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[22:30] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[22:30] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Remote host closed the connection)
[22:30] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[22:45] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[23:41] <Tv> gregaf, sjust: http://staubman.com/blog/?p=67
[23:43] <sjust> :(
[23:44] <Tv> but clearly that is how a cephalopod would get in the cloud marketplace
[23:44] <Tv> *ba-dum tisch*
[23:45] <sjust> :( :(
[23:47] * elder (~elder@aon.hq.newdream.net) has joined #ceph
[23:57] <Tv> woo good news! i finally have credentials to the new servers!

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.