#ceph IRC Log

Index

IRC Log for 2012-01-06

Timestamps are in GMT/BST.

[0:27] * BManojlovic (~steki@212.200.243.100) Quit (Remote host closed the connection)
[1:00] * vodka (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[1:15] * MarkDude (~MT@c-67-170-237-59.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[1:31] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:40] * jmlowe (~Adium@mobile-166-147-096-188.mycingular.net) has joined #ceph
[2:36] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[2:37] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:05] * jojy (~jvarghese@108.60.121.114) Quit (Quit: jojy)
[3:16] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[3:25] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[3:41] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[5:22] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Ping timeout: 480 seconds)
[5:58] * MarkDude (~MT@c-67-170-237-59.hsd1.ca.comcast.net) has joined #ceph
[6:52] * aa (~aa@r190-132-64-34.dialup.mobile.ancel.net.uy) has joined #ceph
[7:18] * aneesh (~aneesh@122.248.163.4) has joined #ceph
[7:20] * aa (~aa@r190-132-64-34.dialup.mobile.ancel.net.uy) Quit (Remote host closed the connection)
[9:03] * verwilst (~verwilst@dD576F795.access.telenet.be) has joined #ceph
[9:19] <wido> sagewk: I'll give it a try today
[9:35] * BManojlovic (~steki@93-87-148-183.dynamic.isp.telekom.rs) has joined #ceph
[9:44] <guido> Hi, what does it mean if ceph -s is reporting a large number of pgs as "active+clean", but a smaller number (8) as just active? (And presumably not clean)
[10:07] * fghaas (~florian@85-127-93-41.dynamic.xdsl-line.inode.at) has joined #ceph
[10:22] * fronlius (~fronlius@testing78.jimdo-server.com) has joined #ceph
[10:36] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[10:36] * fghaas (~florian@85-127-93-41.dynamic.xdsl-line.inode.at) Quit (Ping timeout: 480 seconds)
[10:58] * fronlius_ (~fronlius@testing78.jimdo-server.com) has joined #ceph
[10:58] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[10:58] * fronlius_ is now known as fronlius
[11:05] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[11:10] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[11:23] * MarkDude (~MT@c-67-170-237-59.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[13:11] * johnl (~johnl@2a02:1348:14c:1720:24:19ff:fef0:5c82) Quit (Quit: leaving)
[13:11] * johnl (~johnl@2a02:1348:14c:1720:24:19ff:fef0:5c82) has joined #ceph
[13:17] * johnl (~johnl@2a02:1348:14c:1720:24:19ff:fef0:5c82) Quit (Quit: leaving)
[13:17] * johnl (~johnl@2a02:1348:14c:1720:24:19ff:fef0:5c82) has joined #ceph
[13:19] * johnl (~johnl@2a02:1348:14c:1720:24:19ff:fef0:5c82) Quit ()
[13:19] * johnl (~johnl@2a02:1348:14c:1720:24:19ff:fef0:5c82) has joined #ceph
[13:27] * verwilst (~verwilst@dD576F795.access.telenet.be) Quit (Quit: Ex-Chat)
[13:28] * verwilst (~verwilst@dD576F795.access.telenet.be) has joined #ceph
[14:17] * vodka (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) has joined #ceph
[14:22] * BManojlovic (~steki@93-87-148-183.dynamic.isp.telekom.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[15:07] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[15:17] * vodka (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[15:51] * vodka (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) has joined #ceph
[16:23] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[16:45] * fronlius_ (~fronlius@testing78.jimdo-server.com) has joined #ceph
[16:45] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[16:45] * fronlius_ is now known as fronlius
[17:04] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[17:08] * MarkDude (~MT@c-67-170-237-59.hsd1.ca.comcast.net) has joined #ceph
[17:11] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:12] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[17:12] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit ()
[17:13] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[17:44] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Quit: fronlius)
[17:45] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:34] * jojy (~jvarghese@108.60.121.114) has joined #ceph
[18:42] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[18:53] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[18:59] * jojy (~jvarghese@108.60.121.114) Quit (Quit: jojy)
[19:01] * jojy (~jvarghese@108.60.121.114) has joined #ceph
[19:19] * vodka (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[19:25] * jmlowe (~Adium@mobile-166-147-096-188.mycingular.net) Quit (Ping timeout: 480 seconds)
[19:29] <sagewk> wido: thanks
[19:32] * fghaas (~florian@85-127-93-41.dynamic.xdsl-line.inode.at) has joined #ceph
[19:40] * MarkDude (~MT@c-67-170-237-59.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[19:41] * jmlowe (~Adium@129-79-134-204.dhcp-bl.indiana.edu) has joined #ceph
[19:43] <nhm> joshd: your explaination in 1490 seems to make total sense. I'm bumming around in the osd code now to get an idea of how the server side code is implemented. Any pointers where I should look?
[19:43] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:44] * fghaas (~florian@85-127-93-41.dynamic.xdsl-line.inode.at) Quit (Quit: Leaving.)
[19:46] <joshd> nhm: not sure yet really - it could be in any of several layers
[19:48] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[19:54] * BManojlovic (~steki@212.200.243.100) has joined #ceph
[19:54] <nhm> joshd: ok, let me know if you want a second (albiet noob) pair of eyes on it...
[19:57] <gregaf> nhm: if you're looking for something to do, check out http://ceph.newdream.net/wiki/Projects
[19:58] <gregaf> the wiki is slowly dying but that list is still about right
[19:59] * buck (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving)
[19:59] <elder> Question: Do directories have layouts?
[20:01] <nhm> greg: ok, I'll take a look. I'm mostly just trying to get a basic understanding of the code right now (sadly in the little free time I have).
[20:02] <gregaf> nhm: coolio — any reason in particular you're poking around?
[20:03] <gregaf> elder: yes, although they're different than the file layouts; directory layouts are defaults for their contained files :)
[20:03] <elder> But would it be meaningful to query a directory's layout?
[20:04] <gregaf> yes
[20:05] <gregaf> file layouts impact how they're laid out across the cluster; you can read them at any time and write to them if the file doesn't yet have data
[20:05] * vodka (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) has joined #ceph
[20:05] <elder> OK. Thanks. Currently that is not possible via the virtual xattr interface for a directory.
[20:05] <gregaf> directory layouts set the layout for all new files created underneath them
[20:05] <gregaf> ah, yeah
[20:05] <gregaf> right now you need to use the cephfs tool and some ioctls
[20:05] <gregaf> *which uses some ioctls
[20:06] <elder> Would it be possible to rename the ceph vxattr "ceph.layout" to be "ceph.file.layout"?
[20:06] <gregaf> probably?
[20:06] <gregaf> I don't think I implemented that interface; do you have any thoughts sagewk?
[20:07] <elder> At some point that sort of thing has to be set in stone, just wondering if we're there yet.
[20:07] <gregaf> there are unlikely to be any users right now
[20:07] <wido> sagewk: Haven't found the time yet, but I'll recompile now. I'll cherry pick however, since I my cluster is running code from about 4 weeks ago
[20:07] <wido> that could lead to connection problems
[20:08] <sagewk> wido: thanks
[20:08] <sagewk> elder: yeah we can rename...
[20:08] <sagewk> elder: altho dir layouts and file layouts are mutually exclusive
[20:09] <sagewk> (at least currently we don't have a concept of a directory layout separate from teh policy)
[20:18] <elder> All the more reason to make the name more specific (i.e., ceph.dir.layout and ceph.file.layout)
[20:19] <sagewk> elder: yeah
[20:20] <sagewk> elder: go ahead and make the change, sooner is better!
[20:20] <sagewk> we should expose the policy as ... ceph.dir.layout_policy maybe?
[20:20] <sagewk> i think in the kclient it's the same field in the inode
[20:21] <elder> I don't know, I'll leave that up to you. I'm just going by the (in)consistency I am finding in fs/ceph/xattr.c
[20:21] <gregaf> it's the same field everywhere; is why they're exposed as the same thing
[20:21] <gregaf> they actually were separate server-side but got merged into one since the file layout was never used on directories
[20:21] <wido> I'm getting a "Floating point exception" with some test code, am I missing something? http://pastebin.com/1962RF7L
[20:22] * fronlius (~fronlius@f054099149.adsl.alicedsl.de) has joined #ceph
[20:22] <sagewk> gregaf: actually i never merged that branch i think...
[20:22] <elder> I will leave the old name in place as an alias for the time being. Not sure what deprecation procedures you guys follow.
[20:22] <sagewk> wido: i think you need bfbeae68c045de76ede86ca4f72d2a760a19c84b
[20:23] <wido> sagewk: Figured so, I'm still running 0.39
[20:23] <elder> Another question. What do you envision might be a writable virtual xattr?
[20:23] <sagewk> elder: setting the file or dir layout via the xattr instead of the ioctl
[20:24] <elder> OK, that's what I sort of thought.
[20:24] <elder> Currently though, if a vxattr gets written its value gets saved along with all the other non-virtual ones.
[20:25] <elder> I would expect it would be better to implement the change for real (like the ioctl), and then keep using the existing virtual callback to get the updated value.
[20:26] <sagewk> elder: ah.. we should disable vxattr writes i think. none of them are actually implemented...
[20:26] <elder> Can you think of a reasonable virtual xattr for which it makes more sense to actually save the value as if it were not a virtual xattr? It seemed strange to me as it is written now.
[20:26] <elder> Correct.
[20:26] <elder> None are writable right now.
[20:26] <elder> But there is a concept of whether they are writable, which begged the question.
[20:27] <sagewk> ah. yeah, it's only partially implemented, that's why..
[20:27] <sagewk> i don't think the writeable xattr should modify the xattr directly, but should trigger some server request (e.g. setlayout).
[20:27] <elder> Yes, I completely agree.
[20:28] <elder> And that's why I thought the saving of a written value into the "normal" xattrs was odd.
[20:28] <jmlowe> So has anybody here had trouble doing snapshots with rbd?
[20:28] <sagewk> yeah, that's wrong. but it can't happen because no writeable xattrs are dfiend yet, right? or am i misremembering?
[20:28] <elder> (Even though, at the moment, there is no vxattr that is writable.)
[20:28] <sagewk> gotcha.
[20:29] <jmlowe> I've tried it twice now and had deadlocked osd's shortly afterwards, ceph 0.39 btw
[20:29] <elder> There are a few bits in there that implement it the wrong way. I will create a patch to un-do that behavior.
[20:29] <elder> Once we want to enable a writable one we can do it "right."
[20:30] <sagewk> sounds good
[20:30] <joshd> jmlowe: that's certainly not expected - our nightly test do rbd snapshots with no problems
[20:31] <joshd> jmlowe: what clients are running while you do this, and what are they doing?
[20:34] <jmlowe> most recently, I'm running kvm vm's against rbd and I wanted to resize so I took down the vm, snapshot, rbd map on vm host, modified the partition table for /dev/rbd0, unmapped, started vm and noticed 3/12 osd's were deadlocked
[20:35] <jmlowe> first time was similar but I don't recall the exact order, didn't seem important until I had trouble the second time a month or two later
[20:36] <jmlowe> vm host is ubuntu 11.10 with 3.0.0-14-server kernel
[20:38] <jmlowe> probably not important but qemu 0.15.1 with patch to use async librados
[20:38] <joshd> what are the deadlocked osds doing? can you strace -p PID one?
[20:39] * MarkDude (~MT@64.134.237.13) has joined #ceph
[20:39] <joshd> or is it just some objects that aren't available?
[20:40] <jmlowe> it's a bit late now, I don't dare risk it again, I just spent several hours last night putting things back together after making a new fs
[20:41] <jmlowe> by deadlocked I mean they are in uninterruptible wait
[20:42] <joshd> that sounds like an underlying filesystem problem then
[20:42] <joshd> i.e. btrfs
[20:43] <joshd> you might have some errors in your syslog
[20:44] <jmlowe> that makes some sense, I eventually lost everything due to loosing the underlying btrfs, maybe it was sick to start with and I didn't know it and the snapshot operations pushed it over the edge
[20:44] <jmlowe> s/loosing/losing/
[20:44] <joshd> I'd suggest upgrading the kernel if possible
[20:44] <jmlowe> mainline?
[20:44] <jmlowe> we are up to 3.2 these days?
[20:45] <joshd> it was just released
[20:46] <joshd> fwiw we test on our kernel repo (https://github.com/NewDreamNetwork/ceph-client)
[20:47] <jmlowe> is that guy at oracle ever going to release a fsck that can fix something?
[20:49] <jmlowe> oh, does that remove rollback due to bugs mean you can't revert to an earlier snapshot for the time being?
[20:50] <joshd> you can with the rbd command line tool, just not using the kernel's sysfs interface
[20:50] <jmlowe> ah, ok
[20:50] <jmlowe> it's been a long week
[20:52] <jmlowe> so in the future I should make very sure my btrfs backing ceph are healthy before attempting snapshots
[20:53] <joshd> a lot of things may fail if btrfs isn't happy
[20:55] <darkfaded> one of the little points where you notice linux has no fs management api
[20:59] <jmlowe> well I need to go clean the metaphorical egg off my face and finish up a tiling job that took a back seat to the day job, thanks for the advice and thanks for the hand holding yesterday
[21:00] <joshd> you're welcome
[21:00] * jmlowe (~Adium@129-79-134-204.dhcp-bl.indiana.edu) has left #ceph
[21:47] * izdubar (~MT@sjc-static-208.57.178.16.mpowercom.net) has joined #ceph
[21:53] * MarkDude (~MT@64.134.237.13) Quit (Ping timeout: 480 seconds)
[22:19] * vodka (~paper@41.Red-88-15-116.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[23:44] * izdubar (~MT@sjc-static-208.57.178.16.mpowercom.net) Quit (Read error: Connection reset by peer)
[23:54] * verwilst (~verwilst@dD576F795.access.telenet.be) Quit (Quit: Ex-Chat)
[23:58] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.