#ceph IRC Log

Index

IRC Log for 2011-10-18

Timestamps are in GMT/BST.

[0:02] * slang (~slang@chml01.drwholdings.com) Quit (Ping timeout: 480 seconds)
[0:33] * adjohn is now known as Guest13850
[0:33] * Guest13850 (~adjohn@50.0.103.34) Quit (Read error: Connection reset by peer)
[0:34] * adjohn (~adjohn@50.0.103.34) has joined #ceph
[0:34] <jmlowe> there isn't an up to date ubuntu ppa for ceph is there?
[0:39] <joshd> jmlowe: http://ceph.newdream.net/docs/latest/ops/install/mkcephfs/#debian-ubuntu or http://ceph.newdream.net/docs/latest/ops/autobuilt/, depending on how recent you want
[0:39] <joshd> the latter should be considered unstable
[0:40] * adjohn (~adjohn@50.0.103.34) Quit (Read error: No route to host)
[0:40] * adjohn (~adjohn@50.0.103.34) has joined #ceph
[0:41] <jmlowe> I was planning on making a ppa with qemu-kvm-0.15.1+rbd with the 7a3f5fe commit patched in, but I can't seem to figure out an easy way to make launchpad build against librbd 0.36
[0:48] <joshd> that would be useful - I know nothing about launchpad though
[0:51] <jmlowe> apparently they only accept source packages and they build everything themselves, but if your deps aren't in a ppa or an official ubuntu repository then you are out of luck
[0:51] * adjohn (~adjohn@50.0.103.34) Quit (Read error: Connection reset by peer)
[0:51] * adjohn (~adjohn@50.0.103.34) has joined #ceph
[0:55] <jmlowe> I guess if feature 1618 needs a patch I'll create a package for that as well
[0:58] <joshd> that'd be great
[1:01] * conner (~conner@leo.tuc.noao.edu) Quit (Ping timeout: 480 seconds)
[1:01] * lxo (~aoliva@lxo.user.oftc.net) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * gohko (~gohko@natter.interq.or.jp) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * failbaitr (~innerheig@62.212.76.29) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * dweazle (~dweazle@dev.tilaa.nl) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * f4m8_ (~f4m8@lug-owl.de) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * wonko_be (bernard@november.openminds.be) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * nms (martin@sexyba.be) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * Anticimex (anticimex@netforce.csbnet.se) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) Quit (reticulum.oftc.net kilo.oftc.net)
[1:01] * dweazle (~dweazle@dev.tilaa.nl) has joined #ceph
[1:01] * f4m8 (~f4m8@lug-owl.de) has joined #ceph
[1:01] * Anticimex (anticimex@netforce.csbnet.se) has joined #ceph
[1:01] * nms (martin@sexyba.be) has joined #ceph
[1:01] * wonko_be (bernard@november.openminds.be) has joined #ceph
[1:01] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) has joined #ceph
[1:02] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[1:02] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) has joined #ceph
[1:02] * gohko (~gohko@natter.interq.or.jp) has joined #ceph
[1:06] * failbaitr (~innerheig@62.212.76.29) has joined #ceph
[1:09] * conner (~conner@leo.tuc.noao.edu) has joined #ceph
[1:25] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[1:29] * cyb0org (~cyb0org@89-73-22-110.dynamic.chello.pl) has joined #ceph
[1:30] * cyb0org (~cyb0org@89-73-22-110.dynamic.chello.pl) Quit ()
[1:32] * cyb0org (~cyb0org@89-73-22-110.dynamic.chello.pl) has joined #ceph
[1:37] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) has joined #ceph
[1:46] <cyb0org> Hi. Is there any possibility to restrict client access only to specified dir? For example client.X would have set dir /ceph/osd/user/X/ as mounting root without ability to go up in hierarchy
[1:47] * Tv (~Tv|work@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[1:52] <joshd> cyb0org: I don't think that's possible currently, but there's a feature request for it: http://tracker.newdream.net/issues/1237
[1:57] <cyb0org> thanks. I have to admin I haven't search BT for it
[1:57] * MarkN (~nathan@142.208.70.115.static.exetel.com.au) has left #ceph
[2:13] * cyb0org (~cyb0org@89-73-22-110.dynamic.chello.pl) Quit (Quit: Ceph will rule ;))
[2:16] * Nightdog (~karl@190.84-48-62.nextgentel.com) Quit (Remote host closed the connection)
[2:20] * adjohn (~adjohn@50.0.103.34) Quit (Remote host closed the connection)
[2:21] * adjohn (~adjohn@50.0.103.34) has joined #ceph
[2:33] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:59] * bchrisman (~Adium@108.60.121.114) Quit (Quit: Leaving.)
[3:12] * cp (~cp@206.15.24.21) Quit (Quit: cp)
[3:14] * jojy (~jojyvargh@108.60.121.114) Quit (Quit: jojy)
[3:36] * adjohn is now known as Guest13883
[3:36] * adjohn (~adjohn@50.0.103.34) has joined #ceph
[3:37] * adjohn (~adjohn@50.0.103.34) Quit ()
[3:43] * Guest13883 (~adjohn@50.0.103.34) Quit (Ping timeout: 480 seconds)
[4:02] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[4:12] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[4:13] * jojy (~jojyvargh@75-54-231-2.lightspeed.sntcca.sbcglobal.net) Quit ()
[4:47] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) has joined #ceph
[4:55] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Ping timeout: 480 seconds)
[5:08] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) Quit (Quit: This computer has gone to sleep)
[6:14] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) has joined #ceph
[6:31] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[6:42] * f4m8 is now known as f4m8_
[6:43] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) has joined #ceph
[7:12] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[7:18] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) has joined #ceph
[8:26] * yehuda_hm (~yehuda@99-48-179-68.lightspeed.irvnca.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[8:33] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[8:35] * fronlius (~Adium@f054110235.adsl.alicedsl.de) has joined #ceph
[8:53] <NaioN> I had the problem with the mds that gets killed by the OOM killer
[8:53] <NaioN> I noticed this option: OPTION(mds_mem_max, 0, OPT_INT, 1048576)
[8:53] <NaioN> Is that number the maximum number of PAGES that the MDS allocs?
[8:54] <NaioN> Because then it would make sense if I have a MDS with only 2G of ram and the option is set to 1048576 * 4k = 4G of mem...
[9:12] * fronlius1 (~Adium@testing78.jimdo-server.com) has joined #ceph
[9:12] * fronlius1 (~Adium@testing78.jimdo-server.com) Quit ()
[9:14] * fronlius1 (~Adium@f054110235.adsl.alicedsl.de) has joined #ceph
[9:14] * fronlius (~Adium@f054110235.adsl.alicedsl.de) Quit (Read error: Connection reset by peer)
[9:26] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[9:35] <chaos__> is there any way to list active mons?
[9:35] <chaos__> ceph mon list doesn't work
[9:36] <chaos__> and i cannot find documentation for 'ceph mon'
[9:44] <chaos__> well i could parse ceph -s, but its crappy solution
[10:04] * jmlowe (~Adium@mobile-166-137-141-206.mycingular.net) has joined #ceph
[11:40] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[14:39] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) has joined #ceph
[15:09] * n0de (~ilyanabut@c-24-127-204-190.hsd1.fl.comcast.net) Quit (Quit: This computer has gone to sleep)
[15:10] * huangjun (~root@113.106.102.8) has joined #ceph
[15:17] * huangjun (~root@113.106.102.8) Quit (Quit: leaving)
[16:05] <wido> chaos__: Not really at the moment, there has been some words about a monitoring library
[16:05] <wido> but currently ceph -s is your friend
[16:16] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) has joined #ceph
[16:22] * adjohn (~adjohn@50-0-92-177.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[16:26] * jmlowe (~Adium@mobile-166-137-141-206.mycingular.net) Quit (Quit: Leaving.)
[16:46] * Nightdog (~karl@190.84-48-62.nextgentel.com) has joined #ceph
[16:50] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) has joined #ceph
[16:59] <chaos__> wido, thanks.. so parsing ceph -s here i come :/
[17:48] * tserong (~tserong@58-6-103-205.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[17:48] * bencherian (~bencheria@cpe-76-173-232-163.socal.res.rr.com) has joined #ceph
[17:54] * Tv (~Tv|work@aon.hq.newdream.net) has joined #ceph
[17:58] * tserong (~tserong@58-6-129-199.dyn.iinet.net.au) has joined #ceph
[18:42] * jojy (~jojyvargh@108.60.121.114) has joined #ceph
[18:43] * bencherian (~bencheria@cpe-76-173-232-163.socal.res.rr.com) Quit (Quit: bencherian)
[18:43] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:01] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[19:09] <joshd> NaioN: mds_mem_max actually isn't used right now - but it was the max size of the mds' cache in KB. now the cache is controlled by mds_cache_size, which is a number of inodes
[19:10] <joshd> NaioN: decreasing mds_cache_size might help your memory consumption problem
[19:14] * adjohn (~adjohn@68.65.169.180) has joined #ceph
[19:17] * adjohn (~adjohn@68.65.169.180) Quit ()
[19:17] * adjohn (~adjohn@68.65.169.213) has joined #ceph
[19:18] * adjohn (~adjohn@68.65.169.213) Quit ()
[19:21] <df__> todays fun: log 2011-10-18 17:20:04.226375 mds0 172.29.190.30:6804/6911 12 : [ERR] rfiles underflow -1 on [inode 1000010d089 [...2,head] /andrea-tsm-intra/i_he/slideediting/ auth v11730 pv11750 ap=1+0 f(v0 m2011-10-18 17:20:03.453891 1=0+1) n(v31 rc2011-10-18 17:20:03.453891 b-52 1=-1+2) (inest lock->sync w=1 dirty) (ifile excl) (iversion lock) caps={6438=pAsLsXsFsx/-@10},l=6438 | dirtyscattered lock dirfrag caps dirty authpin 0x6f6d8a0]
[19:25] * jmlowe (~Adium@129-79-134-52.dhcp-bl.indiana.edu) has joined #ceph
[19:26] <df__> and: $ ls -lhd andrea-tsm-intra/i_he/
[19:26] <df__> drwxrwsr-x 1 andrea rd 16E Oct 18 17:21 andrea-tsm-intra/i_he/
[19:29] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[19:32] <Tv> 16E?
[19:32] <df__> that'd be wrong
[19:32] <Tv> oh so that's on cephfs and the subtree size trackng failed?
[19:32] <df__> yes
[19:32] <Tv> as in, exabytes
[19:33] <df__> yes
[19:33] <df__> on a 52T cluster
[19:33] <Tv> yeah i see where that log entry is created
[19:33] <Tv> oopsies
[19:34] <Tv> df__: as usual, we'd love a bug report with plenty of information
[19:35] <df__> i suspect this might be related to a load of mds "mismatch between child accounted_rstats and my rstats!" errors and the like
[19:36] <df__> is there any method to cause the mds to sort itself out?
[19:36] <Tv> df__: sounds probably.. sadly i don't know much about that accounting, and our best mds developers are one sick and another at a conference
[19:37] <Tv> *probable
[19:38] <Tv> http://tracker.newdream.net/issues/879 looks relevant
[19:38] <Tv> that commit is in master these days
[19:39] <Tv> oh it's not anything you could kick into action i guess
[19:52] * aliguori (~anthony@32.97.110.59) has joined #ceph
[19:56] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[20:01] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:07] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[20:07] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:27] <NaioN> joshd: thx will try that!
[20:28] <NaioN> how translates mds_cache_size to mb/gb?
[20:29] <NaioN> how much space does a cached inode use?
[20:32] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[20:33] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:33] <wido> gregaf: http://idle3-tools.sourceforge.net/
[20:33] <wido> that was what I needed :-) It disables the power saving functions of the WD20EARS
[20:34] <wido> Most of my disks had around 90k starts/stops
[20:34] <wido> So, I hope no more dead disks in my cluster
[20:35] <joshd> NaioN: I'm not sure, unfortunately - gregaf and sagewk would know, but they're sick and at a conference today
[20:35] * bencherian (~bencheria@aon.hq.newdream.net) has joined #ceph
[20:35] <Tv> wido: greg's sick today, i don't expect him to be active on irc
[20:37] <wido> Tv: Oh, ok :) Well, might be usefull for others as well
[20:37] <jmlowe> joshd: any suggestions for what I should do with the qemu 0.15.1 package I built for oneiric, would you guys be interested in hosting it?
[20:37] <wido> who have WD's crashing all the time
[20:40] <NaioN> joshd: no problem, i'll experiment a little with the values
[20:41] <bugoff> /30/7
[20:46] <joshd> jmlowe: probably? I'm not sure how our debian repos are setup - maybe wait for sage to get back tomorrow
[20:47] <jmlowe> joshd: I'd hate for anybody else to have to learn how to repackage with debian like I just did
[20:51] <Tv> jmlowe: it shouldn't go in the apt repo we use for ceph itself, but we might put it somewhere else.. though for that, it really should be a git repo, with a gitbuilder, etc
[20:51] <Tv> actually, i'd expect ubuntu people be interested in having a ppa of a newer kvm
[20:52] <jmlowe> I went down the lauchpad path, gave up when I realized I'd need to maintain my own ceph ppa in order to build against librbd
[20:52] <Tv> two months ago, i dropped the kvm 0.14 debian diff into a 0.15 source tree, it built and worked just fine after ripping out 5 patches that were already upstream at that point
[20:54] <jmlowe> mine has the 7a3f5fe commit, vm's are actually usable with it, rbd wasn't viable for me without it
[20:54] <jmlowe> kernel untar went from 13min down to 34sec
[20:55] <jmlowe> I'd get lynched if I made my users wait more than 20 minutes for yum update to run
[20:56] <Tv> jmlowe: yeah i didn't really benchmark, just ran an ubuntu desktop with some animation in a web browser and live migrated that back and forth -- it was definitely good enough for light web browsing
[20:59] <jmlowe> it'll be a lot less work when the async patch hits a release, would just be a matter of running the script that rebases a package on a new upstream version and ripping out all of the patches
[21:00] * bencherian (~bencheria@aon.hq.newdream.net) Quit (Quit: bencherian)
[21:02] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) has joined #ceph
[21:04] * bencherian (~bencheria@aon.hq.newdream.net) has joined #ceph
[21:07] * bencherian (~bencheria@aon.hq.newdream.net) Quit ()
[21:08] * bencherian (~bencheria@aon.hq.newdream.net) has joined #ceph
[21:21] * sandeen_ (~sandeen@sandeen.net) has joined #ceph
[21:46] * votz (~votz@pool-108-52-121-23.phlapa.fios.verizon.net) has joined #ceph
[21:58] * Nightdog (~karl@190.84-48-62.nextgentel.com) Quit (Read error: Connection reset by peer)
[21:58] * Nightdog (~karl@190.84-48-62.nextgentel.com) has joined #ceph
[22:10] <sandeen_> sage had suggested using the "streamtest" binary to try to reproduce the issue with extended attributes on ext4 (if anyone is familiar with that)
[22:11] <sandeen_> but it doesn't seem to be doing any of the threaded stuff that we saw causing problems. Does anyone know if I need to set up a ceph.conf or something to make it behave this way?
[22:11] <sandeen_> I'm sadly ceph-illiterate, sorry.
[22:13] <Tv> sjust: hey you talked about ^ that earlier today
[22:13] <Tv> sjust: any more clue on how to repro outside of ceph
[22:19] <yehudasa> sandeen_: from what I undestand looking at this test, it should be doing the flusher handoff
[22:20] <yehudasa> sandeen_: unless you have configured 'filestore flusher = false'
[22:20] <yehudasa> sandeen_: but sjust will be around in a few minutes, I think he'll know better
[22:21] * fronlius1 (~Adium@f054110235.adsl.alicedsl.de) Quit (Quit: Leaving.)
[22:24] <sandeen_> yehudasa, I dropped the sample ceph.conf into /etc/ceph but made no changes; I see no reference to "flush" in there ...
[22:24] <sandeen_> but maybe I need a bit more configuration to make this thing do the right thing :)
[22:31] * sagelap2 (~sage@soenat3.cse.ucsc.edu) has joined #ceph
[22:36] <sandeen_> sage! :)
[22:36] * sandeen_ is having trouble making streamtest DTRT :(
[22:39] <sagelap2> problems reproducing the problem, or generating any fs workload at all?
[22:40] <sandeen_> the latter, I guess, but I am woefully uninformed
[22:40] <sandeen_> guessing I need a proper ceph.conf
[22:40] <sagelap2> shouldn't for streamtest..
[22:40] <sandeen_> hm
[22:40] <sandeen_> I see no threads whatsoever
[22:40] <sandeen_> I see one big file created
[22:40] <sandeen_> I see 2 xattrs set, and that's it.
[22:41] <sagelap2> the process should have several threads, though
[22:41] <sandeen_> well, I do see 10 child streamtests yes.
[22:42] <sagelap2> let me give it a spin
[22:42] <sagelap2> does strace show it doing the same close-in-different-thread thing?
[22:42] <sandeen_> well *cough* I forgot to trace children
[22:42] <sandeen_> checking now
[22:42] <sagelap2> hehe
[22:43] <sandeen_> I don't see any of the <... blah resumed> in the strace
[22:44] <sandeen_> hm but I do see the sync_file_ranges etc in various threads
[22:44] <sandeen_> ok, let me look more. I guess I expected more files to be created ...
[22:44] <sagelap2> streamtest was a simple tool to test latency of small streaming writes.
[22:44] <sagelap2> could be easily modfiied to write to many objects instead of appending to one
[22:44] <sandeen_> sync_file_range(0x9, 0x300000, 0x100000, 0x2) = 0
[22:44] <sandeen_> close(9) = 0
[22:44] <sandeen_> ok, so I do see that in one thread.
[22:45] * bencherian (~bencheria@aon.hq.newdream.net) Quit (Quit: bencherian)
[22:46] <sandeen_> ok that may be what it takes then
[22:46] * sandeen_ tries
[22:47] <sagelap2> the <blah resumed> stuff just means that two threads are making those two syscalls in parallel, right?
[22:48] <sandeen_> well, I thnk so
[22:48] <sandeen_> ok I'm writing tons of files now ;)
[22:49] <sandeen_> let's see if that does it
[22:49] <sandeen_> sorry, just finding my way around this ceph-y thing.
[22:49] <sandeen_> nope, clean.
[22:50] <sandeen_> If a system call is being executed and meanwhile another one is being
[22:50] <sandeen_> called from a different thread/process then strace will try to preserve
[22:50] <sandeen_> the order of those events and mark the ongoing call as being unfinished.
[22:50] <sandeen_> When the call returns it will be marked as resumed.
[22:51] * sandeen_ puts it in a loop
[22:51] <sagelap2> could it be that multiple threads are interacting with the fs but not necessarily the same file? like not taking the allocation mutex or something?
[22:53] <sagelap2> http://fpaste.org/OPOV/ will get multiple threads involved
[22:54] <sagelap2> submitting io
[22:58] <sandeen_> ok
[22:58] * sandeen_ tries
[22:58] * bencherian (~bencheria@aon.hq.newdream.net) has joined #ceph
[23:00] <sandeen_> http://fpaste.org/jkea/ is what I have noe, including making a new file on each pass through the loop
[23:04] <darkfader> i just read ikea :(
[23:04] <sandeen_> :)
[23:06] <darkfader> btw i found a presentation on ceph by german grau data... i think it's the best so far: http://www.google.de/url?sa=t&source=web&cd=1&ved=0CCUQFjAA&url=http%3A%2F%2Fgraudata.com%2Fsites%2Fdefault%2Ffiles%2Ffile%2FScalable-OpenSource-Storage.pdf&ei=2umdTouBGq7T4QSB47SbCQ&usg=AFQjCNEBwlSF39PDT_y6HtkVFj6iQY7YJw
[23:06] <darkfader> arghl
[23:06] <darkfader> nvm i always fall for that
[23:06] <darkfader> http://graudata.com/sites/default/files/file/Scalable-OpenSource-Storage.pdf
[23:06] <sandeen_> heh, that is the one thing I hate about google. how -do- you get the link behind it?
[23:06] <darkfader> click it
[23:06] <darkfader> copy after it loaded
[23:06] <sandeen_> heh
[23:06] <sandeen_> yeah
[23:07] <darkfader> or maybe python + urllib :)
[23:10] <sandeen_> sagelap2, it seems to only set xattrs on directories now?
[23:16] <df__> sandeen_, turn javascript off ;)
[23:17] * jmlowe (~Adium@129-79-134-52.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[23:29] <df__> hmm, just upgraded to 0.37, still seeing OSDs reporting: "heartbeat_map is_healthy 'OSD::disk_tp thread 0x7f8a89de1700' had timed out after 60"
[23:31] <sagelap2> sandeen_: you sure? it should be setting an xattr on every file it creates
[23:32] <sagelap2> df__: usually that means the fs is hung?
[23:33] <df__> it is running on ext4, i'm not seeing any kernel log errors, local filesystem seems fine, and ceph seems to get over it
[23:34] <sagelap2> df__: might just be that it is slow under load, and some operation took >60 seconds but did complete
[23:34] <df__> it is then eventualy followed by:
[23:34] <df__> Oct 18 21:33:34 vc-fs1 osd.0[4913]: 7f8a8e7ec700 osd.0 3109 OSD::ms_handle_reset()
[23:34] <df__> Oct 18 21:33:34 vc-fs1 osd.0[4913]: 7f8a8e7ec700 osd.0 3109 OSD::ms_handle_reset() s=0x66d4c60
[23:35] <sandeen_> sagelap2, I probably modified it wrong ;)
[23:35] <sandeen_> sagelap2, but even unmodified there is no attr on the big file
[23:36] <df__> sagelap2, i'm not seeing much disk activity though
[23:36] <sandeen_> [root@inode ~]# getfattr -d /mnt/test/streamtest/current/meta/streamtest__0_A546F122
[23:36] <sandeen_> [root@inode ~]#
[23:36] <sagelap2> oooh.. right. hold on.
[23:37] <sandeen_> hehe
[23:37] <sandeen_> sorry, I'm a babe in the woods here :(
[23:38] <sagelap2> i forgot that we're beneath the layer that normally sets attrs all over everything
[23:38] <sagelap2> add something like t->setattr(coll_t(), hobject_t(poid), "foo", (char*)&pos, sizeof(pos));
[23:38] <sagelap2> before the queue_transaction
[23:41] <sandeen_> ok. which is name & which is value ...?
[23:41] <sandeen_> I will need one xattr large enough to go outside the inode
[23:42] <sagelap2> name, valbuf, valbuflen
[23:42] <sandeen_> ok
[23:58] * bencherian (~bencheria@aon.hq.newdream.net) Quit (Quit: bencherian)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.