#ceph IRC Log


IRC Log for 2012-12-15

Timestamps are in GMT/BST.

[0:00] <joao> yeah, crappy resolution
[0:00] <gregaf1> anyway, is there something more to see besides the total memory consumption going way up?
[0:01] <joao> naa
[0:01] <joao> only how high it goes
[0:01] <joao> I killed the workloadgen, hoping it would go down
[0:01] * drokita (~drokita@ Quit (Quit: Leaving.)
[0:02] <joao> but the last refresh I got from htop, before the ssh connection went down, marked mon.a's VIRT at 10GB
[0:02] <gregaf1> yeah
[0:03] <joao> and RES at 4.4GB
[0:04] <joao> looks like mon.a was OOM-killed
[0:07] <dmick> NOTICE: uplink change done, didn't even lose ssh/vpn connections
[0:08] * ee_cc (~edoardoca@dhcp-077-249-189-246.chello.nl) has joined #ceph
[0:08] <ee_cc> still a bit frustrated with java client support
[0:09] <ee_cc> anyone can point to a simple - braindead - tutorials for real n00bs?
[0:12] <gregaf1> ee_cc: what are you interested in?
[0:12] <ee_cc> java client
[0:12] <gregaf1> there's not much documentation at this point though, and most of the people holding it in their brain are out for the rest of the year
[0:13] <gregaf1> well, there are Java bindings for libcephfs
[0:13] <gregaf1> that would be a client
[0:13] <ee_cc> trying to use radosgw s3 and swift modes
[0:13] <ee_cc> with little success...
[0:13] <gregaf1> oh, yeah, we don't supply Java bindings for those
[0:14] <gregaf1> you'll be talking to the maintainers of those packages for that
[0:14] <gregaf1> it's a whole ecosystem and we're just a consumer ;)
[0:14] <ee_cc> but libcephs uses native jni, doesn't it?
[0:15] <gregaf1> the Java ones, yes, but that's for the filesystem client, not for the S3 stuff
[0:15] <gregaf1> it's totally unrelated
[0:15] <gregaf1> you just want to find some generic S3 Java client that lets you specify an endpoint
[0:15] <ee_cc> actually no, I don't really care about s3
[0:15] <gregaf1> then why do you want "to use radosgw s3 and swift modes"?
[0:16] <ee_cc> I'd like to access ceph through java
[0:16] <ee_cc> and from the docs I had the feeling that was made possible through these bindings
[0:16] <gregaf1> okay, Ceph provides a lot of different kinds of access; *we* provide Java bindings for the (not quite production-ready!) filesystem
[0:17] * l0nk (~alex@ has joined #ceph
[0:17] <ee_cc> although I wouldn't mind ceph native stuff
[0:17] <gregaf1> but you could also use RESTful object storage, and that wouldn't be accessed through the filesystem, but through anything that speaks S3
[0:17] <gregaf1> for that, there are a bazillion libraries already and you'd just point them at your rados gateway
[0:17] <gregaf1> there's nothing right now that exports native RADOS or RBD via Java
[0:18] <gregaf1> as I look at it the Ceph bindings are fairly well-documented, so if you're looking at them and they don't help I think you're trying to use them for the wrong thing :)
[0:18] <ee_cc> ah yes, getting the rados gw was a bit of a trouble
[0:18] <ee_cc> didn't really work :(
[0:19] <ee_cc> I found the native ceph java bindings but these were java wrappers around the c lib right?
[0:20] <gregaf1> yes
[0:20] <ee_cc> ah, let me ask one more thing:
[0:21] * calebamiles (~caleb@c-107-3-1-145.hsd1.vt.comcast.net) has joined #ceph
[0:21] <ee_cc> the filesystem.. client side.. for what I can tell, I shouldn't really care much about it right?
[0:22] <ee_cc> it's like a posix fs impl built on top of ceph
[0:22] <gregaf1> that depends on what you want to use it for; it provides a POSIX filesystem on top of RADOS
[0:22] <gregaf1> if you want a filesystem, you should care a lot
[0:22] <gregaf1> if you want an object store or a block device, you shouldn't care at all
[0:23] <ee_cc> right, but if I want to implement my own model from my webapp
[0:23] <ee_cc> I can do without
[0:23] * l0nk (~alex@ Quit (Quit: Leaving.)
[0:23] <ee_cc> and that's where I get lost...
[0:24] <gregaf1> it sounds like, if you want anything in Ceph, you want the RADOS gateway and a third party S3 library
[0:24] <ee_cc> but ok: let me ask another question. Java doesn't seem to be the most popular lang in the ceph group
[0:24] <ee_cc> so which is leading, python?
[0:25] <sjust> ee_cc: you mean other than C++?"
[0:25] <ee_cc> I'm just interested in key/blob+metadata
[0:25] <ee_cc> to be honest http and rest are overhead :)
[0:26] <gregaf1> what kind of interface do you want?
[0:26] <ee_cc> yep, so the closest to the wire client is C++?
[0:26] <gregaf1> it sounds like maybe you're expecting SQL or something, and Ceph does not provide anything like that
[0:27] <sjust> ee_cc: there is a library (librados) for accessing the raw object store
[0:27] <sjust> we have C++, C, python librados bindings at the moment
[0:27] <sjust> I think someone once did an erlang binding?
[0:27] <ee_cc> ah, right… so C, C++ and py
[0:27] <sjust> there's a php binding floating around as well
[0:28] <ee_cc> dang.. no java ;)
[0:28] <ee_cc> ok, that's why I kept hitting a wall today… ;)
[0:28] <gregaf1> https://github.com/ceph/phprados
[0:28] <sjust> not yet, but there's a C binding, so it's just a matter of using jni to build a java binding
[0:28] <sjust> on top of the C library
[0:29] <ee_cc> yeah, I think I came across it but I was on a windows machine
[0:29] <joshd> not sure if this is current, but there are old java bindings for librados: https://github.com/noahdesu/java-rados
[0:30] <ee_cc> https://github.com/noahdesu/java-rados
[0:30] <ee_cc> yeah that one
[0:30] <ee_cc> ok… I know enough...
[0:30] <sjust> anyway, those are all ways of accessing librados, which gets you key->binary_blob mappings
[0:31] <ee_cc> I'll try compiling the java rados
[0:31] <ee_cc> would be nice to have a pure java impl
[0:31] <sjust> no, it probably would not
[0:31] <sjust> you'd have to duplicate the network protocol, which would be a lot of work for little actual benefit
[0:33] <ee_cc> hmm, but a native java implis portable
[0:33] <ee_cc> like jdbc4
[0:33] <ee_cc> yeah it's duplication but there's advanges
[0:33] <ee_cc> m
[0:33] <ee_cc> I guess it will happen when it all becomes more stable...
[0:35] <ee_cc> ok… thanks all, I'll give java-rados a try
[0:35] <ee_cc> cheerz
[0:35] * jlogan1 (~Thunderbi@2600:c00:3010:1:5dfe:284a:edf3:5b27) Quit (Ping timeout: 480 seconds)
[0:36] <elder> dmick, you asked about those patches I re-posted for review earlier. I just realized they're available in ceph-client/wip-repost.
[0:37] <elder> If you're not interested in that, let me know because I'm trying to delete old branches.
[0:47] <dmick> I will try really hard to look soon; I need to start a big rbd test run soon and while that's cooking I can spend some time
[0:47] * The_Bishop (~bishop@e179010011.adsl.alicedsl.de) has joined #ceph
[0:48] <elder> dmick, don't worry about it. I just was looping back, since we discussed it earlier.
[0:49] * ee_cc (~edoardoca@dhcp-077-249-189-246.chello.nl) Quit (Quit: ee_cc)
[0:50] <dmick> not worried, but I want to help
[1:10] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:10] * loicd (~loic@2a01:e35:2eba:db10:75c1:4d16:7115:2261) has joined #ceph
[1:12] * gregaf1 (~Adium@2607:f298:a:607:918d:d4e3:2387:5a6e) Quit (Quit: Leaving.)
[1:30] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has joined #ceph
[1:39] * Psi-Jack_ (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[1:39] <dmick> joshd, sagewk: wip-rbd-striping
[1:39] <dmick> starting full test run now; new tests are good
[1:41] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Read error: Operation timed out)
[1:41] * Psi-Jack_ is now known as Psi-jack
[1:42] * LeaChim (~LeaChim@5ad684ae.bb.sky.com) Quit (Remote host closed the connection)
[1:47] * andreask (~andreas@h081217068225.dyn.cm.kabsi.at) has left #ceph
[1:47] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[1:50] <Psi-jack> Blah.. Almost ready to finally start setting up my ceph cluster.. Just re-allocating disks to another server that'll be persistent during the migration phase. :)
[1:52] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[1:55] * dpippenger (~riven@ Quit (Remote host closed the connection)
[2:00] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[2:07] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[2:09] * sagelap1 (~sage@112.sub-70-197-146.myvzw.com) has joined #ceph
[2:09] * sagelap (~sage@2607:f298:a:607:f5f5:ee4f:6791:8406) Quit (Ping timeout: 480 seconds)
[2:10] * PerlStalker (~PerlStalk@ Quit (Quit: I'm going home)
[2:13] * loicd (~loic@2a01:e35:2eba:db10:75c1:4d16:7115:2261) Quit (Quit: Leaving.)
[2:13] * loicd (~loic@magenta.dachary.org) has joined #ceph
[2:19] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[2:23] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[2:23] * loicd (~loic@magenta.dachary.org) has joined #ceph
[2:27] * calebmiles (~caleb@65-183-137-95-dhcp.burlingtontelecom.net) has joined #ceph
[2:30] * Cube (~Cube@ Quit (Quit: Leaving.)
[2:37] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[2:38] * loicd (~loic@magenta.dachary.org) has joined #ceph
[2:40] * maxiz (~pfliu@ has joined #ceph
[2:50] * sagelap1 (~sage@112.sub-70-197-146.myvzw.com) Quit (Ping timeout: 480 seconds)
[3:03] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[3:03] * loicd (~loic@magenta.dachary.org) has joined #ceph
[3:14] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:14] * yasu` (~yasu`@dhcp-59-227.cse.ucsc.edu) Quit (Remote host closed the connection)
[3:17] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[3:18] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[3:18] * loicd (~loic@magenta.dachary.org) has joined #ceph
[3:24] * Cube1 (~Cube@ has joined #ceph
[3:32] <elder> That "udevadm settle" is great. 1017 iterations completed in 300 seconds
[3:34] * Cube (~Cube@ has joined #ceph
[3:34] * Cube1 (~Cube@ Quit (Read error: Connection reset by peer)
[3:39] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[3:44] <dmick> elder: sweet
[3:51] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:09] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) has joined #ceph
[4:20] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[4:21] * wschulze (~wschulze@cpe-98-14-23-162.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:25] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: This computer has gone to sleep)
[5:05] * Psi-jack (~psi-jack@yggdrasil.hostdruids.com) has joined #ceph
[5:13] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[5:45] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[5:45] * ChanServ sets mode +o scuttlemonkey
[5:52] * The_Bishop (~bishop@e179010011.adsl.alicedsl.de) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[5:52] * Cube (~Cube@184-225-40-86.pools.spcsdns.net) has joined #ceph
[5:58] * maxiz (~pfliu@ Quit (Read error: Operation timed out)
[5:59] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[6:13] * calebamiles (~caleb@c-107-3-1-145.hsd1.vt.comcast.net) has left #ceph
[6:14] * jjgalvez1 (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[6:29] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[6:31] * Ryan_Lane (~Adium@ Quit (Quit: Leaving.)
[6:40] * davidz1 (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[6:49] * Cube (~Cube@184-225-40-86.pools.spcsdns.net) Quit (Quit: Leaving.)
[6:57] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Quit: ChatZilla 0.9.89 [Firefox 17.0.1/20121128204232])
[7:04] <Psi-jack> Alrighty..
[7:04] <Psi-jack> All servers are reconfigured and XFS all mounted in their proper places... Here goes something. :)
[7:27] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[7:32] * ez (~ez@cpe-76-170-118-93.socal.res.rr.com) has joined #ceph
[7:33] * ez (~ez@cpe-76-170-118-93.socal.res.rr.com) Quit ()
[7:45] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[7:48] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[7:49] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[7:53] <Psi-jack> Okay...
[7:53] <Psi-jack> So, I have it now..
[8:01] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[8:08] * Cube (~Cube@184-225-40-86.pools.spcsdns.net) has joined #ceph
[8:10] <Psi-jack> okay, how was it that I do the bench test with all my osd's?
[8:12] * Cube (~Cube@184-225-40-86.pools.spcsdns.net) Quit ()
[8:14] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[8:25] * dmick (~dmick@2607:f298:a:607:6ced:4a04:7d67:fd53) Quit (Quit: Leaving.)
[8:35] <Psi-jack> Sweet.
[8:35] <Psi-jack> This is... MUCH better than I anticipated, so far. :D
[9:04] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[9:04] * loicd (~loic@magenta.dachary.org) has joined #ceph
[9:10] * Cube (~Cube@184-225-40-86.pools.spcsdns.net) has joined #ceph
[9:18] * Cube1 (~Cube@ has joined #ceph
[9:20] * Cube1 (~Cube@ has left #ceph
[9:21] * Cube (~Cube@184-225-40-86.pools.spcsdns.net) Quit (Ping timeout: 480 seconds)
[9:31] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:40] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[10:09] * tryggvil_ (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil_)
[10:10] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) has joined #ceph
[10:12] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit ()
[11:33] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[12:39] * LeaChim (~LeaChim@5ad684ae.bb.sky.com) has joined #ceph
[12:46] * Machske (~bram@d5152D87C.static.telenet.be) Quit (Ping timeout: 480 seconds)
[12:48] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[12:57] * snaff (~z@81-86-160-226.dsl.pipex.com) has joined #ceph
[13:04] <snaff> is it safe to run kvm's rbd driver on an osd?
[13:19] <Psi-jack> Why would you do such a thing?
[13:19] * maxiz (~pfliu@ has joined #ceph
[13:21] <snaff> to use a server as both storage and for running kvm machines
[13:23] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[13:23] * loicd (~loic@magenta.dachary.org) has joined #ceph
[13:24] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[13:26] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[13:28] <Psi-jack> That's a really bad idea.
[13:29] <snaff> for the same reason as using the kernel rbd driver on an osd?
[13:46] <Psi-jack> No, because Ceph is supposed to be for distributed storage networks, and by putting extra stuff on the server, you will only degrade it's performance and stability.
[13:53] <snaff> point taken, it's best to keep things separate if you can afford it
[13:56] * jtangwk (~Adium@2001:770:10:500:8176:da07:8c8c:d8fd) Quit (Read error: Connection reset by peer)
[13:57] * jtangwk (~Adium@2001:770:10:500:f4e7:a53d:453:895d) has joined #ceph
[14:00] <jksM> running a virtual machine with qemu-kvm with Ceph storage I experience that sometimes the virtual machines hangs with the following error: kernel:BUG: soft lockup - CPU#0 stuck for 138s!
[14:01] <jksM> anyone seen this before? - any ideas on how to debug it?
[14:06] <jksM> it caused two of my ceph-mds's to stop running and one of the ceph-mon's stopped as well
[14:07] <jksM> the logs show that the ceph-mon and ceph-mds processed were stopped by the oom-killer
[14:07] <Psi-jack> snaff: If you're not even doing a distributed storage network, there's very little point to even using Ceph at all, frankly.
[14:07] <Robe> jksM: what's the complete backtrace?
[14:07] <joao> jksM, can you drop the logs somewhere for us to grab them?
[14:07] <jksM> on the remaining ceph-mon that is still running, that process is taking up almost 10 GB of RAM... that can't be right, can it?
[14:08] <jksM> Robe, how do I get the complete backtrace?
[14:08] <joao> jksM, seen it happening, tracing down the causes
[14:08] <joao> it's been rare
[14:08] <joao> having your logs would be great
[14:09] * yanzheng (~zhyan@ has joined #ceph
[14:09] <jksM> I'll try to compile something together... but I don't see much information in the logs... besides the call trace from the oom-killer log (?)
[14:09] <Robe> dmesg output is fine
[14:09] <jksM> super, I'll put that together
[14:11] <joao> jksM, what log levels were you running the monitors with?
[14:14] * ScOut3R (~scout3r@54007924.dsl.pool.telekom.hu) has joined #ceph
[14:15] * ScOut3R (~scout3r@54007924.dsl.pool.telekom.hu) Quit ()
[14:15] * ScOut3R (~scout3r@54007924.dsl.pool.telekom.hu) has joined #ceph
[14:16] <Psi-jack> Ahh, joao! :)
[14:16] <joao> hi Psi-jack
[14:16] <joao> how's your ceph experience going on?
[14:16] <Psi-jack> joao: Got my ceph cluster rocking now, slowly been converting my qcow2 disks to ceph-rbd's via the means of a vm bringing in the qcow2 and rbd to transfer to.
[14:17] <Psi-jack> Holy crap is this nice.... Is how it's going so far. :)
[14:17] <joao> great
[14:17] <joao> let us know if you run into trouble ;)
[14:17] <Psi-jack> Beyond the tedious side-effect of conversion, I've seen major speed benefits in my new cluster doing it this way. :)
[14:17] <Psi-jack> Oh, believe me... I will! :)
[14:20] <Psi-jack> I've already got most of my VM's converted over. Just a couple left, then I need to see about getting my web servers on cephfs for their actual /var/www shared mount, within Ubuntu 10.04 (currently)
[14:21] <Psi-jack> I'll be getting my Zabbix up to poll "ceph health" as well, to determine health status and report on that.
[14:22] <joao> might want to look into the admin sockets as well, if things with the monitors go sideways
[14:23] * Psi-jack nods.
[14:23] <joao> 'ceph health' only works if you have a quorum
[14:23] <Psi-jack> heh
[14:23] <Psi-jack> I've been up all night working on this, so yeah.. Brain's not fully here to comprehend 100% of everything. ;)
[14:23] <joao> where are you located?
[14:24] <Psi-jack> Florida :)
[14:24] <joao> okay, "all night" makes sense now ;)
[14:24] <Psi-jack> Been awake for 26.5 hours now. :)
[14:28] <Psi-jack> Will need some help later on with CRUSH mapping. I haven't even touched that yet.
[14:28] <Psi-jack> But, I need sleep. I got 3 more VM's to convert, that can wait till after rest. heh
[14:37] * yanzheng (~zhyan@ Quit (Ping timeout: 480 seconds)
[14:38] <jksM> joao, I haven't changed anything from the defaults, so I guess the default log level
[14:39] <jksM> joao, I had just installed the cluster and were doing a test with a single qemu-emu guest running on top of that storage
[14:39] <jksM> so was a bit disappointed when the whole thing crashed ;-)
[14:40] <jksM> (lunch for now, but will send a pastebin with the logs afterwards)
[14:40] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[14:41] * maxiz (~pfliu@ Quit (Ping timeout: 480 seconds)
[14:46] * yanzheng (~zhyan@ has joined #ceph
[14:50] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[15:13] <jksM> Robe, joao: here's the log contents: http://pastebin.com/5SbE0KLi
[15:14] <jksM> any suggestions on how I get the ceph cluster back to a working state again?
[15:15] <jksM> 3 servers.. on server 1 I have ceph-mon using 9,3 GB of RAM and a ceph-mds process. On server 2 nothing survived. On server 3 ceph-mon is running using approx. 900 MB of RAM
[15:15] <jksM> (before the crash, ceph-mon and ceph-mds were running on all three servers)
[15:16] <jksM> should I restart the processes on server 2 and 3 and then afterwards restart server 1 to reclaim memory, or would it be better to take everything down at the same time?
[15:35] <jksM> did a restart of ceph on each server individually... it seems to be recovering, but I'm seeing this message I haven't seen before: heartbeat_map is_healthy 'OSD::op_tp thread 0x7f0f0f7f6700' had timed out after 30
[15:37] <jksM> ah, now it is showing HEALTH_OK as status and my qemu-kvm guest responds again... impressive!
[15:47] * yanzheng (~zhyan@ Quit (Remote host closed the connection)
[16:20] <snaff> looks like qemu/kvm-rbd and fuse is okay on an osd, only kernel clients are not http://irclogs.ceph.widodh.nl/index.php?date=2011-11-22
[16:29] <jksM> hmm, well looked like it was responsive again, but not really... I see stuff like: [WRN] 5 slow requests, 4 included below; oldest blocked for > 3679.089257 secs
[16:37] * The_Bishop (~bishop@e179016104.adsl.alicedsl.de) has joined #ceph
[16:41] <snaff> i've seen those warnings on my test system when i'm messing about, dunno what they mean but they've always gone away on their own
[16:44] <jksM> I've seem them before with like 8 seconds and 10 seconds... but never 4000 seconds like now :-|
[17:10] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) has joined #ceph
[17:17] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[17:17] * loicd (~loic@magenta.dachary.org) has joined #ceph
[17:18] * ScOut3R (~scout3r@54007924.dsl.pool.telekom.hu) Quit (Quit: leaving)
[17:18] * ScOut3R (~scout3r@54007924.dsl.pool.telekom.hu) has joined #ceph
[17:29] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) Quit (Ping timeout: 480 seconds)
[18:04] * danieagle (~Daniel@ has joined #ceph
[18:05] <paravoid> Sage Weil resolved a ticket I submitted, #3616 but I can see now commit referencing that
[18:05] <sage> which one?
[18:05] <paravoid> oh hi! :)
[18:05] <paravoid> #3616
[18:05] <paravoid> I'm wondering if it was actually resolved or if you're still want me to get some debug_ms logs
[18:05] <sage> those were all related to the wip_watch branch that was just merged into next yesterday
[18:06] <sage> if our theory for the cause is correct, it's resolved.
[18:06] <paravoid> oh
[18:06] <paravoid> great
[18:06] <paravoid> I'm seeing it quite often, I'm surprised noone else has hit it before
[18:06] <sage> this is a cluster running radosgw, right?
[18:07] <paravoid> yes
[18:07] <sage> we're seeing it a lot too. i'm exactly not sure what changed to make it surface tho
[18:07] <paravoid> aha
[18:07] <paravoid> well, that's good ;)
[18:09] * _ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[18:10] <paravoid> thanks!
[18:11] <_ron-slc> Hello. Have a question on CephFS in 48.2 Argonaut. I'm using the kernel-client mount, mounting is fine, and I see files on all clients.
[18:11] <_ron-slc> Client #1 can see, and perform md5sum on all files with success.
[18:12] <_ron-slc> Client #2 and #3, get "operation not permitted" on all md5sum's . Except One file of 10 is properly MD5sum'd.
[18:13] <_ron-slc> And all md5sum tests are being done as root, and mounted by root.
[18:16] <_ron-slc> also, remounting with the ceph-fuse, all md5sum operations are a success
[18:18] <_ron-slc> kernel is 3.5.0-19 (ubuntu quantal/12.10), is there a minimum kernel version recommendation for ceph fs?
[18:42] * Meths (~meths@ Quit (Quit: )
[18:58] * Meths (~meths@ has joined #ceph
[19:25] <joao> jksM, still around?
[19:48] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[20:14] * scalability-junk (~stp@188-193-202-99-dynip.superkabel.de) has joined #ceph
[20:18] * jluis (~JL@ has joined #ceph
[20:24] * joao (~JL@ Quit (Ping timeout: 480 seconds)
[20:26] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[20:54] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[21:06] * ScOut3R (~scout3r@54007924.dsl.pool.telekom.hu) Quit (Quit: Lost terminal)
[21:16] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) Quit (Quit: Leaving.)
[21:23] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[21:33] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[21:40] * Psi-Jack_ (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[21:47] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:47] * Psi-Jack_ is now known as Psi-jack
[22:08] * loicd (~loic@magenta.dachary.org) has joined #ceph
[22:32] * wer (~wer@wer.youfarted.net) has joined #ceph
[22:42] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[22:52] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[22:54] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[22:56] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[22:57] * _ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Read error: Operation timed out)
[23:11] <jksM> joao, I am now
[23:11] <jksM> my slow request doesn't seem to go away: slow request 27778.880205 seconds old, received at 2012-12-15 15:27:56.891437: osd_sub_op(client.4528.0:19602219 0.fe 3807b5fe/rb.0.11b7.4a933baa.00000008629e/head//0 [] v 53'185888 snapset=0=[]:[] snapc=0=[]) v7 currently started
[23:14] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[23:16] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[23:23] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Read error: Connection reset by peer)
[23:23] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[23:23] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[23:24] * loicd (~loic@magenta.dachary.org) has joined #ceph
[23:34] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) has joined #ceph
[23:35] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[23:44] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[23:45] <Psi-jack> Ta Da! All production VM's have been converted from qcow2 disks to Ceph-RBD images.
[23:46] * aliguori (~anthony@cpe-70-113-5-4.austin.res.rr.com) Quit (Remote host closed the connection)
[23:51] <Psi-jack> So.
[23:51] <Psi-jack> When I do "rados df" under clones, if that's showing 0, that means, what? That there's no replica's?

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.