#ceph IRC Log


IRC Log for 2011-01-15

Timestamps are in GMT/BST.

[0:07] <Phil1> gregaf: packages built. Installing now
[0:08] <Phil1> install fails, since it tries to run the init script to stop and start various processes, sees a config file and believes it's meaningful, and fails because those processes aren't actually configured
[0:10] <Phil1> moving /etc/ceph/ceph.conf out of the way let the packages post-install scripts succeed
[0:10] <gregaf> yeah, that's what Sage suggests
[0:10] <Tv|work> yehudasa: i found the cauthtool issue
[0:10] <Tv|work> yehudasa: it was easier than i thought
[0:10] * gnp421 (~hutchint@c-75-71-83-44.hsd1.co.comcast.net) has joined #ceph
[0:10] <Phil1> ugh, mkcephfs isn't idempotent when it fails
[0:11] <Phil1> ok, after /etc/init.d/ceph stop, it succeeded
[0:12] <Tv|work> yehudasa: i seem to have very limited access to that bug -- i can't change the status?
[0:12] <gregaf> Phil1: mkcephfs breaks if you try to re-run it?
[0:12] <gregaf> (re-run after a failure, I mean)
[0:13] <Phil1> not directly, but the failures I encountered earlier led to mkcephfs failing
[0:13] <sagewk> mkcephfs fails if daemons from a prior instance are running.. (cosd can't do it's mkfs step)...
[0:13] <Phil1> when I installed the new packages, there was an /etc/ceph/ceph.conf in place, so the init script actually started stuff up on one node
[0:13] <Phil1> and mkcephfs failed because files it wanted to touch were busy
[0:13] * alexxy[home] (~alexxy@ has joined #ceph
[0:14] <sagewk> yeah.
[0:14] <sagewk> it assumes nothing is running
[0:14] * alexxy (~alexxy@ Quit (Ping timeout: 480 seconds)
[0:15] <Phil1> ok, I have a mounted ceph filesystem
[0:18] <Phil1> now to actually get something resembling performance out of it
[0:35] <Phil1> ugh, I'm seeing various operations hang - dmesg is showing a backtrace
[0:36] <Phil1> ah, the backtrace is the kernel noticing the task getting nowhere
[0:39] <Phil1> hrm, it looks like it's possible for umount to block indefinitely, rather than timing out somehow
[0:39] * gregorg_taf (~Greg@ has joined #ceph
[0:39] * `gregorg` (~Greg@ Quit (Read error: Connection reset by peer)
[0:40] <yehudasa> tv: you're missing permissions on the tracker, sage fixes that
[0:40] <yehudasa> tv: you should be able to modify the tracker now
[0:41] <Phil1> ugh, hung kernel task - kill -9 isn't even working
[0:42] <sagewk> you need to umount -f the first time around.. once you try a non -f umount you're screwed. there is no graceful way around that, unfortunately
[0:43] <sagewk> there is a longstanding feature request for a 'soft' mode that adds timeouts all over the place.
[0:44] <Phil1> ah
[0:44] <Phil1> this are play machines, so I just rebooted
[0:45] <Phil1> and it's about quitting time for me
[0:45] <Phil1> Thanks for the help so far
[0:45] <Phil1> I'm certainly interested to see where this goes
[0:45] * Phil1 (~phil@thrift.cs.ILLINOIS.edu) has left #ceph
[1:40] <ajnelson> Does somebody have a copy of v0.24.1 compiled? I'd like to double-check something with cfuse.
[1:41] <ajnelson> I run these commands in the src/ directory:
[1:41] <ajnelson> ./vstart.sh -d -n -l #Runs fine
[1:41] <ajnelson> ./cfuse -m /mnt/ceph
[1:42] <ajnelson> First line returned: "server name not found: (Success)"
[1:42] <ajnelson> Then: "*** Caught signal (ABRT) ***" followed by the Ceph git version, a stack trace, and then "segmentation fault"
[1:44] <cmccabe> ajnelson: vstart is something... very brittle
[1:44] <sagewk> do you mind opening an issue in the tracker? that shouldn't happen
[1:44] <cmccabe> hmm, actually that looks like a cfuse problem though
[1:44] <ajnelson> Stack trace:
[1:44] <ajnelson> alex@ubuntu:~/ceph-current/src$ sudo ./cfuse -m /mnt/ceph
[1:44] <ajnelson> [sudo] password for alex:
[1:44] <ajnelson> server name not found: (Success)
[1:44] <ajnelson> *** Caught signal (ABRT) ***
[1:44] <ajnelson> ceph version 0.24.1 (commit:630565f3ac96c3f4f5eb2d54fc76086695ab18cc)
[1:44] <ajnelson> 1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x747865]
[1:44] <ajnelson> 2: (sigabrt_handler(int)+0x49) [0x7588f5]
[1:44] <ajnelson> 3: (()+0xfb40) [0x7f196500ab40]
[1:44] <ajnelson> 4: (()+0x82e52) [0x7f1963e9fe52]
[1:44] <ajnelson> 5: (parse_ip_port_vec(char const*, std::vector<entity_addr_t, std::allocator<entity_addr_t> >&)+0x4d) [0x758495]
[1:44] <ajnelson> 6: (MonClient::build_initial_monmap()+0x3eb) [0x748333]
[1:44] <ajnelson> 7: (main()+0x109) [0x5d74d2]
[1:44] <ajnelson> 8: (__libc_start_main()+0xfe) [0x7f1963e3bd8e]
[1:44] <ajnelson> 9: ./cfuse() [0x5d7269]
[1:44] <ajnelson> Segmentation fault
[1:44] <ajnelson> alex@ubuntu:~/ceph-current/src$
[1:45] <ajnelson> I will paste that again in the tracker.
[1:46] <sagewk> thanks
[1:53] <ajnelson> http://tracker.newdream.net/issues/712
[2:06] <yehudasa> ajnelson: you have a typo
[2:06] <yehudasa> it should be: sudo ./cfuse -m /mnt/ceph
[2:07] * DJL (82d8d198@ircip2.mibbit.com) Quit (Ping timeout: 480 seconds)
[2:08] <yehudasa> ajnelson: sage says that it should be: sudo ./cfuse -m /mnt/ceph (without the :/)
[2:10] <ajnelson> yehudasa: Thanks for the quick response. Trying that out...
[2:11] <ajnelson> yehudasa: Yup, needs to have nothing right after the port, just got a segfault with colon-slash.
[2:11] <yehudasa> yeah.. I pushed a fix for that segfault
[2:11] <ajnelson> Great! Thank you!
[2:12] <ajnelson> I checked the wiki, no mentions of cfuse have the /. Not sure where I pulled that from, but it was chugging along fine in a test script I had.
[2:22] <ajnelson> Have a good night, all, and thanks for the help!
[2:22] * ajnelson (~Adium@dhcp-63-189.cse.ucsc.edu) Quit (Quit: Leaving.)
[2:34] * fzylogic (~fzylogic@ Quit (Quit: DreamHost Web Hosting http://www.dreamhost.com)
[2:45] * dallas (~dallas@ Quit (Quit: dallas)
[2:47] * Tv|work (~Tv|work@ip-66-33-206-8.dreamhost.com) Quit (Ping timeout: 480 seconds)
[2:51] * gnp421 (~hutchint@c-75-71-83-44.hsd1.co.comcast.net) Quit (Quit: Leaving)
[2:52] * cmccabe (~cmccabe@ has left #ceph
[3:29] * votz (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) Quit (Remote host closed the connection)
[3:29] * bchrisman (~Adium@c-24-130-226-22.hsd1.ca.comcast.net) has joined #ceph
[3:33] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[5:02] * ghaskins (~ghaskins@66-189-113-47.dhcp.oxfr.ma.charter.com) Quit (Quit: Leaving)
[5:36] * darkfader (~floh@host-93-104-226-28.customer.m-online.net) Quit (Quit: "somalguggen")
[5:43] * darkfader (~floh@ has joined #ceph
[6:37] * ijuz__ (~ijuz@p4FFF7EAA.dip.t-dialin.net) Quit (Ping timeout: 480 seconds)
[6:46] * ijuz__ (~ijuz@p4FFF7CAD.dip.t-dialin.net) has joined #ceph
[7:42] * greglap (~Adium@cpe-76-90-239-202.socal.res.rr.com) has joined #ceph
[9:43] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Ping timeout: 480 seconds)
[9:52] * tjikkun (~tjikkun@195-240-122-237.ip.telfort.nl) has joined #ceph
[10:02] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[10:28] * Yoric_ (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[10:32] * Yoric__ (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[10:34] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[10:34] * Yoric__ is now known as Yoric
[10:38] * Yoric_ (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[11:48] * votz (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) has joined #ceph
[12:17] * greglap (~Adium@cpe-76-90-239-202.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[12:17] * greglap (~Adium@cpe-76-90-239-202.socal.res.rr.com) has joined #ceph
[13:37] <stingray> greglap:
[13:37] <stingray> just saw your msg
[13:37] <stingray> recompiling and redeploying stuff now
[16:08] <jantje> sagewk: I think it was ~2m15 sec to write 18600 files or 726MB
[16:08] <jantje> and my build (random workload) is also twice as slow, I still have to check if it's consistently, so don't worry about it
[16:09] <jantje> I like to complain :-)
[16:34] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (Read error: No route to host)
[16:35] * Anticimex (anticimex@netforce.csbnet.se) Quit (Server closed connection)
[16:35] * Anticimex (anticimex@netforce.csbnet.se) has joined #ceph
[16:37] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[16:40] * zoobab (zoobab@vic.ffii.org) Quit (Server closed connection)
[16:40] * zoobab (zoobab@vic.ffii.org) has joined #ceph
[16:57] * Mark23 (~mark@ has left #ceph
[17:04] * alexxy[home] (~alexxy@ Quit (Remote host closed the connection)
[17:11] * MichielM (~michiel@mike.dwaas.org) Quit (Server closed connection)
[17:11] * MichielM (~michiel@mike.dwaas.org) has joined #ceph
[17:18] * alexxy (~alexxy@ has joined #ceph
[17:20] * alexxy (~alexxy@ Quit (Remote host closed the connection)
[17:40] * gregorg_taf (~Greg@ Quit (Server closed connection)
[17:40] * gregorg_taf (~Greg@ has joined #ceph
[17:45] * ijuz__ (~ijuz@p4FFF7CAD.dip.t-dialin.net) Quit (Server closed connection)
[17:45] * ijuz__ (~ijuz@p4FFF7CAD.dip.t-dialin.net) has joined #ceph
[18:15] * allsystemsarego (~allsystem@ has joined #ceph
[18:35] * alexxy (~alexxy@ has joined #ceph
[21:02] * wido (~wido@fubar.widodh.nl) Quit (Remote host closed the connection)
[21:06] * wido (~wido@fubar.widodh.nl) has joined #ceph
[23:15] * ajnelson (~Adium@dhcp-63-189.cse.ucsc.edu) has joined #ceph
[23:30] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[23:42] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.