#ceph IRC Log

Index

IRC Log for 2012-02-24

Timestamps are in GMT/BST.

[0:00] <darkfader> I need to BRB, the cat exploded
[0:00] <darkfader> sorry.
[0:02] * dhansen (~dave@static-50-53-34-95.bvtn.or.frontiernet.net) has left #ceph
[0:03] <Tv|work> darkfader: journal needs to be a file, not a directory
[0:03] <Tv|work> darkfader: it can be a block device, or just a regular file
[0:04] <Tv|work> darkfader: ceph-osd --mkfs (and thus mkcephfs) will create the file for you, but you had a directory in the way
[0:04] <darkfader> ah sorry
[0:04] <darkfader> i tripped over that last time too i think
[0:04] * BManojlovic (~steki@212.200.243.83) has joined #ceph
[0:07] <darkfader> same thing for /data/mon0 too?
[0:09] <darkfader> ok that was a yes
[0:09] <darkfader> everything worked now
[0:11] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[0:12] <darkfader> osd and mds didn't come up but thats not for today
[0:12] <darkfader> thanks!
[0:31] * BManojlovic (~steki@212.200.243.83) Quit (Remote host closed the connection)
[0:38] * fronlius (~fronlius@f054184172.adsl.alicedsl.de) Quit (Quit: fronlius)
[0:41] * lofejndif (~lsqavnbok@207.Red-88-19-214.staticIP.rima-tde.net) Quit (Quit: Leaving)
[0:58] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Remote host closed the connection)
[1:19] * joao (~joao@89.181.147.200) Quit (Quit: joao)
[1:28] * tnt_ (~tnt@80.63-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[2:28] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[2:34] * Tv|work (~Tv__@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[3:59] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[5:10] * guido (~guido@mx1.hannover.ccc.de) Quit (Read error: No route to host)
[5:10] * chutzpah (~chutz@216.174.109.254) Quit (Quit: Leaving)
[5:10] * nyeates (~nyeates@pool-173-59-237-128.bltmmd.fios.verizon.net) has joined #ceph
[5:14] * guido (~guido@mx1.hannover.ccc.de) has joined #ceph
[5:25] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Read error: Connection reset by peer)
[5:52] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[6:26] * cattelan is now known as cattelan_away
[7:00] * nyeates (~nyeates@pool-173-59-237-128.bltmmd.fios.verizon.net) Quit (Quit: Zzzzzz)
[7:56] * tnt_ (~tnt@80.63-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[9:06] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) has joined #ceph
[9:12] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[9:14] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:26] * tnt_ (~tnt@80.63-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[9:35] * tnt_ (~tnt@212-166-48-236.win.be) has joined #ceph
[9:47] * fronlius (~fronlius@f054184172.adsl.alicedsl.de) has joined #ceph
[9:49] * fronlius (~fronlius@f054184172.adsl.alicedsl.de) Quit ()
[9:53] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[10:04] * sage (~sage@cpe-76-94-40-34.socal.res.rr.com) has joined #ceph
[10:05] * joao (~joao@89.181.147.200) has joined #ceph
[10:39] <darkfader> gceph is quite ok actually. just lacks a more visual overall status info. but really, it seems ok :)
[10:39] <darkfader> still need to fix the osd
[10:39] <darkfader> meh.
[10:43] * joao (~joao@89.181.147.200) Quit (Quit: joao)
[10:44] * joao (~JL@ace.ops.newdream.net) has joined #ceph
[10:51] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[10:56] * joao (~JL@ace.ops.newdream.net) Quit (Quit: Leaving)
[10:56] * joao (~JL@ace.ops.newdream.net) has joined #ceph
[11:21] <darkfader> can one of you give me a one-line definition of RADOS?
[11:22] <darkfader> I'm stuck at "thats what makes it work"
[11:41] <darkfader> can i still use ceph without cephx auth? because i'll only have 1-2 hours to walk them through it
[11:56] * fronlius (~fronlius@testing78.jimdo-server.com) has joined #ceph
[12:11] <darkfader> also, did one of you try building ceph on fc17?
[12:11] <darkfader> or 16
[12:12] <darkfader> yes, i meant 16
[12:20] * joao (~JL@ace.ops.newdream.net) Quit (Quit: Leaving)
[12:20] * joao (~JL@ace.ops.newdream.net) has joined #ceph
[12:39] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[12:58] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Quit: fronlius)
[13:33] * stxShadow (~jens@jump.filoo.de) has joined #ceph
[13:34] <stxShadow> darkfader: you can use ceph without cephx
[13:35] <stxShadow> but be sure to close port 6800 on the mon server for external traffic
[13:43] <darkfader> stxShadow: what happens otherwise?
[13:43] <darkfader> and what is "external"? != 127.0.0.1 or "internet"?
[13:48] <stxShadow> -> internet
[13:48] <stxShadow> try to telnet to port 6800 ;=
[13:48] <stxShadow> and you will see
[13:49] <darkfader> i'm doing a quick excursion to ceph in a class
[13:49] <darkfader> this is not a setup connected to anywhere outside at all :)
[13:49] <stxShadow> ah ok ....
[13:50] <stxShadow> then you have no security issue ;)
[13:57] <darkfader> differnt from my laptop last night: all msd/osd are up but gceph segfaults
[13:57] <darkfader> thats a great improvement over "gceph works and osd dies"
[13:59] <darkfader> fsck!
[13:59] <darkfader> i dont have a ceph kernel module in the debian systems
[14:00] <darkfader> will. not. shout.
[14:04] <stxShadow> you have to build a recent kernel
[14:05] <stxShadow> i think > 3.2 is recommended
[14:07] <darkfader> no time for that i'm afraid
[14:07] <darkfader> using fuse client for now
[14:07] <darkfader> and if possible i'll just ditch debian for fc16
[14:08] <nhm> darkfader: what kernel are you on?
[14:09] <darkfader> nhm: default-ish 2.6.32/5
[14:09] <darkfader> my laptop is on 3.2 :)
[14:09] <nhm> darkfader: ok, just be aware that with that old of a kernel there may be various issues.. :)
[14:10] <darkfader> that's why i'd rather get rid of the whole OS
[14:10] <darkfader> but we have only a small SSD where all the os images go and there's another monster lurking
[14:11] <darkfader> alas, 95MB/s write, 102MB/s read is tolerable :)
[14:11] <darkfader> i'm trying to say i'm glad the network is bottlnecking before fuse
[14:11] <nhm> what kind of configuration for that performance?
[14:12] <darkfader> 7 old quadcore desktops, crap switches and 1 ssd per desktop
[14:12] <darkfader> 128MB osd journal only
[14:13] <darkfader> i'll try two writes next
[14:13] <darkfader> *writers
[14:13] <nhm> cool
[14:13] <stxShadow> hmmm ...... performance is also one of my problems :)
[14:13] <nhm> I'm going to be doing a lot of performance testing soon.
[14:13] <nhm> stxShadow: what are you seeing?
[14:14] <darkfader> nhm: you're is on a little different level hehe
[14:14] <stxShadow> we use local rbd files
[14:14] <stxShadow> 10 Gbit cross connects
[14:14] <stxShadow> 4 x 2 TB sata raid0
[14:14] <stxShadow> and ssd for journaling
[14:14] <stxShadow> per osd
[14:14] <stxShadow> 4 osds
[14:15] <stxShadow> 3 mon
[14:15] <stxShadow> 2 active mds
[14:15] <stxShadow> 1 standby
[14:15] <stxShadow> -> no more than 90 to 100 MB / Sec
[14:15] <nhm> stxShadow: btrfs or xfs?
[14:15] <stxShadow> xfs
[14:15] <stxShadow> btrfs crashed twice
[14:15] <stxShadow> under heavy load
[14:16] <darkfader> i'm glad people start to settle for xfs now
[14:16] <darkfader> we can go with btrfs when ti's done
[14:17] <nhm> When you have a moment, could you run xfs_bmap on some of the underlying data on XFS from one of your throughput tests? I'm curious to see if there is a lot of extent fragmentation.
[14:17] <stxShadow> is this non distructive ?
[14:19] <nhm> stxShadow: It just prints the block mapping on the file.
[14:19] <nhm> stxShadow: here's the manpage: http://linux.die.net/man/8/xfs_bmap
[14:20] <stxShadow> ok .... one of the pg files is ok ?
[14:21] <stxShadow> root@fcmsnode3:/data/osd3/current/18.5_head/DIR_5/DIR_0# xfs_bmap rb.0.0.00000000001e__head_DA209805
[14:21] <stxShadow> rb.0.0.00000000001e__head_DA209805:
[14:21] <stxShadow> 0: [0..8191]: 15142252560..15142260751
[14:24] <nhm> could you run it with the verbose flag and maybe pastebin or pastie the result?
[14:24] <stxShadow> sure
[14:25] <stxShadow> hmmm .... the result is not much longer
[14:26] <stxShadow> root@fcmsnode3:/data/osd3/current/18.5_head/DIR_5/DIR_0# xfs_bmap -v rb.0.0.00000000001e__head_DA209805
[14:26] <stxShadow> rb.0.0.00000000001e__head_DA209805:
[14:26] <stxShadow> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
[14:26] <stxShadow> 0: [0..8191]: 15142252560..15142260751 7 (109867080..109875271) 8192
[14:26] <nhm> stxShadow: this is what another user was seeing: http://pastebin.com/xABVh19J
[14:27] <nhm> stxShadow: 8 extents for a single 4MB chunk of data.
[14:27] <stxShadow> active data size in the whole cluster is around 2 TB
[14:28] <stxShadow> 2012-02-24 14:27:43.866956 pg v1173545: 2062 pgs: 2062 active+clean, 2211 GB data, 4319 GB used, 25474 GB / 29794 GB avail
[14:28] <stxShadow> i will try another 4MB chunk
[14:31] <stxShadow> i've tried 10 files now .... between 2 and 8 extends
[14:32] <nhm> stxShadow: hrm, ok. That's good to know.
[14:34] <stxShadow> nhm: are you the guy who will do the official ceph performance tests ?
[14:34] <nhm> stxShadow: Need to figure out why some of those files are getting fragmented so badly.
[14:34] <darkfader> tadaaaa killed my cfuses :>
[14:35] <stxShadow> what is the definition of "extends" ?
[14:37] <nhm> stxShadow: Well, I imagine there will be lots of people doing performance testing, but my job will primarily focus on performance yes. :)
[14:37] <stxShadow> nice :)
[14:38] <stxShadow> i'm setting up a little test cluster right in the moment ..... to figure out more about our problems
[14:38] <nhm> stxShadow: extents are basically just contigous regions of space for some file. So if there is a lot of extent fragmentation you end up jumping around to different blocks.
[14:39] <stxShadow> ok ... and that slows everything down
[14:39] <nhm> yep. What I need to figure out is how much of what people are seeing is that vs other potential issues.
[14:43] <nhm> ok, going afk for a little while. Thanks for passing that info along!
[14:43] <stxShadow> np
[15:19] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) has joined #ceph
[15:28] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[15:46] <stxShadow> 2012-02-24 15:45:44.874994 log 2012-02-24 15:45:54.828151 osd.1 10.0.0.12:6803/2370 215 : [WRN] old request osd_op(client.4202.1:2330 10000000000.0000078d [write 0~4194304 [2@0]] 0.520660ca snapc 1=[]) received at 2012-02-24 15:45:24.717247 currently waiting for sub ops
[15:46] <stxShadow> can someone explain me the meaning of that line ?
[15:58] * juriskrumins (~juriskrum@217.21.170.21) has joined #ceph
[16:00] <juriskrumins> Hi there. Can anybody help me with ceph test installation ? I've installed ceph 0.42, set it up, started mds, mon and ods services, but I can't mount filesystem. I get "mount error 5 = Input/output error" error every time I try to mount it.
[16:12] <iggy> juriskrumins: look in dmesg for errors
[16:12] <iggy> and try turning up debugging
[16:13] <iggy> and check cluster health to make sure everything is actually up
[16:13] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) Quit (Quit: Leaving.)
[16:14] <juriskrumins> This is what admesg show me
[16:14] <juriskrumins> libceph: client0 fsid 430c054f-8a6a-4b28-9251-6ebe4f787f73
[16:14] <juriskrumins> libceph: mon0 10.10.1.10:6789 session established
[16:15] <juriskrumins> This is what ceph -s show
[16:15] <juriskrumins> 2012-02-24 10:14:31.753652 pg v267: 192 pgs: 192 creating; 0 bytes data, 6418 MB used, 42061 MB / 51039 MB avail
[16:15] <juriskrumins> 2012-02-24 10:14:31.756342 mds e21: 1/1/1 up {0=rd05=up:creating}, 4 up:standby
[16:15] <juriskrumins> 2012-02-24 10:14:31.756406 osd e92: 5 osds: 5 up, 5 in
[16:15] <juriskrumins> 2012-02-24 10:14:31.757542 log 2012-02-24 10:00:42.705260 mon.2 10.10.1.12:6789/0 37 : [WRN] message from mon.0 was stamped 0.279176s in the future, clocks not synchronized
[16:15] <juriskrumins> 2012-02-24 10:14:31.757677 mon e1: 5 mons at {2=10.10.1.10:6789/0,3=10.10.1.11:6789/0,4=10.10.1.12:6789/0,5=10.10.1.13:6789/0,6=10.10.1.14:6789/0}
[16:15] <juriskrumins> The command is mount.ceph 10.10.1.11:6789:/ /mnt/ceph simply wait for minute and then produce and mentioned error.
[16:27] <iggy> have you tried syncing the clocks?
[16:28] <juriskrumins> Yes. The dirrerence is no more that 0.5-0.6 sec.
[16:30] <stxShadow> juriskrumins: you mds have to be active not creating
[16:34] <juriskrumins> Ok. How can I achieve it ? Sorry for that, I'm not familiar with ceph.
[16:36] <stxShadow> normale it should come active at the first start of the filesystem
[16:38] <stxShadow> maybe you should turn on mds debugging
[16:56] <juriskrumins> These are the messages I have is mds log:
[16:56] <juriskrumins> 2012-02-24 10:56:01.658979 7f8b11da9700 mds.-1.0 handle_mds_beacon up:standby seq 28 rtt 0.002077
[16:56] <juriskrumins> 2012-02-24 10:56:03.479810 7f8b03fff700 mds.-1.bal get_load no root, no load
[16:56] <juriskrumins> 2012-02-24 10:56:03.479964 7f8b03fff700 mds.-1.bal get_load mdsload<[0,0 0]/[0,0 0], req 0, hr 0, qlen 0, cpu 0.05>
[16:56] <juriskrumins> 2012-02-24 10:56:05.657247 7f8b03fff700 mds.-1.0 beacon_send up:standby seq 29 (currently up:standby)
[16:56] <juriskrumins> 2012-02-24 10:56:05.657377 7f8b03fff700 monclient: _send_mon_message to mon.5 at 10.10.1.13:6789/0
[16:56] <juriskrumins> 2012-02-24 10:56:05.657399 7f8b03fff700 -- 10.10.1.10:6800/2560 --> 10.10.1.13:6789/0 -- mdsbeacon(4703/rd02 up:standby seq 29 v29) v2 -- ?+0 0x7f8af4000920 con 0x28bed60
[16:56] <juriskrumins> 2012-02-24 10:56:05.659335 7f8b11da9700 -- 10.10.1.10:6800/2560 <== mon.3 10.10.1.13:6789/0 39 ==== mdsbeacon(4703/rd02 up:standby seq 29 v29) v2 ==== 106+0+0 (719933415 0 0) 0x7f8b040008c0 con 0x28bed60
[16:58] * BManojlovic (~steki@91.195.39.5) Quit (Ping timeout: 480 seconds)
[16:58] <juriskrumins> By the way I'm not using cephx in my configuration file.
[16:58] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[16:59] <stxShadow> is this cluster fresh initialized ?
[17:01] <juriskrumins> this is centos 6.2 with 3.2.7 kernel. the custer is fresh initialized. No data in it. I've exerience some problems with mkcephfs
[17:02] <juriskrumins> run it as mkcephfs -a -c <config_file> but due to some reasons osd on mentioned hosts did get initalized.
[17:03] <stxShadow> do you use xfs ?
[17:03] <juriskrumins> so after that i've used cleaned up everything, run mkcephfs -a -c <cfg_file> and ceph-osd with --mkfs option on every single. And only after that I was able to start ceph-osd daemon.
[17:04] <juriskrumins> No it's ext4 with user_xattr mount option, as it was mentioned in wiki
[17:05] <juriskrumins> every single host is KVM machine.
[17:05] <stxShadow> ok .... and you mounted the fs in front of using mkcephfs ?
[17:06] <juriskrumins> no
[17:06] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[17:06] <juriskrumins> I've used mkcephfs, then I've started mon and mds using startup script from 0.42 sources and then I run ceph-osd --mkfs,
[17:07] <stxShadow> mkcephfs will not mount the directorys (only if you use btrfs)
[17:07] <juriskrumins> /dev/vdb is already mounted into /data
[17:07] <stxShadow> what is the output of "ceph osd tree" ?
[17:08] <juriskrumins> [root@rd02 ceph]# ceph osd tree
[17:08] <juriskrumins> 2012-02-24 11:07:51.131693 mon <- [osd,tree]
[17:08] <juriskrumins> 2012-02-24 11:07:51.157339 mon.0 -> 'dumped osdmap tree epoch 93' (0)
[17:08] <juriskrumins> # id weight type name up/down reweight
[17:08] <juriskrumins> -1 0 pool default
[17:08] <juriskrumins> 2 0 osd.2 up 1
[17:08] <juriskrumins> 3 0 osd.3 up 1
[17:08] <juriskrumins> 4 0 osd.4 up 1
[17:08] <juriskrumins> 5 0 osd.5 up 1
[17:08] <juriskrumins> 6 0 osd.6 up 1
[17:09] <stxShadow> hmmm .... i'm not a real expert in this .... but the osd map looks odd
[17:10] <stxShadow> thats mine:
[17:10] <stxShadow> 2012-02-24 17:09:44.031440 mon.2 -> 'dumped osdmap tree epoch 1825' (0)
[17:10] <stxShadow> # id weight type name up/down reweight
[17:10] <stxShadow> -1 3 pool default
[17:10] <stxShadow> -3 3 rack unknownrack
[17:10] <stxShadow> -2 1 host fcmsnode0
[17:10] <stxShadow> 0 1 osd.0 up 1
[17:10] <stxShadow> -6 1 host fcmsnode2
[17:10] <stxShadow> 2 1 osd.2 up 1
[17:10] <stxShadow> -4 1 host fcmsnode3
[17:10] <stxShadow> 3 1 osd.3 up 1
[17:10] <stxShadow> -5 1 host fcmsnode4
[17:10] <stxShadow> 4 1 osd.4 up 1
[17:10] <stxShadow> i think you have to edit the crushmap
[17:10] <stxShadow> for every new osd
[17:11] <stxShadow> http://ceph.newdream.net/wiki/OSD_cluster_expansion/contraction
[17:11] <stxShadow> from wiki:
[17:11] <stxShadow> Adding the OSD to the system doesn't necessarily mean the system will put data on it. You need to check (and possibly update) the CRUSH placement map (part of the OSD map).
[17:12] <juriskrumins> Ok, I'll carefully review my configuration, but I'd like to ask isn't mkcephfs -a -c <cfg_file> should make all this things, including osd host configuration ?
[17:13] <stxShadow> yes .... it should ..... but you mentitioned errors before
[17:14] * nyeates (~nyeates@pool-173-59-237-128.bltmmd.fios.verizon.net) has joined #ceph
[17:14] <stxShadow> mkcephfs will setup a crushmap acording to your ceph.conf
[17:14] <juriskrumins> and init script, that comes with ceph source code should start up ceph-osd on osd hosts ?
[17:14] <stxShadow> you have to start them manualy after "mkcephfs"
[17:14] <stxShadow> or: /etc/init.d/ceph -a start
[17:15] <stxShadow> on the host you issue the mkcephfs command on
[17:15] <stxShadow> is your master able to ssh passwordless to the others ?
[17:16] <juriskrumins> Yes. This is the way how I provision my ceph.conf to other hosts.
[17:17] <stxShadow> ok ;)
[17:17] <stxShadow> maybe you can post the mkcephfs error ?
[17:19] <juriskrumins> just a minute I'll start all over again.
[17:20] <juriskrumins> This is my config file:
[17:20] <juriskrumins> [global]
[17:20] <juriskrumins> max open files = 131072
[17:20] <juriskrumins> log file = /var/log/ceph/$name.log
[17:20] <juriskrumins> pid file = /var/run/ceph/$name.pid
[17:20] <juriskrumins> keyring = /etc/ceph/keyring.admin
[17:20] <juriskrumins> [mon]
[17:20] <juriskrumins> mon data = /data/$name
[17:20] <juriskrumins> [mon.2]
[17:20] <juriskrumins> host = rd02
[17:20] <juriskrumins> mon addr = 10.10.1.10:6789
[17:20] <juriskrumins> [mon.3]
[17:20] <juriskrumins> host = rd03
[17:20] <juriskrumins> mon addr = 10.10.1.11:6789
[17:20] <juriskrumins> [mon.4]
[17:20] <juriskrumins> host = rd04
[17:20] <juriskrumins> mon addr = 10.10.1.12:6789
[17:20] <juriskrumins> [mon.5]
[17:20] <juriskrumins> host = rd05
[17:21] <juriskrumins> mon addr = 10.10.1.13:6789
[17:21] <juriskrumins> [mon.6]
[17:21] <juriskrumins> host = rd06
[17:21] <juriskrumins> mon addr = 10.10.1.14:6789
[17:21] <juriskrumins> [mds]
[17:21] <juriskrumins> debug ms = 1 ; message traffic
[17:21] <juriskrumins> debug mds = 20 ; mds
[17:21] <juriskrumins> debug mds balancer = 20 ; load balancing
[17:21] <juriskrumins> debug mds log = 20 ; mds journaling
[17:21] <juriskrumins> debug mds_migrator = 20 ; metadata migration
[17:21] <juriskrumins> debug monc = 20 ; monitor interaction, startup
[17:21] <juriskrumins> keyring = /etc/ceph/keyring.$name
[17:21] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) has joined #ceph
[17:21] <juriskrumins> [mds.rd02]
[17:21] <juriskrumins> host = rd02
[17:21] <juriskrumins> [mds.rd03]
[17:21] <juriskrumins> host = rd03
[17:21] <juriskrumins> [mds.rd04]
[17:21] <juriskrumins> host = rd04
[17:21] <juriskrumins> [mds.rd05]
[17:21] <juriskrumins> host = rd05
[17:21] <juriskrumins> [mds.rd06]
[17:21] <juriskrumins> host = rd06
[17:21] <juriskrumins> [osd]
[17:21] <juriskrumins> osd data = /data/$name
[17:21] <juriskrumins> osd journal = /data/$name/journal
[17:21] <juriskrumins> osd journal size = 1000 ; journal size, in megabytes
[17:21] <juriskrumins> osd class dir = /opt/ceph-0.42/lib/rados-classes/
[17:21] <juriskrumins> keyring = /etc/ceph/keyring.$name
[17:22] <juriskrumins> [ods.2]
[17:22] <juriskrumins> host = rd02
[17:22] <juriskrumins> [ods.3]
[17:22] <juriskrumins> host = rd03
[17:22] <juriskrumins> [ods.4]
[17:22] <juriskrumins> host = rd04
[17:22] <juriskrumins> [ods.5]
[17:22] <juriskrumins> host = rd05
[17:22] <juriskrumins> [ods.6]
[17:22] <juriskrumins> host = rd06
[17:22] <juriskrumins> the /data firectory is empty and all services is down.
[17:23] <juriskrumins> The output from mkcephfs -a -c /etc/ceph/ceph.conf is the following
[17:23] <juriskrumins> [root@rd02 ceph]# mkcephfs -a -c /etc/ceph/ceph.conf
[17:23] <juriskrumins> temp dir is /tmp/mkcephfs.LORoiZlhQq
[17:23] <juriskrumins> preparing monmap in /tmp/mkcephfs.LORoiZlhQq/monmap
[17:23] <juriskrumins> /opt/ceph-0.42/bin/monmaptool --create --clobber --add 2 10.10.1.10:6789 --add 3 10.10.1.11:6789 --add 4 10.10.1.12:6789 --add 5 10.10.1.13:6789 --add 6 10.10.1.14:6789 --print /tmp/mkcephfs.LORoiZlhQq/monmap
[17:23] <juriskrumins> /opt/ceph-0.42/bin/monmaptool: monmap file /tmp/mkcephfs.LORoiZlhQq/monmap
[17:23] <juriskrumins> /opt/ceph-0.42/bin/monmaptool: generated fsid 0b05fc0b-f752-452c-b139-50e6beb7650b
[17:23] <juriskrumins> epoch 0
[17:23] <juriskrumins> fsid 0b05fc0b-f752-452c-b139-50e6beb7650b
[17:23] <juriskrumins> last_changed 2012-02-24 11:21:55.853074
[17:23] <juriskrumins> created 2012-02-24 11:21:55.853074
[17:23] <juriskrumins> 0: 10.10.1.10:6789/0 mon.2
[17:23] <juriskrumins> 1: 10.10.1.11:6789/0 mon.3
[17:23] <juriskrumins> 2: 10.10.1.12:6789/0 mon.4
[17:23] <juriskrumins> 3: 10.10.1.13:6789/0 mon.5
[17:23] <juriskrumins> 4: 10.10.1.14:6789/0 mon.6
[17:23] <juriskrumins> /opt/ceph-0.42/bin/monmaptool: writing epoch 0 to /tmp/mkcephfs.LORoiZlhQq/monmap (5 monitors)
[17:23] <juriskrumins> === mds.rd02 ===
[17:23] <juriskrumins> creating private key for mds.rd02 keyring /etc/ceph/keyring.mds.rd02
[17:23] <juriskrumins> creating /etc/ceph/keyring.mds.rd02
[17:23] <juriskrumins> === mds.rd03 ===
[17:23] <juriskrumins> pushing conf and monmap to rd03:/tmp/mkfs.ceph.3489
[17:23] <juriskrumins> creating private key for mds.rd03 keyring /etc/ceph/keyring.mds.rd03
[17:23] <juriskrumins> creating /etc/ceph/keyring.mds.rd03
[17:23] <juriskrumins> collecting mds.rd03 key
[17:23] <juriskrumins> === mds.rd04 ===
[17:23] <juriskrumins> pushing conf and monmap to rd04:/tmp/mkfs.ceph.3489
[17:23] <juriskrumins> creating private key for mds.rd04 keyring /etc/ceph/keyring.mds.rd04
[17:23] <juriskrumins> creating /etc/ceph/keyring.mds.rd04
[17:23] <juriskrumins> collecting mds.rd04 key
[17:23] <juriskrumins> === mds.rd05 ===
[17:23] <juriskrumins> pushing conf and monmap to rd05:/tmp/mkfs.ceph.3489
[17:24] <juriskrumins> creating private key for mds.rd05 keyring /etc/ceph/keyring.mds.rd05
[17:24] <juriskrumins> creating /etc/ceph/keyring.mds.rd05
[17:24] <juriskrumins> collecting mds.rd05 key
[17:24] <juriskrumins> === mds.rd06 ===
[17:24] <juriskrumins> pushing conf and monmap to rd06:/tmp/mkfs.ceph.3489
[17:24] <juriskrumins> creating private key for mds.rd06 keyring /etc/ceph/keyring.mds.rd06
[17:24] <juriskrumins> creating /etc/ceph/keyring.mds.rd06
[17:24] <juriskrumins> collecting mds.rd06 key
[17:24] <juriskrumins> Building generic osdmap from /tmp/mkcephfs.LORoiZlhQq/conf
[17:24] <juriskrumins> /opt/ceph-0.42/bin/osdmaptool: osdmap file '/tmp/mkcephfs.LORoiZlhQq/osdmap'
[17:24] <juriskrumins> /opt/ceph-0.42/bin/osdmaptool: writing epoch 1 to /tmp/mkcephfs.LORoiZlhQq/osdmap
[17:24] <juriskrumins> Generating admin key at /tmp/mkcephfs.LORoiZlhQ
[17:24] <stxShadow> maybe you should use pastbin
[17:24] <stxShadow> ;)
[17:24] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[17:24] <juriskrumins> sorry for that
[17:25] <stxShadow> thats not the wohl log ?
[17:25] <stxShadow> there is missing a lot
[17:25] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:25] <stxShadow> if you dont use cephx
[17:25] <juriskrumins> this is the output from mkcephfs
[17:25] <stxShadow> you may delete the keyring config entrys
[17:26] <stxShadow> -> <juriskrumins> Generating admin key at /tmp/mkcephfs.LORoiZlhQ
[17:26] <stxShadow> thats the last i see here
[17:26] <stxShadow> the whol osd setup is missing
[17:26] <juriskrumins> Yes. This is the last line I have on screen
[17:27] <juriskrumins> that's the reason, as I previously mentioned, I run ceph-osd --mkfs manually on every single osd host.
[17:27] <stxShadow> maybe you should start with fewer mon and mds entries ... -> lets say 3 .... you may add more later
[17:28] <stxShadow> -> the osdmap is your problem
[17:28] <stxShadow> it is not posted to the ods
[17:30] <stxShadow> hmm
[17:30] <stxShadow> you can build one by hand
[17:30] <stxShadow> with :
[17:30] <stxShadow> osdmaptool --createsimple 6 --clobber /tmp/osdmap.junk --export-crush /tmp/crush.new
[17:31] <stxShadow> take a look to that map with
[17:31] <stxShadow> crushtool -d /tmp/crush.new
[17:31] <stxShadow> if everything is ok inject it
[17:31] <stxShadow> ceph osd setcrushmap -i /tmp/crush.new
[17:31] <stxShadow> -> maybe this will work for you
[17:33] <stxShadow> btw .... which version do you user? i had nearly the same problem with 0.41
[17:33] <juriskrumins> 0.42
[17:34] * tnt_ (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[17:35] <juriskrumins> Ok. I have to go now but I'll try your suggestions and will be back with report :). Thanks for your help.
[17:36] <stxShadow> np
[17:37] * juriskrumins (~juriskrum@217.21.170.21) Quit (Quit: Leaving)
[17:39] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:45] * Anticimex (anticimex@netforce.csbnet.se) Quit (Remote host closed the connection)
[17:45] * Anticimex (anticimex@netforce.csbnet.se) has joined #ceph
[17:46] * tnt_ (~tnt@80.63-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[17:50] * Tv|work (~Tv__@aon.hq.newdream.net) has joined #ceph
[17:56] * phil_ (~quassel@chello080109010223.16.14.vie.surfer.at) Quit (Remote host closed the connection)
[17:56] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) Quit (Quit: Ex-Chat)
[17:58] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[17:59] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[18:00] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[18:36] * stxShadow (~jens@jump.filoo.de) Quit (Ping timeout: 480 seconds)
[18:48] * BManojlovic (~steki@212.200.243.83) has joined #ceph
[18:54] * Kioob (~kioob@luuna.daevel.fr) Quit (Quit: Leaving.)
[18:56] * lxo (~aoliva@lxo.user.oftc.net) Quit (Read error: Connection reset by peer)
[18:56] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[18:57] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[19:01] * chutzpah (~chutz@216.174.109.254) has joined #ceph
[19:12] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:29] <Tv|work> 2285 packets transmitted, 0 received, 100% packet loss, time 2302273ms
[19:29] <Tv|work> there goes my productive Friday
[19:35] <joao> lol
[19:35] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[19:39] <Tv|work> sjust2: works again, at least for now
[19:42] <sjust2> yup
[20:36] * filoo (~jens@ip-88-153-224-220.unitymediagroup.de) has joined #ceph
[20:51] * filoo (~jens@ip-88-153-224-220.unitymediagroup.de) Quit (Ping timeout: 480 seconds)
[21:23] * lofejndif (~lsqavnbok@191.Red-83-34-192.dynamicIP.rima-tde.net) has joined #ceph
[21:36] * nyeates (~nyeates@pool-173-59-237-128.bltmmd.fios.verizon.net) Quit (Quit: Zzzzzz)
[21:38] * filoo (~jens@ip-88-153-224-220.unitymediagroup.de) has joined #ceph
[22:16] * filoo (~jens@ip-88-153-224-220.unitymediagroup.de) Quit (Quit: Verlassend)
[22:21] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[22:55] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) Quit (Quit: Ex-Chat)
[23:05] * cattelan_away is now known as cattelan
[23:35] * joao (~JL@ace.ops.newdream.net) Quit (Quit: Leaving)
[23:42] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.