#ceph IRC Log

Index

IRC Log for 2012-11-12

Timestamps are in GMT/BST.

[0:00] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[0:01] * loicd (~loic@magenta.dachary.org) has joined #ceph
[0:10] * Leseb (~Leseb@5ED17881.cm-7-2b.dynamic.ziggo.nl) Quit (Quit: Leseb)
[0:11] * jantje (~jan@paranoid.nl) Quit (Ping timeout: 480 seconds)
[0:11] * jantje (~jan@paranoid.nl) has joined #ceph
[0:15] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:05] * yoshi (~yoshi@p30106-ipngn4002marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[1:20] * Cube (~Cube@12.248.40.138) has joined #ceph
[1:30] * jtang1 (~jtang@wlan-clients-1875.sc12.org) Quit (Quit: Leaving.)
[1:30] * didders_ (~btaylor@142.196.239.240) has joined #ceph
[1:32] * miroslav (~miroslav@64.55.78.243) has joined #ceph
[1:54] * didders_ (~btaylor@142.196.239.240) Quit (Ping timeout: 480 seconds)
[2:04] * jtang1 (~jtang@75-148-97-154-Utah.hfc.comcastbusiness.net) has joined #ceph
[2:05] * jtang1 (~jtang@75-148-97-154-Utah.hfc.comcastbusiness.net) Quit (Remote host closed the connection)
[2:05] * jtang1 (~jtang@75-148-97-154-Utah.hfc.comcastbusiness.net) has joined #ceph
[2:19] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[2:19] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[2:34] * wilson (~wilson@CPE001c1025d510-CM001ac317ccea.cpe.net.cable.rogers.com) has joined #ceph
[2:39] * jtang2 (~jtang@75-148-97-154-Utah.hfc.comcastbusiness.net) has joined #ceph
[2:39] * jtang1 (~jtang@75-148-97-154-Utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[2:51] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Ping timeout: 480 seconds)
[2:53] * didders_ (~btaylor@142.196.239.240) has joined #ceph
[2:58] * maxiz (~pfliu@202.108.130.138) has joined #ceph
[3:13] * maxiz (~pfliu@202.108.130.138) Quit (Ping timeout: 480 seconds)
[3:16] * rturk (~rturk@ps94005.dreamhost.com) Quit (Ping timeout: 480 seconds)
[3:16] * cephalobot` (~ceph@ps94005.dreamhost.com) Quit (Ping timeout: 480 seconds)
[3:21] * rturk (~rturk@ps94005.dreamhost.com) has joined #ceph
[3:22] * cephalobot (~ceph@ps94005.dreamhost.com) has joined #ceph
[3:23] * maxiz (~pfliu@202.108.130.138) has joined #ceph
[3:25] <wilson> anyone using zeusram for journaling?
[3:58] * miroslav1 (~miroslav@64.55.78.243) has joined #ceph
[3:58] * miroslav (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[4:05] * miroslav (~miroslav@64.55.78.243) has joined #ceph
[4:05] * miroslav1 (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[4:07] * miroslav (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[4:07] * miroslav1 (~miroslav@64.55.78.243) has joined #ceph
[4:09] * miroslav (~miroslav@64.55.78.243) has joined #ceph
[4:09] * miroslav1 (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[4:09] * miroslav1 (~miroslav@64.55.78.243) has joined #ceph
[4:09] * miroslav (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[4:13] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[4:13] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[4:38] * miroslav (~miroslav@64.55.78.243) has joined #ceph
[4:38] * miroslav1 (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[4:39] * miroslav (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[4:39] * miroslav1 (~miroslav@64.55.78.243) has joined #ceph
[4:51] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Remote host closed the connection)
[4:51] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[5:07] * miroslav (~miroslav@64.55.78.243) has joined #ceph
[5:07] * miroslav1 (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[5:15] * miroslav1 (~miroslav@64.55.78.243) has joined #ceph
[5:15] * miroslav (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[5:17] * miroslav1 (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[5:17] * miroslav (~miroslav@64.55.78.243) has joined #ceph
[5:21] * miroslav1 (~miroslav@64.55.78.243) has joined #ceph
[5:21] * miroslav (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[5:22] * miroslav (~miroslav@64.55.78.243) has joined #ceph
[5:22] * miroslav1 (~miroslav@64.55.78.243) Quit (Write error: connection closed)
[5:22] * miroslav1 (~miroslav@64.55.78.243) has joined #ceph
[5:22] * miroslav (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[5:22] * miroslav1 (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[5:22] * miroslav (~miroslav@64.55.78.243) has joined #ceph
[5:47] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) Quit (Ping timeout: 480 seconds)
[5:48] * miroslav1 (~miroslav@64.55.78.243) has joined #ceph
[5:49] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) has joined #ceph
[5:49] * miroslav (~miroslav@64.55.78.243) Quit (Read error: Connection reset by peer)
[5:50] * Cube (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[5:55] * Cube (~Cube@12.248.40.138) has joined #ceph
[5:58] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[5:58] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[6:01] * yoshi (~yoshi@p30106-ipngn4002marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[6:06] * yoshi (~yoshi@p30106-ipngn4002marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[6:09] * loicd (~loic@magenta.dachary.org) has joined #ceph
[6:10] * loicd (~loic@magenta.dachary.org) Quit ()
[6:11] * yehuda_hm (~yehuda@2602:306:330b:a40:3109:5ab4:9f18:4cd) Quit (Ping timeout: 480 seconds)
[6:12] * miroslav1 (~miroslav@64.55.78.243) Quit (Quit: Leaving.)
[6:30] * jtang2 (~jtang@75-148-97-154-Utah.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[6:34] * loicd (~loic@90.84.144.14) has joined #ceph
[6:34] * didders_ (~btaylor@142.196.239.240) has left #ceph
[7:42] * mib_6loilj (ca037809@ircip3.mibbit.com) has joined #ceph
[7:42] <mib_6loilj> How can we modify size of single Object unit ?
[8:21] * loicd (~loic@90.84.144.14) Quit (Ping timeout: 480 seconds)
[8:47] * gucki (~smuxi@HSI-KBW-082-212-034-021.hsi.kabelbw.de) Quit (Ping timeout: 480 seconds)
[8:47] * gucki (~smuxi@HSI-KBW-082-212-034-021.hsi.kabelbw.de) has joined #ceph
[8:59] * ctrl (~Nrg3tik@78.25.73.250) Quit (Read error: Connection reset by peer)
[9:02] * vagabon (~fbui@au213-1-82-235-205-153.fbx.proxad.net) has joined #ceph
[9:13] * Leseb (~Leseb@193.172.124.196) has joined #ceph
[9:19] * antsygeek (~antsygeek@164.138.27.156) has joined #ceph
[9:26] * Cube (~Cube@12.248.40.138) Quit (Ping timeout: 480 seconds)
[9:27] <mib_6loilj> How can we modify size of single Object unit ?
[9:27] <NaioN> mib_6loilj: what do you mean...
[9:27] <NaioN> are you using rbd's?
[9:31] <Leseb> NaioN: I think he wants to know, how to change the default size of an object? mib_6loilj : am I correct?
[9:31] <NaioN> Leseb: the object doesn't have a default size
[9:32] <Leseb> NaioN: 4MB?
[9:32] <NaioN> that's with rbd's
[9:32] <Leseb> NaioN: oh yes, stripped… :)
[9:33] <NaioN> striped :)
[9:33] <Leseb> even better :)
[9:33] <NaioN> it's an option when you create the rbd
[9:34] <NaioN> --order <bits> the object size in bits, such that the objects are (1 << order) bytes. Default is 22 (4 MB).
[9:34] <NaioN> that's from the rbd command
[9:34] <Leseb> yes yes
[9:35] <NaioN> so the rbd gets split into parts of that size and striped over the osd's
[9:36] <Leseb> I know
[9:36] <NaioN> but this is rbd specific and i don't know if it's possible to stripe files on the cephfs
[9:37] <Leseb> well AFAIK, the MDS only maintains metada related to cephfs, the rest is stripped over objects as well
[9:45] <antsygeek> hmm, can ceph be used with two hosts? I've read there should always be an odd amount of servers (like 3, 5, or 7)
[9:48] <Leseb> I guess the odd number refers to the monitors
[9:49] <Leseb> you can run a 2 nodes cluster, with _only_ one monitor and X number of OSDs
[9:49] <antsygeek> Leseb: what happens when the monitor dies?
[9:51] <Leseb> well everything collapse, you don't loose data but you can't use access them
[9:51] <Leseb> this is why 3 is recommended
[9:51] <antsygeek> but i can't run the monitors on two nodes?
[9:51] <antsygeek> ok
[9:52] <Leseb> antsygeek: no, you can't you always need an uneven number (for the majority of vote, 50% is not the majority)
[9:52] <ramsay_za> well you need to have an odd number of monitors in the environment so they don;t get deadlocked
[9:53] <antsygeek> okay
[9:53] <antsygeek> thanks guys
[9:55] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[9:59] * antsygeek (~antsygeek@164.138.27.156) has left #ceph
[10:03] * loicd (~loic@193.49.201.35) has joined #ceph
[10:42] * maxiz (~pfliu@202.108.130.138) Quit (Quit: Ex-Chat)
[10:46] * Ryan_Lane (~Adium@203.185.194.108) has joined #ceph
[10:58] * tryggvil (~tryggvil@16-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[10:58] * Ryan_Lane (~Adium@203.185.194.108) Quit (Quit: Leaving.)
[10:59] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[11:06] <vagabon> did anybody compile successfully ceph with gcc 4.6.3 ?
[11:10] * Ryan_Lane (~Adium@203.185.194.108) has joined #ceph
[11:14] <jluis> vagabon, yes
[11:15] <jluis> gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
[11:16] <vagabon> jluis: ok thanks
[11:20] * yoshi (~yoshi@p30106-ipngn4002marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:21] * sin (~sinner@78.107.155.77) has joined #ceph
[11:22] <sin> Hello. Can someone help me with little problem about CEPH.
[11:22] <sin> ?
[11:25] <Robe> just ask.
[11:26] <sin> I can`t mount ceph storage on virtual host. it say: mount Error 2 no such file or directory
[11:28] <sin> When I mount same storage on normal server everything ok
[11:29] * Ryan_Lane (~Adium@203.185.194.108) Quit (Quit: Leaving.)
[11:30] <sin> CEPH configuration very simple. 1-mon+1-mds+1-osd
[11:31] <sin> mounting by mount.ceph
[11:33] <sin> Any idea? If need more info - just say what kind
[11:34] <sin> Sorry for my English.
[11:35] <sin> Problem only on virtual hosts.
[11:43] <mib_6loilj> mount.ceph mon1_ip:portno:/ /home/username/local_dir
[11:44] <sin> I use more like: mount -t ceph mon1_ip:/
[11:45] <sin> mount -t ceph mon1_ip:/ /localdir -o name=admin,secret=<key>
[11:46] <mib_6loilj> this is also fine
[11:46] <mib_6loilj> i used it earlier
[11:46] <mib_6loilj> bt try new one ... its good
[11:46] <mib_6loilj> :)
[11:47] <sin> port_number is required?
[11:51] <sin> same error
[11:51] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[11:55] <gucki> mh, the bugtracker is still down? :(
[11:58] <sin> I don`t think that is bug. But I can`t find error on my side. I need fresh point of view.
[11:59] <sin> Also I did not found any mention about same error.
[12:12] <Cube> tracker fixed.
[12:14] * loicd (~loic@193.49.201.35) Quit (Ping timeout: 480 seconds)
[12:18] <jluis> sin, you probably have replication 2 and only one osd, leading to an unhealthy status, which makes the mds unable to let you mount the fs
[12:18] <jluis> check 'ceph -s' and 'ceph osd dump | grep rep'
[12:24] <sin> pool 0 'data' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 1 owner 0 crash_replay_interval 45
[12:24] <sin> pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 1 owner 0
[12:24] <sin> pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 128 pgp_num 128 last_change 1 owner 0
[12:27] <jluis> yeah, thought so
[12:28] <sin> Sorry. Could you tell me what is wrong with me?
[12:28] <jluis> so, you have one of two options: either don't really care about failure tolerance, are okay with not having replication and stick to only one osd, or you add another osd and everything should normalize
[12:29] <jluis> also, could you please show us the output of 'ceph -s'?
[12:29] <jluis> just to make sure that's all that's going on with the cluster
[12:29] * MikeMcClurg (~mike@62.200.22.2) has joined #ceph
[12:29] <sin> ceph -s
[12:29] <sin> health HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery 22/44 degraded (50.000%)
[12:29] <sin> monmap e1: 1 mons at {a=5.9.2.75:6789/0}, election epoch 1, quorum 0 a
[12:29] <sin> osdmap e68: 1 osds: 1 up, 1 in
[12:29] <sin> pgmap v179: 384 pgs: 384 active+degraded; 19599 bytes data, 1060 MB used, 2760 GB / 2761 GB avail; 22/44 degraded (50.000%)
[12:29] <sin> mdsmap e110: 1/1/1 up {0=a=up:active}
[12:30] <sin> Sorry, but that affect only on virtual hosts?
[12:30] <mib_6loilj> see by default
[12:31] <mib_6loilj> replication factor is set to 2
[12:31] <mib_6loilj> for all PG
[12:31] <jluis> sin, what do you mean only virtual hosts?
[12:31] <mib_6loilj> so u have only 1 node (osd ) with you right?
[12:32] <sin> I can`t mount only on virtual hosts, that running under libvirt
[12:32] <sin> qemu
[12:33] <jluis> what other way are you able to mount?
[12:35] <sin> If i mount on another server with ceph client installed, everything is OK
[12:35] <sin> server ral
[12:35] <sin> server real
[12:37] <jluis> got to say that I find that odd, but still, adding a new osd to the cluster or reducing the replication level to 1 should do the trick
[12:38] <jluis> afaik, while you have stuck unclean pgs, the osds shouldn't even let you mount
[12:38] <jluis> but I might be wrong
[12:39] * BManojlovic (~steki@85.222.180.134) has joined #ceph
[12:39] <sin> if you have ceph client i can give a key
[12:39] <sin> ^)
[12:39] <sin> :)
[12:39] <sin> For test
[12:40] <jluis> anyway, if you want to keep the replication level at 2, you should just add another osd and let the osds rebalance, and once you have all pgs active everything should be okay
[12:40] <jluis> or you can reduce your replication level to 1 on all pools and that HEALTH_WARN should go away
[12:40] <jluis> believe it or not, I don't think I have a single dev machine with a ceph client built
[12:45] * Steki (~steki@85.222.177.164) Quit (Ping timeout: 480 seconds)
[12:46] <sin> If you don`t mind, can you tell me how to reduce replication level.
[12:47] <jluis> 'ceph osd pool set <pool-name> size <replication-factor>'
[12:47] <jluis> for i in data metadata rbd; do ceph osd pool set $i size 1 ; done
[12:47] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[12:48] <sin> Thanks a lot!
[12:48] <jluis> sin, I think I should tell you yet again that a replication factor of 1 does *not* provide any kind of fault tolerance
[12:48] <jluis> if you lose your osd, you'll lose your data
[12:48] <sin> This is just test server
[12:48] <jluis> no way around it, unless you can summon all the lost bits from thin air ;)
[12:49] <jluis> okay then
[12:49] <jluis> you can check if your replication level was adjusted with 'ceph osd dump | grep rep'
[12:51] <sin> # ceph health
[12:51] <sin> HEALTH_OK
[12:52] <sin> But error still actual
[12:53] <sin> # mount -t ceph 5.9.2.75:/ /mnt/ -vvv -o name=admin,secret=AQBA8pxQ8MN6NRAAAd5jlbiMOcqShL+HscrqeQ==
[12:53] <sin> mount: fstab path: "/etc/fstab"
[12:53] <sin> mount: mtab path: "/etc/mtab"
[12:53] <sin> mount: lock path: "/etc/mtab~"
[12:53] <sin> mount: temp path: "/etc/mtab.tmp"
[12:53] <sin> mount: UID: 0
[12:53] <sin> mount: eUID: 0
[12:53] <sin> mount: spec: "5.9.2.75:/"
[12:53] <sin> mount: node: "/mnt/"
[12:53] <sin> mount: types: "ceph"
[12:53] <sin> mount: opts: "name=admin,secret=AQBA8pxQ8MN6NRAAAd5jlbiMOcqShL+HscrqeQ=="
[12:53] <sin> mount: external mount: argv[0] = "/usr/sbin/mount.ceph"
[12:53] <sin> mount: external mount: argv[1] = "5.9.2.75:/"
[12:53] <sin> mount: external mount: argv[2] = "/mnt/"
[12:53] <sin> mount: external mount: argv[3] = "-v"
[12:53] <sin> mount: external mount: argv[4] = "-o"
[12:53] <sin> mount: external mount: argv[5] = "rw,name=admin,secret=AQBA8pxQ8MN6NRAAAd5jlbiMOcqShL+HscrqeQ=="
[12:53] <sin> parsing options: rw,name=admin,secret=AQBA8pxQ8MN6NRAAAd5jlbiMOcqShL+HscrqeQ==
[12:53] <sin> mount error 2 = No such file or directory
[12:54] <jluis> this might sound dumb, but does /mnt exist ?
[12:54] <sin> total 3144
[12:54] <sin> drwxr-xr-x 20 root root 4096 Nov 12 13:24 .
[12:54] <sin> drwxr-xr-x 20 root root 4096 Nov 12 13:24 ..
[12:54] <sin> drwxr-xr-x 2 root root 4096 Nov 12 12:01 bin
[12:54] <sin> drwxr-xr-x 3 root root 54 Nov 12 12:01 boot
[12:54] <sin> -rw-r--r-- 1 root root 3151120 Nov 9 20:59 bzImage
[12:54] <sin> drwxr-xr-x 10 root root 15160 Nov 12 14:51 dev
[12:54] <sin> drwxr-xr-x 32 root root 4096 Nov 12 14:51 etc
[12:54] <sin> drwxr-xr-x 2 root root 18 Oct 17 00:54 home
[12:54] <sin> drwxr-xr-x 8 root root 4096 Nov 12 12:01 lib
[12:55] <sin> drwxr-xr-x 2 root root 18 Oct 17 00:54 media
[12:55] <sin> drwxr-xr-x 2 root root 18 Oct 17 00:54 mnt
[12:55] <sin> drwxr-xr-x 2 root root 18 Oct 17 00:54 opt
[12:55] <sin> dr-xr-xr-x 65 root root 0 Nov 12 14:51 proc
[12:55] <sin> -rwxr-xr-x 1 root root 26970 Oct 23 10:42 pxelinux.0
[12:55] <sin> drwxr-xr-x 2 root root 20 Nov 9 16:26 pxelinux.cfg
[12:55] <sin> drwx------ 3 root root 49 Nov 9 21:10 root
[12:55] <sin> drwxr-xr-x 5 root root 160 Nov 12 14:51 run
[12:55] <sin> drwxr-xr-x 2 root root 4096 Nov 12 12:01 sbin
[12:55] <sin> dr-xr-xr-x 11 root root 0 Nov 12 14:51 sys
[12:55] <sin> drwxrwxrwt 4 root root 38 Nov 12 14:51 tmp
[12:55] <sin> drwxr-xr-x 12 root root 147 Nov 9 16:36 usr
[12:55] <sin> drwxr-xr-x 9 root root 102 Oct 17 00:54 var
[12:55] <jluis> you know, there was no need for a full paste out of ls / ;)
[12:55] <jluis> but thanks
[12:55] <sin> Sorry
[12:56] <sin> my mistake
[12:56] <sin> for example i can show mount -vvv on real server
[12:57] <jluis> anyway, not an expert on qemu-related stuff, so bear with me as I'm flying a bit blind here
[12:57] <jluis> are you sure your virtual host can connect to 5.9.2.75?
[12:57] <sin> Нуы
[12:57] <sin> Yes
[12:57] <jluis> alright
[12:58] <jluis> well, I guess I'm out of answers then
[12:58] <jluis> anyone else has any idea?
[12:59] <jluis> it might even be the most obvious thing ever, but I have no clue
[13:01] <sin> There is api for qemu, that alloy to mount ceph image as storage for virtual hosts. But I don`t need it
[13:04] <sin> I just cat`n understand what the difference between virtual host and real host
[13:13] * yanzheng (~zhyan@134.134.139.74) has joined #ceph
[13:28] <NaioN> mib_6loilj: replication level is per pool, not per pg
[13:28] <NaioN> but the pg's get replicated to the different osd's
[13:29] <NaioN> Leseb: no that's not correct about the mds
[13:29] <NaioN> the mds uses by default two pools, data and metadata
[13:29] <NaioN> in metadata it puts objects with the metadata of the files in cephfs
[13:30] <NaioN> and in the data pool it puts the files as objects
[13:31] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:31] <NaioN> but as i said i don't know if the files in cephfs get striped over the different osd's and if there is a parameter to determine how big the objects are
[13:31] * kees_ (~kees@devvers.tweaknet.net) has joined #ceph
[13:32] <NaioN> but if you have a small file then you also have a small object in the data pool
[13:33] * loicd (~loic@193.49.201.35) has joined #ceph
[13:35] * loicd (~loic@193.49.201.35) Quit ()
[13:36] <mib_6loilj> Can we set the Object size manually ?
[13:37] <mib_6loilj> when I ingest 1000 MB file it gets striped into 1000MB/4MB objects
[13:37] <mib_6loilj> so is it possible to set Object size ?
[13:38] * mib_6loilj (ca037809@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[13:41] * maxiz (~pfliu@111.192.251.7) has joined #ceph
[13:55] <Leseb> NaioN: thanks for the clarification
[13:59] * MikeMcClurg (~mike@62.200.22.2) Quit (Quit: Leaving.)
[14:00] * MikeMcClurg (~mike@firewall.ctxuk.citrix.com) has joined #ceph
[14:01] * fmarchand (~fmarchand@212.51.173.12) has joined #ceph
[14:03] <fmarchand> Hello !
[14:04] <fmarchand> I don't know if someone could help me but I have an osd really greedy and I would like to know why it's taking so much RAM ....
[14:09] * fmarchand (~fmarchand@212.51.173.12) Quit (Quit: Leaving)
[14:12] * fmarchand (~fmarchand@212.51.173.12) has joined #ceph
[14:12] * fmarchand (~fmarchand@212.51.173.12) Quit ()
[14:12] * fmarchand (~fmarchand@212.51.173.12) has joined #ceph
[14:23] * sin (~sinner@78.107.155.77) Quit (Remote host closed the connection)
[14:24] <fmarchand> I need some help ... anybody awake ?
[14:25] <slang> fmarchland: good morning
[14:25] <slang> fmarchland: can you describe your setup a bit?
[14:26] <slang> fmarchand: how many osds do you have? how much ram is that one osd using?
[14:31] <fmarchand> oki I have one osd one mon and one mds
[14:31] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[14:32] <fmarchand> I copied a lot of files on the ceph fs ... and now I have a osd deamon taking 2.6 Go
[14:33] <fmarchand> It's a single machine cluster right now ... I will increase the number of osd mon and mds later
[14:33] <slang> fmarchland: ok
[14:33] <fmarchand> it's a 0.48-2 argonaut version
[14:33] <slang> fmarchand: what does ceph -s tell you?
[14:36] * fmarchand (~fmarchand@212.51.173.12) Quit (Quit: Leaving)
[14:37] * fmarchand (~fmarchand@212.51.173.12) has joined #ceph
[14:38] * fmarchand (~fmarchand@212.51.173.12) Quit ()
[14:39] * fmarchand (~fmarchand@212.51.173.12) has joined #ceph
[14:39] <fmarchand> sorry I got a keyboard pb ...
[14:40] * fmarchand_ (~fmarchand@212.51.173.12) has joined #ceph
[14:40] * fmarchand_ (~fmarchand@212.51.173.12) Quit ()
[14:40] <fmarchand> so
[14:41] <NaioN> fmarchand: what do you mean with a lot of ram?
[14:42] <fmarchand> I mean the osd daemon takes 2.6 Go in ram ... in the documentation it says that it wouln't take as much as the mds thread (which take 500Mo in ram on my single machine cluster)
[14:43] <NaioN> Go?
[14:43] <NaioN> you mean 2.6Gb
[14:43] <fmarchand> :) sorry yes 2.6Gb
[14:44] <NaioN> could you paste the "top" line or something
[14:44] <fmarchand> yes
[14:44] <fmarchand> ceph -s
[14:44] <fmarchand> health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean; recovery 1578454/3156908 degraded (50.000%)
[14:44] <fmarchand> monmap e1: 1 mons at {a=172.16.2.72:6789/0}, election epoch 0, quorum 0 a
[14:44] <fmarchand> osdmap e20: 1 osds: 1 up, 1 in
[14:44] <fmarchand> pgmap v82193: 192 pgs: 192 active+degraded; 108 GB data, 257 GB used, 697 GB / 1006 GB avail; 1578454/3156908 degraded (50.000%)
[14:44] <fmarchand> mdsmap e23: 1/1/1 up {0=a=up:active}
[14:44] <fmarchand> and ...
[14:44] <fmarchand> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
[14:44] <fmarchand> 14381 root 20 0 4016m 2.6g 1380 S 2 66.8 6:28.26 ceph-osd
[14:44] <fmarchand> 14163 root 20 0 723m 569m 1796 S 2 14.4 6:15.41 ceph-mds
[14:45] <NaioN> that's indeed a lot
[14:45] <fmarchand> I know it's degraded because I have only one osd
[14:45] <NaioN> indeed
[14:45] <fmarchand> but I don't understand why so much
[14:47] <fmarchand> I have a lot of small files in the osd .. could it be a reason for it to be so "greedy" ?
[14:47] <NaioN> yeah that could be
[14:47] * ninkotech (~duplo@89.177.137.231) has joined #ceph
[14:48] <NaioN> i think one of the developers could answer it better
[14:48] <fmarchand> yes .... where are they ? :)
[14:48] <NaioN> sleeping :)
[14:49] <slang> fmarchand: you should probably run with multiple osds
[14:49] <NaioN> minimum should be 3
[14:49] <slang> fmarchand: even if you just want to evaluate ceph on one machine, you can run multiple osds on that machine
[14:50] <fmarchand> but it would distribute the load between the different osd but it would be on the same machine ...
[14:51] <slang> fmarchand: its really just an issue with that osd trying to recover the degraded objects
[14:51] <NaioN> fmarchand: recommended is to have abou 1Ghz of cpu power per OSD
[14:51] <slang> fmarchand: with 3 osds, the state of your cluster would be more consistent
[14:52] <slang> yes, that's true, but it sounds like fmarchand is just evaluating ceph right now on one machine
[14:52] <slang> so having less cpu power than recommended is ok for that purpose
[14:53] <NaioN> well i also evaluated ceph on one machine, but as you said you can run multiple osds on 1 machine
[14:53] <slang> right
[14:53] <NaioN> and i tested it in vm's
[14:53] <NaioN> so one machine with three mon/mds/osd vm's and 1 or 2 clients
[14:54] <slang> you really only need one mds probably
[14:54] <NaioN> so you could also test the redundancy...
[14:54] <NaioN> yeah true
[14:54] <fmarchand> three mon three mds and three osd : all on the same machine ?
[14:54] <NaioN> as vm's
[14:54] <fmarchand> In my case it's a vm too
[14:55] <NaioN> hehe
[14:55] <slang> fmarchand: I do most of my testing with 3 osds, 3 mons, and 1 mds
[14:55] <slang> fmarchand: on one machine
[14:55] <NaioN> well you could make 3 osds, 1 mon, 1 mds on 1 machine
[14:55] <fmarchand> but 3 osd on the same disk is not a good option ! is it ?
[14:55] <NaioN> and you can evaluate redundancy be shutting one of the osd daemons
[14:56] <NaioN> fmarchand: not for production
[14:56] <fmarchand> In fact it's a prototype ...
[14:57] <NaioN> hmmm don't think your setup is good enogh as prototype
[14:57] <NaioN> you want something that looks more like an actual setup
[14:58] <fmarchand> If I add 2 mons and 2 osd ?
[14:58] <fmarchand> I'm just afraid to overload the machine with more ceph daemons
[14:58] <slang> fmarchand: what do you mean by prototype?
[14:59] <slang> fmarchand: is it a machine running other services?
[14:59] <NaioN> well you can't do anything serious with this setup, so what do you mean with overload?
[15:00] <fmarchand> This machine will remain as a "master" machine but osd will be added as new osd machine's
[15:00] <fmarchand> This machine will remain as a "master" machine but osd will be added as new osd machines
[15:00] <fmarchand> don't know if I'm clear ...
[15:00] <NaioN> if you want to do anything serious with your setup, like testing your workload, this setup isn't good
[15:00] <fmarchand> :) yes this machine is running a db too ..
[15:02] <NaioN> well i would recommend to make 3 new vm's and load each with mon/mds/osd and give the osd's a different harddisk (virtual)
[15:02] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) has joined #ceph
[15:02] <NaioN> this setup you can use to test the functions, but not the performance
[15:04] <fmarchand> I can't do that right now ... we have only 2 ESX's and ... already too much vm's running :)
[15:05] <fmarchand> But I understand ...
[15:05] <fmarchand> I'm gonna try first to add osd's on the same machine
[15:06] <fmarchand> but ... could it be a memory leak ?
[15:06] <fmarchand> osd thread has a in-memory map of all its files ?
[15:07] <NaioN> can't answer those questions :)
[15:07] <NaioN> you'll have to wait till one of the developers is awake
[15:08] <fmarchand> is there any bell to ring ? :)
[15:08] <NaioN> hehe no :)
[15:08] <NaioN> well not that I'm aware of :)
[15:09] <fmarchand> ... and I will I know there is a developer awake ? I have to repeat my question every hour ?
[15:09] <fmarchand> how will I know
[15:11] <fmarchand> just by curiosity ... what kind of configuration of ceph are you currently dealing with ? I mean 10 osd's and mds's ?
[15:11] <NaioN> well if your from europe it's best to post your question in the evening or you could mail it to the mailinglist
[15:12] <ramsay_za> I'd recommend the mailing lists
[15:12] <fmarchand> yeah ... mailing list is one of my best options you're right
[15:13] <fmarchand> but I still have 2 questions
[15:16] <ramsay_za> well, to answer one: I have 3 ceph clusters, 2 dev one production. dev1: 20 1 tb sata osds 3 mon servers. dev2: 6 300gb sata osds, 5 mons. prod: 40 500gb sata, 3 mons
[15:17] <match> ramsay_za: How many servers are those 40 osds behind in production?
[15:17] <ramsay_za> 5 physical servers
[15:17] <ninkotech> ramsay_za: do you backup whole cluster? or do you trust it?
[15:17] <ramsay_za> 8 drives each
[15:18] <match> ramsay_za: cheers - currently doing dev stuff with ceph, but that's about the setup I'm contemplating here
[15:18] <fmarchand> can you lend me just the dev1 cluster ?
[15:18] <fmarchand> :)
[15:19] <ramsay_za> I don't backup the whole cluster, I trust enough that I don't feel the need to do that, I do how ever run backups of the important data as I'm not that brave
[15:19] <wilson> anyone using zeusram for journaling?
[15:19] <ramsay_za> too expensive to use zeusram, only way to justify it would be if your whole env was ssd
[15:20] <wilson> why would you need all SSD to justify it
[15:20] <wilson> seems more justified with slow spindals
[15:20] <fmarchand> Oh jst one question ... I have a weird error trying to delete a folder in ceph fs : "Directory not empty" and It looks empty when I check it ...
[15:20] <fmarchand> I read it was a mds bug ... do you know more about it ?
[15:21] <ninkotech> fmarchand: it might contain deleted file - which someone is still using
[15:21] <ninkotech> (not sure about ceph, but in unix generally this happened to me a lot)
[15:22] <fmarchand> oki
[15:22] <ramsay_za> wilson: journal is used to suck up "loose" io so if your spindles are too slow your journal will fill too quick and then block
[15:23] <wilson> with 8gb of writes into zeusram per OSD, that gives a lot of headroom for spikes, keeping end-user write latency very low, and gives the rotational media time to catch up with the sequential writes
[15:23] <wilson> SSD durability seems that it will fail very quickly as a journal device
[15:24] <wilson> flash based ssd that is
[15:24] <ramsay_za> wilson: so if you have an old ide drive and a zeuscard in sustained writes you will find that the zeus fill very quick and then blocks io till it has made space, where as if the journal was a sata drive of the same size to the ide osd you would have a more consistent performance profile
[15:24] <fmarchand> So If I understood well I can add 2 osd's too my machine on the same disk ? therefore I will be able to change the cluster state to "clean" (default replication number is 3 ?)?
[15:25] <ramsay_za> wilson: yeah I agree that ssds will burn too quick, the newer stuff is better. if you have the budget for a zeus card then go for it, it really will help if you have an env with a lot of spikes
[15:26] <ramsay_za> fmarchand: you can have more than one osd per disk, you just need to partition the drive so they have their own space, it's not going to help performance though
[15:29] <ramsay_za> fmarchand: also if that drive fails you will loose 2 replicas of data which could bring the cluster down
[15:31] <fmarchand> oki
[15:31] <fmarchand> it makes sense
[15:33] * drokita (~drokita@199.255.228.10) has joined #ceph
[15:34] * jluis is now known as joao
[15:35] <ramsay_za> match: depends on the size of the final env, I'm building one that will have 144 500GB sata osds all linked by 10Gbit ethernet with 24 drives in each chassis and grow to 720 osds once it's full
[15:38] <match> ramsay_za: I'll probably be stopping at around your current prod. size :)
[15:40] <fmarchand> do you have advices to add osd's to an existing cluster ?
[15:40] <ramsay_za> http://www.sebastien-han.fr/blog/2012/06/10/introducing-ceph-to-openstack/ have a scroll through that
[15:40] <joao> fmarchand, maybe this will help: http://ceph.com/docs/master/cluster-ops/add-or-rm-osds/
[15:51] <drokita> How big is the existing cluster?
[15:52] <fmarchand> drokita : question is for me ?
[15:52] <drokita> Yeah
[15:52] <fmarchand> 1 mon/1 osd/1mds all on 1 Tb virtual disk
[15:53] <drokita> Is this a test/eval setup?
[15:53] <drokita> I only ask because we just added some OSDs to a production install of 40 disks
[15:54] <drokita> Your disk will be very busy as it moves the data to the new OSD
[15:57] <fmarchand> It's a machine for a "prototype" we're planning to add more vm's as osd and mds later
[15:57] * PerlStalker (~PerlStalk@perlstalker-1-pt.tunnel.tserv8.dal1.ipv6.he.net) has joined #ceph
[15:58] * ghb (~Adium@173-9-152-101-miami.txt.hfc.comcastbusiness.net) has joined #ceph
[15:59] * ghb (~Adium@173-9-152-101-miami.txt.hfc.comcastbusiness.net) has left #ceph
[16:10] * markl (~mark@tpsit.com) has joined #ceph
[16:15] * yanzheng (~zhyan@134.134.139.74) Quit (Remote host closed the connection)
[16:16] <vagabon> joao: did you use ld v2.22 also ?
[16:16] <joao> I do
[16:16] <vagabon> for the context: I'm trying to build ceph here but it fails due to a ling error :-/
[16:16] <vagabon> s/ling/link
[16:16] <vagabon> I've no idea why
[16:20] <joao> is it with boost-program-options by any chance?
[16:23] <vagabon> joao: what's that ?
[16:24] <vagabon> sorry
[16:24] <vagabon> I don't know what you're taking about
[16:24] <joao> libboost-program-options was a dependency that was recently added to ceph
[16:25] <joao> I hit some linking errors just a couple of days ago because my machine didn't have that installed
[16:25] <joao> just thought that info could be useful for your troubles :)
[16:25] * yanzheng (~zhyan@134.134.139.74) has joined #ceph
[16:26] <vagabon> joao: this is what is missing for example: undefined reference to `cls_cxx_stat(void*, unsigned long*, long*)'
[16:26] <vagabon> BTW I'm trying to build ceph-0.53
[16:27] <vagabon> http://pastebin.com/EbhWfZKi
[16:27] <vagabon> here's the details just in case you want to take a look
[16:28] * ghbizness (~ghbizness@host-208-68-233-254.biznesshosting.net) has joined #ceph
[16:28] <ghbizness> anyone alive that can be of assistance ?
[16:32] <ghbizness> http://pastebin.com/sf0xmcHn
[16:32] <ghbizness> seems like ceph apt-get repos are foobared
[16:33] <ghbizness> im haveing an issue adding an OSD to a cluster and i feel the above may be the root cause as i am seeing anomolies with some ceph commands.
[16:34] <fmarchand> drokita ? still there ?
[16:36] * kees_ (~kees@devvers.tweaknet.net) Quit (Remote host closed the connection)
[16:37] <ghbizness> i guess david is not
[16:43] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[16:48] <fmarchand> sudo ceph osd crush set 1 osd.1 0
[16:48] <fmarchand> (22) Invalid argument
[16:48] <fmarchand> is this normal ?
[16:49] <fmarchand> I want to add an osd and when I add it to the crush map ... this is the result I have ... what am I doing wrong ?
[16:50] <ghbizness> fmarchand, can you strace it
[16:50] <fmarchand> I can't see it in the logs ... ceph.log
[16:51] * fmarchand (~fmarchand@212.51.173.12) Quit (Quit: Leaving)
[16:51] * fmarchand (~fmarchand@212.51.173.12) has joined #ceph
[16:51] <fmarchand> I use 0.48 argonaut
[16:53] <ghbizness> what commands did you run to setup the OSD ?
[16:53] <ghbizness> from start to finish
[16:53] <fmarchand> http://ceph.com/docs/master/cluster-ops/add-or-rm-osds/
[16:54] * match (~mrichar1@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[16:54] <fmarchand> I followed this doc to add it
[16:54] * yanzheng (~zhyan@134.134.139.74) Quit (Quit: Leaving)
[16:54] <ghbizness> k
[16:55] <ghbizness> OS ?
[16:55] <joao> fmarchand, you have to give the command the crush locations
[16:56] <fmarchand> Description: Ubuntu 12.04.1 LTS
[16:56] <fmarchand> Release: 12.04
[16:56] <fmarchand> Codename: precise
[16:56] <fmarchand> lcrush location ? like a bucket ?
[16:57] <joao> e.g., ceph osd crush set 1 osd.1 1.0 host=defaulthost rack=defaultrack root=default
[16:57] <joao> yes
[16:57] * vata (~vata@208.88.110.46) has joined #ceph
[16:57] <joao> you must change that according to your crushmap of course
[16:57] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[16:58] <fmarchand> I did not configure any rack .... I guess I used all default value
[16:58] <fmarchand> sudo ceph osd crush set 1 osd.1 0 host=myhost worked
[16:59] <fmarchand> thx ghbizness joao
[16:59] <joao> np
[16:59] <ghbizness> np
[16:59] <fmarchand> I put a weight of 0 to limit impact on the machine io
[17:00] <ghbizness> joao, did you see my initial post about the apt-get ceph repos ?
[17:00] <joao> I did
[17:00] <joao> I checked it out, doesn't look like there's a quantal repo
[17:00] <fmarchand> How can I know that data are migrated after I had the new osd ?
[17:00] <joao> I'm not sure if that's by design or if there's an issue somewhere
[17:01] <joao> better wait for the other devs to make sure
[17:01] <ghbizness> yah , i am not sure either
[17:01] <fmarchand> when I do ceph -w I just have this : osdmap e38: 2 osds: 1 up, 1 in
[17:01] <ghbizness> fmarchand what about your mds and mon prcoesses ?
[17:01] <joao> fmarchand, is your 2nd osd up and running?
[17:02] <fmarchand> I have only one mds and 1 mon
[17:02] <ghbizness> k
[17:03] <fmarchand> how do I know if my second osd is running ?
[17:04] <joao> fmarchand, 'ps xau | grep ceph-osd' would be a good way to check it :p
[17:04] <fmarchand> I don't remember I ran it ...
[17:04] <ghbizness> fmarchand "ceph osd tree"
[17:04] <fmarchand> oh I sorry I thaught there was a "ceph" command to do that :)
[17:05] <ghbizness> the tree command will show it to you nicely
[17:05] <joao> ghbizness, if the monitor does acknowledge it being up, it's likely it will show the osd as being dow
[17:05] <fmarchand> oh yes :) my osd.1 is down
[17:05] <fmarchand> so that's it
[17:05] <joao> that may not mean that the osd is not running, there could be other reasons for it not being reported as up
[17:06] <fmarchand> thx guys ... just one more question ... if the weight is 0 ... it will work ? it will just be longer ?
[17:06] <fmarchand> No but I don't remember I ran ceph osd start
[17:06] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) Quit (Remote host closed the connection)
[17:06] <fmarchand> I did but osd.1 was not yet in the conf file
[17:07] * jlogan1 (~Thunderbi@2600:c00:3010:1:4990:f1e9:6310:a09f) has joined #ceph
[17:08] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[17:12] * loicd (~loic@193.49.201.35) has joined #ceph
[17:14] <ghbizness> joao, i am having a different issue while trying to add a OSD
[17:14] <ghbizness> ceph-osd -c /etc/ceph/ceph.conf -i 401 --monmap /tmp/monmap
[17:14] <ghbizness> 2012-11-12 11:14:02.567434 7fe27fd0c780 -1 ** ERROR: unable to open OSD superblock on /mnt/ceph/osd.401: (2) No such file or directory
[17:15] <joao> have you --mkfs'd /mnt/ceph/osd.401 ?
[17:15] * yehuda_hm (~yehuda@2602:306:330b:a40:8d7b:6204:315:4be) has joined #ceph
[17:15] <joao> morning yehuda_hm
[17:16] <ghbizness> ceph-osd -c /etc/ceph/ceph.conf -i 401 --mkfs --monmap /tmp/monmap
[17:16] <ghbizness> 2012-11-12 11:15:53.344085 7f9465412780 -1 filestore(/mnt/ceph/osd.401) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
[17:16] <ghbizness> 2012-11-12 11:15:53.676154 7f9465412780 -1 created object store /mnt/ceph/osd.401 journal /var/lib/ceph/osd/ceph-401/journal for osd.401 fsid 51786473-1f21-4693-9fac-dd7bbe31fe70
[17:17] <joao> is that a yes, then?
[17:17] <joao> timestamps tell me that you did that after trying to run the osd
[17:17] <ghbizness> i tried so many ways now :-)
[17:18] <ghbizness> mkfs.btrfs /dev/sdc
[17:18] <ghbizness> mkdir -p /mnt/ceph/osd.401
[17:18] <ghbizness> mount /dev/sdc /mnt/ceph/osd.401/
[17:18] <ghbizness> chmod 755 /mnt/ceph/osd.401
[17:18] <ghbizness> mkdir -p /var/lib/ceph/osd/ceph-401
[17:18] <ghbizness> chmod 755 /var/lib/ceph/osd/ceph-401
[17:18] <ghbizness> ceph mon getmap -o /tmp/monmap
[17:18] <ghbizness> ceph-osd -c /etc/ceph/ceph.conf -i 401 --mkfs --monmap /tmp/monmap
[17:18] <ghbizness> [osd]
[17:18] <ghbizness> osd journal size = 1024
[17:18] <ghbizness> osd data = /mnt/ceph/$name
[17:18] <ghbizness> btrfs options = rw,noatime
[17:18] <joao> right
[17:19] <joao> and running ceph-osd -i 401 won't work after all that?
[17:19] <ghbizness> that is basically what i am running from A to Z with the ceph.conf file pertaining to [OSD]
[17:20] <ghbizness> i didnt go any further cause i got the superblock "no such file or directory"
[17:23] <joao> ghbizness, try 'ceph-osd -i 401 -c /etc/ceph/ceph.conf --debug-osd 20 --debug-filestore 20 -d'
[17:23] <fmarchand> thx guy's ... apparenly it works
[17:23] <fmarchand> but I don't have enough ram ...
[17:24] <fmarchand> I don't know why but osd.0 was taking 2.7Gb in RAM
[17:25] <fmarchand> when I added the second osd ... the vm crashed .... the memory was all taken
[17:25] <fmarchand> is it normal ?
[17:25] <ghbizness> http://pastebin.com/WwGHArS0 too much to paste :-)
[17:26] <ghbizness> i see the process still running, at this time, do i wait for it to initialize ?
[17:26] <ghbizness> this is a 3TB disk
[17:26] <ghbizness> but i am not seeing mu IO going on
[17:26] <joao> what does 'ceph -s' report?
[17:27] <joao> from your list of commands, it doesn't seem like you added the osd to the cluster
[17:27] <ghbizness> nothing about that OSD
[17:27] <joao> ghbizness, have you tried following this? http://ceph.com/docs/master/cluster-ops/add-or-rm-osds/
[17:27] <ghbizness> yah, i dont believe i have added it to the cluster
[17:28] <joao> you're probably missing the 'ceph osd create' and 'ceph auth add ...' and 'ceph osd crush set ...' bits
[17:28] <joao> at least that's what's missing from your list of command up there ;)
[17:29] <ghbizness> that is true... that brings me to another question...
[17:29] <ghbizness> root@csn4:/mnt/ceph/osd.401# ceph osd create 401
[17:29] <ghbizness> 0
[17:29] <ghbizness> root@csn4:/mnt/ceph/osd.401#
[17:29] <ghbizness> isnt it suppose to come back as 401 and not 0 ?
[17:31] <fmarchand> So my question is : the amount of memory taken by the osd daemon will increase with the size of the osd disk ?
[17:32] <ghbizness> fmarchand, do you know if it is actually using it or just caching memory bits ?
[17:32] <fmarchand> I would like to know that ! :)
[17:33] <joao> ghbizness, it will return the osd id iff you don't specify an id on create
[17:33] <joao> otherwise, it will return 0 as success
[17:33] <joao> I think
[17:33] <joao> well
[17:33] <joao> may be wrong
[17:33] <joao> let me check
[17:33] <ghbizness> root@csn4:/mnt/ceph/osd.401# ceph osd tree
[17:33] <ghbizness> dumped osdmap tree epoch 11
[17:33] <ghbizness> # id weight type name up/down reweight
[17:33] <ghbizness> -1 2 pool default
[17:33] <ghbizness> -3 2 rack unknownrack
[17:33] <ghbizness> -2 1 host csn1
[17:33] <ghbizness> 101 1 osd.101 up 1
[17:33] <ghbizness> -4 1 host csn2
[17:33] <ghbizness> 201 1 osd.201 up 1
[17:33] <ghbizness> 0 0 osd.0 down 0
[17:33] <ghbizness> root@csn4:/mnt/ceph/osd.401#
[17:33] <ghbizness> it added it as osd.0 instead of 401
[17:34] <joao> somehow this made sense in my head, but I may be wrong
[17:34] <joao> yeah
[17:34] <joao> I was wrong
[17:34] <joao> lol
[17:34] <ghbizness> :-)
[17:34] <fmarchand> ghbizness: how could I know that and tune cache settings
[17:34] <fmarchand> ?
[17:34] <ghbizness> fmarchand, wht are you using for journaling ?
[17:35] <fmarchand> default config
[17:35] <fmarchand> so I guess it's a file in the osd dir
[17:35] <fmarchand> ?
[17:35] <jefferai> joshd: in my cinder config I have rbd_user=volumes and rbd_secret_uuid set, but when I start cinder-volume I get a traceback saying that it couldn't connect to th cluster because of a client.admin initialization error...any ideas?
[17:37] * miroslav (~miroslav@64.55.78.200) has joined #ceph
[17:38] <ghbizness> joao, so basically i cant add OSD with IDs that i want... only ++1 incruments
[17:38] <ghbizness> idea is ... OSD.401 = host 4, drive 01
[17:38] <joao> you can add osds with the ids you want, afaik, but it is not advised anyway
[17:39] <joao> because you may end up with gaps in the osdmap, and it can take a toll on performance
[17:39] <ghbizness> i see
[17:39] * loicd (~loic@193.49.201.35) Quit (Ping timeout: 480 seconds)
[17:40] <joao> I would check why that is not returning 401 as expected, but am chasing something else and the distraction isn't that welcome ;)
[17:40] <ghbizness> understood, thanks for your help
[17:40] <ghbizness> may be a bug...
[17:40] <ghbizness> for now.. ill work with it
[17:41] <joao> I'll put in down on my todo list to make sure I check what's going on there
[17:44] * miroslav (~miroslav@64.55.78.200) Quit (Quit: Leaving.)
[17:45] * sagelap (~sage@76.89.177.113) Quit (Ping timeout: 480 seconds)
[17:46] <fmarchand> any idea about my memory ... pb ?
[17:47] <ghbizness> thanks joao, looks good now
[17:47] <ghbizness> osdmap e12: 3 osds: 3 up, 3 in
[17:47] <ghbizness> root@csn4:~# ceph status
[17:47] <ghbizness> health HEALTH_OK
[17:47] <ghbizness> monmap e1: 5 mons at {1=172.21.1.1:6789/0,2=172.21.1.2:6789/0,3=172.21.1.3:6789/0,4=172.21.1.4:6789/0,5=172.21.1.5:6789/0}, election epoch 4, quorum 0,1,2,3,4 1,2,3,4,5
[17:47] <ghbizness> osdmap e12: 3 osds: 3 up, 3 in
[17:47] <ghbizness> pgmap v55700: 38784 pgs: 38784 active+clean; 8730 bytes data, 484 MB used, 8047 GB / 8383 GB avail
[17:47] <ghbizness> mdsmap e8: 1/1/1 up {0=2=up:active}, 4 up:standby
[17:47] <ghbizness> health looks good
[17:47] <ghbizness> and now.. 3 OSDs are in
[17:49] * vagabon (~fbui@au213-1-82-235-205-153.fbx.proxad.net) has left #ceph
[17:49] <fmarchand> ghbizness: lucky you !
[17:51] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[17:51] <ghbizness> fmarchand, still have a LONG way to go
[17:51] <ghbizness> looking to test this for RBD VM storage
[17:51] * loicd (~loic@90.84.146.236) has joined #ceph
[17:53] * sagelap (~sage@9.sub-70-197-142.myvzw.com) has joined #ceph
[17:53] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:57] <fmarchand> does your osd's take a lot of ram ?
[18:01] * sagelap (~sage@9.sub-70-197-142.myvzw.com) Quit (Ping timeout: 480 seconds)
[18:02] * stxShadow (~jens@p4FECFEA4.dip.t-dialin.net) has joined #ceph
[18:03] * sagelap (~sage@38.122.20.226) has joined #ceph
[18:05] * tnt (~tnt@212-166-48-236.win.be) Quit (Ping timeout: 480 seconds)
[18:06] * yehudasa (~yehudasa@2607:f298:a:607:81d4:c364:fa9e:b0ec) Quit (Ping timeout: 480 seconds)
[18:09] * nhorman_ (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[18:13] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Ping timeout: 480 seconds)
[18:14] * yehudasa (~yehudasa@2607:f298:a:607:a532:18b2:1f59:f68c) has joined #ceph
[18:15] * tnt (~tnt@18.68-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[18:16] <fmarchand> I think I know what's happening
[18:17] <tnt> yehudasa: quick question: When doing a multi part upload on an existing key, we get a 500 error. I'm not sure if it's supposed to work or not (doing a simple PUT on existing key overwrites but not sure for multipart if it's specified), but in anycase 500 seems wrong.
[18:20] * Leseb (~Leseb@193.172.124.196) Quit (Quit: Leseb)
[18:35] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) has joined #ceph
[18:38] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) Quit ()
[18:44] * stxShadow (~jens@p4FECFEA4.dip.t-dialin.net) Quit (Quit: Ex-Chat)
[18:47] * senner (~Wildcard@68-113-232-90.dhcp.stpt.wi.charter.com) has joined #ceph
[18:56] * yehudasa (~yehudasa@2607:f298:a:607:a532:18b2:1f59:f68c) Quit (Ping timeout: 480 seconds)
[18:57] * yehudasa (~yehudasa@2607:f298:a:607:a532:18b2:1f59:f68c) has joined #ceph
[18:58] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[19:03] * MikeMcClurg (~mike@firewall.ctxuk.citrix.com) Quit (Quit: Leaving.)
[19:07] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:08] * noob2 (a5a00214@ircip4.mibbit.com) has joined #ceph
[19:09] <noob2> silly question but do my ceph client systems need to have the full ceph.conf file on their system or just the monitor part of it?
[19:09] <tnt> just mon
[19:10] <noob2> awesome :)
[19:10] <noob2> i was hoping that was the answer
[19:10] <sagewk> slang: oh yeah, there's also wip-client-asok
[19:10] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[19:10] <fmarchand> somebody could tell me what it means : osd.0 [WRN] slow request 203.855518 seconds old, received at 2012-11-12 18:06:44.741311: osd_sub_op(osd.1.0:16879 1.10 93de1bd0/1000010ada1.00000000/head//1 [pull] v 10'18975 snapset=0osd.0 [WRN] slow request 203.855518 seconds old, received at 2012-11-12 18:06:44.741311: osd_sub_op(osd.1.0:16879 1.10 93de1bd0/1000010ada1.00000000/head//1 [pull] v 10'18975 snapset=0=[]:[] snapc=0=[]) v7 currently queued for pg=[]:[
[19:10] <fmarchand> ] snapc=0=[]) v7 currently queued for pg
[19:10] <tnt> noob2: in general every ceph node (even mon/osd/mds) only need the info about mon and whatever is running locally.
[19:11] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) Quit ()
[19:11] <noob2> interesting
[19:11] <noob2> ok
[19:11] <noob2> that changes things a little for me
[19:11] <noob2> makes it a lot easier :)
[19:16] <buck> I have a libcephfs-java test in wip-java-test that could use a review and pull into master if anyone is available.
[19:17] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[19:22] * benner_ (~benner@193.200.124.63) has joined #ceph
[19:22] * benner (~benner@193.200.124.63) Quit (Read error: Connection reset by peer)
[19:23] * loicd1 (~loic@78.250.182.0) has joined #ceph
[19:29] * loicd (~loic@90.84.146.236) Quit (Ping timeout: 480 seconds)
[19:33] * loicd1 (~loic@78.250.182.0) Quit (Ping timeout: 480 seconds)
[19:34] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[19:42] <sagewk> buck: done, thanks
[19:42] <buck> sagewk: thanks sage.
[19:42] <sagewk> slang: does the symlink stuff look ok?
[19:43] <slang> looks good to me
[19:43] <slang> sagewk: minus the sleep
[19:49] <benpol> I've had a single pg stuck in "active+degraded+scrubbing" all weekend long. Doesn't appear to involve any "unfound" objects (as described here: http://ceph.com/docs/master/cluster-ops/placement-groups/ ). I've restarted all OSDs at different times but the issue persists. Ideas? I'm running 0.53 with btrfs backed OSDs (kernel 3.6.6).
[19:52] <benpol> ceph pg query output here: http://pastebin.com/0npcATAN
[19:54] <gucki> what's wrong with the bugtracker, it's down 90% of the time i try to access it.. :(
[19:55] <gucki> benpol: did you change the tunables? i had the once, but after setting the tunables it never occured again..
[19:56] <gucki> benpol: almost at the bottom of this page: http://ceph.com/docs/master/cluster-ops/crush-map/
[19:57] <benpol> gucki: Nope haven't fiddled with tunables yet. (looking at the crush-map page now)
[19:57] <gucki> benpol: i'm no dev, but for me it helped (at least it seems like)...
[20:00] * chutzpah (~chutz@199.21.234.7) has joined #ceph
[20:01] * adjohn (~adjohn@69.170.166.146) has joined #ceph
[20:04] <benpol> gucki: as it's a test cluster I'm trying out the tunable stuff described on that page (including --enable-unsafe-tunables). Cluster seems to be processing the change pretty quickly...
[20:06] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[20:14] <sagewk> joshd: hey
[20:14] * loicd (~loic@90.84.146.236) has joined #ceph
[20:16] <benpol> gucki: ...and FWIW the stuck pg seems to have melted away. thanks for the suggestion! #wonderwhyitworks
[20:19] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[20:19] <slang> sagewk: did you see the build errors in wip-client-asok?
[20:21] <slang> sagewk: so you're actually delaying on the client it looks like
[20:22] <slang> sagewk: (or the receiver rather)
[20:22] <slang> sagewk: I think the next thing would be to reorder received messages (from different peers) arriving within that delay window
[20:25] <benpol> ...so is there a version of qemu/kvm that can talk to a cluster with non-legacy tunables enabled?
[20:27] * joshd1 (~jdurgin@2602:306:c5db:310:10b6:35a8:dee6:c0b) has joined #ceph
[20:30] <benpol> (nevermind that last bit, just messed up a cap line)
[20:35] * loicd (~loic@90.84.146.236) Quit (Quit: Leaving.)
[20:36] <jefferai> joshd: running Folsom, even though I have rbd_user=volumes in my /etc/cinder/cinder.conf file, I had to follow instructions here (http://www.sebastien-han.fr/blog/2012/06/10/introducing-ceph-to-openstack/) regarding hard-coding a line (os.environ["CEPH_ARGS"] = "--id volumes") to make cinder-volume start, or else it insisted on trying to act as client.admin -- what am I doing wrong?
[20:37] <jefferai> Or maybe Sébastien Han knows, if he hangs out here :-)
[20:37] <joshd1> jefferai: yeah, that's still required for cinder-volume - you can just set that in the upstart or init script instead
[20:38] <jefferai> huh, ok -- I tried setting it in init and it didn't take, but will try again
[20:39] <jefferai> in the upstart script it execs the su command, inside the command it calls I tried putting CEPH_ARGS='--id volumes' in front of the cinder-volume command but it didn't take
[20:40] <jefferai> I'll play around
[20:40] * noob2 (a5a00214@ircip4.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[20:43] <joshd1> jefferai: last sentence here explains how to set it with upstart: http://ceph.com/docs/master/rbd/rbd-openstack/#setup-ceph-client-authentication
[20:43] <jefferai> joshd: gah, okay -- I was looking in the next section
[20:44] <jefferai> under the Configuring Cinder/Nova-Volume section
[20:44] <jefferai> probably that tidbit of info should be moved :-|
[20:45] * loicd (~loic@93.158.30.16) has joined #ceph
[20:46] <joshd1> yeah
[20:52] * vagabon (~fbui@au213-1-82-235-205-153.fbx.proxad.net) has joined #ceph
[20:54] <vagabon> hi. Just to let you know that I'm getting these 2 errors now when building the ceph's rpm packages:
[20:54] <vagabon> RPM build errors:
[20:54] <vagabon> File not found: /home/build/rpmbuild/BUILDROOT/ceph-0.53-6.mbs2.x86_64/usr/share/man/man8/ceph-clsinfo.8.gz
[20:54] <vagabon> File not found: /home/build/rpmbuild/BUILDROOT/ceph-0.53-6.mbs2.x86_64/usr/share/man/man8/librados-config.8.gz
[20:59] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[21:01] * buck (~buck@bender.soe.ucsc.edu) Quit (Quit: Leaving.)
[21:02] * mdrnstm (~mdrnstm@206-169-78-213.static.twtelecom.net) has joined #ceph
[21:05] <sagewk> slang: build error is fixed
[21:05] <sagewk> slang: yeah, that'll happen on its own because the delays are random
[21:06] <sagewk> joshd1: do you know the status of rbd-remove-cleanup?
[21:06] <sagewk> joshd1: and automake-python?
[21:06] <slang> sagewk: right ok
[21:06] <slang> sagewk: reproduced 2683 with delay injection and patches
[21:07] <slang> sagewk: it looks like inject_delay_probability isn't used other than as a boolean
[21:08] <sagewk> oh, right
[21:08] <sagewk> yeah
[21:08] <sagewk> the whoel thing needs to be fixed up because of the pending msgr changes in the _qos branch.. i'll fix it then
[21:08] <sagewk> but for now it's useful :)
[21:08] <slang> heh ok
[21:09] <joshd1> sagewk: wip-librbd-remove-cleanup can be merged
[21:10] <joshd1> sagewk: automake-python looks ok maybe, but I'm not sure if it's complete yet
[21:10] <joshd1> sagewk: I hadn't looked at automake-python before
[21:14] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[21:14] * dilemma (~dilemma@2607:fad0:32:a02:1e6f:65ff:feac:7f2a) has joined #ceph
[21:15] <dilemma> Anyone know what an error like this indicates? "filestore(/data/osd0) mount failed to open journal /dev/sda9: (22) Invalid argument"
[21:15] <dilemma> that shows up in an osd log when trying to start an osd that has worked in the past
[21:17] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[21:22] * fmarchand2 (~fmarchand@85-168-75-42.rev.numericable.fr) has joined #ceph
[21:22] <fmarchand2> hello !
[21:25] <fmarchand2> I got a question about the mds cache size config : the default value is 100 000 ... is it Mb ?
[21:27] * Misthafalls (~misthafal@84.245.1.132) has joined #ceph
[21:30] <sagewk> joshd1: is that remove fix related to this? ubuntu@teuthology:/a/teuthology-2012-11-11_19:00:04-regression-master-testing-gcov/13479
[21:31] * allsystemsarego (~allsystem@188.27.167.222) has joined #ceph
[21:33] <jtang> i wonder if anything caught fire in the exhibitor hall at sc12
[21:33] <jtang> there was a fire alarm earlier today
[21:34] <joshd1> sagewk: no, haven't seen that one yet
[21:37] <joshd1> fmarchand2: I think that's inodes, I'm not sure exactly how big they are, but I'd guess on the order of a few kb
[21:39] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:39] <fmarchand2> joshd1 : oh oki it makes sense. Because I don't have 100Gb of RAM ;)
[21:39] <sagewk> joshd1: hmm, python binding tests failing on master
[21:40] <sagewk> /a/sage-2012-11-12_12:09:42-rbd-master-testing-basic/13813
[21:40] <fmarchand2> thx joshd1
[21:41] <joshd1> sagewk: there was another change on friday around that area, I'll take a look
[21:43] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[21:44] <sagewk> thanks
[21:57] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[22:02] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[22:04] * vagabon (~fbui@au213-1-82-235-205-153.fbx.proxad.net) has left #ceph
[22:05] * miroslav (~miroslav@64.55.78.192) has joined #ceph
[22:05] * miroslav1 (~miroslav@64.55.78.192) has joined #ceph
[22:05] * miroslav (~miroslav@64.55.78.192) Quit (Write error: connection closed)
[22:09] * dilemma (~dilemma@2607:fad0:32:a02:1e6f:65ff:feac:7f2a) Quit (Quit: Leaving)
[22:12] <joshd1> sagewk: that failure probably has to do with socket failure injection, I'm not getting it normally on master
[22:14] <joshd1> sagewk: it reproduces with test_copy and socket failure injection though
[22:22] * jlogan1 (~Thunderbi@2600:c00:3010:1:4990:f1e9:6310:a09f) Quit (Ping timeout: 480 seconds)
[22:25] * fmarchand2 (~fmarchand@85-168-75-42.rev.numericable.fr) Quit (Ping timeout: 480 seconds)
[22:25] * nhorman_ (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:29] * jlogan1 (~Thunderbi@72.5.59.176) has joined #ceph
[22:34] <jtang> heh some guy is mentioning ceph's way of balancing the metadata tree at a workshop here
[22:46] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[22:49] * mdawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[22:51] <sagewk> joao: when you have a minute, look at wip-3477?
[22:51] <mdawson> setting up my first deployment. Running mkcephfs, I get 2012-11-12 16:47:53.946196 7ff8de9e6780 -1 journal check: ondisk fsid 00000000-0 000-0000-0000-000000000000 doesn't match expected 9149a4b8-8a90-45cd-9bd3-2a31dd 7608f3, invalid (someone else's?) journal
[22:52] <mdawson> Journal is a 10GB partition on SSD. Do I format the SSD journal partition and mount it? If so, what file system and is there a convention on where to mount the journal?
[22:53] <sagewk> joshd1: the python api test?
[22:53] <sagewk> it just failed for every api test run in the suite..
[22:53] <sagewk> http://fpaste.org/bREf/
[22:54] <sagewk> good news is it's the only failure :). the fsx stuff is all better
[22:55] <NaioN> mdawson: no you don't have to format the journal
[22:55] <NaioN> just point it to the right partition in the config
[22:57] <mdawson> NaioN: thx. looks like I had the OSD devs setting wrong
[22:58] <joshd1> mdawson: note that you'll need to wipe out any existing osd data before re-running mkcephfs
[23:00] <joshd1> sagewk: yeah, but those all have ms failure injection - just different amounts of it
[23:00] <mdawson> joshd1: don't think I got far enough to have any existing osd data the first try. Seems to be doing something now
[23:01] <mdawson> joshd1: it is sitting on "=== osd.0 ===" for several minutes. Is that normal?
[23:03] * noob2 (a5a00214@ircip4.mibbit.com) has joined #ceph
[23:03] <joshd1> no, it shouldn't take that long
[23:04] <joshd1> you've got root passwordless ssh and short hostnames in dns or /etc/hosts?
[23:04] <noob2> i just won approval to build a ceph cluster! yay :D
[23:05] <mdawson> yes, root passwordless ssh and short hostnames are working
[23:05] <mdawson> noob2: Congrats. How big?
[23:06] <noob2> they're saying about 50TB worth to start
[23:06] <noob2> they're super excited
[23:06] <scuttlemonkey> nice noob2, congrats
[23:06] <noob2> thanks
[23:06] <lurbs> Just reading the Ceph mailing list. Are logging and Cephx auth both really that much of an overheard for IOPS?
[23:06] <noob2> you guys made this so awesome it was an easy sell
[23:08] <joshd1> lurbs: I'm not suprised logging is (lots of allocation for string processing), but auth does surprise me
[23:09] <joshd1> mdawson: anything in osd.0's log (/var/log/ceph/ceph-osd.0.log)?
[23:11] <mdawson> joshd1: 2012-11-12 16:56:22.697476 7f8eb85e7780 0 journal _open_block_device: WARNING: configured 'osd journal size' is 10737418240, but /dev/sda6 is only 10240393216 bytes.
[23:11] <mdawson> stupid math
[23:12] <joshd1> mdawson: if you set it to 0, it'll use the whole device (and in the next version, it'll always use the whole device)
[23:13] * Misthafalls (~misthafal@84.245.1.132) Quit (Quit: Nettalk6 - www.ntalk.de)
[23:13] <mdawson> joshd1: thanks@
[23:16] * allsystemsarego (~allsystem@188.27.167.222) Quit (Quit: Leaving)
[23:16] * noob2 (a5a00214@ircip4.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[23:17] <ghbizness> hello all, i am trying to move a host and its osd devices from one rack to another
[23:17] <ghbizness> at the moment, i wish to define racks based on power circuits rather than physical racks
[23:18] <ghbizness> root@csn5:~# ceph osd crush set osd.0 pool=default rack=rack1313 host=csn4
[23:18] <ghbizness> updated item id 0 name 'pool=default' weight 0 at location {host=csn4} to crush map
[23:18] <ghbizness> it says it changed the map...
[23:18] <ghbizness> but looking at the osd tree... it has not
[23:18] <ghbizness> any takers ?
[23:19] <mdawson> joshd1: with "osd journal size = 0", re-running mkcephfs again stalls at osd.0 last log is 2012-11-12 17:14:40.323021 7ff311484780 1 journal _open /dev/sda6 fd 16: 10240393216 bytes, block size 4096 bytes, directio = 1, aio = 0
[23:21] * rweeks (~rweeks@64.55.78.101) has joined #ceph
[23:24] <joshd1> mdawson: can you dd to that partition?
[23:25] <joshd1> mdawson: if that works, the next step is adding 'debug osd = 20', 'debug filestore = 20', and 'debug journal = 20' in the [osd.0] section of your ceph.conf
[23:25] <joshd1> ghbizness: I'm guessing there's some subtlety to ceph osd crush set's behavior there. what's the osd tree show?
[23:26] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[23:26] <ghbizness> root@csn5:~# ceph osd crush set osd.0 pool=default rack=rack1313 host=csn4
[23:26] <ghbizness> updated item id 0 name 'pool=default' weight 0 at location {host=csn4} to crush map
[23:26] <ghbizness> root@csn5:~#
[23:26] <ghbizness> root@csn5:~#
[23:26] <ghbizness> root@csn5:~#
[23:26] <ghbizness> root@csn5:~# ceph osd tree
[23:26] <ghbizness> dumped osdmap tree epoch 63
[23:26] <ghbizness> # id weight type name up/down reweight
[23:26] <ghbizness> -1 9 pool default
[23:26] <ghbizness> -3 4 rack unknownrack
[23:26] <ghbizness> -2 2 host csn1
[23:26] <ghbizness> 101 1 osd.101 up 1
[23:26] <ghbizness> 3 1 osd.3 up 1
[23:26] <ghbizness> -4 2 host csn2
[23:26] <ghbizness> 201 1 osd.201 up 1
[23:26] <ghbizness> 4 1 osd.4 up 1
[23:26] <ghbizness> -6 5 rack unkownrack
[23:26] <ghbizness> -5 1 host csn4
[23:26] <ghbizness> 6 1 osd.6 up 1
[23:26] <ghbizness> 0 0 osd.0 up 1
[23:26] <ghbizness> -7 2 host csn3
[23:26] <ghbizness> 1 1 osd.1 up 1
[23:26] <ghbizness> 5 1 osd.5 up 1
[23:26] <ghbizness> -8 2 host csn5
[23:26] <ghbizness> 2 1 osd.2 up 1
[23:26] <ghbizness> 7 1 osd.7 up 1
[23:26] <ghbizness> root@csn5:~#
[23:26] <ghbizness> ahh ... sorry
[23:26] <ghbizness> 1sec... pastebin
[23:27] <ghbizness> http://pastebin.com/iZkKJ86f
[23:27] <ghbizness> besides the fact that i couldnt spell unknown for some reason ..
[23:27] <lurbs> Some unkown reason?
[23:28] <ghbizness> lol
[23:28] <ghbizness> <~~~ low sleep diet.
[23:30] <ghbizness> i would like to note that i can move an OSD to another node in the crushmap but i cant seem to move a host or its osd to another rack
[23:30] <ghbizness> which sounds conceptually wrong for me
[23:31] <joshd1> ghbizness: I think the easiest way to get it working is to edit the crushmap as a text file
[23:31] <ghbizness> yah, that was my next recourse
[23:31] <ghbizness> just wanted to see if i was doing something wrong or whether this is a bug of some sorts
[23:32] <ghbizness> that will make it 2 bugs for today
[23:32] <joshd1> it might be a bug, but I don't remember if 'crush set' is supposed to create intermediate nodes or not
[23:32] <ghbizness> understood
[23:33] * mib_y6n7gj (adf63fe5@ircip3.mibbit.com) has joined #ceph
[23:34] <joshd1> sagewk: wip-rbd-copy fixes it
[23:34] <joao> sagewk, was having dinner; looking now
[23:36] * iconz55 (~Brandon@host-173-246-63-229.biznesshosting.net) has joined #ceph
[23:36] <jtang> the docs for changing the crushmap was a little out of date the last time i checked
[23:37] <sagewk> ghbizness: 'osd crush set ...' does create intermediate nodes
[23:37] <jtang> its like "ceph osd setcrushmap -i mynewmap.new"
[23:37] <jtang> ?
[23:37] <sagewk> joshd1: sweet, thanks
[23:38] * rweeks (~rweeks@64.55.78.101) Quit (Quit: Computer has gone to sleep.)
[23:38] <ghbizness> jtang, yup, i will have to export the map and edit manually
[23:38] <ghbizness> jtang, thanks for the heads up
[23:38] * Cube (~Cube@173-119-9-75.pools.spcsdns.net) has joined #ceph
[23:38] <jtang> i found editting a text file then setting it from the that was better and more reliable
[23:39] <ghbizness> i see
[23:39] <jtang> ghbizness: you still need to compile the text file into the binary format that setcrushmap expects
[23:39] <ghbizness> ofcourse
[23:39] <jtang> well i made less mistakes if i could review the changes first
[23:39] <jtang> :)
[23:40] <joao> sagewk, all three commits make sense; should I test them somehow?
[23:40] * mib_y6n7gj (adf63fe5@ircip3.mibbit.com) has left #ceph
[23:43] * miroslav1 (~miroslav@64.55.78.192) Quit (Quit: Leaving.)
[23:44] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[23:47] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[23:55] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[23:56] <sagewk> slang: there?
[23:56] * slang waves
[23:56] <sagewk> did you look at wip-client-asok at all?
[23:57] <slang> sagewk: I looked at it, I didn't test it though
[23:58] <sagewk> k

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.