#ceph IRC Log


IRC Log for 2012-10-25

Timestamps are in GMT/BST.

[0:01] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[0:01] * loicd (~loic@magenta.dachary.org) has joined #ceph
[0:05] * slang (~slang@173-163-208-195-westflorida.hfc.comcastbusiness.net) has joined #ceph
[0:06] * nhmlap (~nhm@mb40436d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[0:11] * sagelap1 (~sage@ has joined #ceph
[0:12] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[0:13] * sagelap (~sage@2607:f298:a:607:1d17:77a3:d8f5:26d2) Quit (Ping timeout: 480 seconds)
[0:17] * BManojlovic (~steki@ has joined #ceph
[0:27] * Cube1 (~Cube@ has joined #ceph
[0:29] * Cube (~Cube@2607:f298:a:697:fd91:8ff3:d807:f014) Quit (Read error: Connection reset by peer)
[0:36] * rweeks (~rweeks@ Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[0:37] * nhmlap (~nhm@ has joined #ceph
[0:37] * PerlStalker (~PerlStalk@ Quit (Quit: rcirc on GNU Emacs 24.2.1)
[0:38] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) Quit (Remote host closed the connection)
[0:38] * aliguori_ (~anthony@ Quit (Remote host closed the connection)
[0:46] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Quit: Leseb)
[1:05] * gregaf1 (~Adium@2607:f298:a:607:4da6:6dc3:f488:9e18) Quit (Quit: Leaving.)
[1:06] * gregaf1 (~Adium@ has joined #ceph
[1:08] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:9de4:7e2:789a:ef55) Quit (Quit: LarsFronius)
[1:08] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:11] * gregaf1 (~Adium@ Quit (Quit: Leaving.)
[1:12] <sagewk> elder there?
[1:13] * LarsFronius (~LarsFroni@95-91-242-164-dynip.superkabel.de) has joined #ceph
[1:21] * nhmlap (~nhm@ Quit (Ping timeout: 480 seconds)
[1:29] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:30] * nhmlap (~nhm@ has joined #ceph
[1:31] * loicd (~loic@magenta.dachary.org) has joined #ceph
[1:32] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[1:37] <buck> It looks like the task() constructor is receiving a list where it expects a dict, and that's causing a few issues. To rectify this, should I a) sort out how to actually pass a dict down or b) make the existing code work with the list ? The latter seems easier but I don't know if there will be repercussions. The code is in teuthology/task/workunit.py
[1:38] <buck> nwatkins1: given your responses to my comments, things seem in order for your 10 patches in wip-java-cephfs. Nice work on breaking them apart; made reviewing them heaps easier.
[1:39] <nwatkins1> buck: thanks! I'm gonna wait to merge now based on sage's shutdown() question on the list until that's resolved.
[1:40] <joshd> buck: a dict in yaml is like workunit:\n client.0: [rbd/copy.sh]
[1:40] * redsharp (~redsharpb@2a01:e35:8b19:ec70:d6be:d9ff:fe13:2f9b) has joined #ceph
[1:41] <joshd> buck: that's a dict mapping workunit to a dict mapping client.0 to a list containing a string
[1:41] <LarsFronius> hey redsharp! ;)
[1:41] <redsharp> hey hey LarsFronius
[1:41] <redsharp> how is it doing ? since since.. few secs :D
[1:42] <redsharp> well alright where could I find manual for the ceph.conf itself ?
[1:42] <redsharp> I'd like to define the path of each osd in it and not having to be on /var
[1:43] <joshd> redsharp: http://ceph.com/docs/master/config-cluster/osd-config-ref/
[1:43] <joshd> specifically 'osd data' is the setting you're looking for
[1:44] <dmick> redsharp: and don't forget about osd journal
[1:44] <buck> joshd: the pertinent bit of my YAML looks like this - workunit:
[1:44] <buck> - clients:
[1:44] <buck> - client.0: [libcephfs-java]
[1:44] <buck> - env:
[1:44] <buck> - branch: wip-buck
[1:44] <buck> - BRANCH: wip-buck
[1:44] <buck> so I have dict -> list -> dict -> list
[1:44] <redsharp> oh yeah perfect joshd .. I got lost in all that doc there
[1:44] <joshd> buck: no dash in front of client.0 - a dash means a list entry
[1:45] <buck> joshd: ahh, ok. The examples need updating so I've had to do a bit of trial and error (I'll update the docs once I get totally sorted).
[1:46] <joshd> buck: sounds good. generally the tasks themselves have docstrings that explain their configuration with their 'task' method
[1:46] <redsharp> ok dmick I am not sure to really get the advantage of having the journal somewhere else
[1:47] <redsharp> in fact dmick the journal will remain configured on /var if i don't explicite change it right ?
[1:47] <dmick> redsharp: yes, just didn't want you to forget that you can choose both and might want to
[1:48] <dmick> many like to have journals on raw partitions, and/or SSDs. Depends on what you're setting up
[1:49] <redsharp> yeah ok dmick thank you very much I'm reading about it now and I guess i'll come back later to discuss about it
[1:50] <dmick> suer
[1:50] <dmick> stop in anytime
[1:51] <redsharp> thank you and nighty night all !
[1:57] * redsharp (~redsharpb@2a01:e35:8b19:ec70:d6be:d9ff:fe13:2f9b) Quit (Quit: Leaving)
[1:57] * cdblack (c0373727@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[1:58] * LarsFronius (~LarsFroni@95-91-242-164-dynip.superkabel.de) Quit (Quit: LarsFronius)
[1:59] * noob2 (a5a00214@ircip1.mibbit.com) has joined #ceph
[1:59] <buck> joshd: can I pick your brain once more on this teuthology stuff? I think my config file is still a bit off
[2:00] <joshd> sure
[2:01] <buck> here's my yaml file (as per the workunit.py usage)
[2:01] <buck> tasks:
[2:01] <buck> - ceph:
[2:01] <buck> - kclient: [client.0]
[2:01] <buck> - workunit:
[2:01] <buck> tag: wip-buck
[2:01] <buck> clients:
[2:01] <buck> client.0: [libcephfs-java]
[2:01] <buck> and I'm seeing this error
[2:01] <buck> Invalid task definition: {'tag': 'wip-buck', 'workunit': None, 'clients': {'client.0': ['libcephfs-java']}}
[2:02] <joshd> you probably want branch: wip-buck instead of tag: wip-buck
[2:03] <joshd> and I'm not sure due to non-monospaced fonts, but I think you want two more spaces before each line following '- workunit:'
[2:03] <joshd> whitespace is significant like python
[2:03] <joshd> and it has to be spaces, never tabs
[2:05] * Cube1 (~Cube@ Quit (Quit: Leaving.)
[2:05] * slang (~slang@173-163-208-195-westflorida.hfc.comcastbusiness.net) Quit (Read error: Connection reset by peer)
[2:05] * slang (~slang@173-163-208-195-westflorida.hfc.comcastbusiness.net) has joined #ceph
[2:09] <buck> joshd: nice, that was the ticket. Thanks. Now back to fixing actual bugs
[2:11] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:12] * nhmlap (~nhm@ Quit (Ping timeout: 480 seconds)
[2:14] * slang (~slang@173-163-208-195-westflorida.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[2:14] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Read error: Connection reset by peer)
[2:14] * yoshi_ (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:17] * slang (~slang@173-163-208-195-westflorida.hfc.comcastbusiness.net) has joined #ceph
[2:31] * rlr219 (43c87e04@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[2:35] * sjustlaptop (~sam@mbb0436d0.tmodns.net) has joined #ceph
[2:42] * sjustlaptop (~sam@mbb0436d0.tmodns.net) Quit (Read error: No route to host)
[2:48] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) has joined #ceph
[2:49] * slang (~slang@173-163-208-195-westflorida.hfc.comcastbusiness.net) Quit (Read error: Connection reset by peer)
[3:02] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) Quit (Remote host closed the connection)
[3:10] <buck> Could someone review a 4-line teuthology change in my wip-buck branch?
[3:13] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[3:14] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit (Read error: Connection reset by peer)
[3:14] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[3:14] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit ()
[3:15] * noob2 (a5a00214@ircip1.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[3:20] * adjohn (~adjohn@ Quit (Quit: adjohn)
[3:27] * buck (~buck@bender.soe.ucsc.edu) has left #ceph
[3:27] * scuttlemonkey_ (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[3:34] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Ping timeout: 480 seconds)
[3:36] * madkiss1 (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[3:41] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[3:43] * Q (~qgrasso@ip-121-0-1-110.static.dsl.onqcomms.net) has joined #ceph
[3:43] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[3:43] * Q is now known as Qten
[3:45] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:57] * nwatkins1 (~Adium@soenat3.cse.ucsc.edu) has left #ceph
[4:10] * kblin_ (~kai@kblin.org) Quit (Ping timeout: 480 seconds)
[4:10] * _are_ (~quassel@h1417489.stratoserver.net) Quit (Ping timeout: 480 seconds)
[4:44] * cattelan (~cattelan@2001:4978:267:0:21c:c0ff:febf:814b) Quit (Ping timeout: 480 seconds)
[4:49] * Kioob (~kioob@luuna.daevel.fr) Quit (Ping timeout: 480 seconds)
[4:59] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) Quit (Quit: Leaving.)
[5:08] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[5:09] * chutzpah (~chutz@ Quit (Quit: Leaving)
[5:13] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Ping timeout: 480 seconds)
[5:40] * renzhi (~renzhi@ Quit (Quit: Leaving)
[5:41] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[5:55] * sagelap1 (~sage@ Quit (Ping timeout: 480 seconds)
[5:57] * sagelap (~sage@133.sub-70-197-140.myvzw.com) has joined #ceph
[5:58] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[6:02] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) has joined #ceph
[6:13] * nhmlap (~nhm@253-231-179-208.static.tierzero.net) has joined #ceph
[6:14] * sagelap (~sage@133.sub-70-197-140.myvzw.com) Quit (Ping timeout: 480 seconds)
[6:20] * sjustlaptop (~sam@68-119-138-53.dhcp.ahvl.nc.charter.com) Quit (Ping timeout: 480 seconds)
[7:24] * dmick is now known as dmick-away
[7:27] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[7:37] * pentabular (~pentabula@ has left #ceph
[7:40] * Kioob (~kioob@luuna.daevel.fr) has joined #ceph
[7:49] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Ping timeout: 480 seconds)
[7:51] * miroslav (~miroslav@173-228-38-131.dsl.dynamic.sonic.net) Quit (Quit: Leaving.)
[7:54] * nhmlap (~nhm@253-231-179-208.static.tierzero.net) Quit (Ping timeout: 480 seconds)
[8:17] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:50] * Kioob (~kioob@luuna.daevel.fr) Quit (Ping timeout: 480 seconds)
[8:56] * mib_97b7xw (51c3cb22@ircip2.mibbit.com) has joined #ceph
[8:56] * mib_97b7xw (51c3cb22@ircip2.mibbit.com) Quit ()
[9:08] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[9:09] * maxim (~pfliu@ has joined #ceph
[9:19] * BManojlovic (~steki@ has joined #ceph
[9:22] * tziOm (~bjornar@ has joined #ceph
[9:23] * pixel (~pixel@ has joined #ceph
[9:26] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:28] <pixel> Hi everybody, I have a problem with ceph: stated copying process and it never be finshed also I can't to kill it from process list. Did anybody have similar issue?
[9:34] <madkiss1> copying onto an RBD or something?
[9:35] * madkiss1 is now known as madkiss
[9:40] <pixel> yep
[9:40] * _are_ (~quassel@2a01:238:4325:ca00:f065:c93c:f967:9285) has joined #ceph
[9:41] <pixel> I've mounted ceph storage (w/ 1 ods) on another server and tried to copy on it folder w/ data
[9:42] <pixel> this is simple installation with to nodes (1 ceph & 1 client)
[9:42] <pixel> *two nodes
[9:42] * Leseb (~Leseb@ has joined #ceph
[9:48] * MikeMcClurg (~mike@ has joined #ceph
[9:56] * loicd (~loic@207.209-31-46.rdns.acropolistelecom.net) has joined #ceph
[10:05] * loicd (~loic@207.209-31-46.rdns.acropolistelecom.net) Quit (Ping timeout: 480 seconds)
[10:07] * loicd (~loic@207.209-31-46.rdns.acropolistelecom.net) has joined #ceph
[10:09] <todin> morning
[10:25] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:907b:54c8:9e95:2759) has joined #ceph
[10:35] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:907b:54c8:9e95:2759) Quit (Quit: LarsFronius)
[10:38] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[10:59] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[11:06] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[11:06] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[11:08] * yoshi_ (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:12] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[11:12] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[11:12] * LarsFronius_ is now known as LarsFronius
[11:20] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[11:39] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[11:49] * loicd (~loic@207.209-31-46.rdns.acropolistelecom.net) Quit (Quit: Leaving.)
[11:55] * maxim (~pfliu@ Quit (Ping timeout: 480 seconds)
[12:02] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[12:21] * nhmlap (~nhm@253-231-179-208.static.tierzero.net) has joined #ceph
[12:22] <nhmlap> good morning #ceph
[12:23] <liiwi> good afternoon
[12:27] * alan_ (~alan@v.hostex.lt) has joined #ceph
[12:28] <alan_> Hi!
[12:29] <alan_> how can i fix this warning "health HEALTH_WARN 72 pgs stale; 72 pgs stuck stale"
[12:39] * nhmlap (~nhm@253-231-179-208.static.tierzero.net) Quit (Ping timeout: 480 seconds)
[12:45] <joao> alan_, do you by any chance have only one osd?
[12:47] <alan_> i have one server with one mon one mrs and 8 osd
[12:48] <joao> what does ceph -s show?
[12:48] <alan_> health HEALTH_WARN 72 pgs stale; 72 pgs stuck stale
[12:48] <alan_> monmap e1: 1 mons at {a=}, election epoch 0, quorum 0 a
[12:48] <alan_> osdmap e1184: 9 osds: 9 up, 9 in
[12:48] <alan_> pgmap v429413: 1728 pgs: 1656 active+clean, 72 stale+active+clean; 4464 GB data, 8552 GB used, 9953 GB / 18529 GB avail
[12:49] <alan_> mdsmap e253: 1/1/1 up {0=a=up:replay}
[12:49] <alan_> now i can not mount cephfs and i don't know why
[12:50] <joao> could you pastebin the result of ceph pg dump ?
[12:53] <alan_> joao: http://pastebin.com/kmpEy7BR
[12:57] <jks> running ceph osds on btrfs - is it recommended to mount with options noatime and nodiratime? - or doesn't it matter on btrfs?
[12:59] <alan_> yes, http://pastebin.com/AbdmhWZY btrfs and noatime/dirtime present
[13:00] <andreask> I expect noatime also to be beneficial for btrfs mounts
[13:01] <alan_> yes, i mount with "rw,noexec,nodev,noatime,nodiratime"
[13:01] <alan_> on client server
[13:05] <jks> super, be a good way to test taking the osd's down one by one to see if everything is working :-)
[13:07] <alan_> jks: emm, stop first osd then start and stop second?
[13:08] <jks> alan, yes, that was what I meant?
[13:08] <jks> I'm thinking that the best way to do is to use "ceph osd out 1" for example to mark osd 1 out
[13:08] <jks> and then stop the osd, remount the filesystem and finally start the osd again
[13:08] <jks> correct?
[13:09] <alan_> i try it, ceph start osd sync, try not for all osd.
[13:09] <alan_> i can do its again
[13:09] <jks> sorry, I'm not following what you say?
[13:10] <alan_> I have already stopped only one osd, but i can do it for all
[13:11] <jks> okay, I'm not really understanding what you mean?
[13:11] <jks> I'm just testing the ceph filesystem I have just setup... and I now figured it would be a good test of the system to take one osd down and test if everything is still functioning as it should... remount the filesystem and bring it back up
[13:11] <joao> alan_, I think that 'ceph pg 2.215 query' should give you the osd that's causing that
[13:12] <joao> but I'm currently dealing with my test setup not working for some reason
[13:12] <joao> and can't even check if that does what I think it does
[13:14] <alan_> joao: this may be the reason that I can not mount the ceph fs?
[13:14] <joao> yes; I think you are unable to use cephfs while you have staled pgs
[13:14] <joao> not sure though
[13:15] <alan_> can you suggest anything?
[13:16] <joao> restarting the osd(s) to which those pgs belong might work
[13:16] * mib_th7k74 (d4d3c928@ircip2.mibbit.com) has joined #ceph
[13:17] * mib_th7k74 (d4d3c928@ircip2.mibbit.com) Quit ()
[13:18] <alan_> joao: I did it, and i did restart server too
[13:19] <joao> what does ceph -s report now?
[13:20] <alan_> joao: http://pastebin.com/W6vDq6Fe
[13:22] <joao> alan_, looks like you now have 244 pg's replaying
[13:22] <joao> maybe waiting a bit and see how that goes
[13:24] <alan_> joao: no, I wait ±20h
[13:25] <alan_> i try solve this problem 3 days
[13:27] <joao> alan_, I see; I, for one, am unsure why the 72 pgs are stale, but I know that the cluster will need to take some time to finish replaying (not sure how long)
[13:27] <joao> I'm not sure if that will solve the pg stale issue, but I'm sure someone else (maybe sjust) can lend you a hand on that
[13:28] <alan_> ok, can i force mount cephfs with RO?
[13:28] <joao> don't know, but find it very unlikely
[13:28] <joao> the mds is also replaying, probably because the osd is too
[13:29] <joao> and I believe the mds must be active for you to mount cephfs
[13:31] <alan_> joao: one moment, i paste error from mount
[13:34] <alan_> joao: mount error 5 = Input/output error
[13:34] <alan_> and strafe show:
[13:34] <alan_> readlink("/etc/bacula-s1.hostex.lt:6789:", 0x7fff6edd09a0, 4096) = -1 ENOENT (No such file or directory)
[13:39] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[13:39] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Remote host closed the connection)
[13:41] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[13:47] <joao> sorry, going for lunch
[13:47] <joao> bbiab
[13:47] <alan_> hmm, yes! i go too :)
[14:00] <pixel> io performance will be better if I mount cephFS as fusr or kernel application?
[14:01] <jks> pixeI, I would expect the answer to be kernel
[14:04] <pixel> thx, what is optimal options to mount as kernel driver ( mount -t ceph -o ???? {ip}:6789:/ /mnt/mycephfs)?
[14:10] * mgalkiewicz (~mgalkiewi@staticline-31-183-94-25.toya.net.pl) has joined #ceph
[14:17] <jks> using the RPM version of Ceph I did a /etc/rc.d/init.d/ceph stop osd.2 to stop osd 2.. number seemed to happen... where can I find the documentation for the expected syntax?
[14:17] <jks> ceph now compains that all 3 mons are down, even though ceph-mon is still running on all 3 servers
[14:20] * mgalkiewicz (~mgalkiewi@staticline-31-183-94-25.toya.net.pl) Quit (Ping timeout: 480 seconds)
[14:24] * loicd (~loic@ has joined #ceph
[14:29] * mgalkiewicz (~mgalkiewi@staticline-31-183-94-25.toya.net.pl) has joined #ceph
[14:30] * mtk (~mtk@ool-44c35bb4.dyn.optonline.net) has joined #ceph
[14:36] <Leseb> hi guys
[14:39] * mgalkiewicz (~mgalkiewi@staticline-31-183-94-25.toya.net.pl) Quit (Ping timeout: 480 seconds)
[14:41] <madkiss> hello Leseb
[14:42] <jks> anyone knows what the warning slow request means? I'm continuously getting warnings that "v4 currently waiting for sub ops" with delays of up to ~240 seconds
[14:42] <Leseb> madkiss: ;)
[14:43] <madkiss> jks: means that your boxen are slow.
[14:43] <jks> madkiss, in what sense slow?
[14:43] <madkiss> probably there's some network-related problem I would guess
[14:44] * noob2 (a5a00214@ircip3.mibbit.com) has joined #ceph
[14:44] <noob2> does anyone know if the precise ceph packages will work in the newest 12.10 ubuntu server?
[14:44] <jks> madkiss, hmm, okay - do you know of something I can lookup in the documentation to help me debug this? (I've been googling for the slow requests and reading everything on the mailing list regarding it)
[14:45] <jks> madkiss, it's a test setup with 3 new servers running osd's and 1 new client using rdb to copy data in... the network is dedicated to this 3 x gigabit for each osd and 1 x gigabit for the client
[14:45] <jks> ordinary networks tests such as ttcp, ping, etc. showed no problems
[14:46] <madkiss> iptables or something?
[14:47] <jks> no iptables
[14:47] <jks> the ceph file system does work, and I can store data and it is saved across the three servers... "ceph status" says health okay
[14:49] * mgalkiewicz (~mgalkiewi@staticline-31-183-94-25.toya.net.pl) has joined #ceph
[14:50] * madkiss scratches brain
[14:51] <jks> I also note that the load average is higher than I expected on the osds... at about 5-6 in load average all the time
[14:51] <jks> they're running the ceph-osd and ceph-mon, but nothing apart from that
[14:59] * long (~chatzilla@ has joined #ceph
[15:00] <wido> Does anybody know if libvirt already has an option for adding "discard_granularity" to a disk drive?
[15:00] <wido> with RBD you could add it to the device line
[15:01] <wido> But with a qcow2 image?
[15:01] <jks> madkiss, happens across all three osd's... the clocks are synchronized with ntp... the slows varies from about 30 seconds to 240 seconds in the worst case
[15:01] <jks> madkiss, mainly like this:[WRN] slow request 42.519207 seconds old, received at 2012-10-25 14:59:37.670645: osd_op(client.4528.0:4559216 rb.0.11b7.4a933baa.00000008ac86 [write 2363392~8192] 0.57dc4dc9) v4 currently waiting for sub ops
[15:01] <jks> madkiss, but occassionally like this: slow request 33.697542 seconds old, received at 2012-10-25 14:59:47.492655: osd_sub_op(client.4528.0:4559236 0.3 c9a20c03/rb.0.11b7.4a933baa.0000000862b6/head//0 [] v 37'6653 snapset=0=[]:[] snapc=0=[]) v7 currently started
[15:02] <madkiss> jks: i can really only guess here. slow io on the machines?
[15:03] <jks> madkiss, I guess that could be the case, but I'm not sure how to find out if that is the case?
[15:03] <jks> it's new hardware and standard tests like bonnie and hdparm showed the drives to be quite fast
[15:03] * mgalkiewicz (~mgalkiewi@staticline-31-183-94-25.toya.net.pl) Quit (Ping timeout: 480 seconds)
[15:03] <jks> it's just ordinary hard disk drives (no ssds involved)
[15:04] * rlr219 (43c87e04@ircip3.mibbit.com) has joined #ceph
[15:04] <madkiss> jks: anything suspicious in these machines dmesg?
[15:04] <jks> madkiss, not at all, no
[15:04] <jks> only thing that looks odd to me is that the load average is ~4, but the CPU is 90% idle
[15:05] <madkiss> what's causing the load?
[15:05] <jks> I'm not sure if this is related to btrfs, and it is waiting for the btrfs kernel threads or something
[15:05] <madkiss> is this a test system?
[15:05] <jks> madkiss, I'm only using it for testing ceph, yes
[15:05] <madkiss> i would recommend to run it on xfs nevertheless
[15:06] <jks> ah, I thought btrfs was the recommended filesystem when I read the docs
[15:06] <jks> The load seems to be due to ceph-psd and the btrfs kernel threads
[15:07] <jks> I have used mdraid to create a raid upon which I have placed the btrfs file system... perhaps it is a bad idea to mix mdraid and btrfs (?)
[15:07] <madkiss> jks: I think the official recommendation is sort of "If btrfs were ready for production already, go with that. While it's not, go XFS."
[15:08] <jks> ah, okay :-|
[15:08] <jks> I read it as "if you're using a new enough Linux kernel, go with btrfs"... and I'm running 3.6
[15:08] <madkiss> the one published by the holy church of data shred?
[15:09] <jks> sorry? ;-)
[15:09] <jks> ceph.com states "Currently it is recommended that OSDs use btrfs for the underlying storage, but it is not mandatory."
[15:10] <madkiss> maybe we need to ask the inktank-people about that.
[15:10] <madkiss> I've seen a shitload of btrfsens act up
[15:10] <wido> btrfs still has some fragmentation issues
[15:10] <jks> wido, but I just formatted the filesystem and started copying data over
[15:10] <jks> it's not a system that has run for 6 months or something like that
[15:11] <wido> jks: Ah, that changes the story
[15:11] <jks> the same hardware has been in use for some months for testing other stuff without any performance problems
[15:11] <jks> but the harddrives have been reformatted for the ceph test
[15:11] <wido> jks: Checked with iostat if the drive isn't 100% utilized?
[15:12] <jks> *grmbl* looking at the logs now - it stopped giving out that warning 5 minutes ago
[15:12] <wido> jks: Do you have a journal?
[15:13] <jks> wido, what should I look for in iostat?
[15:13] <wido> jks: iostat -xd 10
[15:13] <wido> and then the util % column, last one
[15:13] <jks> wido, I haven't got a seperate ssd or other drive for the ceph journal, no
[15:13] * vata (~vata@ has joined #ceph
[15:13] <jks> wido, showing approx 20% to 50% utilization across the drives
[15:14] * alan_ (~alan@v.hostex.lt) Quit (Quit: alan_)
[15:14] <jks> ah, the warning about slow requests is back again now... weird
[15:15] <wido> jks: Ok, 20 ~ 50% isn't that bad
[15:15] <wido> because you are not using a journal your drives can have a high load due to a lot of uncached writes
[15:15] <wido> What is the data you are copying to it? Using RBD or the filesystem?
[15:16] <jks> wido, I'm using rbd and have created an ext4 file system on that - and using rsync to just copy over various files to that file system
[15:17] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[15:18] <jks> let iostat run for some time on all osd's now... they some times spike at 70% utilization
[15:18] <wido> jks: Using kernel RBD or inside a virtual machine?
[15:18] <jks> wido, qemu-kvm with the image on rbd
[15:18] <wido> jks: Are you using the writeback cache from librbd?
[15:19] <jks> wido, I have used rbd_cache=1
[15:20] <wido> jks: Try to take a look at: http://eu.ceph.com/docs/master/config-cluster/rbd-config-ref/
[15:20] <wido> although just rbd_cache=1 should trigger the default values
[15:20] <jks> wido, I was just looking there, and I think the defaults should be OK for me
[15:21] <wido> You can pass all those options with qemu-kvm by replacing the space by a underscore
[15:21] <jks> wido, I've only got one client accessing the rbd, so there shouldn't be any problems with the caching
[15:21] <wido> like you did with rbd_cache=1
[15:22] <jks> wido, I understand that setting a larger cache could possible increase performance, but I'm not really after that right now... I was just worried by the warnings
[15:22] <wido> jks: Ok, I get it
[15:22] <wido> my point with this was that by not having the cache everything would go directly to the disks without using their write-cache as well
[15:22] <wido> that could lead to high disk utilization and thus slow responding OSDs
[15:23] <jks> yeah, okay - but I do have the cache on
[15:23] <wido> Is your replication set to 2 or 3? And you have 3 OSDs?
[15:23] <jks> replication set to 2, and 3 osds
[15:26] <wido> Ok, you have the performance of 1.5 hdds when writing
[15:26] <wido> you can easily hit 100% util of a hdd
[15:27] <wido> small files you are writing with rsync?
[15:27] <jks> wido, the rsync is taking in data from a 100 Mbps connection... so it shouldn't be writing more than 10 MB/s at the most... and the drivers should be able to cope with much more than that
[15:27] <wido> jks: The amount of MB/s doesn't say much
[15:27] <wido> what is the average filesize?
[15:27] <jks> wido, mix of small and large files (I'm basically just rsync'ing the contents of a new CentOS installation)
[15:28] <wido> Small writes include a lot of buffer flushes and disk head movement, that is slow on disks
[15:28] <jks> I can't say the average file size... but I guess it's 1 kB or similar - small file sizes
[15:28] <wido> they can do sustained writes at 100MB/sec, but with small files they drop to something like 3MB/sec pretty easily
[15:28] <jks> wido, but the small files are written inside a virtual machine that should cache that, and then write it out to the ext4 file system in blocks that are then saved to the rbd
[15:29] <jks> wido, so I wasn't expecting small files to impact it as much compared to when using cephfs for example
[15:29] <wido> Not completely, since some write actions get the flag that it shouldn't be cached and it goes directly to the Ceph cluster
[15:29] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) has joined #ceph
[15:29] <wido> with a lot of small files you have a lot of those writes
[15:29] <jks> but shouldn't the disks then show close to 100% utilization?
[15:30] <jks> is it so, that I should expect these [WRN] slow request when the disks are over-utilized?
[15:31] <wido> jks: Yes, you should see something very high, that will result in those WRN messages
[15:32] <wido> but maybe you are writing larger files now since the util is lower
[15:32] <jks> wido, but I'm not seing those high percentages
[15:32] <jks> no, right now it is actually copying over some rather small files
[15:32] <jks> perhaps I should do a new rsync with one large file only or something like that... just to test it
[15:33] <jks> but the rsync does seem to run relatively fast, and then stop up and nothing happens for a while, then runs fine, stops up, etc.
[15:33] <wido> The "stop", "start" is probably the buffer flush from the cache
[15:34] <jks> okay, yes - sounds reasonable
[15:39] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Remote host closed the connection)
[15:50] * scuttlemonkey_ (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: Leaving)
[15:50] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[15:50] * tziOm (~bjornar@ Quit (Remote host closed the connection)
[15:52] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[15:53] * long (~chatzilla@ Quit (Max SendQ exceeded)
[15:56] * long (~chatzilla@ has joined #ceph
[16:01] * PerlStalker (~PerlStalk@perlstalker-1-pt.tunnel.tserv8.dal1.ipv6.he.net) has joined #ceph
[16:07] * aliguori (~anthony@cpe-70-123-145-75.austin.res.rr.com) Quit (Quit: Ex-Chat)
[16:15] * jpds (~jpds@faun.canonical.com) Quit (Quit: Farewell.)
[16:20] * lofejndif (~lsqavnbok@82VAAHGCV.tor-irc.dnsbl.oftc.net) has joined #ceph
[16:34] * aliguori (~anthony@ has joined #ceph
[16:40] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Remote host closed the connection)
[16:40] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[16:46] * gr3p (~gr3p___@28IAAIMXD.tor-irc.dnsbl.oftc.net) has joined #ceph
[16:53] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[17:03] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:23] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[17:25] * Qten (~qgrasso@ip-121-0-1-110.static.dsl.onqcomms.net) Quit (Ping timeout: 480 seconds)
[17:26] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Quit: Ex-Chat)
[17:29] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[17:30] * MikeMcClurg (~mike@ Quit (Quit: Leaving.)
[17:30] * MikeMcClurg (~mike@firewall.ctxuk.citrix.com) has joined #ceph
[17:31] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[17:31] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[17:32] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) Quit ()
[17:34] * cdblack (86868b48@ircip3.mibbit.com) has joined #ceph
[17:35] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[17:38] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Ping timeout: 480 seconds)
[17:38] * LarsFronius_ is now known as LarsFronius
[17:45] * pixel (~pixel@ Quit (Quit: Ухожу я от вас (xchat 2.4.5 или старше))
[17:56] * phl (~user@ has joined #ceph
[17:57] * yehudasa_ (~yehudasa@99-48-179-68.lightspeed.irvnca.sbcglobal.net) has joined #ceph
[18:04] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) has joined #ceph
[18:08] * justinwa1 (~Thunderbi@ has joined #ceph
[18:09] * match (~mrichar1@pcw3047.see.ed.ac.uk) Quit (Quit: Leaving.)
[18:09] * rlr219 (43c87e04@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[18:11] * tryggvil (~tryggvil@194-144-71-75.du.xdsl.is) has joined #ceph
[18:11] * sagewk (~sage@ Quit (Quit: Leaving.)
[18:13] * justinwarner (~Thunderbi@ Quit (Ping timeout: 480 seconds)
[18:16] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) has joined #ceph
[18:21] * sagelap (~sage@2607:f298:a:607:aca0:7479:696a:6818) has joined #ceph
[18:23] * nwatkins1 (~Adium@soenat3.cse.ucsc.edu) has joined #ceph
[18:24] <tziOm> Could you guys add a check for getloadavg in configure?
[18:25] <tziOm> check_cc_snippet getloadavg '#include <stdlib.h>
[18:25] <tziOm> void test() { getloadavg(NULL,0); }'
[18:26] * miroslav (~miroslav@c-98-248-210-170.hsd1.ca.comcast.net) has joined #ceph
[18:27] <tziOm> ..and do the appropriate #ifdef ENABLE_GETLOADAVG
[18:32] * tryggvil (~tryggvil@194-144-71-75.du.xdsl.is) Quit (Quit: tryggvil)
[18:36] * Leseb (~Leseb@ Quit (Quit: Leseb)
[18:37] * long (~chatzilla@ Quit (Quit: ChatZilla 0.9.89 [Firefox 16.0.1/20121010144125])
[18:38] * loicd (~loic@ Quit (Ping timeout: 480 seconds)
[18:42] * yehudasa_ (~yehudasa@99-48-179-68.lightspeed.irvnca.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[18:43] * MikeMcClurg (~mike@firewall.ctxuk.citrix.com) Quit (Ping timeout: 480 seconds)
[18:43] <sagelap> out of curiosity, what platform is it not available on?
[18:43] <sagelap> tziom: ^
[18:44] <tziOm> its not available in uClibc
[18:45] <tziOm> I have compiled ceph now towards uclibc toolchain, but there are a few "problems"
[18:46] <elder> It's snowing here.
[18:47] <elder> Damn.
[18:47] <tziOm> getloadavg, execinfo.h (backtrace), sys_siglist, (elder: In Norway too?), ..lets see..
[18:47] <joao> elder, so it is here, albeit it's liquid snow :(
[18:47] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Quit: LarsFronius)
[18:47] * scalability-junk (~stp@188-193-211-236-dynip.superkabel.de) Quit (Quit: Leaving)
[18:47] * sjustlaptop (~sam@meb0536d0.tmodns.net) has joined #ceph
[18:48] <elder> Umm, joao does that mean rain? It's pretty wet snow here too. First of the year.
[18:48] <joao> yeah, I was trying to be funny
[18:48] <elder> Well I'm getting some frosty rain then.
[18:48] <tziOm> sagelap, and envz and posix_fallocate
[18:48] <joao> it's pouring so much the sound it makes on the windows is extremely annoying
[18:48] <tziOm> elder, In Norway?
[18:49] <elder> No, in Minnesota, USA.
[18:49] <tziOm> ah.. started snowing a bit here today as well, in Oslo, Norway
[18:51] * sjustlaptop (~sam@meb0536d0.tmodns.net) Quit ()
[18:51] * sjustlaptop (~sam@meb0536d0.tmodns.net) has joined #ceph
[18:53] * loicd (~loic@2a01:e35:2eba:db10:120b:a9ff:feb7:cce0) has joined #ceph
[18:56] * gr3p (~gr3p___@28IAAIMXD.tor-irc.dnsbl.oftc.net) Quit (Quit: Leaving)
[18:59] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:01] <noob2> is the ceph.conf config file the same across all the nodes? it wasn't clear in the docs
[19:02] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:04] * sjustlaptop (~sam@meb0536d0.tmodns.net) Quit (Ping timeout: 480 seconds)
[19:05] * chutzpah (~chutz@ has joined #ceph
[19:05] <nwl> noob2: yes (AIUI)
[19:13] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[19:13] <scuttlemonkey> jamespage: are you about and available for a brief brain-picking?
[19:13] <scuttlemonkey> re: ceph charm
[19:14] <noob2> nwl: thanks!
[19:14] * davidz (~Adium@ip68-96-75-123.oc.oc.cox.net) has joined #ceph
[19:15] * loicd (~loic@2a01:e35:2eba:db10:120b:a9ff:feb7:cce0) Quit (Quit: Leaving.)
[19:15] * loicd (~loic@magenta.dachary.org) has joined #ceph
[19:15] <PerlStalker> Does xfs have user_xattr enabled by default?
[19:18] <noob2> PerlStalker: I don't believe it does
[19:18] <noob2> i had to enable it when i was using glusterfs
[19:18] <noob2> it's just a remount I believe
[19:18] <PerlStalker> That's what worries me.
[19:19] <noob2> whys that?
[19:19] <PerlStalker> The mount doesn't work if I put user_xattr in fstab.
[19:19] <noob2> let me see what i have
[19:19] <noob2> 1 sc
[19:19] <PerlStalker> Yes, I can remount and enable it but that's really bad.
[19:21] <noob2> i'm sorry, looks like it's a mkfs option
[19:21] <noob2> mkfs.xfs –i size=512,attr=2 device
[19:21] <PerlStalker> Hmm
[19:22] <noob2> man pages say the default attr=2
[19:22] <noob2> i'm confused now :D
[19:22] <Fruit> yes it's enabled by default and attr=2 means something else
[19:22] <PerlStalker> :-)
[19:22] <PerlStalker> I was, too.
[19:22] * rlr219 (43c87e04@ircip2.mibbit.com) has joined #ceph
[19:23] <noob2> are you using rhel or ubuntu?
[19:23] <PerlStalker> ubuntu
[19:23] <Fruit> PerlStalker: on xfs user attributes are enabled by default
[19:24] <Fruit> PerlStalker: I don't think you can even disable them
[19:24] <PerlStalker> Fruit: great
[19:24] <noob2> Fruit: yeah i'm seeing that as the default also
[19:24] <noob2> you mount with user_xattr
[19:24] <PerlStalker> I'm trying to make sure that the xfs mount is set the where it needs to be for ceph
[19:24] <noob2> PerlStalker: what error were you getting with user_xattr?
[19:24] <Fruit> attr=2 means version 2 attributes, which is simply a more efficient way of storing them (and in fact is the default as well)
[19:25] <Fruit> no user_attr flags are required
[19:25] <noob2> nice
[19:25] <Fruit> just mount and go.
[19:25] <PerlStalker> noob2: I was getting 'unknown mount option [user_xattr]'
[19:25] <PerlStalker> I'm okay with the idea of not needing it.
[19:26] <noob2> yeah
[19:26] <Fruit> sane defaults are a precious thing :)
[19:26] <PerlStalker> Indeed
[19:26] <noob2> i'm sure ceph would complain if the filesystem was missing what it needed
[19:26] <PerlStalker> One would hope.
[19:26] <noob2> hehe
[19:27] <nwatkins1> sagelap: should the messenger nonce be refreshed for a remount, or use one for all mounts from instances of the same ceph_mount_info?
[19:31] <tziOm> sagelap, what are your plans for zero-configuration booting of osd nodes?
[19:36] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) Quit (Remote host closed the connection)
[19:38] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Quit: Leaving)
[19:39] <rlr219> mikeryan: you here?
[19:44] <mikeryan> rlr219: yep
[19:44] * adjohn (~adjohn@ has joined #ceph
[19:44] <mikeryan> so we tracked the problem pretty deep into the OSD, but we're still not sure *why* it's happening
[19:44] <mikeryan> it could be FS corruption, but it could also be an OSD bug
[19:44] <mikeryan> the latter is looking somewhat likely
[19:45] <mikeryan> http://tracker.newdream.net/issues/3408
[19:45] <mikeryan> we opened that ticket to address it, and i believe sjust will take a look when he has a moment of spare time
[19:46] <rlr219> ok.
[19:46] <mikeryan> sorry we couldn't track it down further :(
[19:47] <mikeryan> your extremely verbose logs were very helpful!
[19:47] <rlr219> correct me if I am wrong, but if there was file system corruption on one osd, wouldn't the osd recreate it from a good copy off of another?
[19:48] <rlr219> thanks, I had to perform a logrotate to ensure to get the whole day! ;-)
[19:48] <mikeryan> this was actually doing to a missing snapset, which is slightly different from an ordinary file
[19:49] <mikeryan> it's stored as an attribute on the head object that has snapshots
[19:49] <mikeryan> in practice the OSD should just warn you that the filestore is inconsistent
[19:50] <mikeryan> we do suspect that it's a bug though, yes
[19:50] <rlr219> ok. so the snapset was gone (I am assuming because the size was zero). i am guessing then that that snapset can't be recovered from another OSD....
[19:51] * nhmlap (~nhm@ has joined #ceph
[19:52] <mikeryan> rlr219: let me double check how the snapset is stored, it might be possible
[19:54] <rlr219> I would think that if that is possible, then that would be a way to make that filestore consistent, as a inconsisten FS is basically corrupted data.
[19:59] <mikeryan> right, the OSD should *not* crash just due to a corrupt fs
[19:59] * fzylogic (~fzylogic@ has joined #ceph
[20:01] <rlr219> did either of you get a chance to look at the suicide timeout error?
[20:02] <dmick-away> noob2: the conf file doesn't have to be the same across all nodes. It's designed so it *can* be, but it's not required.
[20:02] * dmick-away is now known as dmick
[20:05] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:3868:8cb2:c9e3:b38e) has joined #ceph
[20:06] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[20:06] <noob2> thanks. i'll keep that in mind
[20:08] <mikeryan> rlr219: i wasn't looking at that one, sjust was
[20:08] <mikeryan> not sure how far he got into that, but he's not in today
[20:12] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) has joined #ceph
[20:13] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[20:13] * Kioob (~kioob@luuna.daevel.fr) has joined #ceph
[20:14] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[20:17] <rlr219> mikeryan: ok thanks. please keep me posted.
[20:18] * rlr219 (43c87e04@ircip2.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[20:19] <dmick> elder: I've attempted to invite you to Google Hangout
[20:20] <noob2> i think i hit a little snag with mkcephfs
[20:20] * Cube1 (~Cube@2607:f298:a:697:20f5:750e:4dc9:c91c) has joined #ceph
[20:20] <noob2> it looks like it setup my osd box ok but the monitor box is messed up
[20:20] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[20:20] <noob2> it tried to remove my /var/lib/ceph/mon/ceph-a
[20:21] <noob2> then stopped
[20:22] <noob2> http://mibpaste.com/b7svKo
[20:22] <noob2> here's what i see
[20:35] <noob2> i think btrfs is doing something weird. i'm going to try xfs
[20:37] * noob2 (a5a00214@ircip3.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[20:37] * noob2 (a5a00214@ircip2.mibbit.com) has joined #ceph
[20:43] * sjustlaptop (~sam@ has joined #ceph
[20:45] <dmick> noob2: it just looks like you were trying to initialize a filesystem, but there was already something mounted there
[20:45] <dmick> could that be the case?
[20:46] <noob2> possibly. i umounted the filesystem and it seems to work now
[20:46] <noob2> no idea why
[20:47] <noob2> i thought the monitor node required a separate filesystem mounted at the point it was writing logs at
[20:47] <noob2> so i did that
[20:47] <noob2> when i umount and blow away the osd, mkcephfs seems to work and i can fire up both nodes ok
[20:47] <noob2> leave it to noobies to find the silly problems :D
[20:50] <noob2> does it normally start in a degraded state ?
[20:50] <noob2> i see HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean ceph> status health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean
[20:50] <dmick> how many osds did you configure?
[20:50] <noob2> just 1
[20:50] <noob2> figured i'd start small
[20:51] <dmick> default replication is 2, so unless you reset that, it's complaining that it can't reach the replication you want
[20:51] <noob2> ah
[20:51] <dmick> it'll run that way, but it's consistently unhealthy
[20:51] <noob2> ok
[20:51] <noob2> guess it's time to find out how easy it is to add another osd :)
[20:51] <dmick> and of course a single failure will hurt, but for a test cluster that doesn't matter much
[20:51] <dmick> but yes, by all means :)
[20:51] <noob2> i gotta say this is the coolest damn thing ever
[20:52] <noob2> telling EMC to shove it will be really satisfying
[20:52] <mikeryan> open source ftw
[20:52] <noob2> hell yeah :)
[20:54] <rturk> :)
[20:54] * LarsFronius_ (~LarsFroni@95-91-242-161-dynip.superkabel.de) has joined #ceph
[20:54] <rturk> we should put that on a t-shirt
[20:55] <mikeryan> i would rock the hell out of that shirt
[20:56] * rlr219 (43c87e04@ircip4.mibbit.com) has joined #ceph
[20:57] <rlr219> mikeryan: got your email. looking at it now.
[21:01] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:3868:8cb2:c9e3:b38e) Quit (Ping timeout: 480 seconds)
[21:01] * LarsFronius_ is now known as LarsFronius
[21:04] <benpol> libvirt with ceph rbd: is it reasonable to assume that using librados is "better"/"faster"/"more featureful" than using the kernel's rbd client?
[21:05] <joshd> benpol: yes, librbd has client-side caching and supports discard and cloning, which haven't made it into the kernel client yet
[21:06] <joshd> benpol: with qemu, it avoids an extra context switch as well
[21:08] * BManojlovic (~steki@ has joined #ceph
[21:09] <noob2> are the ceph-{osd-number}'s unique to the whole cluster or just the machine you're working on?
[21:10] <joshd> they're unique cluster wide. you should just use them sequentially though, don't try to give them extra meaning
[21:11] <noob2> ok i figured as much
[21:11] <noob2> never hurts to ask
[21:11] <joshd> yeah, no problem, some people think about it the opposite way so it doesn't hurt to clarify either :)
[21:12] <noob2> yeah i was thinking each server might have osd-[0-11]
[21:12] <noob2> then i thought well that might mess up things cluster wide
[21:15] <benpol> joshd: thanks, I suspected as much but it's good to understand more!
[21:15] * benpol wanders off to lunch
[21:17] * deepsa_ (~deepsa@ has joined #ceph
[21:18] <PerlStalker> joshd: Why should the osd IDs be sequential? What difference does it make?
[21:20] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[21:20] * deepsa_ is now known as deepsa
[21:20] * scuttlemonkey_ (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[21:21] * sjustlaptop (~sam@ Quit (Ping timeout: 480 seconds)
[21:21] * Kioob (~kioob@luuna.daevel.fr) Quit (Ping timeout: 480 seconds)
[21:24] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Ping timeout: 480 seconds)
[21:28] <joshd> PerlStalker: they don't have to be, but they're indexes into an array, so it'll waste some memory if you have osd 1000000
[21:28] <PerlStalker> Fair enough.
[21:29] <PerlStalker> If I renumber my osds to make them sequential will that free up the memory or am I hosed by my ignorant choice?
[21:33] <joshd> PerlStalker: it probably doesn't make that much of a difference, but you can add low number ones, remove the high numbered ones, and set max_osd in the osdmap down to the highest you actually have
[21:33] * Kioob (~kioob@luuna.daevel.fr) has joined #ceph
[21:33] <PerlStalker> Gotcha
[21:34] <PerlStalker> I'm still in testing so it's not that much of a hassle to renumber.
[21:39] <noob2> is there always a crush map created?
[21:39] <noob2> by default
[21:39] <noob2> i'm adding a new osd to my 1 osd cluster :)
[21:40] <joshd> noob2: you always have a crushmap, and by default (created by mkcephfs) it will separate replicas across hosts
[21:41] <noob2> so i need to add the new osd to the crush map regardless then?
[21:42] <joshd> yeah, http://ceph.com/docs/master/cluster-ops/add-or-rm-osds/ is the best reference for this
[21:42] <noob2> yeah i'm following the manual way
[21:42] <noob2> not sure what name or weight to give it
[21:43] <joshd> name is osd.N, where N is the id, weight can be based on the capacity of the device
[21:43] <noob2> ok
[21:43] <joshd> 'ceph osd tree' will show you the current crushmap visually
[21:44] <noob2> ok
[21:44] <joshd> I think the default is everything gets weight 1
[21:45] <noob2> looks like it auto added it
[21:45] <noob2> i see osd.1 down
[21:45] <noob2> so that's good
[21:46] <noob2> i started it up and it says up and in
[21:46] <noob2> guess i'll just wait for it to replicate and see how it goes :D
[21:46] <joshd> sounds good
[21:47] <noob2> oh, it looks like my new osd is in a new pool
[21:47] <noob2> http://mibpaste.com/nK9yjI
[21:48] <joshd> it's not been placed anywhere in the hierarchy
[21:48] <noob2> hmm ok
[21:49] <joshd> that's what the 'ceph osd crush set 0 osd.0 1.0 pool=default rack=unknownrack host=blah' does
[21:49] <noob2> ok
[21:49] <noob2> i just got an error when i tried it
[21:49] <noob2> i didn't put pool=default rack=unknown
[21:50] <noob2> i'll shut it down and try to move it
[21:50] <joshd> no need to shut it down even
[21:52] <noob2> do i just follow that get crush, decompile, edit, recompile procedure?
[21:52] <noob2> sorry for the thousand questions. today is ceph day 1
[21:52] <joshd> you can do that too
[21:52] * grifferz_ is now known as grifferz
[21:52] <noob2> is there an easier way?
[21:52] <joshd> 'ceph osd crush set' is basically a shortcut for that
[21:52] <noob2> nice
[21:53] <joshd> you only need to decompile, edit, recompile, etc when you want to change the actual placement rules, not when you're just adding or removing osds
[21:54] <noob2> awesome
[21:54] <noob2> scrub away
[22:03] <noob2> looks like it finished scrubbing but didn't entirely fix the degraded state
[22:03] <noob2> i'll have to take a look tomorrow
[22:05] <lxo> shouldn't there be some rate limiting for attempts to establish connections with other parties in the cluster? I'm seeing CPU spikes and nf_conntrack table overflows when some component is down but other ceph components try to connect to it like crazy, e.g., when a majority of the mons is down
[22:05] <lxo> with 0.53
[22:09] <joshd> lxo: yes, I'm surprised that isn't being rate limited
[22:17] * kavonr (~rnovak@ns.indyramp.com) has joined #ceph
[22:20] * sagewk (~sage@ has joined #ceph
[22:20] * ChanServ sets mode +o sagewk
[22:20] * sagewk (~sage@ has left #ceph
[22:20] * sagewk (~sage@ has joined #ceph
[22:20] * ChanServ sets mode +o sagewk
[22:21] <sagewk> sigh... much easier to debug rbd caching when it is enabled
[22:21] <dmick> heh
[22:21] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) Quit (Ping timeout: 480 seconds)
[22:21] <elder> It doesn't have a problem when it's off.
[22:22] <dmick> problem solved!
[22:28] * Cube1 (~Cube@2607:f298:a:697:20f5:750e:4dc9:c91c) Quit (Read error: Connection reset by peer)
[22:28] * Cube1 (~Cube@2607:f298:a:697:20f5:750e:4dc9:c91c) has joined #ceph
[22:30] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) has joined #ceph
[22:39] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:41] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[22:47] * dmick (~dmick@2607:f298:a:607:5c63:ce2a:d9d9:8516) Quit (Ping timeout: 480 seconds)
[22:52] * dmick (~dmick@ has joined #ceph
[23:02] * madkiss (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit (Quit: Leaving.)
[23:06] * Robe (robe@amd.co.at) has joined #ceph
[23:06] <Robe> omghi2u2
[23:11] * ChanServ sets mode +o dmick
[23:11] <dmick> hello Robe
[23:25] * fzylogic (~fzylogic@ Quit (Quit: fzylogic)
[23:25] * fzylogic (~fzylogic@ has joined #ceph
[23:31] * Kioob (~kioob@luuna.daevel.fr) Quit (Ping timeout: 480 seconds)
[23:39] * rweeks (~rweeks@ has joined #ceph
[23:49] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[23:50] * tziOm (~bjornar@ti0099a340-dhcp0778.bb.online.no) Quit (Remote host closed the connection)
[23:51] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[23:51] * loicd (~loic@magenta.dachary.org) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.