#ceph IRC Log


IRC Log for 2013-07-24

Timestamps are in GMT/BST.

[0:02] * aliguori (~anthony@ Quit (Remote host closed the connection)
[0:04] * bandrus (~Adium@ has joined #ceph
[0:08] * allsystemsarego (~allsystem@ Quit (Quit: Leaving)
[0:14] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Ping timeout: 480 seconds)
[0:24] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[0:24] * SvenPHX (~scarter@wsip-174-79-34-244.ph.ph.cox.net) has left #ceph
[0:25] * mschiff (~mschiff@ Quit (Remote host closed the connection)
[0:25] <sjust> jmlowe1: yeah, looking
[0:27] * drokita (~drokita@ Quit (Ping timeout: 480 seconds)
[0:28] <sjust> jmlowe1: ok, I know what happened
[0:30] <sjust> the attribute is wrong, we need to adjust the value
[0:31] <sjust> 0000000: 0109 0000 0000 0000 0002 0000 0002 0000
[0:31] <sjust> should be that
[0:32] <sjust> actually
[0:32] <sjust> one sec
[0:32] <sjust> 0000000: 01dc 0000 0000 0000 0002 0000 0002 0000 00
[0:32] <sjust> that
[0:32] <sjust> oops
[0:32] <sjust> 0000000: 01dc 0000 0000 0000 0002 0000 0002 0000
[0:33] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[0:33] <sjust> if you do that and re-move all of the objects back into DIR_7 again you should be good
[0:33] <sjust> jmlowe1: ^
[0:33] <sjust> that is, adjust the attr value to:
[0:34] <sjust> 0000000: 01dc 0000 0000 0000 0002 0000 0002 0000
[0:34] <sjust> 0000010: 00
[0:36] <Kioob> $ ceph-deploy osd prepare tessa:/dev/sda4
[0:36] <Kioob> ...
[0:36] <Kioob> OSError: [Errno 2] No such file or directory: '/sys/block/sda4'
[0:36] <Kioob> ceph-deploy doesn't allow use of partitions, only full disk is allowed ?
[0:39] <lurbs> Kioob: I'm no expect on ceph-deploy, but the examples for ceph-deploy refer to just 'sdb', with no partition. I believe that one gets created.
[0:40] <Kioob> ok... so I need to manually deploy that cluster. Great
[0:40] * diegows (~diegows@host63.186-108-72.telecom.net.ar) Quit (Ping timeout: 480 seconds)
[0:40] <dmick> you can use partitions; if you do, however, you need to specify a journal, maybe?
[0:40] <dmick> do the docs really not talk about disk vs. partition vs dir?
[0:40] <Kioob> not ceph-deploy docs
[0:42] * bwesemann_ (~bwesemann@2001:1b30:0:6:74bc:6499:312e:8e83) Quit (Remote host closed the connection)
[0:42] <lurbs> dmick: http://ceph.com/docs/next/rados/deployment/ceph-deploy-osd isn't hugely obvious when it comes to full devices vs partitions, for OSDs and journals, no.
[0:42] * bwesemann_ (~bwesemann@2001:1b30:0:6:edc3:28ca:34e8:47d7) has joined #ceph
[0:42] * lautriv (~lautriv@f050081157.adsl.alicedsl.de) Quit (Read error: Connection reset by peer)
[0:43] <Kioob> and here, I just want to use a partition for OSD, I don't want to put the journal on a dedicated partition
[0:43] <dmick> yes, but you have to specify where the journal goes
[0:44] <Kioob> it's not mandatory :S
[0:44] * lautriv (~lautriv@f050081157.adsl.alicedsl.de) has joined #ceph
[0:44] <Kioob> you have to specify that when you want a dedicated storage
[0:44] <dmick> what's not mandatory? the journal has to go somewhere
[0:44] <Kioob> I never need to specify the path before
[0:45] <dmick> before as in "with other invocations of ceph-deploy"?
[0:45] <Kioob> no, other cluster, managed without "ceph-deploy"
[0:46] <Kioob> but so, I will search the default journal path in the doc
[0:47] <dmick> yeah. something was defaulting a path for the journal.
[0:47] <Kioob> ( /var/lib/ceph/osd/$cluster-$id/journal )
[0:47] <Kioob> so same problem, ceph-deploy try to access to /sys/block/sda4, which of course doesn't exists
[0:47] <dmick> sure. ceph-deploy isn't assuming you already have one and that it's set up the way you want (it may be a mounted partition, or a symlink to a different place)
[0:48] <Kioob> (from the doc, the journal path is not mandatory on the "ceph-deploy" command line...)
[0:50] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[0:52] <dmick> I'm pretty sure you can either:
[0:52] <dmick> 1) give a block device to osd prepare, in which case it will allocate partitions for data and journal, or
[0:53] <dmick> 2) give two partitions, or two paths, or a partition and a path
[0:53] <Kioob> I tried the second solution, it doesn't works
[0:54] <dmick> yes it would be nice if the docs explained this: http://tracker.ceph.com/issues/5116
[0:54] <dmick> Kioob: so, as always, exactly what you tried and exactly what it did would be a big help in figuring out what went wrong
[0:54] <Kioob> http://pastebin.com/9YGpi3Hv
[0:54] <dmick> k
[0:55] <Kioob> the problem is that ceph-deploy assume that the second paramater is a full disk, and so is present in /sys/block/ ; but it's wrong
[0:55] <dmick> sagewk: ^ do you remember if this is a known bug?
[0:56] <dmick> Kioob: it's assuming the first parm is a full disk by design, but in the second form (the pastebin) that looks wrong to me
[0:56] <dmick> (by design if there's only one)
[0:56] <jmlowe1> sjust: I'm a little fuzzy on setting the xattr, got a suggested method?
[0:57] <sjust> you'll have to dump the attr to a file and hex edit it and reset it
[0:57] <sjust> attr
[0:57] <sjust> is the utility
[0:57] * sprachgenerator (~sprachgen@ Quit (Quit: sprachgenerator)
[0:57] <Kioob> For me a block device is a block device, I don't see where the "by design" say that reserving 4GB at start of the disk is a problem
[0:57] <jmlowe1> ok, I just need a file with those bytes in hex?
[0:58] <sjust> I would test out the attr utility, never used it myself
[0:58] <sjust> but I think so
[0:58] <sjust> sorry, not in hex
[0:58] <sjust> a file containing the actual binary contents
[0:58] <sjust> and then use emacs or something to edit it in hex mode
[0:58] <jmlowe1> right, but it's dumped in hex
[0:58] <sjust> I think attr might be able to dump it raw
[0:59] <sjust> GET The -g attrname option tells attr to search the named object and print (to stdout)
[0:59] <sjust> the value associated with that attribute name. With the -q flag, stdout will be
[0:59] <sjust> exactly and only the value of the attribute, suitable for storage directly into a
[0:59] <sjust> file or processing via a piped command.
[0:59] <dmick> Kioob: if you specify one device, it's assumed to be a full device so that the journal can be a partition next to the data
[0:59] <dmick> if you specify two things, it shouldn't matter whether they're full disk or not, because it doesn't have to split it
[1:00] <Kioob> ok, I understand
[1:00] <jmlowe1> here we go, home turf" python-xattr - module for manipulating filesystem extended attributes
[1:00] * markbby (~Adium@ Quit (Quit: Leaving.)
[1:00] <Kioob> but I tried both syntaxes ; without success
[1:00] <dmick> Yes. I know that.
[1:01] <dmick> that's why I said "shouldn't matter" and "sagewk: ^ do you remember if this is a known bug?"
[1:01] <Kioob> oh ok :)
[1:03] * devoid (~devoid@ Quit (Quit: Leaving.)
[1:05] <Kioob> So, I will retry tomorrow, with or without "ceph-deploy". Go to bed, thanks for your help
[1:05] <sagewk> dmick: i think not a known bug. this is a journal file?
[1:06] <dmick> Kioob is trying to osd prepare two partitions (data/journal)
[1:06] <dmick> and for some reason it's trying to find the whole-device named by the first one
[1:07] <dmick> Kioob: some time before bed yet
[1:09] <sagewk> k, open an urgent bug?
[1:09] <dmick> so that is correct usage, then, and ought to work. OK.
[1:09] * indeed (~indeed@ Quit (Remote host closed the connection)
[1:10] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) Quit (Quit: Leaving.)
[1:13] <dmick> Kioob, sagewk: http://tracker.ceph.com/issues/5734
[1:13] <phantomcircuit> im trying to remove 2 osds
[1:14] <phantomcircuit> i have them set to out but still up
[1:14] <phantomcircuit> 2013-07-23 23:12:22.188508 mon.0 [INF] pgmap v6837450: 576 pgs: 513 active+clean, 4 active+recovery_wait, 56 active+remapped, 1 active+clean+scrubbing+deep, 2 active+recovering; 725 GB data, 2798 GB used, 6647 GB / 9445 GB avail; 15/384264 degraded (0.004%)
[1:14] <phantomcircuit> those seem to be stuck in recovering
[1:24] <jmlowe1> sjust: still haven't figure out how to set it
[1:28] <sjust> one sec
[1:28] <jmlowe1> may have cracked it, redirects seem to work if you are careful
[1:29] <sjust> should work with -q
[1:30] <sjust> attr -q -g <attr> <file>
[1:30] <sjust> and redirect to a file
[1:30] <sjust> hexdump will verify file contents
[1:31] <jmlowe1> ~/tmp# ~/listxattrs .
[1:31] <jmlowe1> test
[1:31] <jmlowe1> 0000000: 01dc 0000 0000 0000 0002 0000 0002 0000 ................
[1:31] <jmlowe1> 0000010: 00 .
[1:32] <jmlowe1> looks right to me
[1:32] <sjust> what is listxattrs?
[1:32] <jmlowe1> cat ~/listxattrs
[1:32] <jmlowe1> #!/bin/bash
[1:32] <jmlowe1> path=$1
[1:32] <jmlowe1> for a in $(attr -lq $path); do echo $a; attr -qg $a $path | xxd; done
[1:32] <jmlowe1> somebody here gave me that, dmick maybe
[1:32] <sjust> looks reasonable
[1:33] <jmlowe1> here goes
[1:33] <dmick> $ ls ~/bin/listxattrs
[1:33] <dmick> /home/dmick/bin/listxattrs
[1:33] <dmick> maybe
[1:35] * dpippenger1 (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[1:36] <sjust> jmlowe1: wait, made a mistake
[1:36] <jmlowe1> yeah, the objects moved back
[1:36] <sjust> hmm, that actually should not have happened
[1:36] <sjust> what is the attr now?
[1:36] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Read error: Connection reset by peer)
[1:37] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[1:37] <jmlowe1> hmm
[1:38] * dpippenger1 (~riven@cpe-76-166-208-83.socal.res.rr.com) Quit ()
[1:38] * dpippenger1 (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[1:38] <jmlowe1> root@gwioss1:/data/osd.14/current/2.37d_head/DIR_D/DIR_7# ~/listxattrs .
[1:38] <jmlowe1> cephos.phash.contents
[1:38] <jmlowe1> 0000000: 01dc 0000 0000 0000 0002 0000 0002 0000 ................
[1:38] <jmlowe1> 0000010: 00 .
[1:39] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[1:39] <sjust> ok, one more adjustment to that
[1:40] <sjust> 0000000: 01dc 0000 0000 0000 0003 0000 0002 0000
[1:40] <sjust> 0000010: 00
[1:40] * indeed (~indeed@ has joined #ceph
[1:40] <sjust> sorry, that 2->3 is the number of subdirs under DIR_D/DIR_7
[1:40] <sjust> what are the attrs on the top pg directory?
[1:41] <sjust> I think there's an in progress op tag
[1:41] <jmlowe1> root@gwioss1:/data/osd.14/current/2.37d_head# ~/listxattrs .
[1:41] <jmlowe1> cephos.collection_version
[1:41] <jmlowe1> 0000000: 0300 0000 ....
[1:41] <jmlowe1> cephos.phash.contents
[1:41] <jmlowe1> 0000000: 0100 0000 0000 0000 0001 0000 0000 0000 ................
[1:41] <jmlowe1> 0000010: 00 .
[1:41] <jmlowe1> cephos.seq
[1:41] <jmlowe1> 0000000: 0101 1000 0000 e0ad d700 0000 0000 0000 ................
[1:41] <jmlowe1> 0000010: 0000 0100 0000 00 .......
[1:41] <jmlowe1> ceph.info
[1:41] <jmlowe1> 0000000: 07 .
[1:41] <jmlowe1> cephos.phash.in_progress_op
[1:41] <jmlowe1> 0000000: 0101 0000 0002 0000 0001 0000 0044 0100 .............D..
[1:41] <jmlowe1> 0000010: 0000 37 ..7
[1:41] <sjust> ahah
[1:41] <sjust> that one
[1:41] <sjust> cephos.phash.in_progress_op
[1:42] <sjust> you need to remove that attribute, move the files again, and adjust the DIR_D/DIR_7 attr to what I pasted a bit ago
[1:42] <jmlowe1> with the 3 for subdir count
[1:42] <sjust> correct
[1:42] <sjust> there are 3 directories under DIR_D/DIR_7, right?
[1:42] * indeed (~indeed@ Quit (Read error: Connection reset by peer)
[1:43] * indeed (~indeed@ has joined #ceph
[1:46] <jmlowe1> well, I think we have a winner
[1:46] <sjust> ah, cool
[1:46] <sjust> thanks for your help, I believe I have the bug
[1:46] <jmlowe1> no, thank you, bacon saved
[1:46] * Cerales (~danielbry@router-york.lninfra.net) Quit (Quit: Lost terminal)
[1:47] <jmlowe1> deep scrubbing that pg now for good measure
[1:47] <sjust> yup
[1:47] <jmlowe1> went active+clean after just scrub
[1:47] <sjust> you got a crash at an inoportune moment and caught the HashIndex code with it's pants down
[1:47] <sjust> *its
[1:47] <bandrus> later man
[1:47] <bandrus> oops
[1:48] <jmlowe1> 2013-07-23 19:47:55.330383 osd.14 [INF] 2.37d deep-scrub ok
[1:49] <jmlowe1> I'm thinking there may also be a bug with trim/discard from qemu-rbd that caused the initial crash, or is that the one you were referring to
[1:49] <sjust> well, this probably didn't cause the crash, this was just a side-effect
[1:50] <jmlowe1> I've only managed to crash an osd twice in the past 6 months with trim/discard, so it may be kind of hard to track down
[1:50] <sagewk> sjust: was this a power loss/kernel crash, or ceph-osd crash?
[1:50] <jmlowe1> ceph-osd crash
[1:54] * indeed (~indeed@ Quit (Remote host closed the connection)
[1:54] <loicd> sjust: thanks for the review at https://github.com/dachary/ceph/commit/a5bde96db5ea52e56fbe94accb3538aee4729aa3#commitcomment-3698477
[1:56] * BManojlovic (~steki@fo-d- Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:58] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[1:58] * andrei (~andrei@host109-151-35-94.range109-151.btcentralplus.com) Quit (Read error: Operation timed out)
[1:59] <sjust> loicd: sure, still need to get to the other 1
[2:04] * portante|afk is now known as portante
[2:06] * Guest820 (~zack@65-36-76-12.dyn.grandenetworks.net) Quit (Ping timeout: 480 seconds)
[2:06] <lautriv> i see many have issues with ceph-deploy.....did anyone get managed to create a cluster with it ?
[2:12] <lurbs> Yeah, I just did. Doesn't much like journals on LVs, rather than partitions, though.
[2:17] <lautriv> also doesn't like md-raids ;)
[2:25] * yanzheng (~zhyan@ has joined #ceph
[2:30] * sagelap (~sage@2600:1012:b028:6e0d:2516:2daf:20cf:33ed) has joined #ceph
[2:31] * huangjun (~kvirc@ has joined #ceph
[2:33] * AfC (~andrew@gateway.syd.operationaldynamics.com) has joined #ceph
[2:33] * sagelap1 (~sage@ Quit (Ping timeout: 480 seconds)
[2:33] * LeaChim (~LeaChim@0540adc6.skybroadband.com) Quit (Ping timeout: 480 seconds)
[2:36] * huangjun|2 (~kvirc@ has joined #ceph
[2:36] * huangjun|2 (~kvirc@ Quit (Read error: Connection reset by peer)
[2:42] * huangjun (~kvirc@ Quit (Ping timeout: 480 seconds)
[2:55] * huangjun (~kvirc@ has joined #ceph
[2:56] * huangjun|2 (~kvirc@ has joined #ceph
[2:56] * huangjun (~kvirc@ Quit ()
[3:00] * yy-nm (~chatzilla@ has joined #ceph
[3:01] * huangjun (~kvirc@ has joined #ceph
[3:07] <lautriv> lurbs, still wondering how you achived that, each invocation of ceph-deploy invents new error-messages o.O
[3:07] * huangjun|2 (~kvirc@ Quit (Ping timeout: 480 seconds)
[3:08] <lautriv> bedtime, tomorrow last chance
[3:12] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) has joined #ceph
[3:15] * indeed (~indeed@c-71-198-23-110.hsd1.ca.comcast.net) has joined #ceph
[3:22] * julian (~julianwa@ has joined #ceph
[3:26] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) has joined #ceph
[3:35] * Cube1 (~Cube@ has joined #ceph
[3:35] * Cube1 (~Cube@ Quit ()
[3:37] * AfC (~andrew@gateway.syd.operationaldynamics.com) Quit (Quit: Leaving.)
[3:40] * jluis (~joao@ has joined #ceph
[3:41] * Cube (~Cube@ Quit (Ping timeout: 480 seconds)
[3:43] * bandrus (~Adium@ Quit (Quit: Leaving.)
[3:46] * joao (~joao@89-181-156-100.net.novis.pt) Quit (Ping timeout: 480 seconds)
[3:47] * markbby (~Adium@ has joined #ceph
[4:04] <phantomcircuit> create-or-move updating item id 2 name 'osd.2' weight 2.6 at location {host=localhost,root=default} to crush map
[4:04] <phantomcircuit> hostname isn't localhost
[4:04] <phantomcircuit> and im even supplying --hostname
[4:05] <dmick> that refers to crush bucket names
[4:06] <phantomcircuit> dmick, the osd should be in a bucket with the same name as the actual hostname
[4:06] <dmick> what was the command that caused this output?
[4:06] <phantomcircuit> /usr/lib/ceph/ceph_init.sh --hostname ns222224 start osd.3
[4:08] <dmick> ceph_init.sh??
[4:08] * markbby (~Adium@ Quit (Remote host closed the connection)
[4:08] <phantomcircuit> dmick, it's what /etc/init.d/ceph start calls in gentoo
[4:09] <phantomcircuit> im pretty sure it's just the normal rc init script
[4:09] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[4:09] <phantomcircuit> which isn't compatible with the gentoo openrc system
[4:09] <dmick> did someone else package it for gentoo and change names?...
[4:09] <phantomcircuit> hmm?
[4:10] <phantomcircuit> not sure what you're asking
[4:10] <dmick> our source base, including packaging, contains no mention of ceph_init
[4:10] <dmick> so I'm confused as to where that came from
[4:10] <phantomcircuit> dmick, im pretty sure it's the init script which would normally be in init.d
[4:10] <dmick> maybe it is the same as what other distros would call /etc/init.d/ceph, but I don't know how it got there
[4:11] * dmick is scared
[4:11] <dmick> but anyway
[4:11] <phantomcircuit> it's a shim to work with gentoo
[4:11] <dmick> ...which someone, somewhere, created
[4:11] <dmick> but I don't think it was Inktank.
[4:11] <phantomcircuit> no the file im running is the /etc/init.d/ceph script
[4:11] <phantomcircuit> the /etc/init.d/ceph on gentoo is just a shim
[4:11] <phantomcircuit> that make sense?
[4:12] <dmick> I could believe that is the case; I just don't know who did it or how
[4:12] <dmick> but no matter. assuming indeed you're running the code I know as /etc/init.d/ceph
[4:13] <phantomcircuit> dmick, yeah i am running the code you know as /etc/init.d/ceph
[4:14] <dmick> then yeah, I don't know how it got through that with localhost
[4:16] <dmick> looks to me like it should be looking up osd.3 in ceph.conf and using that unless /var/lib/ceph/osd/ceph-3/sysvinit exists (does it?)
[4:17] <phantomcircuit> it does not
[4:17] <phantomcircuit> and osd.3 in /etc/ceph/ceph.conf has host = ns222224
[4:17] <phantomcircuit> wait
[4:17] <phantomcircuit> that should be osd host
[4:17] <phantomcircuit> >.>
[4:18] <phantomcircuit> carry on
[4:18] <dmick> I don't think so?...
[4:18] <dmick> localhost is the default for host if not mentioned, but it's not osd host
[4:18] <dmick> at this point I'd be executing that start command with sh -x
[4:19] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[4:25] <phantomcircuit> dmick, yeah it's ignoring the --hostname parameter and calling hostname -s
[4:25] <phantomcircuit> i fixed the actual hostname and now it works
[4:25] <phantomcircuit> but i'd say ignoring --hostname is a bug
[4:27] * Cube (~Cube@66-87-67-203.pools.spcsdns.net) has joined #ceph
[4:27] <dmick> I mean, I see hostname -s in ceph_common.sh, but that's before processing args
[4:27] <dmick> and it sets hostname, and then the arg processing should be resetting it
[4:28] * Cube (~Cube@66-87-67-203.pools.spcsdns.net) Quit ()
[4:31] * john_barbee (~jbarbee@173-16-234-208.client.mchsi.com) Quit (Quit: ChatZilla [Firefox 22.0/20130618035212])
[4:32] * bandrus (~Adium@66-87-67-203.pools.spcsdns.net) has joined #ceph
[4:34] * bandrus (~Adium@66-87-67-203.pools.spcsdns.net) Quit ()
[4:34] <sjustlaptop> jmlowe1: the osd is running ok?
[4:39] <huangjun> can i get the clients that mounted on ceph by cmd?
[4:43] * julian (~julianwa@ Quit (Quit: afk)
[4:56] <dmick> huangjun: I don't believe so, no; I don't think the cluster has any kind of session with a client
[4:56] <huangjun> just like "ceph get clients" and so
[4:57] <huangjun> the mds and client will have msgs and mon also have msg with client,
[4:59] <huangjun> the reason i want to get the clients is if i restart the cluster and some client are still writing to the cluster,then the client will die
[5:01] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) has joined #ceph
[5:09] <lxo> gregaf1, isn't it a bug that renaming an enclosing dir doesn't update enclosed files' parent xattrs?
[5:18] <erice> the tree for ceph-deploy install --dev=wip-cuttlefish-ceph-disk HOST seem broke
[5:18] <erice> Failed to fetch http://gitbuilder.ceph.com/ceph-deb-raring-x86_64-basic/ref/wip-cuttlefish-ceph-disk/dists/raring/main/binary-amd64/Packages
[5:19] * indeed (~indeed@c-71-198-23-110.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[5:22] * zackc (~zack@65-36-76-12.dyn.grandenetworks.net) has joined #ceph
[5:23] * zackc is now known as Guest866
[5:24] <dmick> erice: yeah, seems to b
[5:24] <dmick> e
[5:25] <erice> Thanks for confirming. I hoped it was not my setup
[5:26] <dmick> don't really understand why. it was built and uploaded successfully according to the logs
[5:38] * gregaf (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[5:41] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 480 seconds)
[5:45] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[6:06] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) has joined #ceph
[6:06] * ChanServ sets mode +o scuttlemonkey
[6:13] * gregaf (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit (Quit: Leaving.)
[6:17] * matt__ (~matt@220-245-1-152.static.tpgi.com.au) has joined #ceph
[6:17] * Fetch (fetch@gimel.cepheid.org) Quit (Remote host closed the connection)
[6:23] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has left #ceph
[6:23] * Fetch_ (fetch@gimel.cepheid.org) has joined #ceph
[6:24] * morse (~morse@supercomputing.univpm.it) Quit (Ping timeout: 480 seconds)
[6:29] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[6:39] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[6:45] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) Quit (Quit: erice)
[6:55] * gregaf (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[6:59] * sagelap1 (~sage@ has joined #ceph
[7:03] * huangjun (~kvirc@ Quit (Ping timeout: 480 seconds)
[7:06] * huangjun (~kvirc@ has joined #ceph
[7:07] * sagelap (~sage@2600:1012:b028:6e0d:2516:2daf:20cf:33ed) Quit (Ping timeout: 480 seconds)
[7:09] * huangjun|2 (~kvirc@ has joined #ceph
[7:09] * huangjun (~kvirc@ Quit (Read error: Connection reset by peer)
[7:15] * gregaf (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit (Quit: Leaving.)
[7:16] * gregaf (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[7:16] * gregaf (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit ()
[7:20] * huangjun (~kvirc@ has joined #ceph
[7:21] * huangjun|3 (~kvirc@ has joined #ceph
[7:27] * huangjun|2 (~kvirc@ Quit (Ping timeout: 480 seconds)
[7:28] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) has joined #ceph
[7:28] * huangjun (~kvirc@ Quit (Ping timeout: 480 seconds)
[7:34] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) Quit (Read error: Operation timed out)
[7:42] * huangjun (~kvirc@ has joined #ceph
[7:49] * huangjun|3 (~kvirc@ Quit (Ping timeout: 480 seconds)
[8:02] * ismell_ (~ismell@host-24-56-171-198.beyondbb.com) has joined #ceph
[8:04] * ismell (~ismell@host-24-56-171-198.beyondbb.com) Quit (Ping timeout: 480 seconds)
[8:06] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[8:06] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[8:08] * huangjun|2 (~kvirc@ has joined #ceph
[8:08] * huangjun (~kvirc@ Quit (Read error: Connection reset by peer)
[8:16] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[8:18] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[8:18] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit ()
[8:22] * agh (~oftc-webi@gw-to-666.outscale.net) Quit (Quit: Page closed)
[8:28] * huangjun (~kvirc@ has joined #ceph
[8:34] * huangjun|3 (~kvirc@ has joined #ceph
[8:34] * huangjun (~kvirc@ Quit (Read error: Connection reset by peer)
[8:35] * huangjun|2 (~kvirc@ Quit (Ping timeout: 480 seconds)
[8:43] <phantomcircuit> 2013-07-24 06:38:21.176203 mon.0 [INF] pgmap v6842543: 576 pgs: 365 active+remapped, 211 active+degraded; 725 GB data, 1545 GB used, 9202 GB / 10747 GB avail; 35884/384288 degraded (9.338%)
[8:43] <phantomcircuit> that hasn't changed yet
[8:43] <phantomcircuit> for a while
[8:44] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) Quit (Quit: erice)
[8:48] <huangjun|3> how many osds and are they in the same host?
[8:48] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[8:56] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[8:59] * huangjun (~kvirc@ has joined #ceph
[9:02] <phantomcircuit> huangjun|3, 4 total 2 per host, 2 hosts
[9:02] <phantomcircuit> im trying to downscale to 1 host and set them out on one host
[9:03] * mschiff (~mschiff@p4FD7FEF6.dip0.t-ipconnect.de) has joined #ceph
[9:03] <huangjun> so "ceph osd tree" will show you part of crush ruleset
[9:03] <phantomcircuit> huangjun, i changed the crush rules to one per osd
[9:03] <phantomcircuit> i assume that's where you're going with this
[9:05] <huangjun> i needs 5 mins to migrate data,
[9:06] * huangjun|3 (~kvirc@ Quit (Ping timeout: 480 seconds)
[9:14] * matt__ (~matt@220-245-1-152.static.tpgi.com.au) Quit (Ping timeout: 480 seconds)
[9:18] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[9:20] * huangjun|2 (~kvirc@ has joined #ceph
[9:23] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[9:24] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[9:24] * BManojlovic (~steki@fo-d- has joined #ceph
[9:26] * sleinen (~Adium@user-23-11.vpn.switch.ch) has joined #ceph
[9:26] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[9:27] * huangjun (~kvirc@ Quit (Ping timeout: 480 seconds)
[9:28] * odyssey4me (~odyssey4m@ has joined #ceph
[9:29] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[9:33] * bergerx_ (~bekir@ has joined #ceph
[9:35] * fridudad (~oftc-webi@fw-office.allied-internet.ag) has joined #ceph
[9:35] <fridudad> Anybody here who can give me the data for cephdrop?
[9:41] <phantomcircuit> 2013-07-24 07:41:18.106387 mon.0 [INF] pgmap v6842816: 576 pgs: 1 stale+active+clean+scrubbing+deep, 540 stale+active+clean, 4 stale+active+degraded+wait_backfill, 1 stale+active+recovery_wait, 16 stale+active+remapped, 3 stale+active+degraded+backfilling, 11 active+degraded+remapped; 725 GB data, 1514 GB used, 9266 GB / 10780 GB avail; 15EB/s rd, 15Eop/s; 17512/384288 degraded (4.557%); recovering 15E o/s, 15EB/s
[9:41] <phantomcircuit> lol
[9:49] * huangjun (~kvirc@ has joined #ceph
[9:50] * huangjun|3 (~kvirc@ has joined #ceph
[9:53] * huangjun|4 (~kvirc@ has joined #ceph
[9:53] * huangjun|3 (~kvirc@ Quit (Read error: Connection reset by peer)
[9:54] * leseb1 (~Adium@2a04:2500:0:d00:851c:5528:69c2:1e4f) has joined #ceph
[9:57] * huangjun|2 (~kvirc@ Quit (Ping timeout: 480 seconds)
[9:57] * huangjun (~kvirc@ Quit (Ping timeout: 480 seconds)
[9:58] * huangjun|4 (~kvirc@ Quit ()
[10:12] * capri (~capri@ Quit (Ping timeout: 480 seconds)
[10:22] * fireD (~fireD@93-142-246-152.adsl.net.t-com.hr) has joined #ceph
[10:26] * LeaChim (~LeaChim@0540adc6.skybroadband.com) has joined #ceph
[10:47] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[10:47] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[10:47] * Guest866 (~zack@65-36-76-12.dyn.grandenetworks.net) Quit (Ping timeout: 480 seconds)
[11:01] * jcsp (~john@82-71-55-202.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[11:01] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[11:01] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[11:17] * huangjun (~kvirc@ has joined #ceph
[11:19] <huangjun> did the bug fixed http://tracker.ceph.com/issues/4280?
[11:22] <absynth> doesn't look like it, no
[11:23] <absynth> it's blocked by an older bug, 3819, which needs to be fixed first
[11:26] <yanzheng> don't you snapshot, that feature is still unstable
[11:27] <huangjun> uhh, we use fuse client and then make snapshot, then rollback the snapshot, result in this problem
[11:35] * JM (~oftc-webi@ has joined #ceph
[11:36] * huangjun (~kvirc@ Quit (Ping timeout: 480 seconds)
[11:39] * yanzheng (~zhyan@ Quit (Quit: Leaving)
[11:55] * huangjun (~kvirc@ has joined #ceph
[12:00] * yy-nm (~chatzilla@ Quit (Quit: ChatZilla [Firefox 22.0/20130618035212])
[12:08] * LeaChim (~LeaChim@0540adc6.skybroadband.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * fireD (~fireD@93-142-246-152.adsl.net.t-com.hr) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * odyssey4me (~odyssey4m@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * sagelap1 (~sage@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * soren (~soren@hydrogen.linux2go.dk) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * X3NQ (~X3NQ@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * markl (~mark@tpsit.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * wrencsok (~wrencsok@wsip-174-79-34-244.ph.ph.cox.net) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * WarrenTheAardvarkUsui (~WarrenUsu@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * s2r2 (uid322@id-322.ealing.irccloud.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * AaronSchulz (~chatzilla@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * guppy (~quassel@guppy.xxx) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * yehudasa (~yehudasa@2607:f298:a:607:e1d9:eaeb:4726:2fb6) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * tserong (~tserong@124-168-227-28.dyn.iinet.net.au) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * janos (~janos@static-71-176-211-4.rcmdva.fios.verizon.net) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * jeroenmoors (~quassel@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * jnq (~jon@0001b7cc.user.oftc.net) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * mjeanson (~mjeanson@00012705.user.oftc.net) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * rturk-away (~rturk@ds2390.dreamservers.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * maswan (maswan@kennedy.acc.umu.se) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * terje- (~root@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * sbadia (~sbadia@yasaw.net) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * portante (~portante@nat-pool-bos-t.redhat.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * lmb (lmb@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * cclien_ (~cclien@ec2-50-112-123-234.us-west-2.compute.amazonaws.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * Kdecherf (~kdecherf@shaolan.kdecherf.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * beardo (~sma310@beardo.cc.lehigh.edu) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * jf-jenni (~jf-jenni@stallman.cse.ohio-state.edu) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * Azrael (~azrael@terra.negativeblue.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * Jakdaw (~chris@puma-mxisp.mxtelecom.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * nigwil (~idontknow@ Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * cjh_ (~cjh@ps123903.dreamhost.com) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * lupine (~lupine@lupine.me.uk) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * ivan` (~ivan`@000130ca.user.oftc.net) Quit (synthon.oftc.net graviton.oftc.net)
[12:08] * chutz (~chutz@rygel.linuxfreak.ca) Quit (synthon.oftc.net graviton.oftc.net)
[12:09] * LeaChim (~LeaChim@0540adc6.skybroadband.com) has joined #ceph
[12:09] * fireD (~fireD@93-142-246-152.adsl.net.t-com.hr) has joined #ceph
[12:09] * odyssey4me (~odyssey4m@ has joined #ceph
[12:09] * sagelap1 (~sage@ has joined #ceph
[12:09] * soren (~soren@hydrogen.linux2go.dk) has joined #ceph
[12:09] * X3NQ (~X3NQ@ has joined #ceph
[12:09] * markl (~mark@tpsit.com) has joined #ceph
[12:09] * wrencsok (~wrencsok@wsip-174-79-34-244.ph.ph.cox.net) has joined #ceph
[12:09] * WarrenTheAardvarkUsui (~WarrenUsu@ has joined #ceph
[12:09] * s2r2 (uid322@id-322.ealing.irccloud.com) has joined #ceph
[12:09] * AaronSchulz (~chatzilla@ has joined #ceph
[12:09] * guppy (~quassel@guppy.xxx) has joined #ceph
[12:09] * yehudasa (~yehudasa@2607:f298:a:607:e1d9:eaeb:4726:2fb6) has joined #ceph
[12:09] * tserong (~tserong@124-168-227-28.dyn.iinet.net.au) has joined #ceph
[12:09] * janos (~janos@static-71-176-211-4.rcmdva.fios.verizon.net) has joined #ceph
[12:09] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) has joined #ceph
[12:09] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[12:09] * jeroenmoors (~quassel@ has joined #ceph
[12:09] * jnq (~jon@0001b7cc.user.oftc.net) has joined #ceph
[12:09] * mjeanson (~mjeanson@00012705.user.oftc.net) has joined #ceph
[12:09] * rturk-away (~rturk@ds2390.dreamservers.com) has joined #ceph
[12:09] * maswan (maswan@kennedy.acc.umu.se) has joined #ceph
[12:09] * terje- (~root@ has joined #ceph
[12:09] * sbadia (~sbadia@yasaw.net) has joined #ceph
[12:09] * portante (~portante@nat-pool-bos-t.redhat.com) has joined #ceph
[12:09] * lmb (lmb@ has joined #ceph
[12:09] * cclien_ (~cclien@ec2-50-112-123-234.us-west-2.compute.amazonaws.com) has joined #ceph
[12:09] * lupine (~lupine@lupine.me.uk) has joined #ceph
[12:09] * cjh_ (~cjh@ps123903.dreamhost.com) has joined #ceph
[12:09] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) has joined #ceph
[12:09] * nigwil (~idontknow@ has joined #ceph
[12:09] * Jakdaw (~chris@puma-mxisp.mxtelecom.com) has joined #ceph
[12:09] * Azrael (~azrael@terra.negativeblue.com) has joined #ceph
[12:09] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) has joined #ceph
[12:09] * jf-jenni (~jf-jenni@stallman.cse.ohio-state.edu) has joined #ceph
[12:09] * ivan` (~ivan`@000130ca.user.oftc.net) has joined #ceph
[12:09] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[12:09] * beardo (~sma310@beardo.cc.lehigh.edu) has joined #ceph
[12:09] * Kdecherf (~kdecherf@shaolan.kdecherf.com) has joined #ceph
[12:09] * ChanServ sets mode -o rturk-away
[12:09] * ChanServ sets mode +o sagewk
[12:20] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[12:20] * KindTwo (KindOne@h248.44.28.71.dynamic.ip.windstream.net) has joined #ceph
[12:20] <lautriv> second monitor gives me -> http://pastebin.com/XjPKqSa3 right after creation ? keyring,done,sysvinit and store.db is present for this node and ceph-create-key runs forever (twice b/c i retried local start)
[12:21] * KindTwo is now known as KindOne
[12:26] <mattch> Straw poll: For people doing virtualisation on top of ceph, what virtualisation/cloud management systems are you using? Personally, I'm investigating opennebula, since I'm looking for 'permanent' vm instances with dedicated mac/ip/hardware allocation, rather than short-lived cloud 'clones' but I'm interested to know what other people are up to...
[12:29] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[12:30] * tserong (~tserong@124-168-227-28.dyn.iinet.net.au) Quit (Quit: Leaving)
[12:30] * dobber (~dobber@ has joined #ceph
[12:35] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) Quit (Ping timeout: 480 seconds)
[12:41] <huangjun> i'd like to add two mds on one host machine, how can i achieve it by using ceph-deploy ?
[12:44] <lautriv> huangjun, why would you have 2 mds on the very same machine ?
[12:45] <huangjun> i just like to test the ceph-deploy, if the host already have one mds daemon, then we use "ceph-deploy mds create hostname" it will return ok, but still just one mds daemon
[12:47] <lautriv> huangjun, the box has just one hostname.
[12:48] <huangjun> yes,i also think this is the point, so if ready want to create two mds on one box, any tips?
[12:49] <huangjun> or if i set two hostname in /etc/hosts and point to the same machine ip address
[12:51] <lautriv> bad idea. get another IP then. however it's rather pointless to hav 2 mds on the same machine.
[12:52] <huangjun> thanks,just to test the ceph-deploy,
[12:53] <lautriv> huangjun, define another IP or use an additional NIC
[12:54] <huangjun> yes, we'll try this
[12:56] * huangjun (~kvirc@ Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[12:56] <lautriv> ok, another one : if i have a mon and gathered keys, should the next ceph-deploy mon create not provision that keys to the next mon ?
[13:11] * jluis is now known as joao
[13:12] <joao> Azrael, around?
[13:18] * syed_ (~chatzilla@ has joined #ceph
[13:18] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:22] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[13:38] <Azrael> joao: yup
[13:40] * nagyz (~zoltan@2001:620:20:1001:1586:3791:51e4:de7f) has joined #ceph
[13:40] <nagyz> hi
[13:41] <nagyz> after rebooting an osd which has been deployed using ceph-deploy, nothing came only by default; then I used ceph-deploy activate which again did not start the osd daemons.
[13:41] <nagyz> if I do /etc/init.d/ceph start, it mounts the osds, but I don't see any daemons running
[13:42] <nagyz> any ideas?
[13:43] <lautriv> nagyz, had a look on your ceph.conf ? i observed ceph-deploy missing a lot.
[13:43] <nagyz> the osds are not explicitly listed there, but I was under the impression that it should be able to pick them up even without this
[13:43] <nagyz> is this not the case?
[13:44] <lautriv> mine does not even provision keys to another monitor.
[13:48] <nagyz> isn't ceph-deploy the recommended deployment tool? :-)
[13:48] <nagyz> ok, so, I should list the osds manually. let's see if that helps
[13:49] * yanzheng (~zhyan@ has joined #ceph
[13:51] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:54] <lautriv> nagyz, it is but so far not managed to get it managed.
[13:55] * mschiff_ (~mschiff@p4FD7FEF6.dip0.t-ipconnect.de) has joined #ceph
[13:55] * mschiff (~mschiff@p4FD7FEF6.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[14:03] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:07] * jerrad (~jerrad@dhcp-63-251-67-70.acs.internap.com) has joined #ceph
[14:07] * markbby (~Adium@ has joined #ceph
[14:08] <jerrad> hey I gotta couple newb questions regarding ceph for my implementation needs, anyone got some time?
[14:11] * syed_ (~chatzilla@ Quit (Quit: ChatZilla [Firefox 22.0/20130627172038])
[14:14] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) has joined #ceph
[14:24] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) Quit (Quit: Leaving.)
[14:25] * syed_ (~chatzilla@ has joined #ceph
[14:25] * MK_FG (~MK_FG@00018720.user.oftc.net) Quit (Remote host closed the connection)
[14:26] * MK_FG (~MK_FG@00018720.user.oftc.net) has joined #ceph
[14:26] * mschiff (~mschiff@p4FD7FEF6.dip0.t-ipconnect.de) has joined #ceph
[14:26] * mschiff_ (~mschiff@p4FD7FEF6.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[14:29] <ofu_> jerrad: meta-question?
[14:33] * mxmln_ (~mxmln@ has joined #ceph
[14:33] <janos> jerrad: what is being suggested is that instead of asking about asking, just ask the actual questions. that will determine if there is someone around with the knowledge to answer
[14:35] * yanzheng (~zhyan@ Quit (Ping timeout: 480 seconds)
[14:36] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[14:38] <jerrad> yeah that was a meta question, layering/wrapper if you will
[14:38] * diegows (~diegows@ has joined #ceph
[14:39] <jerrad> ok, can I use all 3 file system types at once with ceph - object block and fs
[14:40] <janos> yep
[14:40] <jerrad> the file type is Dependant on the node, and the monitor sorts it all out?
[14:40] <janos> i'd say more dependent on how to access it
[14:41] * zackc (~zack@65-36-76-12.dyn.grandenetworks.net) has joined #ceph
[14:41] <janos> if you mount rbd it's rbd
[14:41] <janos> access/create
[14:41] * zackc is now known as Guest918
[14:45] * DarkAceZ (~BillyMays@ Quit (Read error: Connection reset by peer)
[14:45] * DarkAceZ (~BillyMays@ has joined #ceph
[14:48] * dobber (~dobber@ Quit (Remote host closed the connection)
[14:49] * Guest918 (~zack@65-36-76-12.dyn.grandenetworks.net) Quit (Ping timeout: 480 seconds)
[14:51] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[14:52] <jerrad> with ceph I can make a file system structure on the storage machine, as opposed to swift containers that have no subdirectory control/structure
[14:52] <jerrad> ?
[14:57] <syed_> jerrad: depends on how you want to use ceph.
[14:57] <janos> you mean like make block devices and then format them?
[14:57] <jerrad> well, i have some legacy stuff that uses nginx to acess my file system directly on the server
[14:58] <jerrad> so i need a file directory storage system
[14:58] <janos> that can be done in a few ways i'd think
[14:59] <janos> one way is block device, format and mount it
[14:59] <janos> as if it were like any other harddrive
[14:59] <janos> nginx would see it as such
[14:59] <janos> just another part of the filesystem mounted
[15:00] <jerrad> ok and a client would be able to ftp though the REST api and do all the normal directory stuff?
[15:00] <jerrad> i guess thats how that works
[15:04] <jerrad> a block is a fixed size? so they would be limited on the amount of data that can be stored?
[15:05] <janos> think of a block device as a harddrive
[15:06] <janos> and how you interact with those
[15:06] <janos> except ceph block devices are thinly provisioned
[15:06] <janos> which is nice
[15:06] <janos> so you can say it's 10TB in size
[15:06] <janos> even though you don't have that much space
[15:06] <janos> but can exapnd the hardware underneath to accomodate as reality creeps in
[15:07] <janos> but defined as 10TB, you OS that's mounting it sees is as 10TB, but maybe you're only using a few hundred MB on it currently
[15:08] <janos> ceph could be a good choice for what you're doing, but it sounds like you may wish to investigate some fundamentals first
[15:10] <jerrad> when repovisioning does it need to be reformatted, can this provisioning be done though an api call or would i have to write some tpye of script to automate this functionality on the machine running ceph
[15:10] <jerrad> i agree my fundamentals are shaky, but thats why i sent that metadata earlier
[15:10] <janos> depends on what you mean by reprovisioning
[15:11] <janos> adding in real hardware to fullfill that hypothetical 10TB example?
[15:11] <jerrad> no
[15:11] * jks (~jks@3e6b5724.rev.stofanet.dk) has joined #ceph
[15:11] <jerrad> just expanding on the same disk
[15:11] * jksM (~jks@3e6b5724.rev.stofanet.dk) Quit (Read error: Connection reset by peer)
[15:12] <janos> if you define a block device's size and then later want to grow it, you'd need to let the filesystem know - just like any harddrive
[15:12] <janos> which is why for these i define them everly large
[15:12] <janos> since thinly provisioned
[15:12] <janos> everly/overly
[15:13] <jerrad> well yeah, but an OS will want to reformat with any drive size changes. which would be a problem if I have to create a large file size for every client
[15:14] <jerrad> and then having to copy data to change size, and paste it back when reprovisioned
[15:15] <janos> no reformatting necessary
[15:15] <janos> for example, if using ext4, you could use resize2fs
[15:16] <janos> which i think can even be done while mounted
[15:19] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:22] <jerrad> well mounting poses a availibility issue and blocks pose a resizing issue (additional work).. would a file system node allow me to get around these limitations?
[15:23] <jerrad> similar to my lagacy stuff now where they use the whole hard drive and the OS quarentiens off sections for individual users
[15:23] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[15:23] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:26] <nagyz> has anyone seen issues like this?
[15:26] <nagyz> Jul 20 22:07:53 signina-osd kernel: [ 1014.112014] EXT4-fs error (device sdh): mb_free_blocks:1301: group 142, block 4654966:freeing already freed block (bit 1910)
[15:26] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[15:26] * PerlStalker (~PerlStalk@ has joined #ceph
[15:26] <nagyz> this is on a dedicated OSD VM (the disks are a pci-e card which is passed thru to the VM using vt-d)
[15:26] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:29] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[15:29] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:29] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[15:30] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[15:30] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit ()
[15:37] <jerrad> can block storage be segregated for individual users
[15:37] <jerrad> ?
[15:38] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[15:44] * huangjun (~kvirc@ has joined #ceph
[15:48] * syed_ (~chatzilla@ Quit (Quit: ChatZilla [Firefox 22.0/20130627172038])
[15:53] * madkiss1 (~madkiss@2001:6f8:12c3:f00f:d08c:2d46:f42:a958) Quit (Ping timeout: 480 seconds)
[15:57] <huangjun> a question about object size, default object size is 4M in ceph, why choose 4M?
[16:03] * markl (~mark@tpsit.com) Quit (Read error: Connection reset by peer)
[16:04] * tchmnkyz (~jeremy@0001638b.user.oftc.net) has joined #ceph
[16:08] <Gugge-47527> huangjun: objects has the size of their content
[16:08] <Gugge-47527> rbd uses 4MB objects by default
[16:09] <Gugge-47527> I use 32MB for my rbd images though, to cut down on the amount of objects needed for big images
[16:09] <huangjun> so if our application mainly small io, can i adjust the object to small size,like 1MB?
[16:09] <Gugge-47527> dont worry about it :)
[16:10] <huangjun> can i set the object size for a pool?
[16:10] <Gugge-47527> just use default, unless you have a problem with it
[16:10] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) has joined #ceph
[16:10] <Gugge-47527> huangjun: a pool can contain objects of all sizes
[16:11] <Gugge-47527> when you create an rbd you choose the object size for that rbd, default is 4MB
[16:12] <huangjun> what about kernel client, from ceph doc it suppose to be 4MB, and can not dynamicly set the object size
[16:12] <Gugge-47527> what kernel client?
[16:12] <Gugge-47527> the rbd kernel client?
[16:12] <huangjun> fs
[16:12] <Gugge-47527> i have no idea, i dont use cephfs
[16:13] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Ping timeout: 480 seconds)
[16:13] <Gugge-47527> but if the docs say "use 4MB" i would use 4MB :P
[16:14] * grepory (~Adium@c-69-181-42-170.hsd1.ca.comcast.net) Quit ()
[16:14] <huangjun> i really want to know why is 4MB, but not other size,
[16:17] <Gugge-47527> i assume its to get better performance on bigger files
[16:17] <Gugge-47527> and that is why it is split over multiple objects
[16:17] <Gugge-47527> but its just a guess :)
[16:19] * zackc (~zack@65-36-76-12.dyn.grandenetworks.net) has joined #ceph
[16:19] * zackc is now known as Guest926
[16:23] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Remote host closed the connection)
[16:26] * leseb2 (~Adium@2a04:2500:0:d00:54ce:9aec:3c85:ff3c) has joined #ceph
[16:27] * bergerx_ (~bekir@ Quit (Quit: Leaving.)
[16:27] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[16:30] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[16:33] * leseb1 (~Adium@2a04:2500:0:d00:851c:5528:69c2:1e4f) Quit (Ping timeout: 480 seconds)
[16:37] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[16:50] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[16:50] * lupine (~lupine@lupine.me.uk) Quit (Read error: Connection reset by peer)
[16:55] * dosaboy_ is now known as dosaboy
[16:59] * madkiss (~madkiss@chello080108036100.31.11.vie.surfer.at) has joined #ceph
[17:00] <huangjun> does cephfs dedicated by ceph kernel client?
[17:01] <huangjun> i use ceph-fuse and use "cephfs /mnt/1.big map",it shows "error getting layout: function not implemented"
[17:04] * sagelap (~sage@2600:1012:b003:36eb:2516:2daf:20cf:33ed) has joined #ceph
[17:05] * sprachgenerator (~sprachgen@ has joined #ceph
[17:07] * markit (~marco@ has joined #ceph
[17:10] * sagelap1 (~sage@ Quit (Ping timeout: 480 seconds)
[17:12] * diegows (~diegows@ Quit (Ping timeout: 480 seconds)
[17:13] * sagelap1 (~sage@2600:1012:b02b:5cfc:582f:e4c0:48d1:630f) has joined #ceph
[17:14] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[17:16] * devoid (~devoid@ has joined #ceph
[17:17] * sagelap (~sage@2600:1012:b003:36eb:2516:2daf:20cf:33ed) Quit (Ping timeout: 480 seconds)
[17:17] * devoid (~devoid@ has left #ceph
[17:18] <Azrael> sagelap1: hiya
[17:19] * jskinner (~jskinner@ has joined #ceph
[17:21] <sagelap1> azrael: hi
[17:21] <sagelap1> looks like joao identified the issue
[17:21] * lupine (~lupine@lupine.me.uk) has joined #ceph
[17:23] * sagelap1 is now known as sagelap
[17:25] <Azrael> sagelap: ok
[17:25] <Azrael> sagelap: we were waiting for a sign-off from you
[17:26] <Azrael> sagelap: will you pull the patch into cuttlefish?
[17:26] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) has joined #ceph
[17:27] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) has joined #ceph
[17:27] * ChanServ sets mode +o scuttlemonkey
[17:30] <huangjun> mount ceph-fuse on /mnt and write files, then use "cephfs /mnt/mid.1 map" get the "error getting layout:function not implemented", does ioctl() don't support ceph-fuse?
[17:43] * jlogan (~Thunderbi@2600:c00:3010:1:1::40) has joined #ceph
[17:43] * leseb2 (~Adium@2a04:2500:0:d00:54ce:9aec:3c85:ff3c) Quit (Quit: Leaving.)
[17:43] * nagyz (~zoltan@2001:620:20:1001:1586:3791:51e4:de7f) Quit (Quit: Leaving)
[17:43] * sleinen (~Adium@user-23-11.vpn.switch.ch) Quit (Quit: Leaving.)
[17:43] * sleinen (~Adium@ has joined #ceph
[17:44] * huangjun (~kvirc@ Quit (Read error: Connection reset by peer)
[17:44] * sagelap (~sage@2600:1012:b02b:5cfc:582f:e4c0:48d1:630f) Quit (Read error: No route to host)
[17:45] * joshd1 (~jdurgin@2602:306:c5db:310:a812:d036:4e52:c0d2) has joined #ceph
[17:46] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[17:48] * scuttlemonkey (~scuttlemo@75-150-32-73-Oregon.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[17:51] * sleinen (~Adium@ Quit (Ping timeout: 480 seconds)
[17:54] <lautriv> if i have a mon and gathered keys, should the next ceph-deploy mon create not provision that keys to the next mon ?
[18:01] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[18:02] * sagelap (~sage@2600:1012:b02b:5cfc:2516:2daf:20cf:33ed) has joined #ceph
[18:13] <jerrad> ok ,does the seph FS node also use the REST api, since it is built on top of the OSD?
[18:31] * mxmln_ (~mxmln@ Quit (Quit: mxmln_)
[18:32] <joelio> lautriv: what kind of keys, remeber ceph authx isn't just about one key, there is a heirachy. ceph-deploy *will not* add admin keys on the mon step, that's cep-deploy admin command's job
[18:32] * markit (~marco@ Quit (Quit: Konversation terminated!)
[18:34] * sleinen (~Adium@2001:620:0:25:a4ae:9e3a:33e2:57d5) has joined #ceph
[18:34] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[18:35] <lautriv> joelio, the monitor keys from "new hostname" to "mon create hostname" after gatherkeys
[18:37] * sagelap1 (~sage@ has joined #ceph
[18:38] <joelio> lautriv: yes, it'll create the keys, if you've not added a mon using the 'new' command then you will need to add a relevant 'pulic network' statement to the ceph.conf (as mentioned in docs)
[18:39] <joelio> do you have an issue?
[18:39] <lautriv> joelio, i do a "new" for the first mon and then gatherkeys which appear in the local dir of the admin-node but if i do a mon create (second node) it doesn't transfer those keys nor does it add the node in config
[18:40] <joelio> it won't add the mon to the config, have you read the docs?
[18:41] <joelio> gatherkeys is just for admin usage iirc.. what is the actual issue you have with authentication on that additional mon?
[18:41] <lautriv> joelio, i do like described here -> http://ceph.com/docs/master/start/quick-ceph-deploy/
[18:42] <joelio> http://ceph.com/docs/master/rados/deployment/ceph-deploy-mon/
[18:42] <joelio> Note When adding a monitor on a host that was not in hosts intially defined with the ceph-deploy new command, a public network statement needs to be be added to the ceph.conf file.
[18:42] <joelio> lautriv: I don't see anywhere in that quick start where you add a mon after the inisial creation, am I missing something?
[18:44] <joelio> afaik too there is no keyring for mons.. you tell the osds that they can see mons, but not the other way around?
[18:44] <joelio> hence why you probably don't see keys
[18:44] <lautriv> joelio, right after install ceph " add a monitor" and no word about "screw on config"
[18:44] <joelio> again, what's the issue
[18:44] * sagelap (~sage@2600:1012:b02b:5cfc:2516:2daf:20cf:33ed) Quit (Ping timeout: 480 seconds)
[18:44] <joelio> lautriv: that's a quick start demo guide
[18:44] <joelio> not the full install docs
[18:45] <joelio> lautriv: there is no mention of adding additonal mons after the initial creation, as I said. You're doing things outside of the quick start and then complaining that the quick start guide isn't right
[18:46] <lautriv> joelio, i have no idea where you read but my page shows exact this.
[18:47] <joelio> no, seriously.. I see 'Add a monitor' note the singular - not a plural
[18:47] <joelio> there are no subsequent 'Add another mon' in that quick start guide.. why not take a look at the full documentation instead
[18:48] <joelio> You may also note that 'Add Ceph OSD Daemons' is plural..
[18:49] <joelio> so, my point again.. you're doing things for a single node quick start install with *1* mon - if you want more, please read the full docs
[18:49] * erice (~erice@c-98-245-48-79.hsd1.co.comcast.net) has joined #ceph
[18:49] <lautriv> joelio, i started reading at "Getting started" like it is meant.
[18:49] <joelio> not just the quick start
[18:49] <joelio> http://ceph.com/docs/master/install/
[18:50] <joelio> http://ceph.com/docs/master/rados/deployment/
[18:50] <joelio> read those
[18:51] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[18:51] * sjusthm (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) has joined #ceph
[19:01] * chamings (~jchaming@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[19:05] * madkiss (~madkiss@chello080108036100.31.11.vie.surfer.at) Quit (Quit: Leaving.)
[19:07] * mschiff (~mschiff@p4FD7FEF6.dip0.t-ipconnect.de) Quit (Remote host closed the connection)
[19:10] <chamings> have a kvm guest with an rbd mounted, ext3 filesystem. kjournald is throwing a 'hung for more than 120s' error when writing more than a relatively small amount to the volume. cluster looks healthy and is not particularly under load. any thoughts?
[19:12] <joelio> chamings: do you have rbd caching enabled, what version of kernel in the vm?
[19:12] <joelio> i.e. are you flushing writes properly
[19:13] <jmlowe1> which version of qemu are you using?
[19:14] * yehudasa__ (~yehudasa@2602:306:330b:1410:ed9b:1527:92bf:3254) Quit (Ping timeout: 480 seconds)
[19:16] <chamings> rbd write caching is NOT enabled currently, although that's something worth trying. kernel inside the VM is 3.2.0-48. qemu-kvm 1.4.2
[19:16] <joelio> ok, looks pretty recent, precise I take it?
[19:17] <joelio> athogh I'd assume that to be ext4 tbh
[19:17] <chamings> joelio: yes, precise
[19:17] <chamings> the filesystem was created by hand
[19:18] <chamings> i.e. wasn't part of the OS build
[19:18] <jmlowe1> qemu 1.4.2 has patches to allow qemu to resume execution after a flush from inside the vm, so that's good
[19:19] <jmlowe1> which qemu driver, virtio?
[19:19] <chamings> yup
[19:19] <jmlowe1> for my precise vm's I use the raring kernel
[19:19] <jmlowe1> and virtio-scsi
[19:21] <jmlowe1> linux-image-generic-lts-raring
[19:21] <chamings> thx will look into that
[19:21] <jmlowe1> you use libvirt?
[19:21] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:21] <chamings> yes, it's openstack
[19:22] <jmlowe1> I don't know enough about openstack, can you tweak the libvirt xml it generates?
[19:22] <chamings> well
[19:23] <chamings> for testing i can destroy and redefine the VM with hand tweaks
[19:23] <jmlowe1> my relevant xml:
[19:23] <chamings> if anything is a real bonus we can do the work to get it into place from OS
[19:23] <jmlowe1> <disk type='network' device='disk'>
[19:23] <jmlowe1> <driver name='qemu' type='raw' cache='writeback'/>
[19:23] <jmlowe1> <auth username='admin'>
[19:23] <jmlowe1> <secret type='ceph' uuid='3d5a5fe8-af8a-8739-f90f-ac54f21e1711'/>
[19:23] <jmlowe1> </auth>
[19:23] <jmlowe1> <source protocol='rbd' name='rbd/gwbackup1'/>
[19:23] <jmlowe1> <target dev='sda' bus='scsi'/>
[19:23] <jmlowe1> <alias name='scsi0-0-0-0'/>
[19:23] <jmlowe1> <address type='drive' controller='0' bus='0' target='0' unit='0'/>
[19:23] <jmlowe1> </disk> <controller type='scsi' index='0' model='virtio-scsi'>
[19:23] <jmlowe1> <alias name='scsi0'/>
[19:23] <jmlowe1> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
[19:23] <jmlowe1> </controller>
[19:24] <chamings> so the virtio-scsi thing, that's new to me
[19:25] <chamings> you're saying the host needs the raring kernel, or the guest?
[19:25] <jmlowe1> discard and rescan scsi bus
[19:25] * grepory1 (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[19:26] * joshd1 (~jdurgin@2602:306:c5db:310:a812:d036:4e52:c0d2) Quit (Quit: Leaving.)
[19:26] <jmlowe1> I don't think the virtio-scsi driver for the guest is in the mainline kernel until after 3.4, so a quantal or raring kernel for the guest is nice
[19:26] <jmlowe1> when in doubt 'find /lib/modules -name virtio-scsi.ko'
[19:27] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Read error: Operation timed out)
[19:27] <chamings> thank you - lots of directions to try out. i'll start with writeback and go from there.
[19:27] <jmlowe1> if fstab uses uuid, then it is a direct drop in replacement virtio-scsi for virtio-blk
[19:27] <jmlowe1> that will get you the most bang for the buck
[19:28] <jmlowe1> writeback that is
[19:29] <chamings> kk
[19:29] <jmlowe1> I found rbd backed vm's unusable without caching
[19:41] * hari (~hari@sealip01.ericsson.net) has joined #ceph
[19:42] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[19:46] * sagelap1 (~sage@ Quit (Ping timeout: 480 seconds)
[19:46] * sagelap (~sage@ has joined #ceph
[19:47] * hari (~hari@sealip01.ericsson.net) has left #ceph
[19:49] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[19:53] * mschiff (~mschiff@ has joined #ceph
[20:03] * Machske (~Bram@d5152D87C.static.telenet.be) has joined #ceph
[20:05] * alram (~alram@ has joined #ceph
[20:06] * fets (~stef@ylaen.iguana.be) has joined #ceph
[20:07] * danieagle (~Daniel@ has joined #ceph
[20:09] * lautriv (~lautriv@f050081157.adsl.alicedsl.de) has left #ceph
[20:09] * odyssey4me (~odyssey4m@ Quit (Ping timeout: 480 seconds)
[20:09] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[20:09] <chamings> jmlowe1: initial stabs look like writeback is a big win
[20:10] * yehudasa__ (~yehudasa@2607:f298:a:607:ea03:9aff:fe98:e8ff) has joined #ceph
[20:13] <chamings> so setting that trades some reliability away
[20:14] <chamings> tell me this: if the guest is killed in this circumstance, would the host continue destaging uncommitted writes?
[20:23] * devoid (~devoid@ has joined #ceph
[20:24] * devoid (~devoid@ has left #ceph
[20:32] <jmlowe1> chamings: joshd is the guy who knows the qemu stuff best
[20:33] <jmlowe1> chamings: I wouldn't want to comment on something that important
[20:36] <joshd> chamings: no the host won't continue doing the writes - they're handled by librbd as part of the qemu process. it's just like a well-behaved hard disk cache - it respects flushes from the guest, so filesystems won't get corrupted any more than with a usual power failure if the guest is killed
[20:38] <jmlowe1> under what circumstances would qemu issue flushes?
[20:40] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[20:43] * ScOut3R (~ScOut3R@54013158.dsl.pool.telekom.hu) has joined #ceph
[20:43] <joshd> whenever the guest asks for them (from the fs issuing them for its own correctness)
[20:43] <joshd> or e.g. from a user-initiated fsync
[20:44] <joelio> chamings: you should be using virtio native, not scsi if you can (not scanned up the log completely, so may have missed something)
[20:44] <joelio> ie. get vda devices
[20:44] * waxzce (~waxzce@2a01:e34:ee97:c5c0:159d:634a:2220:6934) Quit (Remote host closed the connection)
[20:45] <joshd> qemu will also flush the cache so live migration works
[20:47] * joelio reads and realised it's already been recommended :)
[20:47] * madkiss (~madkiss@2001:6f8:12c3:f00f:949b:5194:34a1:f781) has joined #ceph
[20:48] * mxmln_ (~mxmln@ has joined #ceph
[20:52] <joelio> jmlowe1: just wondering, does virtio-scsi give you more than native blk?
[20:53] <joelio> or is it just more generically compatible with userland stuff?
[20:56] <paravoid> virtio-scsi is new and supposedely better
[20:56] <paravoid> native virtio is presented as one PCI device per disk
[20:56] <paravoid> which doesn't scale much
[20:57] <paravoid> scsi is also a nice abstraction layer that gets you some features for free
[20:57] * madkiss (~madkiss@2001:6f8:12c3:f00f:949b:5194:34a1:f781) Quit (Ping timeout: 480 seconds)
[21:00] * ScOut3R (~ScOut3R@54013158.dsl.pool.telekom.hu) Quit (Ping timeout: 480 seconds)
[21:00] * madkiss (~madkiss@2001:6f8:12c3:f00f:390c:5c2b:7291:22c4) has joined #ceph
[21:01] * al (d@niel.cx) Quit (Ping timeout: 480 seconds)
[21:02] <jmlowe1> hotplug disks and discard/trim
[21:02] <jmlowe1> more specifically you can rescan the scsi bus for changes
[21:02] <jmlowe1> you can also set the timeout on the block device
[21:05] <janos> is the best way to force less usage onto an OSD that seems to be getting too much love - to lower the crush weight?
[21:06] * al (d@niel.cx) has joined #ceph
[21:09] <Gugge-47527> janos: add more hardware :)
[21:10] <dmick> janos: that's the intent of the crush weights, yeah
[21:10] <dmick> at least if you mean "love" == "amount of data"
[21:11] <janos> haha yes
[21:11] * ishkbabob (~c7a82cc0@webuser.thegrebs.com) has joined #ceph
[21:11] <janos> attention. hot steamy data, etc
[21:12] <dmick> data is a many-splintered thing
[21:13] <janos> reweight in process
[21:14] <janos> one tripped 85% and got warnings about getting full
[21:15] <janos> that host has 3 osd's in it. 65%, 75%, and 85% full osd's on it
[21:15] <ishkbabob> hi Ceph devs, i was wondering how I might get involved in better RPM packaging for Fedora. I ask because I tried to install ceph-deploy (which requires python 2.6) and this isn't a standard part of FC17
[21:16] <alfredodeza> hi ishkbabob
[21:17] <alfredodeza> yes, ceph-deploy needs at the very least Python 2.6
[21:17] <alfredodeza> is there a way to have Python 2.6 for FC17 ?
[21:17] <ishkbabob> sure, but FC17 comes with Python 2.7
[21:17] <alfredodeza> oh ceph-deploy is compatible with Python 2.7
[21:17] <ishkbabob> i figured it probably was, its just a dependency problem
[21:17] <ishkbabob> if i could get that the spec file, I could fix it
[21:18] <alfredodeza> ah, but of course
[21:18] <alfredodeza> ishkbabob: this is the repo: https://github.com/ceph/ceph-deploy/
[21:18] <ishkbabob> ahhhh, THERE it is
[21:18] <ishkbabob> cool
[21:18] <alfredodeza> this is the rpm script: https://github.com/ceph/ceph-deploy/blob/master/scripts/build-rpm.sh
[21:19] <ishkbabob> very cool, thank you!
[21:20] <alfredodeza> no problem
[21:21] <alfredodeza> glowell: do you know why the RPM would require Python 2.6 ?
[21:22] <ishkbabob> *just guessing here* but you usually have to specify a major Python version for FC, might just be what was stable when the project started
[21:22] <glowell> If it's not coded in the spec file, then it was added as part of the build, probably because 2.6 was the default python on the build platform.
[21:23] <ishkbabob> yeah, i bet you are building FC rpms on an SL/Cent machine
[21:24] <ishkbabob> yeah, the spec file has
[21:24] <ishkbabob> BuildRequires: python-devel
[21:26] <ishkbabob> or rather
[21:26] <ishkbabob> BuildRequires: python >= %{pyver} Requires: python >= %{pyver}
[21:28] <alfredodeza> maybe we should be more specific about it and say Python >= 2.6
[21:28] <alfredodeza> ?
[21:37] * markl (~mark@tpsit.com) has joined #ceph
[21:41] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[21:44] <ishkbabob> well i was thinking something along the lines of if %0!$(uname -a | grep fc), then Requires: python >= 2.7
[21:44] <ishkbabob> or something
[21:45] <ishkbabob> kinda forgot how to write spec file conditionals, but it will come to me
[21:47] <alfredodeza> but why make it specific to Fedora if we are OK with 2.6 or 2.7 ?
[21:48] <ishkbabob> Python 2.6 doesn't exist in the Fedora repos (for FC17 onward I believe)
[21:48] <ishkbabob> and I don't think that SL/Cent/Redhat have Python 2.7 in the repos (not sure though)
[21:52] <ishkbabob> yeah, just checked, SL/Cent/Redhat current version is Python 2.6.6
[21:52] <ishkbabob> i checked on an SL6 box
[21:58] * Vjarjadian (~IceChat77@ has joined #ceph
[22:01] <ishkbabob> alfredodeza does it make sense? also I clearly would want to use this conditional [%if 0%{?fc#}]
[22:01] <alfredodeza> I understand
[22:03] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) has joined #ceph
[22:03] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) Quit (Remote host closed the connection)
[22:03] <ishkbabob> ok cool, i'll review the rules for committing to your git repo. Should I submit a bug ticket?
[22:16] <alfredodeza> a pull request I think is good
[22:20] <dmick> but if we say >= 2.6 we cover all the bases on all the distros, right?...
[22:20] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:20] * yehudasa (~yehudasa@2607:f298:a:607:e1d9:eaeb:4726:2fb6) Quit (Ping timeout: 480 seconds)
[22:21] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) Quit (Ping timeout: 480 seconds)
[22:21] * joshd (~joshd@2607:f298:a:607:e4ca:af6f:1bf7:6830) Quit (Ping timeout: 480 seconds)
[22:21] * sagewk (~sage@2607:f298:a:607:219:b9ff:fe40:55fe) Quit (Ping timeout: 480 seconds)
[22:27] <ishkbabob> dmick, you don't cover all bases because Fedora doesn't ship come with python 2.6
[22:28] <dmick> (12:17:29 PM) ishkbabob: sure, but FC17 comes with Python 2.7
[22:28] * jskinner (~jskinner@ Quit (Remote host closed the connection)
[22:28] <dmick> 2.7 >= 2.6, yah?
[22:28] <ishkbabob> oh now i see what you're getting at
[22:29] <ishkbabob> i thought that the major version was being specified in the actual NAME of the package
[22:29] <ishkbabob> like
[22:29] <ishkbabob> package = python-2.6 or some such
[22:29] <ishkbabob> but indeed it is not
[22:29] <ishkbabob> so yes, you are correct i believe
[22:29] <dmick> surely there must be a way to use version inequalities in spec; there certainly is in deb. ah, good
[22:29] <ishkbabob> thanks :)
[22:29] <ishkbabob> i wish i was using debian/ubuntu :(
[22:30] <ishkbabob> but alas, my companies chosen distro is Fedora
[22:31] <dmick> you're in a large set. I'm so happy not to be there
[22:33] <ishkbabob> i honestly don't care about most of the differences, but best thing about debian/ubuntu to me is the community package management
[22:33] * joshd (~joshd@ has joined #ceph
[22:34] <dmick> we're constantly forced to avoid features because they're not there yet on RH
[22:34] <janos> fedora keeps up though
[22:34] <dmick> configuration management is hard.
[22:35] * yehudasa (~yehudasa@ has joined #ceph
[22:37] * sagewk (~sage@ has joined #ceph
[22:37] <sagewk> yehudasa: can you look at wip-5742?
[22:39] * sjust (~sam@ has joined #ceph
[22:47] * markbby (~Adium@ Quit (Quit: Leaving.)
[22:51] <yehudasa> sagewk: sure
[22:52] <sagewk> tnx
[22:53] <paravoid> v0.67-rc1 -> v0.67-rc1
[22:53] <paravoid> \o/
[22:53] <yehudasa> sagewk: looks food
[22:53] <yehudasa> good
[22:54] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) has joined #ceph
[22:54] <paravoid> someone's hungry
[22:54] <paravoid> :-)
[22:54] <yehudasa> heh.. not really
[22:54] <yehudasa> just had lunch
[22:55] <sagewk> paravoid: it has that mon issue, so we're doing an -rc2 shortly and not announcing it. :/
[22:55] <sagewk> not announcing -rc1 that is
[22:55] <sagewk> yehudasa: thanks
[22:55] <paravoid> I was looking at git logs trying to check if that's the case
[22:55] <sagewk> latest next does not, so you can also just use that ;)
[22:56] <paravoid> I think I'll wait for a release for a change :-)
[22:58] * ishkbabob (~c7a82cc0@webuser.thegrebs.com) Quit (Quit: TheGrebs.com CGI:IRC (Ping timeout))
[22:59] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[23:08] * sleinen (~Adium@2001:620:0:25:a4ae:9e3a:33e2:57d5) Quit (Quit: Leaving.)
[23:09] * jjgalvez1 (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) has joined #ceph
[23:15] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) Quit (Ping timeout: 480 seconds)
[23:18] <sagewk> dmick: wip-ceph-disk
[23:19] <sagewk> loicd: around?
[23:20] * mxmln_ (~mxmln@ Quit (Quit: mxmln_)
[23:20] <loicd> sagewk: yes.
[23:21] <sagewk> yehudasa too:
[23:21] * yanzheng (~zhyan@jfdmzpr02-ext.jf.intel.com) Quit (Ping timeout: 480 seconds)
[23:21] <sagewk> thinking it would be good to have a session for cds on s3/swift api gap. where are we at, what are priorities, what are easy vs hard things to work on
[23:22] <loicd> +1
[23:22] <sagewk> any interest in putting together a discussion/blueprint?
[23:22] <sagewk> :)
[23:23] <loicd> absolutely
[23:23] <loicd> ccourtaut: are you around ?
[23:24] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[23:25] <loicd> sagewk: I'd like to propose that a list is compiled with a list of features S3 has and that radosgw misses. And for each of them there could be a link to a ticket in http://tracker.ceph.com/ so that a volunteer can pick it and work on it
[23:25] <lxo> topic is outdated; 0.61.6 is out; can't fix it myself
[23:25] <sagewk> sounds awesome
[23:28] <loicd> regarding swift I'm not so sure it would be a good idea because I don't know where yehudasa stands regarding https://github.com/zaitcev/swift-lfs ( i.e. the effort to create a swift API that is detached from the swift implementation )
[23:29] <_robbat2|irssi> loicd: please put IAM policies on that list!
[23:29] * loicd remembers that ccourtaut is probably enjoying diner in Paris at this moment and won't be very responsive
[23:32] * sprachgenerator (~sprachgen@ Quit (Quit: sprachgenerator)
[23:32] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) has joined #ceph
[23:33] <sagewk> loicd: how much traction do you think the swift-lfs effort has?
[23:35] * lxo wishes there were prestodelta drpms in the ceph yum repos
[23:35] <loicd> I'm not sure. I'll have a better idea in Hong Kong in November but that's far from now ;-) It makes sense but it highly depends on the actual workforce people who are not swift core developers are willing to put into it.
[23:36] <loicd> sagewk: ^
[23:37] * scuttlemonkey (~scuttlemo@ has joined #ceph
[23:37] * ChanServ sets mode +o scuttlemonkey
[23:37] <yehudasa> sagewk, loicd: looking at swift-lfs now. From my early experience of trying to implement a single api for two substantially different backends (at the time it was rados vs posix fs), it's not an easy task
[23:38] <yehudasa> you end up going to the lowest common denominator
[23:38] <lxo> does anyone know whether the fc18 rpms work on fc19? I was thinking of upgrading, and it would be nice to not have to go back to rolling out my own rpms ;-)
[23:38] <loicd> yehudasa: and if swift developers are unwilling to cooperate with this initiative, it may prove very difficult.
[23:38] <paravoid> http://www.swiftstack.com/blog/2013/04/24/openstack-summit-api-discussion/
[23:38] <paravoid> that's John, the Swift PTL
[23:39] <loicd> I will try to find chmouel right now ( he is at OSCON ) and get his advice.
[23:39] <paravoid> (project tech lead)
[23:39] <dmick> sagewk: looking at wip-ceph-disk now
[23:39] <joshd> yehudasa: sagewk: in april they were talking about a higher layer where something like librgw could plug in at the proxy server level, that's different from the lfs effort. did that change?
[23:40] <dmick> sagewk: just the top commit?
[23:40] <yehudasa> joshd: you mean at the cds?
[23:40] <joshd> yehudasa: at the openstack design summit
[23:41] <yehudasa> I'm not aware of anything new that happened there.
[23:42] <joshd> I just wouldn't expect us to plug in at the lfs level at all
[23:42] <yehudasa> the thing is that you can't really assume anything about the backend capabilities. Either you abstract the top layer, or you end up with something that doesn't work with most other backends
[23:42] <yehudasa> joshd, right
[23:42] <loicd> joshd: LFS is the keyword I remember from the discussions we had at the summit but I don't know what happened since then.
[23:43] <sagewk> dmick: yeah
[23:43] <yehudasa> e.g., achieving atomicity with rgw is being done in one way, but I don't expect swift / glusterfs / any posix filesystem to do it like that
[23:43] <sjustlaptop> sagewk: wip-5677 is ready for review (also wip-5723)
[23:44] <yehudasa> so providing an abstraction into our object write function doesn't make sense
[23:44] <yehudasa> I assume it's true for the other way around also
[23:45] * scuttlemonkey (~scuttlemo@ Quit (Ping timeout: 480 seconds)
[23:45] <joshd> yehudasa: yes I agree
[23:47] <loicd> yehudasa: if you think the effort to adapt ceph to be the backend of an independant SWIFT API is more important than chasing the evolution of the swift API in radosgw, then it would make sense to compile a description of the swift API and check what radosgw is missing. Don't you think ?
[23:49] * leseb (~Adium@pha75-6-82-226-32-84.fbx.proxad.net) Quit (Quit: Leaving.)
[23:49] <loicd> basically adding to http://ceph.com/docs/next/radosgw/swift/ ?
[23:52] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:52] * diegows (~diegows@ has joined #ceph
[23:52] <yehudasa> loicd: documenting the differences is certainly a good idea. The problem I'm having is with creating artificial abstractions based on swift itself
[23:53] <yehudasa> looking at lfs, I don't think it makes sense with regard to rados
[23:54] <loicd> yehudasa: I understand how difficult it can be ;-) It may work out of the core swift developers are motivated. As paravoid points ( http://www.swiftstack.com/blog/2013/04/24/openstack-summit-api-discussion/ ) some of them are. I'm not sure there is a consensus though. Do you have a hunch joshd ?
[23:54] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[23:57] <loicd> s/of the core/if the core/
[23:57] <joshd> loicd: I spoke to John Dickinson in at the summit, and at that point it sounded like they'd accept a higher level plugin (iirc what the proxy server uses) than lfs, but we'd need to write that abstraction layer
[23:57] * markbby (~Adium@ has joined #ceph
[23:57] <joshd> loicd: other swift devs also seemed accepting of the idea as well
[23:58] <loicd> joshd: sounds like good news :-) Do you know if there has been any work done in this direction since then ?
[23:59] <joshd> loicd: I don't think so, but I haven't been paying too much attention. John suggested that the lfs stuff be settled before a higher level abstraction layer was introduced as well

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.