#ceph IRC Log

Index

IRC Log for 2013-07-30

Timestamps are in GMT/BST.

[0:03] <mozg> sagewk: can I mix sata and sas drives in the same osd server?
[0:03] <sagewk> y
[0:03] <mozg> would it not cause any issues with slow requests, etc?
[0:04] <mozg> i've got 4 sas disks
[0:04] <mozg> and i want to save some money and purchase additional 4 disks
[0:04] <mozg> was thinking of getting sata disks instead of sas
[0:04] <mozg> i think they are about 30% less
[0:05] <lautriv> mozg, since they work even with the same layer but SAS is faster, your average speed may fall down but i guess your network is slower thatn that disks ;)
[0:05] <mozg> my network is way faster actually
[0:05] <mozg> i've got infiniband
[0:06] <mozg> ipoib
[0:06] <lautriv> nice
[0:06] <mozg> )))
[0:06] <mozg> thanks
[0:06] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) Quit (Quit: Leaving.)
[0:06] <lautriv> mozg, 0.,b
[0:07] <mozg> it's a pain, but very fast ))
[0:07] <lautriv> mozg, i never had the time to investigate in details but isn't IB very limited in lenght of wires ?
[0:09] <mozg> not really
[0:09] <mozg> it depends
[0:09] <mozg> i think it's about the same as ethernet
[0:09] <mozg> but you can also get fiber cables
[0:09] <mozg> which are more expensive
[0:09] <mozg> but they run for longer range
[0:10] * jluis (~JL@89.181.148.68) Quit (Ping timeout: 480 seconds)
[0:11] <Kioob> Hi
[0:12] <mozg> hi
[0:12] <Kioob> when Xen VM crash, using RBD kernel client, I see data loss : files of the 2 last days are now empty
[0:12] <Kioob> I use RBD snapshots, and files are also empty on snapshots
[0:12] <lautriv> mozg, i do myrinet but play with the idea to replace it since all patches above kernel 2.6.38 are from me and they won't GPL it so i refused to send it to myricom.
[0:12] <Kioob> but backups done externally are fines
[0:13] <mozg> hehe
[0:13] <mozg> not had much experience with myrinet
[0:13] <Kioob> So I suppose a problem in ext4, xen, or rbd...
[0:13] <Kioob> How can I track that ?
[0:13] <Kioob> (OSD are in 0.61.4 version)
[0:14] <darkfaded> Kioob: i know one thing for the ext4 level
[0:14] <darkfaded> add block_validity to your mount options, hoping i spelled it correctly
[0:14] <lautriv> mozg, they perform best without IP and have very low response-time since Myri was developed for clustering, i have one line with 30m in between.
[0:14] <darkfaded> that could, with some luck, detect an issue
[0:15] <mozg> wow
[0:15] <mozg> nice
[0:15] <darkfaded> but from my general point of view ext4 hardly ever notices if it becomes corrupted
[0:15] <mozg> infiniband works better without ip as well
[0:15] <mozg> but unfortunately, ceph is not rdma friendly
[0:15] <mozg> so i have to use ipoib
[0:15] <mozg> which limits the speed and latency somewhat
[0:15] <Kioob> ok darkfaded, thanks
[0:16] <lautriv> mozg, same issue here also needed to tell my nameservers to use views to not mix private and public nets for ceph.
[0:17] <mozg> i have a basic setup and i do not separate private/public
[0:17] <mozg> as my ceph is on ipoib and it's only used for storage + kvm
[0:17] <lautriv> mozg, still testing here, actually not even a ceph-cluster up ;)
[0:18] <mozg> ah
[0:18] <mozg> i c
[0:18] <mozg> i've got it up and running
[0:18] <mozg> at the moment testing ceph with ubuntu and centos
[0:18] <mozg> centos servers seem to crash and restart under load
[0:18] <lautriv> mozg, had it up once and got 322MB/s but some issues with preparation on certain disks.
[0:18] <mozg> so, trying to use ubuntu instead
[0:19] <mozg> nice!
[0:19] <lautriv> i won't suggest ubu over cent
[0:19] <mozg> i can't get writes above 200mb/s as it seems the limit on the journal side
[0:19] <mozg> but reads are very nice )))
[0:19] <lautriv> cent is not the bleeding edge but a crapload more stable
[0:19] <mozg> i am getting 1.2-1.4GB/s
[0:20] <mozg> well, it constantly crashes
[0:20] <mozg> i've got two servers
[0:20] <mozg> which fall over
[0:20] <mozg> about once every day
[0:20] <lautriv> kernel ?
[0:20] <mozg> seems so
[0:20] <mozg> panics
[0:20] <lautriv> which one ?
[0:20] <mozg> the other host just freezes
[0:20] <mozg> and nothing in the logs
[0:21] <mozg> i've got 6.3 and 6.4 centos
[0:21] <mozg> which run kvm
[0:21] <mozg> and vms are doing performance testing
[0:21] <mozg> using phoronix test suite
[0:21] <lautriv> mozg, i observed the very same on debian running kernels 3.9+ where the freeze happens only on high network-traffic, even via NFS
[0:21] <mozg> pts/disk benchmarks
[0:22] <mozg> the crashes are very specific to ceph
[0:22] <mozg> nfs tests are not crashing the server
[0:22] <mozg> with the same benchmarks
[0:22] <lautriv> mozg, actually that behaviour let me come to ceph, had the opinion NFS is the failing part.
[0:22] <mozg> and the strange thing is that they do not crash consistenly
[0:22] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Quit: ...)
[0:23] <mozg> i mean
[0:23] <Kioob> darkfaded : note that fsck doesn't see any problem :(
[0:23] <mozg> sometimes the tests complete without crashing the server
[0:23] <lautriv> mozg, do you have some box around kernel 3.3 up ?
[0:23] <mozg> but i've not seen an uptime of more than 2 days
[0:23] <mozg> i've got ubuntu with 3.5
[0:24] <mozg> and centos with 2.6 brunch
[0:24] <lautriv> mozg, just to be sure, might worth a try to put some ceph on the cent 2.6 and see what happens
[0:25] <darkfaded> Kioob: i try to hold back and not rant. but unless you figure out another, any, whatsoever valid reason for a journaled filesystem that was mounted and written to, not see a problem now while you know data is amiss... well :)
[0:25] <mozg> i've got the following setup: ceph servers - all running ubuntu 12.04 3.5 kernel
[0:25] <darkfaded> anyway, block_validity will only affect the future
[0:25] <mozg> ceph rbd clients are on centos
[0:25] <darkfaded> it could then notice a block that wasn't written out correctly etc
[0:25] <mozg> and i've just installed ubuntu to test
[0:25] <mozg> as centos falls over
[0:26] <lautriv> mozg, the clients fail ?
[0:26] <mozg> yeah
[0:26] <mozg> clients fall over and either panic or just freeze
[0:26] <mozg> servers are not crashing
[0:26] <lautriv> mozg, my issue with NFS on recent kernels was freezing servers
[0:26] <mozg> ah
[0:26] <mozg> i see
[0:27] <lautriv> even no word while dying
[0:27] <mozg> have you tried some nfs tunnings?
[0:27] <mozg> like increasing number of threads
[0:27] <mozg> and some ip tunnings
[0:27] <mozg> ?
[0:27] <lautriv> mozg, since i managed to have diskless clients on NFS-V4 you may assume each option is where it should be ;)
[0:27] <mozg> the only time i've had nfs server crashes is when I was running zfs filesystem
[0:28] <lautriv> and btrfs because it can't handle COW
[0:28] <mozg> ah,
[0:28] <mozg> i am on xfs for now
[0:28] <mozg> has been advised not to use btrfs on production
[0:29] <mozg> on this #
[0:29] <lautriv> xfs is still my flavour
[0:29] <mozg> i've seen btrfs having far better 4k results
[0:30] <mozg> that is my current issue with ceph
[0:30] <mozg> very poor 4k performance (((
[0:30] <mozg> large blocks are very fast
[0:30] <mozg> but small block size ops are so terrible (((
[0:30] * dxd828 (~dxd828@host-2-97-70-33.as13285.net) Quit (Quit: Computer has gone to sleep.)
[0:30] <lautriv> when i tested btrfs it was incredible fast but buggy, after it went more stable, it lost also speed.
[0:30] <Kioob> yes darkfaded : yes, I was thinking that if it's a Ceph/RBD bug, it's logic that all snapshots are also affected. But if it's a Ceph/RBD corruption, ext4 should see this errors... or at least fsck :S
[0:30] * dxd828 (~dxd828@host-2-97-70-33.as13285.net) has joined #ceph
[0:31] <darkfaded> well interesting how it'll turn out
[0:32] <mozg> kioob: i've seen horrible errors on ext4
[0:32] <mozg> and it was unnoticed
[0:32] <mozg> until i forced fsck
[0:32] <mozg> that shocked me
[0:33] <darkfaded> mozg: in my last storage class i gave the same lun to two participants and had them first carefully export / import the vg's and then had them do it both at the same time and we had a lot of fun with writing to the same files etc
[0:33] <lautriv> i agree on that, ext4 still receives a crapload of patches and is called "stable" killed entire partitions on 2.6.31
[0:34] <darkfaded> but tbh i often don't understand it's bugs, why/how it doesn't die from this and correctly survives that
[0:36] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) has joined #ceph
[0:36] * dxd828 (~dxd828@host-2-97-70-33.as13285.net) Quit (Quit: Textual IRC Client: www.textualapp.com)
[0:38] <lautriv> darkfaded, finally all ends in numbers where one bit is almighty ;)
[0:42] <davidz> Kioob: When you say empty files, you mean that from the Xen VM in the mounted filesystem you have files that stat as 0 length?
[0:42] <Kioob> yes davidz
[0:44] <davidz> I would be looking first at your filesystem mount options. If fsck doesn't detect lots of corruption to repair that would indicate that rbd never failed to update a directory, inode or journal. That seems strange if the problem were in rbd.
[0:45] <Kioob> /dev/xvdc /home ext4 defaults,noatime 0 2
[0:46] <Kioob> so : /dev/xvdc on /home type ext4 (rw,noatime)
[0:46] <Kioob> nothing very special I suppose
[0:51] <darkfaded> i missed this about 0 byte files. that's what ext does on a journal abort if the writes weren't done and you'd have had the unsafest journal mode
[0:51] <darkfaded> whose name is not in my head right now
[0:51] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) Quit (Remote host closed the connection)
[0:51] <darkfaded> now, that should of course only affect the files written since 5 minutes before the crash
[0:51] <Kioob> (writeback ?)
[0:52] <darkfaded> Kioob: i guess
[0:52] <Kioob> I loose between 1 and 3 days of files
[0:52] <darkfaded> you can do a tune2fs -l to see default / builtin options
[0:52] <Kioob> not 5 minutes ;)
[0:52] <darkfaded> i know :)
[0:53] * lautriv (~lautriv@f050082152.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[0:53] <darkfaded> are all lost files around as 0 byte files?
[0:53] <Kioob> yes
[0:53] <darkfaded> cool
[0:53] <darkfaded> i don't understand it
[0:53] <darkfaded> for next time:
[0:54] <darkfaded> http://confluence.adminspace.info/display/Adminspace/Mount+ext4+without+journal+replay
[0:54] <darkfaded> i mean, if you run into the same issue and wanna debug
[0:54] <darkfaded> i'd first make a backup of the broken fs
[0:54] <Kioob> I have a lot of RBD snapshots with the problem
[0:54] <darkfaded> and then peek around w/o journal replay
[0:54] <Kioob> so I can do some debugging
[0:55] <darkfaded> just never do rw,noload
[0:55] <darkfaded> or you lose the journal
[0:55] <Kioob> I try that
[0:55] <darkfaded> but it sounds like a corruption issue to me
[0:55] <darkfaded> not on ext layer
[0:55] <darkfaded> ext is just being dumb about it
[0:56] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Ping timeout: 480 seconds)
[0:56] <Kioob> with � mount -t ext4 -o ro,noload /dev/xvdp /mnt/t2/ � my test file still empty
[0:57] <Kioob> I retry to flush caches...
[0:57] <davidz> Kioob: It could be that data was written, but inode update didn't journal nor ever make it to disk. I wonder if the mtime/ctime are the same and old as you'd expect from a newly created file that never had inode written again.
[0:58] <darkfaded> it has to be one that wasn't mounted yet, just to make that clear
[0:58] * devoid (~devoid@130.202.135.213) Quit (Quit: Leaving.)
[0:58] * sprachgenerator (~sprachgen@130.202.135.206) Quit (Quit: sprachgenerator)
[1:02] <Kioob> davidz: mtime/ctime/inoe are the same on all snapshots yes, and on the valid backup too
[1:03] * lautriv (~lautriv@f050082216.adsl.alicedsl.de) has joined #ceph
[1:03] <Kioob> darkfaded: RBD snapshots are RO, so yes, it wasn't mounted yet
[1:03] * devoid (~devoid@130.202.135.213) has joined #ceph
[1:03] <darkfaded> ah ok
[1:03] <darkfaded> sorry i gotta go sleep ;)
[1:04] <darkfaded> brain's already offline
[1:04] <Kioob> to be able to mount them, I use the �CoW device mapper� feature
[1:04] <lautriv> darkfaded, gn
[1:04] <Kioob> thanks for the help darkfaded ;)
[1:05] <Kioob> So... it could be a RBD corruption, and ext4 isn't able to see it !? Really strange
[1:06] <lautriv> Kioob, real man use XFS anyway :o)
[1:06] <Kioob> ;)
[1:11] <lautriv> sage, since the call from ceph-disk to sgdisk does not even contain any sector or size but "--largest-new" it is rather sure the sgdisk itself is guilty.
[1:17] * jluis (~JL@89-181-148-68.net.novis.pt) has joined #ceph
[1:18] * LeaChim (~LeaChim@0540ae5a.skybroadband.com) Quit (Ping timeout: 480 seconds)
[1:20] <lautriv> sage, finally found the real issue, will change a line and tell if that works :o)
[1:22] * mozg (~andrei@host109-151-35-94.range109-151.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[1:24] <Kioob> and since I use ceph 0.61.4 (and not 0.61.7), does the problem can be a wrongly erased rados object ? (and so, affecting all snapshots)
[1:25] <Kioob> I supposed I could look at the mtime of those objects... but hard to find which objects corresponds to that data :/
[1:26] * devoid (~devoid@130.202.135.213) Quit (Quit: Leaving.)
[1:29] * devoid (~devoid@130.202.135.213) has joined #ceph
[1:29] * devoid (~devoid@130.202.135.213) has left #ceph
[1:31] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[1:31] <lautriv> sage, are you responsible for ceph-disk or know one who are ?
[1:35] <lautriv> anyone around who may be responisble for ceph-disk ? i found an issue with sgdisk and its solution.
[1:36] <joshd> Kioob: snapshots aren't stored in exactly the same place as the original objects (unless you're using btrfs beneath the osds), but looking at the objects representing the ranges for the file (shown by debugfs in the guest) could see if there's anything strange with them
[1:37] <joshd> Kioob: or use blktrace or something to see what metadata blocks are accessed on a freshly mounted system when you access those files
[1:38] * LeaChim (~LeaChim@0540ae5a.skybroadband.com) has joined #ceph
[1:38] <lautriv> sagewk, bah always highlighted you second nick :o) ....... are you around ?
[1:40] <lautriv> late night here, tomorrow then ...
[1:40] <Tamil> lautriv: please file a bug under "devops" project
[1:47] <Kioob> great, thanks joshd !
[1:53] * aliguori_ (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Remote host closed the connection)
[1:57] <Kioob> so, if debugfs say that the inode is part of block group 1312 located at block 42991900, offset 0x0900, I should look for a rados object, at offset 4096*42991900 ?
[2:02] <joshd> yeah, or possibly 512*42991900
[2:03] <Kioob> an rados object nums are direct offset, or based on the size of objects ? (here 64K)
[2:03] <joshd> looking at it directly on disk would be best (use 'ceph osd map pool object' to tell which osds it's on)
[2:04] <joshd> the object names are based on the size of the objects for rbd
[2:04] <Kioob> ok thanks
[2:04] * Henson_D (~kvirc@69.166.23.191) has joined #ceph
[2:05] <Henson_D> does anyone know of any tricks to make a filesystem on a Ceph RBD go faster? I asked someone here earlier about performance issues, and they pointed out that a single IO thread might only get about 20 MB/s read while many could get 160 MB/s (in the context of the rados benchmark).
[2:07] <Henson_D> I thought maybe then striping many RBD devices together into a RAID might help, allowing multiple read and write requests in parallel to increase the performance, or using the stripe unit and width parameters for an XFS filesystem, but my read performance is still only about 40 MB/s at best. Does anyone have any suggestions or tricks for increasing the filesystem parallelism to increase the
[2:08] <Henson_D> performance for mostly single-user access for filesystems on RBD?
[2:08] <Kioob> ./DIR_4/DIR_B/rb.0.1b3422.238e1f29.000000290011__2b88_1CC8B412__3 ← 2b88 here is the snapshot version, right ?
[2:08] * xmltok_ (~xmltok@pool101.bizrate.com) Quit (Quit: Bye!)
[2:09] <Kioob> Henson_D: for sequential reads, you can increase the readahead
[2:10] <Kioob> $ cat /sys/block/xvda/queue/read_ahead_kb
[2:10] <Kioob> 512
[2:10] <Kioob> (instead of the default value of 128)
[2:12] * LeaChim (~LeaChim@0540ae5a.skybroadband.com) Quit (Ping timeout: 480 seconds)
[2:15] <Henson_D> Kioob: let me give that a try
[2:16] <joshd> Kioob: yeah, that's the snapshot version - you might check xattrs on those files to make sure they exist
[2:16] <Kioob> joshd: "rbd snap ls" doesn't give me same snapshot versions :(
[2:16] <joshd> Kioob: and a deep scrub didn't show any inconsistency right?
[2:17] <joshd> Kioob: yeah, they're internal unfortunately
[2:17] <Kioob> I have a long standing inconsistency, but on other pools
[2:17] <joshd> sjust might be able to say how the filename snapshot version corresponds
[2:18] <Kioob> (I have inconsistencies in pools 4 and 8, but I can't remove the pools. Here it's the pool 3)
[2:20] <joshd> if the files were text you could run strings on the object files
[2:20] <sjust> 2b88 is the snapshot version
[2:21] <joshd> sjust: is that just hex of the rados-level snapshot version?
[2:21] <sjust> yeah
[2:21] <Kioob> joshd: a jpeg, so I can maybe search for the header but... not sure to be able to use that
[2:21] <Kioob> thanks sjust
[2:23] * sagelap (~sage@2600:1012:b02c:f86e:b1dd:2879:bafb:4558) has joined #ceph
[2:24] * huangjun (~kvirc@111.175.165.62) has joined #ceph
[2:26] <Kioob> So, go to bed. I will think about that tomorow. Thanks for the help !
[2:26] <joshd> you're welcome!
[2:31] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[3:00] * axisys (~axisys@ip68-98-189-233.dc.dc.cox.net) Quit (Remote host closed the connection)
[3:02] * dosaboy (~dosaboy@host109-155-13-224.range109-155.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[3:03] * grepory (~Adium@50-76-55-246-ip-static.hfc.comcastbusiness.net) has joined #ceph
[3:10] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) has joined #ceph
[3:14] * yy-nm (~chatzilla@115.198.96.222) has joined #ceph
[3:15] * rturk is now known as rturk-away
[3:18] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[3:29] * julian (~julianwa@125.70.133.36) has joined #ceph
[3:31] * sprachgenerator (~sprachgen@c-50-141-192-36.hsd1.il.comcast.net) has joined #ceph
[3:34] * yanzheng (~zhyan@jfdmzpr03-ext.jf.intel.com) has joined #ceph
[3:36] * AfC (~andrew@2001:44b8:31cb:d400:b874:d094:ec64:aec3) has joined #ceph
[3:40] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Remote host closed the connection)
[3:40] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) Quit (Quit: Leaving.)
[3:40] * zhangjf_zz2 (~zjfhappy@222.128.1.105) has joined #ceph
[3:41] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:46] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[3:53] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[3:57] * jluis (~JL@89-181-148-68.net.novis.pt) Quit (Ping timeout: 480 seconds)
[3:59] <nhm> Henson_D: depends on what things you want to go faster.
[4:00] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) has joined #ceph
[4:00] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Quit: Leaving.)
[4:00] * Cube1 (~Cube@66-87-118-175.pools.spcsdns.net) has joined #ceph
[4:00] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) Quit (Read error: Connection reset by peer)
[4:14] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[4:20] * grepory (~Adium@50-76-55-246-ip-static.hfc.comcastbusiness.net) Quit (Quit: Leaving.)
[4:20] <Henson_D> nhm: reading files, mostly. My write speed is limited by having slow journal drives, but my data drives are fast. I want to use my Ceph system for storing lots of files and serving them over NFS, as well as hosting several KVM virtual machines. If I have a ton of files, I'd like reading, copying, etc to be as fast as possible. So I'm trying to figure out how to set it up right
[4:21] <Henson_D> nhm: the second time, because the first time I just put ext4 filesystems on the RBD devices and dumped almost 2 TB of data onto them, only to discover that I got about 6 MB/s read speed.
[4:32] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit (Quit: Leaving.)
[4:40] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[4:53] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has left #ceph
[4:58] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit (Quit: Leaving.)
[5:18] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) has joined #ceph
[5:26] * BillK (~BillK-OFT@124-169-67-32.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[5:28] * BillK (~BillK-OFT@124-168-243-244.dyn.iinet.net.au) has joined #ceph
[6:01] * markl (~mark@tpsit.com) Quit (Ping timeout: 480 seconds)
[6:02] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[6:30] * gregaf1 (~Adium@cpe-76-174-249-52.socal.res.rr.com) Quit (Quit: Leaving.)
[6:32] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[6:33] * AfC (~andrew@2001:44b8:31cb:d400:b874:d094:ec64:aec3) Quit (Quit: Leaving.)
[6:49] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[7:09] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[7:10] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) has joined #ceph
[7:24] * haomaiwa_ (~haomaiwan@117.79.232.214) Quit (Remote host closed the connection)
[7:25] * haomaiwang (~haomaiwan@li565-182.members.linode.com) has joined #ceph
[7:27] * haomaiwang (~haomaiwan@li565-182.members.linode.com) Quit (Remote host closed the connection)
[7:28] * haomaiwang (~haomaiwan@117.79.232.150) has joined #ceph
[7:32] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[7:33] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) Quit (Quit: Leaving.)
[7:33] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) has joined #ceph
[7:36] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[7:46] * yy-nm (~chatzilla@115.198.96.222) Quit (Quit: ChatZilla 0.9.90.1 [Firefox 22.0/20130618035212])
[8:10] * Vjarjadian (~IceChat77@90.214.208.5) Quit (Quit: If at first you don't succeed, skydiving is not for you)
[8:15] <cmdrk> hmm, trying to copy soem files into CephFS from a host mounting it on kernel 3.7.8 and with a 10Gb connection, only seeing about ~1Gb of performance on the network. seeing a LOT of "slow request [30-120] seconds old.". right now i have about 60 OSDs across 10 servers
[8:15] <cmdrk> i made a little distribution of the complaining OSDs.. http://fpaste.org/28906/51648011/
[8:15] <cmdrk> possibly shitty disks?
[8:19] <cmdrk> osd.34 looks especially bad, some kernel messages on the host
[8:19] <cmdrk> http://fpaste.org/28907/51651361/ more fpaste action if anyone is interested
[8:33] <cmdrk> disks formatted with XFS, each server is 4 CPU / 8 GB RAM / 1Gbps NIC running 6 OSDs. server with CephFS kernel module is 24 CPU / 48 GB RAM / 10Gbps
[8:33] * Cube1 (~Cube@66-87-118-175.pools.spcsdns.net) Quit (Quit: Leaving.)
[8:34] * agh (~oftc-webi@gw-to-666.outscale.net) has joined #ceph
[8:34] <agh> Hello, i need help ! I've broken my RadosGW installation !
[8:34] <agh> worse, i've broken internally my S3 installation
[8:34] <agh> I had to play with "rados -p .rgw.... rm ...
[8:35] <agh> and now, i can't create new buckets anymore
[8:35] <agh> => how can I zap all .rgw pools to start from scratch ?
[8:48] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[9:04] * AfC (~andrew@2001:44b8:31cb:d400:b874:d094:ec64:aec3) has joined #ceph
[9:04] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) has joined #ceph
[9:07] * waxzce (~waxzce@2a01:e35:2e1e:260:4cd5:5158:434c:d1dd) has joined #ceph
[9:12] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) Quit (Read error: Connection reset by peer)
[9:21] * mschiff (~mschiff@p4FD7DDB9.dip0.t-ipconnect.de) has joined #ceph
[9:28] <agh> Hi to all,
[9:28] <agh> i've a question about Garbage Collector and RadosGW
[9:29] <agh> I've put some big files in my Ceph S3 space. It works fine.
[9:29] <agh> Now, I remove them
[9:29] <agh> So they are marked for deletation, but not deleted yet
[9:29] <agh> I thought that a garbage collector was there to do the job, but, if i do
[9:29] <agh> radosgw-admin gc list
[9:29] <agh> I get an empty list
[9:30] <agh> .. is it normal ?
[9:31] * waxzce (~waxzce@2a01:e35:2e1e:260:4cd5:5158:434c:d1dd) Quit (Remote host closed the connection)
[9:37] <huangjun> someone here have compiled the ceph on centos 5.9 final?
[9:37] <huangjun> how to install the crypto package on centos 5.9
[9:48] * ScOut3R (~ScOut3R@catv-89-133-25-52.catv.broadband.hu) has joined #ceph
[9:53] <huangjun> uhh, i see, yum install nss nss-devel will resolove this problem
[9:58] * waxzce (~waxzce@2a01:e34:ee97:c5c0:8949:d844:5f29:4a80) has joined #ceph
[9:59] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) has joined #ceph
[9:59] * fireD_ (~fireD@93-142-231-192.adsl.net.t-com.hr) has joined #ceph
[10:03] * bergerx_ (~bekir@78.188.204.182) has joined #ceph
[10:05] * yy (~michealyx@115.198.96.222) has joined #ceph
[10:07] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) Quit (Ping timeout: 480 seconds)
[10:14] * odyssey4me (~odyssey4m@196-215-110-38.dynamic.isadsl.co.za) has joined #ceph
[10:18] <agh> Hi to all,
[10:18] <agh> i've a question about Garbage Collector and RadosGW
[10:18] <agh> I've put some big files in my Ceph S3 space. It works fine.
[10:18] <agh> Now, I remove them
[10:18] <agh> So they are marked for deletation, but not deleted yet
[10:18] <agh> I thought that a garbage collector was there to do the job, but, if i do
[10:18] <agh> radosgw-admin gc list
[10:18] <agh> radosgw-admin gc list
[10:18] <agh> .. is it normal ?
[10:19] * LeaChim (~LeaChim@0540ae5a.skybroadband.com) has joined #ceph
[10:22] * fireD_ is now known as fireD
[10:31] <Kioob> darkfaded, joshd : for information, it seems that my data loss problem come from a kernel writeback bug : http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-3.10.y&id=9fa65e09730a2e63c5b45d008f269b2b271b74bc
[10:35] * stepan_cz (~Adium@2a01:348:94:30:f920:dee7:1703:2ff4) has joined #ceph
[10:39] * jaydee (~jeandanie@124x35x46x11.ap124.ftth.ucom.ne.jp) has joined #ceph
[10:40] * odyssey4me2 (~odyssey4m@165.233.71.2) has joined #ceph
[10:43] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[10:43] * odyssey4me (~odyssey4m@196-215-110-38.dynamic.isadsl.co.za) Quit (Ping timeout: 480 seconds)
[10:47] * xdeller (~xdeller@91.218.144.129) Quit (Read error: Connection reset by peer)
[10:51] * yy (~michealyx@115.198.96.222) has left #ceph
[10:58] <agh> i need some experts on radosgw garbage collector.. Someone ?
[11:08] * odyssey4me2 (~odyssey4m@165.233.71.2) Quit (Ping timeout: 480 seconds)
[11:08] * silversurfer (~jeandanie@124x35x46x12.ap124.ftth.ucom.ne.jp) has joined #ceph
[11:12] * jaydee (~jeandanie@124x35x46x11.ap124.ftth.ucom.ne.jp) Quit (Ping timeout: 480 seconds)
[11:14] * yanzheng (~zhyan@jfdmzpr03-ext.jf.intel.com) Quit (Remote host closed the connection)
[11:22] * mozg (~andrei@host217-46-236-49.in-addr.btopenworld.com) has joined #ceph
[11:22] <mozg> wido: hi
[11:22] <mozg> do you have a few minutes?
[11:23] <wido> mozg: yes
[11:23] <mozg> i have a question about rbd format type when the image is created from cloudstack
[11:23] <mozg> from what i can see it is created using format version 1
[11:23] <mozg> by default
[11:23] <mozg> is there a way to change that to version 2?
[11:23] <mozg> so that I can do snapshotting, etc?
[11:23] <wido> mozg: Correct, that is due to your underlying librbd version
[11:23] <wido> Long story short: libvirt tells librbd to create the image
[11:24] <mozg> okay
[11:24] <wido> when the libvirt integration was written there was no format 2
[11:24] <wido> so libvirt calls rbd_create()
[11:24] <mozg> i see
[11:24] <wido> mozg: What version of librbd are you running on the hypervisor?
[11:24] <mozg> one sec
[11:24] <mozg> let me check
[11:24] * zhangjf_zz2 (~zjfhappy@222.128.1.105) Quit (Quit: 离开)
[11:25] <mozg> librbd1 0.61.7-1precise
[11:25] <wido> mozg: http://ceph.com/docs/master/release-notes/#v0-61-3-cuttlefish
[11:25] <wido> ibrbd: make image creation defaults configurable (e.g., create format 2 images via qemu-img)
[11:25] <wido> iirc it should create format 2 by default
[11:26] <wido> otherwise you can configure with a ceph.conf on the hypervisor
[11:26] * loicd running in circles trying to find how backfilling relates to the PG state machine
[11:28] <mozg> wido: what option should I be looking at on the hypervisor side? I can't find it in the link that you've sent. Perhaps the coffee didn't kick in yet ((
[11:29] <wido> mozg: rbd_default_format
[11:29] <wido> defaults to 1 i see
[11:29] <wido> see src/common/config_opts.h
[11:30] <mozg> NAME SIZE PARENT FMT PROT LOCK
[11:30] <mozg> 7b670f93-219a-45ac-b1c3-d95fa75f4e20 102400M 1
[11:30] <mozg> that is what i've got by default
[11:30] <mozg> when i've created a vm from CS
[11:30] <mozg> 4.1
[11:30] <wido> mozg: Yes, so it's librbd which defaults to version 1
[11:30] <wido> change with "rbd_default_format"
[11:30] <wido> CloudStack 4.2 will create format 2 images though
[11:31] <mozg> so, something like [client ] rbd_default_format = 2 ?
[11:31] <wido> mozg: Yes, or [global]
[11:31] <mozg> nice
[11:31] <mozg> i will try that
[11:31] <mozg> thanks
[11:39] * mschiff_ (~mschiff@p4FD7DDB9.dip0.t-ipconnect.de) has joined #ceph
[11:39] * mschiff (~mschiff@p4FD7DDB9.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[11:48] * sagelap1 (~sage@76.89.177.113) has joined #ceph
[11:49] <agh> hello
[11:49] <agh> I need help with garbage collector and radosgw
[11:49] <agh> i do not understand one thing
[11:49] <agh> i've delete some objects via s3cmd
[11:49] * waxzce_ (~waxzce@office.clever-cloud.com) has joined #ceph
[11:49] <agh> it works. But, the objects are not really deleted from the cluster. Normal, tehre is a gc to do that
[11:49] <agh> but, i do not understand why radosgw-admin gc list give me an empty list
[11:50] <agh> any idea ?
[11:50] * waxzce (~waxzce@2a01:e34:ee97:c5c0:8949:d844:5f29:4a80) Quit (Ping timeout: 480 seconds)
[11:54] * sagelap (~sage@2600:1012:b02c:f86e:b1dd:2879:bafb:4558) Quit (Ping timeout: 480 seconds)
[11:55] * janisg (~troll@85.254.50.23) Quit (Ping timeout: 480 seconds)
[12:09] * ScOut3R (~ScOut3R@catv-89-133-25-52.catv.broadband.hu) Quit (Ping timeout: 480 seconds)
[12:10] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[12:17] <huangjun> hi,now i'm compiling ceph on centos 5.9 and it reports error like this:
[12:17] <huangjun> cls/lock/cls_lock_client.cc:199: error: in call to ��lock��
[12:22] <huangjun> what did i missed?
[12:26] <joelio> uff, upgraded to 0.61.7 and now one mon has cephx: verify_reply coudln't decrypt with error: error decoding block for decryption
[12:28] <mozg> wido: do you know if i need to restart ceph services or libvirtd for changes to take effect? I've set the global rbd_default_format and created a new vm, but the format version is still 1
[12:32] <joelio> hmm, yes, 'if it aint broke don't fix it' - now have a broken mon due to upgrade :( cephx: verify_reply coudln't decrypt with error: error decoding block for decryption
[12:32] <joelio> failed verifying authorize reply
[12:32] <wido> mozg: No, that shouldn't. It should read it on image creation
[12:38] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) has joined #ceph
[12:39] <joelio> ahh, ignore me, only one mon had actually upgraded.. phew!
[12:48] * janisg (~troll@85.254.50.23) has joined #ceph
[12:50] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[12:58] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) Quit (Quit: Leaving.)
[12:59] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[13:03] * odyssey4me (~odyssey4m@41.222.225.231) has joined #ceph
[13:03] <lautriv> sagewk, *ping*
[13:05] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[13:05] * huangjun (~kvirc@111.175.165.62) Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[13:12] * Azrael (~azrael@terra.negativeblue.com) Quit (Ping timeout: 480 seconds)
[13:17] * dosaboy (~dosaboy@host109-158-236-137.range109-158.btcentralplus.com) has joined #ceph
[13:26] * diegows (~diegows@190.190.2.126) has joined #ceph
[13:28] <lautriv> anyone around who is responsible for /usr/sbin/ceph-disk ?
[13:31] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[13:41] <joao> lautriv, the rest of the guys will only be around by 9am PST; that's CEST+9
[13:42] * humbolt (~elias@194-166-74-53.adsl.highway.telekom.at) has joined #ceph
[13:43] <joao> and as much as I'd like to help, I have no idea how wrt ceph-disk
[14:03] * markbby (~Adium@168.94.245.3) has joined #ceph
[14:04] <lautriv> joao, if found already the issue but won't screw on the project on my own.
[14:12] * Henson_D (~kvirc@69.166.23.191) Quit (Quit: KVIrc 4.1.3 Equilibrium http://www.kvirc.net/)
[14:23] * mathlin (~mathlin@dhcp2-pc112059.fy.chalmers.se) has joined #ceph
[14:25] <madkiss> so for a RAID5 built out of 10 INTEL SSDSC2CW240A3 disks (250g SSDs), what would be the estimated read-performance?
[14:26] <Gugge-47527> MB/s or IOPS ?
[14:26] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Quit: Ex-Chat)
[14:26] <madkiss> mb/s
[14:27] <Gugge-47527> around 9-10 times what one disk can deliver, if your controller can do that
[14:27] <lautriv> Gugge-47527, raid5
[14:28] <Gugge-47527> lautriv: okay
[14:29] <lautriv> i would guess around 6 times of one IF anything else is no bottleneck.
[14:29] <Gugge-47527> what makes you think a good controller cant read data from 9 disks fullspeed in a raid5 config?
[14:29] <yanzheng> why no 5 times
[14:29] <Gugge-47527> with the proper readahead
[14:30] <yanzheng> why not 5 times
[14:30] * stxShadow (~Jens@ip-88-152-161-249.unitymediagroup.de) has joined #ceph
[14:32] <lautriv> Gugge-47527, the fact that there is some overhead for level5
[14:32] <Gugge-47527> some overhead?
[14:32] <Gugge-47527> a single stripe is data on 9 disks, with no calculation
[14:33] <Gugge-47527> reading that stripe can be done with full speed from 9 disks
[14:33] <madkiss> the funny thing is that apparently, one of these disks, when reading from it with dd, claims to get ma 340mb/s
[14:33] <lautriv> Gugge-47527, since this ain't raid 0, you need to respect the parity
[14:33] <madkiss> (i created a RAID0 with one of the SSDs)
[14:34] <Gugge-47527> its not like raid5 controllers check the parity on reads
[14:34] <lautriv> Gugge-47527, level 5 is pointless anyway, even worse if the don't check on read.
[14:34] <Gugge-47527> no argument there
[14:35] <Gugge-47527> raid5 is crap and useless :)
[14:35] <Gugge-47527> raidz[1-3] is much better :)
[14:35] <lautriv> Gugge-47527, where is the benefit if possibly corrupted data is not checked while reading ?
[14:35] <Gugge-47527> lautriv: that is one of the problems with raid5, you can get bad data without ever knowing :)
[14:36] <joelio> mmm silent data corruption
[14:36] * jluis (~JL@89-181-148-68.net.novis.pt) has joined #ceph
[14:36] <lautriv> Gugge-47527, reason why i use level 0 but have that multi-redundant ;)
[14:37] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[14:43] <lautriv> ok, since the devs will appear in 6+ hours, i'll give it a shot on the public ....
[14:43] <lautriv> i had issues on preparing disk on ONE osd, where it points out it's an issue with the sgdisk call,
[14:44] <lautriv> prepeare calls something like sgdisk --largest-new=1 --change-name=1:"ceph data" ....
[14:45] <lautriv> but this should be rather sgdisk --set-alignment=2048 --largest-new=1 --change-name=1:"ceph data" ....
[14:45] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) Quit (Remote host closed the connection)
[14:45] <lautriv> the direct call on sgdisk works, but a changed /usr/sbin/ceph-disk doesn't. any thoughts ?
[14:50] <lautriv> my first idea was, it may have a defined "num of args" for the subprocess.check_call()s args=[] but that can't be because it's followed by "data," which can be any unpredictable count.
[14:52] <madkiss> in any case, a RAID0 out of two disks ought to be massively faster than RAID1 out of 2 disks, right?
[14:53] <lautriv> madkiss, sure raid1 is mirroring and doesn' much speedup ( just when one drive is closer and need less seek)
[14:53] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[14:53] * ChanServ sets mode +o elder
[14:54] <madkiss> just what i thought. On this controller here, the speed difference between a raid1 and a raid0 actually is, well, 10%.
[14:54] <lautriv> madkiss, should be rather 100%. is the raid actually in sync ?
[15:00] <lautriv> madkiss, some details about CPU/controller/drives ?
[15:01] * stepan_cz (~Adium@2a01:348:94:30:f920:dee7:1703:2ff4) has left #ceph
[15:01] <madkiss> controller is an LSI 9271-8i
[15:02] <madkiss> CPU is a quad-quadcore+ht, drives are Intel SSDSC2CW240A3
[15:08] * Henson_D (~kvirc@lord.uwaterloo.ca) has joined #ceph
[15:09] <lautriv> no idea about that SSD butCPU/controller should not be any hurdle
[15:10] * Azrael (~azrael@terra.negativeblue.com) has joined #ceph
[15:10] * The_Bishop (~bishop@2001:470:50b6:0:d176:f45d:f651:852a) has joined #ceph
[15:10] * aliguori (~anthony@32.97.110.51) has joined #ceph
[15:10] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[15:11] <madkiss> something is rotten here, that's for sure.
[15:11] <lautriv> madkiss, it may happen you have the wrong measure b/c the contoller itself has 1G RAM, tri something like "dd if=/dev/zero bs=1M count=2500 of=/testfile"
[15:11] <mozg> wido: i've read on ceph.com that your company provides european support for ceph
[15:12] <darkfaded> add conv=fdatasync to the dd options if you want to avoid everything being cached by the OS
[15:12] <madkiss> lautriv: the controller has a gig of ram, yes
[15:13] * odyssey4me (~odyssey4m@41.222.225.231) Quit (Read error: Connection reset by peer)
[15:14] <lautriv> have to move, laters ....
[15:21] <joao> mozg, fwiw, inktank does too; wido's is however based in Europe :)
[15:21] <mozg> as I am based in Europe I think it would be reasonable to speak with your local representative
[15:22] <mozg> plus its a good excuse to visit Holland ;-)
[15:22] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Ping timeout: 480 seconds)
[15:22] <mozg> and get some nice wine from France on the way there
[15:22] <mozg> )))
[15:22] <joao> can't argue with a trip to the Netherlands :p
[15:23] <mozg> joao: yeah, indeed
[15:23] <mozg> any excuse will do!
[15:27] * stxShadow (~Jens@ip-88-152-161-249.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[15:28] <mozg> i've got a question. I am planning to replace one of the osd servers in a few weeks time
[15:28] <mozg> this server is also one of three mon servers in the cluster
[15:28] <mozg> how do I go about replacing the monitors
[15:28] <mozg> from what i remember, you should have an odd number of mons
[15:29] <joao> first of all, if it's going to be offline, you'd be advised to move that monitor to other server
[15:29] <joao> w8, changing to laptop
[15:29] <mozg> cheers
[15:29] <mozg> how do i migrate the monitor?
[15:29] <jluis> mozg, you don't strictly need to have an odd number of monitors
[15:29] <mozg> really?
[15:29] <jluis> that's advised, but not a requirement
[15:29] <mozg> so, could I add the 4th monitor?
[15:29] <jluis> you need a majority
[15:30] <mozg> and then remove the old server with the mon
[15:30] <mozg> to have it back to 3 mons?
[15:30] <jluis> the cluster will work fine with 2 or 4 monitors, you'll just have a greater mean time to failure
[15:30] <mozg> okay
[15:30] <jluis> mozg, there's docs about it
[15:30] <jluis> let me get those for you
[15:31] <mozg> coz I remeber i had an issue with with ceph becoming unavaiable when i've moved from 1 mon to 2 mons
[15:31] <jluis> http://ceph.com/docs/master/rados/operations/add-or-rm-mons/
[15:31] <jluis> specifically: http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address
[15:31] <mozg> i think i did follow this guide actually
[15:31] <mozg> and it froze my ceph cluster
[15:31] <jluis> mozg, yeah, moving from 1 to 2 can be a problem *if* the second monitor is unable to start
[15:31] <mozg> as soon as i've added the second mon
[15:32] <mozg> but you are saying that moving from 3 to 4 should not do that?
[15:32] <jluis> you need a majority of up monitors to form a quorum; when you go from 1 to 2, you go from needing just the one to needing the too
[15:32] <jluis> the cluster won't work without a formed quorum
[15:33] <mozg> yeah
[15:33] <mozg> i think that is what happened
[15:33] <jluis> so, moving from 2 to 3 is simpler, because the 3rd monitor doesn't have to come up straight away: as long as you still have the other 2 monitors, they form a majority thus a quorum
[15:34] <jluis> that's why having monitors in even numbers is not advised: if you lose one you're in the woods
[15:34] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[15:34] <jluis> well, lunch is ready
[15:34] <jluis> brb
[15:36] <mozg> thanks for your help
[15:37] * PerlStalker (~PerlStalk@72.166.192.70) has joined #ceph
[15:38] * waxzce_ (~waxzce@office.clever-cloud.com) Quit (Remote host closed the connection)
[15:52] * sprachgenerator (~sprachgen@c-50-141-192-36.hsd1.il.comcast.net) has left #ceph
[15:54] * huangjun (~kvirc@119.147.167.202) has joined #ceph
[15:55] * aliguori_ (~anthony@32.97.110.51) has joined #ceph
[16:01] * aliguori (~anthony@32.97.110.51) Quit (Ping timeout: 480 seconds)
[16:02] * sprachgenerator (~sprachgen@c-50-141-192-36.hsd1.il.comcast.net) has joined #ceph
[16:02] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[16:03] * calit (~thorsten@212.224.79.27) has joined #ceph
[16:04] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[16:04] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[16:05] <calit> hi, I want to add a monitor node zu an existing cluster (following http://ceph.com/docs/next/rados/operations/add-or-rm-mons/), but when i try to start the mon i get 'unable to read magic from mon data.. did you run mkcephfs?
[16:06] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[16:06] <calit> am i missing some step?
[16:09] <jluis> did you run mkfs?
[16:10] <jluis> or even, ceph-mon -i foo --mkfs ?
[16:12] <calit> 'ceph-mon -i 1 --mkfs --monmap monmap.txt --keyring key ' ran fine
[16:13] * BillK (~BillK-OFT@124-168-243-244.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[16:18] <jluis> calit, 'ls /path/to/mon/data' ?
[16:18] * yanzheng (~zhyan@jfdmzpr01-ext.jf.intel.com) Quit (Remote host closed the connection)
[16:18] <jluis> assuming default, would be /var/lib/ceph/mon/ceph-foo
[16:19] <calit> jluis: keyring store.db
[16:20] <calit> at /var/lib/ceph/mon/ceph-1/
[16:20] <jluis> and are you starting the monitor with the proper id?
[16:22] <calit> ah, missed '-i 1' for starting the mon
[16:22] <jluis> give it a try and let us know how it goes :)
[16:23] <calit> ok, thanks already :)
[16:28] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (Ping timeout: 480 seconds)
[16:28] <josef> sagewk: spoke too soon, rawhide built fine, everything else blew up
[16:30] * jeff-YF (~jeffyf@67.23.117.122) Quit (Quit: jeff-YF)
[16:43] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[16:45] * huangjun|2 (~kvirc@60.55.9.12) has joined #ceph
[16:45] * huangjun (~kvirc@119.147.167.202) Quit (Read error: Connection reset by peer)
[17:06] * sagelap (~sage@2600:1012:b023:d7c:b1dd:2879:bafb:4558) has joined #ceph
[17:06] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[17:07] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) has joined #ceph
[17:08] * haomaiwang (~haomaiwan@117.79.232.150) Quit (Remote host closed the connection)
[17:09] * aliguori (~anthony@32.97.110.51) has joined #ceph
[17:09] * aliguori_ (~anthony@32.97.110.51) Quit (Ping timeout: 480 seconds)
[17:10] * haomaiwang (~haomaiwan@106.3.103.147) has joined #ceph
[17:10] * lyncos (~chatzilla@208.71.184.41) has joined #ceph
[17:11] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[17:11] <lyncos> Hi, I'm not sure if I'm crazy or what.. but I have a healthy cluster of 3 MON and 4 OSD .. when I set a pool to Size 2 it works fine.. but when I set it to 3 it dosen't works even if the status says my cluster is healthy .. I probably did miss something
[17:12] <janos> how many hosts are the osd's spread across?
[17:12] <janos> the default failure domain is at the host level
[17:12] * sagelap1 (~sage@76.89.177.113) Quit (Ping timeout: 480 seconds)
[17:12] <lyncos> 4 hosts 1 os per host
[17:12] <lyncos> *1 osd per host
[17:12] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[17:13] <janos> hrm, that should be ok with repl 3
[17:13] <janos> are the osd's all similiarly sized?
[17:14] <lyncos> they are all exactly the same
[17:14] <lyncos> same weight etc
[17:14] <lyncos> will post my ceph osd tree
[17:14] <lyncos> http://pastebin.com/8Zh0R2Kb
[17:14] <janos> i was just scratching the surface. hopefully someone with better debugging skills can get you the rest of the way
[17:15] <lyncos> I hope.. this looks strange ... as soon as I change the size to 3 it kinda lock the pool
[17:15] <lyncos> but cluster is still healthy
[17:15] <janos> interesting weights
[17:15] <janos> cuttlefish?
[17:15] <janos> my cluster is still bobtail
[17:15] * rudolfsteiner_ (~federicon@200.68.116.185) has joined #ceph
[17:16] <lyncos> ceph version 0.61.7
[17:16] * rudolfsteiner__ (~federicon@200.68.116.185) has joined #ceph
[17:17] <maciek> hi, I have some problem with OSD, my cluster is stil "unclean" since forever, I did something wrong? see paste: http://pastebin.com/X1YByvJY
[17:18] <lyncos> janos to start with ceph we have a 100T cluster... on 5 nodes (now a node is broken)
[17:18] <janos> ah
[17:18] <janos> is each osd big raid disk?
[17:18] <janos> +a
[17:19] * Cube (~Cube@66-87-118-175.pools.spcsdns.net) Quit (Quit: Leaving.)
[17:19] <janos> sounds like you're going to want someone more experienced than i to help with that. i was just trying to flush out the most obvious issues
[17:20] <lyncos> yes it's a big raid 5
[17:21] <janos> so with repl 3 is give HEALTH_OK, but is otherwise unresponsive?
[17:21] <janos> very odd
[17:21] * rudolfsteiner___ (~federicon@200.68.116.185) has joined #ceph
[17:23] * rudolfsteiner (~federicon@200.68.116.185) Quit (Ping timeout: 480 seconds)
[17:23] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (Read error: Connection reset by peer)
[17:23] * rudolfsteiner___ is now known as rudolfsteiner
[17:23] * rudolfsteiner_ (~federicon@200.68.116.185) Quit (Ping timeout: 480 seconds)
[17:24] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[17:25] * rudolfsteiner_ (~federicon@200.68.116.185) has joined #ceph
[17:25] <lyncos> yeah exactly
[17:25] <lyncos> it is really strange
[17:26] <calit> jluis: after getting all nodes to the same ceph version everything is fine, thanks for the help again
[17:26] * haomaiwang (~haomaiwan@106.3.103.147) Quit (Ping timeout: 480 seconds)
[17:28] * rudolfsteiner__ (~federicon@200.68.116.185) Quit (Ping timeout: 480 seconds)
[17:30] * buck (~buck@bender.soe.ucsc.edu) has joined #ceph
[17:30] * valzaq (~valzaq@168.96.255.73) has joined #ceph
[17:31] * rudolfsteiner (~federicon@200.68.116.185) Quit (Ping timeout: 480 seconds)
[17:31] * rudolfsteiner_ is now known as rudolfsteiner
[17:31] <valzaq> hellow there can someone tell me if the current cuttlefish release of ceph works with centos 6.4? its driving me nuts..
[17:35] * dobber (~dobber@213.169.45.222) Quit (Remote host closed the connection)
[17:35] <jluis> calit, glad to know it
[17:37] * calit (~thorsten@212.224.79.27) Quit (Quit: Lost terminal)
[17:38] * rudolfsteiner (~federicon@200.68.116.185) Quit (Quit: rudolfsteiner)
[17:41] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[17:42] <mozg> guys, has anyone tried using io throttling with ceph + kvm?
[17:43] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[17:43] <mozg> any tips or howto that you could point to?
[17:45] * JM__ (~oftc-webi@193.252.138.241) Quit (Quit: Page closed)
[17:47] * rudolfsteiner (~federicon@200.68.116.185) Quit ()
[17:49] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[17:52] * devoid (~devoid@130.202.135.214) has joined #ceph
[17:52] * rudolfsteiner (~federicon@200.68.116.185) Quit ()
[17:54] * valzaq (~valzaq@168.96.255.73) Quit (Ping timeout: 480 seconds)
[17:54] * gregmark (~Adium@68.87.42.115) has joined #ceph
[17:56] <huangjun|2> valzaq: yes, it workes well on centos 6.4
[17:57] <huangjun|2> but i can't make it on centos 5.9
[17:59] * AfC (~andrew@2001:44b8:31cb:d400:b874:d094:ec64:aec3) Quit (Quit: Leaving.)
[18:00] * huangjun|2 (~kvirc@60.55.9.12) Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[18:04] * markbby (~Adium@168.94.245.3) Quit (Quit: Leaving.)
[18:05] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[18:09] <mozg> valzaq: i am running clients on centos 6.4 and 6.3
[18:09] <mozg> but to be honest, i am having issues with centos under load
[18:09] <mozg> now checking ubuntu
[18:19] * alram (~alram@38.122.20.226) has joined #ceph
[18:24] * gillesMo (~gillesMo@00012912.user.oftc.net) has joined #ceph
[18:28] * xmltok (~xmltok@pool101.bizrate.com) has joined #ceph
[18:37] <joelio> mozg: have you looked into cgroups?
[18:38] <mozg> i thought that cgroups work with block devices
[18:38] <mozg> how would this work with rbd on the hypervisor side?
[18:39] <mozg> i would like to make sure that no one vm would consume all io when there is a load
[18:40] <joelio> well, it depends on your middleware, how you are dispatching VM's I guess - I use ONE and it has support for this - http://opennebula.org/documentation:rel4.2:kvmg
[18:40] <joelio> check the 'Working with cgroups'
[18:40] <joelio> there may be a way to do it more elegantly, specifically for Ceph I/O - but that is a generic way of throttling resource
[18:41] <darkfaded> joelio: is it just me or does that not show disk io limiting?
[18:41] <darkfaded> i only see cpu/mem
[18:42] * sagelap (~sage@2600:1012:b023:d7c:b1dd:2879:bafb:4558) Quit (Ping timeout: 480 seconds)
[18:42] <joelio> darkfaded: that's just examples... check Throttling I/O bandwith (found via goodgle .. ymmv ) http://docs.oracle.com/cd/E37670_01/E37355/html/ol_use_cases_cgroups.html
[18:43] * sagelap (~sage@38.122.20.226) has joined #ceph
[18:43] * bergerx_ (~bekir@78.188.204.182) Quit (Quit: Leaving.)
[18:44] <darkfaded> hmm. sounds like a lot of work to have it detect if the devmapper paths change for a device. I'll dig into that a little
[18:45] <joelio> yea, as I say, probably more elegant ways.. cgroups generally seen as a resource throttler though
[18:45] * alphe (~alphe@0001ac6f.user.oftc.net) has joined #ceph
[18:45] <joelio> I guess if you were feeling masochistic you could knock up an HTB/QoS/IPTables mashup.. that sounds even more evil though :)
[18:46] <loicd> scuttlemonkey: the topic should probably change to s/v0.61.5/v0.61.7/ ;-)
[18:46] <alphe> hi loicd and everyone
[18:46] <loicd> alphe: good evening sir
[18:46] <alphe> loicd is a rados block device necesary for a ceph cluster
[18:46] * scuttlemonkey changes topic to 'Latest stable (v0.61.7 "Cuttlefish") -- http://ceph.com/get || Ceph Developer Summit: Emperor - http://goo.gl/yy2Jh || Ceph Day NYC 01AUG2013 - http://goo.gl/TMIrZ'
[18:47] <darkfaded> joelio: if i wanted that... there's two paths, one to lunacy, second to go and buy vmware licensens for a few K and have a live
[18:48] <darkfaded> s/live/life/
[18:48] <darkfaded> plus 2 hours to configure it, tops
[18:48] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[18:48] <joelio> yea, but where's the fun in that? :)
[18:49] <alphe> oh ok lets start with other questions
[18:49] <alphe> I need to separate my ceph clusters net i/O in two
[18:50] <alphe> one network for outside i/o and one private lan for osd <-> mds sync
[18:50] <alphe> in version 0.38 I could do that with ceph-deploy I don't know what is the correct proceedure
[18:50] <loicd> alphe: no it is not
[18:51] <loicd> alphe: no a rados block device is not necesary for a ceph cluster
[18:52] <alphe> I can not have private network dedicated to replication data or I can t set that with ceph-deploy
[18:52] <alphe> loicd ok cool !
[18:52] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[18:53] <alphe> hi Tamil
[18:53] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[18:53] <Tamil1> alphe: hi
[18:55] <alphe> ok so for setting up the private network for replication purpose ?
[18:55] <alphe> how do I proceed ?
[18:56] <alphe> I benchmarked 1 file of 1G is up at 11.7MB/s and 1000 files of 4MB is up at 11.7MB
[18:56] <loicd> alphe: http://ceph.com/docs/master/rados/configuration/network-config-ref/#ceph-networks
[18:56] <loicd> add these to all /etc/ceph/ceph.conf files and restart the OSDs
[18:57] * loicd going out for the night
[18:57] <alphe> so I change the ceph.conf with public and cluster network params and I propagate that new config file using ceph-deploy push conf to each machine ?
[18:59] * sakari (sakari@turn.ip.fi) has joined #ceph
[18:59] <lyncos> I did found my pool problem.. it was the max_size paramter was set to 2 .. I did set it up to 3 and now it works ...
[18:59] <alphe> k bye loicd
[19:00] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Ping timeout: 480 seconds)
[19:00] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[19:01] * rennu_ (sakari@turn.ip.fi) Quit (Ping timeout: 480 seconds)
[19:01] * markbby (~Adium@168.94.245.2) has joined #ceph
[19:01] * mschiff_ (~mschiff@p4FD7DDB9.dip0.t-ipconnect.de) Quit (Read error: Connection reset by peer)
[19:01] * mozg (~andrei@host217-46-236-49.in-addr.btopenworld.com) Quit (Ping timeout: 480 seconds)
[19:02] * Meyer^_ (meyer@c64.org) has joined #ceph
[19:05] * Meyer^ (meyer@c64.org) Quit (Ping timeout: 480 seconds)
[19:06] * Daviey (~DavieyOFT@bootie.daviey.com) Quit (Ping timeout: 480 seconds)
[19:07] * Daviey (~DavieyOFT@bootie.daviey.com) has joined #ceph
[19:07] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[19:12] * _Tassadar (~tassadar@tassadar.xs4all.nl) Quit (Remote host closed the connection)
[19:12] * _Tassadar (~tassadar@tassadar.xs4all.nl) has joined #ceph
[19:18] * jjgalvez (~jjgalvez@ip72-193-215-88.lv.lv.cox.net) has joined #ceph
[19:21] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[19:29] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) has joined #ceph
[19:30] * sleinen1 (~Adium@2001:620:0:25:401:b540:2ca5:4e24) has joined #ceph
[19:31] * gillesMo (~gillesMo@00012912.user.oftc.net) Quit (Quit: Konversation terminated!)
[19:34] <Tamil1> alphe: how is it going?
[19:34] <alphe> hum I m benchmarking my network
[19:35] <alphe> but that can wait
[19:35] <Tamil1> alphe: oh ok
[19:36] <alphe> does osd journal size has an impact on speed ?
[19:37] * sleinen (~Adium@77-58-245-10.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[19:37] <lautriv> hey Tamil1 ;) did you see my posts 4h54 min above ?
[19:38] <alphe> strange I hace one ceph.conf generated by ceph-deploy in local dir that has no underscores in param names
[19:39] <alphe> and the /etc/ceph/ceph.conf that has underscores in params names
[19:39] <Tamil1> lautriv: sorry, what was that?
[19:39] <Tamil1> alphe: thats expected
[19:40] <lautriv> Tamil1, about the alignment of sgdisk which is the bug on one of my OSDs
[19:41] <alphe> how works ceph-deploy config push and pull ?
[19:42] <Tamil1> lautriv: bug id?
[19:42] <alphe> I mean what is the file I need to modify to be able to use config push ?
[19:42] <lautriv> Tamil1, i found the call is sgdisk --largest-new=1 --change-name=1:"ceph data" ....
[19:42] <Tamil1> alphe: config push will push your ceph.conf from local dir to the remote machines
[19:43] <Tamil1> alphe: config pull is the reverse
[19:43] <lautriv> Tamil1, but should be sgdisk --set-alignment=2048 --largest-new=1 --change-name=1:"ceph data" ....
[19:43] <alphe> ok so i need to modify the local dir ceph.conf to be able to push it everywhere else right ?
[19:44] <Tamil1> alphe: otherwise there is no need to push the config, unless there is any change to it, right?
[19:44] <alphe> sure but as the conf is in 2 different locations ...
[19:44] * sjm (~oftc-webi@c73-103.rim.net) has joined #ceph
[19:45] <alphe> and if I want to copy my local dir ceph.conf to my localhost /etc/ceph/ i have to push it too no ?
[19:45] <alphe> I know it can sounds dumb but i prefer to be dumb than sorry :)
[19:47] <lautriv> alphe, since it's asked on starting services, even a cp/scp would do.
[19:48] * jluis (~JL@89-181-148-68.net.novis.pt) Quit (Ping timeout: 480 seconds)
[19:49] <alphe> lautriv don t know one has underscores and the other no ...
[19:49] <alphe> lautriv don t know one has underscores and the other don t
[19:51] * dpippenger (~riven@tenant.pas.idealab.com) has joined #ceph
[19:51] <lautriv> alphe, since you mentioned it, i checked my own and found the local conf doesn't have underscores either but i'm pretty sure the have to be there because whitespace is IFS
[19:51] <Tamil1> alphe: sorry, i think it pushes from /etc/ceph to remote host. maybe try the command ?
[19:52] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit (Quit: Leaving.)
[19:53] <alphe> nope it push from ceph.conf in local dir
[19:53] <alphe> I just tested
[19:56] * diegows (~diegows@200.68.116.185) has joined #ceph
[20:00] * rudolfsteiner_ (~federicon@200.68.116.185) has joined #ceph
[20:02] * dxd828 (~dxd828@host-2-97-70-33.as13285.net) has joined #ceph
[20:04] * rudolfsteiner__ (~federicon@200.68.116.185) has joined #ceph
[20:04] <joelio> wido: 17
[20:04] <joelio> doh, sorry /window fail
[20:04] <lautriv> darn, failing osd produces network-storm on the public side, takes hours to fix that remotely :(
[20:07] * rudolfsteiner (~federicon@200.68.116.185) Quit (Ping timeout: 480 seconds)
[20:07] * rudolfsteiner__ is now known as rudolfsteiner
[20:10] * rudolfsteiner_ (~federicon@200.68.116.185) Quit (Ping timeout: 480 seconds)
[20:12] * jeff-YF (~jeffyf@67.23.117.122) Quit (Quit: jeff-YF)
[20:14] * diegows (~diegows@200.68.116.185) Quit (Ping timeout: 480 seconds)
[20:17] * danieagle (~Daniel@177.97.248.72) has joined #ceph
[20:19] * haomaiwang (~haomaiwan@117.79.232.239) has joined #ceph
[20:19] * mschiff (~mschiff@85.182.236.82) has joined #ceph
[20:22] * diegows (~diegows@200.68.116.185) has joined #ceph
[20:24] * jeff-YF (~jeffyf@216.14.83.26) has joined #ceph
[20:26] <cmdrk> seeing a lot of errors in my osd logs along the lines of "fault, initiating reconnect" before "wrongly marked me down" and the cluster starts to recover
[20:26] <cmdrk> any idea what causes these?
[20:31] <alphe> bye all sorry have to go
[20:31] * alphe (~alphe@0001ac6f.user.oftc.net) Quit (Quit: Leaving)
[20:46] * dxd828 (~dxd828@host-2-97-70-33.as13285.net) Quit (Quit: Computer has gone to sleep.)
[20:47] * AaronSchulz_ (~chatzilla@192.195.83.36) has joined #ceph
[20:50] * sjm (~oftc-webi@c73-103.rim.net) Quit (Remote host closed the connection)
[20:53] * AaronSchulz (~chatzilla@216.38.130.164) Quit (Ping timeout: 480 seconds)
[20:53] * AaronSchulz_ is now known as AaronSchulz
[20:59] * fridudad (~oftc-webi@p4FC2C58C.dip0.t-ipconnect.de) has joined #ceph
[20:59] * fridudad (~oftc-webi@p4FC2C58C.dip0.t-ipconnect.de) Quit ()
[21:08] * dscastro (~dscastro@187.37.40.112) has joined #ceph
[21:08] * dscastro (~dscastro@187.37.40.112) Quit ()
[21:09] * dscastro (~dscastro@187.37.40.112) has joined #ceph
[21:09] <dscastro> hi
[21:09] <dscastro> is there anyone who implemented cephfs and selinux ?
[21:10] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[21:12] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) Quit ()
[21:17] <joao> dscastro, don't know about selinux, but for cephfs this would be the right place
[21:17] <dscastro> joao: about ceph block storage
[21:18] <joao> dscastro, ceph*
[21:18] <dscastro> joao: is it a concurrent block device?
[21:18] <dscastro> i mean, can i attach the same block device on different servers?
[21:19] <joao> afaik, yes, as long as those servers guarantee they don't write in the same portions of the block device
[21:19] <joao> but someone else might know better
[21:20] <dscastro> i'm looking for some kind of distributed fs for a http cluster
[21:22] * sagelap (~sage@38.122.20.226) Quit (Quit: Leaving.)
[21:22] * rturk-away is now known as rturk
[21:25] * sagelap (~sage@2607:f298:a:607:b1dd:2879:bafb:4558) has joined #ceph
[21:30] * rudolfsteiner (~federicon@200.68.116.185) Quit (Quit: rudolfsteiner)
[21:33] <cmdrk> hmm. im seeing about 220MB/s reads and writes via rsync --progress on a machine mounting CephFS, writing into it from ramdisk, via 10GbE. i have ~60 OSDs, one for each disk (w/ XFS) across 10 servers (connected via 1Gb), with 3 separate mons. is that in the ballpark or should I be seeing higher performance?
[21:35] <Kioob> wow, �ceph-deploy� is really simple
[21:35] <cmdrk> right now I'm running stock kernels (CentOS 6.4), but I have a custom kernel 3.7.8 available -- considering upgrading and trying btrfs?
[21:35] <cmdrk> also curious if my journals should be on a different partition
[21:36] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[21:41] * jeff-YF_ (~jeffyf@67.23.123.228) has joined #ceph
[21:44] <grepory> are there any recommendations on segmenting production and non-production storage in ceph? whether for or against and how?
[21:48] * jeff-YF (~jeffyf@216.14.83.26) Quit (Ping timeout: 480 seconds)
[21:48] * jeff-YF_ is now known as jeff-YF
[21:49] * diegows (~diegows@200.68.116.185) Quit (Ping timeout: 480 seconds)
[22:01] * dscastro (~dscastro@187.37.40.112) Quit (Remote host closed the connection)
[22:10] * diegows (~diegows@200.68.116.185) has joined #ceph
[22:21] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:23] * PerlStalker (~PerlStalk@72.166.192.70) Quit (Remote host closed the connection)
[22:24] <lautriv> sagelap, *wave* found where the prepare failed.
[22:27] * terje-_ (~root@135.109.216.239) has joined #ceph
[22:27] * markbby (~Adium@168.94.245.2) Quit (Remote host closed the connection)
[22:29] * terje- (~root@135.109.216.239) Quit (Ping timeout: 480 seconds)
[22:36] * devoid (~devoid@130.202.135.214) Quit (Ping timeout: 480 seconds)
[22:37] * rudolfsteiner (~federicon@200.68.116.185) Quit (Quit: rudolfsteiner)
[22:40] <lautriv> anyone around who is used/responible to /usr/sbin/ceph-disk ?
[22:41] * Tamil1 (~Adium@cpe-108-184-66-69.socal.res.rr.com) has joined #ceph
[22:44] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[22:46] * LeaChim (~LeaChim@0540ae5a.skybroadband.com) Quit (Ping timeout: 480 seconds)
[22:46] * jluis (~JL@89.181.148.68) has joined #ceph
[22:47] * Vjarjadian (~IceChat77@90.214.208.5) has joined #ceph
[22:55] <lautriv> can anyone imagine why one additional argument in ceph-disk may fail/be ignored ? to be more specific i need to insert '--set-alignment=2048', in line 1022 (before --largest-new=1) in ceph-disk because it truncates my osd otherwise and proper alignment is a must-have anyway.
[22:56] * LeaChim (~LeaChim@2.122.178.96) has joined #ceph
[22:59] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Remote host closed the connection)
[23:00] <lautriv> to be even more specific, i tested it on direct invocations of sgdisk with success but for some reason the script won't and that is the only occurence.
[23:01] * dpippenger (~riven@tenant.pas.idealab.com) has joined #ceph
[23:04] * humbolt_ (~elias@80-121-55-112.adsl.highway.telekom.at) has joined #ceph
[23:06] * humbolt (~elias@194-166-74-53.adsl.highway.telekom.at) Quit (Ping timeout: 480 seconds)
[23:06] * humbolt_ is now known as humbolt
[23:07] * lyncos (~chatzilla@208.71.184.41) Quit (Remote host closed the connection)
[23:08] * mschiff (~mschiff@85.182.236.82) Quit (Ping timeout: 480 seconds)
[23:08] * mschiff (~mschiff@85.182.236.82) has joined #ceph
[23:12] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[23:19] * guppy (~quassel@guppy.xxx) Quit (Remote host closed the connection)
[23:19] * guppy (~quassel@guppy.xxx) has joined #ceph
[23:19] * dscastro (~dscastro@187.37.40.112) has joined #ceph
[23:20] <cmdrk> could i be geting a lot of "slow requests" messages in the log because my disks are just too slow? right now i have 1 OSD per disk in JBOD
[23:21] <cmdrk> with XFS
[23:21] * mschiff (~mschiff@85.182.236.82) Quit (Ping timeout: 480 seconds)
[23:21] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[23:22] * LPG (~kvirc@c-76-104-197-224.hsd1.wa.comcast.net) has joined #ceph
[23:22] * jeff-YF (~jeffyf@67.23.123.228) Quit (Quit: jeff-YF)
[23:22] * Henson_D (~kvirc@lord.uwaterloo.ca) Quit (Quit: KVIrc KVIrc Equilibrium 4.1.3, revision: 5988, sources date: 20110830, built on: 2011-12-05 12:15:22 UTC http://www.kvirc.net/)
[23:24] <lautriv> cmdrk, rather the machine-load. some details about net/CPU/RAM/bus ?
[23:26] * mschiff (~mschiff@85.182.236.82) has joined #ceph
[23:26] * jeff-YF (~jeffyf@67.23.117.122) has joined #ceph
[23:28] <cmdrk> lautriv: 1Gbps NIC, 4 CPUs, 8GB RAM. currently six OSDs per machine (1:1 map to disks)
[23:29] <cmdrk> 750GB barracuda disks, 7200 RPM SATA, older but i have a ton of them
[23:32] * sprachgenerator (~sprachgen@c-50-141-192-36.hsd1.il.comcast.net) Quit (Quit: sprachgenerator)
[23:33] <cmdrk> im currently using a journal size = 1000
[23:33] <cmdrk> for whatever that's worth.
[23:35] * rudolfsteiner (~federicon@200.68.116.185) Quit (Quit: rudolfsteiner)
[23:37] <cmdrk> so i'm wondering if i ought to instead of having six OSDs per node * 10 nodes, if i ought to do something like 1 OSD per node * 10 nodes
[23:37] <cmdrk> all of this being in XFS land for now
[23:38] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:40] * sprachgenerator (~sprachgen@c-50-141-192-36.hsd1.il.comcast.net) has joined #ceph
[23:40] <sjust> usually it's best to do 1 osd/disk <- cmdrk
[23:41] * danieagle (~Daniel@177.97.248.72) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[23:43] * diegows (~diegows@200.68.116.185) Quit (Ping timeout: 480 seconds)
[23:47] * rudolfsteiner (~federicon@200.68.116.185) has joined #ceph
[23:52] <gregmark> Hi folks. Getting this error when I try to upload image to Glance using rbd: http://pastebin.com/TkrCWigt
[23:53] <gregmark> RBD settings work fine with Cinder… all my VM volumes are happily stored here. I haven't seen much discussion on that error anywhere
[23:56] * jluis (~JL@89.181.148.68) Quit (Ping timeout: 480 seconds)
[23:57] <loicd> sage: in http://wiki.ceph.com/01Planning/02Blueprints/Emperor/osd:_object_redirects when you write "forward" you mean the same as when you write "send EAGAIN with redirect metadata" ?
[23:59] <joshd> gregmark: you need to install the python-ceph package

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.