#ceph IRC Log

Index

IRC Log for 2015-07-20

Timestamps are in GMT/BST.

[0:07] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) has joined #ceph
[0:28] * yanzheng (~zhyan@182.139.205.112) has joined #ceph
[0:29] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[0:32] * bitserker (~toni@188.87.126.67) has joined #ceph
[0:34] * rendar (~I@host146-227-dynamic.116-80-r.retail.telecomitalia.it) Quit ()
[0:36] * stiopa (~stiopa@cpc73828-dals21-2-0-cust630.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[0:42] * bitserker (~toni@188.87.126.67) Quit (Quit: Leaving.)
[0:43] * Cybertinus (~Cybertinu@cybertinus.customer.cloud.nl) Quit (Remote host closed the connection)
[0:43] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) Quit (Remote host closed the connection)
[0:44] * Cybertinus (~Cybertinu@cybertinus.customer.cloud.nl) has joined #ceph
[0:44] * yanzheng (~zhyan@182.139.205.112) Quit (Quit: This computer has gone to sleep)
[0:46] * Concubidated (~Adium@71.21.5.251) Quit (Quit: Leaving.)
[1:00] * yanzheng (~zhyan@182.139.205.112) has joined #ceph
[1:05] * yanzheng (~zhyan@182.139.205.112) Quit ()
[1:13] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[1:13] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[1:14] * scott__ (~scott@69.65.17.194) has joined #ceph
[1:19] * oms101 (~oms101@p20030057EA5E1D00EEF4BBFFFE0F7062.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:20] * capri (~capri@212.218.127.222) has joined #ceph
[1:23] * capri_on (~capri@212.218.127.222) Quit (Ping timeout: 480 seconds)
[1:27] * gucki (~smuxi@212-51-155-85.fiber7.init7.net) Quit (Ping timeout: 480 seconds)
[1:27] * oms101 (~oms101@p20030057EA56A200EEF4BBFFFE0F7062.dip0.t-ipconnect.de) has joined #ceph
[1:42] * yghannam (~yghannam@0001f8aa.user.oftc.net) Quit (Ping timeout: 480 seconds)
[1:48] * scott__ (~scott@69.65.17.194) Quit (Quit: Lost terminal)
[1:51] * rlrevell (~leer@184.52.129.221) Quit (Read error: Connection reset by peer)
[1:57] * thansen (~thansen@63-248-145-253.static.layl0103.digis.net) has joined #ceph
[2:00] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[2:00] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[2:14] * nsoffer (~nsoffer@bzq-109-65-255-114.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[2:16] * rotbeard (~redbeard@2a02:908:df18:6480:76f0:6dff:fe3b:994d) Quit (Ping timeout: 480 seconds)
[2:20] * primechuck (~primechuc@173-17-128-216.client.mchsi.com) has joined #ceph
[2:25] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[2:32] * frednass (~fred@dn-infra-12.lionnois.univ-lorraine.fr) has joined #ceph
[2:32] * frednass1 (~fred@dn-infra-12.lionnois.univ-lorraine.fr) Quit (Read error: Connection reset by peer)
[2:35] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[2:35] * lucas1 (~Thunderbi@218.76.52.64) has joined #ceph
[2:41] * lucas1 (~Thunderbi@218.76.52.64) Quit (Quit: lucas1)
[2:48] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[2:50] * lucas1 (~Thunderbi@218.76.52.64) has joined #ceph
[3:02] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[3:02] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[3:15] * OutOfNoWhere (~rpb@199.68.195.101) Quit (Ping timeout: 480 seconds)
[3:19] * fam_away is now known as fam
[3:21] * frednass1 (~fred@dn-infra-12.lionnois.univ-lorraine.fr) has joined #ceph
[3:21] * frednass (~fred@dn-infra-12.lionnois.univ-lorraine.fr) Quit (Read error: Connection reset by peer)
[3:29] * yanzheng (~zhyan@182.139.205.112) has joined #ceph
[3:37] <rkeene> So, the Sourceforge/Slashdot outage was related to Ceph ?
[3:42] * wink (~wink@218.30.116.10) has joined #ceph
[3:42] <via> i too am interested in what the failure mode was
[3:42] <via> but i doubt they made it public
[3:43] <rkeene> Maybe all their backend OSDs were hosted by a single array that died ?
[3:43] * fam is now known as fam_away
[3:44] <via> a single array?
[3:44] * Debesis (~0x@169.171.46.84.mobile.mezon.lt) Quit (Ping timeout: 480 seconds)
[3:44] <rkeene> Yeah
[3:44] <rkeene> Surely you've heard of storage arrays ? They're typically attached to Storage Area Networks ?
[3:44] <doppelgrau> when I read between the lines correctly (http://sourceforge.net/blog/sourceforge-infrastructure-and-service-restoration/) my guess is that a few OSDs (a single storage node?) delivered wron data (either broken disk-contoller or something like that)
[3:45] <doppelgrau> => for every PG where is had beed primary wrong data in the rbd-Images
[3:45] <doppelgrau> ???
[3:45] * fam_away is now known as fam
[3:45] <via> rkeene: i've heard of sans and storage arrays of course, but i can't imagine how someone would be using one for ceph
[3:45] <via> seems sillly
[3:45] <rkeene> via, https://www.reddit.com/r/sysadmin/comments/3do9k0/sourceforge_is_down_due_to_storage_problems_no_eta/ct77o49
[3:47] <doppelgrau> IT#s only my guess, but I think thats not totally unresonable (I use XFS because it???s more nature, but btrfs seems intressting since it uses hases to verify reads)
[3:47] <via> rkeene: is there any reason to believe that that is someone from sourceforge?
[3:48] <rkeene> via, Nope
[3:48] * baotiao (~baotiao@218.30.116.7) has joined #ceph
[3:49] <via> i would like to think whoever decided to set up ceph would know it would be really dumb to try to use something like that as a storage node
[3:49] <rkeene> But if that's where your disks are and you are not funded to get new disks then it doesn't matter.
[3:49] <via> no, but in that case why use ceph
[3:50] <via> since the technology he described was providing network accessible volumes
[3:50] <rkeene> So you can provide RBD storage to VMs
[3:50] <via> and they were using ceph rbd volumes
[3:50] <via> you only need one or the other
[3:50] <via> if they were using ap roprietary san solution that provided iscsi, they wouldn't be using rbd
[3:50] <rkeene> You need disks to provide RBD volumes -- with the EMC array your storage options are more manual
[3:51] <rkeene> If you were using iSCSI then you'd have to use the kernel to talk to the iSCSI target as an iSCSI initiator, it's a lot more work
[3:51] <via> er...how do you think rbd works?
[3:51] <rkeene> And they may have had hopes that one day they could move their disks
[3:51] <rkeene> QEMU can talk RBD/Ceph directly in userspace.
[3:51] <via> true
[3:51] <via> but
[3:51] <rkeene> Which is most likely what they are doing
[3:52] <via> setting up ceph to purely avoid using iscsi
[3:52] <via> would be insane
[3:52] <via> and thus unlikely
[3:52] <rkeene> Setting up Ceph is just as easy as setting up iSCSI
[3:52] <doppelgrau> via: I agree
[3:52] <rkeene> With the added bonus that you could move off the array while entirely online at some point in the future
[3:52] <via> i strongly disagree
[3:52] <rkeene> If I were in that boat, I would look at setting up Ceph
[3:53] <rkeene> Why ? Everything I said is correct.
[3:53] <via> iscsi is pretty trivial to implemet and udnerstand
[3:53] <via> its conceptually easy
[3:53] <rkeene> If you are using iSCSI then to move off you have to make the controller perform a handoff
[3:54] <via> nobody would layer ceph on top of a storage solution to make that storage solution easier, it just makes no sense
[3:54] <rkeene> iSCSI is not trivial to implement, between iSCSI multipathing and array configuration options, it's a real annoyance -- plus provisioning all your LUNs of random sizes
[3:54] <rkeene> I would
[3:54] <wkennington> how do i upgrade from 0.94.2 to 9.0.2?
[3:54] <via> well, then maybe you'll host the next massive ceph outage :p
[3:54] <rkeene> If I planned to get rid of that storage array soon, there's no reason not to
[3:54] <wkennington> i updated one of the mons but it isn't joining back in
[3:55] <doppelgrau> rkeene: sounds like a good way to hell to mee (much performance overhead, multiple network-layer => latency increses
[3:55] <doppelgrau> wkennington: from 9.4 to 9.0? thats a downgrade
[3:55] <wkennington> 0.94.2
[3:55] <wkennington> not 9.4
[3:57] <rkeene> doppelgrau, Mostly you're bound by the latency of your journal -- if you want to speed things up, get a local journal and you could probably outpace the remote storage (for a while, obviously, once the journal is full it can only be added to as fast as it can be dequeued)
[3:57] <rkeene> If you wanted it to be really fast and you didn't care about your data at all, put your journal on tmpfs :-D
[3:58] <doppelgrau> via, rkeene: I think some unoticed data corruption is the most likely option in my eyes. With xfs or ext4 that can easily happen
[3:58] <via> regardless of the performance issues, what you're describing sounds like a maintainence nightmare. its like creating partitions inside partitions so that you can carve lvm LVs out of them and then remount them via iscsi
[3:58] <rkeene> If it were data corruption then you'd see the scrubs and deep scrubs reporting errors
[3:58] <via> doppelgrau: yeah, but i would like to think that couldn't cause an outage of those proportions
[3:58] <rkeene> And it wouldn't likely affect every RBD image as they say it did
[3:59] <doppelgrau> wkennington: sorry, didn???t read that correctly. Sorry no idea, using only stable versions
[3:59] <via> i do wish ceph didn't have to rely on the filesystem for silent corruption prevention
[3:59] <via> since nobody is going to use btrfs in production
[3:59] <rkeene> via, It's no more difficult than doing iSCSI/FC and Ceph -- except that you only setup the iSCSI/FC once and allocate all your storage to all your storage nodes
[4:00] <rkeene> I tried to use BtrFS in production, it was too slow and too buggy
[4:00] <doppelgrau> via, rkeene: why not? if the source of trouble is a large controller 16 or 24 OSDs can be affected => the cange that in each rbd-Image at least one PG have a primary copy on one of these OSDs is very high
[4:00] <rkeene> via, Isn't that the point of the deep scrub ?
[4:01] <via> doppelgrau: i would like to think crush rules would be set up so that all replicas in PG did not end up on one host
[4:01] <doppelgrau> rkeene: but deep scrub only detect it at some pont in the future, the ???client??? had operated on data with failures up to one week
[4:01] <via> rkeene: if you have two replicas and a deep scrub finds that they are different...i don't believe there's much it can do
[4:02] <doppelgrau> via: but if you read data from primary wirh errors, later write back ..
[4:02] <rkeene> That's why you would have 3 replicas :-D
[4:02] <doppelgrau> via: than the other copies are filled with the garbage too
[4:02] * shang (~ShangWu@175.41.48.77) has joined #ceph
[4:02] <via> rkeene: i can't speak to FC, but thats completely independent of iscsi vs ceph. iscsi is a 5 minute set up activity, ceph... is not
[4:02] <rkeene> via, Ceph takes me less than 5 minutes to setup
[4:02] <via> and i imagine the iscsi targets are done by the san
[4:02] <rkeene> (But I do it manually)
[4:03] <doppelgrau> and they say they compare ???local??? data with backups, so they expect some more or less random files to be brocken
[4:03] <via> i mean, i don't want to say i don't believe you
[4:03] <via> but i don't believe you
[4:03] <rkeene> Part of the product I sell is a "Storage Cloud" that uses Ceph
[4:03] <via> unless you have config management set up to be creating and tearing down ceph clusters on the fly somehow
[4:04] <via> in which case thats not really at all fair to compare...
[4:04] <rkeene> You boot up a node, it adds all its disks to the Ceph cluster
[4:04] <via> since you could do the exact same thing with any system
[4:04] <rkeene> (It PXE boots, if you're on the storage network you boot as a Storage Cloud Node)
[4:04] * baotiao (~baotiao@218.30.116.7) Quit (Ping timeout: 480 seconds)
[4:05] <via> okay, in the case that you have invested many previous hours into being able to create ceph clusters, it doesn't take hours now to do it :p
[4:05] <rkeene> Doing it with iSCSI is a lot less meaningful, since it has no provisioning protocol
[4:05] <via> we're talking about a proprietary san that presents an iscsi interface
[4:05] <rkeene> Certainly I could invent one for iSCSI but it'd be limited to that one node and have no node-redundancy
[4:05] <via> i'm sure it does
[4:06] <doppelgrau> rkeene: simple example where deep scrup won???t help you: primary copy have (unnoticed) errors and a database operates on that => reads data, makes some modifications => write. After the first write the error ist also written to the other copies, so even with size=3 you???ll have problems to get the data right again
[4:06] <rkeene> The array has a provisioning protocol, which is not part of iSCSI
[4:06] <rkeene> doppelgrau, Can you turn on read-checksum-verify ?
[4:07] <via> what checksum is it verifying
[4:07] <doppelgrau> rkeene: only with zfs as far as I know, and I???m unhappy with that
[4:07] <via> the point is there is no checksum
[4:07] <rkeene> Ah, I see -- the deep scrub is just verifying data between replicas
[4:08] <via> yes, i consider it a bit of a flaw
[4:08] <rkeene> doppelgrau, Why are you unhappy with ZFS ? I've used it for a ~7 years (not with Ceph) and I like it way more the BtrFS
[4:08] <rkeene> I've been thinking about trying Ceph on ZFS, I've got it all compiled up for my appliance but I haven't done it yet
[4:09] <doppelgrau> rkeene: sorry, meant btrfs
[4:09] <rkeene> Yeah, I've never been able to be happy with BtrFS -- I keep trying it and every time I'm disappointed
[4:09] <via> my experience with ceph and btrfs was that the btrfs fs had massive fragmentation issues over time
[4:09] <via> and the disk became very slow
[4:09] * shyu (~Shanzhi@119.254.120.66) has joined #ceph
[4:09] <rkeene> To be fair, almost all my kernel panics in Solaris were related to ZFS and I had a data loss incident with ZFS due to a bug
[4:10] <doppelgrau> rkeene: the same problem with (cheap) raid 1 where for performance reasons only one copy is read, the fault model is ???fail silent??? meaning if the drive return data, they have to be allright, but if the drive return erenous data???
[4:11] * doppelgrau hopes that some sort of read-checksumes get???s usable in the near future, either by btrfs getting mature enough or other means
[4:12] * zhaochao (~zhaochao@111.161.77.231) has joined #ceph
[4:12] <rkeene> doppelgrau, Indeed -- this was the argument against SATA when SATA first appeared, FWIW, there was an article from the SCSI vendors called "More than an Interface -- SCSI vs. ATA" (disclaimer, I worked with one of the authors): http://pages.cs.wisc.edu/~remzi/Classes/838/Fall2001/Papers/scsi-ata.pdf
[4:14] <via> and now seagate is known for the worst drives
[4:14] <rkeene> They were then too :-)
[4:14] <via> i have heard many people say that the only difference between enterprise and consumer drives nowadays, if anything, is the controller firmware
[4:15] <rkeene> And the price
[4:15] <via> right
[4:15] <rkeene> And, most likely, the firmware is what does the checksumming
[4:15] <via> to be fair the checksumming on disk drives is kinda orthogonal to the fs checksumming
[4:16] <via> all hdd's as long as i've been using computers had some form of ecc
[4:16] <via> that the physical layer
[4:16] <via> at*
[4:16] <rkeene> SCSI drives traditionally had a lot more, and did sector remapping and all that
[4:16] <doppelgrau> rkeene: but even if scsi/sas have a better internat ECC, if the hba is creating the errors that won???t help
[4:17] * kefu (~kefu@114.86.210.64) has joined #ceph
[4:19] <rkeene> doppelgrau, Indeed, and if the host has memory failures it won't help, more than double-bit and even ECC won't help
[4:19] <doppelgrau> in the end the problem is, that ceph has to trust the data from the filesystem, and the ???usable??? filesystems simply trust the data they get from the controller and they trust the drive ???
[4:21] <rkeene> And memory before that
[4:22] <doppelgrau> that???s the reason why I???d bet on data corruption at the OSD-level, assuming they are sane enough to use size=3 (or larger)
[4:23] <doppelgrau> with size=2 the reason could be ???bad luck - told you so???
[4:24] <rkeene> (If anyone cares, my appliance does size=1 initially and mirrors the disks internally, until you get atleast 3 nodes, then it does size=3 and unmirrors the disks)
[4:35] * kefu_ (~kefu@120.204.164.217) has joined #ceph
[4:39] * kefu_ (~kefu@120.204.164.217) Quit (Max SendQ exceeded)
[4:40] * kefu_ (~kefu@120.204.164.217) has joined #ceph
[4:41] * shang (~ShangWu@175.41.48.77) Quit (Ping timeout: 480 seconds)
[4:42] * kefu (~kefu@114.86.210.64) Quit (Ping timeout: 480 seconds)
[4:46] * primechuck (~primechuc@173-17-128-216.client.mchsi.com) Quit (Remote host closed the connection)
[4:49] * kefu_ (~kefu@120.204.164.217) Quit (Max SendQ exceeded)
[4:49] * kefu (~kefu@120.204.164.217) has joined #ceph
[4:51] * shang (~ShangWu@42-73-69-115.EMOME-IP.hinet.net) has joined #ceph
[4:53] * vbellur (~vijay@122.171.200.74) Quit (Ping timeout: 480 seconds)
[4:55] * kefu (~kefu@120.204.164.217) Quit (Max SendQ exceeded)
[4:56] * kefu (~kefu@120.204.164.217) has joined #ceph
[5:00] * kefu_ (~kefu@114.86.210.64) has joined #ceph
[5:07] * kefu (~kefu@120.204.164.217) Quit (Ping timeout: 480 seconds)
[5:07] * kefu_ is now known as kefu
[5:41] * shang (~ShangWu@42-73-69-115.EMOME-IP.hinet.net) Quit (Read error: Connection reset by peer)
[5:45] * shang (~ShangWu@175.41.48.77) has joined #ceph
[5:51] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[5:51] * Vacuum__ (~Vacuum@88.130.200.58) has joined #ceph
[5:54] * jclm (~jclm@ip24-253-45-236.lv.lv.cox.net) has joined #ceph
[5:58] * jclm1 (~jclm@ip24-253-45-236.lv.lv.cox.net) has joined #ceph
[5:58] * Vacuum_ (~Vacuum@88.130.200.158) Quit (Ping timeout: 480 seconds)
[6:02] * amote (~amote@121.244.87.116) has joined #ceph
[6:04] * jclm (~jclm@ip24-253-45-236.lv.lv.cox.net) Quit (Ping timeout: 480 seconds)
[6:07] * fam is now known as fam_away
[6:11] * Shnaw (~Heliwr@h-213.61.149.100.host.de.colt.net) has joined #ceph
[6:13] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Ping timeout: 480 seconds)
[6:18] * vbellur (~vijay@121.244.87.124) has joined #ceph
[6:31] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[6:41] * Shnaw (~Heliwr@9S0AACGGE.tor-irc.dnsbl.oftc.net) Quit ()
[6:46] * Phase (~KapiteinK@80.82.64.233) has joined #ceph
[6:46] * Phase is now known as Guest766
[6:48] * kefu is now known as kefu|afk
[6:48] * kefu|afk (~kefu@114.86.210.64) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[7:01] * essjayhch (~sid79416@ealing.irccloud.com) Quit (Ping timeout: 480 seconds)
[7:01] * immesys (sid44615@id-44615.charlton.irccloud.com) Quit (Ping timeout: 480 seconds)
[7:02] * essjayhch (sid79416@id-79416.ealing.irccloud.com) has joined #ceph
[7:02] * immesys (sid44615@id-44615.charlton.irccloud.com) has joined #ceph
[7:16] * Guest766 (~KapiteinK@80.82.64.233) Quit ()
[7:19] * kefu (~kefu@114.86.210.64) has joined #ceph
[7:20] * Concubidated (~Adium@66.87.144.220) has joined #ceph
[7:21] * fam_away is now known as fam
[7:34] * Concubidated (~Adium@66.87.144.220) Quit (Read error: Connection reset by peer)
[7:34] * Concubidated (~Adium@66.87.144.220) has joined #ceph
[7:38] * devicenull (~sid4013@id-4013.charlton.irccloud.com) Quit (Ping timeout: 480 seconds)
[7:38] * carmstrong (sid22558@id-22558.highgate.irccloud.com) Quit (Ping timeout: 480 seconds)
[7:39] * JohnPreston78 (sid31393@id-31393.charlton.irccloud.com) Quit (Ping timeout: 480 seconds)
[7:39] * Pintomatic (sid25118@id-25118.charlton.irccloud.com) Quit (Ping timeout: 480 seconds)
[7:41] * dopesong (~dopesong@78-61-129-9.static.zebra.lt) has joined #ceph
[7:42] * devicenull (sid4013@id-4013.charlton.irccloud.com) has joined #ceph
[7:43] * JohnPreston78 (~sid31393@id-31393.charlton.irccloud.com) has joined #ceph
[7:44] * carmstrong (sid22558@id-22558.highgate.irccloud.com) has joined #ceph
[7:49] * Pintomatic (~sid25118@charlton.irccloud.com) has joined #ceph
[7:49] * derjohn_mob (~aj@x590e22b2.dyn.telefonica.de) has joined #ceph
[7:56] * karnan (~karnan@121.244.87.117) has joined #ceph
[8:05] * overclk (~overclk@121.244.87.117) has joined #ceph
[8:05] * jclm1 (~jclm@ip24-253-45-236.lv.lv.cox.net) Quit (Ping timeout: 480 seconds)
[8:07] * rdas (~rdas@121.244.87.116) has joined #ceph
[8:08] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Ping timeout: 480 seconds)
[8:09] * nilshug (~nils@aftr-95-222-31-62.unity-media.net) has joined #ceph
[8:23] * Hemanth (~Hemanth@121.244.87.117) has joined #ceph
[8:30] * derjohn_mob (~aj@x590e22b2.dyn.telefonica.de) Quit (Remote host closed the connection)
[8:35] * derjohn_mob (~aj@x590e22b2.dyn.telefonica.de) has joined #ceph
[8:36] * Concubidated (~Adium@66.87.144.220) Quit (Quit: Leaving.)
[8:39] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[8:43] * sleinen (~Adium@2001:620:0:82::100) Quit (Ping timeout: 480 seconds)
[8:44] <T1w> mornings
[8:45] <T1w> does anyone have a few details as to what happened at sourceforge - ie. what affected all their rbds?
[8:45] * derjohn_mob (~aj@x590e22b2.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[8:47] * nsoffer (~nsoffer@bzq-109-65-255-114.red.bezeqint.net) has joined #ceph
[8:50] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[8:50] * nilshug (~nils@aftr-95-222-31-62.unity-media.net) Quit (Ping timeout: 480 seconds)
[8:50] * muni (~muni@bzq-218-200-252.red.bezeqint.net) has joined #ceph
[8:50] * cok (~chk@2a02:2350:18:1010:517:9345:aca0:246) has joined #ceph
[8:53] * sleinen (~Adium@2001:620:0:2d:7ed1:c3ff:fedc:3223) has joined #ceph
[8:53] * dopesong (~dopesong@78-61-129-9.static.zebra.lt) Quit (Remote host closed the connection)
[8:58] * Nacer (~Nacer@203-206-190-109.dsl.ovh.fr) Quit (Remote host closed the connection)
[8:59] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[9:01] * sleinen (~Adium@2001:620:0:2d:7ed1:c3ff:fedc:3223) Quit (Quit: Leaving.)
[9:05] * kefu_ (~kefu@120.204.164.217) has joined #ceph
[9:06] * derjohn_mob (~aj@x590cf7da.dyn.telefonica.de) has joined #ceph
[9:08] * gucki (~smuxi@31.24.15.113) has joined #ceph
[9:11] * kefu (~kefu@114.86.210.64) Quit (Ping timeout: 480 seconds)
[9:12] * rendar (~I@host170-180-dynamic.36-79-r.retail.telecomitalia.it) has joined #ceph
[9:14] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) Quit (Ping timeout: 480 seconds)
[9:14] * derjohn_mob (~aj@x590cf7da.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[9:15] * dgurtner (~dgurtner@178.197.231.188) has joined #ceph
[9:16] * kefu (~kefu@114.86.210.64) has joined #ceph
[9:19] * b0e (~aledermue@213.95.25.82) has joined #ceph
[9:20] * timfreund (~tim@ec2-54-209-140-45.compute-1.amazonaws.com) has joined #ceph
[9:22] * oro (~oro@84-73-73-158.dclient.hispeed.ch) has joined #ceph
[9:23] * shohn (~shohn@p57A1499B.dip0.t-ipconnect.de) has joined #ceph
[9:23] * kefu_ (~kefu@120.204.164.217) Quit (Ping timeout: 480 seconds)
[9:24] * linjan_ (~linjan@195.91.236.115) has joined #ceph
[9:25] * analbeard (~shw@support.memset.com) has joined #ceph
[9:27] * derjohn_mob (~aj@tmo-112-240.customers.d1-online.com) has joined #ceph
[9:30] * sleinen (~Adium@130.59.94.177) has joined #ceph
[9:31] * nsoffer (~nsoffer@bzq-109-65-255-114.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[9:31] * immesys (sid44615@id-44615.charlton.irccloud.com) Quit (Read error: Connection reset by peer)
[9:31] * kamalmarhubi (sid26581@highgate.irccloud.com) Quit (Read error: Connection reset by peer)
[9:31] * JohnPreston78 (~sid31393@id-31393.charlton.irccloud.com) Quit (Read error: Connection reset by peer)
[9:31] * ade (~abradshaw@tmo-111-112.customers.d1-online.com) has joined #ceph
[9:31] * Pintomatic (~sid25118@charlton.irccloud.com) Quit (Read error: Connection reset by peer)
[9:31] * ivotron (~sid25461@id-25461.brockwell.irccloud.com) Quit (Read error: Connection reset by peer)
[9:32] * gabrtv (sid36209@brockwell.irccloud.com) Quit (Read error: Connection reset by peer)
[9:35] * sleinen1 (~Adium@macsb.switch.ch) has joined #ceph
[9:36] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[9:37] * branto (~branto@178.253.161.174) has joined #ceph
[9:40] * immesys (sid44615@id-44615.charlton.irccloud.com) has joined #ceph
[9:40] * JohnPreston78 (~sid31393@id-31393.charlton.irccloud.com) has joined #ceph
[9:41] * Pintomatic (sid25118@id-25118.charlton.irccloud.com) has joined #ceph
[9:41] * sleinen (~Adium@130.59.94.177) Quit (Ping timeout: 480 seconds)
[9:44] * vbellur (~vijay@121.244.87.124) Quit (Ping timeout: 480 seconds)
[9:45] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[9:46] * kefu (~kefu@114.86.210.64) has joined #ceph
[9:48] * jcsp (~jspray@summerhall-meraki1.fluency.net.uk) has joined #ceph
[9:48] * brutuscat (~brutuscat@17.Red-83-47-123.dynamicIP.rima-tde.net) has joined #ceph
[9:49] * dneary (~dneary@AGrenoble-651-1-414-123.w90-52.abo.wanadoo.fr) has joined #ceph
[9:57] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[9:58] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Ping timeout: 480 seconds)
[9:58] * kefu (~kefu@114.86.210.64) has joined #ceph
[10:03] * ivotron (~sid25461@id-25461.brockwell.irccloud.com) has joined #ceph
[10:04] * gabrtv (~sid36209@id-36209.brockwell.irccloud.com) has joined #ceph
[10:10] * kamalmarhubi (sid26581@id-26581.highgate.irccloud.com) has joined #ceph
[10:16] <s3an2> T1w: nothing more than on the SF Blog
[10:16] <T1w> alas also the only thing I've seen
[10:16] * arcimboldo (~antonio@dhcp-y11-zi-s3it-130-60-34-042.uzh.ch) has joined #ceph
[10:20] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[10:21] * kefu (~kefu@114.86.210.64) has joined #ceph
[10:23] * dalgaaf (~uid15138@id-15138.charlton.irccloud.com) has joined #ceph
[10:23] * nardial (~ls@dslb-088-072-089-026.088.072.pools.vodafone-ip.de) has joined #ceph
[10:27] * jordanP (~jordan@scality-jouf-2-194.fib.nerim.net) has joined #ceph
[10:28] * fam is now known as fam_away
[10:28] * kefu (~kefu@114.86.210.64) Quit (Read error: Connection reset by peer)
[10:31] * fam_away is now known as fam
[10:32] * kefu (~kefu@114.86.210.64) has joined #ceph
[10:32] * vbellur (~vijay@121.244.87.124) has joined #ceph
[10:32] * kefu (~kefu@114.86.210.64) Quit ()
[10:35] * jordanP (~jordan@scality-jouf-2-194.fib.nerim.net) Quit (Ping timeout: 480 seconds)
[10:35] * kefu_ (~kefu@114.86.210.64) has joined #ceph
[10:38] * kefu_ (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[10:38] * yghannam (~yghannam@0001f8aa.user.oftc.net) has joined #ceph
[10:39] * kefu_ (~kefu@114.86.210.64) has joined #ceph
[10:39] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[10:45] * dopesong (~dopesong@mail.bvgroup.eu) has joined #ceph
[10:46] * jordanP (~jordan@213.215.2.194) has joined #ceph
[10:48] * oblu (~o@62.109.134.112) Quit (Ping timeout: 480 seconds)
[10:52] * derjohn_mob (~aj@tmo-112-240.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[10:55] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Quit: Verlassend)
[10:55] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[11:05] * adrian (~abradshaw@tmo-113-110.customers.d1-online.com) has joined #ceph
[11:06] * adrian is now known as Guest790
[11:10] * ade (~abradshaw@tmo-111-112.customers.d1-online.com) Quit (Ping timeout: 480 seconds)
[11:12] * Guest790 (~abradshaw@tmo-113-110.customers.d1-online.com) Quit (Remote host closed the connection)
[11:13] * ade (~abradshaw@tmo-113-110.customers.d1-online.com) has joined #ceph
[11:15] * thomnico (~thomnico@cro38-2-88-180-16-18.fbx.proxad.net) has joined #ceph
[11:17] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Remote host closed the connection)
[11:21] * derjohn_mob (~aj@x590e0bf2.dyn.telefonica.de) has joined #ceph
[11:22] * Kioob`Taff (~plug-oliv@2a01:e35:2e8a:1e0::42:10) has joined #ceph
[11:22] * nardial (~ls@dslb-088-072-089-026.088.072.pools.vodafone-ip.de) Quit (Quit: Leaving)
[11:23] * brutuscat (~brutuscat@17.Red-83-47-123.dynamicIP.rima-tde.net) Quit (Remote host closed the connection)
[11:24] * shang (~ShangWu@175.41.48.77) Quit (Quit: Ex-Chat)
[11:24] * shang (~ShangWu@175.41.48.77) has joined #ceph
[11:25] * thomnico (~thomnico@cro38-2-88-180-16-18.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[11:28] * brutuscat (~brutuscat@17.Red-83-47-123.dynamicIP.rima-tde.net) has joined #ceph
[11:28] * derjohn_mobi (~aj@x590e1b2e.dyn.telefonica.de) has joined #ceph
[11:30] * derjohn_mob (~aj@x590e0bf2.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[11:34] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[11:35] * kefu_ is now known as kefu|afk
[11:35] * kefu|afk (~kefu@114.86.210.64) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[11:37] * derjohn_mobi (~aj@x590e1b2e.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[11:41] * kefu (~kefu@114.86.210.64) has joined #ceph
[11:50] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[11:54] * dopesong (~dopesong@mail.bvgroup.eu) Quit (Remote host closed the connection)
[11:56] * Vacuum_ (~Vacuum@i59F792BE.versanet.de) has joined #ceph
[12:03] * Vacuum__ (~Vacuum@88.130.200.58) Quit (Ping timeout: 480 seconds)
[12:04] * oblu (~o@62.109.134.112) has joined #ceph
[12:08] * dugravot6 (~dugravot6@dn-infra-04.lionnois.univ-lorraine.fr) has joined #ceph
[12:08] * capri_on (~capri@212.218.127.222) has joined #ceph
[12:11] * kefu is now known as kefu|afk
[12:11] * capri (~capri@212.218.127.222) Quit (Ping timeout: 480 seconds)
[12:11] * muni (~muni@bzq-218-200-252.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[12:12] * kefu|afk (~kefu@114.86.210.64) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[12:13] * JohnO (~MonkeyJam@ns316491.ip-37-187-129.eu) has joined #ceph
[12:15] * lcurtis (~lcurtis@2601:194:8101:7100:917b:a8ea:3f2b:128b) has joined #ceph
[12:15] * thomnico (~thomnico@cro38-2-88-180-16-18.fbx.proxad.net) has joined #ceph
[12:17] * bitserker (~toni@cli-5b7ef18e.bcn.adamo.es) has joined #ceph
[12:18] * lucas1 (~Thunderbi@218.76.52.64) Quit (Ping timeout: 480 seconds)
[12:19] * shyu (~Shanzhi@119.254.120.66) Quit (Remote host closed the connection)
[12:20] * Debesis (Debesis@169.171.46.84.mobile.mezon.lt) has joined #ceph
[12:25] * rburkholder (~overonthe@199.68.193.54) Quit (Read error: Connection reset by peer)
[12:26] * muni (~muni@bzq-218-200-252.red.bezeqint.net) has joined #ceph
[12:27] * oro (~oro@84-73-73-158.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[12:33] * derjohn_mobi (~aj@b2b-94-79-172-98.unitymedia.biz) has joined #ceph
[12:36] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) has joined #ceph
[12:38] * thomnico (~thomnico@cro38-2-88-180-16-18.fbx.proxad.net) Quit (Quit: Ex-Chat)
[12:38] * lcurtis (~lcurtis@2601:194:8101:7100:917b:a8ea:3f2b:128b) Quit (Ping timeout: 480 seconds)
[12:40] * kefu (~kefu@114.86.210.64) has joined #ceph
[12:43] * JohnO (~MonkeyJam@9S0AACGYV.tor-irc.dnsbl.oftc.net) Quit ()
[12:43] * xarses_ (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[12:46] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[12:47] * ira (~ira@c-71-233-225-22.hsd1.ma.comcast.net) has joined #ceph
[12:48] * lcurtis (~lcurtis@2601:194:8101:7100:917b:a8ea:3f2b:128b) has joined #ceph
[12:52] * nardial (~ls@dslb-088-072-089-026.088.072.pools.vodafone-ip.de) has joined #ceph
[12:53] * overclk (~overclk@121.244.87.117) Quit (Quit: Leaving)
[12:54] * shohn is now known as shohn_afk
[12:55] * dopesong (~dopesong@mail.bvgroup.eu) has joined #ceph
[12:57] * lcurtis (~lcurtis@2601:194:8101:7100:917b:a8ea:3f2b:128b) Quit (Ping timeout: 480 seconds)
[13:02] * kefu is now known as kefu|afk
[13:02] * kefu|afk (~kefu@114.86.210.64) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[13:03] * dopesong (~dopesong@mail.bvgroup.eu) Quit (Ping timeout: 480 seconds)
[13:04] * thomnico (~thomnico@cro38-2-88-180-16-18.fbx.proxad.net) has joined #ceph
[13:05] * dopesong (~dopesong@mail.bvgroup.eu) has joined #ceph
[13:08] * ifur (~osm@0001f63e.user.oftc.net) Quit (Quit: Lost terminal)
[13:09] * ifur (~osm@hornbill.csc.warwick.ac.uk) has joined #ceph
[13:10] * nsoffer (~nsoffer@nat-pool-tlv-t.redhat.com) has joined #ceph
[13:10] * karnan (~karnan@121.244.87.117) Quit (Remote host closed the connection)
[13:12] * kefu (~kefu@114.86.210.64) has joined #ceph
[13:12] * rburkholder (~overonthe@199.68.193.62) has joined #ceph
[13:13] * zhaochao (~zhaochao@111.161.77.231) Quit (Quit: ChatZilla 0.9.91.1 [Iceweasel 38.1.0/20150711212448])
[13:14] * ifur (~osm@0001f63e.user.oftc.net) Quit (Quit: leaving)
[13:15] * shohn_afk is now known as shohn
[13:16] * ifur (~osm@0001f63e.user.oftc.net) has joined #ceph
[13:26] * nils_ (~nils@doomstreet.collins.kg) has joined #ceph
[13:31] * fam is now known as fam_away
[13:32] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[13:33] * kefu (~kefu@114.86.210.64) has joined #ceph
[13:37] * totalwormage (~cryptk@162.216.46.160) has joined #ceph
[13:41] * overclk (~overclk@59.93.226.16) has joined #ceph
[13:47] * Concubidated (~Adium@66-87-144-220.pools.spcsdns.net) has joined #ceph
[13:52] * vbellur (~vijay@121.244.87.124) Quit (Ping timeout: 480 seconds)
[14:00] * dynamicudpate (~overonthe@199.68.193.54) has joined #ceph
[14:01] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[14:02] * kefu (~kefu@114.86.210.64) has joined #ceph
[14:02] * dugravot6 (~dugravot6@dn-infra-04.lionnois.univ-lorraine.fr) Quit (Quit: Leaving.)
[14:05] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[14:07] * totalwormage (~cryptk@162.216.46.160) Quit ()
[14:07] * rburkholder (~overonthe@199.68.193.62) Quit (Ping timeout: 480 seconds)
[14:08] * sleinen1 (~Adium@macsb.switch.ch) Quit (Ping timeout: 480 seconds)
[14:09] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[14:11] * linjan (~linjan@195.91.236.115) has joined #ceph
[14:11] * overclk (~overclk@59.93.226.16) Quit (Remote host closed the connection)
[14:11] * overclk (~overclk@59.93.226.16) has joined #ceph
[14:12] * danieagle (~Daniel@187.10.25.6) has joined #ceph
[14:14] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[14:14] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[14:14] * cdelatte (~cdelatte@cpe-172-72-105-98.carolina.res.rr.com) has joined #ceph
[14:16] * linjan_ (~linjan@195.91.236.115) Quit (Ping timeout: 480 seconds)
[14:21] * kawa2014 (~kawa@89.184.114.246) Quit (Ping timeout: 480 seconds)
[14:22] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[14:25] * jo00nas (~jonas@188-183-5-254-static.dk.customer.tdc.net) has joined #ceph
[14:25] * jo00nas (~jonas@188-183-5-254-static.dk.customer.tdc.net) Quit ()
[14:28] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[14:28] * freman (~freman@griffin.seedboxes.cc) has joined #ceph
[14:29] <freman> hi! anyone here that can maybe awnser a quick ceph-deploy question?
[14:29] * brutuscat (~brutuscat@17.Red-83-47-123.dynamicIP.rima-tde.net) Quit (Read error: Connection reset by peer)
[14:29] <alfredodeza> I would always ask the question and hope someone can answer it at some point :)
[14:29] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[14:31] * kefu (~kefu@114.86.210.64) has joined #ceph
[14:31] * brutuscat (~brutuscat@17.Red-83-47-123.dynamicIP.rima-tde.net) has joined #ceph
[14:32] <freman> well, that would be my next step. When I run ceph-deploy install [node], ceph-deploy tries to get rpm to import the release.asc file, even if the file is allready imported. Problem is we are behind a proxy that limits access to the file from server network. And we have tried to specify a local url in the cephdeploy.conf but it allways goes to the cehp.com/git URL anyway
[14:33] <alfredodeza> freman: can you share your cephdeploy.conf file in a paste site?
[14:33] <freman> sure 1 second
[14:33] * jks (~jks@178.155.151.121) has joined #ceph
[14:34] * KristopherBel (~Jamana@5NZAAFA2E.tor-irc.dnsbl.oftc.net) has joined #ceph
[14:35] <freman> http://pastebin.com/TcegaCTc
[14:36] * wink (~wink@218.30.116.10) Quit (Ping timeout: 480 seconds)
[14:36] <alfredodeza> freman: and how are you calling the install sub command?
[14:36] * alfredodeza already spotted a couple of things
[14:36] * kefu (~kefu@114.86.210.64) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[14:37] <freman> alfredodeza: I just run ceph-deploy install [node]
[14:37] <freman> alfredodeza: and then the outpu states it found the cephdeploy.conf file, but nothing changes.
[14:38] <alfredodeza> ok, so ceph-deploy can't infer what repo you want from the config if you don't: specify it via a flag, or don't specify it with the 'default = True' value
[14:39] <alfredodeza> freman: the default value is documented here: http://ceph.com/ceph-deploy/docs/conf.html#default-repository
[14:39] <alfredodeza> otherwise you would use the `--release` flag to point to a section. So in your case, your repo section is called cephrepo, so you would do something like `ceph-deploy install --release cephrepo [node]`
[14:39] <freman> ah, yes. I added default = true and now it works :D thanks a bunch
[14:40] <alfredodeza> so a bit of advise though with that flag, it has happened to me a lot of times where I am expecting to get something different and I forget that value is in there, using different defaults than what I expect :)
[14:41] <freman> Yeah good tip :) thanks for the assist :)
[14:41] <alfredodeza> no problem!
[14:42] * oro (~oro@2001:620:20:16:a91d:bef5:54a:5f41) has joined #ceph
[14:43] * treenerd (~treenerd@85.193.140.98) has joined #ceph
[14:43] * treenerd (~treenerd@85.193.140.98) Quit ()
[14:44] * linjan (~linjan@195.91.236.115) Quit (Ping timeout: 480 seconds)
[14:45] * treenerd (~treenerd@85.193.140.98) has joined #ceph
[14:46] * oblu (~o@62.109.134.112) Quit (Quit: ~)
[14:53] * DV__ (~veillard@2001:41d0:1:d478::1) Quit (Ping timeout: 480 seconds)
[14:56] * cok (~chk@2a02:2350:18:1010:517:9345:aca0:246) Quit (Quit: Leaving.)
[14:56] <freman> alfredodeza: maybe you can help with another strange issue ;) now tht it finaly installs on the node. Why on earth does it install firefly when the config states hammer?
[14:58] <alfredodeza> freman: what distro is this? can you share the full output of the install command (on a paste site) ?
[14:58] * nsoffer (~nsoffer@nat-pool-tlv-t.redhat.com) Quit (Ping timeout: 480 seconds)
[14:59] * DV__ (~veillard@2001:41d0:1:d478::1) has joined #ceph
[14:59] * jrankin (~jrankin@d53-64-170-236.nap.wideopenwest.com) has joined #ceph
[14:59] <freman> alfredodeza: http://pastebin.com/9PaUA1S2
[15:00] <freman> centos 7.1 distro
[15:00] * t0rn (~ssullivan@2607:fad0:32:a02:56ee:75ff:fe48:3bd3) has joined #ceph
[15:01] <alfredodeza> freman: line 118
[15:01] * t0rn (~ssullivan@2607:fad0:32:a02:56ee:75ff:fe48:3bd3) has left #ceph
[15:01] <alfredodeza> --> ceph x86_64 1:0.80.7-0.4.el7 epel 9.4 M
[15:01] <alfredodeza> it is coming from epel
[15:01] <alfredodeza> would you mind sending the full output (it seems the log was cut off form the beginning)
[15:01] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[15:01] * sjm (~sjm@pool-100-1-115-73.nwrknj.fios.verizon.net) has joined #ceph
[15:02] * wschulze (~wschulze@cpe-69-206-240-164.nyc.res.rr.com) has joined #ceph
[15:02] <freman> yeah sure. 1 second.
[15:03] * arbrandes (~arbrandes@179.210.13.90) has joined #ceph
[15:03] * sjm (~sjm@pool-100-1-115-73.nwrknj.fios.verizon.net) Quit (Read error: Connection reset by peer)
[15:04] * KristopherBel (~Jamana@5NZAAFA2E.tor-irc.dnsbl.oftc.net) Quit ()
[15:04] <freman> alfredodeza: http://pastebin.com/Y7cLQ4Lw
[15:05] * kawa2014 (~kawa@89.184.114.246) Quit (Ping timeout: 480 seconds)
[15:06] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit ()
[15:06] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[15:07] * rlrevell (~leer@vbo1.inmotionhosting.com) has joined #ceph
[15:08] <alfredodeza> freman: ok, I have a solution for you
[15:08] <alfredodeza> :)
[15:08] * bene (~ben@nat-pool-bos-t.redhat.com) has joined #ceph
[15:08] <freman> alfredodeza: whats it gonna cost me? :)
[15:08] <alfredodeza> so what happens is that you are installing ceph on a box that has epel enabled and epel itself has a version of ceph that is preferred by yum
[15:09] <alfredodeza> what needs to happen is that the repo file that ceph-deploy is installing needs to have a higher priority
[15:09] <alfredodeza> this is surely a (documentation) bug
[15:10] <alfredodeza> but you need to ensure that the `priority` key is set in your cephdeploy repo section
[15:10] <alfredodeza> so in the `[cephrepo]` section, add this: `priority = 1`
[15:10] * nsoffer (~nsoffer@nat-pool-tlv-u.redhat.com) has joined #ceph
[15:10] <alfredodeza> and try again
[15:11] <alfredodeza> (make sure you uninstall ceph before that though)
[15:11] * kefu (~kefu@114.86.210.64) has joined #ceph
[15:11] <freman> ok, trying again :)
[15:11] * kefu is now known as kefu|afk
[15:12] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[15:13] * DV__ (~veillard@2001:41d0:1:d478::1) Quit (Ping timeout: 480 seconds)
[15:14] * kefu|afk is now known as kefu
[15:15] <freman> alfredodeza: still installed firefly ... I set priority = 1 in the cephdeploy.conf and in the yum.repos.d/ceph.repo
[15:15] <freman> hang on 1 seccond
[15:15] * sankarshan_ (~sankarsha@121.244.87.117) has joined #ceph
[15:15] * overclk (~overclk@59.93.226.16) Quit (Quit: Leaving...)
[15:16] <alfredodeza> you should look for a 'ensuring cephrepo contains a high priority' in your logs
[15:17] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[15:17] * marrusl (~mark@cpe-67-247-9-253.nyc.res.rr.com) has joined #ceph
[15:17] <freman> yes, but that file was not in yum.repos.d I added it.
[15:18] <freman> alfredodeza: pfft. still firefly :P
[15:18] * jclm (~jclm@ip24-253-45-236.lv.lv.cox.net) has joined #ceph
[15:18] * sankarshan (~sankarsha@121.244.87.117) Quit (Ping timeout: 480 seconds)
[15:18] <alfredodeza> what do you mean by the 'file was not in yum.repos.d'
[15:18] <alfredodeza> what file
[15:19] <freman> alfredodeza: [seldceph02][WARNIN] ensuring that /etc/yum.repos.d/cephrepo.repo contains a high priority
[15:19] <freman> [seldceph02][WARNIN] altered cephrepo.repo priorities to contain: priority=1
[15:19] <freman> [seldceph02][INFO ] Running command: sudo yum -y install ceph
[15:19] <alfredodeza> did it install the yum priorities plugin?
[15:20] <freman> yes [seldceph02][DEBUG ] Loaded plugins: fastestmirror, priorities
[15:20] <alfredodeza> and why did you added a `priority = 1` to ceph.repo? you need to remove that file and just use whatever ceph-deploy added
[15:21] <alfredodeza> which is (I think) cephrepo.repo
[15:21] <alfredodeza> and also, ensure that you have removed ceph before attempting to install again
[15:21] <alfredodeza> you can also troubleshoot this with yum
[15:21] * primechuck (~primechuc@host-95-2-129.infobunker.com) has joined #ceph
[15:21] <freman> yepp, removed the alternate repo file and ran uninstall/purge on the ceph node.
[15:21] <alfredodeza> see where ceph is coming from etc....
[15:22] * sjm (~sjm@pool-100-1-115-73.nwrknj.fios.verizon.net) has joined #ceph
[15:23] * DV__ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[15:23] <freman> alfredodeza: ame : ceph-common
[15:23] <freman> Arch : x86_64
[15:23] <freman> Epoch : 1
[15:23] <freman> Version : 0.80.7
[15:23] <freman> Release : 2.el7
[15:24] <freman> Size : 18 M
[15:24] <freman> Repo : installed
[15:24] <freman> From repo : base
[15:24] <freman> Summary : Ceph Common
[15:24] <freman> URL : http://ceph.com/
[15:24] <freman> output from yum info ceph-common
[15:25] <alfredodeza> so that is coming from the base repo? :/
[15:25] <freman> for some reason yes.
[15:26] * yanzheng (~zhyan@182.139.205.112) Quit (Quit: This computer has gone to sleep)
[15:27] <freman> I am not a centos man, i usually run ubuntu. how can ceph be in the normal repo?
[15:27] * yanzheng (~zhyan@182.139.205.112) has joined #ceph
[15:28] * tupper (~tcole@173.38.117.66) has joined #ceph
[15:28] <alfredodeza> can you run a `yum clean all`, ensure that `priority=1` exists in the cephrepo.repo file, and try again querying yum to see where ceph is coming from?
[15:29] * vbellur (~vijay@122.171.200.74) has joined #ceph
[15:30] * harold (~hamiller@71-94-227-123.dhcp.mdfd.or.charter.com) has joined #ceph
[15:30] <alfredodeza> OH
[15:30] <alfredodeza> 'ceph repo noarch'
[15:30] <alfredodeza> that is surely wrong :)
[15:30] <alfredodeza> ceph is definitely not noarch
[15:31] <freman> should it be x86_64/?
[15:31] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[15:32] * kefu (~kefu@114.86.210.64) has joined #ceph
[15:32] <alfredodeza> you should have a $basearch at the end
[15:32] <alfredodeza> one sec
[15:33] <alfredodeza> freman: baseurl=http://ceph.com/rpm-hammer/rhel7/$basearch
[15:34] * nardial (~ls@dslb-088-072-089-026.088.072.pools.vodafone-ip.de) Quit (Remote host closed the connection)
[15:36] * treenerd (~treenerd@85.193.140.98) Quit (Quit: Verlassend)
[15:36] <freman> Available Packages
[15:36] <freman> Name : ceph
[15:36] <freman> Arch : x86_64
[15:36] <freman> Epoch : 1
[15:36] <freman> Version : 0.94.2
[15:36] <freman> Release : 0.el7.centos
[15:36] <freman> Size : 20 M
[15:36] <freman> Repo : ceph-noarch/x86_64
[15:36] <freman> Summary : User space components of the Ceph file system
[15:36] <freman> URL : http://ceph.com/
[15:37] <freman> License : GPL-2.0
[15:37] <freman> Yes :) now it shows 0.94.2 :)
[15:37] * harold (~hamiller@71-94-227-123.dhcp.mdfd.or.charter.com) Quit (Quit: Leaving)
[15:38] * yanzheng (~zhyan@182.139.205.112) Quit (Quit: This computer has gone to sleep)
[15:41] <freman> alfredodeza: [ceph@seldceph02 ~]$ sudo ceph --version
[15:41] <freman> ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
[15:41] <freman> [ceph@seldceph02 ~]$it works! you my man deserv a beer :)
[15:41] <alfredodeza> freman: glad it is working for you, sorry for all the woes
[15:42] * alfredodeza will create a ticket to better explain what happens with cephdeploy.conf repo sections
[15:42] * ade (~abradshaw@tmo-113-110.customers.d1-online.com) Quit (Quit: Too sexy for his shirt)
[15:42] <freman> no problems, also I am sure the ceph howto describes noarch as default value on the repofiles.
[15:43] <analbeard> hi guys, quick question - i've been seeing an issue whereby i've just restarted all the osds in a host individually and when they come back up they end up in the default bucket (which doesn't exist until they come back up and then it gets created). this then makes the cluster go nuts rebalancing when it doesn't need to and degrades the client performance horrendously
[15:43] <analbeard> any ideas?
[15:44] * Concubidated (~Adium@66-87-144-220.pools.spcsdns.net) Quit (Quit: Leaving.)
[15:45] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[15:46] * kefu (~kefu@114.86.210.64) has joined #ceph
[15:47] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[15:48] * georgem (~Adium@206.108.127.32) has joined #ceph
[15:52] * nsoffer (~nsoffer@nat-pool-tlv-u.redhat.com) Quit (Ping timeout: 480 seconds)
[15:54] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) has joined #ceph
[16:00] * rwheeler (~rwheeler@nat-pool-rdu-u.redhat.com) has joined #ceph
[16:02] * derjohn_mobi (~aj@b2b-94-79-172-98.unitymedia.biz) Quit (Remote host closed the connection)
[16:04] * Hemanth (~Hemanth@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:04] * nsoffer (~nsoffer@nat-pool-tlv-t.redhat.com) has joined #ceph
[16:06] * derjohn_mob (~aj@b2b-94-79-172-98.unitymedia.biz) has joined #ceph
[16:06] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[16:06] * kefu (~kefu@114.86.210.64) has joined #ceph
[16:07] * dopesong (~dopesong@mail.bvgroup.eu) Quit (Remote host closed the connection)
[16:08] * squizzi (~squizzi@nat-pool-rdu-t.redhat.com) has joined #ceph
[16:09] * bene2 (~ben@nat-pool-bos-t.redhat.com) has joined #ceph
[16:09] * dgurtner_ (~dgurtner@178.197.231.188) has joined #ceph
[16:10] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[16:10] * dgurtner (~dgurtner@178.197.231.188) Quit (Ping timeout: 480 seconds)
[16:11] * kefu (~kefu@114.86.210.64) has joined #ceph
[16:11] <zenpac> jscp: you have time to talk?
[16:13] * bene (~ben@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[16:13] * wushudoin (~wushudoin@2601:646:8201:7769:2ab2:bdff:fe0b:a6ee) has joined #ceph
[16:13] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[16:14] * Concubidated (~Adium@161.225.196.30) has joined #ceph
[16:14] * kefu (~kefu@114.86.210.64) has joined #ceph
[16:15] * jcsp (~jspray@summerhall-meraki1.fluency.net.uk) Quit (Ping timeout: 480 seconds)
[16:19] * shohn is now known as shohn_afk
[16:20] * kefu (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[16:21] * haomaiwang (~haomaiwan@60-250-10-249.HINET-IP.hinet.net) Quit (Remote host closed the connection)
[16:21] * kefu (~kefu@114.86.210.64) has joined #ceph
[16:21] * haomaiwang (~haomaiwan@60-250-10-249.HINET-IP.hinet.net) has joined #ceph
[16:25] * nsoffer (~nsoffer@nat-pool-tlv-t.redhat.com) Quit (Ping timeout: 480 seconds)
[16:26] * t0rn (~ssullivan@2607:fad0:32:a02:56ee:75ff:fe48:3bd3) has joined #ceph
[16:26] * t0rn (~ssullivan@2607:fad0:32:a02:56ee:75ff:fe48:3bd3) has left #ceph
[16:27] * xoritor (~xoritor@cpe-72-177-85-116.austin.res.rr.com) has joined #ceph
[16:27] <xoritor> hey you guys!!!!
[16:28] * xoritor swings in
[16:28] * shohn_afk (~shohn@p57A1499B.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[16:28] <xoritor> who works on ceph-deploy
[16:28] <xoritor> well does not matter
[16:28] <xoritor> kudos
[16:28] <xoritor> last time i tried to use it it did NOT work
[16:28] <xoritor> this time it is gravy
[16:29] <xoritor> so kudos to you
[16:29] <alfredodeza> xoritor: off_rhoden is doing all the hard work these days
[16:29] <xoritor> well you guys are doing an amazing job on it
[16:34] * yanzheng (~zhyan@182.139.205.112) has joined #ceph
[16:38] * jordanP (~jordan@213.215.2.194) Quit (Quit: Leaving)
[16:38] * tdb (~tdb@myrtle.kent.ac.uk) Quit (Read error: Connection reset by peer)
[16:38] * jordanP (~jordan@scality-jouf-2-194.fib.nerim.net) has joined #ceph
[16:39] * oblu (~o@62.109.134.112) has joined #ceph
[16:41] <xoritor> ok when using ceph-deploy to make osd's and having a split public/cluster network... can i use the cluster network name or should i use the public newtwork name for the osd?
[16:42] * linjan (~linjan@176.195.70.29) has joined #ceph
[16:44] * alram (~alram@ip-64-134-238-181.public.wayport.net) has joined #ceph
[16:45] <freman> alfredodeza: if you are still here, one more issue? ;) mon.seldcadm01 monitor is not yet in quorum, tries left: 5
[16:45] <freman> none of the monitors want to join the quorum
[16:52] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[16:53] * thomnico (~thomnico@cro38-2-88-180-16-18.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[16:57] * tdb (~tdb@myrtle.kent.ac.uk) has joined #ceph
[16:58] * kefu_ (~kefu@120.204.164.217) has joined #ceph
[17:03] * kefu_ (~kefu@120.204.164.217) Quit (Read error: Connection reset by peer)
[17:04] * kefu_ (~kefu@120.204.164.217) has joined #ceph
[17:04] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:05] * georgem (~Adium@206.108.127.32) Quit (Quit: Leaving.)
[17:05] * kefu (~kefu@114.86.210.64) Quit (Ping timeout: 480 seconds)
[17:07] * reed (~reed@75-101-54-131.dsl.static.fusionbroadband.com) has joined #ceph
[17:08] * yanzheng (~zhyan@182.139.205.112) Quit (Quit: This computer has gone to sleep)
[17:08] * thomnico (~thomnico@AMarseille-652-1-242-23.w86-210.abo.wanadoo.fr) has joined #ceph
[17:08] * georgem (~Adium@206.108.127.32) has joined #ceph
[17:10] * sankarshan (~sankarsha@106.216.157.142) has joined #ceph
[17:11] * elder (~elder@c-24-245-18-91.hsd1.mn.comcast.net) has joined #ceph
[17:11] * ChanServ sets mode +o elder
[17:11] * kefu_ (~kefu@120.204.164.217) Quit (Max SendQ exceeded)
[17:11] * kefu (~kefu@120.204.164.217) has joined #ceph
[17:12] * TheSov (~TheSov@cip-248.trustwave.com) has joined #ceph
[17:13] * georgem (~Adium@206.108.127.32) Quit ()
[17:14] * georgem (~Adium@206.108.127.32) has joined #ceph
[17:16] * jwilkins (~jwilkins@2601:644:4100:bfef:ea2a:eaff:fe08:3f1d) Quit (Quit: Leaving)
[17:17] * georgem (~Adium@206.108.127.32) Quit ()
[17:17] * jwilkins (~jwilkins@2601:644:4100:bfef:ea2a:eaff:fe08:3f1d) has joined #ceph
[17:19] * georgem (~Adium@206.108.127.32) has joined #ceph
[17:19] * scuttlemonkey (~scuttle@nat-pool-rdu-t.redhat.com) Quit (Read error: Connection reset by peer)
[17:19] * kefu_ (~kefu@114.86.210.64) has joined #ceph
[17:20] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[17:23] * sjm (~sjm@pool-100-1-115-73.nwrknj.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[17:23] * georgem (~Adium@206.108.127.32) Quit ()
[17:27] * kefu (~kefu@120.204.164.217) Quit (Ping timeout: 480 seconds)
[17:28] * linuxkidd (~linuxkidd@li824-68.members.linode.com) has joined #ceph
[17:28] <xoritor> ok i am getting REALLY abysmal performance
[17:29] <gleam> anyone know what the story was with slashdot/sourceforge's cluster?
[17:29] * rwheeler (~rwheeler@nat-pool-rdu-u.redhat.com) Quit (Remote host closed the connection)
[17:30] <xoritor> on a local drive... same drive as is in the ceph cluster i get 371 MB/s
[17:30] <xoritor> using ceph rbd i get 121 MB/s
[17:31] * kefu_ (~kefu@114.86.210.64) Quit (Max SendQ exceeded)
[17:32] * kefu (~kefu@114.86.210.64) has joined #ceph
[17:33] * scuttle|afk (~scuttle@nat-pool-rdu-t.redhat.com) has joined #ceph
[17:33] * scuttle|afk is now known as scuttlemonkey
[17:35] * arbrandes (~arbrandes@179.210.13.90) Quit (Ping timeout: 480 seconds)
[17:36] * kefu (~kefu@114.86.210.64) Quit ()
[17:37] * sjm (~sjm@pool-100-1-115-73.nwrknj.fios.verizon.net) has joined #ceph
[17:38] * jcsp (~Adium@0001bf3a.user.oftc.net) has joined #ceph
[17:38] * kefu (~kefu@114.86.210.64) has joined #ceph
[17:40] <rkeene> xoritor, Welcome to the club
[17:41] <xoritor> rkeene, you too?
[17:42] * moore (~moore@64.202.160.88) has joined #ceph
[17:44] <rkeene> Yep
[17:45] <xoritor> i have a cluster_network setup but i am not sure it is using it
[17:45] <xoritor> i think it is using my public network
[17:45] <rkeene> That's pretty easy to verify by looking at the interface counters
[17:46] <xoritor> true
[17:46] <xoritor> i should go look
[17:46] * alram (~alram@ip-64-134-238-181.public.wayport.net) Quit (Ping timeout: 480 seconds)
[17:46] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[17:47] * oro (~oro@2001:620:20:16:a91d:bef5:54a:5f41) Quit (Ping timeout: 480 seconds)
[17:50] * m0zes would be willing to bet sf's outage is related to unproven ssds for journals and sudden power issues. Someone hit an EPO? Twice?
[17:50] <m0zes> that would be why I lost some data with ceph.
[17:51] * branto (~branto@178.253.161.174) Quit (Quit: Leaving.)
[17:52] <m0zes> xfs corruption and journal issues due to ssds *claiming* things being written when they weren't. someone hit the epo, we bring stuff up and start rebalancing, epo gets triggered again...
[17:52] <m0zes> *kaboom*
[17:55] * thansen (~thansen@63-248-145-253.static.layl0103.digis.net) Quit (Quit: Ex-Chat)
[17:56] * ifur (~osm@0001f63e.user.oftc.net) Quit (Quit: leaving)
[17:58] * ifur (~osm@0001f63e.user.oftc.net) has joined #ceph
[17:58] * derjohn_mob (~aj@b2b-94-79-172-98.unitymedia.biz) Quit (Ping timeout: 480 seconds)
[17:59] <xoritor> well it is using my cluster network
[17:59] <xoritor> i think the rbd map uses the "mon" interface
[18:00] <xoritor> and the cluster network is only used for the replication
[18:00] <xoritor> so performance sucks for my test but may be good for my replication
[18:01] * muni (~muni@bzq-218-200-252.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[18:03] * kawa2014 (~kawa@89.184.114.246) Quit (Quit: Leaving)
[18:03] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[18:03] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[18:03] <xoritor> ok on a bonded interface i get better performance 178 MB/s
[18:04] * arcimboldo (~antonio@dhcp-y11-zi-s3it-130-60-34-042.uzh.ch) Quit (Ping timeout: 480 seconds)
[18:05] * theambient (~theambien@178.162.35.192) has joined #ceph
[18:06] <xoritor> for reference my glusterfs tests using the same hardware were 2-6 times this performance
[18:07] <xoritor> now i realize that my SSDs are not good journals i get it... i will fix that soon, but right now i need to get it working better
[18:08] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Ping timeout: 480 seconds)
[18:08] <xoritor> i followed this http://docs.ceph.com/docs/hammer/rados/configuration/network-config-ref/
[18:10] <TheSov> gleam, what do you mean about sourceforges cluster?
[18:10] * TMM (~hp@188.207.119.60) has joined #ceph
[18:10] <TheSov> did they have a cluster failure?
[18:10] <gleam> that's what they say
[18:11] * theambient (~theambien@178.162.35.192) Quit (Quit: Leaving...)
[18:11] * alram (~alram@12.124.18.126) has joined #ceph
[18:11] * theambient (~theambien@178.162.35.192) has joined #ceph
[18:12] <TheSov> did they forget to delete failed disks from the cluster?
[18:12] <TheSov> and where did you read that?
[18:13] <gleam> https://sourceforge.net/blog/sourceforge-infrastructure-and-service-restoration/
[18:13] <gleam> We responded immediately and confirmed the issue was related to filesystem corruption on our storage platform. This incident impacted all block devices on our Ceph cluster.
[18:13] <theambient> hi, can anybody tell me why rados_read/rados_write could hang on one machine while working pretty good on another?
[18:13] <theambient> also is it appropriate channel to ask such questions?
[18:13] <gleam> this is the right place, although if you don't get results here i'd suggest the mailing lists too
[18:14] <TheSov> hmmmm. did they forget to turn off write cacheing perhaps on osds?
[18:14] <gleam> is it perhaps an mtu issue?
[18:14] <TheSov> MTU issues should never cause corruption
[18:14] <TheSov> just slowness
[18:14] <TheSov> packet fragmentation occurs end to end.
[18:14] * cholcombe (~chris@c-73-180-29-35.hsd1.or.comcast.net) has joined #ceph
[18:14] <gleam> depends on the setup
[18:14] * nardial (~ls@dslb-088-072-089-026.088.072.pools.vodafone-ip.de) has joined #ceph
[18:15] <TheSov> oh?
[18:15] <gleam> no no, i was talking to theambient
[18:15] <gleam> about his rados_read/write problem
[18:15] <TheSov> in all seriousness unless you are using 10G+ leave your mtu at 1500
[18:16] <gleam> ^^
[18:17] * tacticus (~tacticus@v6.kca.id.au) Quit (Ping timeout: 480 seconds)
[18:17] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) Quit (Quit: Leaving.)
[18:18] * Concubidated (~Adium@161.225.196.30) Quit (Quit: Leaving.)
[18:18] * Concubidated (~Adium@161.225.196.30) has joined #ceph
[18:21] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[18:22] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit ()
[18:23] <xoritor> what setting can i use to set rdma?
[18:24] * dugravot6 (~dugravot6@2a01:e35:8bbf:4060:156a:e249:6b02:baff) has joined #ceph
[18:25] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[18:27] * TMM (~hp@188.207.119.60) Quit (Ping timeout: 480 seconds)
[18:27] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[18:28] * dugravot6 (~dugravot6@2a01:e35:8bbf:4060:156a:e249:6b02:baff) Quit ()
[18:32] * jordanP (~jordan@scality-jouf-2-194.fib.nerim.net) Quit (Quit: Leaving)
[18:33] * thansen (~thansen@17.253.sfcn.org) has joined #ceph
[18:34] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[18:36] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Ping timeout: 480 seconds)
[18:37] * dgurtner_ (~dgurtner@178.197.231.188) Quit (Ping timeout: 480 seconds)
[18:37] * JurB (~JurB@dhcp-077-251-054-060.chello.nl) has joined #ceph
[18:37] * rlrevell (~leer@vbo1.inmotionhosting.com) Quit (Ping timeout: 480 seconds)
[18:39] * sankarshan (~sankarsha@106.216.157.142) Quit (Quit: Leaving...)
[18:40] * tacticus (~tacticus@v6.kca.id.au) has joined #ceph
[18:41] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[18:41] <zenpac> SO I can have multiple clusters using the same Monitors. But each cluster has a unique set of OSDs?
[18:44] <TheSov> so how good is ceph at random IO, would you say its mature enough to replace a compellent san?
[18:44] <foxxx0> random i/o is where ceph excels
[18:44] <TheSov> odd i just read a reddit post saying that exact opposite
[18:45] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[18:46] <TheSov> quoting here "In aggregate you get great throughput, but to a single host, it's pretty awful. "
[18:46] <TheSov> " And don't even talk to me about random throughput; ever wonder why you never see anyone talk about how many iops their ceph cluster is capable of?"
[18:47] <TheSov> is this guy just full of shit?
[18:47] * thomnico (~thomnico@AMarseille-652-1-242-23.w86-210.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[18:48] <xoritor> TheSov, lots of people are...
[18:48] <xoritor> i myself am some times
[18:48] <xoritor> actually i am more full of vitriol
[18:48] <xoritor> maybe its the dark side
[18:49] <xoritor> but in general it depends on your use case on what works for you
[18:49] <xoritor> i am doing testing now
[18:49] <xoritor> on my hardware with certain setups i am geting wildly different results
[18:49] <xoritor> i mean WILDLY
[18:50] <TheSov> So i have a question, if the block size of a FS is 1M, does that mean all reads and writes are 1M?
[18:50] <xoritor> when i was running coreos with ceph in a container and everything on ipoib i got 600 MB/s easy
[18:50] * rlrevell (~leer@vbo1.inmotionhosting.com) has joined #ceph
[18:50] <xoritor> now following the network guide and setting it up that way i get 121 MB/s ish
[18:51] <xoritor> ie... public network on gigE and cluster on IB
[18:51] <xoritor> very uncool
[18:51] <xoritor> very very not ok
[18:51] <xoritor> rebuilding with everying on IPoIB right now to test again
[18:56] <xoritor> may try to do a backend rdma with a front end on ipoib
[18:58] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[18:59] * andreww (~xarses@12.164.168.117) has joined #ceph
[19:01] <xoritor> TheSov, on the block size it means that all files are at least 1M so all reads are 1M all writes are 1M and all files use 1M of space
[19:02] <TheSov> how does vmware resolve this, it has a 1MB block size yet my sql server seems to run fine on it
[19:02] <xoritor> ie.. a file containing the letter z (echo z >> file) would be 1M in size on the disk
[19:02] <TheSov> xoritor, i get that. but how do apps do sub block reads and writes?
[19:02] * moore (~moore@64.202.160.88) Quit (Read error: Connection reset by peer)
[19:03] <xoritor> what?
[19:03] * moore (~moore@64.202.160.88) has joined #ceph
[19:03] <xoritor> what do you mean by that?
[19:03] <TheSov> vmware has a 1MB block size right
[19:03] <TheSov> does that mean when a vm writes a 1k file, vmware reads and writes 1MB of information?
[19:03] <xoritor> that just says how much data is going in and out
[19:03] <xoritor> they pad it with zeros
[19:03] <TheSov> right but the vm doesnt know that
[19:04] <TheSov> it still does 4k reads and writes internally
[19:04] <xoritor> data -> memory -> disk
[19:04] * bene (~ben@nat-pool-bos-t.redhat.com) has joined #ceph
[19:04] <xoritor> its padded with zeros if it is not 1M
[19:04] <TheSov> so its 1M no matter what?
[19:04] <xoritor> yep
[19:04] <TheSov> wholly crap that would slow the hell out of a any vm doing a lot of iops
[19:04] <xoritor> yep
[19:05] <TheSov> why the hell would they make the block size so massive?
[19:05] * snakamoto (~Adium@192.16.26.2) has joined #ceph
[19:05] <xoritor> cause it speeds up big vms
[19:05] <xoritor> and when you get into system tuning and page tables... they make the pages match
[19:05] <TheSov> so what happens when im changing a few bytes on a file, like changing the header of 5k
[19:05] <TheSov> does that read and write 1M too?
[19:06] <xoritor> so 1 page of memory is 1M
[19:06] <xoritor> and one block on disk is 1M
[19:06] <xoritor> get it
[19:06] <xoritor> 1M
[19:06] <TheSov> im just trying to understand how vmware handles things smaller than that size
[19:06] <xoritor> they dont
[19:07] <TheSov> if it has to do reads and writes that size every time it explains why the RDM's tend to run a lot faster than vmdk's
[19:07] <xoritor> and 1M is not the biggest they could have done
[19:07] <TheSov> I know they have an 8MB option
[19:07] * bene2 (~ben@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[19:09] <xoritor> https://lwn.net/Articles/469805/
[19:09] <xoritor> that explains some... there is a ton of info on this
[19:10] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[19:10] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[19:12] <TheSov> i assumed thats how it worked, but for some reason or another i refused to believe it was that badly designed
[19:12] <xoritor> http://www.dba-oracle.com/art_so_blocksize.htm
[19:12] * kefu (~kefu@114.86.210.64) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:12] <xoritor> there is one for you
[19:12] <TheSov> i fugured vmware used their vmfs blocks like objects so they can do sub block IO's
[19:12] * jordanP (~jordan@bdv75-2-81-57-250-57.fbx.proxad.net) has joined #ceph
[19:12] <TheSov> but now i got a sick feeling
[19:14] * shohn (~Adium@83.171.135.2) has joined #ceph
[19:14] * rlrevell1 (~leer@vbo1.inmotionhosting.com) has joined #ceph
[19:19] * oro (~oro@84-73-73-158.dclient.hispeed.ch) has joined #ceph
[19:19] * rlrevell (~leer@vbo1.inmotionhosting.com) Quit (Ping timeout: 480 seconds)
[19:19] * brutuscat (~brutuscat@17.Red-83-47-123.dynamicIP.rima-tde.net) Quit (Read error: Connection reset by peer)
[19:20] * mgolub (~Mikolaj@91.225.200.101) has joined #ceph
[19:20] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[19:22] * i_m (~ivan.miro@pool-109-191-87-71.is74.ru) has joined #ceph
[19:23] <xoritor> ok... when using ceph-deploy big thing to remember... the hostname MATTERS
[19:24] <xoritor> TheSov, i do not use vmware... ever
[19:25] * b0e (~aledermue@p5083D18E.dip0.t-ipconnect.de) has joined #ceph
[19:26] * brutuscat (~brutuscat@17.Red-83-47-123.dynamicIP.rima-tde.net) has joined #ceph
[19:28] * stiopa (~stiopa@cpc73828-dals21-2-0-cust630.20-2.cable.virginm.net) has joined #ceph
[19:32] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[19:32] * bitserker (~toni@cli-5b7ef18e.bcn.adamo.es) Quit (Ping timeout: 480 seconds)
[19:38] * b0e (~aledermue@p5083D18E.dip0.t-ipconnect.de) Quit (Quit: Leaving.)
[19:38] * arbrandes (~arbrandes@179.210.13.90) has joined #ceph
[19:41] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[19:42] * angdraug (~angdraug@12.164.168.117) has joined #ceph
[19:42] * davidz (~davidz@2605:e000:1313:8003:91a2:29da:6b0b:6902) has joined #ceph
[19:44] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[19:50] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[19:52] <doppelgrau> is there an easy way to find blocked requests that are not sown in ???ceph -w??? and ???ceph -s??? (other than connection to each server an locking in the osd-logs)
[19:54] <linuxkidd> doppelgrau, 'ceph health detail' should show them all
[19:54] <doppelgrau> linuxkidd: thanks
[19:55] <linuxkidd> np
[19:55] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[19:56] <xoritor> 271 MB/s
[19:56] <xoritor> still not great
[19:56] <xoritor> thats all SSD backed over 40 Gbit/s ipoib
[19:57] <xoritor> 4 nodes (right now) but will be adding more later if i stick with it
[19:59] * Sysadmin88 (~IceChat77@05452b2b.skybroadband.com) has joined #ceph
[20:00] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[20:02] * bene (~ben@nat-pool-bos-t.redhat.com) Quit (Quit: Konversation terminated!)
[20:02] * brutuscat (~brutuscat@17.Red-83-47-123.dynamicIP.rima-tde.net) Quit (Remote host closed the connection)
[20:02] <doppelgrau> xoritor: if I had seen it correctly in the backlog, you use the cluster for rbd-Images? how many synchronous IO/s do your SSDs deliver?
[20:02] <xoritor> not enough
[20:02] <xoritor> i found that out the hard way
[20:02] <xoritor> :-(
[20:03] <xoritor> when i bought them all i had to go on was marketing data
[20:03] <xoritor> :-(
[20:03] <xoritor> wel... they lie
[20:03] <xoritor> lie lie lie lie lie lie lie lie
[20:03] <xoritor> they are samsung 850pro 1tb drives
[20:04] * nsoffer (~nsoffer@bzq-109-65-255-114.red.bezeqint.net) has joined #ceph
[20:04] <Kvisle> 1TB SSDs?
[20:05] <xoritor> doppelgrau, i want to use it for many things... not just rbd, however... rbd is a convenient test and i will be doing lots of rbd
[20:05] <xoritor> yes
[20:05] <xoritor> 1tb ssds
[20:05] <xoritor> why?
[20:05] <xoritor> too small?
[20:06] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[20:07] <doppelgrau> xoritor: perhabs mixing the 850s with faster SSDs, and the faster ones get the journals, the 850 the main data
[20:07] <xoritor> i dont have that option
[20:07] <xoritor> not for a while
[20:08] <doppelgrau> xoritor: journal on a ramdisk (just kidding)
[20:08] <xoritor> bwahahahahahahahahahahahaha
[20:08] <xoritor> thats a good on
[20:08] <xoritor> s/on/one
[20:08] <xoritor> i want to get some intel 750s
[20:08] * primechuck (~primechuc@host-95-2-129.infobunker.com) Quit (Read error: No route to host)
[20:08] <xoritor> i had some of the other pcie nvme drives for testing but they took them away
[20:08] <xoritor> :-(
[20:09] <doppelgrau> xoritor: are you using rbd-cache? That could reduce a bit the problem, since it could aggregate the small writes to larger
[20:09] * primechuck (~primechuc@host-95-2-129.infobunker.com) has joined #ceph
[20:09] <xoritor> hmm no
[20:10] <xoritor> right now i am upgrading to a newer kernel (kernel-ml) and seeing if that helps any
[20:10] <xoritor> the centos stock kernel is a 3.10 and lots of things have changed since then
[20:14] <doppelgrau> okay, thats really old, i upgraded from 4.0 to 4.1 for straw2-kernel support :)
[20:16] <xoritor> yefa
[20:16] <xoritor> 4.1.2-1.el7.elrepo.x86_64
[20:16] <xoritor> testing
[20:19] <xoritor> 18.1 MB/s
[20:19] <xoritor> wtf
[20:19] <doppelgrau> :)
[20:20] * capri (~capri@212.218.127.222) has joined #ceph
[20:20] <xoritor> uh
[20:20] <xoritor> im uh
[20:20] <doppelgrau> xoritor: I???d try changing from cfq to noop, some said it helps
[20:20] <xoritor> it should be
[20:20] <xoritor> lemme check again
[20:20] <xoritor> its deadline
[20:21] <xoritor> ill try noop
[20:21] <xoritor> deadline
[20:21] <xoritor> soory
[20:22] * capri_on (~capri@212.218.127.222) Quit (Ping timeout: 480 seconds)
[20:23] <xoritor> well noop was a bit faster
[20:23] <xoritor> 18.9 MB/s
[20:24] <xoritor> rados bench gives better numbers than my dd does
[20:24] <xoritor> Bandwidth (MB/sec): 240.790
[20:25] * TheSov2 (~TheSov@cip-248.trustwave.com) has joined #ceph
[20:25] <xoritor> Bandwidth (MB/sec): 2085.061
[20:25] <xoritor> Bandwidth (MB/sec): 2130.529
[20:25] <xoritor> write , seq , rand
[20:29] <snakamoto> what happens if you do a rados bench with -t set to 1?
[20:29] <snakamoto> docs say by default it does 16 parallel ops
[20:30] <xoritor> i can try it
[20:30] <xoritor> wait one
[20:31] <xoritor> Bandwidth (MB/sec): 100.000
[20:31] * TheSov (~TheSov@cip-248.trustwave.com) Quit (Ping timeout: 480 seconds)
[20:32] <snakamoto> I think it makes sense
[20:32] <snakamoto> when you're testing with a single file, you're most likely writing to a small subset of SSDs
[20:33] <xoritor> probably
[20:34] * rendar (~I@host170-180-dynamic.36-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[20:37] * rendar (~I@host170-180-dynamic.36-79-r.retail.telecomitalia.it) has joined #ceph
[20:39] * shohn (~Adium@83.171.135.2) Quit (Quit: Leaving.)
[20:39] * shohn (~Adium@83.171.135.2) has joined #ceph
[20:44] * Nacer (~Nacer@2001:41d0:fe82:7200:3cc6:2ed2:c132:c0c8) has joined #ceph
[20:44] * fsimonce (~simon@host249-48-dynamic.53-79-r.retail.telecomitalia.it) has joined #ceph
[20:45] * dopesong (~dopesong@78-61-129-9.static.zebra.lt) has joined #ceph
[20:47] * TheSov2 is now known as TheSov
[20:52] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[21:03] * dneary (~dneary@AGrenoble-651-1-414-123.w90-52.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[21:04] * CobraKhan007 (~PuyoDead@freedom.ip-eend.nl) has joined #ceph
[21:06] * _Tassadar (~tassadar@D57DEE42.static.ziggozakelijk.nl) Quit (Remote host closed the connection)
[21:06] * gucki (~smuxi@31.24.15.113) Quit (Ping timeout: 480 seconds)
[21:07] * stj (~stj@2604:a880:800:10::2cc:b001) Quit (Quit: rebuut)
[21:10] * stj (~stj@2604:a880:800:10::2cc:b001) has joined #ceph
[21:11] * gucki (~smuxi@212-51-155-85.fiber7.init7.net) has joined #ceph
[21:12] <ndru> Is there usage by connection or cephx user ID I can see for IOPS and I/O throughput?
[21:12] <doppelgrau> xoritor: and raddos bench uses 4MB objects as default
[21:17] * Debesis_ (~0x@169.171.46.84.mobile.mezon.lt) has joined #ceph
[21:18] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[21:18] <ndru> Or is there any way I can see a list of client connections to the cluster?
[21:22] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[21:22] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[21:22] * Debesis (Debesis@169.171.46.84.mobile.mezon.lt) Quit (Ping timeout: 480 seconds)
[21:25] * zaitcev (~zaitcev@c-76-113-49-212.hsd1.nm.comcast.net) has joined #ceph
[21:34] * CobraKhan007 (~PuyoDead@5NZAAFBL5.tor-irc.dnsbl.oftc.net) Quit ()
[21:35] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[21:35] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[21:36] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) has joined #ceph
[21:40] * fmanana (~fdmanana@bl13-155-79.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[21:43] * shohn (~Adium@83.171.135.2) Quit (Quit: Leaving.)
[21:44] * mgolub (~Mikolaj@91.225.200.101) Quit (Quit: away)
[21:52] * nardial (~ls@dslb-088-072-089-026.088.072.pools.vodafone-ip.de) Quit (Quit: Leaving)
[21:53] * TheSov2 (~TheSov@204.13.200.248) has joined #ceph
[21:54] * JurB (~JurB@dhcp-077-251-054-060.chello.nl) Quit (Quit: Leaving...)
[21:56] * pdrakeweb (~pdrakeweb@oh-71-50-39-25.dhcp.embarqhsd.net) Quit (Quit: Leaving...)
[21:58] * TheSov (~TheSov@cip-248.trustwave.com) Quit (Ping timeout: 480 seconds)
[22:01] * via (~via@smtp2.matthewvia.info) Quit (Remote host closed the connection)
[22:02] * pdrakeweb (~pdrakeweb@cpe-65-185-74-239.neo.res.rr.com) has joined #ceph
[22:02] * jrankin (~jrankin@d53-64-170-236.nap.wideopenwest.com) Quit (Quit: Leaving)
[22:02] * alram (~alram@12.124.18.126) Quit (Ping timeout: 480 seconds)
[22:03] * via (~via@smtp2.matthewvia.info) has joined #ceph
[22:07] * derjohn_mob (~aj@tmo-109-52.customers.d1-online.com) has joined #ceph
[22:07] * nsoffer (~nsoffer@bzq-109-65-255-114.red.bezeqint.net) Quit (Quit: Segmentation fault (core dumped))
[22:11] * dopesong (~dopesong@78-61-129-9.static.zebra.lt) Quit (Remote host closed the connection)
[22:28] * i_m (~ivan.miro@pool-109-191-87-71.is74.ru) Quit (Ping timeout: 480 seconds)
[22:29] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Remote host closed the connection)
[22:29] * nils_ (~nils@doomstreet.collins.kg) Quit (Quit: This computer has gone to sleep)
[22:34] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[22:39] * oro (~oro@84-73-73-158.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[22:45] * snakamoto (~Adium@192.16.26.2) Quit (Quit: Leaving.)
[22:49] * fmanana (~fdmanana@bl13-155-79.dsl.telepac.pt) has joined #ceph
[22:50] <TheSov2> im willing to bet that the issue with sourceforge was that they forgot to disable write cacheing
[22:52] * MACscr1 (~Adium@2601:247:4102:c3ac:6530:5d69:ce63:967e) Quit (Quit: Leaving.)
[22:53] * reed (~reed@75-101-54-131.dsl.static.fusionbroadband.com) Quit (Quit: Ex-Chat)
[22:54] <TheSov2> when i delete an osd because its down, how long should the cluster take to rebalance?
[22:55] * Pettis (~Helleshin@freedom.ip-eend.nl) has joined #ceph
[22:57] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[23:07] * tupper (~tcole@173.38.117.66) Quit (Ping timeout: 480 seconds)
[23:08] * rlrevell1 (~leer@vbo1.inmotionhosting.com) Quit (Ping timeout: 480 seconds)
[23:10] * fmanana (~fdmanana@bl13-155-79.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[23:15] * Ceph-Log-Bot (~logstash@185.66.248.215) has joined #ceph
[23:15] * Ceph-Log-Bot (~logstash@185.66.248.215) Quit (Read error: Connection reset by peer)
[23:20] * MACscr (~Adium@2601:247:4102:c3ac:29db:fd93:88ea:e3a8) has joined #ceph
[23:25] * Pettis (~Helleshin@9S0AACHWE.tor-irc.dnsbl.oftc.net) Quit ()
[23:30] * snakamoto (~Adium@192.16.26.2) has joined #ceph
[23:37] * rendar (~I@host170-180-dynamic.36-79-r.retail.telecomitalia.it) Quit ()
[23:38] <doppelgrau> TheSov2: depends on the setup
[23:39] * jordanP (~jordan@bdv75-2-81-57-250-57.fbx.proxad.net) Quit (Remote host closed the connection)
[23:39] <doppelgrau> hmm, got a strange problem: a few seconds after I started a osd I got tons of log lines like the following (an slow requests starts to build up). But only one OSD one the server
[23:40] <doppelgrau> [ownIp]:6801/1730223 >> [ownIp]:0/2078381 pipe(0x21f05000 sd=481 :6801 s=0 pgs=0 cs=0 l=1 c=0x21de02c0).accept replacing existing (lossy) channel (new one lossy=1)
[23:41] * nsoffer (~nsoffer@bzq-109-65-255-114.red.bezeqint.net) has joined #ceph
[23:44] * georgem (~Adium@69-196-163-65.dsl.teksavvy.com) Quit (Quit: Leaving.)
[23:44] <TheSov2> weird. i just build out on new egg. a 1 u, 10 disk system with 64 gigs of ram, dual hex core proc, a 10 gig sfp dual port card for less than 4k using 1tb red drives
[23:44] * danieagle (~Daniel@187.10.25.6) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[23:49] * arbrandes (~arbrandes@179.210.13.90) Quit (Ping timeout: 480 seconds)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.