#ceph IRC Log


IRC Log for 2012-08-28

Timestamps are in GMT/BST.

[0:02] * adjohn (~adjohn@ Quit (Quit: adjohn)
[0:03] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[0:07] * Leseb_ (~Leseb@ has joined #ceph
[0:07] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) Quit (Quit: Leaving)
[0:08] * lofejndif (~lsqavnbok@9YYAAI4WI.tor-irc.dnsbl.oftc.net) has joined #ceph
[0:11] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Ping timeout: 480 seconds)
[0:11] * Leseb_ is now known as Leseb
[0:21] * Leseb (~Leseb@ Quit (Ping timeout: 480 seconds)
[0:37] * loicd (~loic@brln-4dbac51f.pool.mediaWays.net) Quit (Quit: Leaving.)
[0:37] * loicd (~loic@brln-4dbac51f.pool.mediaWays.net) has joined #ceph
[0:42] <mrjack_> gregaf: are you still there?
[0:46] <gregaf> yeah
[0:49] * MarkN1 (~nathan@ has joined #ceph
[0:50] * MarkN1 (~nathan@ has left #ceph
[0:53] <mrjack_> it worked
[0:53] <mrjack_> i recreated osd filestore
[0:53] <mrjack_> readded it to crushmap
[0:53] <gregaf> cool; glad to hear it!
[0:53] <mrjack_> 384 pgs: 278 active+clean, 29 active+recovering+degraded+remapped+backfill, 52 active+recovering, 25 active+recovering+degraded+backfill; 7322 MB data, 1000 GB used, 461 GB / 1540 GB avail; 3065/7124 degraded (43.024%)
[0:54] <mrjack_> pgmap v378493: 384 pgs: 283 active+clean, 29 active+recovering+degraded+remapped+backfill, 48 active+recovering, 24 active+recovering+degraded+backfill; 7322 MB data, 1001 GB used, 460 GB / 1540 GB avail; 2926/7124 degraded (41.072%)
[0:54] <mrjack_> is that good or bad? seeing degraded percentage is decreasing?
[0:55] <mrjack_> would you add this patch to next release?
[0:56] <gregaf> that's good ??? it means that the OSDs are copying objects around to get proper redundancy
[0:56] <mrjack_> ok
[0:57] <mrjack_> i tested ceph with rbd and ocfs2
[0:57] <mrjack_> i have it successfully up and running on a 4node ceph cluster
[0:57] <gregaf> the patch I gave you is a bit of a wide brush so it probably won't go in, but I made a bug so we'll investigate it a bit more (see if ext3's ABI is off, or our assumptions about return codes, etc) and come up with something to handle it
[0:57] <dmick> anyone investigating plana37, plana48, plana68? 37 is dead, 48 is in kdb from xfs, 68 is looping crashing and whining about things that look like out-of-memory
[0:58] <mrjack_> but when i use it on a 2 node cluster, ocfs2 starts fencing the nodes when i put load on ocfs2
[0:58] <mikeryan> dmick: none of those belong to me
[0:59] <mrjack_> i would like to use cephfs, but the last time i tested it it crashed when untaring kernel tree
[0:59] <dmick> yeah, they're scheduled runs
[0:59] <gregaf> mrjack_: perhaps you're just placing too much load on it for a 2-node cluster to handle
[1:00] <mrjack_> gregaf: no, i just wanted to untar kernel tree that should not be too much load
[1:01] <mrjack_> i can try again when it is sync again
[1:14] <mrjack_> 2012-08-28 01:14:04.143308 mon.0 [INF] pgmap v378804: 384 pgs: 384 active+clean; 7322 MB data, 1006 GB used, 456 GB / 1540 GB avail
[1:14] <mrjack_> looks good
[1:14] * loicd (~loic@brln-4dbac51f.pool.mediaWays.net) Quit (Quit: Leaving.)
[1:42] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:45] <mrjack_> gregaf: i now know why ocfs2 fenced
[1:45] <mrjack_> i have 2 ceph osds
[1:46] <mrjack_> running 1 kvm on rbd image, created second rbd image, trying to mkfs.ext4 on second image leads to stalled kvm
[1:46] <mrjack_> when that happens to ocfs2, it will freak out and fence :)
[1:48] * adjohn (~adjohn@ has joined #ceph
[1:48] <sagewk> joshd: can you take a look at the fixes in wip-objecter?
[1:48] <mrjack_> [277199.933736] libceph: tid 750 timed out on osd1, will reset osd
[1:48] <mrjack_> [277199.933784] libceph: tid 783 timed out on osd0, will reset osd
[1:48] <mrjack_> [277289.626997] INFO: task khugepaged:40 blocked for more than 120 seconds.
[1:48] <mrjack_> hm what could that be?
[1:49] <gregaf> dunno ?????maybe joshd or mikeryan have ideas
[1:49] <mrjack_> 2012-08-28 01:49:40.600806 osd.0 [WRN] slow request 44.812263 seconds old, received at 2012-08-28 01:48:55.788455: osd_op(client.5599.1:888 rb.0.170b.2797152a.000000000434 [write 524288~524288] 2.280483f RETRY) currently delayed
[1:50] <mrjack_> lots of these
[1:50] <mrjack_> hm
[1:50] <mrjack_> 2012-08-28 01:50:17.605742 mon.0 [INF] osd.1 failed (by osd.0
[1:50] <mrjack_> 2012-08-28 01:50:22.606669 mon.0 [INF] osd.1 failed (by osd.0
[1:50] <mrjack_> ow
[1:50] <mrjack_> not again ;(
[1:51] <gregaf> see if the process is still running ??? if it is, yeah, I think you're just overloading them
[1:51] <mrjack_> how can i overload it?
[1:51] <mrjack_> i did the following
[1:51] <mrjack_> rbd create ext4test --size 10240
[1:51] <mrjack_> rbd map ext4test --user admin --secret /etc/ceph/secret
[1:51] <mrjack_> mkfs.ext4 /dev/rbd1
[1:52] <mrjack_> that lead to overloading?! whierd?!
[1:53] <mrjack_> the osd.1 crashed
[1:54] <mrjack_> log is full of msgr things
[1:54] <mrjack_> could be my logging killed my disk-io?!
[1:54] <mrjack_> how can i reduce osd logging?
[1:55] <mrjack_> 2012-08-28 01:49:51.670608 f6a28b70 1 heartbeat_map is_healthy 'OSD::op_tp thread 0xe7ff9b70' had suicide timed out after 300
[1:55] <mrjack_> 2012-08-28 01:49:51.689158 f6a28b70 -1 common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*
[1:55] <mrjack_> common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
[1:56] <gregaf> mrjack: yes, that generally means that your disk/local filesystem wasn't responding quickly enough for the load on the OSD
[1:57] <gregaf> you could turn down debug logging if you have that enabled by commenting it out of your ceph.conf
[1:57] <gregaf> and you should look at what your iowait and system times are on the OSD nodes when you're doing writes
[1:57] <mrjack_> hm
[1:57] <mrjack_> there is no debug loging set
[1:57] <mrjack_> the lots of logs i see is after the crash
[1:58] <mrjack_> 2012-08-28 01:41:44.764358 f020bb70 1 journal check_for_full at 379375616 : JOURNAL FULL 379375616 >= 163839 (max_size 536870912 start 379539456)
[1:58] <mrjack_> hm
[1:58] <mrjack_> maybe journal to small?
[1:58] <mrjack_> what size should it have?
[1:58] <gregaf> the OSD is still running if that is getting output ??? perhaps the crash is an old one
[1:58] <mrjack_> 2012-08-28 01:47:46.668226 f6a28b70 1 heartbeat_map is_healthy 'OSD::op_tp thread 0xe87fab70' had timed out after 30
[1:58] <gregaf> your journal is 2GB, right?
[1:59] <mrjack_> no
[1:59] <mrjack_> 512m
[1:59] <gregaf> oh, yeah, give it at least a few gigs
[1:59] <mrjack_> 4?
[1:59] <gregaf> but if you have persistent full journal warnings, that means that you are trying to send the OSD more traffic than its backing store can handle
[1:59] <mrjack_> i cannot imagine that the machine is to slow
[2:00] <mrjack_> Intel(R) Xeon(R) CPU E31230 @ 3.20GHz
[2:00] <mrjack_> 16gb ddr3 ecc
[2:00] <mrjack_> 4x500gb sata raid10
[2:00] <gregaf> it's about the disk, not the cpu...
[2:00] <mrjack_> should handle single mkfs shouldnt it?
[2:00] <mrjack_> 2x1gbits connected
[2:00] <gregaf> look, you're the one with access to it, but given your symptoms, that's the problem
[2:01] * Cube (~Adium@ Quit (Quit: Leaving.)
[2:01] <gregaf> we've found that many controllers do an astoundingly poor job of handling our workload for no apparent reason
[2:01] <mrjack_> i have the same servers in a 4 node setup working with > 100 kvm images and ocfs2 without problems...
[2:01] <gregaf> and in terms of writes you effectively have two disks (that you are perhaps using to host both the journal and the backing store)
[2:01] * yoshi (~yoshi@p22043-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:02] <mrjack_> how can i enlarge the journal?
[2:03] <gregaf> change its size in the ceph.conf and restart the OSD should do it, right mikeryan?
[2:03] <mrjack_> stop osd, ceph-osd -i 1 --mk-journal, start osd?
[2:03] <gregaf> I don't think so, but I'm not really sure
[2:04] <mrjack_> i changed it in the config
[2:04] <mrjack_> but it is not changed after restart
[2:06] * fzylogic (~fzylogic@ has joined #ceph
[2:10] <mrjack_> how can i configure rbd caching to use with rbd device?
[2:12] <joshd> rbd caching can be used by qemu/librbd, but not the kernel rbd driver (which is what you're using when you run rbd map)
[2:13] <mrjack_> ah ok
[2:14] <mrjack_> node01:/ocfs2# dd if=/dev/zero of=test bs=1M count=100
[2:14] <mrjack_> 100+0 Datens?tze ein
[2:14] <mrjack_> 100+0 Datens?tze aus
[2:14] <mrjack_> 104857600 Bytes (105 MB) kopiert, 0,312363 s, 336 MB/s
[2:15] <mrjack_> seems cached to me
[2:15] <mrjack_> but is ocfs2 cache
[2:16] <joshd> yeah, the linux page cache
[2:16] <mrjack_> node01:/ocfs2# dd bs=1M count=128 if=/dev/zero of=test conv=fdatasync
[2:16] <mrjack_> 128+0 Datens?tze ein
[2:16] <mrjack_> 128+0 Datens?tze aus
[2:16] <mrjack_> 134217728 Bytes (134 MB) kopiert, 5,70149 s, 23,5 MB/s
[2:17] <mrjack_> seems slow
[2:17] <mrjack_> :)
[2:18] <mrjack_> this is inside a kvm image
[2:18] <mrjack_> dd bs=1M count=128 if=/dev/zero of=test conv=fdatasync
[2:18] <mrjack_> 128+0 records in
[2:18] <mrjack_> 128+0 records out
[2:18] <mrjack_> 134217728 bytes (134 MB) copied, 2.55952 s, 52.4 MB/s
[2:18] <mrjack_> ocfs2 kills performance by 50%
[2:18] <mrjack_> dlm locking
[2:19] <mrjack_> but since cephfs isnt usable yet..
[2:22] <nhm> mrjack_: in a single mds configuration it's at least experimentable. ;)
[2:22] <joshd> sagewk: wip-objecter looks good other than a typo
[2:23] <sagewk> joshd: thanks
[2:23] * bchrisman (~Adium@ Quit (Quit: Leaving.)
[2:23] <mrjack_> i never tried single mds
[2:23] <mrjack_> that would be pointless imho
[2:24] <mrjack_> i ended up pancing machines
[2:24] <mrjack_> freezing them
[2:24] <mrjack_> having directories i could not remove saying they are not empty but they were
[2:24] <mrjack_> performance was okay, though..
[2:26] <joshd> rbd caching also won't work with ocfs2, since it's local to each client
[2:26] <joshd> with one mds, data access is still in parallel
[2:26] <joshd> do you have a very metadata-heavy workloads?
[2:27] <mrjack_> what is this
[2:27] <mrjack_> [273993.208694] ioctl32(ceph-osd:4463): Unknown cmd fd(20) cmd(00009408){t:ffffff94;sz:0} arg(0000000f) on /data/ceph_backend/osd
[2:27] <mrjack_> joshd: no, i just want multiple mds for redundancy
[2:27] <joshd> mrjack_: ah, then you can have one active and others standby
[2:28] <mrjack_> hot standby?
[2:28] <mrjack_> so it would survive powerloss of one running mds?
[2:28] <joshd> yeah, it should
[2:28] <mrjack_> well
[2:28] <joshd> the mds's don't store anything themselves, it's all in objects on the osds
[2:29] <joshd> they act more like caches mostly
[2:29] <mrjack_> the problem i see is that there is no fsck
[2:29] <mrjack_> it is a pain if you cannot undelete directory
[2:30] <joshd> yeah, that's one of those things we'd like to have before it's ready for production ready
[2:30] <mrjack_> i once wrote fsck for mysqlfs fuse module..
[2:31] <mrjack_> but i do not know enough about how ceph works to contribute anything to ceph right now...
[2:31] <mrjack_> when the mds dont store anything
[2:31] <mrjack_> where is the point why it is a seperate daemon?
[2:32] <joshd> it's the part that understands how the filesystem hierarchy and metadata are stored in objects, and it keeps stateful sessions with clients
[2:33] <mrjack_> again
[2:33] <mrjack_> do you know what this is
[2:33] <mrjack_> [273993.208694] ioctl32(ceph-osd:4463): Unknown cmd fd(20) cmd(00009408){t:ffffff94;sz:0} arg(0000000f) on /data/ceph_backend/osd
[2:33] <joshd> it's very much a layer on top of the object store
[2:34] <joshd> it sounds like the osd is trying to use an ioctl that the kernel doesn't support (strange, since it tries to detect support for them)
[2:34] <joshd> what fs are the osds using?
[2:39] <mrjack_> ext3
[2:39] <mrjack_> i first used btrfs
[2:39] <mrjack_> and ended up with unmountable filesystems killing my first two ceph installations
[2:39] <mrjack_> then switched to ext4
[2:40] <mrjack_> there, datacorruption somehow screwd up all 4 filestores
[2:40] <mrjack_> ceph crashed ;)
[2:40] <mrjack_> now i use ext3
[2:40] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[2:41] <iggy> have you tried xfs?
[2:41] <mrjack_> no
[2:41] <mrjack_> does it run stable?
[2:42] <mrjack_> i thought xfs cannot be resized?
[2:42] <iggy> I think it's currently the suggested option until btrfs gets further along
[2:42] <mrjack_> there was a reason i didnt try xfs
[2:42] <mrjack_> ah o
[2:42] <mrjack_> i c
[2:42] <mrjack_> hm
[2:42] <mrjack_> can i just backup osd filestore with tar , format with xfs and replace backup?
[2:43] <mikeryan> mrjack_: xfs can grow, not shrink
[2:43] <joshd> looks like the only ioctl that isn't btrfs specific is fiemap, which shouldn't be being used anyway
[2:43] <mrjack_> so stop osd on a node, backup mountpoint, format with xfs, restore, start osd?
[2:47] <mrjack_> or is that a bad idea because of xattrs?
[2:54] * maelfius (~mdrnstm@ Quit (Quit: Leaving.)
[2:58] <nhm> iggy: it's probably more stable. Performance is lower, but performance degrades more slowly than btrfs as well.
[3:02] * lofejndif (~lsqavnbok@9YYAAI4WI.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[3:03] * lofejndif (~lsqavnbok@82VAAFYY0.tor-irc.dnsbl.oftc.net) has joined #ceph
[3:13] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) Quit (Quit: Leaving.)
[3:31] * fzylogic (~fzylogic@ Quit (Quit: fzylogic)
[3:33] <mrjack_> hm
[3:33] <mrjack_> does more osds mean more speed?
[3:47] * adjohn (~adjohn@ Quit (Quit: adjohn)
[3:54] * chutzpah (~chutz@ Quit (Quit: Leaving)
[3:55] * lofejndif (~lsqavnbok@82VAAFYY0.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[4:28] <iggy> mrjack_: that's one of the main idea's behind rados
[4:32] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[4:37] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[4:48] * nhm (~nhm@67-220-20-222.usiwireless.com) Quit (Remote host closed the connection)
[4:48] * nhm (~nhm@67-220-20-222.usiwireless.com) has joined #ceph
[4:58] * renzhi (~renzhi@ has joined #ceph
[4:59] <renzhi> morning
[5:00] <dmick> hi renzhi
[5:02] <renzhi> dmick: how's going?
[5:02] <dmick> I am well. you?
[5:02] <renzhi> doing fine, still keeping an eye on our ceph production system :)
[5:03] <renzhi> with 200TB of storage space
[5:04] <iggy> cephfs in production?
[5:05] <renzhi> no, just rados
[5:05] <dmick> healthy cluster
[5:05] <iggy> ahhh... kvm machines?
[5:06] <renzhi> we have a few kvm, yes, but still considering if we should put in on ceph.
[5:06] <renzhi> In testing, it's kinda slow, just couldn't figure out why.
[5:06] <renzhi> right now, it's used for object storage, e.g. files
[5:08] <renzhi> I have a small question. I'm trying to create a keyring that allows an app to access only a specific pool.
[5:08] <renzhi> The caps is set like this:
[5:08] <renzhi> client.pool.test2
[5:08] <renzhi> key: AQB1RjtQCBW/DxAAHwsIlErtojWKzrNXL5Am/w==
[5:08] <renzhi> caps: [mds] allow
[5:08] <renzhi> caps: [mon] allow r
[5:08] <renzhi> caps: [osd] allow rw pool=test2
[5:09] <renzhi> but when I try to access it via rados, I always got an authentication error, and operation not permitted.
[5:09] <renzhi> what's wrong with this caps?
[5:11] * nhm (~nhm@67-220-20-222.usiwireless.com) Quit (Ping timeout: 480 seconds)
[5:11] <dmick> that's exactly what is documented, so it seems like it ought to work
[5:11] <dmick> are you sure the key is correct and the one you're using?
[5:12] <renzhi> yes, I do "rados -c ceph.conf -p test2 ls -" and in ceph.conf, I specified the exact path of the keyring
[5:15] <dmick> and ceph auth list shows what you expect?
[5:15] <renzhi> the key information above is from the command "ceph auth list"
[5:15] <dmick> heh
[5:17] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[5:18] <renzhi> here is some debug info from rados:
[5:18] <renzhi> 2012-08-28 11:17:22.308537 7fd7464c9780 2 auth: KeyRing::load: loaded key file /home/xp/tmp2/test2.keyring
[5:18] <renzhi> 2012-08-28 11:17:22.336122 7fd7464c9780 0 librados: client.admin authentication error (1) Operation not permitted
[5:18] <renzhi> couldn't connect to cluster! error -1
[5:18] <renzhi> it's loading the right keyring file, but auth failed
[5:21] <dmick> that section of ceph auth list is from the client.admin section?
[5:23] <renzhi> http://pastebin.com/Y3rUjuQw
[5:23] <renzhi> the result of running ceph auth list, using the client.admin.keyring
[5:23] <dmick> oh you said client.pool.test2, right
[5:24] <renzhi> yes
[5:25] <renzhi> client.pool.test2 key is supposed to let me access only the pool test2, and not others
[5:25] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[5:26] <renzhi> I don't want every app to have client.admin.keyring, so each will be limited to access its own pool only
[5:28] <dmick> right
[5:31] * nhm (~nhm@67-220-20-222.usiwireless.com) has joined #ceph
[5:32] <renzhi> I was expecting this to be a walk in the park :)
[5:35] <dmick> what is setting the client id to rados.pool.test2 when you run the rados command?
[5:37] <renzhi> here are the commands I ran:
[5:38] <renzhi> ceph-authtool -C --gen-key --name client.pool.test2 --caps caps.txt test2.keyring
[5:38] <renzhi> ceph auth add client.pool.test2 -i test2.keyring
[5:39] <dmick> right, but then when you run rados, the client name defaults to client.admin
[5:39] <renzhi> I reviewed this against the wiki and documentation a few times, it should be correct
[5:39] <renzhi> :O
[5:40] <dmick> I *think* if you add -n client.pool.test2 that'll help
[5:40] <dmick> trying to test a similar setup here
[5:41] <renzhi> that's right
[5:41] <renzhi> as always, you are the man :)
[5:42] <dmick> happy I could help. I'm not very familiar with this stuff but I can dig around :)
[5:42] <renzhi> this is like redundant, why do I have to specify the name if the name is already in the keyring file, and it's loading the right keyring already?
[5:42] <dmick> the client has an identity
[5:43] <dmick> that identity is used to look up which key to use
[5:43] <renzhi> Note to self: pay attention to names, and read more ceph code
[5:43] <dmick> the keyring holds potentially many keys; the right one has to be identified for this client session; that's done by client ID
[5:43] <renzhi> ok
[5:44] <dmick> and then the right caps have to be set on the entity receiving that key to allow the access
[5:44] <dmick> (that's the ceph auth add part)
[5:44] <dmick> I believe the caps in the keyring file are not used except at auth add time
[5:45] <renzhi> thanks
[5:46] <renzhi> I don't seem to remember a place where to set the identity in the rados API though, I might have missed it
[5:46] <dmick> librados?
[5:46] <renzhi> going back to the api docs...
[5:46] <renzhi> yeah
[5:46] <dmick> I remember that being slightly confusing
[5:47] <dmick> rados_create, 2nd arg
[5:48] <dmick> btw, if you're not already, it's easy to experiment with the API from Python
[5:48] <dmick> and then code C once it makes sense
[5:49] <renzhi> I see, I was scratching my head what that id is for :)
[5:49] <dmick> I'm happy to see that ids with extra '.' in them are handled :)
[5:50] <renzhi> funny thing is, the id "pool.test2" is not accepted
[5:53] <dmick> is not accepted where?
[5:54] <renzhi> by ceph-authtool
[5:55] <dmick> yes, it wants the 'client' bit, but apparently rados_create does not
[5:55] <renzhi> yeah
[5:56] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[6:18] * deepsa (~deepsa@ has joined #ceph
[6:58] * deepsa_ (~deepsa@ has joined #ceph
[7:02] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[7:02] * deepsa_ is now known as deepsa
[7:37] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[7:53] * dmick (~dmick@2607:f298:a:607:1a03:73ff:fedd:c856) Quit (Quit: Leaving.)
[8:03] * andret (~andre@pcandre.nine.ch) Quit (Remote host closed the connection)
[8:04] * andret (~andre@pcandre.nine.ch) has joined #ceph
[8:12] * Tobarja (~athompson@cpe-071-075-064-255.carolina.res.rr.com) Quit (Ping timeout: 480 seconds)
[8:15] * ihwtl (~ihwtl@odm-mucoffice-02.odmedia.net) has joined #ceph
[8:16] <ihwtl> .
[8:36] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * glowell (~Adium@c-98-210-226-131.hsd1.ca.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * rosco (~r.nap@ Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * gregaf (~Adium@2607:f298:a:607:4990:c1e3:3fe9:b77f) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * dpemmons (~dpemmons@ Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * rturk (~rturk@ps94005.dreamhost.com) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * acaos (~zac@209-99-103-42.fwd.datafoundry.com) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * ajm (~ajm@adam.gs) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * eightyeight (~atoponce@pinyin.ae7.st) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * mkampe (~markk@ Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * MK_FG (~MK_FG@ Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * eternaleye (~eternaley@tchaikovsky.exherbo.org) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * kblin (~kai@kblin.org) Quit (resistance.oftc.net oxygen.oftc.net)
[8:36] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[8:36] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[8:36] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[8:36] * glowell (~Adium@c-98-210-226-131.hsd1.ca.comcast.net) has joined #ceph
[8:36] * gregaf (~Adium@2607:f298:a:607:4990:c1e3:3fe9:b77f) has joined #ceph
[8:36] * rosco (~r.nap@ has joined #ceph
[8:36] * dpemmons (~dpemmons@ has joined #ceph
[8:36] * kblin (~kai@kblin.org) has joined #ceph
[8:36] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) has joined #ceph
[8:36] * rturk (~rturk@ps94005.dreamhost.com) has joined #ceph
[8:36] * acaos (~zac@209-99-103-42.fwd.datafoundry.com) has joined #ceph
[8:36] * ajm (~ajm@adam.gs) has joined #ceph
[8:36] * eightyeight (~atoponce@pinyin.ae7.st) has joined #ceph
[8:36] * mkampe (~markk@ has joined #ceph
[8:36] * MK_FG (~MK_FG@ has joined #ceph
[8:36] * eternaleye (~eternaley@tchaikovsky.exherbo.org) has joined #ceph
[8:48] * mkampe (~markk@ Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * ajm (~ajm@adam.gs) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * acaos (~zac@209-99-103-42.fwd.datafoundry.com) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * rturk (~rturk@ps94005.dreamhost.com) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * kblin (~kai@kblin.org) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * dpemmons (~dpemmons@ Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * rosco (~r.nap@ Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * gregaf (~Adium@2607:f298:a:607:4990:c1e3:3fe9:b77f) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * glowell (~Adium@c-98-210-226-131.hsd1.ca.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * MK_FG (~MK_FG@ Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * eternaleye (~eternaley@tchaikovsky.exherbo.org) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * eightyeight (~atoponce@pinyin.ae7.st) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) Quit (resistance.oftc.net oxygen.oftc.net)
[8:48] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[8:48] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[8:48] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[8:48] * glowell (~Adium@c-98-210-226-131.hsd1.ca.comcast.net) has joined #ceph
[8:48] * gregaf (~Adium@2607:f298:a:607:4990:c1e3:3fe9:b77f) has joined #ceph
[8:48] * rosco (~r.nap@ has joined #ceph
[8:48] * dpemmons (~dpemmons@ has joined #ceph
[8:48] * kblin (~kai@kblin.org) has joined #ceph
[8:48] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) has joined #ceph
[8:48] * rturk (~rturk@ps94005.dreamhost.com) has joined #ceph
[8:48] * acaos (~zac@209-99-103-42.fwd.datafoundry.com) has joined #ceph
[8:48] * ajm (~ajm@adam.gs) has joined #ceph
[8:48] * eightyeight (~atoponce@pinyin.ae7.st) has joined #ceph
[8:48] * mkampe (~markk@ has joined #ceph
[8:48] * MK_FG (~MK_FG@ has joined #ceph
[8:48] * eternaleye (~eternaley@tchaikovsky.exherbo.org) has joined #ceph
[8:51] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * lxo (~aoliva@lxo.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * masterpe (~masterpe@2001:990:0:1674::1:82) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * stan_theman (~stan_them@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * blufor (~blufor@adm-1.candycloud.eu) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * wido (~wido@2a00:f10:104:206:9afd:45af:ae52:80) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * ferai (~quassel@quassel.jefferai.org) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * NaioN (stefan@andor.naion.nl) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * gohko (~gohko@natter.interq.or.jp) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * laevar (~jochen@laevar.de) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * hijacker (~hijacker@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * mengesb (~bmenges@servepath-gw3.servepath.com) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * ogelbukh (~weechat@nat3.4c.ru) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * sagewk (~sage@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * yehudasa (~yehudasa@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * todin (tuxadero@kudu.in-berlin.de) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * Anticimex (anticimex@netforce.csbnet.se) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * andret (~andre@pcandre.nine.ch) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * _are__ (~quassel@2a01:238:4325:ca02::42:4242) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * kibbu (claudio@owned.ethz.ch) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * al (d@fourrooms.bandsal.at) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * mikeryan (mikeryan@lacklustre.net) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * deepsa (~deepsa@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * mrjack_ (mrjack@office.smart-weblications.net) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * joao (~JL@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * darkfaded (~floh@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * exec (~defiler@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * benner (~benner@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * pmjdebruijn (~pmjdebrui@overlord.pcode.nl) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * cclien (~cclien@ec2-50-112-123-234.us-west-2.compute.amazonaws.com) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * spaceman-39642 (l@ Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * wonko_be (bernard@november.openminds.be) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * Ormod (~valtha@ohmu.fi) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * morpheus (~morpheus@foo.morphhome.net) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * raso (~raso@deb-multimedia.org) Quit (resistance.oftc.net synthon.oftc.net)
[8:51] * MarkS (~mark@irssi.mscholten.eu) Quit (resistance.oftc.net synthon.oftc.net)
[8:56] * andret (~andre@pcandre.nine.ch) has joined #ceph
[8:56] * deepsa (~deepsa@ has joined #ceph
[8:56] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) has joined #ceph
[8:56] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[8:56] * masterpe (~masterpe@2001:990:0:1674::1:82) has joined #ceph
[8:56] * mrjack_ (mrjack@office.smart-weblications.net) has joined #ceph
[8:56] * joao (~JL@ has joined #ceph
[8:56] * stan_theman (~stan_them@ has joined #ceph
[8:56] * darkfaded (~floh@ has joined #ceph
[8:56] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[8:56] * blufor (~blufor@adm-1.candycloud.eu) has joined #ceph
[8:56] * wido (~wido@2a00:f10:104:206:9afd:45af:ae52:80) has joined #ceph
[8:56] * ferai (~quassel@quassel.jefferai.org) has joined #ceph
[8:56] * _are__ (~quassel@2a01:238:4325:ca02::42:4242) has joined #ceph
[8:56] * exec (~defiler@ has joined #ceph
[8:56] * mengesb (~bmenges@servepath-gw3.servepath.com) has joined #ceph
[8:56] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) has joined #ceph
[8:56] * benner (~benner@ has joined #ceph
[8:56] * NaioN (stefan@andor.naion.nl) has joined #ceph
[8:56] * spaceman-39642 (l@ has joined #ceph
[8:56] * gohko (~gohko@natter.interq.or.jp) has joined #ceph
[8:56] * laevar (~jochen@laevar.de) has joined #ceph
[8:56] * pmjdebruijn (~pmjdebrui@overlord.pcode.nl) has joined #ceph
[8:56] * hijacker (~hijacker@ has joined #ceph
[8:56] * ogelbukh (~weechat@nat3.4c.ru) has joined #ceph
[8:56] * cclien (~cclien@ec2-50-112-123-234.us-west-2.compute.amazonaws.com) has joined #ceph
[8:56] * guerby (~guerby@nc10d-ipv6.tetaneutral.net) has joined #ceph
[8:56] * kibbu (claudio@owned.ethz.ch) has joined #ceph
[8:56] * sagewk (~sage@ has joined #ceph
[8:56] * yehudasa (~yehudasa@ has joined #ceph
[8:56] * todin (tuxadero@kudu.in-berlin.de) has joined #ceph
[8:56] * Anticimex (anticimex@netforce.csbnet.se) has joined #ceph
[8:56] * al (d@fourrooms.bandsal.at) has joined #ceph
[8:56] * mikeryan (mikeryan@lacklustre.net) has joined #ceph
[8:56] * raso (~raso@deb-multimedia.org) has joined #ceph
[8:56] * MarkS (~mark@irssi.mscholten.eu) has joined #ceph
[8:56] * morpheus (~morpheus@foo.morphhome.net) has joined #ceph
[8:56] * Ormod (~valtha@ohmu.fi) has joined #ceph
[8:56] * wonko_be (bernard@november.openminds.be) has joined #ceph
[9:07] * adjohn (~adjohn@108-225-130-229.lightspeed.sntcca.sbcglobal.net) Quit (Quit: adjohn)
[9:09] * EmilienM (~EmilienM@ has joined #ceph
[9:10] * BManojlovic (~steki@ has joined #ceph
[9:20] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[9:21] * yoshi (~yoshi@p22043-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[9:24] * Leseb (~Leseb@ has joined #ceph
[9:26] * loicd (~loic@brln-4dbac51f.pool.mediaWays.net) has joined #ceph
[9:29] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[9:34] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[9:52] * mikeryan (mikeryan@lacklustre.net) Quit (Remote host closed the connection)
[10:07] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[10:09] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[10:11] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[10:12] * BManojlovic (~steki@smile.zis.co.rs) has joined #ceph
[10:18] * BManojlovic (~steki@smile.zis.co.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[11:31] * BManojlovic (~steki@ has joined #ceph
[11:49] <masterpe> Good moring
[11:50] <masterpe> I have an little problem, somehow I cant get the mds not real working
[11:50] <masterpe> mdsmap e5201: 8/8/8 up {0=c=up:resolve,1=b=up:resolve(laggy or crashed),2=a=up:resolve(laggy or crashed),3=c=up:resolve(laggy or crashed),4=a=up:resolve(laggy or crashed),5=b=up:resolve,6=d=up:resolve,7=d=up:resolve(laggy or crashed)}
[11:51] <masterpe> I have 8 mds systems and somehow dubbel systems
[11:53] <renzhi> shouldn't the number of mds be odd instead of even?
[11:56] <masterpe> That I don't know
[12:02] <NaioN> renzhi: not for the mds
[12:02] <NaioN> for the mon's
[12:03] <NaioN> the mds'es divide the cephfs tree among them (with multiple active)
[12:04] <NaioN> masterpe: could you pastebin the ceph.conf?
[12:06] <renzhi> ok
[12:07] <NaioN> mds is only relevant if you use cephfs
[12:13] <masterpe> http://pastebin.com/raw.php?i=nqFshxGW
[12:20] * deepsa_ (~deepsa@ has joined #ceph
[12:21] * mikeryan (mikeryan@lacklustre.net) has joined #ceph
[12:21] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[12:21] * deepsa_ is now known as deepsa
[12:25] <masterpe> Can i do an ceph mds newfs metadata data
[12:25] <masterpe> to fix my problem?
[12:44] * Cube (~Adium@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[13:16] * mrjack_ (mrjack@office.smart-weblications.net) Quit (Ping timeout: 480 seconds)
[13:17] <masterpe> On faulty nodes the /etc/ceph/keyring.mds.* is emty
[13:17] <NaioN> hmmm that's a problem
[13:18] <NaioN> if they key isn't correct the node can't connect with the cluster
[13:18] <NaioN> but I would recommend you to only use the mgmt-vhosts as mon and mds servers
[13:18] <masterpe> How do i generate an new file?
[13:18] <NaioN> futhermore what do you mean with that?
[13:19] <NaioN> furthermore by default only 1 mds is active
[13:19] <NaioN> and you only need more mds'es if you have a big active cephfs (with active I mean metadata active)
[13:20] <NaioN> for redundancy you could make 1 active mds and one failover mds and the rest standby
[13:20] <NaioN> or two active and two failover
[13:20] <masterpe> On the vhost hosts I want to have kvm guest running, for that I need to move the ceph cluster to the storage cluster.
[13:21] <NaioN> but the two active serve different parts of the cephfs tree
[13:21] <NaioN> ok
[13:21] <masterpe> I started the project with everything on the vhosts systems, but that wasn't very stable
[13:22] <NaioN> well in that case you could use the first three osds or something
[13:22] <NaioN> it isn't stable to have the client and server side on the same nodes
[13:22] <masterpe> indeed
[13:22] <NaioN> it's a known issue
[13:23] <NaioN> but it's something you don't want in a production environment
[13:23] <masterpe> It is an test environment
[13:24] <NaioN> I have a production setup with three fysical servers for MON+MDS (although I don't use cephfs)
[13:24] <NaioN> and 4 OSD nodes
[13:25] <NaioN> connected with an internal network
[13:25] <masterpe> In the proces of moving to the storage system, I did something wrong
[13:25] <NaioN> and at the moment three so called storheads, that connect to the cluster as clients and re-export the volumes
[13:25] <NaioN> I only use rbd's at the moment
[13:26] <NaioN> well the problem is your cluster contains data
[13:26] <masterpe> yes
[13:27] <NaioN> can you build it back to the point where it worked?
[13:27] <masterpe> I have tryed it and it didn't work
[13:27] <NaioN> then you can remove mons and mds'es and move the remaining to the good nodes
[13:28] <NaioN> well the mds'es don't use any local storage, they only depend on the config file and if you use auth the keyring
[13:30] <NaioN> do the logfiles of the mds'es tell anything?
[13:48] * lofejndif (~lsqavnbok@09GAAHY8H.tor-irc.dnsbl.oftc.net) has joined #ceph
[13:50] * deepsa_ (~deepsa@ has joined #ceph
[13:51] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[13:51] * deepsa_ is now known as deepsa
[14:01] * mrjack_ (~mrjack@tmo-096-99.customers.d1-online.com) has joined #ceph
[14:09] * loicd (~loic@brln-4dbac51f.pool.mediaWays.net) Quit (Ping timeout: 480 seconds)
[14:12] * nhm (~nhm@67-220-20-222.usiwireless.com) Quit (Ping timeout: 480 seconds)
[14:14] * loicd (~loic@brln-4dbc3b23.pool.mediaWays.net) has joined #ceph
[14:14] <jamespage> anyone know the magic to make s3cmd work well with radosgw? The s3 command from libs3 works fine - but s3cmd keeps getting authorization failures...
[14:23] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[14:31] * deepsa (~deepsa@ Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[14:32] * deepsa (~deepsa@ has joined #ceph
[14:38] * tnt (~tnt@212-166-48-236.win.be) has joined #ceph
[14:38] <tnt> Is it possible to convert an OSD to use a partition instead of a file for journal ?
[14:43] <exec> tnt: hi. osd journal = /dev/V$name/journal
[14:44] <exec> then stop osd, journal flush, mkjournal, start osd
[14:44] <exec> should it work?
[14:45] <tnt> what is "journal flush" exactly ?
[14:46] <exec> one sec
[14:46] <exec> ceph-osd --flush-journal
[14:47] <exec> ceph-osd -i $ID --flush-journal )
[14:47] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[14:47] * BManojlovic (~steki@ has joined #ceph
[14:47] <exec> btw, what is size of your journal per osd?
[14:47] <tnt> ok, I'll give that a shot
[14:47] <tnt> currently 1G file and I'll try 2G partition
[14:49] <exec> hm. I've switched to 8G partition b/c I saw "JOURNAL FULL" messages too often
[14:50] <exec> 0.48.1 version ?
[14:50] <tnt> Good to know. Currently this is only a test cluster so I'm not too decided yet about the size of the journal.
[14:50] <tnt> 0.48.1 yes
[14:54] <exec> my own is too almost test cluster and it almost works )
[14:57] * ivan` (~ivan`@li125-242.members.linode.com) Quit (Ping timeout: 480 seconds)
[14:57] * ivan` (~ivan`@li125-242.members.linode.com) has joined #ceph
[15:30] * deepsa_ (~deepsa@ has joined #ceph
[15:33] * deepsa (~deepsa@ Quit (Ping timeout: 480 seconds)
[15:33] * deepsa_ is now known as deepsa
[15:41] * deepsa (~deepsa@ Quit (Quit: ["Textual IRC Client: www.textualapp.com"])
[15:44] * jbd_ (~jbd_@34322hpv162162.ikoula.com) Quit (Ping timeout: 480 seconds)
[15:48] * mrjack_ (~mrjack@tmo-096-99.customers.d1-online.com) Quit (Read error: Connection reset by peer)
[15:57] * nhm (~nhm@67-220-20-222.usiwireless.com) has joined #ceph
[16:16] * BManojlovic (~steki@ Quit (Ping timeout: 480 seconds)
[16:31] <tnt> Anyone using rbd ceph xen ? I'm seeing a fairly big impact from exposing a kernel RBD device in dom0 to a domU. This impact doesn't happen when exposing a physical device to the domU though so it's some kind of weird xen <-> ceph interaction ...
[16:44] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[16:49] * BManojlovic (~steki@ has joined #ceph
[17:12] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[17:17] * BManojlovic (~steki@ Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:18] <nhm> wow, this rocketraid card is benchmarking surprisingly well.
[17:40] * deepsa (~deepsa@ has joined #ceph
[17:41] * tnt (~tnt@212-166-48-236.win.be) Quit (Read error: Operation timed out)
[17:47] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:00] * tnt (~tnt@11.164-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[18:02] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) Quit (Quit: Ex-Chat)
[18:18] * shdb (~shdb@80-219-123-230.dclient.hispeed.ch) has joined #ceph
[18:22] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Ping timeout: 480 seconds)
[18:23] * Leseb (~Leseb@ Quit (Quit: Leseb)
[18:23] * BManojlovic (~steki@ has joined #ceph
[18:24] * Leseb (~Leseb@ has joined #ceph
[18:28] * aliguori (~anthony@ has joined #ceph
[18:28] * chutzpah (~chutz@ has joined #ceph
[18:36] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[18:53] * Leseb (~Leseb@ Quit (Quit: Leseb)
[18:58] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[18:59] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[19:02] * bchrisman (~Adium@ has joined #ceph
[19:04] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[19:05] * Cube (~Adium@ has joined #ceph
[19:06] * morse (~morse@supercomputing.univpm.it) Quit (Remote host closed the connection)
[19:12] * deepsa (~deepsa@ Quit ()
[19:13] * danieagle (~Daniel@ has joined #ceph
[19:22] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[19:23] * fghaas (~florian@ has joined #ceph
[19:34] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) has joined #ceph
[19:37] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:40] * dmick (~dmick@2607:f298:a:607:44ac:37a3:2aad:d0eb) has joined #ceph
[19:41] * mrjack_ (mrjack@office.smart-weblications.net) has joined #ceph
[19:49] * ferai (~quassel@quassel.jefferai.org) Quit (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.)
[19:50] * Ryan_Lane (~Adium@ has joined #ceph
[19:50] * jefferai (~quassel@quassel.jefferai.org) has joined #ceph
[19:52] * fghaas (~florian@ Quit (Read error: Operation timed out)
[19:58] * jefferai (~quassel@quassel.jefferai.org) Quit (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.)
[20:00] * jefferai (~quassel@quassel.jefferai.org) has joined #ceph
[20:00] * Ryan_Lane (~Adium@ Quit (Quit: Leaving.)
[20:01] * danieagle (~Daniel@ Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[20:01] * Ryan_Lane (~Adium@ has joined #ceph
[20:03] * ihwtl (~ihwtl@odm-mucoffice-02.odmedia.net) Quit (Ping timeout: 480 seconds)
[20:04] * maelfius (~mdrnstm@ has joined #ceph
[20:04] * fghaas (~florian@ has joined #ceph
[20:17] * mrjack_ (mrjack@office.smart-weblications.net) Quit (Ping timeout: 480 seconds)
[20:17] * fghaas (~florian@ Quit (Quit: Leaving.)
[20:44] <joao> gregaf, do you know if the messenger delivers messages to a monitor that was down once the message was sent, once that same monitor becomes available?
[20:45] <joao> s/down once/down when/
[20:45] * dspano (~dspano@rrcs-24-103-221-202.nys.biz.rr.com) has joined #ceph
[20:49] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[21:00] <mikeryan> sjust: review on wip_bug_3048_rados_bench when you get a chance
[21:20] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Quit: Leaving.)
[21:31] * MK_FG (~MK_FG@ Quit (Quit: o//)
[21:35] <sjust> this is ok for all users of completion_wait()?
[21:35] * MK_FG (~MK_FG@ has joined #ceph
[21:53] <gregaf> joao: a little more context for that question?
[21:54] <gregaf> if we're talking a new process, no, the messenger is not going to deliver those ?????the connection will reset and it'll get propagated up the chain
[21:54] <joao> gregaf, say mon.a sends a message to mon.b while mon.b is not running; would it even be considered as possible that mon.b would nonetheless receive the message once it was brought up?
[21:55] <joao> so, there is no chance that the messenger will 'retry' sending the message at some point?
[21:57] <joao> if there's no chance in that happening, will have to figure out what happened then
[21:57] <gregaf> umm, I think that is possible
[21:58] <gregaf> if the monitors hadn't been previously connected, then the sending monitor would start up a new Pipe with the (nonexistent) receiver as the endpoint, and queue a message for send
[21:58] <joao> yes, assuming that both monitors hadn't been connected previously
[22:00] <gregaf> yeah, they're non-lossy connections so it would just keep trying to connect on an exponential backoff and if it ever did connect then it would send
[22:00] <joao> well, if that's so, it pretty much sucks in my case; have to review the assumptions I make when handling messages
[22:00] <gregaf> if they had been previously connected, then I think the messages will get dropped and a reset will be propagated to the monitor
[22:01] <gregaf> what assumptions does it impact?
[22:03] <joao> for instance, that you should only handle a sync request if you are part of the quorum, or if you replied to a probe message (triggering a sync on the other side)
[22:03] <gregaf> fence the messages with an election_epoch that they apply to?
[22:04] <joao> I suppose that would mean we would have to go through an election first, no?
[22:04] <gregaf> new monitors don't join without an election, right?
[22:04] <joao> they don't join, but they do sync
[22:05] <joao> they may even sync without enough peers to form a quorum
[22:05] <gregaf> right, so they'd say "the monmap has this election version when we sent this message" and the receiving monitor could check and see if that's the same as their version
[22:05] <gregaf> maybe that's insufficient
[22:06] <joao> the monmap is shared during the probe; the receiving monitor would have the same monmap version
[22:06] <gregaf> ah, right
[22:07] <joao> I'm going to take another look at the messenger's code
[22:07] <gregaf> so you're worried about a sync provider not being available, and then getting the sync request after they've already started syncing with a different provider, and starting to send those messages back anyway?
[22:07] <joao> maybe there's a way to avoid the message's being retried
[22:07] <gregaf> no, there isn't
[22:08] <joao> well, I'm not only worried, as I experienced a scenario that I assumed as impossible
[22:08] <gregaf> I didn't mean to imply that, just trying to understand the scenario
[22:09] <joao> say, mon.b is syncing with mon.c; mon.c fails; mon.b tries mon.d; mon.d is unavailable; mon.b finally syncs with mon.a
[22:09] <gregaf> I think you'll need to add a bit more state to make sure the messages remain valid somehow
[22:09] <gregaf> yeah
[22:09] <joao> when mon.d comes back, with a fresh store, receives a sync_start_chunks message
[22:09] <joao> and simply goes boom
[22:09] <joao> booms on an assert, btw
[22:10] <joao> yeah
[22:10] <joao> I think I've figured it out
[22:10] <joao> add the 'paxos version at which point we started to sync' on the message
[22:10] <joao> make it imperative there
[22:11] <joao> just drop the message if the monitor's paxos version is lower than the message's version
[22:11] <gregaf> does that handle the case where the "provider" store is nonempty?
[22:12] <joao> no, I was talking how it would fix this specific case
[22:12] <joao> a generic approach should be found
[22:12] <gregaf> I think you want a general solution, not a collection of patches :)
[22:13] <joao> yeah, but maybe I can extrapolate this onto something else more generic
[22:14] <joao> I can only wonder if would there be a better way to chose a new monitor for the sync, besides randomly trying the monmap's monitors
[22:14] <gregaf> you are looking at the ones it lists as in the quorum, not just present, right?
[22:15] <joao> no, just present
[22:15] <joao> we cannot rely on the quorum as there may be none
[22:15] <gregaf> wouldn't you want to take quorum members and then fall back on the most up-to-date one available if there is no quourm?
[22:15] <gregaf> *quorum
[22:16] <joao> well, that's certainly true, but we don't have a quorum until we trigger an election
[22:16] <joao> well
[22:16] <joao> I think the best approach here is to go through the probe phase again
[22:16] <joao> keep some sane amount of infos to restart the sync from where we left it, without losing our connection to the leader
[22:17] <joao> but go through the probe phase nonetheless
[22:17] <gregaf> that sounds like it could work
[22:17] <joao> it will involve some reworking on the probe handling functions though
[22:18] <joao> but it's not like the monitor didn't suffer enough reworks by now... :)
[22:20] <gregaf> well, the probe handlers shouldn't need any changes, just the requester, right?
[22:22] <joao> in fact, the only change we might have to do regards the check on whether we are probing or not, when we are handling the probe reply
[22:23] <joao> we will want to add an "or if we are synchronizing and currently looking for a new peer" to the "if we are probing, do continue on this function without dropping the message right away"
[22:32] * jamespage (~jamespage@tobermory.gromper.net) Quit (Quit: Coyote finally caught me)
[22:34] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:45] <dmick> elder: you around?
[23:03] * mrjack_ (mrjack@office.smart-weblications.net) has joined #ceph
[23:03] <mrjack_> re
[23:14] * sjust (~sam@2607:f298:a:607:baac:6fff:fe83:5a02) has left #ceph
[23:36] * EmilienM (~EmilienM@ has left #ceph
[23:39] * loicd (~loic@brln-4dbc3b23.pool.mediaWays.net) Quit (Quit: Leaving.)
[23:56] * aliguori (~anthony@ Quit (Quit: Ex-Chat)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.