#ceph IRC Log

Index

IRC Log for 2011-11-11

Timestamps are in GMT/BST.

[0:03] <sagewk> nwatkins`: that write throughput looks pretty slow. can you do a 'rados -p rbd bench 120 write' and see what kind of throughput that gives you?
[0:05] <nwatkins`> sagewk: sure. does it need to be on an osd node?
[0:06] <sagewk> from whereever... ideally a node where a hadoop worker would also run
[0:10] <nwatkins`> sagewk: i'm guessing this isn't supposed to happen: http://pastebin.com/KzU5VTrM
[0:11] <sagewk> hmm, what does ceph -s say?
[0:11] <nwatkins`> 2011-11-10 15:11:34.440814 pg v12917: 2376 pgs: 2376 active+clean; 41669 MB data, 97423 MB used, 2515 GB / 2750 GB avail
[0:11] <nwatkins`> 2011-11-10 15:11:34.448704 mds e24: 1/1/1 up {0=a=up:active}
[0:11] <nwatkins`> 2011-11-10 15:11:34.448746 osd e20: 12 osds: 12 up, 12 in
[0:11] <nwatkins`> 2011-11-10 15:11:34.448854 log 2011-11-10 14:54:01.342717 mds.0 192.168.141.123:6801/10943 35 : [INF] closing stale session client.6730 192.168.141.139:0/1010579 after 304.336945
[0:11] <nwatkins`> 2011-11-10 15:11:34.448972 mon e1: 1 mons at {a=192.168.141.123:6789/0}
[0:12] <nwatkins`> hmm. is that readable?
[0:12] <sagewk> yeah.. weird.
[0:30] * fronlius (~fronlius@e182093240.adsl.alicedsl.de) has joined #ceph
[0:34] <nwatkins`> sagewk: we're gonna bump these tests up to 30 nodes. is there a best practice configuration for dealing utilizing multiple disks per node?
[0:40] * nwatkins` (~user@kyoto.soe.ucsc.edu) Quit (Remote host closed the connection)
[0:42] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[0:48] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[0:55] * fronlius (~fronlius@e182093240.adsl.alicedsl.de) Quit (Quit: fronlius)
[1:36] <sjust> nwatkins: we tend to recommend one osd process per disk
[1:38] <sagewk> nwatkins`: if you set host = in the ceph.conf the crush map mkcephfs generates will set things up properly
[2:35] * bchrisman (~Adium@108.60.121.114) Quit (Quit: Leaving.)
[2:36] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:46] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[3:18] * Nightdog_ (~karl@190.84-48-62.nextgentel.com) has joined #ceph
[3:18] * Nightdog (~karl@190.84-48-62.nextgentel.com) Quit (Read error: Connection reset by peer)
[3:24] * aa (~aa@r190-135-242-109.dialup.adsl.anteldata.net.uy) has joined #ceph
[3:25] * cp (~cp@206.15.24.21) Quit (Quit: cp)
[4:00] * aa (~aa@r190-135-242-109.dialup.adsl.anteldata.net.uy) Quit (Quit: Konversation terminated!)
[4:00] * aa (~aa@r190-135-242-109.dialup.adsl.anteldata.net.uy) has joined #ceph
[4:02] * aa (~aa@r190-135-242-109.dialup.adsl.anteldata.net.uy) Quit ()
[4:02] * aa (~aa@r190-135-242-109.dialup.adsl.anteldata.net.uy) has joined #ceph
[4:12] * sagewk (~sage@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[4:21] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) has joined #ceph
[4:21] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) Quit ()
[5:29] * Nightdog_ (~karl@190.84-48-62.nextgentel.com) Quit (Remote host closed the connection)
[6:17] * aa (~aa@r190-135-242-109.dialup.adsl.anteldata.net.uy) Quit (Remote host closed the connection)
[6:35] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[6:37] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) has joined #ceph
[7:13] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) has joined #ceph
[7:59] * fronlius (~fronlius@e182093240.adsl.alicedsl.de) has joined #ceph
[9:17] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[9:31] * yoshi (~yoshi@p9224-ipngn1601marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[10:43] * Nightdog (~karl@190.84-48-62.nextgentel.com) has joined #ceph
[11:06] * verwilst (~verwilst@dD5769260.access.telenet.be) has joined #ceph
[11:25] * fronlius (~fronlius@e182093240.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[11:35] * fronlius (~fronlius@testing78.jimdo-server.com) has joined #ceph
[12:01] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[12:11] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[14:21] * fronlius_ (~fronlius@testing78.jimdo-server.com) has joined #ceph
[14:21] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[14:21] * fronlius_ is now known as fronlius
[14:34] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[14:38] * svenx (92744@diamant.ifi.uio.no) has joined #ceph
[14:39] <svenx> about http://ceph.newdream.net/wiki/Adjusting_replication_level , has there been ideas about having ceph being clever with placement of replicas?
[14:40] <svenx> say a pool spans over two locations, and you don't want all replicas to recide in one of them (by chance)
[14:40] <svenx> possibly with multiple levels (e.g. datacenters, racks, even blade enclosures)
[14:42] <psomas> svenx: http://ceph.newdream.net/wiki/Custom_data_placement_with_CRUSH
[14:42] <svenx> ah, d'oh!
[14:42] <svenx> saw it just now :)
[14:42] <svenx> thanks!
[14:42] <psomas> i think now the crushmap will parse the host= and rack= options in the conf
[14:46] <svenx> okay, sounds good
[15:32] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[15:33] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[16:49] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[16:51] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[16:58] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Quit: fronlius)
[17:02] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) has joined #ceph
[17:03] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[17:45] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[18:09] * adjohn (~adjohn@70-36-139-211.dsl.dynamic.sonic.net) Quit (Quit: adjohn)
[18:12] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) has joined #ceph
[18:13] <jmlowe> What is the best way to move from 0.37 to 0.38 without taking everything down?
[18:39] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[18:57] * mgalkiewicz (~maciej.ga@85.89.186.247) has joined #ceph
[19:00] * votz (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[19:01] * votz (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) has joined #ceph
[19:03] <mgalkiewicz> Hello guys. How should caps for mon, mds and osd look like?
[19:04] <mgalkiewicz> what happens when they are not set?
[19:04] <mgalkiewicz> and what is the difference between rwx and *?
[19:27] <todin> jmlowe: I think you should update the mon's first and then the osd
[19:27] <todin> psomas: you have an example for the host and rack options in the ceph.conf?
[19:32] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[19:37] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[19:38] * adjohn (~adjohn@208.90.214.43) Quit (Quit: adjohn)
[19:38] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[19:38] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) has joined #ceph
[19:45] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:52] * nwatkins (~user@kyoto.soe.ucsc.edu) has joined #ceph
[19:54] * adjohn is now known as Guest16753
[19:54] * Guest16753 (~adjohn@208.90.214.43) Quit (Read error: Connection reset by peer)
[19:54] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[19:58] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[20:00] * adjohn is now known as Guest16754
[20:00] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[20:06] * Guest16754 (~adjohn@208.90.214.43) Quit (Ping timeout: 480 seconds)
[20:06] * adjohn is now known as Guest16755
[20:06] * adjohn (~adjohn@mc60536d0.tmodns.net) has joined #ceph
[20:12] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:12] * Guest16755 (~adjohn@208.90.214.43) Quit (Ping timeout: 480 seconds)
[20:20] * verwilst (~verwilst@dD5769260.access.telenet.be) Quit (Quit: Ex-Chat)
[20:39] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[20:47] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:47] * lxo (~aoliva@lxo.user.oftc.net) Quit ()
[20:48] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[20:53] * cp (~cp@c-98-234-218-251.hsd1.ca.comcast.net) Quit (Quit: cp)
[20:53] * adjohn (~adjohn@mc60536d0.tmodns.net) Quit (Read error: No route to host)
[20:53] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[21:07] * jmlowe (~Adium@129-79-195-139.dhcp-bl.indiana.edu) Quit (Quit: Leaving.)
[21:08] * jmlowe (~Adium@140-182-215-237.dhcp-bl.indiana.edu) has joined #ceph
[21:16] * jmlowe (~Adium@140-182-215-237.dhcp-bl.indiana.edu) Quit (Ping timeout: 480 seconds)
[21:25] * cp (~cp@206.15.24.21) has joined #ceph
[21:27] * adjohn (~adjohn@208.90.214.43) Quit (Quit: adjohn)
[21:44] * sagewk (~sage@aon.hq.newdream.net) has joined #ceph
[22:06] * adjohn (~adjohn@208.90.214.43) has joined #ceph
[22:13] * adjohn (~adjohn@208.90.214.43) Quit (Read error: Connection reset by peer)
[22:14] * adjohn (~adjohn@mc60536d0.tmodns.net) has joined #ceph
[22:24] * votz (~votz@pool-108-52-121-103.phlapa.fios.verizon.net) Quit (Quit: Leaving)
[22:40] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[22:43] * jpieper (~josh@209-6-86-62.c3-0.smr-ubr2.sbo-smr.ma.cable.rcn.com) Quit (Quit: Ex-Chat)
[22:50] * Nightdog (~karl@190.84-48-62.nextgentel.com) Quit (Remote host closed the connection)
[23:01] <todin> If there are >2
[23:01] <todin> racks, separate across racks.
[23:02] <todin> why not >1 rack? I that should work as well?
[23:03] <todin> and in the code it is if (racks.size() > 3) {
[23:03] * tjikkun (~tjikkun@82-169-255-84.ip.telfort.nl) has joined #ceph
[23:10] * mig5 (~mig5@ppp59-167-182-161.vic.adsl.internode.on.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.