#ceph IRC Log

Index

IRC Log for 2012-10-01

Timestamps are in GMT/BST.

[0:17] * amatter (~amatter@c-174-52-137-136.hsd1.ut.comcast.net) Quit (Ping timeout: 480 seconds)
[0:18] * lofejndif (~lsqavnbok@9YYAAJEGQ.tor-irc.dnsbl.oftc.net) has joined #ceph
[0:29] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[0:32] * lofejndif (~lsqavnbok@9YYAAJEGQ.tor-irc.dnsbl.oftc.net) Quit (Quit: gone)
[0:38] * tziOm (~bjornar@ti0099a340-dhcp0358.bb.online.no) Quit (Remote host closed the connection)
[0:39] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[1:16] * SvenDowideit (~SvenDowid@203-206-171-38.perm.iinet.net.au) has joined #ceph
[1:17] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:23] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:79ef:b329:5e0f:1c24) has joined #ceph
[1:40] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:79ef:b329:5e0f:1c24) Quit (Quit: LarsFronius)
[2:01] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:09] * steki-BLAH (~steki@212.200.241.170) Quit (Quit: Ja odoh a vi sta 'ocete...)
[3:32] * The_Bishop (~bishop@2001:470:50b6:0:58d6:8844:6d:102e) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[4:00] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[4:06] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[4:08] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[4:15] * eightyeight (~atoponce@pinyin.ae7.st) has left #ceph
[4:28] * dty (~derek@pool-71-178-175-208.washdc.fios.verizon.net) has joined #ceph
[4:40] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[4:40] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit ()
[4:47] * dty (~derek@pool-71-178-175-208.washdc.fios.verizon.net) Quit (Quit: dty)
[4:47] * amatter (amatter@c-174-52-137-136.hsd1.ut.comcast.net) has joined #ceph
[4:54] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[4:54] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[5:37] * maelfius1 (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) has joined #ceph
[5:43] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[5:49] * maelfius1 (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) Quit (Quit: Leaving.)
[6:32] * amatter_ (~amatter@209.63.136.130) has joined #ceph
[6:37] * amatter (amatter@c-174-52-137-136.hsd1.ut.comcast.net) Quit (Ping timeout: 480 seconds)
[6:58] * amatter_ (~amatter@209.63.136.130) Quit (Ping timeout: 480 seconds)
[7:16] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[7:26] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) has joined #ceph
[7:37] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[7:37] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[7:54] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[8:01] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[8:04] * loicd (~loic@magenta.dachary.org) has joined #ceph
[8:05] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[8:05] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[8:06] * yoshi (~yoshi@EM117-55-68-182.emobile.ad.jp) has joined #ceph
[8:16] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[8:30] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[8:31] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[8:45] * maelfius (~mdrnstm@pool-71-160-33-115.lsanca.fios.verizon.net) Quit (Quit: Leaving.)
[8:52] * yoshi (~yoshi@EM117-55-68-182.emobile.ad.jp) Quit (Remote host closed the connection)
[9:07] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[9:11] * masterpe_ is now known as masterpe
[9:17] * BManojlovic (~steki@91.195.39.5) has joined #ceph
[9:36] * loicd (~loic@178.20.50.225) has joined #ceph
[9:45] * Leseb (~Leseb@193.172.124.196) has joined #ceph
[9:54] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[9:55] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[9:56] * yoshi_ (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[9:56] * yoshi (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Read error: Connection reset by peer)
[10:28] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:561:a8dd:b174:63f1) has joined #ceph
[10:51] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[10:51] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:561:a8dd:b174:63f1) Quit (Quit: LarsFronius)
[11:33] * yoshi_ (~yoshi@p37219-ipngn1701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[11:48] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) Quit (Quit: Leaving.)
[11:49] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[11:54] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[11:54] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[11:57] * karmin (ca037809@ircip4.mibbit.com) has joined #ceph
[11:57] <karmin> Hi , folks !
[11:58] * loicd (~loic@178.20.50.225) Quit (Ping timeout: 480 seconds)
[11:58] <karmin> As time differences most developers must be enjoying sweet dreams , but still I wanna ask about anyone available here now... ?
[12:08] * karmin (ca037809@ircip4.mibbit.com) Quit (Quit: http://www.mibbit.com ajax IRC Client)
[12:19] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[12:20] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit ()
[12:26] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[12:26] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[12:26] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[12:37] * MikeMcClurg (~mike@62.200.22.2) has joined #ceph
[12:39] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[12:39] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[12:45] * tryggvil_ (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[12:45] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Read error: Connection reset by peer)
[12:45] * tryggvil_ is now known as tryggvil
[13:24] * mtk (~mtk@ool-44c35bb4.dyn.optonline.net) has joined #ceph
[13:48] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (Remote host closed the connection)
[13:48] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[14:22] * antsygeek (~antsygeek@108-166-90-85.static.cloud-ips.com) has joined #ceph
[14:25] <antsygeek> can i configure fault tolerance (like a RAID-5) in ceph? can't find it in the docs
[14:26] <andreask> it already does replication by default
[14:27] <antsygeek> andreask: hmm, but i have mounted it from a separate machine and it shows the accumulated space of all servers
[14:27] <antsygeek> how can that be replicated?
[14:28] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) Quit (Quit: zzzzzzzzzzzzzzzzzzzz)
[14:28] <andreask> antsygeek: yes, but if you put data into rados you will see it is counted twice ... do a "rados ls"
[14:29] <andreask> antsygeek: rados df ... sorry
[14:29] <antsygeek> andreask: haven't access to the servers right now, but thanks
[14:30] <antsygeek> so if i have 3 nodes with 5TB each, i will have 5TB available?
[14:31] <antsygeek> and if i have 10 nodes with 5TB each, i will still have 5TB?
[14:34] <andreask> no, by default 2 replicas are kept so you will have half the disk space for date
[14:34] <andreask> but of course you can change that
[14:34] <antsygeek> what is this setting called?
[14:35] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[14:35] <antsygeek> or the "ceph term" for it :-)
[14:36] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[14:38] <andreask> antsygeek: a pool has a size==number of replicas for each object
[14:39] <antsygeek> ok, thanks
[14:41] * antsygeek (~antsygeek@108-166-90-85.static.cloud-ips.com) has left #ceph
[14:45] * kibbu (claudio@owned.ethz.ch) Quit (Remote host closed the connection)
[14:51] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[15:00] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) has joined #ceph
[15:02] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * SvenDowideit (~SvenDowid@203-206-171-38.perm.iinet.net.au) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * nwl (~levine@atticus.yoyo.org) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * ogelbukh (~weechat@nat3.4c.ru) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * benner (~benner@193.200.124.63) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * mkampe (~markk@2607:f298:a:607:222:19ff:fe31:b5d3) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * jamespage (~jamespage@tobermory.gromper.net) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (synthon.oftc.net charon.oftc.net)
[15:02] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (synthon.oftc.net charon.oftc.net)
[15:04] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[15:04] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[15:04] * SvenDowideit (~SvenDowid@203-206-171-38.perm.iinet.net.au) has joined #ceph
[15:04] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[15:04] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[15:04] * jamespage (~jamespage@tobermory.gromper.net) has joined #ceph
[15:04] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) has joined #ceph
[15:04] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[15:04] * mkampe (~markk@2607:f298:a:607:222:19ff:fe31:b5d3) has joined #ceph
[15:04] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[15:04] * benner (~benner@193.200.124.63) has joined #ceph
[15:04] * ogelbukh (~weechat@nat3.4c.ru) has joined #ceph
[15:04] * nwl (~levine@atticus.yoyo.org) has joined #ceph
[15:15] * nwl (~levine@atticus.yoyo.org) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * ogelbukh (~weechat@nat3.4c.ru) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * benner (~benner@193.200.124.63) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * mkampe (~markk@2607:f298:a:607:222:19ff:fe31:b5d3) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * jamespage (~jamespage@tobermory.gromper.net) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * alexxy (~alexxy@2001:470:1f14:106::2) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * SvenDowideit (~SvenDowid@203-206-171-38.perm.iinet.net.au) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * MikeMcClurg (~mike@62.200.22.2) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * darkfader (~floh@188.40.175.2) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * laevar (~jochen@laevar.de) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * gregorg_taf (~Greg@78.155.152.6) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * mtk (~mtk@ool-44c35bb4.dyn.optonline.net) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * SpamapS (~clint@xencbyrum2.srihosting.com) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * f4m8 (f4m8@kudu.in-berlin.de) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * jeffhung_ (~jeffhung@60-250-103-120.HINET-IP.hinet.net) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * stan_theman (~stan_them@173.208.221.221) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * rturk (~rturk@ps94005.dreamhost.com) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * iggy (~iggy@theiggy.com) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * markl (~mark@tpsit.com) Quit (resistance.oftc.net synthon.oftc.net)
[15:15] * asadpanda (~asadpanda@67.231.236.80) Quit (resistance.oftc.net synthon.oftc.net)
[15:17] * dty (~derek@pool-71-178-175-208.washdc.fios.verizon.net) has joined #ceph
[15:17] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[15:17] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[15:17] * SvenDowideit (~SvenDowid@203-206-171-38.perm.iinet.net.au) has joined #ceph
[15:17] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[15:17] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[15:17] * jamespage (~jamespage@tobermory.gromper.net) has joined #ceph
[15:17] * scheuk (~scheuk@67.110.32.249.ptr.us.xo.net) has joined #ceph
[15:17] * alexxy (~alexxy@2001:470:1f14:106::2) has joined #ceph
[15:17] * mkampe (~markk@2607:f298:a:607:222:19ff:fe31:b5d3) has joined #ceph
[15:17] * tjikkun (~tjikkun@2001:7b8:356:0:225:22ff:fed2:9f1f) has joined #ceph
[15:17] * benner (~benner@193.200.124.63) has joined #ceph
[15:17] * ogelbukh (~weechat@nat3.4c.ru) has joined #ceph
[15:17] * nwl (~levine@atticus.yoyo.org) has joined #ceph
[15:17] * aliguori (~anthony@cpe-70-123-140-180.austin.res.rr.com) has joined #ceph
[15:17] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[15:17] * silversurfer (~silversur@124x35x68x250.ap124.ftth.ucom.ne.jp) has joined #ceph
[15:17] * mtk (~mtk@ool-44c35bb4.dyn.optonline.net) has joined #ceph
[15:17] * Karcaw (~evan@68-186-68-219.dhcp.knwc.wa.charter.com) has joined #ceph
[15:17] * SpamapS (~clint@xencbyrum2.srihosting.com) has joined #ceph
[15:17] * rturk (~rturk@ps94005.dreamhost.com) has joined #ceph
[15:17] * markl (~mark@tpsit.com) has joined #ceph
[15:17] * stan_theman (~stan_them@173.208.221.221) has joined #ceph
[15:17] * iggy (~iggy@theiggy.com) has joined #ceph
[15:17] * asadpanda (~asadpanda@67.231.236.80) has joined #ceph
[15:17] * f4m8 (f4m8@kudu.in-berlin.de) has joined #ceph
[15:17] * jeffhung_ (~jeffhung@60-250-103-120.HINET-IP.hinet.net) has joined #ceph
[15:17] * MikeMcClurg (~mike@62.200.22.2) has joined #ceph
[15:17] * darkfader (~floh@188.40.175.2) has joined #ceph
[15:17] * laevar (~jochen@laevar.de) has joined #ceph
[15:17] * gregorg_taf (~Greg@78.155.152.6) has joined #ceph
[15:31] * scuttlemonkey (~scuttlemo@173-14-58-198-Michigan.hfc.comcastbusiness.net) has joined #ceph
[15:32] * tziOm (~bjornar@194.19.106.242) has joined #ceph
[15:34] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[15:38] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) Quit ()
[15:40] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Ping timeout: 480 seconds)
[15:44] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[15:54] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[15:55] * loicd (~loic@90.84.144.37) has joined #ceph
[16:03] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[16:05] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[16:08] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[16:12] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has left #ceph
[16:16] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[16:16] <exec> hi there. how can I move pg data from $osd_dir/current/$pg_head to another osd w/o starting of old one?
[16:17] <exec> I've tried to move directory itself to another osd, but this osd is going to coredump during startup
[16:20] * morse (~morse@supercomputing.univpm.it) has joined #ceph
[16:22] * tziOm (~bjornar@194.19.106.242) Quit (Remote host closed the connection)
[16:24] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Quit: Leaving.)
[16:28] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[16:34] * gaveen (~gaveen@112.134.114.109) has joined #ceph
[16:48] * nhorman (~nhorman@nat-pool-rdu.redhat.com) has joined #ceph
[16:49] * idnc_sk (~idnc_sk@92.240.234.20) has joined #ceph
[16:49] <idnc_sk> Hi
[16:50] <idnc_sk> I'm testing ceph on a 2node system for the last 2w but I'm not able to go around a HA issue
[16:50] <idnc_sk> if I stop mds on one node, all works OK, but if I stop the mon daemon, this is what I get
[16:51] <idnc_sk> 2012-10-01 16:47:34.902285 7f010f753700 1 mon.21@1(probing) e1 discarding message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere; we are not in quorum
[16:51] <idnc_sk> ceph version 0.48.1argonaut
[16:52] <idnc_sk> config comming..
[16:53] <idnc_sk> http://pastebin.com/7bjKeFV8
[16:54] <idnc_sk> the mon msg is from the mon log on the running server
[16:55] <Fruit> you need an odd number of mon nodes
[16:55] <Fruit> most commonly 3
[16:56] <idnc_sk> can I run one 2mons on one system/in a vm - for testing purposes
[16:56] <idnc_sk> *ignore the "one"
[16:56] <idnc_sk> part
[16:56] <Fruit> note sure. you could run a tie-breaker mon node on your desktop :)
[16:56] <Fruit> s/note/not/
[16:57] <idnc_sk> sed-ster
[16:57] <idnc_sk> :)
[16:57] <Fruit> http://ceph.com/docs/master/config-cluster/ceph-conf/#monitors
[16:57] <idnc_sk> ok, let me check this - good to know about that 3node min requirement, thx
[16:58] <joao> idnc_sk, you can run with two monitors, but as Fruit said, there is no tie-breaker
[16:58] <Fruit> that explains about the odd node requirement
[16:58] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[17:00] <joao> also, with only two monitors, if one goes down the other one won't be able to form a quorum; hence the message you got
[17:02] <idnc_sk> went through the mon part - damn, must have forgotten about that, should re-rtfm first :) will probably configure the 3rd mon in a failover VM(if possible), we'll see how that goes
[17:06] * jantje_ (~jan@paranoid.nl) Quit (Ping timeout: 480 seconds)
[17:08] * slang (~slang@ace.ops.newdream.net) has joined #ceph
[17:10] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Remote host closed the connection)
[17:11] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[17:16] * slang (~slang@ace.ops.newdream.net) Quit (Ping timeout: 480 seconds)
[17:18] * jantje (~jan@paranoid.nl) has joined #ceph
[17:26] * Cube1 (~Adium@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[17:28] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[17:28] * BManojlovic (~steki@91.195.39.5) Quit (Quit: Ja odoh a vi sta 'ocete...)
[17:38] * cblack101 (86868949@ircip2.mibbit.com) has joined #ceph
[17:40] * scuttlemonkey (~scuttlemo@173-14-58-198-Michigan.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[17:41] * scuttlemonkey (~scuttlemo@173-14-58-198-Michigan.hfc.comcastbusiness.net) has joined #ceph
[17:49] * idnc_sk (~idnc_sk@92.240.234.20) Quit (Quit: leaving)
[17:51] <exec> is it normal that ceph report pg as stale from osds which were removed from cluster? from my PoV they should be "down"
[17:52] * amatter (~amatter@209.63.136.130) has joined #ceph
[17:54] * Cube1 (~Adium@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[17:55] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[18:02] * Leseb (~Leseb@193.172.124.196) Quit (Quit: Leseb)
[18:05] * jlogan1 (~Thunderbi@72.5.59.176) has joined #ceph
[18:05] * nhorman (~nhorman@nat-pool-rdu.redhat.com) Quit (Quit: Leaving)
[18:09] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[18:10] * Tv_ (~tv@2607:f298:a:607:b899:20f7:e1bb:234c) has joined #ceph
[18:14] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Read error: Connection reset by peer)
[18:16] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Ping timeout: 480 seconds)
[18:16] * LarsFronius_ is now known as LarsFronius
[18:18] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[18:21] * amatter_ (~amatter@209.63.136.130) has joined #ceph
[18:24] * amatter (~amatter@209.63.136.130) Quit (Ping timeout: 480 seconds)
[18:25] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[18:29] <gregaf> exec: it reports PGs as stale if it hasn't gotten a progress report on them in a long time, so you can generally consider that to mean down
[18:30] * gregaf (~Adium@2607:f298:a:607:25ce:c8f2:6135:761c) Quit (Quit: Leaving.)
[18:31] * gregaf (~Adium@38.122.20.226) has joined #ceph
[18:32] <exec> gregaf: I have data directories of them, can I put these dirs to any existent osds?
[18:32] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[18:32] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[18:32] * LarsFronius_ is now known as LarsFronius
[18:33] <gregaf> exec: I'm not sure exactly what'd be involved if you can't turn on the old OSD containing them; but it's not quite that simple
[18:33] <gregaf> I'll poke sjust when he gets in; he can figure it out
[18:35] <exec> gregaf: thanks. I have stopped these osds, but them also removed from cluster
[18:35] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[18:38] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[18:49] * MikeMcClurg (~mike@62.200.22.2) Quit (Quit: Leaving.)
[18:49] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[18:49] * nhm (~nhm@174-20-35-45.mpls.qwest.net) has joined #ceph
[18:49] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[18:50] <Tv_> exec: if you did "ceph osd rm" to remove the osds from the cluster, i don't see how their data directories would be easily useful for anything anymore...
[18:50] <Tv_> manual disaster recovery, perhaps, but easy automatic things, unlikely
[18:52] * jjgalvez (~jjgalvez@cpe-76-175-17-226.socal.res.rr.com) has joined #ceph
[18:52] <exec> Tv_: i'm taliking about manual recovery of PG's data
[18:54] <exec> now I've enabled removed osds back. however it's so painful due to ignorance of osd number in "ceph osd create $n" command
[18:54] * exec afk
[18:55] <Tv_> exec: i don't think creating a new osd with the same id is going to help you much, anyway
[18:56] <Tv_> but sjust would be the expert
[18:56] <Tv_> each object is a file, so on that level recovery should be doable
[18:57] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Remote host closed the connection)
[18:58] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[18:59] * Cube1 (~Adium@12.248.40.138) has joined #ceph
[19:00] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit ()
[19:01] * adjohn (~adjohn@69.170.166.146) has joined #ceph
[19:05] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) has joined #ceph
[19:10] * joshd (~joshd@2607:f298:a:607:221:70ff:fe33:3fe3) has joined #ceph
[19:11] * Leseb_ (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[19:11] * Leseb (~Leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[19:11] * Leseb_ is now known as Leseb
[19:14] * Ryan_Lane (~Adium@c-67-160-217-184.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[19:15] * dmick (~dmick@38.122.20.226) has joined #ceph
[19:19] * danieagle (~Daniel@186.214.56.89) has joined #ceph
[19:27] * chutzpah (~chutz@199.21.234.7) has joined #ceph
[19:40] <exec> Tv_: it helped
[19:40] <exec> Tv_: I mean starting old osds with the same id's
[19:41] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[19:41] <exec> however I'd to know some *manual* way to inject pg data into existend osd. if it possible.
[19:42] * houkouonchi-work (~linux@12.248.40.138) Quit (Remote host closed the connection)
[19:42] <exec> waiting for sjust comments )
[19:42] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[19:43] <sjust> exec: sorry, here now
[19:44] * LarsFronius_ (~LarsFroni@testing78.jimdo-server.com) has joined #ceph
[19:44] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Read error: Connection reset by peer)
[19:44] * LarsFronius_ is now known as LarsFronius
[19:44] <sjust> there isn't really a good way to reinject the information short of writing a utility to scan the old data directory and re-upload the data via rados
[19:44] <Tv_> sagewk: fyi the teuthology "console" task; frequent ipmi sol connects/disconnects cause the drac serial console to get stuck as locked, beware running that feature for real
[19:45] <nhm> Tv_: It's hard to believe how buggy those things are.
[19:45] <exec> sjust: however I've not seen any osd-special in $pg_head directory
[19:46] <Tv_> nhm: Enterprise Quality!
[19:46] <exec> sjust: where is some meta data stored?
[19:46] <sjust> the problem is how the osds decide whether the information they have is Good Enough
[19:46] <sjust> one sec
[19:47] <sjust> the meta data is in the meta directory (each pg has a log and an info file)
[19:47] <sjust> also, some information for each object may be stored in the omap subdirectory (leveldb store)
[19:47] <sjust> so it's generally not enough to yank the pg directory
[19:47] <exec> sjust: will look into, thanks.
[19:48] <nhm> metadata stored in the cloud!
[19:48] * nhm runs
[19:48] <sjust> but even so, reintegrating the pg back into an osd is tricky since the osds where it used to be stored are logically dead
[19:48] <exec> sjust: another q: ceph osd create $n wont works.
[19:48] <sjust> no, I don't think so
[19:48] <exec> it creates first free id instead of described
[19:49] <sagewk> tv_: yeah, i figured. i've had that task sitting around for a while and figured i'd push it, even if we can't use it for nightlies or whatever
[19:49] <sjust> yeah, probably to avoid this exact situation
[19:49] <exec> sjust: I've repaired cluster by recreating osds with old pg_data, haven't checked data concistency but all pgs are in active+clean state for now
[19:49] <sjust> oh
[19:50] <sjust> how did you do that?
[19:50] <exec> sjust: by reacreating old osds ). luckily their numbers starts from zero
[19:50] <exec> b/c I not purged data on them
[19:51] <sjust> using ceph osd create?
[19:51] <exec> yup
[19:52] * exec going to fsck on repaired volume.
[19:52] <sjust> ah, I know why it worked, cool
[19:53] <exec> b/c it has the same ids?
[19:53] <sjust> yeah, but pg_interval_t doesn't know the osds were removed :)
[19:53] <sjust> just an interval detail I found funny
[19:54] <sjust> *internal
[19:54] <exec> what is default param?
[19:54] <sjust> sorry?
[19:54] <exec> osd removed time lasts several hours.
[19:55] <exec> do yum mean that by default pg_interval_t should by finished before osds will be marked as lost forever?
[19:55] <exec> you )
[19:56] <exec> I'm talking that these osds was removed for several hours or so
[19:56] <sjust> sorry, pg_interval_t is part of a structure used by the osds to track where the pg has been in the past, it doesn't appear to explicitly track the difference between the newly created osds and the old ones with the sam ids
[19:56] <sjust> *same ids
[19:56] <sjust> it's not a config variable
[19:57] <exec> yup, I've seen via ceph health detail that these pgs were on old osds.
[19:57] <exec> what is value of pg_interval_t ?
[19:58] <sjust> it's a C++ struct in osd_types.h. probably not interesting to you, I just wanted to track down what exactly happened :)
[19:58] * edv (~edjv@107-1-75-186-ip-static.hfc.comcastbusiness.net) has joined #ceph
[19:58] <exec> sjust: ok, but i'd know how long ceph will know about lost osd
[19:58] <exec> )
[20:00] * scuttlemonkey (~scuttlemo@173-14-58-198-Michigan.hfc.comcastbusiness.net) Quit (Quit: zzzzzzzzzzzzzzzzzzzz)
[20:01] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[20:01] <sjust> exec: immediately, there isn't a time limit
[20:02] * LarsFronius (~LarsFroni@testing78.jimdo-server.com) Quit (Quit: LarsFronius)
[20:02] <sjust> though I wouldn't count on being able to re-add osds in a case like this, the behavior is arguably a bug
[20:02] <exec> sjust: but I've managed my issue thanks to this bug )
[20:03] <sjust> yeah, we'd definitely have to add an alternate mechanism before fixing/changing it :)
[20:05] <exec> sjust: yup. btw, could you say something about kernel-rbd client vs qemu-rbd? as I understand, the second one should be more stable and tested
[20:06] <sjust> yeah, I think we currently consider qemu-rbd more stable
[20:07] <exec> can I trust it my backups? )
[20:07] <sjust> also, I think it has better performance due to better caching support, but joshd would have better info
[20:08] <joshd> if you can use qemu-rbd, I'd recommend it
[20:08] <exec> it's hard to remember, who is responsible for each part )
[20:08] * Ryan_Lane (~Adium@216.38.130.167) has joined #ceph
[20:09] <joshd> like sam said, the client-side caching tends to help performance a lot
[20:09] <exec> joshd: yup, I can, however it's more complex way to support
[20:09] <exec> joshd: aha. I've tried both.
[20:10] <exec> thank you guys
[20:10] <sjust> sure!
[20:10] <joshd> no problem
[20:13] * loicd (~loic@90.84.144.37) Quit (Ping timeout: 480 seconds)
[20:14] <nhm> joshd: it'd be nice to add caching to kernel-rbd too.
[20:14] <elder> Yes.
[20:14] <joshd> it would be a good chunk of work
[20:14] <elder> Eventually we can, but we have other priorities...
[20:15] <joshd> discard support too
[20:16] <nhm> What are the circumstances where people can't use qemu-rbd? I seem to remember someone on the mailing list saying they had to use the kernel one.
[20:16] <elder> That would be really good, and easier.
[20:16] <exec> do you mean ext4 discard opt? It should be cool )
[20:17] <joshd> ext4, xfs, blkdev -- they all use the TRIM command
[20:18] <elder> It's used on SSD's to efficiently tell the device "block no longer in use" so they don't have to get zeroed.
[20:18] <elder> But also on VM's, etc.
[20:18] <exec> aha.
[20:18] <joshd> nhm: usually if they're not using qemu
[20:19] * mgalkiewicz (~mgalkiewi@staticline-31-182-149-180.toya.net.pl) has joined #ceph
[20:21] <houkouonchi-work> vercoi01 and being looked at
[20:21] <nhm> elder: For VM images, discard would let you reclaim space from sparse images that had temporary data written to them right?
[20:23] <mgalkiewicz> Hi guys I am dealing with poor osd performance and after some time I have eliminated filesystem ageing through osds reinstallation.
[20:24] <elder> Well, I'm not exactly sure what you mean, but for block devices anyway, they have no knowledge of how content is used. So telling it "TRIM" a range of blocks means "these are all zero-filled" and in some cases the device can take great advantage of that information.
[20:24] <elder> RBD would be an excellent case in point. But SSD's as well.
[20:24] <sjust> mgalkiewicz: I don't quite follow :)
[20:24] <mgalkiewicz> I have also upgraded to version 0.51 which has better support for small IO operations. I am using rbd volumes on top of which are postgresql and mongodb.
[20:25] <sjust> ok
[20:26] * jlogan1 (~Thunderbi@72.5.59.176) Quit (Remote host closed the connection)
[20:29] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) Quit (Ping timeout: 480 seconds)
[20:29] <nhm> mgalkiewicz: we spoke about this a week or two ago right?
[20:29] * loicd (~loic@magenta.dachary.org) has joined #ceph
[20:29] <mgalkiewicz> nhm: yes
[20:29] <mgalkiewicz> my production cluster is still at least 2 times slower than staging. My config includes 1 brand new osd, 1 installed around 2 weeks ago, and the last is shut down
[20:31] <nhm> sjust: from my recollection, basically there's multiple VMs that run a mix of posgresql and mongodb. Lots of small IO, and at least for mongo I think there's some kind of write lock involved.
[20:32] <nhm> A nasty storm of things Ceph doesn't like.
[20:34] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:a8d1:ab3a:90a4:4ead) has joined #ceph
[20:35] <mgalkiewicz> I am not sure what check next. Last time you have asked me to shut down the clients and then perform some benchs. It is not currently possible. I like ceph but it is too slow for me right now. I am considering migration to iscsi/drbd. It would be helpful for ceph debugging.
[20:35] * jlogan1 (~Thunderbi@2600:c00:3010:1:e419:2dcf:b362:1800) has joined #ceph
[20:35] <mgalkiewicz> what do you suggest?
[20:37] <nhm> mgalkiewicz: You had mentioned that the production cluster is 3x faster. Is that performing adequately?
[20:41] <houkouonchi-work> and vercoi01 back up
[20:46] * mgalkiewicz (~mgalkiewi@staticline-31-182-149-180.toya.net.pl) Quit (Ping timeout: 480 seconds)
[21:00] * LarsFronius_ (~LarsFroni@2a02:8108:3c0:79:4a3:e866:bb76:605b) has joined #ceph
[21:02] * verwilst (~verwilst@d5152FEFB.static.telenet.be) Quit (Ping timeout: 480 seconds)
[21:03] * verwilst (~verwilst@d5152FEFB.static.telenet.be) has joined #ceph
[21:06] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:a8d1:ab3a:90a4:4ead) Quit (Ping timeout: 480 seconds)
[21:06] * LarsFronius_ is now known as LarsFronius
[21:13] <Tv_> nhm: "What are the circumstances where people can't use qemu-rbd?" - 1. hypervisors without librbd integration, such as xen 2. desire to use some kernel-level features, e.g. BMW tends to run RAID-1 on top of two LUNs from separate SANs, they want to stay close to that setup even when replacing a SAN with RBD
[21:13] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[21:15] <Tv_> nhm: a single mongo server is a single-writer design
[21:16] * gaveen (~gaveen@112.134.114.109) Quit (Ping timeout: 480 seconds)
[21:18] * mgalkiewicz (~mgalkiewi@staticline-31-183-23-218.toya.net.pl) has joined #ceph
[21:22] <mgalkiewicz> nhm: looks like my message has not been sent, my production cluster is 2x slower than staging
[21:24] * dilemma (~dilemma@2607:fad0:32:a02:1e6f:65ff:feac:7f2a) has joined #ceph
[21:25] <dilemma> anyone notice that when doing a "ceph osd dump", the PID in the OSD address is off?
[21:25] <dilemma> example: 1.2.3.4:6800/5678 (where 5678 is supposed to be the pid of the osd process)
[21:25] <dilemma> when the actual pid is 5680
[21:26] <dilemma> presumably, this PID is gathered before daemonizing, which would explain why it tends to be exactly 2 less than the actual PID
[21:27] * cblack101 (86868949@ircip2.mibbit.com) Quit (Ping timeout: 480 seconds)
[21:28] * scuttlemonkey (~scuttlemo@c-69-244-181-5.hsd1.mi.comcast.net) has joined #ceph
[21:30] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[21:30] * loicd (~loic@magenta.dachary.org) has joined #ceph
[21:30] * stxShadow (~Jens@ip-178-203-169-190.unitymediagroup.de) has joined #ceph
[21:31] * stxShadow (~Jens@ip-178-203-169-190.unitymediagroup.de) Quit ()
[21:31] <nhm> mgalkiewicz: oh, I had it backwards. Ok. This is a bit involved, but you could try running blktrace on both the production and staging cluster, then use seekwatcher to look at the seek behavior on the devices behind the OSDs.
[21:32] * stxShadow (~Jens@ip-178-203-169-190.unitymediagroup.de) has joined #ceph
[21:32] <mgalkiewicz> nhm: ok I will take a look
[21:33] <nhm> mgalkiewicz: If it's slow and the seeks look bad, it might just mean that the underlying filesystem is causing a lot of fragmentation. If the seek behavior looks fine, there may be some other problem.
[21:34] <mgalkiewicz> I have mentioned that filesystem on osds is new, made only 2 weeks ago
[21:37] <nhm> mgalkiewicz: yeah, it has more to do with how files and metadata get laid out on the disk.
[21:38] <nhm> mgalkiewicz: Also directory metadata.
[21:38] * tziOm (~bjornar@ti0099a340-dhcp0358.bb.online.no) has joined #ceph
[21:38] * adjohn (~adjohn@69.170.166.146) Quit (Quit: adjohn)
[21:39] * EmilienM (~EmilienM@195-132-228-252.rev.numericable.fr) has joined #ceph
[21:40] <Tv_> tracker is down :(
[21:41] <Tv_> intermittent
[21:42] * cblack101 (86868949@ircip3.mibbit.com) has joined #ceph
[21:43] * _are_ (~quassel@vs01.lug-s.org) Quit (Ping timeout: 480 seconds)
[21:44] * LarsFronius_ (~LarsFroni@2a02:8108:3c0:79::3) has joined #ceph
[21:50] * LarsFronius (~LarsFroni@2a02:8108:3c0:79:4a3:e866:bb76:605b) Quit (Ping timeout: 480 seconds)
[21:50] * LarsFronius_ is now known as LarsFronius
[21:51] <sagewk> tv_: behaving now?
[21:51] <nhm> mgalkiewicz: If you are using XFS, you may also want to look at using xfs_bmap to look at extent fragmentation of various object's data and attribute forks.
[21:51] <Tv_> sagewk: yeah
[21:54] <dilemma> I see v0.52 in the channel topic, and in the blog, saying it's "released", but it doesn't say it's "stable", and there are no debs built. Is it considered the latest stable, or is that a development release?
[21:55] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) has joined #ceph
[21:56] <Tv_> dilemma: stable releases have a codename like "argonaut", "bobtail", "cuttlefish"..
[21:56] <Tv_> but yes, we need to be clearer about that
[21:56] <dilemma> ahh, thanks
[21:59] <dmick> the distinguished (numbered) builds have gone through a lot of QA though, and are "more stable" than the top of tree
[21:59] <dmick> it's a range
[21:59] * MikeMcClurg (~mike@cpc10-cmbg15-2-0-cust205.5-4.cable.virginmedia.com) has joined #ceph
[21:59] * mgalkiewicz (~mgalkiewi@staticline-31-183-23-218.toya.net.pl) Quit (Ping timeout: 480 seconds)
[21:59] <Tv_> dmick: it's tiered, not a range :-p
[22:00] <dmick> tomato, tomahto
[22:00] <Tv_> tomaatti
[22:00] <dmick> I've been Finned
[22:04] * tryggvil (~tryggvil@163-60-19-178.xdsl.simafelagid.is) Quit (Quit: tryggvil)
[22:05] * dilemma (~dilemma@2607:fad0:32:a02:1e6f:65ff:feac:7f2a) Quit (Quit: Leaving)
[22:08] * mgalkiewicz (~mgalkiewi@staticline-31-183-23-218.toya.net.pl) has joined #ceph
[22:14] * tryggvil (~tryggvil@rtr1.tolvusky.sip.is) has joined #ceph
[22:18] * adjohn (~adjohn@69.170.166.146) has joined #ceph
[22:22] * mgalkiewicz (~mgalkiewi@staticline-31-183-23-218.toya.net.pl) Quit (Ping timeout: 480 seconds)
[22:23] * The_Bishop (~bishop@2001:470:50b6:0:58d6:8844:6d:102e) has joined #ceph
[22:23] <slang> dmick: is that like being rick rolled?
[22:24] <dmick> it's much shorter and more intense
[22:24] <Tv_> and when done properly, includes a sauna
[22:24] <slang> heh
[22:25] <slang> dmick: I guess it doesn't really work to say: I've been Finnished
[22:25] * dmick disengages
[22:29] * kyle_ (~kyle@ip03.foxyf.simplybits.net) has joined #ceph
[22:31] * mgalkiewicz (~mgalkiewi@staticline-31-183-23-218.toya.net.pl) has joined #ceph
[22:37] <kyle_> hello all, i have a couple questions about my ceph cluster if anyone has a minute to discuss.
[22:38] * danieagle (~Daniel@186.214.56.89) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[22:39] <gregaf> kyle_: what's up?
[22:39] <kyle_> I am doing a large initial transfer of files to my cluster. OSD is now using swap, i assumed the 16GB it has would be enough. Is my journal size too high?
[22:40] <kyle_> or if you can offer any insight/advice etc... that would be awesome.
[22:41] <gregaf> how many osd daemons are on that node?
[22:41] <kyle_> just one
[22:41] <sjust> transferring a large number of files shouldn't balloon memory at all
[22:41] <sjust> can you paste you rceph.conf?
[22:41] <sjust> *ceph.conf
[22:41] <gregaf> yeah, I wouldn't expect it to get even close to 16GB
[22:42] <kyle_> i only have one osd node total until i transfer all of my existing content. ceph.conf coming...
[22:42] <gregaf> how much memory is the ceph-osd process claiming?
[22:42] <joshd> do you have tcmalloc installed (in the google-perftools package in debian and derivatives)?
[22:43] <kyle_> virtual is 17.3GB and RES is 6.3GB
[22:44] <nhm> kyle_: interesting
[22:44] <kyle_> not sure about tcmalloc. i'm running a vanilla Ubuntu 12.04
[22:45] <kyle_> clearly hitting swap though. top is showing 11213840k for swap and things are moving pretty slowly. seeing up to a 25% wait on disk from CPU
[22:46] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:46] <kyle_> not sure what the best way to spit out my conf is
[22:46] <gregaf> kyle_: what distro are you using, and where did your binaries come from?
[22:46] <kyle_> using ubuntu 12.04 and installed via the git repo.
[22:46] <gregaf> pastebin the file?
[22:47] <gregaf> so you built it from source?
[22:47] <kyle_> yes
[22:47] <kyle_> i used the wiki instructions
[22:47] * mgalkiewicz (~mgalkiewi@staticline-31-183-23-218.toya.net.pl) Quit (Ping timeout: 480 seconds)
[22:47] <nhm> kyle_: is it repeatable?
[22:47] <nhm> kyle_: Will it balloon if you start up ceph-osd again?
[22:47] <kyle_> although now that i think about it i believe i downloaded straight from the tarbell and built directly from that
[22:48] <kyle_> but the other nodes i had to use the git repo because of weird dependency errors i could not resolve
[22:49] <gregaf> okay, you should have a configure.log file in the root ceph directory; can you search through that for tcmalloc and see what pops up?
[22:49] * EmilienM (~EmilienM@195-132-228-252.rev.numericable.fr) Quit (Ping timeout: 480 seconds)
[22:49] <gregaf> because I've certainly never seen memory usage like that in our builds, even when all hell breaks loose
[22:49] <kyle_> here's my conf
[22:49] <kyle_> http://pastebin.com/zhAhNhj7
[22:50] <nhm> gregaf: it'd be very interesting to see what that is.
[22:50] <kyle_> okay yeah i'll search the log now
[22:50] <gregaf> sjust: hmm, could a large number of open files balloon memory inappropriately?
[22:51] <sjust> not without hitting open file limits
[22:51] <sjust> and also no
[22:51] <gregaf> ah, right, 1024 max anyway
[22:51] <kyle_> also, i'm using rsync to do this copy. which i was told isn not optimized well for ceph
[22:51] <gregaf> (not even sure if that's an OSD setting or an MDS one, to be honest)
[22:51] <sjust> yeah, but if it were going to cause a problem, it would be on the mds
[22:51] <gregaf> kyle_: that just means slow, not...this
[22:51] <sjust> actually, are you sure it's not the ceph-mds process which has it's memory ballooning?
[22:52] <kyle_> mds box is doing fine. 18GB free of 32GB
[22:53] <gregaf> sjust: oh, that limit is used by the init scripts to increase the open file limits
[22:53] <kyle_> mds process is using 14.1 VIRT and 13B RES
[22:53] <sjust> what limit?
[22:53] <kyle_> 13GB*
[22:54] <gregaf> that's also awfully high
[22:54] <gregaf> sjust: ulimit -n $max_open_files
[22:54] <sjust> yes, I know
[22:54] <sjust> but it defaults to 1024, right?
[22:55] * mgalkiewicz (~mgalkiewi@staticline-31-183-23-218.toya.net.pl) has joined #ceph
[22:55] <gregaf> yeah
[22:55] <gregaf> I'm just looking at a ceph.conf with 128k, but I don't think that could do it without the OSD's internal limits being changed as well
[22:56] <sjust> filestore_flusher_max_fds specifically, which doesn't get changed with that limit
[22:57] <sjust> and there isn't anywhere else the osd opens lots of fds
[22:59] <kyle_> hmmm not finding configure.log at the moment. s
[23:01] <kyle_> i did just realize my conf file points to "log file = /var/log/ceph/$name.log". that dir did not even exist
[23:02] <dmick> $name of course is a metavariable
[23:02] <dmick> that expands to the name of the daemon in question
[23:03] <kyle_> right. but /var/log/ceph didn't exist.
[23:03] <kyle_> it did on the monitor though
[23:03] <gregaf> so, no logging then
[23:04] <gregaf> kyle_: you ought to have configure stuff if you built from source
[23:04] <gregaf> you did at some point run autogen.sh && ./configure && make (or equivalent), right?
[23:04] <kyle_> yes
[23:04] <kyle_> exactly that on all nodes
[23:08] * The_Bishop (~bishop@2001:470:50b6:0:58d6:8844:6d:102e) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[23:08] * mgalkiewicz (~mgalkiewi@staticline-31-183-23-218.toya.net.pl) Quit (Ping timeout: 480 seconds)
[23:08] <gregaf> kyle_: which directory did you run that in, and what files are in that directory?
[23:09] <kyle_> i usually go to /usr/local/src/ to build things from source
[23:09] <kyle_> so in there i have ceph/
[23:10] <gregaf> right, and what's in that directory?
[23:10] <kyle_> which currently has.
[23:10] <kyle_> aclocal.m4 ChangeLog configure.ac fusetrace Makefile.am RELEASE_CHECKLIST
[23:10] <kyle_> admin CodingStyle COPYING INSTALL Makefile.in src
[23:10] <kyle_> AUTHORS compile COPYING-LGPL2.1 install-sh man SubmittingPatches
[23:10] <kyle_> autogen.sh config.guess debian keys missing udev
[23:10] <kyle_> autom4te.cache config.log depcomp libtool NEWS wireshark
[23:10] <kyle_> ceph-object-corpus config.status do_autogen.sh ltmain.sh py-compile
[23:10] <kyle_> ceph.spec config.sub doc m4 qa
[23:10] <kyle_> ceph.spec.in configure Doxyfile Makefile README
[23:11] <gregaf> okay, search in config.log for "tcmalloc"
[23:12] <kyle_> oh sorry about that. i was thinking somewhere else. just a second
[23:12] <kyle_> first up is this:
[23:12] <kyle_> configure:17591: checking for malloc in -ltcmalloc
[23:12] <kyle_> configure:17616: gcc -o conftest -g -O2 conftest.c -ltcmalloc >&5
[23:12] <kyle_> conftest.c:38:6: warning: conflicting types for built-in function 'malloc' [enabled by default]
[23:12] <kyle_> configure:17616: $? = 0
[23:12] <kyle_> configure:17625: result: yes
[23:13] <gregaf> hmm, that does approximately match mine
[23:13] <kyle_> | #define HAVE_LIBTCMALLOC 1
[23:14] <kyle_> that comes up a few times
[23:14] <kyle_> ac_cv_lib_tcmalloc_malloc=yes
[23:14] <kyle_> LIBTCMALLOC='-ltcmalloc'
[23:14] <kyle_> WITH_TCMALLOC_FALSE='#'
[23:14] <kyle_> WITH_TCMALLOC_TRUE=''
[23:14] <kyle_> #define HAVE_LIBTCMALLOC 1
[23:15] <kyle_> that's all
[23:17] <gregaf> yeah, that looks right
[23:17] <gregaf> hrmmm
[23:19] <kyle_> i'm using XFS too. just thought i'd mention.
[23:20] <kyle_> i'm setting up my second and third monitor now as well as a standby mds
[23:21] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[23:22] <kyle_> i though i was being bottlenecked by mds initially. then i finally started watching osd and realized the deal with swap. it's taken like 4-5 days to copy about 500 gig of the 735GB total
[23:22] <gregaf> yeah, that's not anything close to what we're used to
[23:23] <kyle_> yeah i was getting really good sysbench numbers. 320MB/s or so locally
[23:23] <kyle_> for random writes
[23:27] <gregaf> kyle_: we're really fishing here
[23:27] <gregaf> did you make any modifications to the source?
[23:27] <gregaf> and what's the output of ldd ceph-osd | grep tcmalloc
[23:27] <gregaf> ?
[23:30] <kyle_> no i didn't modify the source. i would be lost in there. i'm a web dev when it comes to coding.
[23:30] <kyle_> libtcmalloc.so.0 => /usr/lib/libtcmalloc.so.0 (0x00007fac34547000)
[23:31] <kyle_> maybe after this initial copy is over i'll try a fresh build
[23:31] * stxShadow (~Jens@ip-178-203-169-190.unitymediagroup.de) has left #ceph
[23:52] * Kioob (~kioob@luuna.daevel.fr) Quit (Ping timeout: 480 seconds)
[23:53] * joey_ (~root@135.13.255.151) has joined #ceph
[23:53] * joey_ is now known as joey_alu
[23:56] * The_Bishop (~bishop@p5DC1134C.dip.t-dialin.net) has joined #ceph
[23:59] <gregaf> kyle_: just another random check, what version is it? (ceph-osd —version will print it out)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.