#ceph IRC Log

Index

IRC Log for 2014-08-15

Timestamps are in GMT/BST.

[0:00] <ganders> ThiagoMiranda: Sorry i mean that you could choose any kernel version from 3.15, like 3.15.0 or 3.15.8,...
[0:01] * steki (~steki@212.200.65.143) Quit (Ping timeout: 480 seconds)
[0:05] * nljmo_ (~nljmo@217-195-251-180.dsl.easynet.nl) has joined #ceph
[0:05] * steveeJ (~junky@HSI-KBW-085-216-022-246.hsi.kabelbw.de) has joined #ceph
[0:08] * ganders (~root@200-127-158-54.net.prima.net.ar) Quit (Quit: WeeChat 0.4.1)
[0:09] * burley (~khemicals@cpe-98-28-233-158.woh.res.rr.com) Quit (Quit: burley)
[0:09] * burley (~khemicals@cpe-98-28-233-158.woh.res.rr.com) has joined #ceph
[0:13] * jhujhiti_ (~jhujhiti@elara.adjectivism.org) has joined #ceph
[0:15] <jhujhiti_> so i just accidentially created an rbd image far larger than i can actually store.. rbd rm seems to be hanging, as does rbd resize. or at least acting really slowly. is there some other, messier, way to delete this thing?
[0:15] * jhujhiti_ is now known as jhujhiti
[0:16] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[0:18] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Read error: Operation timed out)
[0:19] <nwf_> jhujhiti: I think "rbd rm" issues delete commands for each object that could be part of the rbd; at least, it does not seem, observationally, to reflect the sparsity of the actual allocations. Barring astronomical rbd sizes, you might just be best off letting it run.
[0:20] <jhujhiti> nwf_: 500 petabytes? :(
[0:21] <jhujhiti> i attempted to resize an RBD and did the math in bytes intead of megabytes, so it's a factor of a million larger
[0:22] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[0:22] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[0:23] <lurbs> I've had to delete an exbibyte-scale RBD volume before. That took a *long* time, but it did start immediately and show progress.
[0:23] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[0:23] <jhujhiti> all right. sounds like i might just let it run overnight then
[0:24] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[0:28] * rweeks (~rweeks@pat.hitachigst.com) has joined #ceph
[0:28] * dmsimard is now known as dmsimard_away
[0:35] * rturk|afk is now known as rturk
[0:37] * wabat (~wbatterso@65.182.109.4) has joined #ceph
[0:39] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) Quit (Quit: ...)
[0:48] * rendar (~I@host100-179-dynamic.23-79-r.retail.telecomitalia.it) Quit ()
[0:51] * jobewan (~jobewan@c-75-65-191-17.hsd1.la.comcast.net) has joined #ceph
[0:52] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[0:55] * tryggvil (~tryggvil@83.151.131.116) Quit (Quit: tryggvil)
[0:56] * ircolle (~Adium@2601:1:a580:145a:40d9:4fd3:c7c1:34c4) Quit (Quit: Leaving.)
[1:12] * sjusthm (~sam@24-205-54-233.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[1:15] * dneary (~dneary@pool-108-20-148-222.bstnma.fios.verizon.net) Quit (Ping timeout: 480 seconds)
[1:19] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) has joined #ceph
[1:23] * Concubidated (~Adium@12.248.40.138) Quit (Quit: Leaving.)
[1:24] * joef (~Adium@2620:79:0:131:828:ccd1:2e5e:60c7) Quit (Quit: Leaving.)
[1:24] * rweeks (~rweeks@pat.hitachigst.com) Quit (Quit: Leaving)
[1:24] * sjustlaptop (~sam@24-205-54-233.dhcp.gldl.ca.charter.com) has joined #ceph
[1:34] * kevinc_ (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[1:36] * funnel (~funnel@0001c7d4.user.oftc.net) Quit (Server closed connection)
[1:36] * funnel (~funnel@0001c7d4.user.oftc.net) has joined #ceph
[1:36] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Ping timeout: 480 seconds)
[1:37] * markl (~mark@knm.org) Quit (Server closed connection)
[1:37] * markl (~mark@knm.org) has joined #ceph
[1:38] * dneary (~dneary@pool-108-20-148-222.bstnma.fios.verizon.net) has joined #ceph
[1:39] * kevinc_ (~kevinc__@client65-44.sdsc.edu) Quit (Read error: Connection reset by peer)
[1:41] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[1:43] * andreask (~andreask@h081217017238.dyn.cm.kabsi.at) has joined #ceph
[1:43] * ChanServ sets mode +v andreask
[1:43] * andreask (~andreask@h081217017238.dyn.cm.kabsi.at) Quit (Read error: Connection reset by peer)
[1:44] * sjustlaptop (~sam@24-205-54-233.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[1:46] * elder (~elder@c-24-245-18-91.hsd1.mn.comcast.net) Quit (Server closed connection)
[1:46] * elder (~elder@c-24-245-18-91.hsd1.mn.comcast.net) has joined #ceph
[1:46] * ChanServ sets mode +o elder
[1:47] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[1:48] * dneary (~dneary@pool-108-20-148-222.bstnma.fios.verizon.net) Quit (Remote host closed the connection)
[1:48] * sputnik13 (~sputnik13@207.8.121.241) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[1:48] * morfair (~mf@office.345000.ru) Quit (Server closed connection)
[1:48] * morfair (~mf@office.345000.ru) has joined #ceph
[1:50] * masterpe (~masterpe@2a01:670:400::43) Quit (Server closed connection)
[1:50] * masterpe (~masterpe@2a01:670:400::43) has joined #ceph
[1:50] * bandrus (~Adium@216.57.72.205) Quit (Quit: Leaving.)
[1:50] * houkouonchi-work (~linux@12.248.40.138) Quit (Server closed connection)
[1:51] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[1:51] * Guest5118 (~coyo@thinks.outside.theb0x.org) Quit (Server closed connection)
[1:51] * theanalyst (theanalyst@0001c1e3.user.oftc.net) Quit (Server closed connection)
[1:52] * Coyo (~coyo@thinks.outside.theb0x.org) has joined #ceph
[1:52] * LeaChim (~LeaChim@host86-159-115-162.range86-159.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[1:52] * Coyo is now known as Guest568
[1:54] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) Quit (Quit: Leaving.)
[1:56] * dmick (~dmick@2607:f298:a:607:9c98:cad1:478d:3044) Quit (Server closed connection)
[1:56] * dmick (~dmick@2607:f298:a:607:754f:5ad6:1184:5323) has joined #ceph
[1:58] <Nats> -25/5335686 objects degraded . negative objects degraded, thats a new one to me
[2:00] * oms101 (~oms101@p20030057EA0D2400EEF4BBFFFE0F7062.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[2:00] * singler (~singler@zeta.kirneh.eu) Quit (Server closed connection)
[2:00] * singler (~singler@zeta.kirneh.eu) has joined #ceph
[2:01] * ccourtaut_ (~ccourtaut@2001:41d0:1:eed3::1) Quit (Server closed connection)
[2:01] * ccourtaut_ (~ccourtaut@2001:41d0:1:eed3::1) has joined #ceph
[2:05] * blahnana (~bman@us1.blahnana.com) Quit (Server closed connection)
[2:06] * alram (~alram@38.122.20.226) Quit (Ping timeout: 480 seconds)
[2:06] * blahnana (~bman@us1.blahnana.com) has joined #ceph
[2:07] * hybrid512 (~walid@195.200.167.70) Quit (Server closed connection)
[2:08] * hybrid512 (~walid@195.200.167.70) has joined #ceph
[2:08] * oms101 (~oms101@p20030057EA0C3800EEF4BBFFFE0F7062.dip0.t-ipconnect.de) has joined #ceph
[2:09] * tcatm (~quassel@2a01:4f8:151:13c3:5054:ff:feff:cbce) Quit (Server closed connection)
[2:09] * tcatm (~quassel@2a01:4f8:151:13c3:5054:ff:feff:cbce) has joined #ceph
[2:10] * ultimape (~Ultimape@c-174-62-192-41.hsd1.vt.comcast.net) Quit (Server closed connection)
[2:10] * ultimape (~Ultimape@c-174-62-192-41.hsd1.vt.comcast.net) has joined #ceph
[2:11] * Concubidated (~Adium@66-87-67-194.pools.spcsdns.net) has joined #ceph
[2:15] <jcsp1> Nats: http://tracker.ceph.com/issues/5884 :-/
[2:16] * mattch (~mattch@pcw3047.see.ed.ac.uk) Quit (Server closed connection)
[2:16] * dosaboy (~dosaboy@65.93.189.91.lcy-01.canonistack.canonical.com) Quit (Server closed connection)
[2:16] * dosaboy (~dosaboy@65.93.189.91.lcy-01.canonistack.canonical.com) has joined #ceph
[2:17] * mattch (~mattch@pcw3047.see.ed.ac.uk) has joined #ceph
[2:17] <Nats> i'm on emporer, but sounds similar
[2:17] <Nats> i reweighted a drive that was getting close to full
[2:18] * angdraug (~angdraug@12.164.168.117) Quit (Quit: Leaving)
[2:20] * Andreas-IPO (~andreas@2a01:2b0:2000:11::cafe) Quit (Server closed connection)
[2:20] * Andreas-IPO (~andreas@2a01:2b0:2000:11::cafe) has joined #ceph
[2:22] * gregsfortytwo (~Adium@38.122.20.226) Quit (Server closed connection)
[2:22] * gregsfortytwo (~Adium@2607:f298:a:607:816a:1722:9336:a6ef) has joined #ceph
[2:22] * rmoe (~quassel@12.164.168.117) Quit (Ping timeout: 480 seconds)
[2:23] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[2:23] * bens (~ben@c-71-231-52-111.hsd1.wa.comcast.net) Quit (Server closed connection)
[2:24] * bens (~ben@c-71-231-52-111.hsd1.wa.comcast.net) has joined #ceph
[2:24] * codice (~toodles@97-94-175-73.static.mtpk.ca.charter.com) Quit (Server closed connection)
[2:25] * codice (~toodles@97-94-175-73.static.mtpk.ca.charter.com) has joined #ceph
[2:25] <Nats> restarting that osd lowers the negative value but doesnt fix it
[2:25] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Read error: Operation timed out)
[2:26] * runfromnowhere (~runfromno@pool-108-29-25-203.nycmny.fios.verizon.net) Quit (Server closed connection)
[2:26] * runfromnowhere (~runfromno@pool-108-29-25-203.nycmny.fios.verizon.net) has joined #ceph
[2:27] * rmoe (~quassel@173-228-89-134.dsl.static.sonic.net) has joined #ceph
[2:27] * jhujhiti (~jhujhiti@00012a8b.user.oftc.net) Quit (Server closed connection)
[2:28] * jhujhiti (~jhujhiti@elara.adjectivism.org) has joined #ceph
[2:28] * muhanpon1 (~povian@kang.sarang.net) Quit (Server closed connection)
[2:28] * muhanpong (~povian@kang.sarang.net) has joined #ceph
[2:29] * aknapp (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Ping timeout: 480 seconds)
[2:30] * xarses (~andreww@12.164.168.117) Quit (Ping timeout: 480 seconds)
[2:30] * winston-1 (~ubuntu@ec2-54-244-213-72.us-west-2.compute.amazonaws.com) Quit (Server closed connection)
[2:31] * winston-d (~ubuntu@ec2-54-244-213-72.us-west-2.compute.amazonaws.com) has joined #ceph
[2:31] * tupper (~chatzilla@2607:f298:a:607:2677:3ff:fe64:c3f4) Quit (Remote host closed the connection)
[2:36] * blynch_ (~blynch@vm-nat.msi.umn.edu) Quit (Server closed connection)
[2:36] * blynch (~blynch@vm-nat.msi.umn.edu) has joined #ceph
[2:41] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit (Quit: Leaving.)
[2:41] * fouxm (~foucault@ks3363630.kimsufi.com) Quit (Server closed connection)
[2:41] * fouxm (~foucault@ks3363630.kimsufi.com) has joined #ceph
[2:42] * Psi-Jack (~psi-jack@lhmon.linux-help.org) Quit (Server closed connection)
[2:43] * yguang11 (~yguang11@182.18.27.82) has joined #ceph
[2:43] * Psi-Jack (~psi-jack@lhmon.linux-help.org) has joined #ceph
[2:43] * KindOne (kindone@0001a7db.user.oftc.net) Quit (Server closed connection)
[2:43] * KindOne (kindone@0001a7db.user.oftc.net) has joined #ceph
[2:45] * yguang11 (~yguang11@182.18.27.82) Quit (Remote host closed the connection)
[2:45] * yguang11 (~yguang11@vpn-nat.corp.tw1.yahoo.com) has joined #ceph
[2:45] * zackc (~zackc@0001ba60.user.oftc.net) Quit (Server closed connection)
[2:46] * chowmeined (~chow@c-24-19-66-251.hsd1.wa.comcast.net) Quit (Server closed connection)
[2:46] * chowmeined (~chow@c-24-19-66-251.hsd1.wa.comcast.net) has joined #ceph
[2:46] * MrBy2 (~MrBy@85.115.23.42) Quit (Server closed connection)
[2:46] * zackc (~zackc@0001ba60.user.oftc.net) has joined #ceph
[2:46] * MrBy2 (~MrBy@85.115.23.42) has joined #ceph
[2:54] * rotbeard (~redbeard@2a02:908:df19:4b80:76f0:6dff:fe3b:994d) Quit (Ping timeout: 480 seconds)
[2:54] * steveeJ (~junky@HSI-KBW-085-216-022-246.hsi.kabelbw.de) Quit (Remote host closed the connection)
[2:55] * Nats__ (~Nats@2001:8000:200c:0:c40d:8553:6e00:d963) Quit (Server closed connection)
[2:55] * oblu (~o@62.109.134.112) Quit (Server closed connection)
[2:56] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[2:56] * Nats__ (~Nats@2001:8000:200c:0:c40d:8553:6e00:d963) has joined #ceph
[2:56] * oblu (~o@62.109.134.112) has joined #ceph
[2:59] * lucas1 (~Thunderbi@222.240.148.154) has joined #ceph
[3:00] * KevinPerks (~Adium@2606:a000:80a1:1b00:d807:f232:c2fd:6b24) Quit (Quit: Leaving.)
[3:01] * rturk is now known as rturk|afk
[3:03] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[3:04] * boichev2 (~boichev@213.169.56.130) Quit (Server closed connection)
[3:04] * boichev (~boichev@213.169.56.130) has joined #ceph
[3:05] * baylight (~tbayly@74-220-196-40.unifiedlayer.com) Quit (Server closed connection)
[3:06] * baylight (~tbayly@74-220-196-40.unifiedlayer.com) has joined #ceph
[3:07] * zack_dolby (~textual@p8505b4.tokynt01.ap.so-net.ne.jp) Quit (Read error: Connection reset by peer)
[3:07] * zack_dolby (~textual@p8505b4.tokynt01.ap.so-net.ne.jp) has joined #ceph
[3:10] * reed (~reed@75-101-54-131.dsl.static.sonic.net) Quit (Server closed connection)
[3:11] * reed (~reed@75-101-54-131.dsl.static.sonic.net) has joined #ceph
[3:12] * MACscr (~Adium@c-50-158-183-38.hsd1.il.comcast.net) Quit (Server closed connection)
[3:12] * MACscr (~Adium@c-50-158-183-38.hsd1.il.comcast.net) has joined #ceph
[3:15] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[3:20] * jobewan (~jobewan@c-75-65-191-17.hsd1.la.comcast.net) Quit (Server closed connection)
[3:20] * jobewan (~jobewan@c-75-65-191-17.hsd1.la.comcast.net) has joined #ceph
[3:21] * capri (~capri@212.218.127.222) Quit (Read error: Connection reset by peer)
[3:30] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) has joined #ceph
[3:31] * wabat (~wbatterso@65.182.109.4) has left #ceph
[3:32] * yguang11 (~yguang11@vpn-nat.corp.tw1.yahoo.com) Quit ()
[3:38] * Tamil (~Adium@cpe-108-184-74-11.socal.res.rr.com) Quit (Quit: Leaving.)
[3:39] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[3:43] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Remote host closed the connection)
[3:43] * aknapp (~aknapp@64.202.160.233) has joined #ceph
[3:49] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has joined #ceph
[3:57] * zhaochao (~zhaochao@111.161.17.102) has joined #ceph
[3:57] * pentabular (~Adium@2601:9:4980:a01:ed8f:2abe:89ba:ac1c) Quit (Quit: Leaving.)
[4:07] * longguang_ (~chatzilla@123.126.33.253) has joined #ceph
[4:08] * classicsnail (~David@2600:3c01::f03c:91ff:fe96:d3c0) has left #ceph
[4:08] * classicsnail (~David@2600:3c01::f03c:91ff:fe96:d3c0) has joined #ceph
[4:12] * longguang (~chatzilla@123.126.33.253) Quit (Ping timeout: 480 seconds)
[4:12] * longguang_ is now known as longguang
[4:14] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:29] * diegows (~diegows@190.190.5.238) Quit (Ping timeout: 480 seconds)
[4:30] * haomaiwa_ (~haomaiwan@223.223.183.114) has joined #ceph
[4:35] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[4:36] * haomaiwang (~haomaiwan@124.248.208.2) Quit (Ping timeout: 480 seconds)
[4:49] * bandrus (~Adium@216.57.72.205) has joined #ceph
[4:49] * zerick (~eocrospom@190.187.21.53) Quit (Ping timeout: 480 seconds)
[5:05] * bandrus (~Adium@216.57.72.205) Quit (Quit: Leaving.)
[5:14] * Vacum__ (~vovo@88.130.195.27) has joined #ceph
[5:16] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[5:19] * AfC (~andrew@nat-gw2.syd4.anchor.net.au) Quit (Quit: Leaving.)
[5:20] * haomaiwa_ (~haomaiwan@223.223.183.114) Quit (Remote host closed the connection)
[5:20] * haomaiwang (~haomaiwan@124.248.208.2) has joined #ceph
[5:21] * Vacum_ (~vovo@88.130.202.151) Quit (Ping timeout: 480 seconds)
[5:26] * shang (~ShangWu@175.41.48.77) has joined #ceph
[5:27] * MACscr (~Adium@c-50-158-183-38.hsd1.il.comcast.net) Quit (Quit: Leaving.)
[5:31] <Jakey> how can i add a node to monmap?
[5:31] <Jakey> node8][WARNIN] monitor node8 does not exist in monmap
[5:31] <Jakey> i'm getting this warning using ceph-deploy
[5:38] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) has joined #ceph
[5:39] * haomaiwa_ (~haomaiwan@223.223.183.114) has joined #ceph
[5:45] * haomaiwang (~haomaiwan@124.248.208.2) Quit (Ping timeout: 480 seconds)
[5:54] <Nats> is node8 new?
[5:58] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[6:01] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit ()
[6:16] * pentabular (~Adium@c-50-148-128-107.hsd1.ca.comcast.net) has joined #ceph
[6:17] * pentabular (~Adium@c-50-148-128-107.hsd1.ca.comcast.net) Quit ()
[6:22] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[6:23] * KevinPerks (~Adium@2606:a000:80a1:1b00:6095:8564:cf42:6eff) has joined #ceph
[6:25] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) Quit (Quit: Leaving.)
[6:27] * bandrus (~Adium@216.57.72.205) has joined #ceph
[6:27] * bandrus (~Adium@216.57.72.205) Quit (Remote host closed the connection)
[6:27] * bandrus (~Adium@216.57.72.205) has joined #ceph
[6:29] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[6:34] * bandrus (~Adium@216.57.72.205) Quit (Quit: Leaving.)
[6:50] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[7:05] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[7:11] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Quit: Verlassend)
[7:14] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) has joined #ceph
[7:27] * reed (~reed@75-101-54-131.dsl.static.sonic.net) Quit (Ping timeout: 480 seconds)
[7:35] * aknapp (~aknapp@64.202.160.233) Quit (Remote host closed the connection)
[7:35] * aknapp (~aknapp@64.202.160.233) has joined #ceph
[7:41] * lucas1 (~Thunderbi@222.240.148.154) Quit (Quit: lucas1)
[7:43] * aknapp (~aknapp@64.202.160.233) Quit (Ping timeout: 480 seconds)
[7:46] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[7:54] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Read error: Operation timed out)
[7:57] * rendar (~I@host13-179-dynamic.7-87-r.retail.telecomitalia.it) has joined #ceph
[8:05] * adamcrume (~quassel@c-71-204-162-10.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[8:16] * lucas1 (~Thunderbi@222.247.57.50) has joined #ceph
[8:19] * nljmo_ (~nljmo@217-195-251-180.dsl.easynet.nl) Quit (Quit: My MacBook Pro has gone to sleep. ZZZzzz???)
[8:24] * Meths_ (~meths@2.27.83.72) has joined #ceph
[8:28] * Meths (~meths@2.25.220.171) Quit (Read error: Operation timed out)
[8:34] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[8:39] * steki (~steki@91.195.39.5) has joined #ceph
[8:41] * cok (~chk@2a02:2350:18:1012:84b4:3102:e165:9889) has joined #ceph
[8:45] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[8:49] * Concubidated (~Adium@66-87-67-194.pools.spcsdns.net) Quit (Quit: Leaving.)
[8:54] * jtang_ (~jtang@80.111.83.231) has joined #ceph
[9:04] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[9:11] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[9:13] * danieljh (~daniel@0001b4e9.user.oftc.net) has joined #ceph
[9:14] * branto (~borix@ip-213-220-214-245.net.upcbroadband.cz) has joined #ceph
[9:14] * purpleidea is now known as Guest609
[9:14] * purpleidea (~james@216.252.94.224) has joined #ceph
[9:16] * Guest609 (~james@216.252.87.54) Quit (Ping timeout: 480 seconds)
[9:17] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[9:20] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:22] * analbeard (~shw@support.memset.com) has joined #ceph
[9:23] * MACscr (~Adium@c-50-158-183-38.hsd1.il.comcast.net) has joined #ceph
[9:25] * linjan (~linjan@82.102.126.145) has joined #ceph
[9:25] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[9:27] * nljmo_ (~nljmo@46.44.153.234) has joined #ceph
[9:28] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[9:29] * cookednoodles (~eoin@eoin.clanslots.com) has joined #ceph
[9:35] * tryggvil (~tryggvil@83.151.131.116) has joined #ceph
[9:39] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[9:47] * jtang_ (~jtang@80.111.83.231) Quit (Remote host closed the connection)
[9:47] * jtang_ (~jtang@80.111.83.231) has joined #ceph
[9:48] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[9:51] * KevinPerks (~Adium@2606:a000:80a1:1b00:6095:8564:cf42:6eff) Quit (Quit: Leaving.)
[9:59] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[10:00] * JC (~JC@AMontpellier-651-1-319-156.w92-133.abo.wanadoo.fr) Quit (Quit: Leaving.)
[10:03] * JC (~JC@AMontpellier-651-1-319-156.w92-133.abo.wanadoo.fr) has joined #ceph
[10:12] * ksingh (~Adium@2001:708:10:10:e872:e14f:4379:8de0) has joined #ceph
[10:16] * dignus_ (~jkooijman@t-x.dignus.nl) has joined #ceph
[10:16] * dignus_ (~jkooijman@t-x.dignus.nl) Quit ()
[10:16] * dignus_ (~jkooijman@t-x.dignus.nl) has joined #ceph
[10:17] * dignus (~jkooijman@t-x.dignus.nl) Quit (Ping timeout: 480 seconds)
[10:17] * JC (~JC@AMontpellier-651-1-319-156.w92-133.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[10:18] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[10:19] * lucas1 (~Thunderbi@222.247.57.50) Quit (Quit: lucas1)
[10:25] * tryggvil (~tryggvil@83.151.131.116) Quit (Quit: tryggvil)
[10:26] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[10:30] * mgarcesMZ (~mgarces@5.206.228.5) has joined #ceph
[10:30] <mgarcesMZ> good morning everyone
[10:35] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[10:37] * analbeard (~shw@support.memset.com) Quit (Read error: Operation timed out)
[10:37] * cookednoodles (~eoin@eoin.clanslots.com) Quit (Remote host closed the connection)
[10:43] * LeaChim (~LeaChim@host86-159-115-162.range86-159.btcentralplus.com) has joined #ceph
[10:44] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[10:46] * analbeard (~shw@support.memset.com) has joined #ceph
[10:47] * lucas1 (~Thunderbi@222.247.57.50) has joined #ceph
[11:04] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[11:11] * Sysadmin88 (~IceChat77@054533bc.skybroadband.com) Quit (Quit: Depression is merely anger without enthusiasm)
[11:19] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[11:19] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[11:27] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[11:29] * topro (~prousa@host-62-245-142-50.customer.m-online.net) has joined #ceph
[11:38] * _Tassadar (~tassadar@D57DEE42.static.ziggozakelijk.nl) has joined #ceph
[11:42] <ksingh> good morning
[11:42] <ksingh> * good afternoon
[11:46] * lucas1 (~Thunderbi@222.247.57.50) Quit (Quit: lucas1)
[11:54] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[11:57] * mgarcesMZ (~mgarces@5.206.228.5) Quit (Quit: mgarcesMZ)
[12:00] * AbyssOne is now known as a1-away
[12:03] * a1-away is now known as AbyssOne
[12:04] * mgarcesMZ (~mgarces@5.206.228.5) has joined #ceph
[12:05] * cok (~chk@2a02:2350:18:1012:84b4:3102:e165:9889) has left #ceph
[12:05] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[12:06] * t0rn (~ssullivan@2607:fad0:32:a02:d227:88ff:fe02:9896) has joined #ceph
[12:06] * topro (~prousa@host-62-245-142-50.customer.m-online.net) Quit (Quit: Konversation terminated!)
[12:09] * cookednoodles (~eoin@eoin.clanslots.com) has joined #ceph
[12:15] * nljmo_ (~nljmo@46.44.153.234) Quit (Quit: My MacBook Pro has gone to sleep. ZZZzzz???)
[12:20] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[12:28] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[12:29] <rotbeard> regarding to best practices I set the number of PGs for my pools to 8192
[12:29] <rotbeard> is there any problem if I add more pools with 8192 PGs also?
[12:29] <rotbeard> for now I have for example 5 pools
[12:30] <rotbeard> if I add another 5 pools, should I decrease the number of PGs or can I just use 8192
[12:33] * mgarcesMZ (~mgarces@5.206.228.5) Quit (Quit: mgarcesMZ)
[12:48] * mtl1 (~Adium@c-67-174-109-212.hsd1.co.comcast.net) has joined #ceph
[12:48] * Gugge-47527 (gugge@kriminel.dk) Quit (Ping timeout: 480 seconds)
[12:53] * mtl2 (~Adium@c-67-174-109-212.hsd1.co.comcast.net) Quit (Ping timeout: 480 seconds)
[12:57] * blSnoopy (~snoopy@miram.persei.mw.lg.virgo.supercluster.net) has joined #ceph
[13:03] * diegows (~diegows@190.190.5.238) has joined #ceph
[13:08] * andreask (~andreask@h081217017238.dyn.cm.kabsi.at) has joined #ceph
[13:08] * ChanServ sets mode +v andreask
[13:09] * shang (~ShangWu@175.41.48.77) Quit (Ping timeout: 480 seconds)
[13:09] * Gugge-47527 (gugge@kriminel.dk) has joined #ceph
[13:14] * zhaochao (~zhaochao@111.161.17.102) has left #ceph
[13:15] * andreask (~andreask@h081217017238.dyn.cm.kabsi.at) has left #ceph
[13:16] * steveeJ (~junky@HSI-KBW-085-216-022-246.hsi.kabelbw.de) has joined #ceph
[13:21] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[13:22] * danieljh (~daniel@0001b4e9.user.oftc.net) Quit (Quit: leaving)
[13:28] * Cataglottism (~Cataglott@dsl-087-195-030-184.solcon.nl) has joined #ceph
[13:29] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[13:30] <Vacum__> rotbeard: the recommendation of 100 pgs per osd is total for all pools. so you have to plan ahead a bit
[13:30] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[13:33] <rotbeard> Vacum__, thanks, I'll fix that later
[13:34] <Vacum__> rotbeard: its (currently) not possible to decrease the number of PGs. and IIRC increasing only by doubling the number
[13:35] * MrBy2 (~MrBy@85.115.23.42) Quit (Remote host closed the connection)
[13:36] <blSnoopy> are there any recommendations for improving single stream performance (rbd)?
[13:37] <blSnoopy> (iops, specifically)
[13:37] * Cataglottism (~Cataglott@dsl-087-195-030-184.solcon.nl) Quit (Quit: Textual IRC Client: www.textualapp.com)
[13:42] <rotbeard> our ceph cluster isn't in a _real_ production state yet, so I am able to delete the pools
[13:43] * mgarcesMZ (~mgarces@5.206.228.5) has joined #ceph
[13:43] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[13:56] * i_m (~ivan.miro@gbibp9ph1--blueice4n2.emea.ibm.com) has joined #ceph
[14:06] * danieljh (~daniel@0001b4e9.user.oftc.net) has joined #ceph
[14:07] <mgarcesMZ> guys, I am trying to connect to my radosgw, using the user created in the exaples. I am testing with Cyberduck
[14:07] <mgarcesMZ> this is the info for the user: https://gist.github.com/mgarces/491196411cb209574b3c
[14:08] <mgarcesMZ> I cant login or test via command line..
[14:08] <mgarcesMZ> can you help me sort this out?
[14:11] * codice_ (~toodles@97-94-175-73.static.mtpk.ca.charter.com) has joined #ceph
[14:13] * codice (~toodles@97-94-175-73.static.mtpk.ca.charter.com) Quit (Ping timeout: 480 seconds)
[14:14] * JC (~JC@AMontpellier-651-1-319-156.w92-133.abo.wanadoo.fr) has joined #ceph
[14:21] * gregmark (~Adium@68.87.42.115) has joined #ceph
[14:22] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[14:22] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[14:27] * KevinPerks (~Adium@2606:a000:80a1:1b00:71d8:9a0d:3699:6d5e) has joined #ceph
[14:29] * cok (~chk@2a02:2350:1:1203:1d4e:3b39:fb4a:4b13) has joined #ceph
[14:30] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Ping timeout: 480 seconds)
[14:32] * nljmo_ (~nljmo@5ED6C263.cm-7-7d.dynamic.ziggo.nl) has joined #ceph
[14:35] * nljmo_ (~nljmo@5ED6C263.cm-7-7d.dynamic.ziggo.nl) Quit ()
[14:39] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[14:50] * cok (~chk@2a02:2350:1:1203:1d4e:3b39:fb4a:4b13) Quit (Quit: Leaving.)
[14:52] * JC (~JC@AMontpellier-651-1-319-156.w92-133.abo.wanadoo.fr) Quit (Quit: Leaving.)
[15:05] <TiCPU> I'm using the RBD client with a Btrfs filesystem, and sometimes it seems to freeze for no actual reason, looking at /sys/kernel/debug/ceph/8/osdc I can see there are multiple request to only one (random) OSD, restarting the OSD restores the operations, how could I troubleshoot this?
[15:10] * shang (~ShangWu@220-138-38-112.dynamic.hinet.net) has joined #ceph
[15:15] * wschulze (~wschulze@cpe-69-206-251-158.nyc.res.rr.com) has joined #ceph
[15:17] * amospalla (~amospalla@0001a39c.user.oftc.net) Quit (Quit: WeeChat 0.4.3)
[15:17] * rwheeler_ (~rwheeler@nat-pool-bos-t.redhat.com) has joined #ceph
[15:17] * amospalla (~amospalla@0001a39c.user.oftc.net) has joined #ceph
[15:18] <swat30> hi all
[15:19] <TiCPU> hi
[15:19] <swat30> having problems with our MON, seemed to happen during a snap delete:
[15:19] <swat30> http://pastebin.com/LBsT4FKi
[15:19] <swat30> now it won't start
[15:20] * ksingh (~Adium@2001:708:10:10:e872:e14f:4379:8de0) Quit (Quit: Leaving.)
[15:21] <TiCPU> one or multiple mon setup?
[15:21] <swat30> single
[15:22] <swat30> tidbit more: http://pastebin.com/fqNZwChJ
[15:22] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[15:23] <swat30> TiCPU, any help is appreciated, gotta get this thing back online!
[15:24] * Cataglottism (~Cataglott@dsl-087-195-030-170.solcon.nl) has joined #ceph
[15:24] <swat30> I take it upgrading to Dumpling at this time would probably be unsafe
[15:25] * markbby (~Adium@168.94.245.4) has joined #ceph
[15:25] * twx (~twx@rosamoln.org) Quit (Read error: Connection reset by peer)
[15:25] <TiCPU> I'd like opinions from others, last time I had a failure it was because I accidently upgrade an OSD to a newer version without updating the Mon, recovered by updating everything
[15:26] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Read error: Operation timed out)
[15:26] <swat30> OK. Cuttlefish is a little old, but we're definitely running it everywhere
[15:26] <TiCPU> in this situation, I'm unsure, removing snapshot from a cephfs once got me an assert on MDS startup which I removed from the source and continued but this is not an assert
[15:27] * mancdaz (~mancdaz@2a00:1a48:7807:102:94f4:6b56:ff08:886c) Quit (Quit: ZNC - http://znc.in)
[15:27] <swat30> looks like it is an assert?
[15:27] <TiCPU> and it is not a cephfs snapshot, is it a pool or rbd snapshot?
[15:27] <swat30> rbd
[15:27] * mancdaz (~mancdaz@2a00:1a48:7807:102:94f4:6b56:ff08:886c) has joined #ceph
[15:28] <swat30> ./include/interval_set.h: In function 'void interval_set<T>::insert(T, T) [with T = snapid_t]' thread 7f744bd56700 time 2014-08-15 13:01:14.947084
[15:28] <swat30> ./include/interval_set.h: 340: FAILED assert(0) <- assert before crash
[15:28] <swat30> (sorry for pasting)
[15:29] <swat30> how did you fix your snap issue TiCPU ?
[15:29] <swat30> just removed the assert from code?
[15:29] <TiCPU> for my MDS problem I only removed the assert line and restarted the MDS, not sure if you would be that lucky
[15:30] <swat30> hm
[15:30] <swat30> I could certainly try it, however don't want to risk any data loss or anything of the sort
[15:30] * twx (~twx@rosamoln.org) has joined #ceph
[15:30] <TiCPU> I had no problem losing my MDS data in my case
[15:31] <TiCPU> those were only used for ISOs
[15:31] <swat30> gotcha
[15:31] <swat30> did you lose any?
[15:31] <TiCPU> nothing
[15:31] * ksingh (~Adium@teeri.csc.fi) has joined #ceph
[15:32] <swat30> found this which seemed similar http://tracker.ceph.com/issues/7915
[15:32] <swat30> but affects the OSDs
[15:32] * shang (~ShangWu@220-138-38-112.dynamic.hinet.net) Quit (Quit: Ex-Chat)
[15:33] <swat30> sage, don't suppose you're around and able to help out once again?
[15:34] * cok (~chk@2a02:2350:18:1012:ac7e:151c:cbfb:adec) has joined #ceph
[15:36] <swat30> I really want to just remove that assert but I don't know if that'll have dire consequences
[15:38] <TiCPU> I don't know neither, I'm still wondering if trying to start a newer version of the mon wouldn't help
[15:39] <TiCPU> if no OSD are started, you would just snapshot or backup the leveldb, maybe?
[15:39] * cok (~chk@2a02:2350:18:1012:ac7e:151c:cbfb:adec) Quit (Quit: Leaving.)
[15:40] <infernix> after I do a "radosgw-admin region set" and a "radosgw-admin regions list" (neither of which return an error), "radosgw-admin regions list" does not show the region I just created
[15:40] <swat30> the OSDs are currently running and complaining about auth
[15:40] <infernix> where is this created and how do I go about increasing log levels to figure out why this fails?
[15:42] <infernix> oh. n/m. forgot to pass correct --name
[15:43] <infernix> now that that checks out, back to my previous issue: when I run 'radosgw-admin user create --uid="mia-1" --display-name="Region-Miami-1" --name client.radosgw.mia-1 --system' (region mia, zone 1 - both verified now to exist), I get "couldn't init storage provider"
[15:44] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) has joined #ceph
[15:44] * Hell_Fire (~hellfire@123-243-155-184.static.tpgi.com.au) Quit (Read error: Connection reset by peer)
[15:45] * vbellur (~vijay@122.167.213.139) has joined #ceph
[15:46] * brad_mssw (~brad@shop.monetra.com) has joined #ceph
[15:48] <swat30> TiCPU, upgrade to dumpling on the MON didn't help
[15:50] <TiCPU> no disk error in dmesg neither?
[15:52] <swat30> nope, it's just running off of the root disk
[15:52] <mgarcesMZ> guys, can you help me figure out why I cannot login in radosgw using the swift user?
[15:52] <swat30> OSDs have separate partitions
[15:57] <mgarcesMZ> I keep getting AcessDenied with curl or ???Account not found??? with swift client
[16:05] <swat30> wow, found the problem TiCPU
[16:05] <swat30> http://tracker.ceph.com/issues/7210
[16:06] <mgarcesMZ> any ideas
[16:10] * pressureman (~pressurem@62.217.45.26) Quit (Quit: Ex-Chat)
[16:12] * pinoyskull (~pinoyskul@112.201.143.35) has joined #ceph
[16:15] <pinoyskull> any resource online that talks about ceph optimization?
[16:16] * JC1 (~JC@80.12.59.111) has joined #ceph
[16:19] * steveeJ (~junky@HSI-KBW-085-216-022-246.hsi.kabelbw.de) Quit (Remote host closed the connection)
[16:20] * drankis (~drankis__@89.111.13.198) Quit (Remote host closed the connection)
[16:23] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) has joined #ceph
[16:25] * scuttle|afk (~scuttle@nat-pool-rdu-t.redhat.com) has joined #ceph
[16:25] * scuttle|afk is now known as scuttlemonkey
[16:25] <mgarcesMZ> ok, changed something in apache???s rgw.conf, and now, S3 works very good, fast???
[16:25] <mgarcesMZ> but I still can???t login with Swift user/key
[16:28] <TiCPU> swat30, you had to stop glance and the mon started?
[16:28] <swat30> TiCPU, yup but now OSDs are not starting properly
[16:28] <swat30> only 1/3 will start
[16:30] <TiCPU> swat30, do you have any logs?
[16:32] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[16:32] <swat30> sure, sec
[16:33] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[16:33] <swat30> TiCPU, http://pastebin.com/PK0mag68
[16:34] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[16:36] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) has joined #ceph
[16:37] * sputnik13 (~sputnik13@wsip-68-105-248-60.sd.sd.cox.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[16:37] <swat30> any ideas?
[16:38] <swat30> seems that 1 of them works fine
[16:38] <swat30> the other two just crap out
[16:38] <TiCPU> swat30, I guess your pool is size=1 ?
[16:38] <swat30> yea
[16:38] <swat30> sorry
[16:38] <swat30> size=2
[16:38] <swat30> min_size=1
[16:40] <saturnine> Is there a way to see the quota set for a pool?
[16:40] <swat30> TiCPU, OSD dump
[16:40] <swat30> http://pastebin.com/z9ze9Nb1
[16:40] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[16:41] <TiCPU> swat30, maybe flushing the journal to disk before starting them up would help
[16:41] <TiCPU> it could help them complete transaction offline
[16:42] <swat30> not sure how to do that?
[16:42] <swat30> sorry
[16:42] <TiCPU> ceph-osd --flush-journal -i # -f -d
[16:42] <swat30> ahh gotcha ok
[16:44] <swat30> nope, they still go through all of their "catch-up" and then crash
[16:44] <TiCPU> so they crash before talking to mon
[16:44] <TiCPU> I guess clearing the journal would lead to some data loss :/
[16:45] <TiCPU> (random) data loss
[16:45] <swat30> a bit more of a detailed log: http://pastebin.com/NE1fNXR8
[16:46] <swat30> the mon seems to know that they come up momentarily
[16:46] <TiCPU> maybe some dev has a clue to clear specific transactions from the replay log
[16:47] <swat30> goes through a lot of this: osd.21 pg_epoch: 18838 pg[1.82( v 4702'1543 (237'543,4702'1543] local-les=18791 n=5 ec=1 les/c 18740/18740 18836/18836/18836) [21] r=0 lpr=18836 pi=18733-18835/33 lcod 0'0 mlcod 0'0 down+peering
[16:47] <swat30> ] state<Started/Primary/Peering>: Peering, affected_by_map, going to Rese
[16:48] <TiCPU> also forget about clearing the journal, it normally makes the OSD unbootable
[16:49] <TiCPU> unless you're using btrfs
[16:49] <swat30> yea, seems that this thing starts up and starts to do things
[16:49] <swat30> nope
[16:49] <kraken> http://i.minus.com/iUgVCKwjISSke.gif
[16:49] <swat30> xfs
[16:51] * ircolle (~Adium@2601:1:a580:145a:9953:901e:2331:5287) has joined #ceph
[16:51] <swat30> any devs around that may be able to help?
[16:52] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) has joined #ceph
[16:55] * JC1 (~JC@80.12.59.111) Quit (Ping timeout: 480 seconds)
[16:57] <swat30> this seems to be related to snapshots once again
[16:57] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[16:58] * yguang11 (~yguang11@vpn-nat.corp.tw1.yahoo.com) has joined #ceph
[16:58] <TiCPU> swat30, yes it is the same problem
[16:58] * sz0 (~sz0@94.55.197.185) has joined #ceph
[16:58] <TiCPU> I guess in the ticket they had more OSD so they would start anyway
[16:58] <swat30> hmm
[16:59] <TiCPU> losing 3 OSDs in a pool of 16 is less of a problem (in case of size=3)
[16:59] <swat30> yea
[17:00] <swat30> I wonder what can be done here. I have a feeling it's going to be code related
[17:00] <swat30> I really need to get these things back up
[17:03] <TiCPU> I've seen in the ticket that the fix has been backported to both emperor and dumpling though
[17:04] <mgarcesMZ> now I am getting this: ???FastCGI: comm with server "/var/www/cgi-bin/s3gw.fcgi" aborted: idle timeout (30 sec)???
[17:06] <swat30> TiCPU, interesting, OK
[17:07] <swat30> I hate upgrading things in such a degraded state
[17:08] * i_m (~ivan.miro@gbibp9ph1--blueice4n2.emea.ibm.com) Quit (Quit: Leaving.)
[17:09] <swat30> I've also found this http://marc.info/?l=ceph-devel&m=133616385318839&w=2
[17:09] <swat30> but code has changed significantly since then
[17:09] <TiCPU> you're right that the less change, the better
[17:10] <TiCPU> you can probably clone that branch tog et you back up and running
[17:10] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[17:11] * aknapp (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[17:12] <swat30> it's just so old tho, 0.45 I believe
[17:12] <TiCPU> I though it was based on cuttlefish
[17:14] <swat30> mine is
[17:14] <swat30> this http://marc.info/?l=ceph-devel&m=133616385318839&w=2 is old
[17:15] <swat30> or you mean http://tracker.ceph.com/projects/ceph/repository/revisions/6bf46e23e09c38717a83e6eba202f15e56090748?
[17:16] <swat30> I don't think that np-> stuff exists in cuttlefish
[17:20] <TiCPU> swat30, the second one is for mon only, your problem happens before communication with mon (flush-journal)
[17:21] <swat30> not sure I see a first one
[17:22] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[17:22] <TiCPU> I did not see any patch in the first link, however there was a branch name which does not exist anymore in Git
[17:23] <swat30> oh what is it?
[17:23] * Nacer (~Nacer@pai34-4-82-240-124-12.fbx.proxad.net) Quit (Remote host closed the connection)
[17:23] <TiCPU> wip-snap-workaround based on v0.45 :/
[17:23] <swat30> yea
[17:23] <swat30> I found it
[17:23] <swat30> https://github.com/ceph/ceph/compare/historic/snap-workaround
[17:24] <TiCPU> it works for OSD and PG
[17:24] <TiCPU> I wonder why it wasn't kept
[17:24] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[17:24] * AbyssOne is now known as a1-away
[17:24] <swat30> not sure
[17:25] * a1-away is now known as AbyssOne
[17:25] <swat30> also not sure how to port to cuttlefish
[17:25] <swat30> :/
[17:25] * reed (~reed@75-101-54-131.dsl.static.sonic.net) has joined #ceph
[17:27] * adamcrume (~quassel@c-71-204-162-10.hsd1.ca.comcast.net) has joined #ceph
[17:28] <swat30> sage already sent me a PG fix, now just have to figure out this OSD
[17:30] * adamcrume (~quassel@c-71-204-162-10.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[17:31] * Pedras1 (~Adium@50.185.218.255) Quit (Quit: Leaving.)
[17:32] * dignus_ (~jkooijman@t-x.dignus.nl) Quit (Quit: Lost terminal)
[17:33] <swat30> TiCPU, any ideas? really not sure where to go from here
[17:35] * Sysadmin88 (~IceChat77@054533bc.skybroadband.com) has joined #ceph
[17:35] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) has joined #ceph
[17:36] <TiCPU> not sure how to proceed with critical data on this one, removing the assert when dealing with an erase is far from being safe
[17:37] * ircolle is now known as ircolle-afk
[17:37] * steveeJ (~junky@HSI-KBW-085-216-022-246.hsi.kabelbw.de) has joined #ceph
[17:37] <swat30> yea
[17:37] <swat30> I'm not sure how upgrading to Dumpling would go / if it will even help things
[17:38] * sputnik13 (~sputnik13@207.8.121.241) has joined #ceph
[17:39] <TiCPU> there's a fix to prevent the problem from happening but maybe not to bypass the current problem
[17:39] <steveeJ> what happens if an OSD is part of the cache pool and the backend pool? will objects be re-written on a cache flush?
[17:39] <TiCPU> on my side I'm still trying to diagnose freezing operations on random OSD using an RBD
[17:40] <TiCPU> I have to restart the OSD which the RBD says is trying to read from when stuck
[17:40] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[17:41] <TiCPU> I'm trying to replace iSCSI with direct RBD
[17:43] * stupidnic (~foo@cpe-70-94-232-191.sw.res.rr.com) Quit ()
[17:43] * rmoe (~quassel@173-228-89-134.dsl.static.sonic.net) Quit (Ping timeout: 480 seconds)
[17:44] <swat30> hm
[17:44] <swat30> TiCPU, that's what I was thinking. not sure if the fix will actually repair the issue
[17:45] <swat30> I wish sage were around to work his magic, once I get this thing stable in Cuttlefish I'm going to upgrade it right through
[17:48] * cok (~chk@2a02:2350:1:1202:353a:fc82:6619:bc37) has joined #ceph
[17:48] <swat30> sage, are you around? I am running into a problem, believe I need something like this: https://github.com/ceph/ceph/commit/3adaea723e0396c5480c97b03792beddb0dd6bbf except in Cuttlefish (this is for 0.45)
[17:49] <swat30> the PG fix looks straightforward, but this has changed
[17:50] <swat30> and it looks like the bug may have been fixed here: http://tracker.ceph.com/issues/7915
[17:52] * adamcrume (~quassel@50.247.81.99) has joined #ceph
[17:56] <infernix> when I run 'radosgw-admin user create --uid="mia-1" --display-name="Region-Miami-1" --name client.radosgw.mia-1 --system' (region mia, zone 1 - both verified now to exist), I get "couldn't init storage provider"
[17:56] <infernix> what is it trying to init? a specific rados pool?
[17:57] <infernix> stracing it doesn't show anything obvious
[17:57] * JC (~JC@AMontpellier-651-1-319-156.w92-133.abo.wanadoo.fr) has joined #ceph
[17:58] * cok (~chk@2a02:2350:1:1202:353a:fc82:6619:bc37) has left #ceph
[17:58] * rmoe (~quassel@12.164.168.117) has joined #ceph
[17:58] * cok (~chk@2a02:2350:1:1202:353a:fc82:6619:bc37) has joined #ceph
[18:00] * joef (~Adium@2620:79:0:131:c24:72c7:e09:2552) has joined #ceph
[18:00] <mgarcesMZ> ;( FastCGI: comm with server "/var/www/cgi-bin/s3gw.fcgi" aborted: idle timeout (30 sec)
[18:00] <mgarcesMZ> it was working so great
[18:04] * zidarsk8 (~zidar@89-212-142-10.dynamic.t-2.net) has joined #ceph
[18:05] * danieagle (~Daniel@179.184.165.184.static.gvt.net.br) has joined #ceph
[18:05] * zidarsk8 (~zidar@89-212-142-10.dynamic.t-2.net) has left #ceph
[18:09] * baylight (~tbayly@74-220-196-40.unifiedlayer.com) Quit (Ping timeout: 480 seconds)
[18:10] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[18:10] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[18:12] <sage> swat30: is that where it is crashing?
[18:13] <swat30> sage: crash logs - http://pastebin.com/PK0mag68
[18:13] * rweeks (~rweeks@pat.hitachigst.com) has joined #ceph
[18:13] <swat30> still Cuttlefish w/ your PG fix
[18:13] * burley (~khemicals@cpe-98-28-233-158.woh.res.rr.com) Quit (Quit: burley)
[18:14] <swat30> I sent an email out to the users ML a sec ago as well with compiled details
[18:17] * Pauline_ (~middelink@bigbox.ch.polyware.nl) Quit (Remote host closed the connection)
[18:20] <swat30> sage, also found http://tracker.ceph.com/issues/7915, but not sure if upgrading to a version with that fix will fix my existing problem, or just prevent it from happening (or if it's even related)
[18:24] * cok (~chk@2a02:2350:1:1202:353a:fc82:6619:bc37) has left #ceph
[18:26] * ircolle-afk is now known as ircolle
[18:30] <infernix> ok, i think i figured out what's wrong
[18:30] <infernix> can I delete a radosgw zone?
[18:30] * pinoyskull (~pinoyskul@112.201.143.35) Quit (Remote host closed the connection)
[18:31] <sage> swat30: 7915 is unrelated
[18:31] <swat30> sage, OK
[18:31] <sage> btw that commit you referenced is identical to the workaround i already pushed
[18:32] <swat30> is it? ok, there are two commits in that branch
[18:33] <swat30> I knew once of them was, assumed PG.cc
[18:34] <swat30> any ideas on what I'm running into here? I'm assuming it's snap related, just unsure how to fix to get these to come back up
[18:34] <sage> pushed another bandaid. you really need to get off cuttlefish, though, it's EOL a while back.
[18:34] <swat30> you bet, going to upgrade once I get it back online and healthy
[18:34] <swat30> same branch?
[18:34] <sage> and stop using (esp deleting) snaps until you have upgraded and scrubbed
[18:35] <swat30> unfortunately we have glance hooked up to this, so just have to upgrade sooner rather than later
[18:35] <sage> same branch. untested etc
[18:35] <swat30> kk
[18:36] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit (Remote host closed the connection)
[18:36] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) has joined #ceph
[18:36] <swat30> you're awesome
[18:36] <swat30> btw
[18:39] * linjan (~linjan@82.102.126.145) Quit (Ping timeout: 480 seconds)
[18:40] * angdraug (~angdraug@12.164.168.117) has joined #ceph
[18:42] <saturnine> sage: hi
[18:42] <saturnine> Anyone know the best way to roll back to a previous version of radosgw?
[18:43] <saturnine> 0.80.5 broke S3 uploads. Getting 411 responses for all multipart uploads now.
[18:44] <saturnine> How can I roll back to 0.80.4, and can I do that on just radosgw (leaving the rest on 0.80.5)?
[18:49] * sz0 (~sz0@94.55.197.185) Quit ()
[18:49] <saturnine> cool :D
[18:49] <gleam> i don't have any problems with radosgw+s3 api on 0.80.5
[18:49] * steki (~steki@91.195.39.5) Quit (Read error: Connection reset by peer)
[18:49] <gleam> even with multipart
[18:49] <gleam> hmm
[18:51] <saturnine> It may just be with the Java bindings?
[18:51] <gleam> perhaps, i'm only using it with fog
[18:51] <saturnine> 162.223.12.111 - - [15/Aug/2014:12:43:11 -0400] "POST /snapshots%2F2%2F14%2F8b1a5713-679c-473a-bb2e-b3279248bbfd?uploads HTTP/1.1" 411 424 "-" "aws-sdk-java/1.3.22 Linux/3.13.0-32-generic OpenJDK_64-Bit_Server_VM/24.51-b03"
[18:54] * mgarcesMZ (~mgarces@5.206.228.5) Quit (Ping timeout: 480 seconds)
[18:54] <gleam> all of mine have ?partNumber=X&?uploadId=Y
[18:55] <saturnine> it usually does a POST then a bunch of PUTs with the partNumber
[18:55] * bandrus (~Adium@216.57.72.205) has joined #ceph
[18:55] <gleam> yeah, i have the same thing but with an uploadid on all of them (and then partnumber on the puts)
[18:56] <gleam> chef/2014.08.15.09.00.02/chef.tar?uploadId=2%2F8pgJ2pyfMXy3j2A92qTFLjxi80c2-63 HTTP/1.1" 200 324
[18:56] <gleam> chef/2014.08.15.09.00.02/chef.tar?partNumber=17&uploadId=2%2F8pgJ2pyfMXy3j2A92qTFLjxi80c2-63 HTTP/1.1" 200
[18:56] <gleam> eg
[18:56] <saturnine> I dunno, maybe it's just with the java bindings. I saw that someone else had a similar issues since 80.5
[18:56] <gleam> weird
[18:57] <saturnine> https://bugs.launchpad.net/horizon/+bug/1352256
[18:57] <saturnine> Throwing "Length Required"
[18:57] <saturnine> I suppose that's with swift though
[18:59] * adamcrume (~quassel@50.247.81.99) Quit (Remote host closed the connection)
[19:01] * Tamil (~Adium@cpe-108-184-74-11.socal.res.rr.com) has joined #ceph
[19:01] * georgem2 (~oftc-webi@69-196-154-104.dsl.teksavvy.com) has joined #ceph
[19:02] <georgem2> any idea how to fix this broken cluster (http://paste.openstack.org/show/95640/)? I just started experimenting with Ceph and it's pretty complex..
[19:02] <swat30> georgem2, ceph health detail will should you a little more about the stuck/unhealthy pages
[19:02] <swat30> placement groups*
[19:04] * sjustwork (~sam@2607:f298:a:607:685f:fa56:be5c:579e) has joined #ceph
[19:04] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[19:05] <georgem2> swat30: I think all pgs are stuck, I basically have three nodes each with one OSD and my initial deploy partially failed and I did it again and I had 6 osd, three in and three down; after I removed the ones down I'm in this situation now
[19:06] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) has joined #ceph
[19:06] <georgem2> swat30: I don't have any data so I could probably wipe everything and start fresh, but if there is a simpler way..
[19:09] <swat30> ceph osd dump ?
[19:09] <saturnine> gleam: luckily radosgw doesn't have an object size limit
[19:09] <saturnine> 6GB single part upload worked fine
[19:09] <saturnine> So I dunno :x
[19:10] <gleam> haha
[19:10] * cok (~chk@2a02:2350:18:1012:f424:e96d:70af:fc68) has joined #ceph
[19:10] <georgem2> swat30: http://paste.openstack.org/show/95647/
[19:12] <swat30> georgem2, anything interesting in the OSD logs?
[19:13] <swat30> might want to turn up debugging if not, and looks like these are all on separate hosts. make sure they can talk to eachother (firewall etc.)
[19:13] * scuttlemonkey is now known as scuttle|afk
[19:14] * scuttle|afk is now known as scuttlemonkey
[19:14] * Cataglottism (~Cataglott@dsl-087-195-030-170.solcon.nl) Quit (Ping timeout: 480 seconds)
[19:18] <georgem2> swat30: last msg is from yesterday (c=0x7fbc2677f4a0).fault with nothing to send, going to standby) and connectivity between nodes is working; I'm just trying to start fresh and get back later if I have any issues, thanks for help
[19:20] * adamcrume (~quassel@c-71-204-162-10.hsd1.ca.comcast.net) has joined #ceph
[19:20] * zerick (~eocrospom@190.187.21.53) has joined #ceph
[19:24] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[19:24] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Remote host closed the connection)
[19:25] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) has joined #ceph
[19:25] * Concubidated (~Adium@66.87.67.248) has joined #ceph
[19:28] * Concubidated1 (~Adium@66.87.67.47) has joined #ceph
[19:28] * Concubidated (~Adium@66.87.67.248) Quit (Read error: Connection reset by peer)
[19:30] * baylight (~tbayly@204.15.85.169) has joined #ceph
[19:31] * aknapp (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Ping timeout: 480 seconds)
[19:33] * aknapp_ (~aknapp@fw125-01-outside-active.ent.mgmt.glbt1.secureserver.net) Quit (Ping timeout: 480 seconds)
[19:34] * baylight (~tbayly@204.15.85.169) has left #ceph
[19:42] * rweeks (~rweeks@pat.hitachigst.com) Quit (Quit: Leaving)
[19:42] * Kupo1 (~tyler.wil@wsip-68-14-231-140.ph.ph.cox.net) has joined #ceph
[19:43] <Kupo1> Quick question, on 'qemu-img info -f rbd' I am showing virtual size: 25G (26843545600 bytes) however 26843545600 bytes = 26.84GB. Am I crazy or is this incorrect?
[19:43] * georgem2 (~oftc-webi@69-196-154-104.dsl.teksavvy.com) Quit (Remote host closed the connection)
[19:44] * xarses (~andreww@12.164.168.117) has joined #ceph
[19:46] <TiCPU> Kupo1, 26843545600/1024/1024/1024=25
[19:47] <TiCPU> aka 26843545600/1GB not 1Gb
[19:47] <Kupo1> ok yeah i just figured that, google betrayed me D:
[19:47] <TiCPU> Kupo1, man 7 units :)
[19:48] <TiCPU> GiB = 1024, GB = 1000, I was confused myself
[19:48] <Kupo1> yeah was using this https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=26843545600bytes%20in%20GB lol
[19:49] <Kupo1> correct usage is 'Gibibyte"
[19:49] <Kupo1> ./boggle
[19:50] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[19:51] * cok (~chk@2a02:2350:18:1012:f424:e96d:70af:fc68) has left #ceph
[19:51] * aknapp (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Remote host closed the connection)
[19:52] * aknapp (~aknapp@64.202.160.233) has joined #ceph
[19:54] * ksingh (~Adium@teeri.csc.fi) has left #ceph
[19:57] * hasues1 (~hasues@kwfw01.scrippsnetworksinteractive.com) has joined #ceph
[19:58] * hasues1 (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit ()
[19:58] * rendar (~I@host13-179-dynamic.7-87-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[19:58] * danieljh (~daniel@0001b4e9.user.oftc.net) Quit (Quit: leaving)
[20:00] * Concubidated1 is now known as Concubidated
[20:01] * rendar (~I@host13-179-dynamic.7-87-r.retail.telecomitalia.it) has joined #ceph
[20:02] <infernix> so I'm a lot further but am now slightly confused when I am trying to do metadata sync between to regions: ERROR:root:destination cannot be master zone of master region
[20:02] <infernix> isn't each region its own master?
[20:02] <infernix> or, how do I configure which region is master if I'm only looking to do metadata sync?
[20:05] <Kupo1> Is this the only current method of getting the used size of an RBD ? http://www.sebastien-han.fr/blog/2013/12/19/real-size-of-a-ceph-rbd-image/
[20:05] * Meths_ is now known as Meths
[20:10] * thomnico (~thomnico@2a01:e35:8b41:120:7c4e:e245:59a3:9f06) has joined #ceph
[20:10] * thomnico (~thomnico@2a01:e35:8b41:120:7c4e:e245:59a3:9f06) Quit (Remote host closed the connection)
[20:14] <yuriw> I see new failed tests with error:
[20:14] <yuriw> Command failed on vpm062 with status 1: "S3TEST_CONF=/home/ubuntu/cephtest/archive/s3-tests.client.1.conf /home/ubuntu/cephtest/s3-tests/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/s3-tests -v -a '!fails_on_rgw'
[20:14] <yuriw> in
[20:14] <yuriw> http://pulpito.front.sepia.ceph.com/teuthology-2014-08-15_07:35:30-upgrade:dumpling-firefly-x-firefly-distro-basic-vps/426427/
[20:14] <yuriw> sorry wrong chat :)
[20:21] * llpamies (~oftc-webi@pat.hitachigst.com) has joined #ceph
[20:22] <llpamies> my osd stats say "10 osds: 8 up, 8 in". Is there any command to figure out which OSDs are the ones that "down" or "out"?
[20:26] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[20:28] * llpamies (~oftc-webi@pat.hitachigst.com) Quit (Remote host closed the connection)
[20:37] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) has joined #ceph
[20:38] <infernix> ah! "Set is_master to false." for multiple regions
[20:38] <infernix> great.
[20:40] * yguang11 (~yguang11@vpn-nat.corp.tw1.yahoo.com) Quit (Ping timeout: 480 seconds)
[20:51] <infernix> ok now i broke my region-map, and i can't use region-map set anymore
[20:54] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) has joined #ceph
[20:56] <infernix> success! metadata sync running
[20:58] * Midnightmyth_ (~quassel@93-167-84-102-static.dk.customer.tdc.net) has joined #ceph
[20:59] * dalegaard (~dalegaard@vps.devrandom.dk) Quit (Remote host closed the connection)
[20:59] * dalegaard (~dalegaard@vps.devrandom.dk) has joined #ceph
[21:01] * markbby (~Adium@168.94.245.4) Quit (Quit: Leaving.)
[21:02] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[21:02] * markbby (~Adium@168.94.245.4) has joined #ceph
[21:03] * Midnightmyth (~quassel@93-167-84-102-static.dk.customer.tdc.net) Quit (Ping timeout: 480 seconds)
[21:04] * dalegaard (~dalegaard@vps.devrandom.dk) Quit (Remote host closed the connection)
[21:06] * dalegaard (~dalegaard@vps.devrandom.dk) has joined #ceph
[21:09] * jtang_ (~jtang@80.111.83.231) Quit (Remote host closed the connection)
[21:11] * dalegaard (~dalegaard@vps.devrandom.dk) Quit (Remote host closed the connection)
[21:12] * dalegaard (~dalegaard@vps.devrandom.dk) has joined #ceph
[21:21] * rweeks (~rweeks@pat.hitachigst.com) has joined #ceph
[21:30] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) Quit (Read error: Connection reset by peer)
[21:30] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) has joined #ceph
[21:31] * JC1 (~JC@AMontpellier-651-1-280-182.w92-143.abo.wanadoo.fr) has joined #ceph
[21:32] * JC (~JC@AMontpellier-651-1-319-156.w92-133.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[21:40] * sjm (~sjm@108.53.250.33) has left #ceph
[21:40] * seapasulli (~seapasull@95.85.33.150) Quit (Remote host closed the connection)
[21:40] * seapasulli (~seapasull@95.85.33.150) has joined #ceph
[21:41] * rwheeler_ (~rwheeler@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving)
[21:43] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) Quit (Read error: Connection reset by peer)
[21:43] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) has joined #ceph
[21:49] * jnq (~jnq@0001b7cc.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:49] * bandrus (~Adium@216.57.72.205) Quit (Quit: Leaving.)
[21:53] * aknapp_ (~aknapp@ip68-99-237-112.ph.ph.cox.net) has joined #ceph
[21:54] * seapasul1i (~seapasull@95.85.33.150) has joined #ceph
[21:58] * seapasulli (~seapasull@95.85.33.150) Quit (Ping timeout: 480 seconds)
[21:58] * jnq (~jnq@95.85.22.50) has joined #ceph
[22:01] * aknapp (~aknapp@64.202.160.233) Quit (Ping timeout: 480 seconds)
[22:03] * aknapp_ (~aknapp@ip68-99-237-112.ph.ph.cox.net) Quit (Remote host closed the connection)
[22:04] * aknapp (~aknapp@64.202.160.233) has joined #ceph
[22:04] * markbby (~Adium@168.94.245.4) Quit (Quit: Leaving.)
[22:05] * jcollins (~jcollins@c-50-132-65-22.hsd1.wa.comcast.net) has joined #ceph
[22:05] * markbby (~Adium@168.94.245.4) has joined #ceph
[22:06] * tinklebear (~tinklebea@50.97.94.13-static.reverse.softlayer.com) has joined #ceph
[22:06] <jcollins> I'm setting up a ceph cluster and am a bit confused on the proper value for pg_num and pgp_num for my main pool. I'm trying to follow the documentation here: http://ceph.com/docs/master/rados/operations/placement-groups/
[22:07] <jcollins> however, based on the calculation I should be setting pg_num to 512 (7*100/2 = 350, up to power of 2 = 512)
[22:07] <jcollins> Error E2BIG: specified pg_num 512 is too large (creating 312 new PGs on ~7 OSDs exceeds per-OSD max of 32)
[22:08] <jcollins> seems like most any cluster would result in a pg_num value larger than 32 per OSD
[22:19] * markbby (~Adium@168.94.245.4) Quit (Quit: Leaving.)
[22:29] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) Quit (Read error: Connection reset by peer)
[22:30] <codice_> I believe the default number of replicas is 3, not 2
[22:31] <codice_> which means that pg_num should be 256, not 512
[22:31] <jcollins> codice_: it is, but that doesn't work well with only two hosts
[22:31] * codice_ is now known as codice
[22:31] <jcollins> I won't be adding the third host for a while, so I've reconfigured the replica size to 2
[22:31] <jcollins> and the min_size to 1
[22:32] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) has joined #ceph
[22:32] <codice> what's the current number of placement groups on your pool?
[22:32] * kevinc (~kevinc__@client65-44.sdsc.edu) Quit (Quit: This computer has gone to sleep)
[22:32] <codice> i.e. ceph osd pool get somename pg_num ?
[22:33] <jcollins> right now, 256, as that's the highest power of two it will let me set
[22:33] <codice> also, why 7 osds?
[22:33] <codice> and not an even number?
[22:34] <jcollins> that's all I've added to the pool so far as I migrate existing storage off LVM volumes
[22:34] * Japje (~Japje@2001:968:672:1::12) Quit (Ping timeout: 480 seconds)
[22:35] <jcollins> but the docs state 50-100 placement groups per OSD, yet the tools say a max of 32
[22:35] <jcollins> seems one or the other is incorrect
[22:36] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit (Quit: Leaving.)
[22:37] <codice> what version are you running?
[22:38] <jcollins> ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
[22:39] * dignus (~jkooijman@t-x.dignus.nl) has joined #ceph
[22:40] <codice> root@cdn006:~# ceph osd pool set data pg_num 1024
[22:40] <codice> set pool 0 pg_num to 1024
[22:40] <codice> root@cdn006:~# ceph osd pool set data pgp_num 1024
[22:40] <codice> set pool 0 pgp_num to 1024
[22:41] <jcollins> $ ceph osd pool set vms pg_num 512
[22:41] <jcollins> Error E2BIG: specified pg_num 512 is too large (creating 256 new PGs on ~7 OSDs exceeds per-OSD max of 32)
[22:41] <codice> It could be a bug (maybe?) or something to do with the odd number of OSDs
[22:41] <codice> to be honest, I'm not sure
[22:43] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) has joined #ceph
[22:44] <jcollins> what's the impact of fewer than suggested pg_nums?
[22:44] * bandrus (~Adium@216.57.72.205) has joined #ceph
[22:44] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) Quit ()
[22:44] * tinklebear2 (~tinklebea@50.97.94.13-static.reverse.softlayer.com) has joined #ceph
[22:51] * tinklebear (~tinklebea@50.97.94.13-static.reverse.softlayer.com) Quit (Read error: Operation timed out)
[22:53] * t0rn (~ssullivan@2607:fad0:32:a02:d227:88ff:fe02:9896) Quit (Quit: Leaving.)
[22:56] * brad_mssw (~brad@shop.monetra.com) Quit (Quit: Leaving)
[22:56] * Tamil (~Adium@cpe-108-184-74-11.socal.res.rr.com) Quit (Quit: Leaving.)
[23:00] * Japje (~Japje@2001:968:672:1::12) has joined #ceph
[23:02] * Tamil (~Adium@cpe-108-184-74-11.socal.res.rr.com) has joined #ceph
[23:03] <infernix> ok, so I have my multi region (mia, las) single zone (named "1" on each site) rgw setup complete. I'm replicating metadata just fine
[23:03] <infernix> i create a bucket the master region las with the constraint set to mia: "<CreateBucketConfiguration><LocationConstraint>mia</LocationConstraint></CreateBucketConfiguration>"
[23:03] <infernix> that works fine, metadata is replicated
[23:05] * linuxkidd (~linuxkidd@cpe-066-057-017-151.nc.res.rr.com) Quit (Quit: Leaving)
[23:05] <infernix> then when I try to upload data to this bucket in the 'mia' region by addressing mybucket.mia-1.s3.mydomain.com, the ceph log states " NOTICE: request for data in a different region (las != mia)"
[23:06] <infernix> so it seems that the default requests that don't have any specific location set in their XML are somehow being mapped to the master region
[23:06] <infernix> this is very confusing because my s3gw is configured as "exec /usr/bin/radosgw -d -c /etc/ceph/ceph.conf -n client.radosgw.mia-1 --rgw-region=mia --rgw-zone=1"
[23:08] <infernix> ceph.conf has "rgw region = mia"
[23:09] <infernix> region list shows "{ "default_info": { "default_region": "mia"}"
[23:09] <infernix> literally everything points to the "mia" region on the rados gw server on that site, except for master_region in the region-map
[23:12] <infernix> i would really appreciate some pointers on this one, i've figured out all the other variables here but am now beating my head against the wall
[23:16] * llpamies (~oftc-webi@pat.hitachigst.com) has joined #ceph
[23:23] <infernix> well, seems help on this irc channel isn't what it used to be :/
[23:24] <jcollins> infernix: wow, you're being more than a little rude
[23:25] <infernix> jcollins: i'm not actually, it's been my experience over the past few monhts
[23:26] <jcollins> infernix: you are, have you paid anyone on this channel for help? I suspect no, yet you act as it is your god given right
[23:26] * Tamil (~Adium@cpe-108-184-74-11.socal.res.rr.com) Quit (Read error: Connection reset by peer)
[23:26] * Tamil (~Adium@cpe-108-184-74-11.socal.res.rr.com) has joined #ceph
[23:26] <infernix> i don't act like that, you misinterpret; i'm just saddened by it.
[23:26] * dalgaaf (uid15138@id-15138.ealing.irccloud.com) has joined #ceph
[23:26] <infernix> this :/ is a sadface, not an i'm-entitled-face
[23:26] <jcollins> no, you do in fact act like it, you waited all of 11 minutes for a response and then started bitching
[23:27] <infernix> jcollins: i asked my first question 2.5 hours ago, in case you just walked in
[23:27] <jcollins> infernix: it doesn't matter
[23:27] <jcollins> what have you done that entitles you to a response?
[23:27] <gleam> why are we being dicks to each other?
[23:28] <infernix> look, you're completely misinterpreting me. i am just sad that there is less activity here. it was a lot higher. i don't demand anything
[23:28] <jcollins> the software is free, that they even offer help in IRC is more than anyone should expect
[23:28] <infernix> and i am entitled to exactly nothing
[23:28] <infernix> and i've been using free software since 1995, you don't have to explain to me how it works.
[23:28] <codice> infernix: I'd love to help, but I can't because I don't have a similar setup, so my experience is limited in that regard
[23:28] <codice> (also, I'm at work)
[23:28] <codice> I imagine a few of the other folks here are also likely occupied as well
[23:29] <jcollins> precisely, ask and wait... if you get an answer, great, if you don't... *shrug*
[23:29] <infernix> all i was saying is that things were much more lively in #ceph a year or so ago. maybe i should take it up to the mailinglist
[23:29] <TiCPU> my day just ended and I only have 2 ceph clusters with 6 nodes, can't help neither
[23:29] <infernix> jcollins: thank you for educating me on netiquette, i highly appreciate your input
[23:29] * davidz (~Adium@cpe-23-242-12-23.socal.res.rr.com) Quit (Remote host closed the connection)
[23:30] <codice> jcollins: with regard to fewer than suggested pg_nums, my understanding would be that data wouldn't be distributed as efficiently throughout your cluster
[23:30] * tryggvil (~tryggvil@17-80-126-149.ftth.simafelagid.is) Quit (Quit: tryggvil)
[23:30] <codice> this may help: http://ceph.com/docs/firefly/dev/placement-group/
[23:31] <jcollins> codice: thank you
[23:31] <codice> sure thing
[23:32] <codice> infernix: have you tried the mailing list, or the archives? I seem to recall someone having a similar issue
[23:32] <TiCPU> I encounter problems with RBD, I have a client using an RBD with a Btrfs filesystem and it sporadically freeze waiting for one OSD as seen in debugfs/osdc. Other VMs using Ceph works perfectly, iSCSI client too but to get the RBD back up and running I must let it unlock after about 5 to 15 minutes or restart the OSD it is waiting after, any idea?
[23:32] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) Quit (Read error: Connection reset by peer)
[23:32] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) has joined #ceph
[23:33] <TiCPU> I also get this line in the osd.#.log when frozen: 2014-08-15 07:51:16.090645 7fb4232c9700 0 -- 10.10.252.4:6800/19085 >> 10.10.4.13:0/4012069674 pipe(0xe7e9980 sd=94 :6800 s=0 pgs=0 cs=0 l=0 c=0xc67c940).accept peer addr is really 10.10.4.13:0/4012069674 (socket is 10.10.4.13:35658/0)
[23:34] <TiCPU> could it be a network routing issue (I'm using the same network for mon and data but the server is multi-homed.
[23:34] <llpamies> I have a cluster with two networks (192.168.5.0 and 192.168.6.0) and I configure ceph.conf with "public_network = 192.168.6.0/8" and "cluster_network = 192.168.6.0/8". However, when I look at the osd dump, or the logs, and I can still see all osds using the 192.168.5.0 network. Should I configure the networks to use somewhere else too?
[23:35] <TiCPU> llpamies, same setup as mine, did you setup the DNS correctly?
[23:36] <TiCPU> nslookup <host>
[23:36] <codice> do you have separate interfaces?
[23:36] <llpamies> TiCPU, I don't have a DNS
[23:36] <TiCPU> llpamies, then the host file
[23:36] <codice> i.e. one for public, one for cluster?
[23:36] <llpamies> yes, this are seperate interfaces
[23:36] <Vacum__> llpamies: 192.168.6.0/8 covers the whole 192.x.x.x network
[23:36] <codice> good. You don't have specific routes, do you?
[23:37] <llpamies> should it be /24 ?
[23:37] <TiCPU> didn't see that /8 !
[23:37] <codice> what may be happening is your cluster network is using your default gw for the traffic
[23:37] <Vacum__> llpamies: did you actually mean /24 ?
[23:37] <llpamies> ohhh I see,
[23:37] <llpamies> no, my bad :(
[23:37] <llpamies> thanks guys
[23:37] <llpamies> I missed this
[23:37] <codice> you need a network route
[23:37] * kevinc (~kevinc__@client65-44.sdsc.edu) has joined #ceph
[23:38] <codice> i.e ip route add net <network> dev <device>
[23:38] <TiCPU> codice, a route to handle packets from 192.168.5.0/24 to the cluster, right
[23:38] <codice> to keep the cluster traffic on its own interface, yes
[23:38] * BManojlovic (~steki@79-101-68-222.dynamic.isp.telekom.rs) Quit (Read error: Connection reset by peer)
[23:39] <TiCPU> codice, this route will be generated by the kernel when you assign the IP on the interface
[23:39] <llpamies> TiCPU, Vacum__: Thanks, now it works!
[23:40] <TiCPU> to use RBD on another network, I had to enable IP forwarding on the Ceph machine and setup route on the router for each Ceph, did I do this right? Else, no one could query the cluster network
[23:40] * xarses (~andreww@12.164.168.117) Quit (Remote host closed the connection)
[23:41] <TiCPU> I though Ceph would listen on my internal network as well as the cluster and public network
[23:43] * davidz (~Adium@cpe-23-242-12-23.socal.res.rr.com) has joined #ceph
[23:46] * xarses (~andreww@12.164.168.117) has joined #ceph
[23:46] <tinklebear2> Is CentOS 7 the best distribution to start playing with ceph since ceph is now a RedHat project?
[23:49] * sprachgenerator (~sprachgen@130.202.135.20) has joined #ceph
[23:49] <TiCPU> I must say that Ceph isn't much OS dependent, but RBD being what it is, the most recent Kernel the better, so I guess Ubuntu is still the best platform for Ceph or OpenSUSE. If you don't use RBD well, it doesn't matter.
[23:50] * Eco (~eco@adsl-99-102-132-133.dsl.pltn13.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[23:56] * sjustwork (~sam@2607:f298:a:607:685f:fa56:be5c:579e) Quit (Quit: Leaving.)
[23:59] * nhm (~nhm@65-128-141-191.mpls.qwest.net) Quit (Quit: Lost terminal)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.