#ceph IRC Log

Index

IRC Log for 2016-01-02

Timestamps are in GMT/BST.

[0:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[0:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[0:13] * codice (~toodles@75-128-34-237.static.mtpk.ca.charter.com) has joined #ceph
[0:15] * thansen (~thansen@162.219.43.108) Quit (Quit: Ex-Chat)
[0:17] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[1:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[1:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[1:07] * olid1982 (~olid1982@aftr-185-17-206-143.dynamic.mnet-online.de) Quit (Ping timeout: 480 seconds)
[1:07] * eXeler0n (~Swompie`@65.19.167.130) has joined #ceph
[1:22] * oms101 (~oms101@p20030057EA6CBA00C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:32] * oms101 (~oms101@p20030057EA5E9700C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[1:37] * eXeler0n (~Swompie`@76GAAAWI4.tor-irc.dnsbl.oftc.net) Quit ()
[1:37] * DJComet (~Chrissi_@euve59226.serverprofi24.de) has joined #ceph
[2:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[2:01] * DJComet (~Chrissi_@6YRAABUT9.tor-irc.dnsbl.oftc.net) Quit (Remote host closed the connection)
[2:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[2:02] * Discovery (~Discovery@178.239.49.68) Quit (Ping timeout: 480 seconds)
[2:24] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[2:46] * Discovery (~Discovery@178.239.49.68) has joined #ceph
[2:58] * kefu (~kefu@114.92.107.250) has joined #ceph
[2:58] * LeaChim (~LeaChim@host86-185-146-193.range86-185.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[3:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[3:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[3:03] * aj__ (~aj@x4db24b53.dyn.telefonica.de) has joined #ceph
[3:06] * kefu (~kefu@114.92.107.250) Quit (Ping timeout: 480 seconds)
[3:11] * derjohn_mobi (~aj@x590cacd0.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[3:26] * redbeast12 (~rogst@59-234-47-212.rev.cloud.scaleway.com) has joined #ceph
[3:45] * yanzheng (~zhyan@182.139.23.32) has joined #ceph
[3:45] * calvinx (~calvin@101.100.172.246) has joined #ceph
[3:50] * yanzheng (~zhyan@182.139.23.32) Quit (Quit: This computer has gone to sleep)
[3:56] * redbeast12 (~rogst@6YRAABUU1.tor-irc.dnsbl.oftc.net) Quit ()
[3:57] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[3:59] * Discovery (~Discovery@178.239.49.68) Quit (Remote host closed the connection)
[4:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[4:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[4:04] * yanzheng (~zhyan@182.139.23.32) has joined #ceph
[4:09] * thadood (~thadood@slappy.thunderbutt.org) Quit (Remote host closed the connection)
[4:10] * thadood (~thadood@slappy.thunderbutt.org) has joined #ceph
[4:11] * yanzheng (~zhyan@182.139.23.32) Quit (Quit: This computer has gone to sleep)
[4:22] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[4:26] * calvinx (~calvin@101.100.172.246) Quit (Quit: calvinx)
[4:28] * Mousey (~neobenedi@tor.asmer.com.ua) has joined #ceph
[4:36] * calvinx (~calvin@101.100.172.246) has joined #ceph
[4:46] * kefu (~kefu@114.92.107.250) has joined #ceph
[4:54] * kefu (~kefu@114.92.107.250) Quit (Ping timeout: 480 seconds)
[4:58] * Mousey (~neobenedi@4MJAAAXYL.tor-irc.dnsbl.oftc.net) Quit ()
[5:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[5:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[5:07] * blip2 (~Grimmer@exit.tor.uwaterloo.ca) has joined #ceph
[5:14] * MACscr (~Adium@2601:247:4101:a0be:dd5:7ff6:d859:fc2c) Quit (Quit: Leaving.)
[5:20] * johnhunter (~hunter@101.16.202.230) has joined #ceph
[5:30] * calvinx (~calvin@101.100.172.246) Quit (Quit: calvinx)
[5:37] * calvinx (~calvin@101.100.172.246) has joined #ceph
[5:37] * blip2 (~Grimmer@4MJAAAXZT.tor-irc.dnsbl.oftc.net) Quit ()
[5:37] * Freddy (~Mraedis@59-234-47-212.rev.cloud.scaleway.com) has joined #ceph
[5:51] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[5:51] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[5:54] * vbellur (~vijay@2601:647:4f00:4960:5e51:4fff:fee8:6a5c) has joined #ceph
[5:54] * Vacuum_ (~Vacuum@88.130.210.217) has joined #ceph
[5:55] * hunter_ (~hunter@101.16.202.230) has joined #ceph
[6:00] * Vacuum__ (~Vacuum@i59F79D8A.versanet.de) Quit (Read error: Connection reset by peer)
[6:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[6:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[6:01] * johnhunter (~hunter@101.16.202.230) Quit (Ping timeout: 480 seconds)
[6:01] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[6:07] * Freddy (~Mraedis@84ZAAAY21.tor-irc.dnsbl.oftc.net) Quit ()
[6:07] * poller (~osuka_@178.162.216.42) has joined #ceph
[6:30] * calvinx (~calvin@101.100.172.246) Quit (Quit: calvinx)
[6:37] * poller (~osuka_@76GAAAWR6.tor-irc.dnsbl.oftc.net) Quit ()
[6:45] * Nats__ (~natscogs@114.31.195.238) Quit (Read error: Connection reset by peer)
[6:54] * i_m (~ivan.miro@83.149.37.87) has joined #ceph
[6:54] * calvinx (~calvin@101.100.172.246) has joined #ceph
[7:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[7:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[7:07] * petracvv (~quassel@c-50-158-9-81.hsd1.il.comcast.net) Quit (Read error: Connection reset by peer)
[7:07] * petracvv (~quassel@c-50-158-9-81.hsd1.il.comcast.net) has joined #ceph
[7:47] * tansy (~sahithi_r@14.139.82.6) has joined #ceph
[8:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[8:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[8:11] * Throlkim (~adept256@185.60.144.31) has joined #ceph
[8:11] * tansy (~sahithi_r@14.139.82.6) Quit (Ping timeout: 480 seconds)
[8:16] * johnhunter (~hunter@101.16.202.230) has joined #ceph
[8:17] * hunter_ (~hunter@101.16.202.230) Quit (Ping timeout: 480 seconds)
[8:21] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[8:22] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[8:22] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[8:23] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[8:32] * johnhunter (~hunter@101.16.202.230) Quit (Ping timeout: 480 seconds)
[8:41] * Throlkim (~adept256@4MJAAAX2R.tor-irc.dnsbl.oftc.net) Quit ()
[8:42] * r_await (~r_await@ec2-52-18-254-85.eu-west-1.compute.amazonaws.com) has joined #ceph
[8:42] * tansy (~sahithi_r@14.139.82.6) has joined #ceph
[8:47] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:ed0c:dd2f:18a6:e24d) has joined #ceph
[8:53] <tansy> hello! I am tansy and I am new to this community. I wish to contribute to Ceph. I have gone through the 2015 gsoc projects. I took a course on Operating systems and I can code in C/C++, python, and I use unix system. Can someone help me get started?
[8:55] <tansy> I would like to know if there are any small tasks or some easy bugs to fix.
[8:57] * mykola (~Mikolaj@91.225.201.25) has joined #ceph
[9:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[9:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[9:04] * Swompie` (~tallest_r@108.61.68.152) has joined #ceph
[9:11] * rendar (~I@95.233.118.222) has joined #ceph
[9:18] * i_m1 (~ivan.miro@83.149.37.169) has joined #ceph
[9:21] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:ed0c:dd2f:18a6:e24d) Quit (Ping timeout: 480 seconds)
[9:23] * i_m (~ivan.miro@83.149.37.87) Quit (Ping timeout: 480 seconds)
[9:30] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[9:34] * Swompie` (~tallest_r@108.61.68.152) Quit ()
[9:36] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[9:49] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[9:51] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[9:52] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[9:55] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[9:55] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[9:55] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[9:59] * MACscr (~Adium@2601:247:4101:a0be:c46f:f51b:1bda:b9ba) has joined #ceph
[9:59] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:ed0c:dd2f:18a6:e24d) has joined #ceph
[10:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[10:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[10:07] * MACscr (~Adium@2601:247:4101:a0be:c46f:f51b:1bda:b9ba) Quit (Quit: Leaving.)
[10:07] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Read error: Connection reset by peer)
[10:10] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[10:10] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[10:10] * xarses_ (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[10:13] * MACscr (~Adium@2601:247:4101:a0be:14ca:756a:f7ec:2ca0) has joined #ceph
[10:23] * LRWerewolf (~murmur@109.201.133.100) has joined #ceph
[10:27] * johnhunter (~hunter@101.16.202.230) has joined #ceph
[10:32] * yanzheng (~zhyan@182.139.23.32) has joined #ceph
[10:37] * yanzheng (~zhyan@182.139.23.32) Quit ()
[10:49] * yanzheng (~zhyan@182.139.23.32) has joined #ceph
[10:53] * LRWerewolf (~murmur@84ZAAAZA7.tor-irc.dnsbl.oftc.net) Quit ()
[10:57] * yanzheng (~zhyan@182.139.23.32) Quit (Quit: This computer has gone to sleep)
[11:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[11:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[11:09] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:ed0c:dd2f:18a6:e24d) Quit (Ping timeout: 480 seconds)
[11:27] * i_m1 (~ivan.miro@83.149.37.169) Quit (Ping timeout: 480 seconds)
[11:34] * SweetGirl (~Coe|work@tor-exit.dhalgren.org) has joined #ceph
[12:00] * johnhunter (~hunter@101.16.202.230) Quit (Ping timeout: 480 seconds)
[12:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[12:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[12:03] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:ed0c:dd2f:18a6:e24d) has joined #ceph
[12:04] * SweetGirl (~Coe|work@76GAAAW0O.tor-irc.dnsbl.oftc.net) Quit ()
[12:16] * olid1982 (~olid1982@aftr-185-17-206-143.dynamic.mnet-online.de) has joined #ceph
[12:18] * pabluk_ is now known as pabluk
[12:23] * pabluk is now known as pabluk_
[12:30] * olid1982 (~olid1982@aftr-185-17-206-143.dynamic.mnet-online.de) Quit (Ping timeout: 480 seconds)
[12:37] * fdmanana (~fdmanana@2001:8a0:6dfd:6d01:ed0c:dd2f:18a6:e24d) Quit (Ping timeout: 480 seconds)
[12:37] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Remote host closed the connection)
[12:40] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[12:55] * calvinx (~calvin@101.100.172.246) Quit (Quit: calvinx)
[13:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[13:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[13:12] * Miho (~Curt`@37.48.81.27) has joined #ceph
[13:18] * krypto (~krypto@43.224.130.104) has joined #ceph
[13:38] * tansy (~sahithi_r@14.139.82.6) Quit (Ping timeout: 480 seconds)
[13:41] * Miho (~Curt`@84ZAAAZGV.tor-irc.dnsbl.oftc.net) Quit ()
[13:46] * Discovery (~Discovery@178.239.49.69) has joined #ceph
[13:50] * Popz (~Tralin|Sl@104.238.169.56) has joined #ceph
[14:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[14:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[14:08] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[14:10] * johnhunter (~hunter@101.16.202.230) has joined #ceph
[14:19] * LDA (~DM@host217-114-156-249.pppoe.mark-itt.net) has joined #ceph
[14:20] * Popz (~Tralin|Sl@104.238.169.56) Quit ()
[14:26] * LeaChim (~LeaChim@host86-185-146-193.range86-185.btcentralplus.com) has joined #ceph
[14:27] * tansy (~sahithi_r@14.139.82.6) has joined #ceph
[14:28] * johnhunter (~hunter@101.16.202.230) Quit (Remote host closed the connection)
[14:30] * uhtr5r (~Kealper@109.201.143.40) has joined #ceph
[14:52] * olid1982 (~olid1982@193.24.209.81) has joined #ceph
[14:56] * etienne (~textual@ARennes-650-1-42-118.w86-215.abo.wanadoo.fr) has joined #ceph
[14:57] * nardial (~ls@dslb-088-072-094-077.088.072.pools.vodafone-ip.de) has joined #ceph
[14:59] * uhtr5r (~Kealper@76GAAAW50.tor-irc.dnsbl.oftc.net) Quit ()
[15:00] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[15:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[15:07] <flaf> Hi @all and happy new year! ;)
[15:08] <flaf> tansy: I'm not a ceph expert but I you research bugs to fix, you can can search here http://tracker.ceph.com
[15:08] <flaf> *if you...
[15:09] * ade (~abradshaw@dslb-092-078-131-084.092.078.pools.vodafone-ip.de) has joined #ceph
[15:17] * fabioFVZ (~fabiofvz@250-4-187-213.wifi4all.it) has joined #ceph
[15:18] * fabioFVZ (~fabiofvz@250-4-187-213.wifi4all.it) Quit ()
[15:18] * maku1 (~Grimhound@109.201.143.40) has joined #ceph
[15:18] * i_m (~ivan.miro@31.207.230.243) has joined #ceph
[15:23] * olid1982 (~olid1982@193.24.209.81) Quit (Ping timeout: 480 seconds)
[15:30] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[15:30] * xarses_ (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Read error: Connection reset by peer)
[15:33] * danieagle (~Daniel@187.34.2.79) has joined #ceph
[15:39] * etienne (~textual@ARennes-650-1-42-118.w86-215.abo.wanadoo.fr) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[15:47] * maku1 (~Grimhound@84ZAAAZJ3.tor-irc.dnsbl.oftc.net) Quit ()
[15:47] * bvi (~bastiaan@152-64-132-5.ftth.glasoperator.nl) has joined #ceph
[16:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[16:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[16:04] * olid1982 (~olid1982@dslb-084-059-135-089.084.059.pools.vodafone-ip.de) has joined #ceph
[16:05] * DougalJacobs (~GuntherDW@172.98.67.86) has joined #ceph
[16:13] * markl (~mark@knm.org) Quit (Quit: leaving)
[16:14] * markl (~mark@knm.org) has joined #ceph
[16:16] <tansy> flaf: thank you
[16:20] * med (~medberry@71.74.177.250) Quit (Ping timeout: 480 seconds)
[16:24] * med (~medberry@71.74.177.250) has joined #ceph
[16:26] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Ping timeout: 480 seconds)
[16:27] * vata (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[16:35] * DougalJacobs (~GuntherDW@172.98.67.86) Quit ()
[16:40] * duderonomy (~duderonom@c-24-7-50-110.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:52] * dyasny (~dyasny@dsl.198.58.161.209.ebox.ca) has joined #ceph
[16:52] * mykola (~Mikolaj@91.225.201.25) Quit (Remote host closed the connection)
[16:56] * etienne (~textual@ARennes-650-1-42-118.w86-215.abo.wanadoo.fr) has joined #ceph
[17:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[17:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[17:05] * Discovery (~Discovery@178.239.49.69) Quit ()
[17:06] * krypto (~krypto@43.224.130.104) Quit (Read error: Connection reset by peer)
[17:07] * Salamander_ (~blip2@109.201.143.40) has joined #ceph
[17:09] * Discovery (~Discovery@178.239.49.69) has joined #ceph
[17:15] * i_m (~ivan.miro@31.207.230.243) Quit (Ping timeout: 480 seconds)
[17:16] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Ping timeout: 480 seconds)
[17:26] * dyasny_ (~dyasny@dsl.198.58.152.230.ebox.ca) has joined #ceph
[17:29] * vata (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[17:29] * dyasny (~dyasny@dsl.198.58.161.209.ebox.ca) Quit (Ping timeout: 480 seconds)
[17:37] * Salamander_ (~blip2@4MJAAAYDE.tor-irc.dnsbl.oftc.net) Quit ()
[17:45] <flaf> Hello. I have a (little) cluster with ~20 osds and I have stopped an osd daemon. After that, I have 43 pgs stuck active+undersized+degraded.
[17:45] <flaf> It seems to me very curious because I have size = 3 and I have 5 physical servers.
[17:47] <flaf> Ah... no, now all is OK (active+clean)
[17:48] <flaf> It seemed to me a little long. ;)
[17:48] <flaf> and I seen no activity...
[17:49] <T1> flaf: it could also depend on how your CRUSH map is configured
[17:50] <flaf> Yes, indeed but my crush map seems to me correct (and simple too). Do you want to see a paste?
[17:51] <T1> sure, why not
[17:52] * etienne (~textual@ARennes-650-1-42-118.w86-215.abo.wanadoo.fr) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[17:53] <flaf> T1 => http://paste.alacon.org/39210
[17:53] <flaf> I accept remarks with pleasure. ;)
[17:54] <T1> please update the paste (or create a new one) with the output from ceph osd crush tree
[17:54] <flaf> It's a basic CRUSH map, I have just a specific rule to put objects in SSD only (rule used by the cephfsmetada pool only).
[17:54] <T1> it's easier on the eye for a quick lookover.. :)
[17:54] <flaf> T1 ok...
[17:55] <flaf> http://paste.alacon.org/39211
[17:55] <flaf> Indeed, the tree help a lot ;)
[17:56] <T1> yeah
[17:56] * tansy (~sahithi_r@14.139.82.6) Quit (Quit: Leaving)
[17:56] <T1> well that looks pretty standard
[17:56] <flaf> Yes indeed.
[17:56] <T1> nothing out of the odenary (looks very much like mine apart from the ssd/sata partitioning)
[17:57] <flaf> I have just stopped 2 osds (with stop ceph-osd id=$id) and I seems to me a little long to go back to 100% active+clean.
[17:58] <T1> how much IO does your cluster have?
[17:58] <flaf> But finally I have 100% active+clean (maybe I'm a little impatient).
[17:58] <flaf> T1: how can I see that?
[17:58] <T1> it could simply be that it takes that much time to get back to active + clean when the cluster has to rebalance the lost data from the 2 OSDs
[17:58] <T1> uhmm.. ceph -s
[17:59] <T1> last few lines should show client I/O
[17:59] <T1> client io 34315 B/s wr, 3 op/s
[17:59] <flaf> My cluster in not in production and completely idle currently and has... just 21 objects ;) according to "ceph df"
[17:59] <T1> (wow.. pretty idle here)
[17:59] <T1> ah
[17:59] <T1> ok
[18:00] <T1> well.. when you stop an OSD the MONs need to notice it
[18:00] <flaf> This is why I found the balancing a little long.
[18:00] <T1> I think there is a 30 second (I seem to remember having read that somewhere at some point) timeout
[18:00] <T1> after that the OSD gets marked as down
[18:00] <T1> you could probably tune it a bit
[18:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[18:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[18:01] <flaf> In my case, after a "stop ceph-osd id=$id" the osd was marked as down immediatly.
[18:01] <T1> ah, ok
[18:02] <T1> probably due to a "nice" shutdown
[18:02] <flaf> Yes :)
[18:02] <T1> make good sense anyway
[18:02] <T1> well..
[18:02] <T1> it probably depends on what you deem "resonable" for the cluster to recover
[18:02] <T1> you could try this
[18:03] <T1> open a new console
[18:03] <T1> do a
[18:03] <T1> ceph -w
[18:03] <flaf> I think I was too impatient
[18:03] <T1> this shows at runtime everything that happens in the cluster
[18:03] <flaf> T1 ok.
[18:03] <T1> then try and stop an OSD
[18:03] <T1> and see what happens
[18:03] <flaf> I restart all the osd before...
[18:03] <T1> then you can see what the monitors are doing
[18:04] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) has joined #ceph
[18:04] <flaf> I will make a little screencast.... :)
[18:04] * bvi (~bastiaan@152-64-132-5.ftth.glasoperator.nl) Quit (Ping timeout: 480 seconds)
[18:05] * dyasny_ (~dyasny@dsl.198.58.152.230.ebox.ca) Quit (Ping timeout: 480 seconds)
[18:07] * dyasny_ (~dyasny@dsl.dynamic.190-116-74-99.electronicbox.net) has joined #ceph
[18:16] <flaf> T1: osd stopped at 18:07:56 => cluster 100% OK at ~18:13 (my cluster has only 21 objects according to "ceph df").
[18:16] * nardial (~ls@dslb-088-072-094-077.088.072.pools.vodafone-ip.de) Quit (Quit: Leaving)
[18:17] <flaf> T1: activity according to "ceph -w" http://paste.alacon.org/39212
[18:18] <T1> flaf: okay - that is a bit long
[18:19] <flaf> Yes... I'm creating the wonderfull video. :)
[18:19] <T1> I wonder why
[18:19] <T1> 18:13:00.446310 osd.4 [INF] 0.16 starting backfill to osd.9 from (0'0,0'0] MAX to 1207'3669
[18:19] <T1> first starts there
[18:20] <T1> ooooh
[18:20] <T1> sorry
[18:20] <T1> !
[18:20] <T1> osd.1 out (down for 302.316077)
[18:20] <T1> there is a 300 sec delay where the OSD has a chance to get back up and in before backfilling starts
[18:20] <flaf> What is the meaning of "1207'3669" etc. ?
[18:21] * brian_ is now known as brian-
[18:21] <T1> no idea
[18:21] <T1> we probably need to look at the code to get an idea
[18:21] <T1> (and I've got no idea where the proper place is)
[18:21] * vata (~vata@cable-21.246.173-197.electronicbox.net) Quit (Ping timeout: 480 seconds)
[18:23] * brian- is now known as brian
[18:24] <T1> somewhere on http://docs.ceph.com/docs/master/rados/operations/monitoring-osd-pg/ there are the following lines
[18:24] <T1> If an OSD is down and the degraded condition persists, Ceph may mark the down OSD as out of the cluster and remap the data from the down OSD to another OSD. The time between being marked down and being marked out is controlled by mon osd down out interval, which is set to 300 seconds by default.
[18:24] * brian is now known as brian-
[18:24] <flaf> Ah ok, so it's completely normal, in fact, correct ?
[18:24] <T1> yeah
[18:25] <T1> look here for explanation
[18:25] <T1> http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/
[18:25] <flaf> Ok thx. What a pity, my video was ready here :) https://echanges.ac-versailles.fr/get?k=Kd7BKM5LSL88Na3Rmkd
[18:25] <T1> dang.. ;)
[18:26] <flaf> (a wonderful video isn't it? ;))
[18:26] <flaf> (lot of suspens...)
[18:27] <T1> but the 300 seconds are a giveaway (as would 30 seconds in other instances) as a configured default value.. :)
[18:27] <T1> haha, yeah
[18:28] <T1> on a cluster with client I/O ceph -w updates at least once a second
[18:29] <T1> more often if there is reason for it
[18:29] <T1> (ie. something happens)
[18:30] <flaf> Ok. thx T1 and well done.
[18:30] <T1> np np
[18:30] <T1> I'm still learning the ropes, so I'm happy to have helped out
[18:32] <flaf> Me too, except that I'm able to help rarely ;)
[18:33] <T1> well.. I started lloking at ceph in the early part of 2015
[18:33] <T1> got the hardware during summer
[18:33] <flaf> (But I agree help can be a good way to learn)
[18:33] * ade (~abradshaw@dslb-092-078-131-084.092.078.pools.vodafone-ip.de) Quit (Ping timeout: 480 seconds)
[18:33] <T1> spent a few months doing nothing (other work had priority)
[18:33] <T1> and then crelated a cluster and entered it into production during october and november
[18:33] <T1> created even
[18:34] <flaf> ah ok, and since november, no problem with your cluster in production?
[18:34] <T1> and right now we are pouring all data in our production env into it
[18:34] <T1> no
[18:35] <flaf> What part of ceph do you use in production? (radosgw, cephfs, rdb?)
[18:37] <T1> all data = 10mio + PDFs (20.000+ are added daily), PDF generator logfiles (where the data in the PDF comes from and why we calculated what/how we did), logfiles for all systems as well as 30+ mio "versions" (binary blobs only 2-3kb in size) of our questionnaires and other misc data
[18:37] <T1> we are using RBD
[18:37] <T1> cephfs would be nice instead, but it has performance problems that would cause us major problems with our use case
[18:38] <flaf> Yes I understand. I will use cephfs in production soon but it's for storage of a web site.
[18:38] <T1> so we have a number of RBDs with data (depending on what system, what kind of data etc etc) that are fronted by a server and served to clients over NFS
[18:39] <T1> for us it's the lack of mulitple active+ active MDSs
[18:39] <flaf> Ah ok, but in this case, there is only one nfs server, correct?
[18:39] <T1> as well as poor (read: almost no IOPS) performance when you have more than 1000 files in a single directory
[18:39] <flaf> Yes, indeed.
[18:40] <T1> + no fsck (or equivalent)
[18:40] <T1> yeah, only a single NFS server
[18:40] <r_await> T1, what is the typical size of a PDF and what is the order size for the rbds, if you don't mind?
[18:41] <flaf> T1: And how do you monitor your cluster? (collectd?)
[18:41] <T1> since we are using XFS inside the RBDs a crash of the server should not cause dataloss (and the risk of loosing all I/O for everything in that event has been accepted by management)
[18:41] <T1> the server = the NFS server
[18:42] <T1> r_await: I really cant tell you that - it depends highly on what it contains.. probably between a few hundred kb and up
[18:43] <T1> flaf: that is a work in progress..
[18:43] <T1> right not just some basic metrics through nagios
[18:43] <T1> but I have plans for a lot more
[18:43] <T1> I'm also planning on setting up calamari
[18:44] <flaf> T1 ok, I see. Thx T1 for all these explanations. ;)
[18:44] <T1> I have a few reminders on adding nagios alerts on a few important SMARTD metrics
[18:45] <T1> as well as making sure to get an alert of the clusters data capacity reaches a point where IOPS will grind to a stop if a single node drops out
[18:46] <T1> hmmm.. afk
[18:46] <T1> dinnertime
[18:46] <flaf> bye. :)
[18:46] <T1> cya later
[18:46] <T1> :)
[18:47] * toabctl (~toabctl@toabctl.de) Quit (Quit: Adios)
[18:49] <r_await> flaf, do you find it take a long time to do an ls in a cephfs directory with more than 1k files?
[18:50] <flaf> r_await: I put 1000 files in a cephds directory and I do a "\ls /dir", correct? Not "ls -l"?
[18:50] * wjw-freebsd (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[18:50] <r_await> I've seen on at least one object storage system where the files metadata is stored with the files that ls or worse ls -l in large directories could take an hour
[18:50] * etienne (~textual@ARennes-650-1-42-118.w86-215.abo.wanadoo.fr) has joined #ceph
[18:51] <r_await> ls -l
[18:51] <flaf> r_await: let me try...
[18:51] <r_await> Is there a way to cache the metadata for fast lookup?
[18:52] * LeaChim (~LeaChim@host86-185-146-193.range86-185.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[18:52] <flaf> I don't think so. metadata is handled by the mds server and you can increase its cache but it's in server side.
[18:53] <flaf> but I'm not a ceph expert at all.
[18:53] <flaf> But yes ls and "ls -l" don't work well in cephfs.
[18:54] <r_await> make sense, thank you for your input
[18:54] <flaf> I'm trying your ls...
[18:55] <flaf> r_await: I have created 1000 _empty_ files in a cephfs dir.
[18:56] <flaf> "ls -l" takes ~0.5s
[18:56] <flaf> And "\ls" takes ~0.008s.
[18:57] * lightspeed_ (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Remote host closed the connection)
[18:58] <r_await> in the past i've seen 1k be fine, but 50k could be very rough. I haven't tested on cephfs yet.
[18:58] <flaf> For the "ls -l", if I try in the second cephfs client, I have ~1s the first time
[18:58] <r_await> thats a good time
[18:58] <flaf> and after ~0.5s
[18:59] <flaf> r_await: but if there are writing in the directory by another cephfs client, the "ls -l" is terrible.
[18:59] * dyasny__ (~dyasny@dsl.198.58.155.234.ebox.ca) has joined #ceph
[19:00] <flaf> r_await: remark: my pool cephfsmetada is only in SSD not in spinner disks.
[19:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[19:01] <r_await> wouldn't SSD be better?
[19:01] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[19:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[19:02] <flaf> That's the point, I _use_ SSD for cephfsmedata
[19:02] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[19:03] * dyasny_ (~dyasny@dsl.dynamic.190-116-74-99.electronicbox.net) Quit (Ping timeout: 480 seconds)
[19:04] <flaf> I'm just creating 50,000 files in the directory.
[19:04] <flaf> It's long...
[19:05] <flaf> And during this work, a "ls -l" is the directory in the other cephfs client is just impossible.
[19:05] <r_await> running just a touch in a loop? or some other program?
[19:05] <flaf> But a \ls takes 0.5s.
[19:05] <flaf> r_await: a touch in a loop.
[19:07] * LeaChim (~LeaChim@host81-157-237-29.range81-157.btcentralplus.com) has joined #ceph
[19:08] <flaf> loop is finished
[19:08] <r_await> not bad
[19:09] <flaf> now (touch finished, no writing in the directory), "\ls" takes ~1.2 seconds
[19:09] <r_await> though its all metadata
[19:10] <flaf> ls -l => 27s
[19:10] <flaf> Fortunately, I don't have such directory in my use case. ;)
[19:12] <r_await> 27s, I think thats good performance for object storage
[19:12] <flaf> it seems to be stable 27s even the 2nd attempt
[19:12] <flaf> Well... now cleaning ;)
[19:16] <flaf> rm -r is long............
[19:19] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) Quit (Remote host closed the connection)
[19:20] <flaf> r_await: rm -r /mnt/cephfs/test/ => ~ 6 minutes (50,000 empty files in the directory)
[19:24] * dyasny_ (~dyasny@dsl.198.58.154.14.ebox.ca) has joined #ceph
[19:27] * Fapiko (~straterra@104.238.169.56) has joined #ceph
[19:29] * dyasny__ (~dyasny@dsl.198.58.155.234.ebox.ca) Quit (Ping timeout: 480 seconds)
[19:47] * lightspeed (~lightspee@fw-carp-wan.ext.lspeed.org) has joined #ceph
[19:49] * Freeaqingme (~quassel@nl3.s.kynet.eu) has joined #ceph
[19:53] * i_m (~ivan.miro@31.207.230.243) has joined #ceph
[19:57] * Fapiko (~straterra@104.238.169.56) Quit ()
[19:59] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Quit: Ex-Chat)
[20:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[20:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[20:06] <flaf> Can someone tell me if there is a command to know if an osd is marked out (or not) by monitors?
[20:08] * toabctl (~toabctl@toabctl.de) has joined #ceph
[20:19] * dyasny__ (~dyasny@dsl.198.58.154.125.ebox.ca) has joined #ceph
[20:23] * dyasny_ (~dyasny@dsl.198.58.154.14.ebox.ca) Quit (Ping timeout: 480 seconds)
[20:25] * vbellur (~vijay@2601:647:4f00:4960:5e51:4fff:fee8:6a5c) Quit (Ping timeout: 480 seconds)
[20:29] <T1> flaf: (re rm of 50k files): ouch! that is soooo slow
[20:29] <flaf> T1: yes, cephfs doesn't like directories with lot of files.
[20:31] <IcePic> few filesystems do
[20:31] <T1> it would be nice to have multiple active MDSs that partition and load balance the hotspots of the directorytree between them, but it must still be possible to do som performance enhancements in regard to single-directories
[20:31] <IcePic> a good test of fs'es is to make a million files in a dir, and measure the time it takes to make the millionth one.
[20:31] <T1> 50k files is not that many
[20:32] <T1> I've got a mailserver with one file per mail (ie. Maildir) with a few mio files in some "week" folders
[20:32] <IcePic> flaf: would be neat to compare the time for rm -rf to the time to just stat each of those files.
[20:33] <T1> I can perform "cleanup" (ie. delete everything) in half a minute or so for an entire week
[20:34] <T1> and that is even just on ext4
[20:38] * mykola (~Mikolaj@91.225.200.219) has joined #ceph
[20:39] * dyasny_ (~dyasny@dsl.198.58.174.47.ebox.ca) has joined #ceph
[20:44] * dyasny__ (~dyasny@dsl.198.58.154.125.ebox.ca) Quit (Ping timeout: 480 seconds)
[20:51] <flaf> IcePic: do you mean ???time rm -rf /dir??? vs ???time stat /dir/*????
[20:52] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Remote host closed the connection)
[20:52] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[20:55] * dyasny__ (~dyasny@dsl.198.58.170.202.ebox.ca) has joined #ceph
[20:56] * dyasny_ (~dyasny@dsl.198.58.174.47.ebox.ca) Quit (Ping timeout: 480 seconds)
[21:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[21:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[21:05] * ade (~abradshaw@dslb-092-078-131-084.092.078.pools.vodafone-ip.de) has joined #ceph
[21:05] * ade (~abradshaw@dslb-092-078-131-084.092.078.pools.vodafone-ip.de) Quit (Remote host closed the connection)
[21:08] * dyasny__ (~dyasny@dsl.198.58.170.202.ebox.ca) Quit (Ping timeout: 480 seconds)
[21:09] * ade (~abradshaw@dslb-092-078-131-084.092.078.pools.vodafone-ip.de) has joined #ceph
[21:10] * etienne (~textual@ARennes-650-1-42-118.w86-215.abo.wanadoo.fr) Quit (Quit: Textual IRC Client: www.textualapp.com)
[21:15] * olid1982 (~olid1982@dslb-084-059-135-089.084.059.pools.vodafone-ip.de) Quit (Ping timeout: 480 seconds)
[21:17] * rendar (~I@95.233.118.222) Quit (Ping timeout: 480 seconds)
[21:19] * rendar (~I@95.233.118.222) has joined #ceph
[21:29] * i_m (~ivan.miro@31.207.230.243) Quit (Ping timeout: 480 seconds)
[21:29] * shaunm (~shaunm@208.102.161.229) Quit (Ping timeout: 480 seconds)
[21:30] * Salamander_ (~elt@46.166.188.237) has joined #ceph
[21:51] * Anticimex (anticimex@185.19.66.194) Quit (Quit: apt-get dist-upgrade)
[21:54] * ade (~abradshaw@dslb-092-078-131-084.092.078.pools.vodafone-ip.de) Quit (Quit: Too sexy for his shirt)
[22:00] * Salamander_ (~elt@76GAAAXK8.tor-irc.dnsbl.oftc.net) Quit ()
[22:00] <IcePic> I dont think stat has recursive like that, but a "find /dir -ls {} \; > /dev/null" should suffice
[22:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[22:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[22:07] * Anticimex (anticimex@185.19.66.194) has joined #ceph
[22:10] * xarses_ (~xarses@rrcs-76-79-238-170.west.biz.rr.com) has joined #ceph
[22:10] * xarses (~xarses@rrcs-76-79-238-170.west.biz.rr.com) Quit (Ping timeout: 480 seconds)
[22:16] * olid1982 (~olid1982@aftr-185-17-206-143.dynamic.mnet-online.de) has joined #ceph
[22:18] * Nijikokun (~loft@lqfb.piraten-lsa.de) has joined #ceph
[22:23] * i_m (~ivan.miro@88.206.113.199) has joined #ceph
[22:48] * Nijikokun (~loft@76GAAAXMW.tor-irc.dnsbl.oftc.net) Quit ()
[22:53] * ilken (~unknown@2602:63:c2a2:af00:2dfa:aba6:3063:ad27) Quit (Ping timeout: 480 seconds)
[22:58] * dgbaley27 (~matt@75.148.118.217) has joined #ceph
[23:01] * haomaiwang (~haomaiwan@103.15.217.218) Quit (Remote host closed the connection)
[23:01] * haomaiwang (~haomaiwan@103.15.217.218) has joined #ceph
[23:08] * vata (~vata@cable-21.246.173-197.electronicbox.net) has joined #ceph
[23:16] * danieagle (~Daniel@187.34.2.79) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[23:18] * mykola (~Mikolaj@91.225.200.219) Quit (Quit: away)
[23:29] * T1 (~the_one@87.104.212.66) Quit (Quit: Where did the client go?)
[23:38] * aeroevan (~aeroevan@00015f77.user.oftc.net) Quit (Quit: ZNC 1.6.1 - http://znc.in)
[23:40] * aeroevan (~aeroevan@00015f77.user.oftc.net) has joined #ceph
[23:44] * i_m (~ivan.miro@88.206.113.199) Quit (Ping timeout: 480 seconds)
[23:48] * LDA (~DM@host217-114-156-249.pppoe.mark-itt.net) Quit (Quit: Nettalk6 - www.ntalk.de)
[23:58] * Geoffrey (~geoffrey@169-0-138-190.ip.afrihost.co.za) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.