#ceph IRC Log

Index

IRC Log for 2016-06-06

Timestamps are in GMT/BST.

[0:01] * rendar (~I@host4-179-dynamic.23-79-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)
[0:07] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[0:07] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[0:11] * badone (~badone@66.187.239.16) has joined #ceph
[0:11] * agsha (~agsha@124.40.246.234) has joined #ceph
[0:19] * agsha (~agsha@124.40.246.234) Quit (Ping timeout: 480 seconds)
[0:21] * Miouge (~Miouge@h-4-155-222.a163.priv.bahnhof.se) has joined #ceph
[0:27] * Miouge (~Miouge@h-4-155-222.a163.priv.bahnhof.se) Quit (Quit: Miouge)
[0:29] <flaf> I have just migrated to jewel on Trusty but if I try to restart an osd I have an error in upstart => /proc/self/fd/9: 8: /proc/self/fd/9: /usr/libexec/ceph/ceph-osd-prestart.sh: not found
[0:29] <flaf> In fact, the osd is not started.
[0:30] <flaf> If I try to start the osd in foreground, no problem.
[0:31] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:3df9:f187:182b:27cf) has joined #ceph
[0:39] <flaf> this "/usr/libexec/ceph/ceph-osd-prestart.sh" is very strange.
[0:40] <flaf> In /etc/init/ceph-osd.conf I have /usr/lib/ceph/ceph-osd-prestart.sh but not /usr/libexec/ceph/ceph-osd-prestart.sh...?
[0:49] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) Quit (Ping timeout: 480 seconds)
[0:54] * LeaChim (~LeaChim@host86-168-126-119.range86-168.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[0:57] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:3df9:f187:182b:27cf) Quit (Ping timeout: 480 seconds)
[1:11] * musca (musca@tyrael.eu) has left #ceph
[1:13] * _s1gma (~neobenedi@46.166.186.250) has joined #ceph
[1:20] * oms101 (~oms101@p20030057EA105F00C6D987FFFE4339A1.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:28] * oms101 (~oms101@p20030057EA13E100C6D987FFFE4339A1.dip0.t-ipconnect.de) has joined #ceph
[1:34] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[1:43] * _s1gma (~neobenedi@4MJAAF09U.tor-irc.dnsbl.oftc.net) Quit ()
[1:43] * Tonux1 (~Scaevolus@watchme.tor-exit.network) has joined #ceph
[1:52] <flaf> it's like if the init system of trusty are using the old version of /etc/init/ceph-osd.conf (in infernalis package).
[2:10] * agsha (~agsha@124.40.246.234) has joined #ceph
[2:13] * Tonux1 (~Scaevolus@06SAADJ3S.tor-irc.dnsbl.oftc.net) Quit ()
[2:13] * Aramande_ (~Eric@tor.piratenpartei-nrw.de) has joined #ceph
[2:18] * agsha (~agsha@124.40.246.234) Quit (Ping timeout: 480 seconds)
[2:28] * grauzikas (grauzikas@78-56-222-78.static.zebra.lt) Quit (Ping timeout: 480 seconds)
[2:32] * yanzheng (~zhyan@125.70.23.87) has joined #ceph
[2:33] * hellertime (~Adium@pool-173-48-155-219.bstnma.fios.verizon.net) has joined #ceph
[2:37] * yanzheng (~zhyan@125.70.23.87) Quit (Quit: This computer has gone to sleep)
[2:43] * Aramande_ (~Eric@06SAADJ4W.tor-irc.dnsbl.oftc.net) Quit ()
[2:48] * drupal (~Zeis@06SAADJ6E.tor-irc.dnsbl.oftc.net) has joined #ceph
[3:03] * georgem (~Adium@104-222-119-175.cpe.teksavvy.com) has joined #ceph
[3:05] * georgem (~Adium@104-222-119-175.cpe.teksavvy.com) Quit ()
[3:05] * georgem (~Adium@206.108.127.16) has joined #ceph
[3:07] * ircuser-1 (~Johnny@158.183-62-69.ftth.swbr.surewest.net) has joined #ceph
[3:13] * dalegaard-39554 (~dalegaard@vps.devrandom.dk) Quit (Ping timeout: 480 seconds)
[3:18] * drupal (~Zeis@06SAADJ6E.tor-irc.dnsbl.oftc.net) Quit ()
[3:18] * TheDoudou_a (~N3X15@64.ip-37-187-176.eu) has joined #ceph
[3:20] * natarej (~natarej@101.188.54.14) has joined #ceph
[3:20] * NTTEC (~nttec@122.53.162.158) has joined #ceph
[3:24] * EinstCrazy (~EinstCraz@58.247.119.250) has joined #ceph
[3:25] * natarej_ (~natarej@101.188.54.14) Quit (Ping timeout: 480 seconds)
[3:29] * sudocat (~dibarra@2602:306:8bc7:4c50:cc78:32ee:7fc3:d8f4) has joined #ceph
[3:44] * aj__ (~aj@x4db1e739.dyn.telefonica.de) has joined #ceph
[3:48] * TheDoudou_a (~N3X15@06SAADJ7L.tor-irc.dnsbl.oftc.net) Quit ()
[3:48] * capitalthree (~Sketchfil@94.242.195.186) has joined #ceph
[3:52] * derjohn_mobi (~aj@x4db2a8c3.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[4:00] * hellertime (~Adium@pool-173-48-155-219.bstnma.fios.verizon.net) Quit (Quit: Leaving.)
[4:00] * yanzheng (~zhyan@125.70.23.87) has joined #ceph
[4:08] * Skaag (~lunix@65.200.54.234) Quit (Quit: Leaving.)
[4:12] * flisky (~Thunderbi@210.12.157.91) has joined #ceph
[4:13] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[4:18] * capitalthree (~Sketchfil@4MJAAF1FU.tor-irc.dnsbl.oftc.net) Quit ()
[4:18] * Nanobot (~Guest1390@tor.yrk.urgs.uk0.bigv.io) has joined #ceph
[4:18] * wgao (~wgao@106.120.101.38) Quit (Ping timeout: 480 seconds)
[4:19] * wgao (~wgao@106.120.101.38) has joined #ceph
[4:31] * shyu (~shyu@218.241.172.114) has joined #ceph
[4:32] * ReSam (ReSam@0001fe39.user.oftc.net) Quit (Quit: ZNC - http://znc.in)
[4:33] * ReSam (ReSam@catrobat-irc.ist.tu-graz.ac.at) has joined #ceph
[4:41] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) has joined #ceph
[4:46] * NTTEC (~nttec@122.53.162.158) Quit (Remote host closed the connection)
[4:48] * Nanobot (~Guest1390@06SAADJ9O.tor-irc.dnsbl.oftc.net) Quit ()
[4:48] * Rosenbluth (~Throlkim@tor-exit1-readme.dfri.se) has joined #ceph
[4:54] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[4:54] <IvanJobs> hi, cephers
[4:55] <IvanJobs> when I use "ceph-deploy rgw create ceph-node1" to create a RGW instance, error occurs saying "config file /etc/ceph/ceph.conf exists with different content"
[4:56] <IvanJobs> I guess it means "/etc/ceph/ceph.conf" is different with curent installation dir's ceph.conf.
[4:57] <IvanJobs> but when I replace current installation dir's ceph.conf with /etc/ceph/ceph.conf, the error remains.
[4:57] <IvanJobs> why? is my guess wrong? I even cp with --preserve=all to make timestamp the same.
[4:58] * aarontc (~aarontc@2001:470:e893::1:1) Quit (Quit: Bye!)
[4:59] <IvanJobs> any help, would be appreciated.
[5:06] <badone> IvanJobs: you can use ceph-deploy to copy out the conf...
[5:07] <IvanJobs> badone: you mean, I can use ceph-deploy to do copying from /etc/ceph/ceph.conf to current directory?
[5:08] <badone> IvanJobs: to copy from admin directory to /etc/ceph/
[5:08] <badone> ceph-deploy config push ...
[5:08] <IvanJobs> badone, yep, I know that which command is config push.
[5:09] <IvanJobs> I need kind of config pull.
[5:09] <badone> why?
[5:10] <IvanJobs> because I think /etc/ceph/ceph.conf is better than config file in current dir. Thx anyway, there is a config pull command in ceph-deploy
[5:10] <IvanJobs> I will try that.
[5:11] * flisky1 (~Thunderbi@210.12.157.86) has joined #ceph
[5:13] * flisky (~Thunderbi@210.12.157.91) Quit (Ping timeout: 480 seconds)
[5:13] * flisky1 is now known as flisky
[5:13] <badone> IvanJobs: how did they get out of sync?
[5:15] * aarontc (~aarontc@2001:470:e893::1:1) has joined #ceph
[5:18] * Rosenbluth (~Throlkim@06SAADKAS.tor-irc.dnsbl.oftc.net) Quit ()
[5:18] <IvanJobs> badone: maybe some editing ops for ceph-admin's ceph.conf, then their content become different. Here the key problem is it keep saying "[ceph_deploy.rgw][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite", I don't want to use "--overwrite-conf" before I know exactly what this option doing.
[5:20] <badone> IvanJobs: fair enough, you could make a backup and test the various options like removing the ceph.conf after making a backup, etc...
[5:22] <badone> IvanJobs: if the copy on the rgw is the "best" copy then I would be inclined to copy that to the ceph-deploy directory and push it out to all nodes
[5:22] * flisky (~Thunderbi@210.12.157.86) Quit (Remote host closed the connection)
[5:24] * flisky (~Thunderbi@106.38.61.190) has joined #ceph
[5:25] <IvanJobs> badone: thx anyway, i will check ceph-delpoy's source code, to find out what ceph-deploy --overwrite-conf do and how to check conf file content diffrences.
[5:27] * Vacuum__ (~Vacuum@88.130.217.57) has joined #ceph
[5:32] * flisky (~Thunderbi@106.38.61.190) Quit (Ping timeout: 480 seconds)
[5:33] * Vacuum_ (~Vacuum@88.130.214.130) Quit (Ping timeout: 480 seconds)
[5:38] * linuxkidd (~linuxkidd@ip70-189-207-54.lv.lv.cox.net) has joined #ceph
[5:48] * Solvius (~MatthewH1@chulak.enn.lu) has joined #ceph
[5:50] <badone> IvanJobs: sure
[5:51] * flisky (~Thunderbi@106.38.61.181) has joined #ceph
[5:56] * overclk (~quassel@117.202.96.41) has joined #ceph
[5:58] <flaf> I said => it's like if the init system of trusty are using the old version of /etc/init/ceph-osd.conf (from infernalis package) instead of the current version (from the jewel package).
[5:58] <flaf> Very curious, I'm pretty sure of this assertion.
[5:59] <flaf> It was the case with ???start cep-osd id=$id???, I'm sure.
[6:00] <flaf> But, I don't know why, no problem with ???start ceph-osd-all???.
[6:00] <flaf> Why...? I don't know. Fuck upstart! ;)
[6:02] * sankarshan (~sankarsha@121.244.87.117) has joined #ceph
[6:07] * kefu (~kefu@ec2-52-193-227-229.ap-northeast-1.compute.amazonaws.com) has joined #ceph
[6:14] * flisky1 (~Thunderbi@210.12.157.86) has joined #ceph
[6:14] * linuxkidd (~linuxkidd@ip70-189-207-54.lv.lv.cox.net) Quit (Quit: Leaving)
[6:18] * Solvius (~MatthewH1@06SAADKC5.tor-irc.dnsbl.oftc.net) Quit ()
[6:19] * flisky (~Thunderbi@106.38.61.181) Quit (Ping timeout: 480 seconds)
[6:19] * flisky1 is now known as flisky
[6:24] * rdas (~rdas@121.244.87.116) has joined #ceph
[6:35] * aj__ (~aj@x4db1e739.dyn.telefonica.de) Quit (Ping timeout: 480 seconds)
[6:39] <IvanJobs> badone: I found it.
[6:40] <IvanJobs> there is a bug in ceph-deploy, and hasn't fixed. this pr attend to fix it, but encounter an dep problem. https://github.com/ceph/ceph-deploy/pull/207
[6:41] <IvanJobs> so if u can confirm that conf are the same, use --overwrite-conf anyway.
[6:46] * matj345314 (~matj34531@element.planetq.org) has joined #ceph
[6:50] * matj345314 (~matj34531@element.planetq.org) Quit ()
[6:52] * vikhyat (~vumrao@121.244.87.116) has joined #ceph
[6:52] <badone> IvanJobs: nice find, I wasn't aware of that one
[6:53] * mog_ (~zviratko@192.42.115.101) has joined #ceph
[6:53] <IvanJobs> badone: so it is like, we take two files to compare. one with original content, one with line striping and space/underscore uniforming, it mostly are not same, even same content.
[6:54] <badone> verifiably not identical though so the bug is understandable
[6:54] <IvanJobs> only, all config item in conf file with underscores and no space char head or tail for each line.
[6:54] * gauravbafna (~gauravbaf@122.167.72.77) has joined #ceph
[6:55] <badone> IvanJobs: I understand what you are saying but they still differ in content
[6:57] * geli12 (~geli@1.136.96.234) has joined #ceph
[6:58] * vikhyat is now known as vikhyat|brb
[7:00] * dgbaley27 (~matt@2601:b00:c600:f800:45e0:8dd4:ec1:17bb) has joined #ceph
[7:03] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Ping timeout: 480 seconds)
[7:05] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Remote host closed the connection)
[7:06] * kefu is now known as kefu|afk
[7:07] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[7:11] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[7:15] * TomasCZ (~TomasCZ@yes.tenlab.net) Quit (Quit: Leaving)
[7:17] * kefu|afk is now known as kefu
[7:18] * naga1 (~oftc-webi@15.219.201.83) has joined #ceph
[7:19] * vikhyat|brb is now known as vikhyat
[7:22] * mog_ (~zviratko@7V7AAFRXY.tor-irc.dnsbl.oftc.net) Quit ()
[7:23] * gauravba_ (~gauravbaf@122.172.203.166) has joined #ceph
[7:24] * dalegaard-39554 (~dalegaard@vps.devrandom.dk) has joined #ceph
[7:28] * gauravbafna (~gauravbaf@122.167.72.77) Quit (Ping timeout: 480 seconds)
[7:29] * Atomizer (~TehZomB@chomsky.torservers.net) has joined #ceph
[7:31] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[7:33] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:33] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Remote host closed the connection)
[7:33] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:33] * deepthi (~deepthi@106.216.177.46) has joined #ceph
[7:33] * rotbeard (~redbeard@185.32.80.238) has joined #ceph
[7:43] * agsha (~agsha@121.244.155.9) has joined #ceph
[7:50] * skyrat (~skyrat@94.230.156.78) has joined #ceph
[7:50] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:51] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:51] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:51] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:51] * kefu (~kefu@ec2-52-193-227-229.ap-northeast-1.compute.amazonaws.com) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[7:51] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:51] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:51] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:52] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:52] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:52] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:52] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:53] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:53] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:54] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:54] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:54] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:54] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:54] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:54] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:55] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:55] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:55] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:55] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:56] <skyrat> Hi, is there any way to re-activate OSD on Infernalis? I see the option was added in Jewel.
[7:57] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:57] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:57] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:57] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:57] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:57] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:58] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[7:58] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:58] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:58] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[7:58] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[7:58] * Atomizer (~TehZomB@4MJAAF1N4.tor-irc.dnsbl.oftc.net) Quit ()
[7:58] * Nanobot (~qable@edwardsnowden2.torservers.net) has joined #ceph
[7:59] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[8:02] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[8:02] * itamarl is now known as Guest3301
[8:02] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[8:04] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) has joined #ceph
[8:05] * Kurt (~Adium@2001:628:1:5:e098:843e:2a6b:f76) has joined #ceph
[8:05] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[8:06] * Guest3301 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:09] * itamarl is now known as Guest3302
[8:09] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[8:11] * Guest3302 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:16] * itamarl is now known as Guest3303
[8:16] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[8:18] * kefu (~kefu@183.193.187.174) has joined #ceph
[8:19] * itamarl is now known as Guest3304
[8:19] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[8:21] * Guest3303 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:22] * karnan (~karnan@121.244.87.117) has joined #ceph
[8:22] * karnan_ (~karnan@121.244.87.117) has joined #ceph
[8:23] * IvanJobs_ (~ivanjobs@103.50.11.146) has joined #ceph
[8:24] * matj345314 (~matj34531@141.255.254.208) has joined #ceph
[8:24] * Guest3304 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:25] * kefu_ (~kefu@114.92.104.47) has joined #ceph
[8:25] * itamarl is now known as Guest3305
[8:25] * itamarl_ (~itamarl@194.90.7.244) has joined #ceph
[8:25] * itamarl_ is now known as itamarl
[8:26] * kefu_ (~kefu@114.92.104.47) Quit ()
[8:27] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Ping timeout: 480 seconds)
[8:28] * Guest3305 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:28] * Nanobot (~qable@06SAADKHG.tor-irc.dnsbl.oftc.net) Quit ()
[8:28] * mohmultihouse (~mohmultih@gw01.mhitp.dk) has joined #ceph
[8:28] * W|ldCraze (~danielsj@watchme.tor-exit.network) has joined #ceph
[8:29] * karnan_ (~karnan@121.244.87.117) Quit (Quit: Leaving)
[8:30] * kefu (~kefu@183.193.187.174) Quit (Ping timeout: 480 seconds)
[8:30] * itamarl is now known as Guest3307
[8:30] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[8:31] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[8:32] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[8:33] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[8:33] * Guest3307 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:35] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) has joined #ceph
[8:36] * dugravot6 (~dugravot6@4cy54-1-88-187-244-6.fbx.proxad.net) has joined #ceph
[8:36] * itamarl is now known as Guest3309
[8:36] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[8:37] * b0e (~aledermue@213.95.25.82) has joined #ceph
[8:38] * Guest3309 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:39] * flisky (~Thunderbi@210.12.157.86) Quit (Quit: flisky)
[8:41] * itamarl is now known as Guest3310
[8:41] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[8:43] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Remote host closed the connection)
[8:44] * Guest3310 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:44] * naga1 (~oftc-webi@15.219.201.83) Quit (Ping timeout: 480 seconds)
[8:45] * dugravot6 (~dugravot6@4cy54-1-88-187-244-6.fbx.proxad.net) Quit (Quit: Leaving.)
[8:53] * itamarl is now known as Guest3312
[8:53] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[8:57] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[8:57] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[8:58] * Guest3312 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[8:58] * W|ldCraze (~danielsj@7V7AAFR1F.tor-irc.dnsbl.oftc.net) Quit ()
[9:00] * straterra (~CydeWeys@tor-exit.insane.us.to) has joined #ceph
[9:03] * gauravba_ (~gauravbaf@122.172.203.166) Quit (Remote host closed the connection)
[9:04] * itamarl is now known as Guest3314
[9:04] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[9:07] * Guest3314 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[9:09] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) has joined #ceph
[9:10] * kefu (~kefu@27.115.92.94) has joined #ceph
[9:12] * analbeard (~shw@support.memset.com) has joined #ceph
[9:13] * itamarl is now known as Guest3316
[9:13] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[9:14] * Guest3316 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[9:18] * itamarl is now known as Guest3317
[9:18] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[9:19] * kefu (~kefu@27.115.92.94) Quit (Ping timeout: 480 seconds)
[9:21] * Guest3317 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[9:24] * Miouge (~Miouge@h-4-155-222.a163.priv.bahnhof.se) has joined #ceph
[9:27] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[9:28] * straterra (~CydeWeys@7V7AAFR2U.tor-irc.dnsbl.oftc.net) Quit ()
[9:29] * rdas (~rdas@121.244.87.116) has joined #ceph
[9:32] * itamarl is now known as Guest3320
[9:32] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[9:34] * Guest3320 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[9:37] * itamarl is now known as Guest3322
[9:37] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[9:40] * Guest3322 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[9:41] * itamarl is now known as Guest3323
[9:41] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[9:43] * ceph_ (~chris@180.168.197.82) has joined #ceph
[9:43] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[9:43] * ceph_ (~chris@180.168.197.82) Quit (Read error: Connection reset by peer)
[9:46] * Guest3323 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[9:47] * itamarl is now known as Guest3324
[9:47] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[9:48] * chengpeng (~chris@180.168.170.2) has joined #ceph
[9:49] * Guest3324 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[9:51] * dgurtner (~dgurtner@82.199.64.68) has joined #ceph
[9:53] * itamarl is now known as Guest3325
[9:53] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[9:55] * Guest3325 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[9:59] * Xylios (~tokie@163.172.129.70) has joined #ceph
[10:00] * dugravot6 (~dugravot6@194.199.223.4) has joined #ceph
[10:02] * itamarl is now known as Guest3327
[10:02] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:04] * chopmann (~sirmonkey@2a02:8108:8b40:5600::2) has joined #ceph
[10:05] * Guest3327 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[10:07] * itamarl is now known as Guest3328
[10:07] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:09] * dvanders (~dvanders@130.246.253.64) has joined #ceph
[10:09] * dugravot61 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) has joined #ceph
[10:10] * Guest3328 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[10:10] * allaok (~allaok@machine107.orange-labs.com) Quit (Quit: Leaving.)
[10:11] * dugravot6 (~dugravot6@194.199.223.4) Quit (Ping timeout: 480 seconds)
[10:11] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[10:12] * itamarl is now known as Guest3329
[10:12] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:14] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Ping timeout: 480 seconds)
[10:15] * Miouge (~Miouge@h-4-155-222.a163.priv.bahnhof.se) Quit (Quit: Miouge)
[10:15] * Guest3329 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[10:16] * gauravbafna (~gauravbaf@49.44.57.225) has joined #ceph
[10:17] * rraja (~rraja@121.244.87.117) has joined #ceph
[10:18] * itamarl is now known as Guest3330
[10:18] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:19] * Miouge (~Miouge@h-4-155-222.a163.priv.bahnhof.se) has joined #ceph
[10:20] * T1w (~jens@node3.survey-it.dk) has joined #ceph
[10:21] * Guest3330 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[10:23] * rendar (~I@host159-59-dynamic.22-79-r.retail.telecomitalia.it) has joined #ceph
[10:23] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) has joined #ceph
[10:24] * itamarl is now known as Guest3331
[10:24] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:25] * flisky (~Thunderbi@210.12.157.88) has joined #ceph
[10:26] * gauravba_ (~gauravbaf@49.32.0.228) has joined #ceph
[10:26] * Guest3331 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[10:28] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[10:28] * gauravba_ (~gauravbaf@49.32.0.228) Quit ()
[10:28] * Xylios (~tokie@7V7AAFR5I.tor-irc.dnsbl.oftc.net) Quit ()
[10:29] * LeaChim (~LeaChim@host86-148-117-255.range86-148.btcentralplus.com) has joined #ceph
[10:29] * itamarl is now known as Guest3332
[10:29] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:32] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) Quit (Ping timeout: 480 seconds)
[10:32] * Guest3332 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[10:33] * EinstCra_ (~EinstCraz@58.247.119.250) has joined #ceph
[10:33] * EinstCrazy (~EinstCraz@58.247.119.250) Quit (Read error: Connection reset by peer)
[10:33] * gauravbafna (~gauravbaf@49.44.57.225) Quit (Ping timeout: 480 seconds)
[10:34] * itamarl is now known as Guest3336
[10:34] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:34] * natarej (~natarej@101.188.54.14) Quit (Read error: Connection reset by peer)
[10:35] * maku (~allenmelo@tollana.enn.lu) has joined #ceph
[10:35] * bara (~bara@nat-pool-brq-t.redhat.com) has joined #ceph
[10:37] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) has joined #ceph
[10:38] * Guest3336 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[10:40] * chopmann (~sirmonkey@2a02:8108:8b40:5600::2) Quit (Quit: chopmann)
[10:41] <DaveOD> Hey
[10:41] * dlan (~dennis@116.228.88.131) Quit (Remote host closed the connection)
[10:42] * Tene (~tene@173.13.139.236) Quit (Ping timeout: 480 seconds)
[10:42] <DaveOD> Anybody can help me with a 1 OSD Full which is causing my CephFS to display: no space left on device
[10:42] <DaveOD> I have adjust the full ratio to 0.98% temporary
[10:42] <DaveOD> and i have reweighted the one OSD in crush
[10:42] <DaveOD> so it's rebalancing the data
[10:43] <DaveOD> but I want make the CephFS cluster available again
[10:43] <DaveOD> it looks like adjusting the full ratio to 98% did not temporary remove the full OSD message
[10:43] <DaveOD> whichi is causing the no space left on device
[10:43] <DaveOD> any idea how to proceed without the need for waiting on the rebalance?
[10:47] * itamarl is now known as Guest3337
[10:47] * itamarl (~itamarl@194.90.7.244) has joined #ceph
[10:47] * dlan (~dennis@116.228.88.131) has joined #ceph
[10:48] * Guest3337 (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[10:49] * TMM (~hp@185.5.121.201) has joined #ceph
[10:52] * ade (~abradshaw@85.158.226.30) has joined #ceph
[10:56] * dugravot61 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) Quit (Quit: Leaving.)
[10:57] * allaok (~allaok@machine107.orange-labs.com) has left #ceph
[10:58] * dugravot6 (~dugravot6@194.199.223.4) has joined #ceph
[10:58] * dugravot6 (~dugravot6@194.199.223.4) Quit (Remote host closed the connection)
[11:00] * dugravot6 (~dugravot6@194.199.223.4) has joined #ceph
[11:02] * itamarl (~itamarl@194.90.7.244) Quit (Ping timeout: 480 seconds)
[11:03] * agsha_ (~agsha@121.244.155.15) has joined #ceph
[11:03] * dvanders (~dvanders@130.246.253.64) Quit (Ping timeout: 480 seconds)
[11:04] * matj345314 (~matj34531@141.255.254.208) Quit (Quit: matj345314)
[11:04] * maku (~allenmelo@4MJAAF1WE.tor-irc.dnsbl.oftc.net) Quit ()
[11:06] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[11:06] * allaok (~allaok@machine107.orange-labs.com) Quit ()
[11:06] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[11:08] * agsha (~agsha@121.244.155.9) Quit (Ping timeout: 480 seconds)
[11:09] * rakeshgm (~rakesh@121.244.87.117) has joined #ceph
[11:10] * thomnico (~thomnico@2a01:e35:8b41:120:147a:ce20:a39f:4a17) has joined #ceph
[11:12] * matj345314 (~matj34531@141.255.254.208) has joined #ceph
[11:13] * dcwangmit01_ (~dcwangmit@162-245.23-239.PUBLIC.monkeybrains.net) has joined #ceph
[11:14] * dcwangmit01 (~dcwangmit@162-245.23-239.PUBLIC.monkeybrains.net) Quit (Ping timeout: 480 seconds)
[11:14] * matj345314 (~matj34531@141.255.254.208) Quit ()
[11:16] * jordanP (~jordan@204.13-14-84.ripe.coltfrance.com) Quit (Quit: Leaving)
[11:17] * rakeshgm (~rakesh@121.244.87.117) Quit (Ping timeout: 480 seconds)
[11:20] * danieagle (~Daniel@177.188.65.64) Quit (Quit: Obrigado por Tudo! :-) inte+ :-))
[11:21] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:3df9:f187:182b:27cf) has joined #ceph
[11:25] * swami1 (~swami@49.32.0.164) has joined #ceph
[11:29] * rakeshgm (~rakesh@121.244.87.118) has joined #ceph
[11:30] * sankarshan (~sankarsha@121.244.87.117) Quit (Quit: Are you sure you want to quit this channel (Cancel/Ok) ?)
[11:33] <IvanJobs> DaveOD: I didn't get it. what do you mean by "causing my CephFS to display"?
[11:35] * skney1 (~sardonyx@192.87.28.28) has joined #ceph
[11:37] * agsha (~agsha@121.244.155.9) has joined #ceph
[11:40] <DaveOD> IvanJobs: touch /mountpointCephFS/test.txt
[11:40] <DaveOD> that dispalys: no space left on device
[11:40] <DaveOD> because 1 OSD has reached osd full ratio
[11:41] * chengpeng (~chris@180.168.170.2) Quit (Ping timeout: 480 seconds)
[11:42] * chengpeng (~chris@180.168.197.82) has joined #ceph
[11:44] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Remote host closed the connection)
[11:44] * agsha_ (~agsha@121.244.155.15) Quit (Ping timeout: 480 seconds)
[11:50] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[11:51] * rakeshgm (~rakesh@121.244.87.118) Quit (Ping timeout: 480 seconds)
[11:55] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[11:56] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[11:57] * wgao (~wgao@106.120.101.38) Quit (Remote host closed the connection)
[12:01] * rakeshgm (~rakesh@121.244.87.117) has joined #ceph
[12:04] * skney1 (~sardonyx@7V7AAFR8Z.tor-irc.dnsbl.oftc.net) Quit ()
[12:13] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[12:14] * secate (~Secate@dsl-197-245-164-90.voxdsl.co.za) has joined #ceph
[12:15] * kefu (~kefu@183.193.187.174) has joined #ceph
[12:19] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[12:19] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[12:20] <raeven> do a ceph client use multiple connections to a single node or is it just one?
[12:20] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[12:20] <raeven> I am checking if a single client can benefit from lacp
[12:21] * allaok (~allaok@machine107.orange-labs.com) Quit (Remote host closed the connection)
[12:22] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[12:23] <badone> raeven: it will connect to multiple OSDs and each connection is a pair, one send, one receive thread
[12:23] <raeven> badone: Thanks for the quick response.
[12:24] <badone> raeven: np. There's *some* details here, http://docs.ceph.com/docs/master/dev/index-old/
[12:24] * dvanders (~dvanders@130.246.253.64) has joined #ceph
[12:24] <badone> raeven: #ceph-devel may be more appropriate, depending how deep you want to go
[12:26] <badone> if you want to look at source look at ./src/msg/ and at the Simple messenger to begin with
[12:26] <badone> a new implementation, async messenger is coming, so keep that in mind going forward
[12:26] <badone> HTH
[12:28] <BranchPredictor> raeven: depends on how much data you're going to transfer
[12:28] <BranchPredictor> and what links are you going to use (1gbps, 10gbps, faster? slower?)
[12:29] <raeven> BranchPredictor: I'm using 2x1gbps at the moment
[12:29] <badone> true
[12:29] <raeven> BranchPredictor: This is a home lab so there are not much data to be moved
[12:30] <BranchPredictor> raeven: then you should benefit, assuming your storage can attain more than 128MB/s of sustained read/write/both
[12:31] <raeven> reads should be no problem, the writes can be hurting but thats why i am playing with it. Breaking ceph one piece at a time.
[12:32] <BranchPredictor> writes will be always slower, but having more bandwidth for writes won't hurt
[12:33] <BranchPredictor> anyway
[12:33] <raeven> right now my reads are hurting because my zfs storage server is doing a scrub, will play with ceph more after it is done.
[12:34] <BranchPredictor> on 3 osds on loopback and with cached rados bench read --no-verify, I attain over 2GB/s (~2250MB/s) on core i7-6700, so I'm pretty sure you'll see difference.
[12:35] * sardonyx (~Hazmat@06SAADKSC.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:36] <Be-El> BranchPredictor: so and osd should be able to saturate a link if the data is available in the page cache?
[12:37] <BranchPredictor> Be-El: yes, assuming your cpu can keep up
[12:37] <BranchPredictor> 2250
[12:38] <BranchPredictor> 2250MB/s equals to around 17,5Gbps
[12:38] <Be-El> BranchPredictor: we have a setup with 9 hosts with 12 nl-sas disks, 40GB frontend network. a client with 2x10GB lacp bond is not able to transfer more than 500MB/s after multiple rados bench runs
[12:38] <Be-El> cpus are more or less idle in the storage hosts
[12:39] <Be-El> and 500MB/s single threaded, but it does not scale well with number of threads
[12:39] <BranchPredictor> Be-El: try rados bench --no-verify (if your has one)
[12:40] <Be-El> the storage hosts have 128gb ram, so the benchmark set should fit into memory
[12:40] <BranchPredictor> have you tried iperf to rule out any network issues
[12:41] <BranchPredictor> ?
[12:41] <Be-El> iperf gives 10GBit/s, so it can saturate a link
[12:41] <Be-El> i get similar speeds around 400-500 MB/s if i use another storage host, so there's a 40GB link in between (same switch)
[12:42] <BranchPredictor> Be-El: no idea about your cluster/crushmap, but you may be saturating osds.
[12:43] <Be-El> BranchPredictor: do you know about any documentation for the osd perf counters?
[12:43] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:43] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:43] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:43] <vikhyat> DaveOD: if your cluster is marked full you need to some how manage to come out of *full* status it could be adding new osds or changing the *full* ratio
[12:44] <vikhyat> DaveOD: how you did this "adjusting the full ratio to 98% did not temporary remove the full OSD message"
[12:44] <vikhyat> ?
[12:44] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:44] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:44] <BranchPredictor> Be-El: not from top of my head, sorry
[12:44] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:44] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:44] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:44] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:45] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:45] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:45] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:45] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:45] <Be-El> BranchPredictor: that's a rados write bench: http://pastebin.com/5FrMDB6G
[12:45] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:45] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:46] <Be-El> with 10 threads, so it is able to saturate at least one 10GB link
[12:46] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:46] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:46] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:46] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[12:46] <BranchPredictor> Be-El: yeah, but as I said, writes are always slower than reads.
[12:46] <BranchPredictor> Be-El: try with read bench.
[12:47] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[12:47] <Be-El> BranchPredictor: single threaded directly after writing: http://pastebin.com/gaEvRDdg
[12:48] <BranchPredictor> Be-El: not bad, for single thread.
[12:48] <BranchPredictor> try 8 or 16.
[12:50] <Be-El> BranchPredictor: it may not be bad, but data stored in page cache should be served faster
[12:51] <Be-El> 8 threads: http://pastebin.com/DEma4jPS
[12:51] * onyb (~ani07nov@112.133.232.30) has joined #ceph
[12:51] <BranchPredictor> Be-El: 20 8 1746 1738 347.55 0 - 0.0756305
[12:52] <BranchPredictor> Be-El: for some reason it doesn't keep up with load
[12:52] <BranchPredictor> Be-El: try sshing into hosts that participate in that load and monitoring it with nmon
[12:52] <Be-El> and it keeps on printed that line every second
[12:53] <BranchPredictor> Be-El: if cpu usage is mostly blue, then you're saturating drives, not pagecache.
[12:53] <BranchPredictor> cur mb/s equal to 0 means that in that second there were no IO requests completed.
[12:54] <BranchPredictor> so you're definitely saturating *something*
[12:54] <Be-El> and there are 8 pending requests. the benchmark run also does not stop after 60 seconds
[12:55] * vicente (~~vicente@125-227-238-55.HINET-IP.hinet.net) Quit (Quit: Leaving)
[12:56] <raeven> I don't think this is the problem but is the journal on the same drives you do you have a dedicated ssh for that?
[12:56] <Be-El> some iowait spikes, but one of the them are permanent
[12:57] <Be-El> journals are on intel p3700 pci cards, 6 journals per card
[12:57] <BranchPredictor> Be-El: you may want to figure out why is that happening (bad drive?)
[12:58] <DaveOD> vikhyat: ceph tell mon.\* injectargs '--mon_osd_full_ratio 0.98'
[12:59] <vikhyat> DaveOD: it wont help you
[12:59] <vikhyat> as this option is checked when you create pgs initially
[12:59] <Be-El> BranchPredictor: half of the cluster is currently running on btrfs. the spiky machine is a btrfs one; a xfs one does not show any peaks
[13:00] * tullis (~ben@cust94-dsl91-135-7.idnet.net) has joined #ceph
[13:00] <vikhyat> DaveOD: ceph pg set_full_ratio <float[0.0-1.0]> set ratio at which pgs are considered
[13:00] <vikhyat> full
[13:00] <vikhyat> use this command
[13:00] <vikhyat> ceph pg set_full_ratio 0.98
[13:01] <vikhyat> once it re-balances the data and cluster is out of full , add osds as soon as you can
[13:01] <Be-El> BranchPredictor: but nmon was a good hint. i've used atop before, which also gives you a lot of information, but is focussing on processes
[13:02] <tullis> Hello. I'm currently setting up a cluster and I have a quick question for anyone please...
[13:03] * b0e (~aledermue@213.95.25.82) Quit (Ping timeout: 480 seconds)
[13:04] <tullis> I've got an admin node, three running monitors, two osd servers. Each of the osd servers has 10 disks and a jounal partition for each. I'm currently preparing and activating the osd services. Everything is Jewel 10.2.1.
[13:04] * sardonyx (~Hazmat@06SAADKSC.tor-irc.dnsbl.oftc.net) Quit ()
[13:06] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) has left #ceph
[13:06] <tullis> Whenever I use 'ceph-deploy osd prepare host:disk:/dev/partition' or a similar command, I am being prompted with: "config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite"
[13:06] <tullis> Is this normal? It *does* seem to work if I use the 'overwrite-conf' switch, but I'm currently wondering why I should have to do it every time.
[13:08] * hellertime (~Adium@72.246.3.14) has joined #ceph
[13:08] * matj345314 (~matj34531@141.255.254.208) has joined #ceph
[13:08] <skyrat> Hi, is there any way to re-activate OSD on Infernalis? I see the option was added in Jewel.
[13:09] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) has joined #ceph
[13:09] <tullis> Does everyone just set 'overwrite_conf = True' in their ~.cephdeploy.conf
[13:09] <tullis> ?
[13:09] <Be-El> tullis: i haven't used ceph-deploy before, but adding a new osd may also add an entry for it to ceph.conf, which is different from the version deployed on the hosts afterwards
[13:10] * dgurtner_ (~dgurtner@178.197.233.176) has joined #ceph
[13:12] * dgurtner (~dgurtner@82.199.64.68) Quit (Ping timeout: 480 seconds)
[13:13] <tullis> Be-el: Thanks for the response. As far as I can tell there is no entry in ceph.conf for these OSDs. I've added 2 of 20 so far and there is currently no mention of either my osd servers or numbered osds in the ceph.conf. I'll try adding a third now.
[13:18] * b0e (~aledermue@213.95.25.82) has joined #ceph
[13:19] * KpuCko (~KpuCko@87-126-68-130.ip.btc-net.bg) has joined #ceph
[13:19] * KpuCko (~KpuCko@87-126-68-130.ip.btc-net.bg) has left #ceph
[13:20] * tumeric (~jcastro@89.152.250.115) has joined #ceph
[13:21] <tumeric> Hello guys. I am trying to connect cephfs through a firewall (cisco). I have 6789 going to a public IP and then NAT'ted to my monitor which is in my private network. I am getting "connection timed out"
[13:21] <tumeric> If I am inside my private network I can mount the FS without any issue.
[13:21] <tumeric> Do you have any idea why this might happen?
[13:22] <tumeric> I have the ports 6789, 6800, 6801, 6802 open in my firewall
[13:22] * dgurtner_ (~dgurtner@178.197.233.176) Quit (Ping timeout: 480 seconds)
[13:22] <Be-El> tumeric: all ceph clients including cephfs need direct access to all osd hosts
[13:23] <tumeric> Aha. Ok
[13:23] <tumeric> then I need to open the port range for the osds, right?
[13:23] <tumeric> What would be the best way to do this?
[13:24] <tumeric> I have to mount it through the internet. Something like this. Cephfs-client (internet) -> jumpserver (internet-nat) -> ceph (private network)
[13:24] <Be-El> tumeric: you can't
[13:25] <raeven> tumeric: I think Client -> VPN -> ceph is your best bet
[13:25] <Be-El> tumeric: use some fileserver gateway (e.g. an NFS server) or a vpn
[13:25] <tumeric> I am using a VPN, Cisco ASA
[13:26] <tuxcrafter> osd.4 [ERR] OSD full dropping all updates 97% full
[13:26] <tuxcrafter> if is stop osd.4 then my cluster starts working again\
[13:26] <tuxcrafter> if I start it after a while then it comes to a hold again
[13:26] <tuxcrafter> it does not go to ERR state
[13:26] <Be-El> it is possible to force a cephfs kernel client to reconnect to the mds?
[13:27] <tumeric> Be-El, how?
[13:27] <badone> tullis: may be something like https://bugs.launchpad.net/fuel/+bug/1333814 ?
[13:31] <BranchPredictor> Be-El: yeah, nmon in general is very useful as an monitoring tool.
[13:32] <tullis> badone: I think you are right. I just found a discrepancy in the /etc/ceph/conf.conf and ~ceph-admin/my-cluster/ceph.conf
[13:32] <tullis> One had underscores between the words e.g. public_network and the other didn't.
[13:32] <tullis> I've now done a complete pull->push so that they all use underscores and now I think I am no longer being prompted. I'll double-check now with a 4th OSD.
[13:33] * NTTEC (~nttec@122.53.162.158) has joined #ceph
[13:34] <tullis> Yep. Absolutely right. Many thanks.
[13:34] <badone> tullis: yes, looks like the same issue, it was mentioned here earlier today by IvanJobs I'd never heard of it before
[13:34] <badone> tullis: no problem at all
[13:35] <DaveOD> vikhyat: thanks for that
[13:35] <DaveOD> but what is the difference?
[13:35] <DaveOD> will ceph pg set_full_ratio force a rebalance?
[13:35] <vikhyat> DaveOD: I think you missed my comment
[13:35] <DaveOD> I tought it was just a configured threshold
[13:35] <vikhyat> <vikhyat> as this option is checked when you create pgs initially
[13:35] <tullis> I also got bitten by this bug, http://tracker.ceph.com/issues/15645
[13:35] <tullis> ...but I've upgraded to Jewel now and everything looks good so far. I guess I'll be back again with more questions when I'm further down the road.
[13:36] <vikhyat> '--mon_osd_full_ratio 0.98' this option is only checked
[13:36] <vikhyat> when you initially create pgs
[13:36] <raeven> Is it possible to have a ceph node as write only for backup?
[13:36] <vikhyat> DaveOD: not when pgs are already created
[13:36] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:37] <DaveOD> understood
[13:37] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:37] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:37] <DaveOD> but then again
[13:37] <DaveOD> will it rebalance my data?
[13:37] * dvanders (~dvanders@130.246.253.64) Quit (Ping timeout: 480 seconds)
[13:37] * rmart04 (~rmart04@support.memset.com) has joined #ceph
[13:37] <vikhyat> DaveOD: yes
[13:37] <DaveOD> because right now i've reweighted some OSD's
[13:37] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:37] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:37] <DaveOD> and I also added 2 new OSD's
[13:37] <vikhyat> okay then it should be fine
[13:38] <DaveOD> so it's currently rebalancing because of the OSD's and the reweighted OSD's
[13:38] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:38] <vikhyat> no need to change
[13:38] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:38] <vikhyat> as rebalance will take care of it
[13:38] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:38] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:38] <DaveOD> yes I know but new data is being added and i've given a lower prio to recovery
[13:38] <DaveOD> but then again
[13:38] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:38] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:38] <DaveOD> I find recovery being very slow
[13:38] <DaveOD> even I have 10G NIC and disks are not fully used atm
[13:39] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:39] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:39] <DaveOD> so i'm not sure where it's haning
[13:39] <DaveOD> hanging*
[13:39] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:39] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:39] * BlS (~cmrn@tor2r.ins.tor.net.eu.org) has joined #ceph
[13:39] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:39] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:40] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:40] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:40] <DaveOD> and there are so many tunables
[13:40] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:40] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:40] <DaveOD> without any comment, i don't know for sure which one to edit :(
[13:40] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:40] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:41] <DaveOD> there is comment, but like not enough to fully understand which one I need to tune
[13:41] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:41] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:41] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:41] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:42] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:42] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:43] * IvanJobs_ (~ivanjobs@103.50.11.146) has joined #ceph
[13:43] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:43] * IvanJobs_ (~ivanjobs@103.50.11.146) has joined #ceph
[13:43] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:44] * IvanJobs_ (~ivanjobs@103.50.11.146) has joined #ceph
[13:44] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:44] * IvanJobs_ (~ivanjobs@103.50.11.146) has joined #ceph
[13:44] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:44] * IvanJobs_ (~ivanjobs@103.50.11.146) has joined #ceph
[13:44] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:45] <DaveOD> anyway vikhyat thnx for the help
[13:45] * IvanJobs_ (~ivanjobs@103.50.11.146) has joined #ceph
[13:45] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:45] <DaveOD> btw: could you shortly explain why it would rebalance?
[13:45] * IvanJobs_ (~ivanjobs@103.50.11.146) has joined #ceph
[13:45] * IvanJobs_ (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:45] <DaveOD> I mean, all the weight was equally devided on it's size
[13:45] <DaveOD> of each OSD, so default
[13:45] <DaveOD> but some OSD's have some huge differences
[13:45] <DaveOD> most OSD's were at 81%
[13:46] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:46] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:46] <DaveOD> but one was at 95% which basically put down my cluster
[13:46] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:46] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:46] <DaveOD> so that did not really made sense to me
[13:46] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:46] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:47] * Miouge (~Miouge@h-4-155-222.a163.priv.bahnhof.se) Quit (Quit: Miouge)
[13:47] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:47] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:47] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:47] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:47] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:47] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:48] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[13:48] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:48] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:48] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:48] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:49] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:49] * IvanJobs (~ivanjobs@103.50.11.146) Quit (Read error: Connection reset by peer)
[13:49] * IvanJobs (~ivanjobs@103.50.11.146) has joined #ceph
[13:50] <tullis> Can anyone explain briefly what is likely to happen to my OSD server if my /dev device names change for my back-end OSD drives? I'm a little surprised that the docs don't tell me to add devices using their /dev/disk/by-id/ entries in stead of e.g. /dev/sdb which might change. I've read something about its using udev rules, but I can't see anything new under /etc/udev/
[13:51] <badone> tullis: pretty sure ceph-disk has some smarts about what disk belongs to which osd
[13:54] * flisky (~Thunderbi@210.12.157.88) Quit (Ping timeout: 480 seconds)
[13:55] * wjw-freebsd2 (~wjw@smtp.digiware.nl) Quit (Ping timeout: 480 seconds)
[13:55] * rwheeler (~rwheeler@pool-173-48-195-215.bstnma.fios.verizon.net) Quit (Quit: Leaving)
[13:56] <Be-El> tullis: devices are usually recognized by their partition type guid
[13:56] * bniver (~bniver@nat-pool-bos-u.redhat.com) has joined #ceph
[13:57] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[13:57] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[13:57] <badone> try ceph-disk list
[13:59] <tullis> badone and Be-El: Thanks again. I see, so the sort of sgdisk command that I see when calling create is this:
[13:59] <tullis> command_check_call: Running command: /sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sdf
[13:59] <tullis> That's the guid for the disk then? I'm fairly reassured already.
[14:00] * jmn` (~jmn@nat-pool-bos-t.redhat.com) Quit (Quit: Coyote finally caught me)
[14:00] <Be-El> tullis: you can verify the guid in /usr/sbin/ceph-disk (python script)
[14:02] <vikhyat> DaveOD: ceph does not guarantee equal distribution it tries its best with the help of crush tunables http://docs.ceph.com/docs/master/rados/operations/crush-map/#tunables
[14:03] <vikhyat> and you can check if you are running optimal tunables if your client and cluster can support it
[14:03] <vikhyat> for example if you are running hammer version
[14:03] <vikhyat> and if you set tunable as optimal then profile will become : hammer
[14:03] <vikhyat> same can be verfied with $ceph osd crush show-tunable
[14:04] <vikhyat> NOTE: running optimal tunable require cluster and clients in same version
[14:05] <tullis> badone and Be-El: I don't quite see how to get that info from ceph-disk list. Should I run this on the admin server, or on the osd itself? I'm running it with sudo with my unprivileged admin account, but if I run it on the admin host I just get its local disks. If I run it on the osd host, then I just get output like this:
[14:05] <tullis> /dev/sdb1 ceph data, active, cluster ceph, osd.0
[14:05] * wjw-freebsd (~wjw@176.74.240.1) has joined #ceph
[14:05] * jmn (~jmn@nat-pool-bos-t.redhat.com) has joined #ceph
[14:05] <vikhyat> DaveOD: also you should always check your cluster health
[14:06] <vikhyat> I hope you should have got warning nearfull before going to full
[14:06] <vikhyat> and as soon as your get nearfull warning your should take action to fix it
[14:06] <vikhyat> and if you are running *optimal* tunable
[14:07] <vikhyat> and still you are getting this issue data imbalance
[14:07] <vikhyat> then you can take help of *test-rewight-by-utilization* and *reweight-by-utitlization* if you are running latest hammer : 0.94.7
[14:07] <vikhyat> or above
[14:09] * BlS (~cmrn@7V7AAFSD9.tor-irc.dnsbl.oftc.net) Quit ()
[14:11] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[14:11] * allaok (~allaok@machine107.orange-labs.com) Quit ()
[14:11] * karnan (~karnan@121.244.87.117) Quit (Remote host closed the connection)
[14:11] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[14:12] * penguinRaider (~KiKo@67.159.20.163) Quit (Ping timeout: 480 seconds)
[14:13] * shylesh__ (~shylesh@121.244.87.118) has joined #ceph
[14:14] * shylesh__ (~shylesh@121.244.87.118) Quit ()
[14:15] * dgurtner (~dgurtner@178.197.233.176) has joined #ceph
[14:15] * shylesh (~shylesh@121.244.87.118) has joined #ceph
[14:16] <badone> tullis: right, on the osd node, so if the name changes to /dev/sdzz1 it will still know it belongs to osd0
[14:17] <tullis> badone: Got it. Thanks.
[14:20] <agsha> can anybody explain why pg log entries take up so much memory ?
[14:20] * penguinRaider (~KiKo@67.159.20.163) has joined #ceph
[14:20] <agsha> or the life cycle of pg logs
[14:20] <agsha> and how to trim it
[14:21] * praveen (~praveen@121.244.155.11) has joined #ceph
[14:21] <praveen> hi
[14:25] <vikhyat> agsha: praveen OPTION(osd_min_pg_log_entries, OPT_U32, 3000) // number of entries to keep in the pg log when trimming it
[14:25] <vikhyat> OPTION(osd_max_pg_log_entries, OPT_U32, 10000) // max entries, say when degraded, before we trim
[14:25] <vikhyat> OPTION(osd_pg_log_trim_min, OPT_U32, 100)
[14:25] <vikhyat> I have never tested them but may be it will help you
[14:25] * kefu (~kefu@183.193.187.174) Quit (Remote host closed the connection)
[14:28] <skyrat> Hi, I would like to ask, what is the correct way on Infernalis to re-activate an OSD after a reboot which took longer time and the OSD was marked as down&out meanwhile. The OSD was deployed with --dmcrypt and the problem is the disk isn't decrypted (and the corresponding /dev/mapper/<uuid> isn't created. Thanks.
[14:29] <IvanJobs> Hi cephers, has ceph already used C++11 now? I found include/unordered_map.h which use C++11 header <unordered_map>.
[14:30] * onyb (~ani07nov@112.133.232.30) Quit (Quit: raise SystemExit())
[14:31] * IvanJobs (~ivanjobs@103.50.11.146) Quit ()
[14:32] * thomnico (~thomnico@2a01:e35:8b41:120:147a:ce20:a39f:4a17) Quit (Remote host closed the connection)
[14:32] * chopmann (~sirmonkey@2a02:8108:8b40:5600::2) has joined #ceph
[14:35] * thomnico (~thomnico@2a01:e35:8b41:120:147a:ce20:a39f:4a17) has joined #ceph
[14:37] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[14:38] <etienneme> skyrat: ceph osd in X ?
[14:38] <etienneme> check reweight/weight on ceph osd tree
[14:39] * n0x1d (~Coestar@edwardsnowden0.torservers.net) has joined #ceph
[14:39] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[14:41] <sep> when using rbd via libvirt/kvm as a disk for a vm. i get sda abort messages in dmesg in the vm, usualy related to when something happens on the cluster. like backfilling or similar. is there any way to prioritize rbd more ?
[14:41] <sep> debian jessie + latest hammer
[14:42] * dvanders (~dvanders@130.246.253.64) has joined #ceph
[14:42] * rotbeard (~redbeard@185.32.80.238) Quit (Quit: Leaving)
[14:42] * dvanders (~dvanders@130.246.253.64) Quit (Remote host closed the connection)
[14:42] * kefu (~kefu@183.193.187.174) has joined #ceph
[14:44] * thomnico (~thomnico@2a01:e35:8b41:120:147a:ce20:a39f:4a17) Quit (Quit: Ex-Chat)
[14:44] * thomnico (~thomnico@2a01:e35:8b41:120:147a:ce20:a39f:4a17) has joined #ceph
[14:45] * dvanders (~dvanders@130.246.253.64) has joined #ceph
[14:51] * ira (~ira@nat-pool-bos-t.redhat.com) has joined #ceph
[14:53] * georgem (~Adium@104-222-119-175.cpe.teksavvy.com) has joined #ceph
[14:53] * georgem (~Adium@104-222-119-175.cpe.teksavvy.com) Quit ()
[14:54] * georgem (~Adium@206.108.127.16) has joined #ceph
[14:55] * lmb (~Lars@ip5b41f0a4.dynamic.kabel-deutschland.de) Quit (Quit: Leaving)
[14:57] <skyrat> etienneme, thanks for the reply, but unfortunately this won't help, the OSD daemon is failing to start because it cannot find the disk for mount to /var/lib/ceph/osd/ceph-2. The source device doesn't exist, first it must be decrypted and mapped by device mapper to appear.
[14:58] <skyrat> osd in will take into account the running osd
[14:58] <DaveOD> vikhyat: Thanks! great info. I'm running infernalis on Centos &
[14:59] <DaveOD> Centos 7
[14:59] * pi_zhjw (~pi@139.129.6.152) has joined #ceph
[14:59] <Be-El> sep: you can reduce the number of running backfill operation (osd-max-backfills), adjust client and backfill io priorities, use different io scheduler classes for client and recovery io etc.
[15:00] * lmb (~Lars@ip5b41f0a4.dynamic.kabel-deutschland.de) has joined #ceph
[15:02] * EinstCra_ (~EinstCraz@58.247.119.250) Quit (Remote host closed the connection)
[15:02] <sep> Be-El, ill look at the number of backfills . priority have been tweaked allready. ill google the io scheduler part
[15:02] <sep> Be-El, thanks
[15:02] <Be-El> sep: ceph tell osd.* injectargs '--osd-max-backfills 3'
[15:03] <Be-El> or use the corresponding setting in ceph.conf for persistent values
[15:03] * NTTEC (~nttec@122.53.162.158) Quit (Remote host closed the connection)
[15:04] <etienneme> skyrat: i've never used dmcrypt :(
[15:04] <Be-El> sep: keep in mind that the value defines both how many osd a given osd backfill to and how many other osds are allowed to backfill to a given osd
[15:04] <skyrat> etienneme, it was a working setup, deployed by ceph-deploy, I just rebooted one node and it did not start up the osd daemon, while throwing an error about the missing volume to mount. I was nto able to start it back again even manually. Definitely it is connected with the --dmcrypt. where are the keys for crypt stored? I can perform prepare & activate but I'm surprised that a reboot of a node crashes the osd. Am I missing something?
[15:06] <skyrat> etienneme, ok, thanks
[15:06] <skyrat> Anybody's using dmcrypt?
[15:07] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) has joined #ceph
[15:09] * n0x1d (~Coestar@4MJAAF17Z.tor-irc.dnsbl.oftc.net) Quit ()
[15:09] * qable (~clusterfu@87.120.254.200) has joined #ceph
[15:12] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[15:12] * Tene (~tene@173.13.139.236) has joined #ceph
[15:15] * bene (~bene@2601:193:4003:4c7a:ea2a:eaff:fe08:3c7a) has joined #ceph
[15:23] * deepthi (~deepthi@106.216.177.46) Quit (Ping timeout: 480 seconds)
[15:25] * kefu (~kefu@183.193.187.174) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[15:26] * vata (~vata@207.96.182.162) has joined #ceph
[15:26] <vikhyat> DaveOD: then I think for infernalis optimal tunable would be hammer only as new tunables comes with LTS release
[15:27] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) has joined #ceph
[15:27] * dugravot6 (~dugravot6@194.199.223.4) Quit (Ping timeout: 480 seconds)
[15:27] * nass5 (~fred@dn-infra-12.lionnois.site.univ-lorraine.fr) Quit (Ping timeout: 480 seconds)
[15:27] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) Quit ()
[15:28] * jcsp (~jspray@82-71-16-249.dsl.in-addr.zen.co.uk) has joined #ceph
[15:28] * flisky (~Thunderbi@106.37.236.216) has joined #ceph
[15:28] * flisky (~Thunderbi@106.37.236.216) Quit (autokilled: This host may be infected. Mail support@oftc.net with questions. BOPM (2016-06-06 13:28:52))
[15:30] <rkeene> skyrat, I am using dmcrypt
[15:30] <rkeene> skyrat, But I manage the keys myself
[15:35] * m0zes__ (~mozes@n117m02.cis.ksu.edu) has joined #ceph
[15:37] * chopmann (~sirmonkey@2a02:8108:8b40:5600::2) Quit (Quit: chopmann)
[15:38] <skyrat> rkeene, does it mean you decrypt the disks manually using cryptsetup/cryptdisks_start? Do you have any experience with the internal ceph crypting?
[15:38] * dugravot6 (~dugravot6@194.199.223.4) has joined #ceph
[15:39] * qable (~clusterfu@4MJAAF19W.tor-irc.dnsbl.oftc.net) Quit ()
[15:40] <rkeene> I have a script that decrypts the disks "manually"
[15:41] <rkeene> I use smartcards to store RSA private keys, and have a header on the disk with the symmetric key encrypted with the public keys from many smartcards
[15:41] * nass5 (~fred@dn-infra-12.lionnois.site.univ-lorraine.fr) has joined #ceph
[15:43] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) has joined #ceph
[15:43] <rkeene> So all my monitors have the smartcards attached and when OSD nodes want to decrypt their header they talk over the network to request decryption
[15:44] * spgriffi_ (~spgriffin@66.46.246.206) Quit (Quit: Leaving...)
[15:44] * squizzi (~squizzi@107.13.31.195) has joined #ceph
[15:46] <tumeric> guys how to clean update stats?
[15:46] <tumeric> does it make sense to you?
[15:46] <tumeric> what can happend when update states reach 100%?
[15:46] <tumeric> happen*
[15:46] * rkeene doesn't know what that means
[15:46] <tumeric> stats*, sorry for the typos
[15:46] <tumeric> 0 mon.monitor01@0(leader).data_health(700) update_stats avail 40% total 13695 MB, used 7452 MB, avail 5524 MB
[15:51] * shylesh (~shylesh@121.244.87.118) Quit (Remote host closed the connection)
[15:53] * allaok (~allaok@machine107.orange-labs.com) Quit (Remote host closed the connection)
[15:55] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[15:55] * allaok (~allaok@machine107.orange-labs.com) Quit ()
[15:56] * mattbenjamin (~mbenjamin@12.118.3.106) has joined #ceph
[15:56] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[16:01] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[16:03] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[16:04] * mohmultihouse (~mohmultih@gw01.mhitp.dk) Quit (Ping timeout: 480 seconds)
[16:06] * scg (~zscg@valis.gnu.org) has joined #ceph
[16:10] * danieagle (~Daniel@177.188.65.64) has joined #ceph
[16:11] * masteroman (~ivan@93-142-28-162.adsl.net.t-com.hr) has joined #ceph
[16:12] * matj345314 (~matj34531@141.255.254.208) Quit (Quit: matj345314)
[16:13] * yanzheng (~zhyan@125.70.23.87) Quit (Quit: This computer has gone to sleep)
[16:13] * kefu (~kefu@183.193.187.174) has joined #ceph
[16:16] * gregmark (~Adium@68.87.42.115) has joined #ceph
[16:17] * ram (~oftc-webi@static-202-65-140-146.pol.net.in) has joined #ceph
[16:17] <ram> Hi
[16:17] <skyrat> rkeene, ok got the idea, thanks anyway
[16:17] * T1w (~jens@node3.survey-it.dk) Quit (Ping timeout: 480 seconds)
[16:18] <ram> I am configuring Ceph cluster using ceph Infernalis release. I was getting the dpkg issue. http://paste.openstack.org/show/508380/
[16:19] * andreww (~xarses@c-73-202-191-48.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[16:19] <ram> When I was tried to configure ceph cluster with firefly release then it was created successfully.
[16:20] * b0e (~aledermue@213.95.25.82) Quit (Remote host closed the connection)
[16:20] <ram> Please give me an idea to resolve that dpkg issue
[16:21] * tdb (~tdb@myrtle.kent.ac.uk) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * tserong (~tserong@203-214-92-220.dyn.iinet.net.au) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * Frank_ (~Frank@149.210.210.150) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * asalor (~asalor@0001ef37.user.oftc.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * IcePic (~jj@c66.it.su.se) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * wujson (zok@neurosis.pl) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * bloatyfloat (~bloatyflo@46.37.172.253.srvlist.ukfast.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * dvahlin (~saint@battlecruiser.thesaint.se) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * huats (~quassel@stuart.objectif-libre.com) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * malevolent_ (~quassel@124.red-88-11-251.dynamicip.rima-tde.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * jagardaniel (~daniel@bottenskrap.se) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * dosaboy (~dosaboy@33.93.189.91.lcy-02.canonistack.canonical.com) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * DaveOD (~DaveOD@nomadesk.com) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * skoude (~skoude@193.142.1.54) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * GeoTracer (~Geoffrey@41.77.153.99) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * Heebie (~thebert@dub-bdtn-office-r1.net.digiweb.ie) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * jamespd_ (~mucky@mucky.socket7.org) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * mistur (~yoann@kewl.mistur.org) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * jayjay (~jayjay@185.27.175.112) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * BranchPredictor (branch@00021630.user.oftc.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * robbat2 (~robbat2@178.63.9.89) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * topro (~prousa@host-62-245-142-50.customer.m-online.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * BlaXpirit (~irc@blaxpirit.com) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * lj (~liujun@111.202.176.44) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * gahan (gahan@breath.hu) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * titzer (~titzer@cs.13ad.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * Mosibi (~Mosibi@dld.unixguru.nl) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * koollman (samson_t@87.252.5.161) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * hroussea (~hroussea@000200d7.user.oftc.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * olc- (~olecam@93.184.35.82) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * delaf (~delaf@legendary.xserve.fr) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * mivaho (~quassel@2001:983:eeb4:1:c0de:69ff:fe2f:5599) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * babilen (~babilen@babilen.user.oftc.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * kevcampb (~kev@orchid.vm.bytemark.co.uk) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * CustosLim3n (~CustosLim@ns343343.ip-91-121-210.eu) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * Freeaqingme (~quassel@nl3.s.kynet.eu) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * tomaw (tom@tomaw.netop.oftc.net) Quit (magnet.oftc.net charon.oftc.net)
[16:21] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) Quit (magnet.oftc.net charon.oftc.net)
[16:22] * vikhyat (~vumrao@121.244.87.116) Quit (Quit: Leaving)
[16:22] * tdb (~tdb@myrtle.kent.ac.uk) has joined #ceph
[16:22] * tserong (~tserong@203-214-92-220.dyn.iinet.net.au) has joined #ceph
[16:22] * Frank_ (~Frank@149.210.210.150) has joined #ceph
[16:22] * asalor (~asalor@0001ef37.user.oftc.net) has joined #ceph
[16:22] * jamespd_ (~mucky@mucky.socket7.org) has joined #ceph
[16:22] * babilen (~babilen@babilen.user.oftc.net) has joined #ceph
[16:22] * daviddcc (~dcasier@84.197.151.77.rev.sfr.net) has joined #ceph
[16:22] * Heebie (~thebert@dub-bdtn-office-r1.net.digiweb.ie) has joined #ceph
[16:22] * gahan (gahan@breath.hu) has joined #ceph
[16:22] * GeoTracer (~Geoffrey@41.77.153.99) has joined #ceph
[16:22] * lj (~liujun@111.202.176.44) has joined #ceph
[16:22] * skoude (~skoude@193.142.1.54) has joined #ceph
[16:22] * mivaho (~quassel@2001:983:eeb4:1:c0de:69ff:fe2f:5599) has joined #ceph
[16:22] * DaveOD (~DaveOD@nomadesk.com) has joined #ceph
[16:22] * BlaXpirit (~irc@blaxpirit.com) has joined #ceph
[16:22] * dosaboy (~dosaboy@33.93.189.91.lcy-02.canonistack.canonical.com) has joined #ceph
[16:22] * Mosibi (~Mosibi@dld.unixguru.nl) has joined #ceph
[16:22] * jagardaniel (~daniel@bottenskrap.se) has joined #ceph
[16:22] * malevolent_ (~quassel@124.red-88-11-251.dynamicip.rima-tde.net) has joined #ceph
[16:22] * topro (~prousa@host-62-245-142-50.customer.m-online.net) has joined #ceph
[16:22] * huats (~quassel@stuart.objectif-libre.com) has joined #ceph
[16:22] * kevcampb (~kev@orchid.vm.bytemark.co.uk) has joined #ceph
[16:22] * koollman (samson_t@87.252.5.161) has joined #ceph
[16:22] * tomaw (tom@tomaw.netop.oftc.net) has joined #ceph
[16:22] * dvahlin (~saint@battlecruiser.thesaint.se) has joined #ceph
[16:22] * mistur (~yoann@kewl.mistur.org) has joined #ceph
[16:22] * titzer (~titzer@cs.13ad.net) has joined #ceph
[16:22] * bloatyfloat (~bloatyflo@46.37.172.253.srvlist.ukfast.net) has joined #ceph
[16:22] * Freeaqingme (~quassel@nl3.s.kynet.eu) has joined #ceph
[16:22] * robbat2 (~robbat2@178.63.9.89) has joined #ceph
[16:22] * CustosLim3n (~CustosLim@ns343343.ip-91-121-210.eu) has joined #ceph
[16:22] * BranchPredictor (branch@00021630.user.oftc.net) has joined #ceph
[16:22] * wujson (zok@neurosis.pl) has joined #ceph
[16:22] * olc- (~olecam@93.184.35.82) has joined #ceph
[16:22] * hroussea (~hroussea@000200d7.user.oftc.net) has joined #ceph
[16:22] * delaf (~delaf@legendary.xserve.fr) has joined #ceph
[16:22] * IcePic (~jj@c66.it.su.se) has joined #ceph
[16:22] * jayjay (~jayjay@185.27.175.112) has joined #ceph
[16:23] * lcurtis (~lcurtis@47.19.105.250) has joined #ceph
[16:23] * ChanServ sets mode +v sage
[16:23] * ChanServ sets mode +v jluis
[16:23] * ChanServ sets mode +v nhm
[16:23] * ChanServ sets mode +o scuttlemonkey
[16:27] * garphy is now known as garphy`aw
[16:28] * pdrakeweb (~pdrakeweb@cpe-65-185-74-239.neo.res.rr.com) Quit (Remote host closed the connection)
[16:28] * pdrakeweb (~pdrakeweb@cpe-65-185-74-239.neo.res.rr.com) has joined #ceph
[16:29] * shyu (~shyu@218.241.172.114) Quit (Ping timeout: 480 seconds)
[16:36] * b0e (~aledermue@213.95.25.82) has joined #ceph
[16:38] * wjw-freebsd2 (~wjw@176.74.240.1) has joined #ceph
[16:40] * andreww (~xarses@64.124.158.100) has joined #ceph
[16:40] * TheDoudou_a (~Shesh@93.115.95.204) has joined #ceph
[16:41] * wjw-freebsd (~wjw@176.74.240.1) Quit (Ping timeout: 480 seconds)
[16:41] * swami1 (~swami@49.32.0.164) Quit (Quit: Leaving.)
[16:48] * davidz (~davidz@2605:e000:1313:8003:46b:c01d:af13:2a90) has joined #ceph
[16:51] * fritchie (~oftc-webi@199.168.151.107) has joined #ceph
[16:52] * raso (~raso@deb-multimedia.org) Quit (Ping timeout: 480 seconds)
[16:52] * tullis (~ben@cust94-dsl91-135-7.idnet.net) has left #ceph
[16:56] * fritchie (~oftc-webi@199.168.151.107) Quit (Quit: Page closed)
[16:59] * dgurtner (~dgurtner@178.197.233.176) Quit (Ping timeout: 480 seconds)
[16:59] * joshd1 (~jdurgin@71-92-201-212.dhcp.gldl.ca.charter.com) has joined #ceph
[17:00] * Wahmed (~wahmed@206.174.203.195) Quit (Quit: Nettalk6 - www.ntalk.de)
[17:01] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) has joined #ceph
[17:02] <tumeric> ram, please post dpkg -l | grep ceph
[17:02] <tumeric> also what do you get with apt-get -f install ?
[17:06] * LeaChim (~LeaChim@host86-148-117-255.range86-148.btcentralplus.com) Quit (Remote host closed the connection)
[17:07] * wushudoin (~wushudoin@2601:646:9501:d2b2:2ab2:bdff:fe0b:a6ee) has joined #ceph
[17:07] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:07] * jmn (~jmn@nat-pool-bos-t.redhat.com) Quit (Quit: Coyote finally caught me)
[17:09] * wushudoin (~wushudoin@2601:646:9501:d2b2:2ab2:bdff:fe0b:a6ee) Quit ()
[17:09] * wushudoin (~wushudoin@2601:646:9501:d2b2:2ab2:bdff:fe0b:a6ee) has joined #ceph
[17:09] * TheDoudou_a (~Shesh@93.115.95.204) Quit ()
[17:09] * Throlkim (~offer@ded31663.iceservers.net) has joined #ceph
[17:13] * Wahmed (~wahmed@206.174.203.195) has joined #ceph
[17:15] * jmn (~jmn@nat-pool-bos-t.redhat.com) has joined #ceph
[17:15] * dgbaley27 (~matt@2601:b00:c600:f800:45e0:8dd4:ec1:17bb) Quit (Quit: Leaving.)
[17:18] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[17:19] * sbillah (~Adium@ool-3f8fc6cc.dyn.optonline.net) has joined #ceph
[17:19] * thomnico (~thomnico@2a01:e35:8b41:120:147a:ce20:a39f:4a17) Quit (Quit: Ex-Chat)
[17:20] * sbillah (~Adium@ool-3f8fc6cc.dyn.optonline.net) has left #ceph
[17:22] * dgurtner (~dgurtner@178.197.233.176) has joined #ceph
[17:23] * vvb (~vvb@168.235.85.239) has joined #ceph
[17:25] * agsha (~agsha@121.244.155.9) Quit (Remote host closed the connection)
[17:27] * ade (~abradshaw@85.158.226.30) Quit (Ping timeout: 480 seconds)
[17:28] * ircolle (~Adium@2601:285:201:633a:b482:ba8e:ae5d:2e88) has joined #ceph
[17:29] * rakeshgm (~rakesh@121.244.87.117) Quit (Ping timeout: 480 seconds)
[17:32] * evelu (~erwan@2a01:e34:eecb:7400:4eeb:42ff:fedc:8ac) has joined #ceph
[17:37] * sudocat (~dibarra@2602:306:8bc7:4c50:cc78:32ee:7fc3:d8f4) Quit (Remote host closed the connection)
[17:39] * jdillaman (~jdillaman@pool-108-18-97-82.washdc.fios.verizon.net) has joined #ceph
[17:39] * Throlkim (~offer@7V7AAFSQ9.tor-irc.dnsbl.oftc.net) Quit ()
[17:39] * w2k (~tokie@tor-exit-4.all.de) has joined #ceph
[17:40] * kefu is now known as kefu|afk
[17:41] * allaok (~allaok@machine107.orange-labs.com) Quit (Remote host closed the connection)
[17:41] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[17:42] * allaok (~allaok@machine107.orange-labs.com) Quit ()
[17:42] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[17:43] * w2k (~tokie@06SAADK86.tor-irc.dnsbl.oftc.net) Quit (Remote host closed the connection)
[17:44] * vvb (~vvb@168.235.85.239) Quit (Quit: leaving)
[17:47] * allaok (~allaok@machine107.orange-labs.com) Quit (Quit: Leaving.)
[17:48] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[17:49] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) Quit (Remote host closed the connection)
[17:50] * allaok (~allaok@machine107.orange-labs.com) Quit ()
[17:50] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) has joined #ceph
[17:51] * allaok (~allaok@machine107.orange-labs.com) has joined #ceph
[17:52] <m0zes__> How can leveldb be so ungodly slow? Multiple days to extract 200MB out of leveldb is horrendous.
[17:52] <m0zes__> especially considering the leveldb is *so* much smaller than main memory.
[17:55] * raso (~raso@ns.deb-multimedia.org) has joined #ceph
[17:57] * TMM (~hp@185.5.121.201) Quit (Quit: Ex-Chat)
[17:58] * jmn (~jmn@nat-pool-bos-t.redhat.com) Quit (Quit: Coyote finally caught me)
[18:00] * allaok (~allaok@machine107.orange-labs.com) has left #ceph
[18:01] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[18:02] * kawa2014 (~kawa@89.184.114.246) Quit (Quit: Leaving)
[18:02] * shaunm (~shaunm@cpe-192-180-17-174.kya.res.rr.com) has joined #ceph
[18:03] * jmn (~jmn@nat-pool-bos-t.redhat.com) has joined #ceph
[18:03] <Walex> m0zes__: bizarre, it may be corrupted... hipster written sw? :-) https://en.wikipedia.org/wiki/LevelDB#Bugs_and_Reliability
[18:04] * NTTEC (~nttec@122.53.162.158) has joined #ceph
[18:05] <darkfader> m0zes__: can you try to copy it to a ramdisk? or is it just burning cpu?
[18:06] <m0zes__> darkfader: it is just burning cpu. the leveldbs are sitting on ssds designed for 13000 iops ;)
[18:07] <darkfader> hmrpf :(
[18:07] * Miouge (~Miouge@h-4-155-222.a163.priv.bahnhof.se) has joined #ceph
[18:07] * linuxkidd (~linuxkidd@ip70-189-207-54.lv.lv.cox.net) has joined #ceph
[18:08] <m0zes__> I???ve got 8 osds down now, extracting data off them so i can re-insert it into ceph. they keep hitting suicide timeouts while reading in the pg data. dumping an re-inserting seems to make them work, though. so it has to be fragmentation or something silly happening in the leveldb code.
[18:09] * alexbligh1 (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) Quit (Quit: Terminated with extreme prejudice - dircproxy 1.0.5)
[18:09] * alexbligh1 (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) has joined #ceph
[18:12] * NTTEC (~nttec@122.53.162.158) Quit (Ping timeout: 480 seconds)
[18:13] * branto (~branto@ip-78-102-208-181.net.upcbroadband.cz) Quit (Quit: Leaving.)
[18:14] <m0zes__> http://thread.gmane.org/gmane.comp.file-systems.ceph.user/30016 http://thread.gmane.org/gmane.comp.file-systems.ceph.user/30094 and http://thread.gmane.org/gmane.comp.file-systems.ceph.user/30116 for those that want to follow along at home ;)
[18:15] <Walex> m0zes__: thansk for sharing an interesting recovery technique
[18:18] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[18:19] * Xerati (~starcoder@192.42.115.101) has joined #ceph
[18:19] * Skaag (~lunix@65.200.54.234) has joined #ceph
[18:23] * Xerati (~starcoder@06SAADLBH.tor-irc.dnsbl.oftc.net) Quit ()
[18:24] * georgem (~Adium@206.108.127.16) Quit (Quit: Leaving.)
[18:26] * kefu|afk (~kefu@183.193.187.174) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[18:28] * wes_dillingham (~wes_dilli@140.247.242.44) has joined #ceph
[18:29] * Jyron (~Misacorp@185.100.85.236) has joined #ceph
[18:37] * sudocat (~dibarra@192.185.1.20) has joined #ceph
[18:40] * wjw-freebsd2 (~wjw@176.74.240.1) Quit (Ping timeout: 480 seconds)
[18:42] * Be-El (~blinke@nat-router.computational.bio.uni-giessen.de) Quit (Quit: Leaving.)
[18:43] * dgurtner (~dgurtner@178.197.233.176) Quit (Read error: Connection reset by peer)
[18:45] * shylesh__ (~shylesh@45.124.226.133) has joined #ceph
[18:49] * zaitcev (~zaitcev@c-50-130-189-82.hsd1.nm.comcast.net) has joined #ceph
[18:59] * Jyron (~Misacorp@06SAADLCA.tor-irc.dnsbl.oftc.net) Quit ()
[19:06] * vasu (~vasu@c-73-231-60-138.hsd1.ca.comcast.net) has joined #ceph
[19:06] * rraja (~rraja@121.244.87.117) Quit (Quit: Leaving)
[19:11] * dvanders (~dvanders@130.246.253.64) Quit (Ping timeout: 480 seconds)
[19:11] * mykola (~Mikolaj@193.93.217.33) has joined #ceph
[19:13] * Brochacho (~alberto@2601:243:504:6aa:894b:e78b:5c11:849a) has joined #ceph
[19:13] * rmart04 (~rmart04@support.memset.com) Quit (Ping timeout: 480 seconds)
[19:15] * swami1 (~swami@106.216.155.3) has joined #ceph
[19:17] * joshd1 (~jdurgin@71-92-201-212.dhcp.gldl.ca.charter.com) Quit (Quit: Leaving.)
[19:21] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[19:22] * karnan (~karnan@106.51.128.50) has joined #ceph
[19:22] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[19:22] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:3df9:f187:182b:27cf) Quit (Ping timeout: 480 seconds)
[19:27] * overclk (~quassel@117.202.96.41) Quit (Remote host closed the connection)
[19:27] * MACscr1 (~Adium@c-73-9-230-5.hsd1.il.comcast.net) has joined #ceph
[19:29] * luigiman (~MatthewH1@tor-exit.insane.us.to) has joined #ceph
[19:29] * derjohn_mob (~aj@46.189.28.94) has joined #ceph
[19:31] * agsha (~agsha@124.40.246.234) has joined #ceph
[19:32] * matj345314 (~matj34531@element.planetq.org) has joined #ceph
[19:33] * MACscr (~Adium@c-73-9-230-5.hsd1.il.comcast.net) Quit (Ping timeout: 480 seconds)
[19:33] * dyasny (~dyasny@cable-192.222.152.136.electronicbox.net) has joined #ceph
[19:34] <GooseYArd> Walex: haha nice
[19:34] <GooseYArd> (the link)
[19:37] * shylesh__ (~shylesh@45.124.226.133) Quit (Remote host closed the connection)
[19:38] * bniver (~bniver@nat-pool-bos-u.redhat.com) Quit (Remote host closed the connection)
[19:40] * swami2 (~swami@106.216.155.3) has joined #ceph
[19:42] * swami2 (~swami@106.216.155.3) Quit (Read error: Connection reset by peer)
[19:47] * tumeric (~jcastro@89.152.250.115) Quit (Remote host closed the connection)
[19:47] * swami1 (~swami@106.216.155.3) Quit (Ping timeout: 480 seconds)
[19:53] * matj345314 (~matj34531@element.planetq.org) Quit (Quit: matj345314)
[19:53] * rmart04 (~rmart04@host86-138-245-51.range86-138.btcentralplus.com) has joined #ceph
[19:53] * LeaChim (~LeaChim@host86-148-117-255.range86-148.btcentralplus.com) has joined #ceph
[19:54] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:9135:a55b:27ec:cc48) has joined #ceph
[19:57] * wjw-freebsd (~wjw@smtp.digiware.nl) has joined #ceph
[19:58] * bvi (~Bastiaan@152-64-132-5.ftth.glasoperator.nl) has joined #ceph
[19:59] * luigiman (~MatthewH1@4MJAAF2QP.tor-irc.dnsbl.oftc.net) Quit ()
[19:59] * Grimmer (~Shesh@06SAADLHN.tor-irc.dnsbl.oftc.net) has joined #ceph
[20:03] * fdmanana (~fdmanana@2001:8a0:6e0c:6601:9135:a55b:27ec:cc48) Quit (Ping timeout: 480 seconds)
[20:11] * bara (~bara@nat-pool-brq-t.redhat.com) Quit (Quit: Bye guys! (??????????????????? ?????????)
[20:12] * rmart04 (~rmart04@host86-138-245-51.range86-138.btcentralplus.com) Quit (Quit: rmart04)
[20:13] * rmart04 (~rmart04@host86-138-245-51.range86-138.btcentralplus.com) has joined #ceph
[20:15] * agsha (~agsha@124.40.246.234) Quit (Remote host closed the connection)
[20:26] * wes_dillingham (~wes_dilli@140.247.242.44) Quit (Quit: wes_dillingham)
[20:27] * secate (~Secate@dsl-197-245-164-90.voxdsl.co.za) Quit (Quit: leaving)
[20:28] * TomasCZ (~TomasCZ@yes.tenlab.net) has joined #ceph
[20:29] * Grimmer (~Shesh@06SAADLHN.tor-irc.dnsbl.oftc.net) Quit ()
[20:29] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) Quit (Remote host closed the connection)
[20:29] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) has joined #ceph
[20:30] <The1_> m0zes__: interesting read..
[20:30] <m0zes__> and unfortunate ;)
[20:31] <m0zes__> I???ve still got 8 osds down, they???ve been exporting data since thursday.
[20:31] <The1_> yeah, it seems like a slow and painful way
[20:33] * linjan_ (~linjan@176.195.239.133) Quit (Ping timeout: 480 seconds)
[20:33] <m0zes__> I???m open to better ways ;-) (Say a defragment_on_mount option for the osds, or a recreate_leveldb_on_mount) or a key-value store that isn???t so horribly slow when something isn???t quite right.
[20:33] * Jaska (~Shesh@tor-node.com) has joined #ceph
[20:33] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[20:39] * rmart04 (~rmart04@host86-138-245-51.range86-138.btcentralplus.com) Quit (Quit: rmart04)
[20:39] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[20:39] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) has joined #ceph
[20:40] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) has joined #ceph
[20:47] * georgem (~Adium@24.114.58.223) has joined #ceph
[20:48] * MACscr (~Adium@c-73-9-230-5.hsd1.il.comcast.net) has joined #ceph
[20:49] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[20:49] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[20:51] * SamYaple (~SamYaple@162.209.126.134) Quit (Quit: leaving)
[20:51] * SamYaple (~SamYaple@162.209.126.134) has joined #ceph
[20:53] * karnan (~karnan@106.51.128.50) Quit (Quit: Leaving)
[20:53] * wes_dillingham (~wes_dilli@140.247.242.44) has joined #ceph
[20:54] * srk (~oftc-webi@cpe-70-113-23-93.austin.res.rr.com) has joined #ceph
[20:54] * MACscr1 (~Adium@c-73-9-230-5.hsd1.il.comcast.net) Quit (Ping timeout: 480 seconds)
[20:56] * georgem1 (~Adium@24.114.53.216) has joined #ceph
[20:58] * ira (~ira@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving)
[20:58] * georgem (~Adium@24.114.58.223) Quit (Ping timeout: 480 seconds)
[21:03] * Jaska (~Shesh@4MJAAF2UD.tor-irc.dnsbl.oftc.net) Quit ()
[21:04] * KindOne_ (kindone@h254.30.30.71.dynamic.ip.windstream.net) has joined #ceph
[21:06] * cronburg (~cronburg@nat-pool-bos-t.redhat.com) has joined #ceph
[21:08] * debian112 (~bcolbert@24.126.201.64) Quit (Ping timeout: 480 seconds)
[21:09] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[21:09] * KindOne (~KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:09] * KindOne_ is now known as KindOne
[21:09] * cronburg_ (~cronburg@nat-pool-bos-t.redhat.com) has joined #ceph
[21:11] * cronburg (~cronburg@nat-pool-bos-t.redhat.com) Quit ()
[21:12] * wes_dillingham (~wes_dilli@140.247.242.44) Quit (Quit: wes_dillingham)
[21:13] * georgem (~Adium@24.114.69.110) has joined #ceph
[21:14] * brad_mssw (~brad@66.129.88.50) Quit (Quit: Leaving)
[21:16] * blizzow (~jburns@50.243.148.102) has joined #ceph
[21:18] <blizzow> I shutdown an OSD to do some upgrades over a couple hours. When I restart my OSD, the entire cluster starts blocking requests. ceph -s shows a line like:
[21:18] <blizzow> 813 requests are blocked > 32 sec
[21:18] * wes_dillingham (~wes_dilli@140.247.242.44) has joined #ceph
[21:19] <blizzow> That will stick around for a bit, then go away, then show back up. Is there a way to turn off an OSD for a while, and restart it without pummeling my entire ceph cluster?
[21:19] <blizzow> I have 11 OSDs and 3 mons all running infernalis.
[21:19] * georgem1 (~Adium@24.114.53.216) Quit (Ping timeout: 480 seconds)
[21:21] * rwheeler (~rwheeler@nat-pool-bos-t.redhat.com) Quit (Quit: Leaving)
[21:23] <blizzow> Full ceph -s output... http://pastebin.ca/3619233
[21:24] * jermudgeon (~jhaustin@199.200.6.148) has joined #ceph
[21:25] <The1_> did you set noout before shutting down that osd?
[21:25] * cronburg (~cronburg@nat-pool-bos-t.redhat.com) has joined #ceph
[21:26] <The1_> otherwise your cluster would have begun backfill and recover of all data on that OSD to other OSDs
[21:26] <blizzow> nope. It's not in this... http://docs.ceph.com/docs/jewel/rados/operations/operating/
[21:28] <blizzow> Shutting off an OSD and using the remaining two copies of data to backfill should not cause my entire cluster to poop all over itself.
[21:28] <The1_> then there's your explanation - slow requests due to backfill the other stuff mentioned
[21:29] <The1_> well.. if you do not tell ceph that it should not rebalance data due to a missing OSD, then it WILL protect the data that's assumed to be lost
[21:29] * Brochacho (~alberto@2601:243:504:6aa:894b:e78b:5c11:849a) Quit (Quit: Brochacho)
[21:30] <The1_> .. and depending on your hardware, network and IO pattern that is more than enough to cause slow IO/blocked requests
[21:30] <blizzow> The1_: I get that, but a rebalance with 2 remaining copies of data should not cause such performance degradations.
[21:30] <blizzow> 10Gbe NICs, switch, and SSDs.
[21:30] <blizzow> all on the same switch and I'm still getting brutalized.
[21:32] <wes_dillingham> You could lower osd max backfills (not sure how safe this is to lower mid backfill)
[21:32] <rkeene> wes_dillingham, It's safe to do mid backfill, it just doesn't actually DO anything
[21:32] <blizzow> Each OSD has a 500GB SSD for storage.
[21:32] <wes_dillingham> rkeene: its a totally un-honored configuration ?
[21:33] <rkeene> wes_dillingham, I think it's only read when backfills start -- it works if you set it before backfilling starts
[21:33] * KapiteinKoffie (~Kayla@85.93.218.204) has joined #ceph
[21:33] * jermudgeon (~jhaustin@199.200.6.148) Quit (Quit: jermudgeon)
[21:33] <wes_dillingham> what about lowering osd recovery max active ?
[21:34] <srk> I think, changing recovery max active is possible on a running cluster.
[21:35] <srk> might as well reduce max recovery threads as default value causes recovery storm
[21:35] <rkeene> wes_dillingham, Yes, changing "osd recovery max active" and "osd max backfills" doesn't DO anything while a backfill is going on
[21:36] * stiopa (~stiopa@cpc73832-dals21-2-0-cust453.20-2.cable.virginm.net) has joined #ceph
[21:36] <wes_dillingham> good to know
[21:36] <rkeene> Well, atleast nothing that I observed -- this is based on observations of setting those during a backfill storm and before one
[21:36] <rkeene> Not actually looking at the source
[21:37] <wes_dillingham> rkeene: I am about to do some stress testing so i will add that to the list
[21:37] <rkeene> I recently started encrypting my OSDs and to do this I just delete an OSD and let it backfill, with the default settings and 30 OSDs on 3 nodes I had recovery I/O kill user I/O
[21:37] <rkeene> delete an OSD and re-add it with an encrypted filesystem
[21:39] <SamYaple> rkeene: is that on jewel?
[21:39] <rkeene> SamYaple, No, I'm still on Hammer (0.94.7)
[21:39] <srk> DId any one upgrade to Jewel and see performance improvements?
[21:39] <SamYaple> in jewel that shouldnt be an issue anymore with proper io queueing
[21:40] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) Quit (Remote host closed the connection)
[21:40] <rkeene> Yeah, it's on the list -- I'm upgrading to OpenNebula 5.0 first, but the next release of my appliance will have Ceph 10.2 and OpenNebula 5.0
[21:41] * derjohn_mob (~aj@46.189.28.94) Quit (Ping timeout: 480 seconds)
[21:41] <SamYaple> did 5.0 already release? i thought it was still beta
[21:41] <rkeene> It's still in beta, but I have many patches to port forward
[21:41] <rkeene> It's going to take a week or so to integrate everything
[21:42] <SamYaple> ah. was worried i missed something
[21:43] <srk> I'm seeing a degradation after upgrading Ceph Monitor, osd, openstack cinder and qemu-kvm compute hosts to Jewel..
[21:43] <rkeene> I haven't even looked into the Ceph upgrade, someone else will likely take care of that
[21:45] <srk> When only ceph osd nodes were upgraded , didn't see it as a problem. But, as soon as compute and openstack controllers are upgraded saw a reduction in 4k randread/randwrite (using fio)
[21:45] <srk> Anyone know if qemu version need to be upgraded for Ceph Jewel?
[21:46] <srk> The1_ ?
[21:47] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[21:47] <srk> SamYaple: Did you try ceph with bcache ?
[21:47] * jermudgeon (~jhaustin@199.200.6.147) has joined #ceph
[21:48] <SamYaple> srk: i run ceph with bcache, yes
[21:48] <SamYaple> i do journal+data on a /dev/bcache* volume
[21:50] <srk> Oh nice.. I've /dev/bcache as osd a and bcache cache device + journal on a SSD
[21:50] <SamYaple> yea i had to patch the bcache module. by default you can't do partitions. i havent submitted a patch to the kernel for this yet
[21:50] <srk> What kind of volume size does your customers use?
[21:51] <SamYaple> srk: wrote a thing about it if youre interested http://yaple.net/2016/03/31/bcache-partitions-and-dkms/
[21:51] <SamYaple> srk: small, 20GB-50GB or large 1TB+
[21:51] <SamYaple> no in between
[21:51] <srk> I see a good performance upto 10GB. Seeing degradation after that
[21:52] <SamYaple> that seems odd. perhaps a new default was changed (objectmap or something) in jewel?
[21:52] <SamYaple> look at the rbds and thier features for new rbds vs existing
[21:53] <srk> I see this with Hammer as well
[21:54] <SamYaple> oh that means youre probably using up your writeback cache on the compute node srk
[21:54] <srk> right
[21:54] <SamYaple> moar ram
[21:55] <srk> more ram on computes?
[21:56] * jermudgeon (~jhaustin@199.200.6.147) Quit (Ping timeout: 480 seconds)
[21:56] <SamYaple> always. but i dont know thats your issue srk
[21:56] <srk> SameYaple: I got: [client] rbd cache = true rbd cache writethrough until flush = true rbd concurrent management ops = 20
[21:56] <srk> that is in ceph.conf
[21:57] <SamYaple> then youre only using the defaults, like 32 or 64MB of wrietback cache for starters
[21:57] <SamYaple> but it depends on your use case. if its a throwaway machine when it crashes you should destroy it and start a new one, in which case you can really jack that number up
[21:57] <SamYaple> the VM that is
[21:58] <srk> would you mind sharing your bcache tuning?
[21:59] <SamYaple> sure lemme check what i have. its pretty basic i only change seqwrites i beleive...
[22:00] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) has joined #ceph
[22:01] <SamYaple> srk: sequential_cutoff = 0 and writeback_percent = 40 (the max) is all i tweak on startup
[22:01] <srk> SamYaple: Thanks.. This is the bcache ceph implementation that I use
[22:01] <srk> https://github.com/blueboxgroup/ursula/blob/master/roles/ceph-osd/tasks/bcache.yml
[22:01] <SamYaple> mmm ursula
[22:02] * Skaag (~lunix@65.200.54.234) Quit (Quit: Leaving.)
[22:03] * d_shizzzzle (~d_shizzzz@128.104.173.229) has joined #ceph
[22:03] * d_shizzzzle (~d_shizzzz@128.104.173.229) Quit ()
[22:03] <SamYaple> fyi you can enable writeback mode here https://github.com/blueboxgroup/ursula/blob/master/roles/ceph-osd/tasks/bcache.yml#L46 its one of the few persistent settings and can be enabled at creation
[22:03] * KapiteinKoffie (~Kayla@06SAADLMM.tor-irc.dnsbl.oftc.net) Quit ()
[22:04] <srk> SamYaple: Where to set those numbers you suggested?
[22:04] <SamYaple> doesnt look like you are actually tweaking ceph at all so you should be fine. cant really explain your 10GB issue, i suspect some issue with flushing all those random writes
[22:04] * georgem (~Adium@24.114.69.110) Quit (Quit: Leaving.)
[22:04] <SamYaple> srk: in /sys like normal
[22:04] <SamYaple> /sys/block or /sys/fs/bcache
[22:06] <srk> my current values are : 4M and 10%
[22:06] <SamYaple> yea those are defaults
[22:06] <srk> ok, will try changing to 0 and 40
[22:07] * johnavp1989 (~jpetrini@8.39.115.8) Quit (Ping timeout: 480 seconds)
[22:07] <srk> I still need the [client] settings in ceph.conf for rbd cache = true and writethrough = true ?
[22:08] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) Quit (Remote host closed the connection)
[22:08] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) has joined #ceph
[22:08] <srk> Some one suggested as "rbd cache writethrough until flush = true" might keep writethrough active even though rbd cahce is set to True.
[22:10] <srk> SamYaple: ^^
[22:10] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Remote host closed the connection)
[22:10] * hellertime (~Adium@72.246.3.14) Quit (Quit: Leaving.)
[22:10] * Skaag (~lunix@65.200.54.234) has joined #ceph
[22:11] <SamYaple> yea its been a while since i looked at that srk, so at the risk of providing bad information, i beleive that was used when the cache got full before flush
[22:11] <SamYaple> i dont have the best memory of it, but I set it as well
[22:11] <srk> both are set to True?
[22:11] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[22:11] <SamYaple> yea
[22:12] <srk> ok, Thank you
[22:17] * matj345314 (~matj34531@element.planetq.org) has joined #ceph
[22:18] * mykola (~Mikolaj@193.93.217.33) Quit (Quit: away)
[22:18] * rendar (~I@host159-59-dynamic.22-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[22:19] * johnavp1989 (~jpetrini@8.39.115.8) has joined #ceph
[22:19] <- *johnavp1989* To prove that you are human, please enter the result of 8+3
[22:21] * matj345314 (~matj34531@element.planetq.org) Quit ()
[22:22] * scg (~zscg@valis.gnu.org) Quit (Ping timeout: 480 seconds)
[22:25] * wushudoin (~wushudoin@2601:646:9501:d2b2:2ab2:bdff:fe0b:a6ee) Quit (Quit: Leaving)
[22:26] * wushudoin (~wushudoin@2601:646:9501:d2b2:2ab2:bdff:fe0b:a6ee) has joined #ceph
[22:31] * bvi (~Bastiaan@152-64-132-5.ftth.glasoperator.nl) Quit (Ping timeout: 480 seconds)
[22:33] * tuhnis (~MatthewH1@4MJAAF21K.tor-irc.dnsbl.oftc.net) has joined #ceph
[22:34] * gregmark (~Adium@68.87.42.115) has joined #ceph
[22:35] * Annttu (annttu@0001934a.user.oftc.net) Quit (Remote host closed the connection)
[22:39] * cronburg (~cronburg@nat-pool-bos-t.redhat.com) Quit (Ping timeout: 480 seconds)
[22:39] * georgem (~Adium@24.114.77.83) has joined #ceph
[22:41] * Annttu (annttu@0001934a.user.oftc.net) has joined #ceph
[22:45] * rendar (~I@host159-59-dynamic.22-79-r.retail.telecomitalia.it) has joined #ceph
[22:46] * kjetijor (kjetijor@microbel.pvv.ntnu.no) has joined #ceph
[22:47] * cronburg (~cronburg@nat-pool-bos-t.redhat.com) has joined #ceph
[22:48] * shaunm (~shaunm@cpe-192-180-17-174.kya.res.rr.com) Quit (Ping timeout: 480 seconds)
[22:49] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) has joined #ceph
[22:53] * dcwangmit01_ (~dcwangmit@162-245.23-239.PUBLIC.monkeybrains.net) Quit (Quit: leaving)
[22:53] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) Quit (Remote host closed the connection)
[22:53] * dcwangmit01 (~dcwangmit@162-245.23-239.PUBLIC.monkeybrains.net) has joined #ceph
[22:53] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) has joined #ceph
[22:54] * untoreh (~fra@151.50.200.100) has joined #ceph
[22:57] * _28_ria (~kvirc@opfr028.ru) has joined #ceph
[22:58] * bniver (~bniver@71-9-144-29.static.oxfr.ma.charter.com) Quit (Remote host closed the connection)
[23:03] * tuhnis (~MatthewH1@4MJAAF21K.tor-irc.dnsbl.oftc.net) Quit ()
[23:03] * wes_dillingham (~wes_dilli@140.247.242.44) Quit (Ping timeout: 480 seconds)
[23:07] * Wahmed (~wahmed@206.174.203.195) Quit (Quit: Nettalk6 - www.ntalk.de)
[23:08] * Miouge (~Miouge@h-4-155-222.a163.priv.bahnhof.se) Quit (Quit: Miouge)
[23:08] * Kizzi (~Silentkil@192.42.115.101) has joined #ceph
[23:09] * georgem (~Adium@24.114.77.83) Quit (Read error: Connection reset by peer)
[23:14] * owasserm (~owasserm@2001:984:d3f7:1:5ec5:d4ff:fee0:f6dc) Quit (Ping timeout: 480 seconds)
[23:14] * matj345314 (~matj34531@element.planetq.org) has joined #ceph
[23:16] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) Quit (Remote host closed the connection)
[23:18] * joelc (~joelc@cpe-24-28-78-20.austin.res.rr.com) has joined #ceph
[23:26] <kjetijor> Does ceph-mds carry any changing/mutable state on local filesystem that's required for i.e. daemon restart ? (Context: looking at running ceph-mds inside a docker container)
[23:27] <gregsfortytwo> it does not
[23:28] <kjetijor> thanks :)
[23:33] <flaf> Hi. Imagine I want to reboot the active mds server of a cephfs cluster (I have the other mds stand-by state).
[23:34] <flaf> What is the best way to minimize the little freeze in the cephfs client side?
[23:34] <flaf> a) stop the mds daemon and reboot the server
[23:34] * Wahmed (~wahmed@206.174.203.195) has joined #ceph
[23:34] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[23:35] <flaf> b) or ???ceph mds fail 0??? and reboot the server
[23:35] <flaf> ?
[23:35] <gregsfortytwo> either of those should be fine, although you'll want to put the other mds in standby-replay first
[23:36] <flaf> gregsfortytwo: ah and how can I do that?
[23:36] <gregsfortytwo> mmm, you might have to stop it and set the config value appropriately
[23:37] <gregsfortytwo> that's generally a good idea to do anyway though; it keeps the cache warm in case a failover is required
[23:38] * Kizzi (~Silentkil@4MJAAF225.tor-irc.dnsbl.oftc.net) Quit ()
[23:38] * N3X15 (~Grimhound@static-ip-85-25-103-119.inaddr.ip-pool.com) has joined #ceph
[23:39] <flaf> gregsfortytwo: so if I understand well you recommend to set ???mds standby replay = fale??? for one mds, correct?
[23:39] <flaf> s/fale/false/
[23:39] <gregsfortytwo> you want to set it to true
[23:40] <gregsfortytwo> so that it *does* do standby-replay
[23:40] <flaf> Err... yes one mds, at least, with ???mds standby replay = true??? sorry.
[23:40] <gregsfortytwo> yeah
[23:41] <gregsfortytwo> probably all of them really, unless you have one specifically you want to be MDS most of the time and are going to go to the effort of making sure it's active
[23:41] <flaf> Ok, ???standby replay??? is an state between ???standby??? and ???active???, correct. Well supported I guess.
[23:43] <flaf> gregsfortytwo: so I can put ???mds standby replay = true??? directly in the [global] section or in the [mds] section, correct ?
[23:43] * dgurtner (~dgurtner@178.197.239.98) has joined #ceph
[23:43] <gregsfortytwo> yeah
[23:44] <flaf> gregsfortytwo: ok, thx.
[23:44] <gregsfortytwo> standby-replay will make the mds follow the journal of the active MDS; it keeps the cache warm and will let it go through the restart+replay process faster in case of failover
[23:46] <flaf> Ok, it's perfect. I didn't know that. I guess the negative side is that a mds in ???standby-replay??? state uses resources (ie RAM and CPU), contrary to the standby state. But it's not a problem for me.
[23:47] <m0zes__> you???he standby mds servers to have the ram and cpu to takeover in the event of a failure anyway.
[23:47] <gregsfortytwo> yeah, and a little bit of read IOPS in your RADOS cluster
[23:48] <gregsfortytwo> but if you're redlining that you're going to be in trouble anyway
[23:48] <m0zes__> you???d want the
[23:49] <flaf> Ok, thx. ;)
[23:51] * kjetijor (kjetijor@microbel.pvv.ntnu.no) Quit (Quit: removing-old-iso-8859-1-sins)
[23:52] * matj345314 (~matj34531@element.planetq.org) Quit (Quit: matj345314)
[23:55] * kjetijor (kjetijor@microbel.pvv.ntnu.no) has joined #ceph
[23:55] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[23:56] * m0zes__ (~mozes@n117m02.cis.ksu.edu) Quit (Ping timeout: 480 seconds)
[23:57] * LeaChim (~LeaChim@host86-148-117-255.range86-148.btcentralplus.com) Quit (Remote host closed the connection)
[23:58] * vata (~vata@207.96.182.162) Quit (Quit: Leaving.)
[23:59] * MentalRay (~MentalRay@office-mtl1-nat-146-218-70-69.gtcomm.net) has joined #ceph
[23:59] * rendar (~I@host159-59-dynamic.22-79-r.retail.telecomitalia.it) Quit (Quit: std::lower_bound + std::less_equal *works* with a vector without duplicates!)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.