#ceph IRC Log

Index

IRC Log for 2013-12-01

Timestamps are in GMT/BST.

[0:00] <Gugge-47527> symmcom: what did you do?
[0:03] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[0:04] <symmcom> Gugge: I changed IP address through /etc/network/interfaces of MONs , instead of proposed CEH way of adding new MONs and removing old ones
[0:05] <symmcom> when i saw what a mess i created, i put old IP back but now #ceph command does not work
[0:11] * ScOut3R (~scout3r@dsl51B69BF7.pool.t-online.hu) Quit ()
[0:14] <Guyou> not sure if that can help, but in my really first experience, I noticed that mkceph store IP on some database
[0:14] <kraken> ≖_≖
[0:15] <Guyou> changing IP in config files seems not enough
[0:15] <Guyou> good luck
[0:15] * Guyou (~bonnefil@mrb31-1-88-184-0-166.fbx.proxad.net) has left #ceph
[0:16] <symmcom> thats what i read on CEPH docs, MON really does not care about conf file IPs. it stores them somewhere. The docs says i can download Mon Map, change what i have to change, then inject the Mon Map back into cluster. But the problem is authtool wont even work now to get the map
[0:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[0:20] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[0:20] * DarkAce-Z (~BillyMays@50.107.53.200) has joined #ceph
[0:25] * DarkAceZ (~BillyMays@50.107.53.200) Quit (Ping timeout: 480 seconds)
[0:25] * DarkAce-Z is now known as DarkAceZ
[0:29] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[0:56] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[1:12] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[1:16] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[1:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[1:20] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[1:27] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[1:32] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[1:34] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[1:36] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[1:38] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[1:40] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[1:45] <symmcom> can anybody still help me figure out my MON issue
[1:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[2:01] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[2:09] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[2:19] * sarob (~sarob@2601:9:7080:13a:883e:2517:2425:c3d1) has joined #ceph
[2:19] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[2:23] <bloodice> freaking ceph keeps changing the permissions on the ceph directory during installation and then fails
[2:27] * sarob (~sarob@2601:9:7080:13a:883e:2517:2425:c3d1) Quit (Ping timeout: 480 seconds)
[2:27] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[2:30] <pmatulis> i've seen permission changes on /etc/ceph/ceph.conf in the past, but that's it
[2:34] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[2:34] <bloodice> yea that and log
[2:36] <bloodice> its going into sudo to edit things but then it leaves sudo and because it did a sudo the file is owned and locked to root
[2:44] <bloodice> if i run the command under sudo, it asks me for passowrd that dont exist... sigh..
[2:45] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[2:48] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[2:49] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[2:49] <bloodice> it seems the root priv is not taking for the ceph user
[2:49] <bloodice> thats the issue
[2:59] * Shmouel (~Sam@ns1.anotherservice.com) has joined #ceph
[3:00] * Shmouel1 (~Sam@fny94-12-83-157-27-95.fbx.proxad.net) Quit (Ping timeout: 480 seconds)
[3:03] * dmsimard (~Adium@69-165-206-93.cable.teksavvy.com) has joined #ceph
[3:07] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[3:09] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[3:11] <bloodice> its completely ignoring my edits to sudoers, see if a reboot fixes it
[3:13] <bloodice> now the server wont start argh!
[3:18] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[3:19] * sarob (~sarob@2601:9:7080:13a:54b4:9276:fa4d:2f9d) has joined #ceph
[3:27] * diegows (~diegows@190.190.11.42) Quit (Ping timeout: 480 seconds)
[3:29] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:31] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[3:32] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[3:39] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:45] * dmsimard1 (~Adium@108.163.152.66) has joined #ceph
[3:47] * dmsimard (~Adium@69-165-206-93.cable.teksavvy.com) Quit (Read error: Connection reset by peer)
[3:47] <symmcom> any idea anybody how i can take care authentication error? none of my #ceph commands works. cant check how the cluster doing.
[3:48] * dmsimard (~Adium@69-165-206-93.cable.teksavvy.com) has joined #ceph
[3:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[3:49] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[3:53] * mtanski (~mtanski@cpe-74-65-252-48.nyc.res.rr.com) has joined #ceph
[3:53] * dmsimard1 (~Adium@108.163.152.66) Quit (Ping timeout: 480 seconds)
[3:53] * nigwil_ (~chatzilla@2001:44b8:5144:7b00:d4a9:149d:3e80:2be1) has joined #ceph
[3:54] * mtanski (~mtanski@cpe-74-65-252-48.nyc.res.rr.com) Quit ()
[3:54] * sarob_ (~sarob@2601:9:7080:13a:3536:8ca7:11fd:1306) has joined #ceph
[3:54] * LeaChim (~LeaChim@host86-162-2-255.range86-162.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[3:55] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[3:56] * sarob (~sarob@2601:9:7080:13a:54b4:9276:fa4d:2f9d) Quit (Ping timeout: 480 seconds)
[3:57] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[4:00] * nigwil (~chatzilla@2001:44b8:5144:7b00:6870:5c5b:c52e:e0cd) Quit (Ping timeout: 480 seconds)
[4:06] <bloodice> omg... seriously going to put my fist through the screen
[4:06] <bloodice> ubuntu will not grant root priv to ceph user
[4:06] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[4:17] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[4:17] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[4:17] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[4:18] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[4:18] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[4:18] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[4:31] * sarob_ (~sarob@2601:9:7080:13a:3536:8ca7:11fd:1306) Quit (Ping timeout: 480 seconds)
[4:31] * dmsimard (~Adium@69-165-206-93.cable.teksavvy.com) Quit (Quit: Leaving.)
[4:36] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[4:43] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[4:46] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[4:47] * CAPSLOCK2000 (~oftc@2001:610:748:1::8) Quit (Ping timeout: 480 seconds)
[4:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[4:51] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[4:52] <aarontc> well thar's your problem ;)
[4:53] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[4:54] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[4:56] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[5:00] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[5:01] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[5:02] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[5:02] <bloodice> its somehow fixed now
[5:02] <bloodice> no idea why
[5:08] <symmcom> i wish my issue would go away just like that on its own :(
[5:10] * Hakisho (~Hakisho@0001be3c.user.oftc.net) Quit (Ping timeout: 480 seconds)
[5:12] * Hakisho (~Hakisho@0001be3c.user.oftc.net) has joined #ceph
[5:18] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[5:19] * sarob (~sarob@2601:9:7080:13a:51eb:a58f:d28e:80d1) has joined #ceph
[5:26] * xarses (~andreww@c-24-23-183-44.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[5:27] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[5:29] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[5:32] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[5:32] <bloodice> well i kept kicking it in the face and it eventually gave up
[5:33] <bloodice> hopefully rados doesnt do the same thing though
[5:36] * xarses (~andreww@c-24-23-183-44.hsd1.ca.comcast.net) has joined #ceph
[5:41] * CAPSLOCK2000 (~oftc@2001:610:748:1::8) has joined #ceph
[5:44] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[5:46] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[5:46] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit ()
[5:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[5:49] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[5:55] * sarob (~sarob@2601:9:7080:13a:51eb:a58f:d28e:80d1) Quit (Ping timeout: 480 seconds)
[6:04] <bloodice> anyone know how to properly clear a hard drive that has data on it so that the ceph-deploy osd create command will work on it?
[6:05] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[6:19] * sarob (~sarob@2601:9:7080:13a:714e:af90:87f0:ea4f) has joined #ceph
[6:19] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[6:21] <bloodice> found it, parted rm 1 2 done
[6:30] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[6:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[6:50] * Tamil (~tamil@38.122.20.226) Quit (Ping timeout: 480 seconds)
[6:51] * dmick (~dmick@38.122.20.226) Quit (Quit: Leaving.)
[6:56] * sarob (~sarob@2601:9:7080:13a:714e:af90:87f0:ea4f) Quit (Ping timeout: 480 seconds)
[6:56] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[7:14] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[7:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[7:33] * Tamil (~tamil@38.122.20.226) has joined #ceph
[7:36] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[7:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[7:51] * Sodo_ (~Sodo@a88-113-108-239.elisa-laajakaista.fi) has joined #ceph
[7:55] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[8:01] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[8:17] * xarses (~andreww@c-24-23-183-44.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[8:17] * xarses (~andreww@c-24-23-183-44.hsd1.ca.comcast.net) has joined #ceph
[8:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[8:19] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[8:27] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[8:36] <symmcom> still seeking help from whoever can provide it to figure out MON issue :(
[8:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[8:50] * Pauline (~middelink@2001:838:3c1:1:be5f:f4ff:fe58:e04) Quit (Ping timeout: 480 seconds)
[8:56] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[8:57] * Sodo_ (~Sodo@a88-113-108-239.elisa-laajakaista.fi) Quit (Ping timeout: 480 seconds)
[8:57] * Pauline (~middelink@2001:838:3c1:1:be5f:f4ff:fe58:e04) has joined #ceph
[9:01] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[9:04] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[9:19] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[9:23] * rendar (~s@host9-176-dynamic.22-79-r.retail.telecomitalia.it) has joined #ceph
[9:29] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[9:41] <bloodice> anyone have radosgw running?
[9:42] <symmcom> seems like it is night time for Ceph IRC user
[9:42] <bloodice> lol yea
[9:43] <symmcom> i think i got bruises from banging my head on the keyboard thousands time already
[9:43] <bloodice> yea me too
[9:43] <bloodice> i am sooo close to having a working unit
[9:43] <bloodice> i have developers scheduled for 4am but i am stuck argh
[9:44] <symmcom> my unit was working 8 months straight till i made A STUPID mistake yesterday
[9:44] <bloodice> ouch?
[9:45] <symmcom> tried to change IP address of MON mode by editing /etc/network/interfaces and MON Qourum went boom , although the cluster is still working. because i can access all Virtual machines that are stored on RBD
[9:45] <bloodice> ouch dont change ip lol
[9:46] <symmcom> ya , i know it now. a costly learning process :)
[9:46] <symmcom> whats ur issue
[9:47] <bloodice> ERROR: S3 error: 405 (MethodNotAllowed):
[9:47] <bloodice> when i try to create a bucket
[9:47] <bloodice> i can only find vague references online
[9:48] <symmcom> Rados object bucket?
[9:49] <bloodice> yea
[9:50] <bloodice> ohhh i am doing it wrong
[9:50] <bloodice> err wait
[9:50] <bloodice> not tsure
[9:51] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) has joined #ceph
[9:57] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) Quit (Quit: Computer has gone to sleep.)
[9:57] <bloodice> i need to give the user access to the bucket
[9:57] <bloodice> hrm...
[9:57] <bloodice> no idea
[10:07] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) has joined #ceph
[10:11] * ScOut3R (~scout3r@dsl51B69BF7.pool.t-online.hu) has joined #ceph
[10:15] <symmcom> never got that working . my problem was lack of understanding of RADOSGW documentation. i mainly focus on RBD and CephFS
[10:17] <bloodice> its a bit tough, ihave dev site working but umm this prod site is not
[10:18] <bloodice> got it!
[10:18] <bloodice> i needed to change the rados gateway domain
[10:18] <bloodice> holy balls just in time
[10:18] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[10:19] <bloodice> i have been at this for like 10 hours
[10:20] <bloodice> but its all good now!
[10:20] <bloodice> time to do the haproxy lb
[10:20] <bloodice> do you know anything about bind?
[10:20] <symmcom> i m pulling close to 14 hours now :(
[10:20] <bloodice> ouch
[10:21] <symmcom> bind as i understand is IP Address tied to a Domain Name
[10:21] <bloodice> lol
[10:21] <bloodice> yea i have been four days without any decent sleep
[10:21] <bloodice> spent the last week building a datacenter with the "used" equipment
[10:21] <bloodice> freaking used equipment wasted all my time
[10:27] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[10:28] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) Quit (Quit: Computer has gone to sleep.)
[10:28] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[10:33] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[10:36] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[10:36] <symmcom> glad it worked out for u , let see how many more hours i need to go before the solution just comes to me
[10:48] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[10:58] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[10:58] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[10:58] * madkiss (~madkiss@089144194208.atnat0003.highway.a1.net) has joined #ceph
[11:02] * madkiss1 (~madkiss@chello062178057005.20.11.vie.surfer.at) Quit (Ping timeout: 480 seconds)
[11:04] * nigwil_ (~chatzilla@2001:44b8:5144:7b00:d4a9:149d:3e80:2be1) Quit (Quit: ChatZilla 0.9.90.1 [Firefox 25.0.1/20131112160018])
[11:13] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[11:15] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[11:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[11:24] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[11:25] * ScOut3R (~scout3r@dsl51B69BF7.pool.t-online.hu) Quit ()
[11:27] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[11:42] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[11:49] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[12:02] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[12:07] * madkiss1 (~madkiss@chello062178057005.20.11.vie.surfer.at) has joined #ceph
[12:12] * madkiss (~madkiss@089144194208.atnat0003.highway.a1.net) Quit (Ping timeout: 480 seconds)
[12:18] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 481 seconds)
[12:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[12:27] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[12:43] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[12:44] * Sodo (~Sodo@a88-113-108-239.elisa-laajakaista.fi) has joined #ceph
[12:50] * diegows (~diegows@190.190.11.42) has joined #ceph
[12:54] * BillK (~BillK-OFT@58-7-120-52.dyn.iinet.net.au) Quit (Read error: Operation timed out)
[12:55] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[12:56] * Sodo (~Sodo@a88-113-108-239.elisa-laajakaista.fi) Quit (Ping timeout: 480 seconds)
[12:57] * LeaChim (~LeaChim@host86-162-2-255.range86-162.btcentralplus.com) has joined #ceph
[12:57] * BillK (~BillK-OFT@58-7-66-254.dyn.iinet.net.au) has joined #ceph
[13:00] * ggreg (~ggreg@int.0x80.net) has joined #ceph
[13:05] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[13:07] <symmcom> can anybody tell me how to fix "error connecting to cluster: PermissionError" that comes up after i try to run #ceph -s?
[13:07] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[13:13] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[13:15] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[13:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[13:30] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[13:35] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Read error: Connection reset by peer)
[13:36] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[13:46] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[13:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[13:52] * diegows (~diegows@190.190.11.42) Quit (Ping timeout: 480 seconds)
[13:56] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[13:56] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[14:18] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[14:18] * glzhao (~glzhao@118.195.65.67) has joined #ceph
[14:19] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[14:19] * sarob (~sarob@2601:9:7080:13a:19ee:8f01:1dfb:d1d4) has joined #ceph
[14:27] * sarob (~sarob@2601:9:7080:13a:19ee:8f01:1dfb:d1d4) Quit (Ping timeout: 480 seconds)
[14:31] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[14:35] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[14:39] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[14:43] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[14:56] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[14:58] * DarkAceZ (~BillyMays@50.107.53.200) Quit (Ping timeout: 480 seconds)
[15:00] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[15:03] * DarkAceZ (~BillyMays@50.107.53.200) has joined #ceph
[15:04] * BillK (~BillK-OFT@58-7-66-254.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:15] * BillK (~BillK-OFT@58-7-66-254.dyn.iinet.net.au) has joined #ceph
[15:16] * erwan_taf (~erwan@lns-bzn-48f-62-147-157-222.adsl.proxad.net) Quit (Quit: ZNC - http://znc.sourceforge.net)
[15:18] * erwan_taf (~erwan@lns-bzn-48f-62-147-157-222.adsl.proxad.net) has joined #ceph
[15:18] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[15:19] * sarob (~sarob@2601:9:7080:13a:ec08:8999:38ce:3b67) has joined #ceph
[15:23] * Sodo (~Sodo@a88-113-108-239.elisa-laajakaista.fi) has joined #ceph
[15:26] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[15:26] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[15:35] * BillK (~BillK-OFT@58-7-66-254.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[15:36] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) Quit (Remote host closed the connection)
[15:43] * b1tbkt (~b1tbkt@24-217-192-155.dhcp.stls.mo.charter.com) has joined #ceph
[15:44] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[15:50] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[15:51] * sarob (~sarob@2601:9:7080:13a:ec08:8999:38ce:3b67) Quit (Ping timeout: 480 seconds)
[16:02] <symmcom> i m in desparate need of help to fix "librados:client.admin authentication error (1) Operation not permitted" . anybody?
[16:05] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[16:07] * danieagle (~Daniel@179.186.121.14.dynamic.adsl.gvt.net.br) has joined #ceph
[16:19] * sarob (~sarob@2601:9:7080:13a:5117:abe0:26c9:bc81) has joined #ceph
[16:19] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[16:27] <pmatulis> sounds like a cephx problem
[16:27] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Read error: Connection reset by peer)
[16:27] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[16:29] <pmatulis> which means the client you're using doesn't have a proper key or the key it does have hasn't been given the permissions for doing what the client is trying to do
[16:30] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[16:31] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[16:31] <symmcom> i have tried to give 777 permission, tried putting it in home folder and used -k option, none works
[16:35] * sarob (~sarob@2601:9:7080:13a:5117:abe0:26c9:bc81) Quit (Ping timeout: 480 seconds)
[16:42] <pmatulis> no, the permissions associated with the key itself
[16:43] <pmatulis> this gives you an overview of keys and their permissions: 'ceph auth list'
[16:45] * Jean-Roger (Jean-Roger@ALille-651-1-30-11.w2-5.abo.wanadoo.fr) has joined #ceph
[16:45] <Jean-Roger> Hi
[16:46] <symmcom> pmatulis: #ceph auth list gives the same authentication error
[16:47] <pmatulis> ah
[16:48] <pmatulis> maybe you should turn off cephx and perform repairs. sounds like a chickn 'n egg problem
[16:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[16:49] <pmatulis> [global]
[16:49] <pmatulis> auth cluster required = none
[16:49] <pmatulis> auth service required = none
[16:49] <symmcom> i disabled cephx in the conf, then rebooted but for some reason it is still giving authentication issue. i double checked ceph.conf /etc/ceph, cephx is disabled
[16:49] <pmatulis> auth client required = none
[16:50] <pmatulis> on all nodes?
[16:50] <symmcom> unfortunately yes, thats what i m not understanding, i only have 3 mons and all behaving same way
[16:51] <pmatulis> sounds fishy to me. afaiu, "client.admin" refers to a key
[16:52] <symmcom> thats ceph.client.admin.keyring right ?
[16:52] <pmatulis> if your cluster name is "ceph", yes
[16:53] <pmatulis> well, the key referred to in that keyring
[16:54] <symmcom> the way the whole thing got messed was i tried to change IP address of all MONs and didnt know its a big NO NO
[16:54] <pmatulis> hmm, i haven't thought of researching that one :/
[16:55] <symmcom> after i realized how badly MON was out of quorum, i put old IP address back and nothing came back to normal.
[16:55] <pmatulis> did you change them all at once? or one at a time, allowing the cluster to adapt after each change?
[16:56] <symmcom> i should mention though, my cluster itself working and all Virtual Machine stored on RBD are 100% functioning. its just that i cannot see the health of cluster and any command requires #ceph
[16:56] <pmatulis> interesting
[16:56] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[16:56] <pmatulis> other clients are working, just not the admin client
[16:57] <symmcom> to make sure that all VMs are indeed working, i m actually chatting here from one of the Windows 7 Virtual Machine in the cluster
[16:57] * DarkAce-Z (~BillyMays@50.107.53.200) has joined #ceph
[16:57] <symmcom> i used MON #1 as admin
[16:57] <symmcom> 3 MONs, 2 MDSs and 2 OSD nodes
[16:57] <pmatulis> ah
[16:58] <pmatulis> what about the ceph command from another node in the cluster? does that work?
[16:59] <symmcom> nope
[16:59] <kraken> http://i.imgur.com/2xwe756.gif
[16:59] * pmatulis prefers a dedicated admin host
[16:59] <pmatulis> and ceph-deploy is borked as well i presume?
[17:00] <symmcom> from other node, it actually site with blank line for eternity after #ceph -s command
[17:00] <symmcom> ceph-deploy works
[17:00] <pmatulis> ah good
[17:00] <pmatulis> so try to re-admin your admin host
[17:01] <symmcom> #ceph-deploy admin <node> ?
[17:01] * DarkAceZ (~BillyMays@50.107.53.200) Quit (Ping timeout: 480 seconds)
[17:02] <pmatulis> it's been a while but i think so, yeah
[17:03] <symmcom> success! but now ceph -s wont work without sudo. after i do #sudo ceph -s cursor gives a blank line and sits without any response
[17:04] <pmatulis> become the root user and apply strace
[17:05] <pmatulis> i bet it's still trying to contact the old IPs. consider tcpdump as well
[17:06] <symmcom> which strace option should i use
[17:06] <symmcom> tcpdump
[17:13] <symmcom> odd, tcpdump shows activity on IP address has not been used in months
[17:14] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Read error: Operation timed out)
[17:16] * hflai_ (hflai@alumni.cs.nctu.edu.tw) Quit (Ping timeout: 480 seconds)
[17:17] <pmatulis> unanswered: did you change them all at once? or one at a time, allowing the cluster to adapt after each change?
[17:18] <symmcom> ummm. i changed all at once.....:(
[17:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[17:20] <pmatulis> d'oh
[17:21] <symmcom> lesson learned, although through a very very bad way :(
[17:21] <pmatulis> tcpdump show my admin host to be communicating with my leader monitor when invoking 'ceph status'
[17:22] <pmatulis> so your admin is attempting to contact the wrong IP then
[17:22] <symmcom> it is contacting all the right current ones plus old ones
[17:24] <pmatulis> hm, there might be a way to repair this but methinks it might be easier to simply deploy another monitor, possibly on the same host as another monitor if a new host is a problem
[17:25] <symmcom> ok, i m going to try that
[17:26] <pmatulis> thing is, i hope it doesn't confuse the already connected clients
[17:26] <pmatulis> also, we need to get the new host to become the leader i think. i'm not sure how to force that
[17:26] <pmatulis> mabye by turning off the old leader
[17:27] <symmcom> ceph-deploy from old faulty monitor to new node will transfer key and conf needed right ?
[17:28] <pmatulis> actually
[17:29] <pmatulis> ceph-deploy only takes the hostname as argument, but if that hostname already has a monitor...
[17:29] <pmatulis> you might need to look into deploying thing manually (i.e. not with c-d)
[17:30] <pmatulis> something i'm not knowledgeable in
[17:30] * glzhao (~glzhao@118.195.65.67) Quit (Quit: leaving)
[17:30] <symmcom> that actually makes sense. i would have used ceph-deploy
[17:31] <pmatulis> too bad you only have 3 MONs. if you take one down and purge it, you will lose quorum
[17:31] <pmatulis> actually no
[17:31] <pmatulis> you will still have 2/3
[17:31] <symmcom> i think i already lot qourum
[17:31] <symmcom> *lost
[17:32] <pmatulis> if you lost quorum your cluster would cease to function afaiu
[17:32] <symmcom> ah ok
[17:33] <pmatulis> have you tried querying the monitors directly?
[17:33] <symmcom> so behind the scene everything working fine, except the #ceph command which is depended on cephx key
[17:33] <pmatulis> seems like that
[17:34] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[17:35] <symmcom> i have some old pc, i m going to cook up a fresh Ubuntu node and manually install admin and lets see what happens
[17:37] <pmatulis> if you do, you will see if they are in quorum
[17:39] <pmatulis> example, invoked on a monitor: sudo ceph --admin-daemon /var/run/ceph/ceph-mon.mon1.asok mon_status
[17:40] <symmcom> if that command gets invoked that means there is quorum ?
[17:41] * DarkAce-Z is now known as DarkAceZ
[17:42] <pmatulis> no, its output will tell you about quorum as well as IPs of the monitors, according to that particular monitor
[17:43] <pmatulis> it will also tell you if the monitor is a peon or a leader
[17:43] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) has joined #ceph
[17:45] <symmcom> ok, the command shows some data, rank:0, state : Probing and current IP address of all other MONs
[17:45] <pmatulis> pastebin?
[17:46] <pmatulis> i think what happened is that the monitors have lost contact with one another but i'm surprised that the cluster did not heal itself after you put the old IPs back
[17:46] * mtk (~mtk@ool-44c35983.dyn.optonline.net) Quit (Remote host closed the connection)
[17:49] <symmcom> http://pastebin.com/QhepdL5W
[17:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[17:50] * nwl (~levine@atticus.yoyo.org) Quit (Ping timeout: 480 seconds)
[17:50] * ingard (~cake@tu.rd.vc) Quit (Ping timeout: 480 seconds)
[17:51] * mtk (~mtk@ool-44c35983.dyn.optonline.net) has joined #ceph
[17:52] <symmcom> its not a problem to have the Admin and a MON on same node is it ?
[17:55] <Gugge-47527> What is "the Admin" ?
[17:56] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[17:56] <symmcom> the node with all keys,ceph-deploy and where almost all administrative commands are run i guess
[17:57] <Gugge-47527> you are not sure what you are calling "the Admin"?
[17:58] <Gugge-47527> But no, it is not problem running ceph-deploy on the same machine as a mon or/and osd
[17:59] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[18:00] <symmcom> CEPH docs doesnt specifically talk about 'Admin Node'. But what i understood is this is the node inital Cluster was created from and pushes rest of the nodes in the cluster
[18:01] <Gugge-47527> ceph-deploy has to run somewhere, you decide where :)
[18:09] * gdavis331 (~gdavis@38.122.12.254) has joined #ceph
[18:11] * wmat (wmat@wallace.mixdown.ca) has left #ceph
[18:11] * dxd828 (~dxd828@host-92-24-127-29.ppp.as43234.net) has joined #ceph
[18:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[18:20] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[18:21] <pmatulis> yeah, best to use a dedicated host for that
[18:21] <pmatulis> symmcom: so that monitor is in a probing state
[18:22] <symmcom> what is the state it should b in and how can i put it
[18:22] <pmatulis> either 'peon' or 'leader'
[18:23] <pmatulis> the monitors talk to one another and decide who is the leader and so on
[18:23] <symmcom> i m sorry, although i have been using the cluster for several months, i m still learning.... no clue how to change state or which command to use
[18:24] <pmatulis> so this confirms what i said before. the monitors have lost contact with one another
[18:24] <pmatulis> i recommend querying the other 2 monitors as well
[18:24] <pmatulis> what release of ceph is this anyway?
[18:24] * WarrenUsui1 (~Warren@2607:f298:a:607:3d58:9a3b:c9f8:8961) has joined #ceph
[18:24] <symmcom> .72
[18:25] <pmatulis> k
[18:25] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[18:25] <symmcom> just checked the other MONs. all in probing statet
[18:25] * allsystemsarego (~allsystem@5-12-240-115.residential.rdsnet.ro) has joined #ceph
[18:26] * nwat (~textual@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[18:27] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[18:29] <pmatulis> so the IP addresses for monitors in ceph.conf are accurate?
[18:29] <symmcom> yes
[18:30] <pmatulis> and they don't correspond to the output to the direct monitor queries?
[18:30] <pmatulis> the pastebins were truncated
[18:31] * aardvark1 (~Warren@2607:f298:a:607:3d58:9a3b:c9f8:8961) Quit (Ping timeout: 480 seconds)
[18:31] * wusui (~Warren@2607:f298:a:607:3d58:9a3b:c9f8:8961) Quit (Ping timeout: 480 seconds)
[18:31] * WarrenUsui (~Warren@2607:f298:a:607:3d58:9a3b:c9f8:8961) has joined #ceph
[18:32] <pmatulis> but looks like you lost quorum. hm, but rbd clients are working. i wonder if that would be the case if you stopped one of those VMs and restarted it
[18:32] * john_barbee_ (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[18:33] <symmcom> going to try to reboot one of the unimportant VM to find out
[18:33] <pmatulis> if you lost quorum on a monitor (you have on the one you pastebinned) then you might consider purging that host and reinstalling a monitor
[18:33] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[18:34] * john_barbee (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[18:34] * john_barbee_ is now known as john_barbee
[18:34] <pmatulis> symmcom: this is a 'mon status' for a healthy cluster's leader monitor
[18:35] <pmatulis> http://paste.ubuntu.com/6505234/
[18:35] <pmatulis> notice 'quorum'
[18:36] <symmcom> ah i see
[18:39] <symmcom> any command i can use to make them see each other ?
[18:41] <pmatulis> there's probably some way to tell a monitor the IPs of the other ones
[18:41] <symmcom> the VM i rebooted , is not strting any more
[18:41] <pmatulis> ah ha
[18:41] <symmcom> so looks like definitely a quorum issue
[18:42] <pmatulis> your pastebin was truncated. it only showed 2 monitor IPs
[18:42] <symmcom> i can repaste it, but i only have 3 MONs
[18:45] <symmcom> ok, my new node for MON is ready. since i m installing the MON manually, which files i should copy there? client.admin.keyring and ceph.conf?
[18:45] <pmatulis> but are those IPs correct? in the pastebin
[18:46] <symmcom> yes, all the IPs are correct
[18:46] <pmatulis> re re-installing, i thought you were going to purge and install a monitor all with ceph-deploy?
[18:47] <pmatulis> re querying, instead of 'mon_status' put 'help'. there are ways to write/set stuff (config set). and not sure what 'sync_force' does
[18:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[18:50] <symmcom> i decided to go with completely new PC to setu new mon instead of purging existing ones
[18:50] <symmcom> changing the MONs hardware was in the future plan anyway, so now it seems like a good idea
[18:52] <pmatulis> ok
[18:56] * danieagle_ (~Daniel@186.214.63.175) has joined #ceph
[18:58] * xarses (~andreww@c-24-23-183-44.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[19:01] * nwat (~textual@eduroam-240-40.ucsc.edu) has joined #ceph
[19:03] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[19:03] * danieagle (~Daniel@179.186.121.14.dynamic.adsl.gvt.net.br) Quit (Ping timeout: 480 seconds)
[19:06] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[19:08] <symmcom> ls
[19:08] <symmcom> oppss wrong window
[19:09] <symmcom> ok, i m ready to copy ncessary files to turn the new node into a MON . i only need to copy 2 files ? client.admin and ceph.conf?
[19:13] <pmatulis> at least, but i have some stuff under /var/lib/ceph, not sure if that needs to be tampered with
[19:14] <pmatulis> http://paste.ubuntu.com/6505431/
[19:14] * pmatulis is now known as pmatulis_afk
[19:14] <pmatulis_afk> i need to go out, bbl
[19:17] * hflai (hflai@alumni.cs.nctu.edu.tw) has joined #ceph
[19:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[19:19] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[19:28] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[19:31] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[19:32] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) has joined #ceph
[19:40] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) Quit (Quit: Leaving.)
[19:45] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) Quit (Quit: Computer has gone to sleep.)
[19:46] * nwl (~levine@atticus.yoyo.org) has joined #ceph
[19:49] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[19:50] * ScOut3R (~scout3r@dsl51B69BF7.pool.t-online.hu) has joined #ceph
[19:54] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[19:57] * Cube (~Cube@66-87-65-52.pools.spcsdns.net) has joined #ceph
[20:08] <bloodice> oy screwed up..... i setup an osd server when the hostname was wrong... i just renamed it but the osd tree is still showing all the osd's under the same host. I am not/will not change the server ip though. any way to fix the mapping?
[20:08] <bloodice> all osds are still online
[20:12] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) has joined #ceph
[20:14] * mozg (~andrei@host81-151-251-29.range81-151.btcentralplus.com) has joined #ceph
[20:15] <mozg> hello guys
[20:15] <mozg> i've got several questions about benchmarking; my results do not make sense to me at all
[20:16] <mozg> i've got two servers with 8 osds in one and 9 osds in the second one
[20:16] <mozg> they are all the same hard disk
[20:16] <mozg> hard disks
[20:16] <mozg> i've also have two ssd disks for journals in each server
[20:16] <mozg> the servers are linked with 40gbit/s infiniband link using ipoib
[20:17] <mozg> when I am using rados benchmark with various thread levels I am not seeing more than 200MB/s for my writes
[20:18] <mozg> however, I am testing the individual osds using ceph tell osd.N bench and each osd is giving me between 120 - 150MB/s
[20:18] <bloodice> only 40gbs? i want the one with the larger gbs!
[20:19] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[20:19] <mozg> What I am struggle to understand that if each of my osds can do between 120-150MB/s why am I only seeing around 200MB/s in my rados benchmarks
[20:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[20:19] <mozg> even if I bump up the thread level to 64 or 128 I am still seeing poor results
[20:20] <bloodice> seems to me that the rados can only send at its network speed
[20:20] <mozg> in comparison, the read tests are giving me around 1GB/s
[20:20] <mozg> so, there is no limitation on the network side
[20:21] * KevinPerks (~Adium@cpe-066-026-252-218.triad.res.rr.com) Quit (Ping timeout: 480 seconds)
[20:21] <mozg> bloodice, well, i've done some tests with iperf and I can reach the speeds of between 16 - 20Gbit/s on my ipoib link
[20:22] <mozg> so, I do not see the network being the issue here
[20:22] <bloodice> different networks... gateway issue?
[20:22] <mozg> bloodice, same network
[20:22] <mozg> no gw or fw in between
[20:23] <bloodice> switch?
[20:23] <mozg> as I've said i can reach speeds over 16-20Gbit/s on this link
[20:23] <mozg> using the same servers
[20:23] <mozg> so, no no networking/switch/cable issues there
[20:23] <bloodice> oh you said writes
[20:24] <bloodice> my bad
[20:24] <mozg> with rados reads I can get over 1GB/s
[20:24] <mozg> however, writes are pretty slow
[20:24] <bloodice> can i borrow your network... i have some cheap people here who think bonding is enough
[20:24] <mozg> ))
[20:25] <mozg> you can get one off ebay
[20:25] <bloodice> if its more than a grand, they wont
[20:25] <mozg> secondhand kit should lend you for less than 1k
[20:25] <bloodice> i just put in 20 servers they bought for like 10k
[20:25] <mozg> 20 servers for 10k usd?
[20:25] <bloodice> spend more money in my time trying to fix them so they work than just buying new ones
[20:26] <bloodice> lol
[20:26] <bloodice> yea
[20:26] <mozg> what kind of servers are these?
[20:26] <bloodice> dell cloud servers
[20:26] <bloodice> basically pre-blade
[20:26] <symmcom> bloodice: all working ok ?
[20:26] <bloodice> yes!
[20:26] <symmcom> awesome
[20:26] <bloodice> DNS is too
[20:27] <mozg> anyway, anyone have an idea what could be the issue with my write performance?
[20:27] <bloodice> i put the wrong ips in my dhcp server.... i must of been completely dead tired...
[20:27] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[20:27] <bloodice> hey sym, you know that thing you did....
[20:27] <Sysadmin88> bloodice, how many replicas?
[20:27] <bloodice> well i had to rename a host and now my osd map is still showing the old host name argh
[20:28] <bloodice> i only do 2
[20:28] <bloodice> ... or 2way
[20:28] <bloodice> i have 40 osds ( 4tb each )
[20:28] <symmcom> bloodice: did u change host name through crush?
[20:28] <bloodice> did you fix your issue?
[20:29] <bloodice> nope just renamed the host on the server... i will check into that
[20:29] <bloodice> the IP is not changing... i just named it wrong when i set it up
[20:29] <bloodice> i still have 20 kvm servers i need to get assigned static ips... argh
[20:29] <symmcom> no :( still fighting it. Pmatulis here was helping me, gave me some good pointer. so trying something new
[20:30] <bloodice> the devs are on and other than the haproxy redirect issue(http calling https), are happy
[20:30] <bloodice> hey thats progress!
[20:30] <Sysadmin88> any news on the geo replication features that ceph was implementing?
[20:30] <bloodice> one location at a time for me lol
[20:30] <bloodice> sounds intreiging though...
[20:30] <symmcom> Ya lol me too
[20:31] <bloodice> they need to SERIOUSLY update the rados gateway docs... they dont tell you where to create what...
[20:31] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[20:31] <bloodice> developers make the worst documentarians
[20:32] <bloodice> admins in a hurry do also....lol
[20:32] * xarses (~andreww@c-24-23-183-44.hsd1.ca.comcast.net) has joined #ceph
[20:32] <bloodice> its sunday and we are all here... what does that say about us?
[20:32] <symmcom> still... CEPH came a very long way, an excellent storage system
[20:32] <symmcom> LOL. we are dedicated!
[20:33] <bloodice> yes, i agree completely... we are planning to expand it to multiple petabytes in the coming months
[20:34] <bloodice> yea lol... and for those who are not, the documentation weeds them out lol
[20:34] <Sysadmin88> i wish i had that kind of hardware to use... i've got one place i'm trying to push virtualization... going very slowly. ceph will be a long way off
[20:34] <bloodice> they bought all used stuff off ebay
[20:34] <symmcom> bloodice: any idea what to do after #ceph-mon --mkfs ....... and #ceph mon add ..... ? how do i start the newly created MON?
[20:35] <bloodice> brought up kvm on it, though we are not using the ceph storage for that.... yet
[20:35] <bloodice> are you doing it manually?
[20:35] <bloodice> i used ceph-deploy
[20:36] <bloodice> ceph-deploy mon create <hostname> ( make sure to add public network entry into ceph.conf if you didnt initially install with that host )
[20:36] <symmcom> doing it manually to avoid all the bad staff happened as somebody suggested here an hour ago
[20:36] <bloodice> err not create add
[20:37] <bloodice> yea, i had to fight with the ssh authentication in order to get mine up. each time i ran through in the test environment the user/auth setup was killing me
[20:37] <bloodice> but you just issue commands from a central point and its easier to manage in my opinion.
[20:38] <symmcom> ya thats how mine was setup till i killed it :(
[20:38] <bloodice> ack
[20:38] <bloodice> imma look up this crush change thing.. cause i need to fix it before things go live
[20:38] <bloodice> the healthcheck is having a cow
[20:40] <symmcom> ya easy to fix in the beginning
[20:41] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) has joined #ceph
[20:48] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[20:52] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) Quit (Quit: Computer has gone to sleep.)
[20:56] * nigwil (~chatzilla@2001:44b8:5144:7b00:4107:7c27:140d:dc74) has joined #ceph
[20:56] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[21:04] <bloodice> i hate to say this, but i think its going to be easier to remove the osds and then readd them
[21:04] <bloodice> editing the map makes me nervous
[21:04] <bloodice> gonna fix my monitor server first though
[21:11] <Gugge-47527> bloodice: what about "ceph osd crush move" ?
[21:14] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[21:18] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[21:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[21:24] <symmcom> ok, how do i activate newly created MON
[21:26] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[21:27] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[21:30] * Cube (~Cube@66-87-65-52.pools.spcsdns.net) Quit (Quit: Leaving.)
[21:34] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[21:40] * onizo (~onizo@cpe-75-80-122-116.san.res.rr.com) has joined #ceph
[21:43] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) has joined #ceph
[21:45] <symmcom> can anybody tell me how can i start newly created MON daemon
[21:46] * Cube (~Cube@66-87-65-52.pools.spcsdns.net) has joined #ceph
[21:48] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[21:49] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[21:59] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[22:01] * masterpe (~masterpe@2a01:670:400::43) has joined #ceph
[22:10] * onizo (~onizo@cpe-75-80-122-116.san.res.rr.com) Quit (Remote host closed the connection)
[22:14] * otisspud (~otisspud@198.15.79.50) has left #ceph
[22:15] * mattt_ (~textual@cpc25-rdng20-2-0-cust162.15-3.cable.virginm.net) Quit (Quit: Computer has gone to sleep.)
[22:17] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[22:19] * pmatulis_afk is now known as pmatulis
[22:19] <pmatulis> symmcom: hi
[22:20] <pmatulis> symmcom: got it going?
[22:22] * thomnico (~thomnico@2a01:e35:8b41:120:c17:240c:3f05:30d5) has joined #ceph
[22:23] <symmcom> i have prepared 2 MON machines, installed ceph, copied client.admin.keyring and ceph.conf
[22:24] <symmcom> did ceph-mon --mkfs....... but now how do i activate the MON
[22:24] * dmsimard (~Adium@108.163.152.66) has joined #ceph
[22:25] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[22:26] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Read error: Operation timed out)
[22:31] * xmltok (~xmltok@cpe-23-240-222-226.socal.res.rr.com) has joined #ceph
[22:42] * jesus (~jesus@emp048-51.eduroam.uu.se) Quit (Ping timeout: 480 seconds)
[22:42] * nwat (~textual@eduroam-240-40.ucsc.edu) Quit (Quit: My MacBook has gone to sleep. ZZZzzz…)
[22:46] * nwat (~textual@eduroam-240-40.ucsc.edu) has joined #ceph
[22:46] * nwat (~textual@eduroam-240-40.ucsc.edu) Quit ()
[22:47] <bloodice> Gugge-47527: interesting idea... is there any documenation on the syntax and capabilities? I cant find anything online.
[22:50] * jesus (~jesus@emp048-51.eduroam.uu.se) has joined #ceph
[22:51] * BillK (~BillK-OFT@58-7-66-254.dyn.iinet.net.au) has joined #ceph
[22:51] <bloodice> nevermind, i found it... lol
[22:56] <pmatulis> symmcom: well, what distro are you using?
[22:56] * dmsimard (~Adium@108.163.152.66) Quit (Quit: Leaving.)
[22:56] * thomnico (~thomnico@2a01:e35:8b41:120:c17:240c:3f05:30d5) Quit (Quit: Ex-Chat)
[22:56] <symmcom> Ubuntu
[22:56] <pmatulis> this will restart all monitors on a ubuntu host, from the host itself run: 'sudo restart ceph-mon-all'
[22:57] <pmatulis> or start|stop|restart
[22:58] * ScOut3R (~scout3r@dsl51B69BF7.pool.t-online.hu) Quit ()
[22:58] <symmcom> ok, after #sudo restart ceph-mon-all it shows ceph-mon-all start/running
[22:58] <symmcom> i still dont have the .aosk file under /var/run/ceph/
[22:59] <pmatulis> and no process probably? 'ps ax | grep ceph'
[23:01] <symmcom> which process i m particularly looking for
[23:01] <symmcom> i do not see any ceph-mon running
[23:04] <bloodice> Gugge-47527: awesome worked perfectly!
[23:04] <bloodice> thanks
[23:05] <bloodice> for those who want to know: ceph osd crush set osd.<number> 1.0 root=default host=<hostname> You might have to replace 1.0 with the actual weight if its different
[23:06] <bloodice> how you doin there symm
[23:07] <bloodice> ps -ef |grep mon :)
[23:08] <symmcom> #ps -ef | grep mon shows this http://pastebin.com/aY1dadiZ
[23:08] <bloodice> next, i guess i need to deal with this: health HEALTH_WARN too few pgs per osd (6 < min 20)
[23:08] <pmatulis> symmcom: nada :(
[23:09] <symmcom> i dont understand why the daemon wont start
[23:09] <pmatulis> http://paste.ubuntu.com/6506550/
[23:09] <pmatulis> symmcom: apply strace
[23:12] <pmatulis> symmcom: for a monitor of id=mon2, but you can also do the ceph-mon-all:
[23:12] <pmatulis> # strace -f -o strace-mon2-restart.txt restart ceph-mon id=mon2
[23:12] <pmatulis> you should prolly use the 'start' command though
[23:13] * AfC (~andrew@2407:7800:400:1011:2ad2:44ff:fe08:a4c) has joined #ceph
[23:15] <pmatulis> hm, here, when using 'start ceph-mon-all' i got
[23:15] <pmatulis> 'job is already running'
[23:15] <pmatulis> i stopped my mon2 and then:
[23:15] <pmatulis> # strace -f -o strace-mon2-start.txt start ceph-mon id=mon2
[23:16] <symmcom> strace produced quite a long txt file
[23:16] <pmatulis> yup
[23:17] <pmatulis> you can look it over quickly to try to spot anything suspicious
[23:17] <pmatulis> mine (from a working start):
[23:17] <pmatulis> http://paste.ubuntu.com/6506573/
[23:17] <symmcom> this is from start http://pastebin.com/Hwz8C1GE
[23:19] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) has joined #ceph
[23:27] * sarob (~sarob@c-50-161-65-119.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[23:29] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[23:34] * allsystemsarego (~allsystem@5-12-240-115.residential.rdsnet.ro) Quit (Quit: Leaving)
[23:41] * BillK (~BillK-OFT@58-7-66-254.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[23:43] <symmcom> possible reason of not running the daemon "mon Keyring not found"; run 'new' to create a new cluster
[23:43] <symmcom> i really really dont want to create a new cluster
[23:47] * BillK (~BillK-OFT@124-148-75-108.dyn.iinet.net.au) has joined #ceph
[23:49] * nigwil (~chatzilla@2001:44b8:5144:7b00:4107:7c27:140d:dc74) Quit (Quit: ChatZilla 0.9.90.1 [Firefox 25.0.1/20131112160018])
[23:50] <pmatulis> hm, did you try to "gather keys" using ceph-deploy?
[23:50] <pmatulis> ceph-deploy gatherkeys <some monitor>
[23:53] <symmcom> ah ok, got the keys
[23:57] <pmatulis> i would think you would need to then get them over to the new monitor or redeploy the monitor

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.