#ceph IRC Log

Index

IRC Log for 2015-04-07

Timestamps are in GMT/BST.

[0:00] <loicd> shivark: now all we have to do is find the difference between your system and mine
[0:01] <shivark> ok
[0:02] <shivark> are you done setting up the clusteR?
[0:02] <loicd> shivark: yes
[0:02] * puffy (~puffy@216.207.42.144) Quit (Quit: Leaving.)
[0:02] <loicd> shivark: and rebooted and ceph-osd starts as expected
[0:03] <shivark> ah, really
[0:03] <loicd> http://paste2.org/p6xhPAw1
[0:04] <loicd> $ sudo ls -l /var/lib/ceph/osd/ceph-0/journal
[0:04] <loicd> lrwxrwxrwx 1 root root 58 6 avril 21:57 /var/lib/ceph/osd/ceph-0/journal -> /dev/disk/by-partuuid/ce323479-ace3-453d-9d7e-dbfb73e2a564
[0:05] * loicd tries to modify this manually to match what shivark has
[0:06] * bandrus1 (~brian@117.sub-70-211-78.myvzw.com) Quit (Ping timeout: 480 seconds)
[0:08] <loicd> shivark: could you pastebin the list of packages you have installed ?
[0:08] * starcoder (~Jebula@luna115.startdedicated.net) has joined #ceph
[0:08] <loicd> all of them
[0:08] <sbfox> Hey cephers, anyone using ceph in anger with openstack? could do with some help
[0:08] <loicd> shivark: having /dev/vdc1 instead of the /dev/disk/by-partuuid does not change a thing for me
[0:09] <loicd> shivark: and after rebooting it's all good
[0:10] <shivark> ok
[0:10] <shivark> we also deploy the cluster with ceph-deploy
[0:11] <shivark> but, the journals are getting setup with label, instead of uuid
[0:11] * bandrus1 (~brian@50.97.232.157-static.reverse.softlayer.com) has joined #ceph
[0:11] <shivark> This happens on both 80.7 and 80.9 clusters
[0:12] <shivark> do we conclude that journal need to have the softlink to uuid as you have?
[0:12] * brad_mssw (~brad@66.129.88.50) Quit (Quit: Leaving)
[0:13] <loicd> shivark: it does not seem to make a difference
[0:13] <loicd> after replacing the uuid with vdc1 it all works fine
[0:14] * puffy (~puffy@216.207.42.144) has joined #ceph
[0:14] <shivark> ok
[0:15] <shivark> We got multiple clusters where this issue happened.
[0:15] <shivark> with 80.9
[0:15] * bandrus (~brian@198.23.71.111-static.reverse.softlayer.com) Quit (Ping timeout: 480 seconds)
[0:16] <shivark> need to figure out from our side then?
[0:16] <loicd> shivark: I think we can find what's different
[0:16] <shivark> ok
[0:16] <loicd> could you pastebin the list of packages you have installed ?
[0:16] <shivark> ok
[0:17] <shivark> rpm -qa , right ?
[0:17] * rongze (~rongze@106.39.154.69) has joined #ceph
[0:19] <loicd> shivark: or sudo yum list installed
[0:19] <loicd> shivark: # sgdisk --info 1 /dev/sdr
[0:19] <loicd> Partition GUID code: EBD0A0A2-B9E5-4433-87C0-68B6B72699C7 (Microsoft basic data)
[0:20] <loicd> it should be
[0:20] <loicd> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
[0:20] <shivark> I think we do mklabel MSDOS
[0:21] * oro (~oro@80-219-254-208.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[0:22] <shivark> or gpt
[0:23] <loicd> shivark: how do you mean "we do" ?
[0:23] <loicd> shivark: something in your provisioning system runs that command ?
[0:23] <loicd> what is the output of sudo sgdisk --info 1 /dev/vdb ?
[0:24] <loicd> sorry
[0:24] <loicd> sudo sgdisk --info 1 /dev/sdb
[0:24] <shivark> yes, something in our provisions
[0:25] <shivark> middle of copy paste of rpm -qa, 700+ pkgs :)
[0:25] <loicd> :-)
[0:25] <loicd> JOURNAL_UUID = '45b0969e-9b03-4f30-b4c6-b4b80ceff106'
[0:25] <loicd> OSD_UUID = '4fbd7e29-9d25-41b8-afd0-062c0ceff05d'
[0:25] * rongze (~rongze@106.39.154.69) Quit (Ping timeout: 480 seconds)
[0:25] * dupont-y (~dupont-y@2a01:e34:ec92:8070:9075:21c0:2a0f:dc51) Quit (Quit: Ex-Chat)
[0:26] <loicd> /dev/sdb1 must have OSD_UUID and /dev/sdr1 must have JOURNAL_UUID for the udev logic to work
[0:26] <loicd> otherwise here is what can (and probably does) happen:
[0:27] <loicd> udev notices sdb but sdr is not available yet therefore it fails to start the osd (but that's ok because it will get another chance when sdr comes up)
[0:27] <shivark> http://pastebin.com/9741Qqvb
[0:28] <loicd> udev notices sdr but since it does not have a ceph UUID in the typecode, it ignores it
[0:28] <loicd> shivark: could you pastebinit the output of sudo sgdisk --info 1 /dev/sdb ?
[0:29] <shivark> http://pastebin.com/KwUgN6Bc
[0:30] * loicd does not see anything suspicious in the list of packages
[0:30] <loicd> Partition GUID code: 4FBD7E29-9D25-41B8-AFD0-062C0CEFF05D (Unknown) is OSD_UUID which is good
[0:31] <loicd> shivark: I think you can fix the problem with
[0:31] <loicd> JOURNAL_UUID=45b0969e-9b03-4f30-b4c6-b4b80ceff106
[0:32] <loicd> sudo sgdisk --typecode="1:${JOURNAL_UUID}" /dev/sdr
[0:32] * puffy (~puffy@216.207.42.144) Quit (Quit: Leaving.)
[0:32] <loicd> shivark: can you try that ?
[0:32] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[0:32] <shivark> sure
[0:32] * puffy (~puffy@216.207.42.129) has joined #ceph
[0:33] * B_Rake (~B_Rake@69-195-66-67.unifiedlayer.com) Quit (Ping timeout: 480 seconds)
[0:33] <shivark> The operation has completed successfully.
[0:33] <shivark> so, try rebooting now?
[0:34] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) has joined #ceph
[0:34] <loicd> yes
[0:35] <loicd> that should fix one of the two osds
[0:35] <shivark> ok.. rebooted..
[0:38] * starcoder (~Jebula@5NZAAA8GD.tor-irc.dnsbl.oftc.net) Quit ()
[0:38] * SaneSmith1 (~Vale@enjolras.gtor.org) has joined #ceph
[0:40] * puffy (~puffy@216.207.42.129) Quit (Ping timeout: 480 seconds)
[0:41] <shivark> hmm, no luck.. both osds still down
[0:43] <loicd> shivark: hum
[0:44] <loicd> sudo sgdisk --info 1 /dev/sdr ?
[0:44] <loicd> shivark: ^
[0:45] <shivark> sure
[0:45] * ircolle is now known as ircolle-afk
[0:46] <shivark> http://pastebin.com/ri6cEKsV
[0:47] <loicd> shivark: and sudo ceph-disk list ?
[0:50] <shivark> http://pastebin.com/B18hYRUN
[0:51] <loicd> hum
[0:51] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) Quit (Ping timeout: 480 seconds)
[0:51] <loicd> that's weird
[0:52] <loicd> shivark: could you cat /var/lib/ceph/osd/ceph-8/journal-uuid ?
[0:52] <loicd> it should match # sgdisk --info 1 /dev/sdr
[0:52] <loicd> Partition GUID code: 45B0969E-9B03-4F30-B4C6-B4B80CEFF106 (Unknown)
[0:52] <loicd> Partition unique GUID: 0A45987B-DC77-4816-B506-DEE0DC60AE9E
[0:54] <shivark> empty directory right now as parttion is not mounted
[0:54] <shivark> let me mount it
[0:54] <loicd> or you can ceph-disk activate /dev/sdb1
[0:55] * sbfox (~Adium@72.2.49.50) Quit (Ping timeout: 480 seconds)
[0:55] * PerlStalker (~PerlStalk@162.220.127.20) Quit (Quit: ...)
[0:55] * bandrus (~brian@198.23.103.87-static.reverse.softlayer.com) has joined #ceph
[0:56] <shivark> there is no journal-uuid file
[0:57] * bandrus1 (~brian@50.97.232.157-static.reverse.softlayer.com) Quit (Ping timeout: 480 seconds)
[0:58] <loicd> journal_uuid (with an underscore)
[0:58] * bandrus1 (~brian@169.53.156.224-static.reverse.softlayer.com) has joined #ceph
[0:59] <shivark> ls -l j* lrwxrwxrwx 1 root root 9 Mar 30 23:04 journal -> /dev/sdr1
[1:00] <shivark> only one journal file
[1:00] <loicd> shivark: ok. that explains why ceph-disk does not match the journal
[1:00] <loicd> as far as I can tell it's no big deal
[1:01] <loicd> shivark: you can echo 0a45987b-dc77-4816-b506-dee0dc60ae9e > /var/lib/ceph/osd/ceph-8/journal_uuid
[1:02] <loicd> and ceph-disk list should match the journal with the partition
[1:03] * rendar (~I@host163-180-dynamic.23-79-r.retail.telecomitalia.it) Quit ()
[1:03] <shivark> ok, got it
[1:03] * bandrus (~brian@198.23.103.87-static.reverse.softlayer.com) Quit (Ping timeout: 480 seconds)
[1:04] * dneary (~dneary@nat-pool-bos-u.redhat.com) Quit (Ping timeout: 480 seconds)
[1:04] <loicd> c6ac0ddf91915ba2aeae46d21367f017e18e82cd is the commit that added the journal_uuid file back in feb 2013
[1:04] <loicd> shivark: you made the change and it shows ceph-disk differently ?
[1:05] <shivark> I ra: sgdisk --info 1 /dev/sdr again
[1:05] <shivark> Partition unique GUID: 0A45987B-DC77-4816-B506-DEE0DC60AE9E
[1:06] <loicd> it seems unlikely that your osd was created in 2012 though, right ?
[1:06] <shivark> no, it is 2015
[1:06] <shivark> ot more than a week old
[1:06] <loicd> the absence of journal_uuid is not a good sign then
[1:07] <loicd> shivark: how was this osd created ?
[1:07] * zack_dolby (~textual@pa3b3a1.tokynt01.ap.so-net.ne.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[1:07] <loicd> oh, maybe it was created with something that's not ceph-disk !
[1:07] <loicd> which would explain the /dev/sdr1 symlink and the absence of journal_uuid
[1:08] * cpceph (~Adium@67.21.63.155) Quit (Quit: Leaving.)
[1:08] <shivark> Our provisioing, partitions the ssd disks first and then runs ceph-deploy to deploy the cluster
[1:08] * SaneSmith1 (~Vale@5NZAAA8G6.tor-irc.dnsbl.oftc.net) Quit ()
[1:08] <loicd> ceph-deploy uses ceph-disk: it would not miss journal_uuid. There must be something else.
[1:09] <loicd> but that's not why it does not start
[1:09] <shivark> ok.
[1:10] <loicd> could you umount the partition and try to ceph-disk activate /dev/sdb1 to check if that works ?
[1:10] <shivark> ceph-deploy osd-prepare --zapdisk
[1:10] <shivark> ceph-disk activate worked before
[1:10] <loicd> ok
[1:11] <shivark> it mounted /dev/sdb1 with osd partition
[1:11] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has joined #ceph
[1:11] * hasues (~hazuez@108-236-232-243.lightspeed.knvltn.sbcglobal.net) has left #ceph
[1:13] <loicd> shivark: let's focus on what udev does then
[1:13] <loicd> could you try
[1:13] <loicd> udevadm test /block/sdb/sdb1
[1:13] <loicd> (note the trailing sdb1 that was missing previously)
[1:14] <shivark> http://pastebin.com/CffkZBaK
[1:14] <loicd> shivark: I'm looking at the output on my rhel6.5
[1:14] <loicd> we see a lot more there
[1:15] * bandrus1 (~brian@169.53.156.224-static.reverse.softlayer.com) Quit (Ping timeout: 480 seconds)
[1:15] * puffy (~puffy@216.207.42.129) has joined #ceph
[1:16] <shivark> ok
[1:18] * rongze (~rongze@106.39.154.69) has joined #ceph
[1:18] <loicd> /lib/udev/rules.d/95-ceph-osd.rules should do the right thing but ... does not
[1:19] <loicd> udevadm test /block/sdr/sdr1
[1:19] <loicd> shivark: what's the output of that ?
[1:19] <shivark> http://pastebin.com/CffkZBaK
[1:19] <shivark> oops
[1:19] <loicd> :-)
[1:22] <shivark> http://pastebin.com/sH2t0Q0T
[1:23] * jcsalem (~Jim@pool-108-49-214-102.bstnma.fios.verizon.net) has joined #ceph
[1:25] * bandrus (~brian@37.sub-70-211-75.myvzw.com) has joined #ceph
[1:26] * rongze (~rongze@106.39.154.69) Quit (Ping timeout: 480 seconds)
[1:29] <shivark> loic, are you still there?
[1:30] <loicd> yes
[1:30] <loicd> reading https://wiki.ubuntu.com/DebuggingUdev
[1:31] <loicd> shivark: let's triple confirm ceph-disk activate works *with the journal*
[1:31] <shivark> sure
[1:31] <loicd> sudo ceph-disk -v activate /dev/sdr1
[1:32] <loicd> should start osd.8
[1:32] <loicd> can you try that ?
[1:32] <shivark> let me shutdown the old one
[1:32] <shivark> I mean osd.8
[1:32] <loicd> ok
[1:34] <loicd> sorry
[1:34] <loicd> I meant
[1:34] <loicd> sudo ceph-disk -v activate-journal /dev/sdr1
[1:35] <loicd> I think it won't work
[1:35] <shivark> it worked
[1:37] <loicd> i'm impressed
[1:38] * Peaced (~sardonyx@herngaard.torservers.net) has joined #ceph
[1:38] <loicd> shivark: can you pastebin the output ?
[1:39] <loicd> shivark: we're getting close, there are not many options left ;-)
[1:40] <shivark> http://pastebin.com/sRYff1Hd
[1:42] <loicd> shivark: let's try calling /usr/sbin/ceph-disk-udev manually
[1:42] <loicd> could you shutdown the osd.8 and call
[1:42] <shivark> ok
[1:43] <loicd> bash -x /usr/sbin/ceph-disk-udev 1 sdr1 sdr
[1:43] <shivark> sure
[1:43] <loicd> and keep the output
[1:43] <loicd> that's what's called from udev
[1:44] <shivark> it cameup
[1:44] <loicd> what's the output ?
[1:45] <shivark> http://pastebin.com/WufPYe8F
[1:45] * p66kumar (~p66kumar@74.119.205.248) Quit (Quit: p66kumar)
[1:45] * sjmtest (uid32746@id-32746.uxbridge.irccloud.com) Quit (Quit: Connection closed for inactivity)
[1:45] <loicd> exactly what's expected
[1:45] <loicd> can you try with
[1:45] <loicd> bash -x /usr/sbin/ceph-disk-udev 1 sdb1 sdb
[1:46] <loicd> it should also work
[1:47] <shivark> yes, it came up
[1:48] <shivark> just fyi, I got 1 out 3 mons down.
[1:48] * jcsalem (~Jim@pool-108-49-214-102.bstnma.fios.verizon.net) has left #ceph
[1:48] <shivark> unrelated to this problem
[1:49] <loicd> that should not cause any issue but you may want to not have the IP of the mon that's down in /etc/ceph/ceph.conf
[1:49] <shivark> http://pastebin.com/Q3Yux6Wf
[1:49] <shivark> I was mentioning because, you will see the message in the paste
[1:52] * oms101 (~oms101@p20030057EA632A00EEF4BBFFFE0F7062.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[1:54] <loicd> shivark: sudo udevadm trigger --sysname-match=sdb1
[1:54] <loicd> does that work ?
[1:54] <loicd> I mean does that start the osd.8
[1:55] <shivark> I tried the: ceph-disk-udev 1 sdb1 sdb
[1:55] <shivark> and it started the osd
[1:56] <loicd> shivark: ok
[1:56] <loicd> sudo udevadm trigger --sysname-match=sdb1 should also work
[1:56] <loicd> I tried it on my rhel6.5 and it does
[1:57] * yanzheng (~zhyan@171.216.95.48) has joined #ceph
[1:59] <shivark> ok, let me try
[1:59] <loicd> shivark: what kernel do you have ?
[2:00] * LeaChim (~LeaChim@host86-151-147-249.range86-151.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[2:00] <loicd> I have a good old 2.6.32-431.el6.x86_64 ;-)
[2:00] <loicd> now... the -431 suggests a number of patches that is beyond imagination
[2:00] <shivark> 2.6.32-504.3.3.el6.x86_64 #1 SMP Fri Dec 12 16:05:43 EST 2014 x86_64 x86_64 x86_64 GNU/Linux
[2:01] * puffy (~puffy@216.207.42.129) Quit (Ping timeout: 480 seconds)
[2:01] * moore (~moore@64.202.160.88) Quit (Read error: Connection reset by peer)
[2:01] * moore (~moore@64.202.160.88) has joined #ceph
[2:01] * oms101 (~oms101@p20030057EA018600EEF4BBFFFE0F7062.dip0.t-ipconnect.de) has joined #ceph
[2:02] <shivark> mine is 504 patches then?
[2:03] <loicd> version 504 is not necessarily 504 patches but certainly more than a few dozens ;-)
[2:03] <loicd> did sudo udevadm trigger --sysname-match=sdb1 work ?
[2:03] <shivark> yes
[2:03] <shivark> it worked
[2:03] <loicd> ok
[2:04] <loicd> can you tear it down and try
[2:04] * yanzheng (~zhyan@171.216.95.48) Quit (Quit: This computer has gone to sleep)
[2:04] <loicd> udevadm trigger --sysname-match=sdr1 ?
[2:04] * yguang11 (~yguang11@vpn-nat.corp.tw1.yahoo.com) has joined #ceph
[2:05] <shivark> that worked too
[2:06] <loicd> do you have a machine installed where v0.80.9 works as expected ? or is it not starting at boot time on all of them ?
[2:06] <shivark> all of them
[2:07] <shivark> its other way with 80.7 :)
[2:07] * zack_dolby (~textual@nfmv001067004.uqw.ppp.infoweb.ne.jp) has joined #ceph
[2:08] * bandrus (~brian@37.sub-70-211-75.myvzw.com) Quit (Quit: Leaving.)
[2:08] * Peaced (~sardonyx@98EAAAZ2H.tor-irc.dnsbl.oftc.net) Quit ()
[2:08] * bandrus (~brian@37.sub-70-211-75.myvzw.com) has joined #ceph
[2:11] * puffy (~puffy@216.207.42.144) has joined #ceph
[2:11] * debian112 (~bcolbert@24.126.201.64) Quit (Quit: Leaving.)
[2:12] <loicd> shivark: as a workaround, could you try echo udevadm trigger --sysname-match=sd* >> /etc/rc.local
[2:12] <loicd> and reboot to see if that works ?
[2:12] * jdillaman (~jdillaman@pool-173-66-110-250.washdc.fios.verizon.net) has joined #ceph
[2:12] <shivark> ok
[2:12] * Crisco1 (~starcoder@37.187.129.166) has joined #ceph
[2:14] <loicd> the next step would be to trace and check if udev is called as it should at boot time but it's a little late for me. I don't think it can be a bug in ceph-disk-udev that would prevent it to work during boot (for instance because there is not PATH environment variable) otherwise it would also fail on my rhel6.5
[2:15] <shivark> I've changed the rc.local and rebooted the system.
[2:15] <shivark> Thanks for working on this so late into your time. It helps a lot towards us making a call on whether to go with 80.9 or not
[2:15] <loicd> shivark: https://wiki.ubuntu.com/DebuggingUdev has instructions on how to get more information about udev during boot time
[2:16] <shivark> Sure, i'll look into that.
[2:17] <loicd> shivark: I'm convinced this is an environmental issue, unrelated to ceph. The partittion uuid that was incorrect is a good reason for teh osd not to come up.
[2:18] <shivark> you mean journal_uuid being not setup?
[2:19] <loicd> if you're having journals with incorrect parition code on 0.80.7 you want to fix that otherwise you'll risk having the same issue, if the osd needs to be brought up based on the journal partition
[2:19] <shivark> with work-around both osds cameup
[2:19] <shivark> after reboot
[2:19] <loicd> shivark: not the journal_uuid file, the ptype
[2:20] <shivark> ok, Windows vs Unknown ?
[2:20] <loicd> yes
[2:20] * joef1 (~Adium@2601:9:280:f2e:ad95:f2b7:3817:62f) has joined #ceph
[2:20] <shivark> ok, i'll look more into that tonight
[2:21] <loicd> shivark: cool, I'll dream about it ;-)
[2:21] <shivark> your work-around worked ::-)
[2:21] * puffy (~puffy@216.207.42.144) Quit (Quit: Leaving.)
[2:21] <loicd> yeah, that's kind of frustrating but useful
[2:22] <loicd> shivark: http://tracker.ceph.com/issues/8498 is the ticket I was refering to
[2:23] <shivark> ok, got it.
[2:24] <shivark> Thanks again, good night
[2:24] <shivark> hope I don't have to catch you again on this.
[2:25] * joef (~Adium@2620:79:0:2420::6) Quit (Ping timeout: 480 seconds)
[2:27] <loicd> shivark: do you have the same kernel on the v0.80.7 machines and the v0.80.9 machines ?
[2:28] * xarses (~andreww@12.164.168.117) Quit (Ping timeout: 480 seconds)
[2:30] <shivark> yes, same kernel
[2:30] <loicd> super weird
[2:31] * jdillaman (~jdillaman@pool-173-66-110-250.washdc.fios.verizon.net) Quit (Quit: jdillaman)
[2:31] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) has joined #ceph
[2:31] <loicd> night !
[2:34] * shivark (~oftc-webi@32.97.110.54) Quit (Remote host closed the connection)
[2:39] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) Quit (Ping timeout: 480 seconds)
[2:40] * yguang11 (~yguang11@vpn-nat.corp.tw1.yahoo.com) Quit (Ping timeout: 480 seconds)
[2:42] * Crisco1 (~starcoder@98EAAAZ3F.tor-irc.dnsbl.oftc.net) Quit ()
[2:43] * narthollis (~Peaced@h2343030.stratoserver.net) has joined #ceph
[2:45] * bandrus (~brian@37.sub-70-211-75.myvzw.com) Quit (Quit: Leaving.)
[2:47] * yanzheng (~zhyan@171.216.95.48) has joined #ceph
[2:48] * joef1 (~Adium@2601:9:280:f2e:ad95:f2b7:3817:62f) Quit (Quit: Leaving.)
[2:54] * eternaleye (~eternaley@50.245.141.77) Quit (Remote host closed the connection)
[2:55] * moore (~moore@64.202.160.88) Quit (Remote host closed the connection)
[3:04] * neurodrone (~neurodron@pool-100-1-89-227.nwrknj.fios.verizon.net) has joined #ceph
[3:06] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) has joined #ceph
[3:09] * joef (~Adium@c-24-130-254-66.hsd1.ca.comcast.net) has joined #ceph
[3:12] * narthollis (~Peaced@5NZAAA8KX.tor-irc.dnsbl.oftc.net) Quit ()
[3:13] * Pommesgabel (~Mraedis@tor-exit.xshells.net) has joined #ceph
[3:15] * eternaleye (~eternaley@50.245.141.73) has joined #ceph
[3:19] * p66kumar (~p66kumar@c-67-188-232-183.hsd1.ca.comcast.net) has joined #ceph
[3:20] * rongze (~rongze@106.39.154.69) has joined #ceph
[3:23] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) Quit (Remote host closed the connection)
[3:26] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) has joined #ceph
[3:28] * rongze (~rongze@106.39.154.69) Quit (Ping timeout: 480 seconds)
[3:29] * moore (~moore@97-124-123-201.phnx.qwest.net) has joined #ceph
[3:38] * wushudoin (~wushudoin@209.132.181.86) Quit (Ping timeout: 480 seconds)
[3:41] * Mika_c (~quassel@125.227.22.217) has joined #ceph
[3:42] * root2 (~root@p57B2E90E.dip0.t-ipconnect.de) has joined #ceph
[3:42] * Pommesgabel (~Mraedis@98EAAAZ4V.tor-irc.dnsbl.oftc.net) Quit ()
[3:43] * K3NT1S_aw (~darks@2WVAAA5XC.tor-irc.dnsbl.oftc.net) has joined #ceph
[3:44] * neurodrone (~neurodron@pool-100-1-89-227.nwrknj.fios.verizon.net) Quit (Quit: neurodrone)
[3:49] * root (~root@p57B2EE05.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[3:56] * joef (~Adium@c-24-130-254-66.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[3:56] * joshd1 (~joshd@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[4:12] * K3NT1S_aw (~darks@2WVAAA5XC.tor-irc.dnsbl.oftc.net) Quit ()
[4:13] * Diablothein (~Rehevkor@tor-exit.server9.tvdw.eu) has joined #ceph
[4:18] * kefu (~kefu@114.92.108.72) has joined #ceph
[4:18] * zhaochao (~zhaochao@111.161.17.97) has joined #ceph
[4:33] * rongze (~rongze@182.48.117.114) has joined #ceph
[4:42] * Diablothein (~Rehevkor@98EAAAZ5Y.tor-irc.dnsbl.oftc.net) Quit ()
[4:45] * lucas1 (~Thunderbi@218.76.52.64) has joined #ceph
[4:47] * shang (~ShangWu@175.41.48.77) has joined #ceph
[5:09] * p66kumar (~p66kumar@c-67-188-232-183.hsd1.ca.comcast.net) Quit (Quit: p66kumar)
[5:13] * Kwen (~zapu@tor-exit.server6.tvdw.eu) has joined #ceph
[5:20] * Mika_c (~quassel@125.227.22.217) Quit (Remote host closed the connection)
[5:21] * davidzlap (~Adium@2605:e000:1313:8003:4c44:9477:d73e:cd35) Quit (Quit: Leaving.)
[5:22] * davidzlap (~Adium@cpe-23-242-189-171.socal.res.rr.com) has joined #ceph
[5:31] * KevinPerks (~Adium@cpe-75-177-32-14.triad.res.rr.com) Quit (Quit: Leaving.)
[5:32] * sbfox (~Adium@S0106c46e1fb849db.vf.shawcable.net) has joined #ceph
[5:33] * hellertime1 (~Adium@pool-173-48-154-80.bstnma.fios.verizon.net) Quit (Quit: Leaving.)
[5:37] * Vacuum_ (~vovo@i59F797DD.versanet.de) has joined #ceph
[5:38] * shylesh (~shylesh@121.244.87.124) has joined #ceph
[5:42] * Kwen (~zapu@2WVAAA52V.tor-irc.dnsbl.oftc.net) Quit ()
[5:43] * KungFuHamster (~Pettis@2.tor.exit.bbln.org) has joined #ceph
[5:44] * Vacuum (~vovo@88.130.209.146) Quit (Ping timeout: 480 seconds)
[5:47] <lurbs> Anyone else seen a monitor develop issues (unresponsive, kicked out of quorum, stuck 'synchronizing' on restart) just after a daylight saving (time going back) change?
[5:48] * kuroneko (~kuroneko@2600:3c01::f03c:91ff:fe96:1bfe) Quit (Remote host closed the connection)
[5:49] * kuroneko (~kuroneko@2600:3c01::f03c:91ff:fe96:1bfe) has joined #ceph
[6:02] * m0zes (~mozes@beocat.cis.ksu.edu) Quit (Ping timeout: 480 seconds)
[6:03] * yghannam (~yghannam@0001f8aa.user.oftc.net) Quit (Ping timeout: 480 seconds)
[6:08] * rdas (~rdas@121.244.87.116) has joined #ceph
[6:12] * joshd1 (~joshd@68-119-140-18.dhcp.ahvl.nc.charter.com) Quit (Quit: Leaving.)
[6:12] * KungFuHamster (~Pettis@2WVAAA54D.tor-irc.dnsbl.oftc.net) Quit ()
[6:12] * overclk (~overclk@121.244.87.117) has joined #ceph
[6:12] * Random (~osuka_@46.28.202.81) has joined #ceph
[6:13] * kanagaraj (~kanagaraj@121.244.87.117) has joined #ceph
[6:14] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[6:15] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[6:17] * kefu (~kefu@114.92.108.72) Quit (Max SendQ exceeded)
[6:18] * kefu (~kefu@114.92.108.72) has joined #ceph
[6:22] * p66kumar (~p66kumar@c-67-188-232-183.hsd1.ca.comcast.net) has joined #ceph
[6:27] * davidzlap (~Adium@cpe-23-242-189-171.socal.res.rr.com) Quit (Quit: Leaving.)
[6:28] * davidzlap (~Adium@2605:e000:1313:8003:ccaf:ec69:d9f1:a963) has joined #ceph
[6:28] * lxo (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[6:28] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[6:31] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[6:32] * davidzlap (~Adium@2605:e000:1313:8003:ccaf:ec69:d9f1:a963) Quit ()
[6:35] * rdas (~rdas@121.244.87.116) has joined #ceph
[6:36] * davidzlap (~Adium@cpe-23-242-189-171.socal.res.rr.com) has joined #ceph
[6:37] * vbellur (~vijay@121.244.87.117) has joined #ceph
[6:41] * wschulze (~wschulze@cpe-74-73-11-233.nyc.res.rr.com) Quit (Quit: Leaving.)
[6:42] * derjohn_mob (~aj@ip-95-223-126-17.hsi16.unitymediagroup.de) Quit (Ping timeout: 480 seconds)
[6:42] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) has joined #ceph
[6:42] * Random (~osuka_@1GLAAA1QF.tor-irc.dnsbl.oftc.net) Quit ()
[6:42] * nih (~phyphor@95.130.15.96) has joined #ceph
[6:42] * davidzlap (~Adium@cpe-23-242-189-171.socal.res.rr.com) Quit (Quit: Leaving.)
[6:48] * davidzlap (~Adium@cpe-23-242-189-171.socal.res.rr.com) has joined #ceph
[6:50] * brutuscat (~brutuscat@105.34.133.37.dynamic.jazztel.es) Quit (Ping timeout: 480 seconds)
[6:56] * sbfox (~Adium@S0106c46e1fb849db.vf.shawcable.net) Quit (Quit: Leaving.)
[7:02] * davidzlap (~Adium@cpe-23-242-189-171.socal.res.rr.com) Quit (Quit: Leaving.)
[7:07] * sbfox (~Adium@S0106c46e1fb849db.vf.shawcable.net) has joined #ceph
[7:12] * nih (~phyphor@3OZAAAWTF.tor-irc.dnsbl.oftc.net) Quit ()
[7:15] * kefu (~kefu@114.92.108.72) Quit (Max SendQ exceeded)
[7:16] * kefu (~kefu@114.92.108.72) has joined #ceph
[7:18] * kefu (~kefu@114.92.108.72) Quit ()
[7:22] * shang (~ShangWu@175.41.48.77) Quit (Quit: Ex-Chat)
[7:23] * karnan (~karnan@121.244.87.117) has joined #ceph
[7:25] * MACscr (~Adium@2601:d:c800:de3:f19d:4c12:3088:1412) Quit (Quit: Leaving.)
[7:27] * lalatenduM (~lalatendu@121.244.87.117) has joined #ceph
[7:28] * kefu (~kefu@114.92.108.72) has joined #ceph
[7:30] * Hemanth (~Hemanth@121.244.87.117) has joined #ceph
[7:32] * shang (~ShangWu@175.41.48.77) has joined #ceph
[7:32] * oro (~oro@80-219-254-208.dclient.hispeed.ch) has joined #ceph
[7:32] * Swat- (~Swat-@vps.frasa.net) Quit (Quit: leaving)
[7:42] * kefu (~kefu@114.92.108.72) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[7:43] * w2k (~mason@95.128.43.164) has joined #ceph
[7:45] * pevans (~pevans@c-73-189-80-117.hsd1.ca.comcast.net) has joined #ceph
[7:45] <Anticimex> osd's do support writing larger objects than will fit on journal, right?
[7:47] * pevans (~pevans@c-73-189-80-117.hsd1.ca.comcast.net) Quit ()
[7:48] * pevans (~pevans@c-73-189-80-117.hsd1.ca.comcast.net) has joined #ceph
[7:49] * pevans (~pevans@c-73-189-80-117.hsd1.ca.comcast.net) Quit ()
[7:57] * derjohn_mob (~aj@88.128.80.213) has joined #ceph
[8:07] * Hemanth (~Hemanth@121.244.87.117) Quit (Ping timeout: 480 seconds)
[8:12] * w2k (~mason@98EAAA0AY.tor-irc.dnsbl.oftc.net) Quit ()
[8:16] * Rickus__ (~Rickus@office.protected.ca) has joined #ceph
[8:17] * Lunk2 (~Jones@exit1.ipredator.se) has joined #ceph
[8:20] * Hemanth (~Hemanth@121.244.87.117) has joined #ceph
[8:21] * rotbeard (~redbeard@2a02:908:df10:d300:76f0:6dff:fe3b:994d) Quit (Quit: Verlassend)
[8:23] * Rickus_ (~Rickus@office.protected.ca) Quit (Ping timeout: 480 seconds)
[8:23] * rdas (~rdas@121.244.87.116) Quit (Remote host closed the connection)
[8:26] * Nacer_ (~Nacer@2001:41d0:fe82:7200:78c4:1ebb:82d7:906d) Quit (Remote host closed the connection)
[8:27] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[8:28] * rdas (~rdas@121.244.87.116) has joined #ceph
[8:34] * DV (~veillard@2001:41d0:1:d478::1) Quit (Ping timeout: 480 seconds)
[8:36] * derjohn_mob (~aj@88.128.80.213) Quit (Ping timeout: 480 seconds)
[8:38] * shang (~ShangWu@175.41.48.77) Quit (Quit: Ex-Chat)
[8:40] * p66kumar (~p66kumar@c-67-188-232-183.hsd1.ca.comcast.net) Quit (Quit: p66kumar)
[8:42] * thomnico (~thomnico@2a01:e35:8b41:120:4576:4fc6:1b9b:34b8) has joined #ceph
[8:47] * cok (~chk@2a02:2350:18:1010:ace7:51f3:af2:ceff) has joined #ceph
[8:47] * Lunk2 (~Jones@2WVAAA6DZ.tor-irc.dnsbl.oftc.net) Quit ()
[8:47] <anorak> hi
[8:52] * roaet (~spidu_@hessel0.torservers.net) has joined #ceph
[8:53] * OnTheRock (~overonthe@199.68.193.62) has joined #ceph
[8:57] * oro (~oro@80-219-254-208.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[9:00] * boredatwork (~overonthe@199.68.193.62) Quit (Ping timeout: 480 seconds)
[9:02] * p66kumar (~p66kumar@c-67-188-232-183.hsd1.ca.comcast.net) has joined #ceph
[9:04] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[9:08] * kawa2014 (~kawa@89.184.114.246) has joined #ceph
[9:09] * dugravot6 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) has joined #ceph
[9:13] * Be-El (~quassel@fb08-bcf-pc01.computational.bio.uni-giessen.de) has joined #ceph
[9:13] <Be-El> hi
[9:16] * DV (~veillard@2001:41d0:1:d478::1) has joined #ceph
[9:16] * moore (~moore@97-124-123-201.phnx.qwest.net) Quit (Remote host closed the connection)
[9:18] * analbeard (~shw@support.memset.com) has joined #ceph
[9:20] * wicope (~wicope@0001fd8a.user.oftc.net) has joined #ceph
[9:21] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[9:21] * roaet (~spidu_@2WVAAA6F1.tor-irc.dnsbl.oftc.net) Quit ()
[9:22] * Plesioth (~Aramande_@5NZAAA8YQ.tor-irc.dnsbl.oftc.net) has joined #ceph
[9:25] * fsimonce (~simon@host178-188-dynamic.26-79-r.retail.telecomitalia.it) has joined #ceph
[9:26] * Kvisle (~tv@tv.users.bitbit.net) has joined #ceph
[9:27] * m0zes (~mozes@beocat.cis.ksu.edu) has joined #ceph
[9:27] * Concubidated (~Adium@71.21.5.251) Quit (Quit: Leaving.)
[9:30] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[9:32] <Kvisle> playing around with a test-cluster with no data of value; one of the nodes died, and will not get back up again ... I already did ceph osd rm on the osd running on the downed node, and it doesn't seem like I can fix the stale pgs anymore
[9:32] <Kvisle> anyone have any ideas?
[9:33] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[9:35] * DV (~veillard@2001:41d0:1:d478::1) Quit (Ping timeout: 480 seconds)
[9:36] * shang (~ShangWu@175.41.48.77) has joined #ceph
[9:39] * TMM (~hp@178-84-46-106.dynamic.upc.nl) Quit (Quit: Ex-Chat)
[9:40] * _NiC (~kristian@aeryn.ronningen.no) Quit (Ping timeout: 480 seconds)
[9:41] * kefu (~kefu@114.92.108.72) has joined #ceph
[9:43] * p66kumar (~p66kumar@c-67-188-232-183.hsd1.ca.comcast.net) Quit (Quit: p66kumar)
[9:44] * DV (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[9:51] * Plesioth (~Aramande_@5NZAAA8YQ.tor-irc.dnsbl.oftc.net) Quit (Remote host closed the connection)
[9:51] * W|ldCraze (~tunaaja@tor-elpresidente.piraten-nds.de) has joined #ceph
[9:54] * dugravot6 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) Quit (Quit: Leaving.)
[9:55] * oro (~oro@2001:620:20:16:d84d:b122:4902:e044) has joined #ceph
[9:56] * dugravot6 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) has joined #ceph
[9:57] * gravitystorm (~gravityst@host109-149-148-33.range109-149.btcentralplus.com) has joined #ceph
[9:58] * MACscr (~Adium@2601:d:c800:de3:b866:e2cb:6322:1ce2) has joined #ceph
[10:02] * dugravot61 (~dugravot6@dn-infra-04.lionnois.univ-lorraine.fr) has joined #ceph
[10:06] * sbfox (~Adium@S0106c46e1fb849db.vf.shawcable.net) Quit (Quit: Leaving.)
[10:06] * thomnico (~thomnico@2a01:e35:8b41:120:4576:4fc6:1b9b:34b8) Quit (Quit: Ex-Chat)
[10:07] * DV_ (~veillard@2001:41d0:1:d478::1) has joined #ceph
[10:07] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) has joined #ceph
[10:07] * thomnico (~thomnico@2a01:e35:8b41:120:4576:4fc6:1b9b:34b8) has joined #ceph
[10:08] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) Quit ()
[10:08] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) has joined #ceph
[10:09] * bigtoch (~bigtoch@41.189.169.250) has joined #ceph
[10:10] * dugravot6 (~dugravot6@nat-persul-plg.wifi.univ-lorraine.fr) Quit (Ping timeout: 480 seconds)
[10:10] * jordanP (~jordan@213.215.2.194) has joined #ceph
[10:11] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[10:11] * wicope (~wicope@0001fd8a.user.oftc.net) Quit (Remote host closed the connection)
[10:12] * MACscr1 (~Adium@2601:d:c800:de3:a4a1:791e:c9bb:c787) has joined #ceph
[10:13] * MACscr (~Adium@2601:d:c800:de3:b866:e2cb:6322:1ce2) Quit (Ping timeout: 480 seconds)
[10:14] * DV (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[10:15] * ngoswami (~ngoswami@121.244.87.116) has joined #ceph
[10:16] * bitserker (~toni@63.pool85-52-240.static.orange.es) has joined #ceph
[10:16] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[10:19] * DV_ (~veillard@2001:41d0:1:d478::1) Quit (Ping timeout: 480 seconds)
[10:21] * W|ldCraze (~tunaaja@98EAAA0F5.tor-irc.dnsbl.oftc.net) Quit ()
[10:21] * PeterRabbit (~anadrom@2.tor.exit.bbln.org) has joined #ceph
[10:27] * wicope (~wicope@0001fd8a.user.oftc.net) has joined #ceph
[10:29] <Be-El> Kvisle: does the osd still show up in the output of 'ceph osd tree' ?
[10:29] * brutuscat (~brutuscat@93.Red-88-1-121.dynamicIP.rima-tde.net) has joined #ceph
[10:30] <Kvisle> Be-El: it shows up as DNE
[10:30] * DV_ (~veillard@2001:41d0:a:f29f::1) has joined #ceph
[10:31] * b0e (~aledermue@213.95.25.82) has joined #ceph
[10:33] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[10:33] * TMM (~hp@sams-office-nat.tomtomgroup.com) has joined #ceph
[10:34] <Be-El> Kvisle: you can try to remove it from the crush map with 'ceph osd crush remove osd.XX'. this should also trigger the recovery and backfilling
[10:36] * vbellur (~vijay@121.244.87.117) Quit (Ping timeout: 480 seconds)
[10:51] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[10:51] * PeterRabbit (~anadrom@98EAAA0HE.tor-irc.dnsbl.oftc.net) Quit ()
[10:52] * tom__ (~tom@167.88.45.146) Quit (Remote host closed the connection)
[10:53] <Kvisle> Be-El: it's out of the tree, but it doesn't seem to have any effect on the stale pg's
[10:53] * vbellur (~vijay@121.244.87.124) has joined #ceph
[10:57] * tom (~tom@167.88.45.146) has joined #ceph
[10:58] <Be-El> Kvisle: can you find out which pgs are affected and upload the output of ' ceph pg X.Y query' of one of them to a pastebin?
[11:00] <Kvisle> https://gist.github.com/kvisle/38374f35668e58fe1f04
[11:00] <Kvisle> Be-El: ^
[11:01] <Be-El> how many osds are left in the cluster?
[11:01] <Kvisle> 4
[11:01] <Kvisle> (out of a total of 5)
[11:01] * zack_dolby (~textual@nfmv001067004.uqw.ppp.infoweb.ne.jp) Quit (Quit: My MacBook has gone to sleep. ZZZzzz???)
[11:02] <Be-El> and osd.0 is still alive?
[11:03] * topro (~prousa@host-62-245-142-50.customer.m-online.net) has joined #ceph
[11:04] <Kvisle> no, that's the dead one
[11:05] <Be-El> the problem is that the affected pgs are only available on that osd (see the primary and acting columns)
[11:05] <Be-El> there's no replicate of them on another osd
[11:05] <Be-El> which size and min_size settings do you use for the affected pool?
[11:07] <Kvisle> aha!
[11:07] <Kvisle> so those pg's are bound to a certain pool, that explains it
[11:07] <Be-El> the first part of the pg id is the pool id
[11:08] <Kvisle> I had a bunch of pools, one of them with a min_size / size of 1 --- removed that pool now, there were no data in it
[11:09] * m0zes (~mozes@beocat.cis.ksu.edu) Quit (Ping timeout: 480 seconds)
[11:12] <Kvisle> Be-El: thanks, I learned :)
[11:12] <Be-El> the state 'stale+active+clean' is wrong in this case nonetheless. you may want to fill a bug report, since the osd should be marked as down in this case
[11:12] <Be-El> you're welcome
[11:13] * kefu (~kefu@114.92.108.72) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[11:14] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[11:20] * rdas (~rdas@121.244.87.116) has joined #ceph
[11:20] * jordanP (~jordan@213.215.2.194) Quit (Quit: Leaving)
[11:21] * KristopherBel (~MonkeyJam@5.9.158.75) has joined #ceph
[11:24] * kefu (~kefu@114.92.108.72) has joined #ceph
[11:27] * thomnico (~thomnico@2a01:e35:8b41:120:4576:4fc6:1b9b:34b8) Quit (Ping timeout: 480 seconds)
[11:27] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[11:33] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[11:40] * m0zes (~mozes@beocat.cis.ksu.edu) has joined #ceph
[11:43] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[11:44] * jordanP (~jordan@213.215.2.194) has joined #ceph
[11:51] * KristopherBel (~MonkeyJam@2WVAAA6PE.tor-irc.dnsbl.oftc.net) Quit ()
[11:54] * jcsp1 (~Adium@82-71-16-249.dsl.in-addr.zen.co.uk) has joined #ceph
[11:54] * DiabloD3 (~diablo@exelion.net) has joined #ceph
[11:54] <DiabloD3> hey guys
[11:55] <DiabloD3> lets say someone is using qemu's native ceph support (for live migration purposes)
[11:55] <DiabloD3> and they wanted to use host local ssds to write through cache things
[11:56] * Misacorp (~Sirrush@98EAAA0KI.tor-irc.dnsbl.oftc.net) has joined #ceph
[12:12] <joelm> DiabloD3: you can use rbd cache (only small buffer). I'm not aware a hypervisor caching solution exists and to be honest, you wouldn't want to have large caches that could potentially be corrupted. If anyone does this, let me know :)
[12:12] * t0rn (~ssullivan@2607:fad0:32:a02:56ee:75ff:fe48:3bd3) has joined #ceph
[12:13] <joelm> you can live migrate local storage btw
[12:14] <DiabloD3> joelm: yeah, but Im trying to eat the hit of that
[12:14] <DiabloD3> joelm: customer of mine is trying to do it with ovz + that thing's live migrate
[12:16] * KevinPerks (~Adium@cpe-75-177-32-14.triad.res.rr.com) has joined #ceph
[12:16] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[12:19] <DiabloD3> joelm: it takes several minutes to move
[12:19] <joelm> it always will
[12:19] <joelm> if you have a memory footprint that's large
[12:19] <joelm> and the links are not as fast
[12:20] <joelm> a small VM will migrate a lot quicker, as there's less memory to map up to the target node
[12:20] <joelm> storage is one thing, running state is another
[12:21] <joelm> (this isn't the same as locktepping btw, which does keep things in sync)
[12:21] <joelm> *lockstepping
[12:22] * lightspeed (~lightspee@2001:8b0:16e:1:8326:6f70:89f:8f9c) Quit (Ping timeout: 480 seconds)
[12:26] <DiabloD3> joelm: its all the disk doing it
[12:26] * Misacorp (~Sirrush@98EAAA0KI.tor-irc.dnsbl.oftc.net) Quit ()
[12:26] * dux0r (~nastidon@176.10.99.201) has joined #ceph
[12:26] <DiabloD3> joelm: he tested it with tiny ram VMs, they only took forever when they had lots of files (due to chrootness)
[12:27] <joelm> on Ceph?
[12:27] <DiabloD3> no, local to local
[12:27] <joelm> oh well, can't help you there
[12:27] <DiabloD3> so which is why Im looking into ceph for him to not need to move everything
[12:32] * lightspeed (~lightspee@2001:8b0:16e:1:8326:6f70:89f:8f9c) has joined #ceph
[12:49] * kefu (~kefu@114.92.108.72) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[12:53] * bilco105 is now known as bilco105_
[12:55] * rwheeler (~rwheeler@pool-173-48-214-9.bstnma.fios.verizon.net) has joined #ceph
[12:56] * dux0r (~nastidon@98EAAA0LK.tor-irc.dnsbl.oftc.net) Quit ()
[12:56] * w2k (~dontron@mail.calyx.com) has joined #ceph
[13:01] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) Quit (Ping timeout: 480 seconds)
[13:02] * psiekl (psiekl@wombat.eu.org) Quit (Quit: leaving)
[13:07] * fdmanana (~fdmanana@bl13-135-166.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[13:09] * bilco105_ is now known as bilco105
[13:11] <DiabloD3> joelm: can ovz use ceph with live migration?
[13:11] <DiabloD3> I cant find anything on the internet about it
[13:11] <DiabloD3> keeps trying to push parallel's product
[13:12] * lucas1 (~Thunderbi@218.76.52.64) Quit (Quit: lucas1)
[13:13] <joelm> DiabloD3: I have no idea, never used it
[13:13] * joelm just uses libvirt/kvm/opennebula/openstack
[13:14] * vbellur (~vijay@121.244.87.124) Quit (Ping timeout: 480 seconds)
[13:17] * rongze (~rongze@182.48.117.114) Quit (Remote host closed the connection)
[13:18] * rongze (~rongze@182.48.117.114) has joined #ceph
[13:20] * hellertime (~Adium@a23-79-238-10.deploy.static.akamaitechnologies.com) has joined #ceph
[13:21] * overclk (~overclk@121.244.87.117) Quit (Quit: Leaving)
[13:21] <boolman> Can you change pg_num of a active pool?
[13:22] <joelm> sure
[13:23] <joelm> only up thhough
[13:26] * w2k (~dontron@2WVAAA6UB.tor-irc.dnsbl.oftc.net) Quit ()
[13:26] * rongze (~rongze@182.48.117.114) Quit (Ping timeout: 480 seconds)
[13:26] <boolman> joelm: ok thx
[13:27] <Be-El> boolman: don't forget to adjust pgp_num, too
[13:29] * _NiC (~kristian@aeryn.ronningen.no) has joined #ceph
[13:30] * brannmar (~PcJamesy@tor.laquadrature.net) has joined #ceph
[13:33] * fdmanana (~fdmanana@bl13-135-166.dsl.telepac.pt) has joined #ceph
[13:35] * thomnico (~thomnico@2a01:e35:8b41:120:4576:4fc6:1b9b:34b8) has joined #ceph
[13:44] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[13:44] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[13:47] * fdmanana (~fdmanana@bl13-135-166.dsl.telepac.pt) Quit (Ping timeout: 480 seconds)
[13:47] * karnan (~karnan@121.244.87.117) Quit (Quit: Leaving)
[13:48] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[13:57] * derjohn_mob (~aj@fw.gkh-setu.de) has joined #ceph
[13:58] * jdillaman (~jdillaman@pool-173-66-110-250.washdc.fios.verizon.net) has joined #ceph
[14:00] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[14:00] * brannmar (~PcJamesy@2WVAAA6V5.tor-irc.dnsbl.oftc.net) Quit ()
[14:00] * Inverness (~Vidi@tor-exit.server9.tvdw.eu) has joined #ceph
[14:01] * b0e (~aledermue@213.95.25.82) Quit (Ping timeout: 480 seconds)
[14:01] * zok_ (zok@neurosis.pl) Quit (Quit: leaving)
[14:03] <boolman> anyone running ceph rbd on xenserver ?
[14:05] <boolman> is there a native support? or is this the way to go? http://xenserver.org/blog/entry/tech-preview-of-xenserver-libvirt-ceph.html
[14:09] * karnan (~karnan@121.244.87.117) has joined #ceph
[14:10] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[14:12] * vbellur (~vijay@122.172.34.195) has joined #ceph
[14:14] * ksingh (~Adium@2001:708:10:10:7dc3:431f:15b5:6b97) has joined #ceph
[14:16] * yghannam (~yghannam@0001f8aa.user.oftc.net) has joined #ceph
[14:17] * kefu (~kefu@114.92.108.72) has joined #ceph
[14:19] * Concubidated (~Adium@71.21.5.251) has joined #ceph
[14:20] * Concubidated (~Adium@71.21.5.251) Quit ()
[14:22] * shohn (~Adium@dslb-188-102-031-093.188.102.pools.vodafone-ip.de) has joined #ceph
[14:23] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Read error: Connection reset by peer)
[14:25] * shang (~ShangWu@175.41.48.77) Quit (Remote host closed the connection)
[14:25] * wschulze (~wschulze@cpe-74-73-11-233.nyc.res.rr.com) has joined #ceph
[14:27] * fdmanana (~fdmanana@bl13-135-166.dsl.telepac.pt) has joined #ceph
[14:29] * dneary (~dneary@nat-pool-bos-u.redhat.com) has joined #ceph
[14:30] * Shnaw (~andrew_m@strasbourg-tornode.eddai.su) has joined #ceph
[14:30] * Inverness (~Vidi@425AAAHZN.tor-irc.dnsbl.oftc.net) Quit ()
[14:31] * kanagaraj (~kanagaraj@121.244.87.117) Quit (Ping timeout: 480 seconds)
[14:32] * amote (~amote@121.244.87.116) Quit (Quit: Leaving)
[14:36] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) has joined #ceph
[14:37] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) Quit ()
[14:37] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) has joined #ceph
[14:39] * kanagaraj (~kanagaraj@121.244.87.124) has joined #ceph
[14:40] * b0e (~aledermue@213.95.25.82) has joined #ceph
[14:48] * i_m (~ivan.miro@deibp9eh1--blueice3n2.emea.ibm.com) has joined #ceph
[14:50] * sankarshan (~sankarsha@183.87.39.242) has joined #ceph
[14:55] * kanagaraj_ (~kanagaraj@121.244.87.117) has joined #ceph
[14:58] * DV (~veillard@2001:41d0:1:d478::1) has joined #ceph
[14:58] * sjm (~sjm@pool-173-70-76-86.nwrknj.fios.verizon.net) has joined #ceph
[14:59] * brad_mssw (~brad@66.129.88.50) has joined #ceph
[15:00] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[15:00] * Shnaw (~andrew_m@5NZAAA9BI.tor-irc.dnsbl.oftc.net) Quit ()
[15:01] * thomnico (~thomnico@2a01:e35:8b41:120:4576:4fc6:1b9b:34b8) Quit (Read error: No route to host)
[15:01] * thomnico (~thomnico@2a01:e35:8b41:120:4576:4fc6:1b9b:34b8) has joined #ceph
[15:02] * ninkotech_ (~duplo@static-84-242-87-186.net.upcbroadband.cz) Quit (Ping timeout: 480 seconds)
[15:02] * kanagaraj (~kanagaraj@121.244.87.124) Quit (Ping timeout: 480 seconds)
[15:04] * thomnico_ (~thomnico@2a01:e35:8b41:120:8177:7a44:5fa6:3c9d) has joined #ceph
[15:05] * DV_ (~veillard@2001:41d0:a:f29f::1) Quit (Ping timeout: 480 seconds)
[15:05] * visored (~Sketchfil@tor-exit2-readme.puckey.org) has joined #ceph
[15:06] * karnan (~karnan@121.244.87.117) Quit (Remote host closed the connection)
[15:06] * tupper_ (~tcole@2001:420:2280:1272:8900:f9b8:3b49:567e) has joined #ceph
[15:07] * kanagaraj_ (~kanagaraj@121.244.87.117) Quit (Quit: Leaving)
[15:07] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[15:11] * kefu (~kefu@114.92.108.72) Quit (Max SendQ exceeded)
[15:11] * thomnico (~thomnico@2a01:e35:8b41:120:4576:4fc6:1b9b:34b8) Quit (Ping timeout: 480 seconds)
[15:12] * thomnico_ (~thomnico@2a01:e35:8b41:120:8177:7a44:5fa6:3c9d) Quit (Quit: Ex-Chat)
[15:12] * thomnico (~thomnico@2a01:e35:8b41:120:8177:7a44:5fa6:3c9d) has joined #ceph
[15:12] * kefu (~kefu@114.92.108.72) has joined #ceph
[15:15] * justyns (~justyns@li916-116.members.linode.com) has joined #ceph
[15:19] <loicd> ksingh: I don't know ;-)
[15:19] * yanzheng (~zhyan@171.216.95.48) Quit (Quit: This computer has gone to sleep)
[15:19] <ksingh> :-) thanks
[15:20] * yanzheng (~zhyan@171.216.95.48) has joined #ceph
[15:20] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[15:29] * kapil (~ksharma@2620:113:80c0:5::2222) has joined #ceph
[15:30] * rdas (~rdas@121.244.87.116) Quit (Quit: Leaving)
[15:30] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[15:33] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) has joined #ceph
[15:35] * visored (~Sketchfil@2WVAAA620.tor-irc.dnsbl.oftc.net) Quit ()
[15:37] * marrusl (~mark@cpe-24-90-46-248.nyc.res.rr.com) has joined #ceph
[15:39] * hellertime1 (~Adium@72.246.0.14) has joined #ceph
[15:40] * dyasny (~dyasny@173.231.115.58) has joined #ceph
[15:40] * hellertime (~Adium@a23-79-238-10.deploy.static.akamaitechnologies.com) Quit (Read error: Connection reset by peer)
[15:43] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[15:45] * harold (~hamiller@71-94-227-66.dhcp.mdfd.or.charter.com) has joined #ceph
[15:46] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) has joined #ceph
[15:47] * hasues (~hasues@kwfw01.scrippsnetworksinteractive.com) has left #ceph
[15:49] * scuttle|afk is now known as scuttlemonkey
[15:50] * yanzheng (~zhyan@171.216.95.48) Quit (Quit: This computer has gone to sleep)
[16:05] * overclk (~overclk@122.167.106.102) has joined #ceph
[16:05] * Misacorp (~Tarazed@rosaluxemburg.tor-exit.calyxinstitute.org) has joined #ceph
[16:05] * bigtoch_ (~bigtoch@41.189.169.250) has joined #ceph
[16:05] * dynamicudpate (~overonthe@199.68.193.54) has joined #ceph
[16:05] * Nats_ (~natscogs@114.31.195.238) has joined #ceph
[16:07] * boichev2 (~boichev@213.169.56.130) has joined #ceph
[16:09] * fghaas (~florian@nat-pool-brq-u.redhat.com) has joined #ceph
[16:09] * bigtoch (~bigtoch@41.189.169.250) Quit (Read error: No route to host)
[16:09] * schmee_ (~quassel@phobos.isoho.st) Quit (Read error: Connection reset by peer)
[16:09] * dec (~dec@ec2-54-66-50-124.ap-southeast-2.compute.amazonaws.com) Quit (Max SendQ exceeded)
[16:09] * _nick (~nick@zarquon.dischord.org) Quit (Max SendQ exceeded)
[16:09] * ndevos (~ndevos@nat-pool-ams2-5.redhat.com) Quit (Read error: Connection reset by peer)
[16:09] * ndevos_ (~ndevos@nat-pool-ams2-5.redhat.com) has joined #ceph
[16:10] * _nick (~nick@zarquon.dischord.org) has joined #ceph
[16:10] * dec (~dec@ec2-54-66-50-124.ap-southeast-2.compute.amazonaws.com) has joined #ceph
[16:10] * schmee (~quassel@phobos.isoho.st) has joined #ceph
[16:11] * boichev (~boichev@213.169.56.130) Quit (Ping timeout: 481 seconds)
[16:12] * OnTheRock (~overonthe@199.68.193.62) Quit (Ping timeout: 481 seconds)
[16:12] * Nats (~natscogs@114.31.195.238) Quit (Ping timeout: 481 seconds)
[16:13] <loicd> kapil: what version of ceph are you running ?
[16:13] * huihoo (~oftc-webi@45.62.104.34.16clouds.com) has joined #ceph
[16:13] <kapil> Its firefly
[16:13] <huihoo> hello
[16:13] <huihoo> everyone
[16:14] <huihoo> I have a problem with ceph right now
[16:14] <kapil> Hi folks, I see some issues with some of the rados commands but not sure if they are features of bugs and wanted to discuss them here
[16:14] <kapil> example - The ceph pool "data" is already present. if I try to create a new pool named data again, there is no error -
[16:14] <kapil> host-44-0-2-109:~ # rados mkpool data
[16:14] <kapil> successfully created pool data
[16:14] * ngoswami (~ngoswami@121.244.87.116) Quit (Quit: Leaving)
[16:14] * harold (~hamiller@71-94-227-66.dhcp.mdfd.or.charter.com) Quit (Quit: Leaving)
[16:14] * rendar (~I@host157-176-dynamic.35-79-r.retail.telecomitalia.it) has joined #ceph
[16:15] * cok (~chk@2a02:2350:18:1010:ace7:51f3:af2:ceff) Quit (Quit: Leaving.)
[16:15] <kapil> I learned that that's a feature rather than a bug because all commands should be idempotent where possible. However, if you look at rados rmpool, then it's the opposite. It does throw an error message -
[16:15] <kapil> rados rmpool data2 data2 --yes-i-really-really-mean-it
[16:15] <kapil> pool data2 does not exist
[16:15] <kapil> error 2: (2) No such file or directory
[16:15] * rongze (~rongze@106.39.138.252) has joined #ceph
[16:17] <huihoo> @kapil It seems nobody helps here
[16:17] <cephalobot> huihoo: Error: "kapil" is not a valid command.
[16:18] <m0zes> huihoo: for someone that joined less than 5 minutes ago, you seem to be certain no one is around.
[16:18] <m0zes> it would also seem that you didn't ask a question.
[16:18] * topro (~prousa@host-62-245-142-50.customer.m-online.net) Quit (Read error: No route to host)
[16:18] * ircolle-afk is now known as ircolle
[16:19] * blueskin (~a@0001ffcf.user.oftc.net) has joined #ceph
[16:19] <huihoo> sorry for my misbehaviour
[16:20] * topro (~prousa@host-62-245-142-50.customer.m-online.net) has joined #ceph
[16:20] <blueskin> I have one server that was brought down for maintenance and now back up, and all of its osds are stuck at 'booting'
[16:20] <blueskin> seems they aren't catching up with the current epoch in newest_map - e.g. one is still at 3060 where latest is 3120
[16:21] <blueskin> with osd tree, they are all showing as down - am I just being impatient?
[16:21] <m0zes> kapil: destruction of data is a fairly special case. personally, I would want errors returned if the destruction wasn't possible, especially in an automated setting. I'm not sure if that is reason behind that design, though.
[16:22] <Be-El> blueskin: depending on the number of pgs and objects restarting an osd process may take a significant amount of time
[16:22] <huihoo> I'm setting up my first mon.But when I try to test it with ceph -s. It always complains that Illegal instruction (core dumped).
[16:22] <blueskin> Be-El: thanks; was wondering as I already redid another server in the cluster last week and that one was much faster to come back in
[16:22] <Be-El> blueskin: you can check the logs during startup. in my experience scanning the existing pgs takes the most time
[16:22] <blueskin> 10min rather than 90min and counting
[16:23] <Be-El> 90mins is definitely long....
[16:23] <loicd> kapil: rados lspool shows the pool but you can't delete it, right ?
[16:23] * dyasny (~dyasny@173.231.115.58) Quit (Remote host closed the connection)
[16:23] <m0zes> huihoo: I've seen that before, but not from an enterprise distro. what linux distro are you running the command from? (when I saw it it was gentoo, and boost had been updated and ceph wasn't rebuild against the new library)
[16:23] <blueskin> well, I did restart several osds in troubleshooting in that time, but several of them are that old since bringing the server up, I think
[16:24] <blueskin> the one I'm watching now as it's got the highest epoch is perhaps 20min old
[16:24] <huihoo> I'm running arch
[16:24] <huihoo> I'm running ceph on arch
[16:24] * dyasny (~dyasny@173.231.115.58) has joined #ceph
[16:25] <loicd> kapil: ah, now I understand your question :-)
[16:25] <m0zes> does arch provide ceph packages or are they from AUR?
[16:25] <Be-El> blueskin: i also have startup times > 1h on some systems. they are due to btrfs and btrfs cleanups at startup and limited cpu power
[16:25] <huihoo> I build ceph from AUR
[16:25] <blueskin> well, these are xfs rather than btrfs, most of them are new (empty) drives. also, when my osds are catching up, should I have any noout or similar flags set for optimal time?
[16:26] * thomnico (~thomnico@2a01:e35:8b41:120:8177:7a44:5fa6:3c9d) Quit (Quit: Ex-Chat)
[16:26] <m0zes> I'd double check that ceph is built against the currently installed libraries, then.
[16:26] <kapil> loicd: looks like this has been fixed in Hammer, I do get proper error message -
[16:26] <kapil> jenkins@host-44-0-2-109:~> rados -p data mkpool data
[16:26] <kapil> error creating pool data: (17) File exists
[16:27] <huihoo> my boost version is 1.57.0-3
[16:27] <huihoo> Is it too old ?
[16:27] <blueskin> lost of messages in my osd logs like "2015-04-07 15:25:21.466907 7f1ab258e700 1 heartbeat_map reset_timeout 'FileStore::op_tp thread 0x7f1ab258e700' had timed out after 4
[16:28] <m0zes> I don't thin that it is a problem with being too old, but more of an instance of a library being updated and the things that depend on the library not being rebuilt to use the new ABI.
[16:29] <loicd> kapil: I would prefer rados to be idempotent but since it's mostly used by human beings, I guess it's better to not have it idempotent
[16:29] <m0zes> boost is particularly bad about that, because their ABI changes frequently.
[16:29] <Be-El> blueskin: looks like the disks are busy
[16:29] <loicd> kapil: why do it matter that it's not idempotent ?
[16:30] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[16:30] <Be-El> blueskin: i've no experience with xfs, so i cannot comment on osd startup times
[16:30] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[16:31] <huihoo> maybe I should try to use a old boost version?
[16:31] <kapil> loicd: I think what matters more is consistency. for example in firefly, "rados mkpool" was not throwing an error however, "rados rmpool" was throwing an error. This I think is not right
[16:31] <m0zes> huihoo: or re-install ceph.
[16:31] <loicd> kapil: I agree
[16:32] <huihoo> Is there an easy way to install ceph without compiling it?
[16:32] <huihoo> on arch
[16:32] * shylesh (~shylesh@121.244.87.124) Quit (Remote host closed the connection)
[16:33] <loicd> loicd: I suggest you file a bug, it's a legitimate request (i.e backward compatibility should not be broken)
[16:33] <loicd> kapil: http://tracker.ceph.com/projects/ceph/issues/new
[16:33] * loicd talking to himself ;-)
[16:35] * Misacorp (~Tarazed@1GLAAA19Z.tor-irc.dnsbl.oftc.net) Quit ()
[16:35] * AotC (~colde@marylou.nos-oignons.net) has joined #ceph
[16:35] <kapil> loicd: sure, thanks :-)
[16:36] * PerlStalker (~PerlStalk@162.220.127.20) has joined #ceph
[16:36] * wushudoin (~wushudoin@209.132.181.86) has joined #ceph
[16:38] * _prime_ (~oftc-webi@199.168.44.192) has joined #ceph
[16:38] * wushudoin (~wushudoin@209.132.181.86) Quit ()
[16:38] * wushudoin (~wushudoin@209.132.181.86) has joined #ceph
[16:39] <MaZ-> okay.. so i have a primary and secondary zone in the same region configured for federated rgw, under normal circumstances we only push data to the primary zone (these are 'small' deployments btw). Is there some sort of process for making the secondary zone a primary so I can blow away the current primary? Is it just a case of updating the region / zone config?
[16:40] * dugravot61 (~dugravot6@dn-infra-04.lionnois.univ-lorraine.fr) Quit (Quit: Leaving.)
[16:41] * lpabon (~quassel@24-151-54-34.dhcp.nwtn.ct.charter.com) has joined #ceph
[16:41] * Hemanth (~Hemanth@121.244.87.117) Quit (Ping timeout: 480 seconds)
[16:43] * burley (~khemicals@cpe-98-28-239-78.cinci.res.rr.com) Quit (Read error: Connection reset by peer)
[16:43] * thomnico (~thomnico@2a01:e35:8b41:120:793b:12f5:f699:3969) has joined #ceph
[16:44] * burley (~khemicals@cpe-98-28-239-78.cinci.res.rr.com) has joined #ceph
[16:44] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[16:44] <flaf> Hi, just a question about journal of OSD. If a ceph client make a IO request just for *reading*, the journal of the requested ODSs (primary OSD and the secondary etc.) are absolutely not used in this case. Right?
[16:45] * sherlocked (~watson@14.139.82.6) has joined #ceph
[16:45] <m0zes> huihoo: probably not.
[16:47] <huihoo> I will rebuild ceph again
[16:47] <huihoo> this time I will make check
[16:48] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[16:53] * togdon (~togdon@74.121.28.6) has joined #ceph
[16:53] * togdon (~togdon@74.121.28.6) Quit ()
[16:55] * linuxkidd (~linuxkidd@vpngac.ccur.com) has joined #ceph
[17:00] * huihoo (~oftc-webi@45.62.104.34.16clouds.com) Quit (Quit: Page closed)
[17:01] <kapil> folks: I will really appreciate if someone could try "rados cppool <src-pool> <dest-pool>" command on their Hammer ceph cluster. It throws an error for me, not sure if this is an upstream issue or something related to our distro only
[17:01] <kapil> http://pastebin.com/gVkbiPLa
[17:02] * fghaas (~florian@nat-pool-brq-u.redhat.com) Quit (Quit: Leaving.)
[17:02] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[17:05] * AotC (~colde@2WVAAA7AF.tor-irc.dnsbl.oftc.net) Quit ()
[17:05] * adept256 (~Nephyrin@ds1789779.dedicated.solnet.ch) has joined #ceph
[17:05] * davidzlap (~Adium@2605:e000:1313:8003:68c4:1457:41bc:7f05) has joined #ceph
[17:06] * mourgaya (~mourgaya@80.124.164.139) has joined #ceph
[17:06] * lalatenduM (~lalatendu@121.244.87.117) Quit (Quit: Leaving)
[17:06] * mourgaya (~mourgaya@80.124.164.139) has left #ceph
[17:06] * ksingh (~Adium@2001:708:10:10:7dc3:431f:15b5:6b97) has left #ceph
[17:11] * analbeard (~shw@support.memset.com) Quit (Quit: Leaving.)
[17:11] * thomnico (~thomnico@2a01:e35:8b41:120:793b:12f5:f699:3969) Quit (Quit: Ex-Chat)
[17:16] * Kioob (~Kioob@200.254.0.109.rev.sfr.net) has joined #ceph
[17:17] * sankarshan (~sankarsha@183.87.39.242) Quit (Quit: Are you sure you want to quit this channel (Cancel/Ok) ?)
[17:18] * bilco105 is now known as bilco105_
[17:21] * adeel (~adeel@fw1.ridgeway.scc-zip.net) Quit (Quit: Leaving...)
[17:21] * bilco105_ is now known as bilco105
[17:27] * p66kumar (~p66kumar@c-67-188-232-183.hsd1.ca.comcast.net) has joined #ceph
[17:28] * b0e (~aledermue@213.95.25.82) Quit (Quit: Leaving.)
[17:30] * blueskin (~a@0001ffcf.user.oftc.net) Quit (Quit: leaving)
[17:30] * Kioob (~Kioob@200.254.0.109.rev.sfr.net) Quit (Quit: Leaving.)
[17:32] * ifur (~osm@0001f63e.user.oftc.net) Quit (Quit: Lost terminal)
[17:33] * TMM (~hp@sams-office-nat.tomtomgroup.com) Quit (Quit: Ex-Chat)
[17:33] * ifur (~osm@0001f63e.user.oftc.net) has joined #ceph
[17:35] * elder (~elder@c-24-245-18-91.hsd1.mn.comcast.net) has left #ceph
[17:35] * adept256 (~Nephyrin@2WVAAA7BV.tor-irc.dnsbl.oftc.net) Quit ()
[17:35] * tunaaja (~Helleshin@171.ip-5-135-148.eu) has joined #ceph
[17:36] * alram (~alram@38.122.20.226) has joined #ceph
[17:40] * rongze (~rongze@106.39.138.252) Quit (Remote host closed the connection)
[17:40] * p66kumar (~p66kumar@c-67-188-232-183.hsd1.ca.comcast.net) Quit (Quit: p66kumar)
[17:41] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Remote host closed the connection)
[17:42] * ChrisNBlum (~ChrisNBlu@178.255.153.117) has joined #ceph
[17:45] * joef (~Adium@2601:9:280:f2e:9d69:5448:100c:ef36) has joined #ceph
[17:45] * zack_dolby (~textual@pa3b3a1.tokynt01.ap.so-net.ne.jp) has joined #ceph
[17:48] * rongze (~rongze@106.39.138.252) has joined #ceph
[17:48] * rongze (~rongze@106.39.138.252) Quit (Remote host closed the connection)
[17:52] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) has joined #ceph
[17:52] * puffy (~puffy@50.185.218.255) has joined #ceph
[17:52] * brutuscat (~brutuscat@93.Red-88-1-121.dynamicIP.rima-tde.net) Quit (Remote host closed the connection)
[17:52] * rotbeard (~redbeard@b2b-94-79-138-170.unitymedia.biz) Quit (Quit: Leaving)
[17:52] * jwilkins (~jwilkins@2601:9:4580:f4c:ea2a:eaff:fe08:3f1d) Quit (Remote host closed the connection)
[17:53] * Concubidated (~Adium@71.21.5.251) has joined #ceph
[17:56] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) Quit (Quit: Ex-Chat)
[17:57] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) has joined #ceph
[17:57] * Nacer (~Nacer@252-87-190-213.intermediasud.com) has joined #ceph
[17:58] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) Quit ()
[17:58] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) has joined #ceph
[18:01] <devicenull> so, I have a few PGs stuck stale+active+undersized+degraded
[18:01] <devicenull> how the hell do I fix them? everything I've tried has failed
[18:01] * CheKoLyN (~saguilar@bender.parc.xerox.com) has joined #ceph
[18:02] * bandrus (~brian@198.23.71.101-static.reverse.softlayer.com) has joined #ceph
[18:04] * setmason (~setmason@128.107.241.185) has joined #ceph
[18:04] <devicenull> it's also fun, because
[18:04] <devicenull> pg 0.0 is stuck undersized for 1011158.873869, current state stale+active+undersized+degraded, last acting [14]
[18:04] <devicenull> # ceph pg 0.0 query
[18:04] <devicenull> Error ENOENT: i don't have pgid 0.0
[18:04] * bkopilov (~bkopilov@bzq-109-64-149-201.red.bezeqint.net) Quit (Ping timeout: 480 seconds)
[18:04] <setmason> Anybody know when http://tracker.ceph.com/issues/6494 was fixed? 80.7? 80.9?
[18:05] * tunaaja (~Helleshin@1GLAAA2E3.tor-irc.dnsbl.oftc.net) Quit ()
[18:05] * VampiricPadraig (~VampiricP@95.211.169.35) has joined #ceph
[18:06] * scuttlemonkey is now known as scuttle|afk
[18:07] * bilco105 is now known as bilco105_
[18:07] <gleam> http://tracker.ceph.com/journals/diff/47154?detail_id=46832
[18:08] * qhartman (~qhartman@den.direwolfdigital.com) has joined #ceph
[18:09] * bilco105_ is now known as bilco105
[18:11] * zhaochao (~zhaochao@111.161.17.97) Quit (Ping timeout: 480 seconds)
[18:13] * bkopilov (~bkopilov@bzq-79-178-155-52.red.bezeqint.net) has joined #ceph
[18:15] * zhaochao (~zhaochao@111.161.77.236) has joined #ceph
[18:20] * i_m (~ivan.miro@deibp9eh1--blueice3n2.emea.ibm.com) Quit (Ping timeout: 480 seconds)
[18:20] * Kioob (~Kioob@2a01:e34:ec0a:c0f0:7e7a:91ff:fe3c:6865) has joined #ceph
[18:26] * kawa2014 (~kawa@89.184.114.246) Quit (Quit: Leaving)
[18:27] * brutuscat (~brutuscat@93.Red-88-1-121.dynamicIP.rima-tde.net) has joined #ceph
[18:30] <setmason> thanks
[18:30] * setmason (~setmason@128.107.241.185) has left #ceph
[18:35] * VampiricPadraig (~VampiricP@3OZAAAXJT.tor-irc.dnsbl.oftc.net) Quit ()
[18:35] * CydeWeys (~kalleeen@171.ip-5-135-148.eu) has joined #ceph
[18:35] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[18:39] * gravitystorm (~gravityst@host109-149-148-33.range109-149.btcentralplus.com) has left #ceph
[18:41] * B_Rake (~B_Rake@69-195-66-67.unifiedlayer.com) has joined #ceph
[18:41] * rongze (~rongze@106.39.138.252) has joined #ceph
[18:42] * p66kumar (~p66kumar@74.119.205.248) has joined #ceph
[18:42] * smiley_ (~smiley@205.153.36.170) Quit (Read error: Connection reset by peer)
[18:45] * Nacer (~Nacer@252-87-190-213.intermediasud.com) Quit (Ping timeout: 480 seconds)
[18:46] * vata (~vata@208.88.110.46) has joined #ceph
[18:50] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) has joined #ceph
[18:54] * alram (~alram@38.122.20.226) Quit (Quit: leaving)
[18:55] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[18:56] * alram (~alram@38.122.20.226) has joined #ceph
[18:57] * oro (~oro@2001:620:20:16:d84d:b122:4902:e044) Quit (Ping timeout: 480 seconds)
[18:57] * dupont-y (~dupont-y@2a01:e34:ec92:8070:f036:a06f:495f:db99) has joined #ceph
[18:58] * joshd1 (~jdurgin@68-119-140-18.dhcp.ahvl.nc.charter.com) Quit (Quit: Leaving.)
[18:58] * jordanP (~jordan@213.215.2.194) Quit (Quit: Leaving)
[19:00] * Kioob (~Kioob@2a01:e34:ec0a:c0f0:7e7a:91ff:fe3c:6865) Quit (Quit: Leaving.)
[19:00] * Kioob (~Kioob@sal69-4-78-192-172-15.fbxo.proxad.net) has joined #ceph
[19:01] * davidzlap (~Adium@2605:e000:1313:8003:68c4:1457:41bc:7f05) Quit (Quit: Leaving.)
[19:01] * davidzlap (~Adium@2605:e000:1313:8003:68c4:1457:41bc:7f05) has joined #ceph
[19:03] * debian112 (~bcolbert@24.126.201.64) Quit (Ping timeout: 480 seconds)
[19:05] * CydeWeys (~kalleeen@1GLAAA2HP.tor-irc.dnsbl.oftc.net) Quit ()
[19:05] * Frymaster (~offender@176.10.99.200) has joined #ceph
[19:06] * debian112 (~bcolbert@24.126.201.64) has joined #ceph
[19:08] * kefu (~kefu@114.92.108.72) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[19:09] * lpabon (~quassel@24-151-54-34.dhcp.nwtn.ct.charter.com) Quit (Remote host closed the connection)
[19:10] * daniel2_ (~daniel2_@cpe-24-28-6-151.austin.res.rr.com) has joined #ceph
[19:12] * rotbeard (~redbeard@2a02:908:df10:d300:6267:20ff:feb7:c20) has joined #ceph
[19:13] * bigtoch_ (~bigtoch@41.189.169.250) Quit (Remote host closed the connection)
[19:14] * overclk (~overclk@122.167.106.102) Quit (Quit: Leaving)
[19:18] * bene (~ben@nat-pool-bos-t.redhat.com) has joined #ceph
[19:23] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) has joined #ceph
[19:23] * xarses (~andreww@c-76-126-112-92.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[19:23] * sbfox (~Adium@72.2.49.50) has joined #ceph
[19:24] <cetex> hm. has anyone fitted hp sl230s with 2 3.5" hdd's and an ssd for journal?
[19:25] <cetex> I'm skeptical. i don't think it will fit without using one pci slot for the ssd..
[19:26] <cetex> so.. the issue i have currently is that we have a few hundred hp sl230s we're planning to equip with 2x4TB sata hdd's. and I'm wondering how i should handle the journal.
[19:26] <darkfaded> cetex: haven't. i have supermicro twin^2 which are similar and also are problematic with the journal
[19:26] <cetex> as far as i know the servers has 2 pci slots, one sd-card, one usb-stick and 2x3.5" hdd's.
[19:26] <cetex> ok.
[19:26] <darkfaded> if the pic slot size fits a _good_ pcie ssd then it would be all great
[19:27] <darkfaded> *pci
[19:27] <cetex> we can get 256GB without a problem. it's expensive though.
[19:27] <darkfaded> pic would be really small
[19:27] <cetex> basically ramdisk + battery backup :)
[19:27] <cetex> we were offered that in the beginning
[19:28] <darkfaded> battery doesn't protect against OS crashes :/
[19:28] <darkfaded> power outages are much less likely than bugs
[19:28] * Nacer (~Nacer@203-206-190-109.dsl.ovh.fr) has joined #ceph
[19:28] <cetex> ah, yeah. but it's branded as an ssd, appears as an ssd, but has 256GB ram-chips + battery
[19:28] <darkfaded> oh! sorry
[19:28] <darkfaded> i iddn't get it
[19:28] <cetex> :)
[19:29] <darkfaded> sounds like a lot of money, but also sounds like you'd have one of the nicest ceph installs world-wide
[19:29] <darkfaded> what is that ram+battery 256GB thing called?
[19:30] <Be-El> cetex: using sl230s as ceph nodes sounds like a waste of resources
[19:30] <Be-El> cetex: we use them as workhorses in our compute cluster.....
[19:30] <darkfaded> in my case i'm thinking hard about burning through m.2 ssds and just replacing them every 3 months
[19:32] <Be-El> darkfaded: replacing the ssds require taking the osds down
[19:33] <darkfaded> Be-El: of course
[19:34] <darkfaded> Be-El: sucks and is generally a reason to not use any shit ssds
[19:34] <Be-El> darkfaded: the maintenance burden is quite high, yes
[19:34] <darkfaded> supermicro just doesn't have really good hw engineers, so it's less spacy in the server than with hp / intel / ...
[19:35] <Be-El> before starting with ceph our hosts were equipped with samsung 850 ssds....i'm curious how long these can stand ceph traffic
[19:35] * Frymaster (~offender@2WVAAA7I4.tor-irc.dnsbl.oftc.net) Quit ()
[19:35] * TGF (~nastidon@5.135.85.23) has joined #ceph
[19:36] <darkfaded> Be-El: i've looked at some TBW numbers, it's funny
[19:37] <darkfaded> samsung 830 or older crucial to same-time hitachi: 1:500
[19:37] <darkfaded> micron too
[19:38] <cetex> sorry :)
[19:38] <cetex> delay.. at the pub :>
[19:39] <cetex> but then again, those ssd's uses one pci slot. we're trying to standardize to use one for standard 10gbe networking, and the other for secondary 10gbe networking (router-functionality) or sdi (video-thingie) or <whatever>
[19:40] <cetex> Be-El: we have a few hundred already for workhorse-stuff. we don't need much storage compared to cpu, we'll easily fit 1PB in our machines with 4TB harddrives.
[19:40] <cetex> so we thought we'll just reuse the machines we have. less management and more efficient resource use.
[19:40] * Hemanth (~Hemanth@117.192.242.187) has joined #ceph
[19:40] <Be-El> cetex: well, you can also test it without extra journals
[19:40] <Be-El> cetex: i had that setup on our sl230s
[19:41] <cetex> darkfaded: i'll see if i can find the p/n for those ssd's. i think they're insanely expensive though..
[19:42] <Be-El> cetex: if you are willing to use btrfs, the osd can write journal and data in parallel
[19:42] <cetex> hm, yeah. the harddrives is only used for ceph. (kinda)
[19:43] <cetex> also for aurora and mesos databases on a few machines, but those don't do much iops and only use 20-50MB of disk space.
[19:43] <darkfaded> brb sry
[19:43] * dgurtner (~dgurtner@217-162-119-191.dynamic.hispeed.ch) Quit (Ping timeout: 480 seconds)
[19:43] <cetex> .. and consul which may do a bit more iops on a large cluster, but still 1:100 of what ceph is doing.
[19:43] * Kioob (~Kioob@sal69-4-78-192-172-15.fbxo.proxad.net) Quit (Quit: Leaving.)
[19:43] * jwilkins (~jwilkins@c-67-180-123-48.hsd1.ca.comcast.net) has joined #ceph
[19:44] <cetex> Be-El: i'm running 9 nodes with drives (7 nodes 2x4tb + 2 nodes 2x3tb) for testing currently.
[19:45] <cetex> getting between 120 to 400MB/sec.
[19:45] <cetex> would be nice to get this to sit at a more stable number.
[19:45] <cetex> in writes.
[19:45] <Be-El> cetex: did you try a better nice/ionice setting for the osd processes?
[19:46] <cetex> osd is the only process that uses the harddrives.
[19:46] <cetex> everything else is on ramdisk
[19:46] <cetex> so. boot host -> mount drives under /data/x where x is the harddrive number -> launch osd on /data/x/ceph
[19:47] <cetex> so /dev/sda1 is /data/1
[19:47] <cetex> /dev/sdb1 is /data/2
[19:47] <cetex> hosts are pxe booted so we run the same setup on diskless hosts.
[19:47] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[19:48] <cetex> so i don't think ionice would make a difference?
[19:48] <Be-El> probably not
[19:49] <Be-El> well, time to call it a day
[19:49] <cetex> maybe. :)
[19:49] * Be-El (~quassel@fb08-bcf-pc01.computational.bio.uni-giessen.de) Quit (Remote host closed the connection)
[19:49] <cetex> My guess is that i will get kinda linear improvement in iops when i increase the number of hosts with 2xdisks
[19:49] <cetex> but i'm not entirely sure if that's the case.
[19:49] * mgolub (~Mikolaj@91.225.203.116) has joined #ceph
[19:52] <cetex> darkfaded: i can't find the order for those ssd's. but they're kinda expensive. i believe 3k usd each or something.
[19:53] <cetex> we don't have them though. never ordered them since i said we don't need them :)
[19:53] <darkfaded> hehe
[19:53] <cetex> question: if using btrfs, is the journal not needed any more?
[19:53] <cetex> or is the journal and the file written in one write?
[19:54] <cetex> so, would we get a large decrease in iops needed?
[19:55] * dyasny (~dyasny@173.231.115.58) Quit (Quit: Ex-Chat)
[19:55] * dyasny (~dyasny@173.231.115.58) has joined #ceph
[19:55] * xarses (~andreww@12.164.168.117) has joined #ceph
[19:57] * ircolle (~Adium@c-71-229-136-109.hsd1.co.comcast.net) Quit (Quit: Leaving.)
[19:58] * rongze (~rongze@106.39.138.252) Quit (Remote host closed the connection)
[19:59] * joef1 (~Adium@2620:79:0:2420::5) has joined #ceph
[20:00] * LeaChim (~LeaChim@host86-151-147-249.range86-151.btcentralplus.com) has joined #ceph
[20:01] * brutuscat (~brutuscat@93.Red-88-1-121.dynamicIP.rima-tde.net) Quit (Remote host closed the connection)
[20:05] * TGF (~nastidon@2FBAAA78D.tor-irc.dnsbl.oftc.net) Quit ()
[20:05] * joef (~Adium@2601:9:280:f2e:9d69:5448:100c:ef36) Quit (Ping timeout: 480 seconds)
[20:05] * KUSmurf (~x303@tor-exit.bronk-ict.nl) has joined #ceph
[20:07] <cetex> no input? :)
[20:07] <cetex> or why is btrfs recommended?
[20:12] * kmccormick (~kmccormic@frodo.ilinkadv.com) has joined #ceph
[20:16] * subscope (~subscope@92-249-244-64.pool.digikabel.hu) has joined #ceph
[20:18] * lalatenduM (~lalatendu@122.171.118.51) has joined #ceph
[20:20] * dyasny (~dyasny@173.231.115.58) Quit (Ping timeout: 480 seconds)
[20:20] * vbellur (~vijay@122.172.34.195) Quit (Ping timeout: 480 seconds)
[20:20] <kmccormick> Hoping someone can help me. Our cluster has had performance issues during deep scrubbing, so we have been doing it off-hours with a script. It looks like somehow those scrubs have been queued up and are now running, not off-hours. I currently have nodeep-scrub set, and another scrub just started. It's really hurting performance. Is there any way to stop a scrub in progress? Any way to see which
[20:20] <kmccormick> PGs have been "instructed" to scrub, but haven't yet?
[20:21] * psiekl (psiekl@wombat.eu.org) has joined #ceph
[20:24] * nljmo_ (~nljmo@5ED6C263.cm-7-7d.dynamic.ziggo.nl) has joined #ceph
[20:24] * nljmo (~nljmo@5ED6C263.cm-7-7d.dynamic.ziggo.nl) Quit (Read error: Connection reset by peer)
[20:28] <cetex> it would be nice to be able to disable journal-use entirely and just go for synchronous writes of all files instead. it would mean a little bit higher delay, but it would remove the need for 2xiops.
[20:29] <cetex> if i understand it correctly..
[20:29] * B_Rake (~B_Rake@69-195-66-67.unifiedlayer.com) Quit (Remote host closed the connection)
[20:31] * dyasny (~dyasny@173.231.115.58) has joined #ceph
[20:32] * vbellur (~vijay@122.171.75.59) has joined #ceph
[20:35] * KUSmurf (~x303@5NZAAA9YV.tor-irc.dnsbl.oftc.net) Quit ()
[20:35] * jakekosberg (~Grum@5NZAAA903.tor-irc.dnsbl.oftc.net) has joined #ceph
[20:37] * lpabon (~quassel@24-151-54-34.dhcp.nwtn.ct.charter.com) has joined #ceph
[20:42] * cdelatte (~cdelatte@cpe-75-176-84-224.carolina.res.rr.com) has joined #ceph
[20:42] * derjohn_mob (~aj@fw.gkh-setu.de) Quit (Ping timeout: 480 seconds)
[20:43] * Hemanth (~Hemanth@117.192.242.187) Quit (Ping timeout: 480 seconds)
[20:47] * shaunm (~shaunm@74.215.76.114) Quit (Ping timeout: 480 seconds)
[20:47] * b0e (~aledermue@x2f2325b.dyn.telefonica.de) has joined #ceph
[20:50] * b0e (~aledermue@x2f2325b.dyn.telefonica.de) Quit ()
[20:50] * lalatenduM (~lalatendu@122.171.118.51) Quit (Quit: Leaving)
[20:53] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Quit: Leaving.)
[20:53] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) has joined #ceph
[20:54] <davidzlap> kmccormick: A fix is in the works to be able to stop manually initiated scrubs.
[20:57] * x45d (~oz@62.179.62.81.dynamic.wline.res.cust.swisscom.ch) has joined #ceph
[20:57] <cetex> :)
[20:58] <cetex> has anyone done tests with disabling the journal entirely?
[20:59] <cetex> what would the performance be like compared to using journal and doing 2x the writes?
[21:03] <cetex> my guess is it would be faster than write synched to journal and then write to disk.
[21:03] * oro (~oro@80-219-254-208.dclient.hispeed.ch) has joined #ceph
[21:03] <cetex> which triggers a kernel sync once in a while which may wreak havoc on the system..
[21:05] * jakekosberg (~Grum@5NZAAA903.tor-irc.dnsbl.oftc.net) Quit ()
[21:05] <cetex> which in turn lowers writes and breaks stuff for us. :>
[21:05] * w2k (~Unforgive@tor-exit.info.ucl.ac.be) has joined #ceph
[21:12] * i_m (~ivan.miro@nat-5-carp.hcn-strela.ru) Quit (Ping timeout: 480 seconds)
[21:14] <cetex> any ideas is welcome
[21:14] <cetex> *are welcome
[21:14] <cetex> :>
[21:22] * subscope (~subscope@92-249-244-64.pool.digikabel.hu) Quit (Quit: My Mac has gone to sleep. ZZZzzz???)
[21:26] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[21:27] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[21:29] * omar_m (~omar_m@209.163.140.194) has joined #ceph
[21:30] * shaunm (~shaunm@74.215.76.114) has joined #ceph
[21:35] * w2k (~Unforgive@98EAAA05K.tor-irc.dnsbl.oftc.net) Quit ()
[21:35] * Arfed (~mr_flea@178-175-139-142.ip.as43289.net) has joined #ceph
[21:35] <visbits_> centos kernel now includes RBD enabled by default
[21:35] <visbits_> you can thank me later
[21:36] * sbfox (~Adium@72.2.49.50) Quit (Quit: Leaving.)
[21:36] * subscope (~subscope@92-249-244-64.pool.digikabel.hu) has joined #ceph
[21:36] <kmccormick> davidzlap: Any ideas on the multiple hour delay between ceph pg deep-scrub $pg and the actual scrubbing? Those must be queued up somewhere? Will restarting mons and/or osds clear the queue?
[21:37] * sherlocked (~watson@14.139.82.6) Quit (Read error: Connection reset by peer)
[21:37] <visbits_> kmccormick just restart all of your osds and see if it corrects
[21:52] * jo00nas (~jonas@188-183-5-254-static.dk.customer.tdc.net) has joined #ceph
[21:52] * jo00nas (~jonas@188-183-5-254-static.dk.customer.tdc.net) Quit ()
[21:54] * rendar (~I@host157-176-dynamic.35-79-r.retail.telecomitalia.it) Quit (Ping timeout: 480 seconds)
[21:56] * shohn (~Adium@dslb-188-102-031-093.188.102.pools.vodafone-ip.de) Quit (Quit: Leaving.)
[21:56] * ajs (~alistair@12.201.5.10) has joined #ceph
[21:57] * rendar (~I@host157-176-dynamic.35-79-r.retail.telecomitalia.it) has joined #ceph
[21:58] <ajs> hi, does anybody know if there's a reason librados doesn't call add_observer() on the Objecter instance it creates? AFAICT it means that e.g. rbd does not respect 'crush location'.
[21:59] <visbits_> #ceph-devel
[22:01] * ajs (~alistair@12.201.5.10) has left #ceph
[22:03] * getup (~getup@dhcp-077-251-206-162.chello.nl) has joined #ceph
[22:04] <getup> hi, when starting radosgw it returns an error: ERROR: can't read user header: ret=-2, any idea what could cause this and what to do to fix it?
[22:05] * Arfed (~mr_flea@2WVAAA7SI.tor-irc.dnsbl.oftc.net) Quit ()
[22:05] * demonspork (~theghost9@luxemburg.gtor.org) has joined #ceph
[22:07] <kmccormick> thanks visbits_
[22:14] * t0rn (~ssullivan@2607:fad0:32:a02:56ee:75ff:fe48:3bd3) Quit (Remote host closed the connection)
[22:15] * lpabon (~quassel@24-151-54-34.dhcp.nwtn.ct.charter.com) Quit (Remote host closed the connection)
[22:16] * TMM (~hp@178-84-46-106.dynamic.upc.nl) has joined #ceph
[22:17] * hellertime1 (~Adium@72.246.0.14) Quit (Quit: Leaving.)
[22:21] * kmccormick (~kmccormic@frodo.ilinkadv.com) Quit (Ping timeout: 480 seconds)
[22:27] * subscope (~subscope@92-249-244-64.pool.digikabel.hu) Quit (Ping timeout: 480 seconds)
[22:30] * burley (~khemicals@cpe-98-28-239-78.cinci.res.rr.com) Quit (Ping timeout: 480 seconds)
[22:35] * demonspork (~theghost9@2WVAAA7T4.tor-irc.dnsbl.oftc.net) Quit ()
[22:36] * getup (~getup@dhcp-077-251-206-162.chello.nl) Quit (Quit: My MacBook Pro has gone to sleep. ZZZzzz???)
[22:37] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) Quit (Ping timeout: 480 seconds)
[22:37] * kmccormick (~kmccormic@ip-206-63-187-74-spk.cet.com) has joined #ceph
[22:37] * mgolub (~Mikolaj@91.225.203.116) Quit (Quit: away)
[22:39] * dmick (~dmick@38.122.20.226) Quit (Quit: Leaving.)
[22:39] * loft1 (~SinZ|offl@195.169.125.226) has joined #ceph
[22:40] * yuan (~yzhou67@134.191.220.72) Quit (Ping timeout: 480 seconds)
[22:40] * burley (~khemicals@cpe-98-28-239-78.cinci.res.rr.com) has joined #ceph
[22:40] * dmick (~dmick@38.122.20.226) has joined #ceph
[22:41] * dmick (~dmick@38.122.20.226) Quit ()
[22:42] * dmick (~dmick@38.122.20.226) has joined #ceph
[22:48] * oro (~oro@80-219-254-208.dclient.hispeed.ch) Quit (Ping timeout: 480 seconds)
[22:56] * joef1 (~Adium@2620:79:0:2420::5) Quit (Ping timeout: 480 seconds)
[22:56] * ChrisNBlum (~ChrisNBlu@178.255.153.117) Quit (Quit: Goodbye)
[22:58] <qhartman> congrats devs on Hammer release \o/
[22:59] * _prime_ (~oftc-webi@199.168.44.192) Quit (Remote host closed the connection)
[22:59] <visbits_> i dont see a release for hammer
[22:59] <qhartman> I just got the release announcement on the users mailing list
[22:59] <visbits_> hmm
[22:59] <qhartman> may not be everywhere yet
[22:59] <qhartman> http://ceph.com/docs/master/release-notes/#v0-94-hammer
[23:00] <visbits_> well this channel is about to become crazy
[23:01] <magicrobotmonkey> awesome!
[23:02] * owasserm (~owasserm@52D9864F.cm-11-1c.dynamic.ziggo.nl) has joined #ceph
[23:02] * derjohn_mob (~aj@p578b6aa1.dip0.t-ipconnect.de) has joined #ceph
[23:03] <visbits_> can you mix the release?
[23:03] * oftc (~androirc@78-32-127-104.static.enta.net) has joined #ceph
[23:04] * oftc is now known as Guest1404
[23:04] <lurbs> Test cluster upgraded. Let's see what breaks. :)
[23:04] <darkfaded> They said RDMA
[23:04] <darkfaded> I omg
[23:04] <qhartman> dang, lurbs is on task...
[23:04] <darkfaded> need to quit work so i can play.
[23:05] <visbits_> burn down for what lol
[23:07] * fxmulder_ (~fxmulder@cpe-24-55-6-128.austin.res.rr.com) Quit (Remote host closed the connection)
[23:07] * fxmulder (~fxmulder@cpe-24-55-6-128.austin.res.rr.com) has joined #ceph
[23:09] * loft1 (~SinZ|offl@2WVAAA7VX.tor-irc.dnsbl.oftc.net) Quit ()
[23:09] * verbalins (~xanax`@politkovskaja.torservers.net) has joined #ceph
[23:13] * wicope (~wicope@0001fd8a.user.oftc.net) Quit (Read error: Connection reset by peer)
[23:13] * Guest1404 is now known as kavanagh
[23:15] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[23:17] * dupont-y (~dupont-y@2a01:e34:ec92:8070:f036:a06f:495f:db99) Quit (Quit: Ex-Chat)
[23:17] * dyasny (~dyasny@173.231.115.58) Quit (Ping timeout: 480 seconds)
[23:18] * dupont-y (~dupont-y@2a01:e34:ec92:8070:f036:a06f:495f:db99) has joined #ceph
[23:20] * kmccormick (~kmccormic@ip-206-63-187-74-spk.cet.com) Quit (Ping timeout: 480 seconds)
[23:20] <Archeron> Ceph seems to be great for big stuff, but is it worth considering using RBD in a small (4 node, 8 disk) deployment to aggregate discs across the cluster?
[23:20] <m0zes> whoo!
[23:21] <qhartman> Archeron, depends on what you want to do with those disks. I use it to provide fault tolerant aggregate storage on a small (14 nodes, 3 disks per node) clsuter for openstack VMs.
[23:21] <qhartman> and it's totally worth it for the fault tolerance and simplicity
[23:21] * kmccormick (~kmccormic@ip-206-63-187-74-spk.cet.com) has joined #ceph
[23:28] * dmick (~dmick@38.122.20.226) Quit (Quit: Leaving.)
[23:28] * badone_ (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[23:29] * dmick (~dmick@38.122.20.226) has joined #ceph
[23:29] * jluis (~joao@249.38.136.95.rev.vodafone.pt) has joined #ceph
[23:29] * ChanServ sets mode +o jluis
[23:31] * badone__ (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) has joined #ceph
[23:31] * davidzlap (~Adium@2605:e000:1313:8003:68c4:1457:41bc:7f05) Quit (Quit: Leaving.)
[23:31] * rotbeard (~redbeard@2a02:908:df10:d300:6267:20ff:feb7:c20) Quit (Quit: Leaving)
[23:32] * davidz (~davidz@2605:e000:1313:8003:1830:20d1:3c25:a9dd) has joined #ceph
[23:32] * badone (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[23:34] * tupper_ (~tcole@2001:420:2280:1272:8900:f9b8:3b49:567e) Quit (Ping timeout: 480 seconds)
[23:35] * joef (~Adium@2620:79:0:2420::16) has joined #ceph
[23:35] * joao (~joao@249.38.136.95.rev.vodafone.pt) Quit (Ping timeout: 480 seconds)
[23:37] * badone_ (~brad@CPE-121-215-241-179.static.qld.bigpond.net.au) Quit (Ping timeout: 480 seconds)
[23:38] * yehudasa_ (~yehudasa@2607:f298:a:607:f564:1f99:b052:5ca2) Quit (Ping timeout: 480 seconds)
[23:38] * B_Rake (~B_Rake@45.56.23.41) has joined #ceph
[23:39] * badone__ is now known as badone
[23:39] * linuxkidd (~linuxkidd@vpngac.ccur.com) Quit (Remote host closed the connection)
[23:39] * verbalins (~xanax`@2WVAAA7XT.tor-irc.dnsbl.oftc.net) Quit ()
[23:39] * ain (~Oddtwang@thoreau.gtor.org) has joined #ceph
[23:41] * MACscr1 (~Adium@2601:d:c800:de3:a4a1:791e:c9bb:c787) Quit (Quit: Leaving.)
[23:41] * B_Rake (~B_Rake@45.56.23.41) Quit (Remote host closed the connection)
[23:42] * B_Rake (~B_Rake@45.56.23.41) has joined #ceph
[23:47] * yehudasa_ (~yehudasa@2607:f298:a:607:ec72:dc25:4408:b78e) has joined #ceph
[23:47] * B_Rake (~B_Rake@45.56.23.41) Quit (Remote host closed the connection)
[23:47] * Concubidated (~Adium@71.21.5.251) Quit (Quit: Leaving.)
[23:48] * Concubidated (~Adium@71.21.5.251) has joined #ceph
[23:49] * B_Rake (~B_Rake@45.56.23.41) has joined #ceph
[23:50] * B_Rake (~B_Rake@45.56.23.41) Quit (Remote host closed the connection)
[23:52] * B_Rake (~B_Rake@2605:a601:5b9:dd01:d907:52b3:a3fc:234e) has joined #ceph
[23:52] * B_Rake (~B_Rake@2605:a601:5b9:dd01:d907:52b3:a3fc:234e) Quit (Remote host closed the connection)
[23:53] * ircolle (~Adium@2601:1:a580:1735:2d2d:e429:8d1c:f96c) has joined #ceph
[23:54] * sbfox (~Adium@72.2.49.50) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.