#ceph IRC Log

Index

IRC Log for 2013-01-20

Timestamps are in GMT/BST.

[0:41] * allsystemsarego (~allsystem@188.27.166.249) Quit (Quit: Leaving)
[1:06] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[1:06] * loicd (~loic@2a01:e35:2eba:db10:89e4:eb83:ca16:d26f) has joined #ceph
[1:17] * tnt (~tnt@120.194-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[1:18] <Psi-jack> sagelap: Perfect, good to know. So far, I haven't hit any issues, and just Friday I had a full outage due to home power going out for 3.5~4 hours, yet, my ceph cluster was able to fully recover itself from the failure with no issues. :)
[1:19] <Psi-jack> sagelap: And I just recently got word from the current AUR maintainer for Ceph that if I want it, he'll offer me to adopt it, since he no longer can properly maintain it himself.
[1:21] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) Quit (Remote host closed the connection)
[1:24] <Psi-jack> But, I am curious.. About obsync and boto_tool.. What are those, and are they needed, or useful?
[1:27] * jmlowe (~Adium@c-71-201-31-207.hsd1.in.comcast.net) has joined #ceph
[3:19] * loicd (~loic@2a01:e35:2eba:db10:89e4:eb83:ca16:d26f) Quit (Ping timeout: 480 seconds)
[3:22] * Steki (~steki@85.222.221.79) Quit (Quit: Ja odoh a vi sta 'ocete...)
[3:40] * mattbenjamin (~matt@75.45.226.110) Quit (Quit: Leaving.)
[4:04] * xiaoxi (~xiaoxiche@134.134.137.75) has joined #ceph
[4:09] * LeaChim (~LeaChim@b0faf18a.bb.sky.com) Quit (Ping timeout: 480 seconds)
[4:10] * mattbenjamin (~matt@adsl-75-45-226-110.dsl.sfldmi.sbcglobal.net) has joined #ceph
[4:20] * mattbenjamin (~matt@adsl-75-45-226-110.dsl.sfldmi.sbcglobal.net) Quit (Quit: Leaving.)
[4:21] * mattbenjamin (~matt@adsl-75-45-226-110.dsl.sfldmi.sbcglobal.net) has joined #ceph
[4:29] * mattbenjamin (~matt@adsl-75-45-226-110.dsl.sfldmi.sbcglobal.net) Quit (Ping timeout: 480 seconds)
[4:37] * exec (~defiler@109.232.144.194) Quit (Ping timeout: 480 seconds)
[4:55] * xiaoxi (~xiaoxiche@134.134.137.75) Quit (Remote host closed the connection)
[4:57] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) has joined #ceph
[5:39] * sjustlaptop (~sam@71-83-191-116.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[5:54] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Ping timeout: 480 seconds)
[6:23] * tsygrl (~tsygrl314@c-75-68-140-25.hsd1.vt.comcast.net) Quit (Read error: Connection reset by peer)
[6:24] * tsygrl (~tsygrl314@c-75-68-140-25.hsd1.vt.comcast.net) has joined #ceph
[7:16] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[7:35] * xiaoxi (~xiaoxiche@134.134.137.75) has joined #ceph
[8:39] * xiaoxi (~xiaoxiche@134.134.137.75) Quit (Ping timeout: 480 seconds)
[9:41] <jksM> sagelap, I am here now
[9:42] <jksM> timezones are dreadful ;-)
[9:42] * tnt (~tnt@120.194-67-87.adsl-dyn.isp.belgacom.be) has joined #ceph
[9:49] * Meths_ (~meths@2.27.72.227) has joined #ceph
[9:53] * Meths (~meths@2.27.95.119) Quit (Ping timeout: 480 seconds)
[10:09] * xiaoxi (~xiaoxiche@134.134.137.75) has joined #ceph
[10:09] * loicd (~loic@2a01:e35:2eba:db10:f91f:e105:e759:9a2d) has joined #ceph
[10:10] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[10:24] * loicd (~loic@2a01:e35:2eba:db10:f91f:e105:e759:9a2d) Quit (Quit: Leaving.)
[10:24] * loicd (~loic@magenta.dachary.org) has joined #ceph
[10:25] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) has joined #ceph
[10:33] * stxShadow (~Jens@ip-178-201-147-146.unitymediagroup.de) has left #ceph
[10:40] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[10:40] * loicd (~loic@magenta.dachary.org) has joined #ceph
[10:43] * Machske (~Bram@d5152D87C.static.telenet.be) has joined #ceph
[10:44] <Machske> Hi guys! I hope someone is online and can help me, because I'm becoming a litte desperate ;s
[10:44] <Machske> This morning I found out that all of my mds daemons had crashed (2 of them).
[10:45] <Machske> Now I seem to be unable to start them again, which makes it impossible to access any files stored on the cephfs
[10:45] <Machske> I use cephfs combined with ceph-fuse
[10:45] <Machske> and running version 0.56.1
[10:46] <Machske> ceph mds getmap seems to work
[10:46] <Machske> The problem is that I do not know how to proceed
[10:46] <Machske> the mds processes do start but ceph -s reports that they are laggy or crashed
[10:47] <Machske> Anyone now what to do ?
[10:55] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[11:09] * lxo (~aoliva@lxo.user.oftc.net) Quit (Quit: later)
[11:20] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[11:43] * ScOut3R (~ScOut3R@catv-89-133-32-74.catv.broadband.hu) has joined #ceph
[11:43] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) has joined #ceph
[11:46] * sleinen1 (~Adium@2001:620:0:26:b578:72d2:7448:3db7) has joined #ceph
[11:48] * sleinen (~Adium@217-162-132-182.dynamic.hispeed.ch) Quit (Read error: Operation timed out)
[12:12] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[12:12] * loicd (~loic@magenta.dachary.org) has joined #ceph
[12:34] * salvatore (~salvatore@ppp-34-2.26-151.libero.it) has joined #ceph
[12:36] * leseb (~leseb@5ED01FAC.cm-7-1a.dynamic.ziggo.nl) has joined #ceph
[12:38] * salvatore (~salvatore@ppp-34-2.26-151.libero.it) Quit (Quit: Sto andando via)
[12:51] * LeaChim (~LeaChim@b0faf18a.bb.sky.com) has joined #ceph
[12:56] * salvatore (~salvatore@ppp-34-2.26-151.libero.it) has joined #ceph
[13:02] * salvatore (~salvatore@ppp-34-2.26-151.libero.it) Quit (Remote host closed the connection)
[13:07] * tnt (~tnt@120.194-67-87.adsl-dyn.isp.belgacom.be) Quit (Ping timeout: 480 seconds)
[13:10] * madkiss (~madkiss@178.188.60.118) has joined #ceph
[13:12] * sleinen (~Adium@2001:620:0:26:b578:72d2:7448:3db7) has joined #ceph
[13:12] * sleinen1 (~Adium@2001:620:0:26:b578:72d2:7448:3db7) Quit (Read error: No route to host)
[13:19] <xdeller> why new osdmaps are issued at every snapshot creation? just kind of interest
[13:20] * madkiss (~madkiss@178.188.60.118) Quit (Quit: Leaving.)
[13:28] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) Quit (Remote host closed the connection)
[13:46] * xdeller_ (~xdeller@broadband-77-37-224-84.nationalcablenetworks.ru) has joined #ceph
[13:46] * xdeller (~xdeller@broadband-77-37-224-84.nationalcablenetworks.ru) Quit (Read error: Connection reset by peer)
[14:02] * Meths_ is now known as Meths
[14:29] <Machske> anyone know how to recover the mdsmap ?
[14:29] <Machske> I've got this error: mon.0 [INF] mdsmap e152: 1/1/1 up {0=2=up:replay(laggy or crashed)}
[14:29] <Machske> it won't come up anymore
[15:53] * Machske (~Bram@d5152D87C.static.telenet.be) Quit ()
[15:55] <Psi-jack> Alrighty! Ceph 0.56.1 installed. :D
[16:04] * xdeller_ (~xdeller@broadband-77-37-224-84.nationalcablenetworks.ru) Quit (Quit: Leaving)
[16:04] * gohko (~gohko@natter.interq.or.jp) Quit (Read error: Connection reset by peer)
[16:21] * xiaoxi (~xiaoxiche@134.134.137.75) Quit (Remote host closed the connection)
[16:25] * xdeller (~xdeller@broadband-77-37-224-84.nationalcablenetworks.ru) has joined #ceph
[16:30] * tziOm (~bjornar@ti0099a340-dhcp0628.bb.online.no) has joined #ceph
[16:33] <Psi-jack> Hmmm... I'm wondering if my kvm servers would be able to support rbd format 2. heh
[16:35] <Psi-jack> They seem to be just using rbd directly via qemu-rbd using -drive file=rbd:rbd/vm-106-disk-1:id=blah:auth_supported=cephx:keyring=/file:mon_host=<host1:port>,if=none,id=drive-virtio0,cache=writeback,aio=native
[16:39] * madkiss (~madkiss@178.188.60.118) has joined #ceph
[16:44] * madkiss (~madkiss@178.188.60.118) Quit (Quit: Leaving.)
[17:00] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[17:00] * loicd (~loic@magenta.dachary.org) has joined #ceph
[17:00] * webbber (~me@219.85.215.162) has joined #ceph
[17:26] * Zethrok (~martin@95.154.26.34) Quit (Ping timeout: 480 seconds)
[17:42] <sage> jksm: there?
[17:51] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[17:51] * loicd (~loic@magenta.dachary.org) has joined #ceph
[18:01] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[18:02] * loicd (~loic@magenta.dachary.org) has joined #ceph
[18:05] * sleinen1 (~Adium@2001:620:0:26:18fe:448f:44dc:e4f3) has joined #ceph
[18:05] <jksM> sage, yep!
[18:05] <sage> yay there you are!
[18:05] <sage> can you attach map 554 too?
[18:05] <jksM> sure, one sec
[18:06] <jksM> done
[18:10] <sage> did you do 'ceph osd rm 4' at some point?
[18:10] <sage> strange that the osd is referenced but doesn't exist
[18:11] * sleinen (~Adium@2001:620:0:26:b578:72d2:7448:3db7) Quit (Ping timeout: 480 seconds)
[18:11] <sage> jksm: ^
[18:11] <sage> confused how this came about :/
[18:12] * mistur (~yoann@kewl.mistur.org) Quit (Ping timeout: 480 seconds)
[18:14] * mistur (~yoann@kewl.mistur.org) has joined #ceph
[18:18] <jksM> sage, hmm, can I offer any clues? :-)
[18:19] <jksM> sage, and now, I haven't removed osds at any time
[18:19] <sage> maybe you can tar up the osdmap_full directory from your mon so i can see all the maps?
[18:19] <jksM> sage, I started out with osd.0, osd.1 and osd.2 only... HEALTH_OK
[18:19] <jksM> sage, then I added osd.3 and my problems started... I then added osd.4 because the system became unusable because of lack of space
[18:20] <jksM> sage, sure, one sec
[18:20] <jksM> sage, which mon should I take it from, or doesn't it matter?
[18:20] <sage> oh.. there is an osd.4 somewhere in the picture. that's a clue!
[18:20] <sage> doesn't matter, they're all the same
[18:21] <sage> ah, i think i see the bug.
[18:21] <sage> can you attach the output from 'find current/meta' on the node that is crashing too?
[18:21] <jksM> sage, https://www.dropbox.com/s/33b1ukvl86ruzk4/osdmap_full.tgz
[18:22] <jksM> sage, do you mean literally running find like that?!?
[18:22] <sage> yeah, just to see what files are present in that dir
[18:23] <jksM> but which current/meta directory do you want?
[18:23] <jksM> do you want the one from osd.2 that is the one that is crashing all the time?
[18:23] <jksM> or osd.1 which then subsequently crashed as well
[18:24] <sage> from teh osd that crashed..
[18:24] <sage> osd.2
[18:25] <jksM> sage, https://www.dropbox.com/s/29qbsyg5ykqmgsc/current_meta_files.txt
[18:28] <sage> ok, i have a fix to test
[18:28] <jksM> weeeh :-) super!
[18:28] <jksM> is it somewhat easy for me to build myself - or?
[18:30] <sage> not as easy as it is for gitbuilder to do it
[18:30] <sage> pushed to git, the packages will appear at that url in a few minutes
[18:30] <jksM> fine by me :)
[18:30] <sage> or you can pull from git and build yourself. fix is on top of the wip-pg-removal branch
[18:30] <sage> http://ceph.com/gitbuilder.cgi for stats
[18:30] <sage> status
[18:31] <sage> is the osd.1 crash the same?
[18:32] <jksM> well, I haven't upgraded that one to the wip-pg-removal version
[18:32] <jksM> but the symptoms are the same as the other osd before I upgraded that one
[18:32] <jksM> (i.e. dies with failed assert hit suicide timeout
[18:32] <sage> k
[18:33] <sage> i'd make sure osd.2 starts, then upgrade the others and verify the symptoms go away
[18:33] <jksM> I'll try it out - thanks for all the help!
[18:33] <sage> thanks for testing!
[18:46] <jksM> perhaps I'm blind... I know it is still building, but where do I actually download the binaries from? :)
[18:47] <jksM> it seems like all the links goes to various build status pages
[18:48] <jksM> ah, found it - sorry :)
[18:48] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[18:48] * loicd (~loic@magenta.dachary.org) has joined #ceph
[18:49] <sage> np :)
[18:49] <sage> we should add a link from that page
[18:55] * ScOut3R (~ScOut3R@catv-89-133-32-74.catv.broadband.hu) Quit (Remote host closed the connection)
[19:07] <jksM> it has lived for longer than the previous build ;-)
[19:08] <jksM> it is in the up state now... but gives out the same messages as the original 0.56.1: heartbeat_map is_healthy 'OSD::op_tp thread 0x7ff265ffb700' had timed out after 30
[19:08] <jksM> but now also: 'heartbeat_map is_healthy 'FileStore::op_tp thread 0x7ff2767fc700' had timed out after 60'
[19:13] <jksM> that aside it seems to have fixed the problem! - it is still running and things seems to be improving :)
[19:32] * ScOut3R (~ScOut3R@dsl51B61EED.pool.t-online.hu) has joined #ceph
[19:38] * mattbenjamin (~matt@75.45.228.196) has joined #ceph
[19:48] <xdeller> from http://ceph.com/dev-notes/rados-snapshots/ single snapshots should not be a reason for issuing new osdmap, right?
[19:51] <jksM> sage, damn, it crashed again :-(
[19:51] <jksM> sage, common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
[19:53] * dosaboy (~gizmo@12.231.120.253) has joined #ceph
[19:57] * dosaboy (~gizmo@12.231.120.253) Quit ()
[19:59] <sage> jksm: lame... can you post the log again?
[20:00] <jksM> I didn't have it set to debug=20, only default debug
[20:00] <jksM> do you want me restart with debug=20 ?
[20:00] * danieagle (~Daniel@177.133.175.185) has joined #ceph
[20:00] * noob21 (~noob2@ext.cscinfo.com) has joined #ceph
[20:01] <noob21> anyone around?
[20:02] <jksM> yes
[20:02] <noob21> hey what's up :)
[20:02] <noob21> any idea what the difference when you run ceph -w is between data and used?
[20:02] <jksM> just ceph'ing away
[20:02] <noob21> i have 1609GB of data and 5017GB used
[20:02] <jksM> noob21, data is the amount of "real" data you have stored
[20:03] <noob21> ah
[20:03] <noob21> i suspected as much
[20:03] <jksM> noob21, used is the amount of storage that uses on the osd's
[20:03] <noob21> right
[20:03] <jksM> so if you have replication 3 on everything, expect that value to be three times higher
[20:03] <noob21> oh ok
[20:03] <noob21> that makes sense now
[20:03] <noob21> yes it's almost exactly 3x
[20:05] <noob21> i'm running bonnie++ in a for loop to try and get that last nasty bug to show it's face again
[20:06] <noob21> my proxy server that i have ceph exported through likes to kernel panic randomly.
[20:10] <sage> jksm: please, with full debug
[20:11] <jksM> okay, restarting it now! :)
[20:11] <sage> and attach or link from the bug, if you don't mind
[20:13] <sage> er, ctually post it here
[20:13] <sage> it's a different bug :)
[20:16] <jksM> okay :)
[20:17] <noob21> working weekends sage?
[20:36] * ebo^ (~ebo@233.195.116.85.in-addr.arpa.manitu.net) has joined #ceph
[20:36] <ebo^> is there any documentation on the meaning of the osd perfcounters?
[20:38] * scalability-junk is still looking for a suitable backup solution for ceph...
[20:42] <Vjarjadian> more OSDs :)
[20:43] <jksM> scalability-junk, why not just use traditional backup software?
[20:44] <scalability-junk> jksM, how would you backup a 12tb ceph cluster then?
[20:44] <ebo^> with another 12tb ceph cluster?
[20:44] <scalability-junk> ebo^, elaborate ;)
[20:45] <jksM> scalability-junk, like you would backup 12 tb of other data... it is hard to give more information without knowing what you store on there
[20:45] <ebo^> build second cluster, copy files, ..., profit
[20:46] <scalability-junk> ebo^, jksM I don't know what data is in there. see me as the provider, who wants to backup all objects from rados.
[20:46] <ebo^> define backup
[20:46] <jksM> scalability-junk, well, you must know what you store on there ;-)
[20:46] <jksM> scalability-junk, do you use it for qemu-kvm images or?
[20:47] <scalability-junk> jksM, yeah that's the first plan ;)
[20:47] <jksM> scalability-junk, so backup each image individually with traditional software?
[20:47] <jksM> or backup the files on the filesystem inside each image
[20:47] <scalability-junk> ebo^, backup in snapshot of data. for disaster recovery
[20:47] <ebo^> off site?
[20:48] <scalability-junk> jksM, yeah but when one of my users is using ceph, I really don't see the good thing of going into their image and do it like that
[20:48] <scalability-junk> ebo^, yeah
[20:48] <jksM> scalability-junk, then just backup the image
[20:49] <jksM> you're not going to be able to backup kvm virtual machines with guest cooperation... unless you can just power them down at will
[20:49] <ebo^> yeah consistent state and all
[20:49] <scalability-junk> jksM, but backing up 100+ volumes seems not like the solution someone (provider for ceph) would wanna do
[20:50] <scalability-junk> ebo^, yeah true
[20:50] <jksM> scalability-junk, what would like to do instead then?
[20:52] <Vjarjadian> scalability junk... maybe the geo-replication feature coming in the next version will be good for you...
[20:52] <scalability-junk> jksM, since you asked for what i would like ;) I would love to have one async replica, which is made available to the main cluster and therefore syncs with snapshots.
[20:52] <scalability-junk> Vjarjadian, yeah something like that.
[20:52] <scalability-junk> I just wanna discuss some solutions from within ceph.
[20:52] <jksM> I don't think you can do that with the current ceph
[20:53] <Vjarjadian> you can use ceph over WAN if you have an insanely good WAN...
[20:53] <scalability-junk> Vjarjadian, haha good one ;)
[20:54] <jksM> or perhaps shutdown all osds and mds, snapshot all underlying filesystems and then start up the osds and mds again
[20:54] <jksM> and then backup the snapshot
[20:54] <scalability-junk> jksM, for one server that's my plan yeah
[20:54] <Vjarjadian> that still requires it being offline...
[20:54] <scalability-junk> but imagine 10+ server with 2-4 osds ...
[20:54] <jksM> yep, ofcourse
[20:54] <jksM> it would mean a long downtime while the osds start up again
[20:55] <scalability-junk> syncing the objects with snapshots (rsync) seems like corrupt backups right?
[20:55] <jksM> I guess you could just snapshot everything at exactly the same time, which would simulate a power failure of all servers... but I doubt that is a good idea ;-)
[20:56] <jksM> scalability-junk, what do you mean by syncing objects with snapshots?
[20:56] <scalability-junk> rados uses the underlying filesystem to store the storage objects and my plan was to sync these to another offsite storage
[20:56] <jksM> you cannot do that while the system is running
[20:57] * dosaboy (~gizmo@12.231.120.253) has joined #ceph
[20:57] * loicd (~loic@magenta.dachary.org) Quit (Quit: Leaving.)
[20:57] * loicd (~loic@magenta.dachary.org) has joined #ceph
[20:57] <scalability-junk> mh damn my plan is falling apart
[20:58] <scalability-junk> still I can't imagine how big providers like dreamhost do it.
[20:58] <Vjarjadian> just get a bigger cluster... and wait for the new features :)
[20:58] <jksM> scalability-junk, how do you know that they have backups? ;-)
[20:58] <scalability-junk> Vjarjadian, yeah but a bigger cluster doesn't help to prevent user dumbness and update failures
[20:58] <scalability-junk> jksM, I don't, but I would even love to hear that they don't
[20:59] <scalability-junk> then I know alright stop trying
[20:59] <Vjarjadian> iirc, rados or one of the things has snapshots internally...
[20:59] <jksM> why is it you so badly don't want to take backups the traditional way?
[20:59] <scalability-junk> jksM, how would you do that?
[20:59] <jksM> like I outlined before
[20:59] * kloo (~kloo@a82-92-246-211.adsl.xs4all.nl) has joined #ceph
[20:59] <kloo> hi.
[20:59] <jksM> hi
[21:00] <Vjarjadian> scalability, you have VMs or something accessing your cluster... use that to back it up...
[21:00] <scalability-junk> yeah but what about a volume being 4tb, hard to backup something like that on a 3tb disk ;)
[21:00] <kloo> i have an osd that sigsegvs now that i try to run it on bobtail.
[21:00] <jksM> scalability-junk, you can never backup 4 tb of data on a 3 tb disk... that would require magic
[21:00] <kloo> .. immediately on start-up.
[21:00] <scalability-junk> Vjarjadian, but imagine ceph being used as object and ebs store
[21:01] <scalability-junk> jksM, yeah that's why backing up the chunked up volume is the idea in my head instead of backing up the volume old fashioned ;)
[21:01] <Vjarjadian> maybe you just need users who arent horrible :)
[21:01] <jksM> scalability-junk, you're not making any sense now
[21:02] <scalability-junk> Vjarjadian, still leaves onsite disaster and backup stuff ;)
[21:02] <Vjarjadian> you could dedup the data.. then 4tb might fit on a 3tb disk...
[21:02] <scalability-junk> Vjarjadian, hehe
[21:02] <jksM> scalability-junk, if you need to backup 4 tb of data, and you only have a 3 tb disk to keep it on... it will never work... you need to get additional disks then
[21:02] <Vjarjadian> scalabilty, then maybe ceph isnt the solution that does 'everything' you want... just ask, does it do everythin you 'need'
[21:03] <scalability-junk> jksM, i was talking about one 4tb volume on ceph. ceph chunks up the data so I can distribute it to more than one 3tb disk
[21:03] <jksM> scalability-junk, yes, but as I mentioned, you do not want to do it like that...
[21:03] <jksM> scalability-junk, just use traditional backups... they have been "chunking" up data for decades
[21:04] <scalability-junk> jksM, just theoretically, how would you do the "use traditional backups" thing with 100+ servers of data?
[21:04] <jksM> scalability-junk, why does it matter to you that it is 100 servers instead of 1 server?
[21:04] <jksM> scalability-junk, backup one server at a time
[21:04] <Vjarjadian> this is going in circles :)
[21:05] <scalability-junk> Vjarjadian, don't worry I love ceph and the featureset, but that doesn't mean I can't question the status quo
[21:05] <Vjarjadian> like most things open source... it's a work in progress
[21:05] <jksM> scalability-junk, what I'm saying is that you want to take a backup of EVERYTHING in ONE go... I'm saying you could backup one image at a time
[21:05] <Vjarjadian> theyre planning geo-replication... but theyre not magicians :)
[21:06] <Vjarjadian> i tried shouting at my computer once... other than amusing kids... it doesnt work
[21:06] <scalability-junk> jksM, so I would get the list of all available objects/volumes/filesystems then shut them off/snapshot them and then ask the radosgw, rbd etc. to retrieve the data and copy it over to offsite with some chunking logic of myself ?
[21:06] <jksM> scalability-junk, that is one way of doing it, yes
[21:07] <jksM> you don't need to actual program "chunking logic" yourself... just use ordinary backup programs
[21:07] <jksM> they can handle fine that the backup needs to be split over 5 tapes or over 2 harddrives or whatever
[21:07] <scalability-junk> jksM, still doesn't seem right. but let's go with this.
[21:08] <jksM> you need to explain why it doesn't seem right
[21:08] <jksM> it's hard to guess requirements you might have, that you haven't told about ;-)
[21:08] <scalability-junk> jksM, true
[21:08] <scalability-junk> jksM, one feature ceph has is the unified storage backend for different frontend uses
[21:09] <ebo^> is there any documentation on the meaning of the osd perfcounters?
[21:09] <scalability-junk> with the traditional methods I would wirte one script for each frontend use, which retrieves the data (objects, volume etc.) and backs them up.
[21:09] <scalability-junk> *write
[21:12] <jksM> scalability-junk, yes?
[21:13] <scalability-junk> just to make sure: is there an admin user, which can see every object stored via radosgw for example?
[21:13] <scalability-junk> this would be needed for retrieving all objects and backing them up.
[21:14] <jksM> you can access all the objects if you own the ceph cluster
[21:16] <scalability-junk> jksM, but with a convenient api method? #interest
[21:16] <jksM> yes, there are apis available?
[21:17] <jksM> scalability-junk, check for example: http://ceph.com/docs/master/radosgw/admin/adminops/#get-object
[21:18] <scalability-junk> jksM, yeah cool
[21:19] <scalability-junk> mh could be quite easy as a solution
[21:26] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[21:45] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Ping timeout: 480 seconds)
[21:58] * Pagefaulted (~AndChat73@c-67-168-132-228.hsd1.wa.comcast.net) has joined #ceph
[21:59] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[22:06] * dosaboy (~gizmo@12.231.120.253) Quit (Quit: Leaving.)
[22:11] <jksM> sagelap, weird.. it won't crash now that I have enabled debugging :-|
[22:22] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[22:25] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) has joined #ceph
[22:26] * Pagefaulted (~AndChat73@c-67-168-132-228.hsd1.wa.comcast.net) Quit (Read error: No route to host)
[22:27] * dosaboy (~gizmo@12.231.120.253) has joined #ceph
[22:33] * danieagle (~Daniel@177.133.175.185) Quit (Quit: Inte+ :-) e Muito Obrigado Por Tudo!!! ^^)
[22:40] * slang1 (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) has joined #ceph
[22:40] * slang (~slang@207-229-177-80.c3-0.drb-ubr1.chi-drb.il.cable.rcn.com) Quit (Read error: Connection reset by peer)
[22:42] * Cube (~Cube@cpe-76-95-223-199.socal.res.rr.com) Quit (Quit: Leaving.)
[22:48] * dosaboy (~gizmo@12.231.120.253) Quit (Quit: Leaving.)
[22:56] * dosaboy (~gizmo@12.231.120.253) has joined #ceph
[22:58] * xdeller (~xdeller@broadband-77-37-224-84.nationalcablenetworks.ru) Quit (Quit: Leaving)
[22:58] <elder> cd /
[22:58] <elder> (:
[23:00] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (Quit: Leaving)
[23:01] <kloo> sigh, no luck with bobtail. :(
[23:01] * kloo (~kloo@a82-92-246-211.adsl.xs4all.nl) Quit (Quit: good night.)
[23:06] * sleinen1 (~Adium@2001:620:0:26:18fe:448f:44dc:e4f3) Quit (Quit: Leaving.)
[23:14] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[23:14] * ChanServ sets mode +o elder
[23:24] * dty (~derek@pool-71-191-131-36.washdc.fios.verizon.net) has joined #ceph
[23:38] * mdxi (~mdxi@74-95-29-182-Atlanta.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[23:41] * LeaChim (~LeaChim@b0faf18a.bb.sky.com) Quit (Remote host closed the connection)
[23:52] * xiaoxi (~xiaoxiche@jfdmzpr06-ext.jf.intel.com) has joined #ceph
[23:52] <noob21> in general does adding more monitors to the ceph cluster slow it down?

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.