#ceph IRC Log

Index

IRC Log for 2013-08-02

Timestamps are in GMT/BST.

[0:01] * gentleben (~sseveranc@c-98-207-40-73.hsd1.ca.comcast.net) has joined #ceph
[0:02] <gentleben> is there any info other than the trac ticket on getting ceph (or at least the rados lib) building on osx?
[0:08] <dmick> gentleben: there's a blueprint for it on ceph.com, and Noah will be discussing it next week at the Ceph Summit
[0:09] <dmick> https://ceph.com/community/ceph-developer-summit-emperor/
[0:10] <dmick> http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Increasing_Ceph_portability
[0:11] <gentleben> dmick: thanks. I don't need to actually run it so this may be sufficient
[0:13] * sagelap1 (~sage@2600:1001:b116:55f3:2def:815e:75b9:3a94) has joined #ceph
[0:16] <lautriv> dmick, interrested what my real issue with the truncated disklabels was ?
[0:16] <dmick> lautriv: don't recall the issue, but yes, description and resolution is always interesting
[0:19] <lautriv> dmick, problem was disk prepare always ended up in "unrecognizeable disklablel" but only on one OSD and a specific kind of drives ...............there were some raid-superblocks on which become only visible if one creates a partition manually and get's a "blkid" where they appear with a sub-uuid.
[0:19] * sagelap (~sage@216.194.44.151) Quit (Ping timeout: 480 seconds)
[0:19] <dmick> ugh. linux block device management has become byzantine
[0:20] <lautriv> the heavy part is, they survived several dd's
[0:22] <lautriv> however, the other thing i found _should_ really be adoped to ceph-disk, adding '--set-alignment=2048', in line1022 since it is highly desireable to have a proper alignment.
[0:23] <dmick> and what defines proper?
[0:24] <lautriv> dmick, i define proper in this case anything which is the multiple of common blocksizes and flush-page sizes.
[0:24] <dmick> I've had a lot of bad experience with assumptions about partition alignment
[0:24] <dmick> it's generally driver-dependent at least
[0:24] * diegows (~diegows@200.68.116.185) Quit (Ping timeout: 480 seconds)
[0:25] <dmick> but if you could file an issue with details of what you saw and how that fixed it, we could do some research
[0:25] <lautriv> dmick, you are right, at least since newer drives doesn't match that scheme at all but they are still somewhat optimized to a "cylinder-like" behaviour
[0:27] <lautriv> dmick, i have nothing to file as bug, maybe a recommendation for alignment and a warning on old superblocks.
[0:27] <dmick> lautriv: surely you saw a problem with the existing ceph-disk that caused you to add the --set-alignment=2048?
[0:29] <lautriv> dmick, the dry-run showed up that sgdisk with that option doesn't run in trouble but the real issue behind that was the remaining superblock "suggested" the last cylinder somwhere on another disk. finally no ceph-disk issue.
[0:32] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) has joined #ceph
[0:35] * alram (~alram@cpe-76-167-50-51.socal.res.rr.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * iggy (~iggy@theiggy.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * markl (~mark@tpsit.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * cmdrk (~lincoln@c-24-12-206-91.hsd1.il.comcast.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * terje-_ (~root@135.109.216.239) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * janisg (~troll@85.254.50.23) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * cjh_ (~cjh@ps123903.dreamhost.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * mjeanson (~mjeanson@00012705.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * sjust (~sam@38.122.20.226) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * joshd (~joshd@38.122.20.226) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * tchmnkyz (~jeremy@0001638b.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * _robbat2|irssi (nobody@www2.orbis-terrarum.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * alexbligh (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * baffle_ (baffle@jump.stenstad.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * nwl (~levine@atticus.yoyo.org) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Ormod (~valtha@ohmu.fi) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * josef (~seven@li70-116.members.linode.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * liiwi (liiwi@idle.fi) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * [cave] (~quassel@boxacle.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * [fred] (fred@konfuzi.us) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Sargun_ (~sargun@208-106-98-2.static.sonic.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * devoid (~devoid@130.202.135.210) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * zackc (~zack@0001ba60.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * jeroenmoors (~quassel@193.104.8.40) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * jnq (~jon@0001b7cc.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * portante (~portante@nat-pool-bos-t.redhat.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * lmb (lmb@212.8.204.10) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * cclien_ (~cclien@ec2-50-112-123-234.us-west-2.compute.amazonaws.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * nigwil (~idontknow@174.143.209.84) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Jakdaw (~chris@puma-mxisp.mxtelecom.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * jf-jenni (~jf-jenni@stallman.cse.ohio-state.edu) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * beardo (~sma310@beardo.cc.lehigh.edu) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Kdecherf (~kdecherf@shaolan.kdecherf.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * AaronSchulz (~chatzilla@192.195.83.36) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * guppy (~quassel@guppy.xxx) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * gregaf (~Adium@2607:f298:a:607:112c:1fa8:77e1:af2e) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * MACscr (~Adium@c-98-214-103-147.hsd1.il.comcast.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * masterpe (~masterpe@2a01:670:400::43) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Azrael (~azrael@terra.negativeblue.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * maswan (maswan@kennedy.acc.umu.se) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * sbadia (~sbadia@yasaw.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * soren (~soren@hydrogen.linux2go.dk) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * ivan` (~ivan`@000130ca.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * rturk-away (~rturk@ds2390.dreamservers.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * chutz (~chutz@rygel.linuxfreak.ca) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * yeled (~yeled@spodder.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * infernix (nix@cl-1404.ams-04.nl.sixxs.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * jjgalvez (~jjgalvez@ip72-193-217-254.lv.lv.cox.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * sprachgenerator (~sprachgen@130.202.135.194) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * gregmark (~Adium@68.87.42.115) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * jochen (~jochen@laevar.de) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * NaioN_ (stefan@andor.naion.nl) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * tdb (~tdb@willow.kent.ac.uk) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Fetch_ (fetch@gimel.cepheid.org) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Tamil (~tamil@38.122.20.226) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * dmick (~dmick@38.122.20.226) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * cfreak201 (~cfreak200@p4FF3E75F.dip0.t-ipconnect.de) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * LeaChim (~LeaChim@2.122.178.96) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * \ask (~ask@oz.develooper.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Psi-Jack_ (~Psi-Jack@yggdrasil.hostdruids.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * mbjorling (~SilverWol@130.226.133.120) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * jks (~jks@3e6b5724.rev.stofanet.dk) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * Daviey (~DavieyOFT@bootie.daviey.com) Quit (resistance.oftc.net synthon.oftc.net)
[0:35] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) Quit (resistance.oftc.net synthon.oftc.net)
[0:36] <lautriv> ok, next stop :those pools and manage it, somehow confusing to browse the docs, could anyone point me to the right link ?
[0:36] * nwf_ (~nwf@67.62.51.95) has joined #ceph
[0:36] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[0:36] * devoid (~devoid@130.202.135.210) has joined #ceph
[0:36] * jjgalvez (~jjgalvez@ip72-193-217-254.lv.lv.cox.net) has joined #ceph
[0:36] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) has joined #ceph
[0:36] * alram (~alram@cpe-76-167-50-51.socal.res.rr.com) has joined #ceph
[0:36] * zackc (~zack@0001ba60.user.oftc.net) has joined #ceph
[0:36] * sprachgenerator (~sprachgen@130.202.135.194) has joined #ceph
[0:36] * mbjorling (~SilverWol@130.226.133.120) has joined #ceph
[0:36] * [cave] (~quassel@boxacle.net) has joined #ceph
[0:36] * Ormod (~valtha@ohmu.fi) has joined #ceph
[0:36] * nwl (~levine@atticus.yoyo.org) has joined #ceph
[0:36] * Sargun_ (~sargun@208-106-98-2.static.sonic.net) has joined #ceph
[0:36] * liiwi (liiwi@idle.fi) has joined #ceph
[0:36] * baffle_ (baffle@jump.stenstad.net) has joined #ceph
[0:36] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) has joined #ceph
[0:36] * [fred] (fred@konfuzi.us) has joined #ceph
[0:36] * alexbligh (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) has joined #ceph
[0:36] * _robbat2|irssi (nobody@www2.orbis-terrarum.net) has joined #ceph
[0:36] * tchmnkyz (~jeremy@0001638b.user.oftc.net) has joined #ceph
[0:36] * joshd (~joshd@38.122.20.226) has joined #ceph
[0:36] * sjust (~sam@38.122.20.226) has joined #ceph
[0:36] * josef (~seven@li70-116.members.linode.com) has joined #ceph
[0:36] * mjeanson (~mjeanson@00012705.user.oftc.net) has joined #ceph
[0:36] * cjh_ (~cjh@ps123903.dreamhost.com) has joined #ceph
[0:36] * janisg (~troll@85.254.50.23) has joined #ceph
[0:36] * terje-_ (~root@135.109.216.239) has joined #ceph
[0:36] * cmdrk (~lincoln@c-24-12-206-91.hsd1.il.comcast.net) has joined #ceph
[0:36] * markl (~mark@tpsit.com) has joined #ceph
[0:36] * iggy (~iggy@theiggy.com) has joined #ceph
[0:36] * masterpe (~masterpe@2a01:670:400::43) has joined #ceph
[0:36] * LeaChim (~LeaChim@2.122.178.96) has joined #ceph
[0:36] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) has joined #ceph
[0:36] * MACscr (~Adium@c-98-214-103-147.hsd1.il.comcast.net) has joined #ceph
[0:36] * infernix (nix@cl-1404.ams-04.nl.sixxs.net) has joined #ceph
[0:36] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[0:36] * gregaf (~Adium@2607:f298:a:607:112c:1fa8:77e1:af2e) has joined #ceph
[0:36] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[0:36] * cfreak201 (~cfreak200@p4FF3E75F.dip0.t-ipconnect.de) has joined #ceph
[0:36] * guppy (~quassel@guppy.xxx) has joined #ceph
[0:36] * AaronSchulz (~chatzilla@192.195.83.36) has joined #ceph
[0:36] * Daviey (~DavieyOFT@bootie.daviey.com) has joined #ceph
[0:36] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[0:36] * Azrael (~azrael@terra.negativeblue.com) has joined #ceph
[0:36] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[0:36] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) has joined #ceph
[0:36] * dmick (~dmick@38.122.20.226) has joined #ceph
[0:36] * Psi-Jack_ (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[0:36] * Tamil (~tamil@38.122.20.226) has joined #ceph
[0:36] * yeled (~yeled@spodder.com) has joined #ceph
[0:36] * jks (~jks@3e6b5724.rev.stofanet.dk) has joined #ceph
[0:36] * Kdecherf (~kdecherf@shaolan.kdecherf.com) has joined #ceph
[0:36] * beardo (~sma310@beardo.cc.lehigh.edu) has joined #ceph
[0:36] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[0:36] * ivan` (~ivan`@000130ca.user.oftc.net) has joined #ceph
[0:36] * jf-jenni (~jf-jenni@stallman.cse.ohio-state.edu) has joined #ceph
[0:36] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) has joined #ceph
[0:36] * Jakdaw (~chris@puma-mxisp.mxtelecom.com) has joined #ceph
[0:36] * nigwil (~idontknow@174.143.209.84) has joined #ceph
[0:36] * cclien_ (~cclien@ec2-50-112-123-234.us-west-2.compute.amazonaws.com) has joined #ceph
[0:36] * lmb (lmb@212.8.204.10) has joined #ceph
[0:36] * portante (~portante@nat-pool-bos-t.redhat.com) has joined #ceph
[0:36] * sbadia (~sbadia@yasaw.net) has joined #ceph
[0:36] * maswan (maswan@kennedy.acc.umu.se) has joined #ceph
[0:36] * rturk-away (~rturk@ds2390.dreamservers.com) has joined #ceph
[0:36] * jnq (~jon@0001b7cc.user.oftc.net) has joined #ceph
[0:36] * jeroenmoors (~quassel@193.104.8.40) has joined #ceph
[0:36] * soren (~soren@hydrogen.linux2go.dk) has joined #ceph
[0:36] * Fetch_ (fetch@gimel.cepheid.org) has joined #ceph
[0:36] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[0:36] * tdb (~tdb@willow.kent.ac.uk) has joined #ceph
[0:36] * NaioN_ (stefan@andor.naion.nl) has joined #ceph
[0:36] * jochen (~jochen@laevar.de) has joined #ceph
[0:36] * \ask (~ask@oz.develooper.com) has joined #ceph
[0:37] <lautriv> (repeat because netsplit) ok, next stop :those pools and manage it, somehow confusing to browse the docs, could anyone point me to the right link ?
[0:38] * terje (~joey@63-154-140-200.mpls.qwest.net) has joined #ceph
[0:38] * kyann (~oftc-webi@did75-15-88-160-187-237.fbx.proxad.net) Quit (Quit: Page closed)
[0:44] <lautriv> dmick, small hint for me ?
[0:46] <dmick> sorry; the last I saw was that I'd asked a question about your experience with ceph-disk without --set-alignment
[0:46] * terje (~joey@63-154-140-200.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[0:53] * BillK (~BillK-OFT@58-7-165-124.dyn.iinet.net.au) has joined #ceph
[0:54] * lautriv (~lautriv@f050082253.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[0:55] * sprachgenerator (~sprachgen@130.202.135.194) Quit (Quit: sprachgenerator)
[0:55] * jefferai (~quassel@corkblock.jefferai.org) Quit (Quit: No Ping reply in 180 seconds.)
[0:56] * jefferai (~quassel@corkblock.jefferai.org) has joined #ceph
[0:56] * dpippenger1 (~riven@tenant.pas.idealab.com) has joined #ceph
[0:56] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Read error: Connection reset by peer)
[1:02] <loicd> \o/
[1:03] * lautriv (~lautriv@f050084144.adsl.alicedsl.de) has joined #ceph
[1:07] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) Quit (Ping timeout: 480 seconds)
[1:08] <loicd> sjustlaptop: I understand why the newest last_update helps select the authoritative log. But I'm not sure to understand why it should be the one with the longest tail.
[1:08] <dmick> lautriv: can you file a bug about the alignment issue? also, what small hint did you want?
[1:12] <lautriv> dmick, the dry-run showed up that sgdisk with that option doesn't run in trouble but the real issue behind that was the remaining superblock "suggested" the last cylinder somwhere on another disk. finally no ceph-disk issue.
[1:14] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Quit: Ex-Chat)
[1:14] <dmick> ok?...
[1:14] * TiCPU (~jeromepou@190-130.cgocable.ca) Quit (Quit: Ex-Chat)
[1:14] <lautriv> dmick, about the hint :i read somewhere about pools in ceph and i have some imagination what/how they could be used for but if i go to ceph.com and search deeper, it is very good hidden.
[1:16] <dmick> a pool is the top-level division of the cluster
[1:17] * sagelap1 (~sage@2600:1001:b116:55f3:2def:815e:75b9:3a94) Quit (Read error: No route to host)
[1:17] <dmick> when you refer to objects in a cluster, you refer to them by "poolname, objectname"
[1:18] <dmick> there are several well-known pools for defaults for various services. 'data' and 'metadata' are used by cephfs by default; 'rbd' is used for rbd images by default; many more are used by radosgw. You can always override the pool used.
[1:19] <dmick> pools are where the replication size is set, where crush rule(sets) are selected, and where access permissions are applied (until namespaces add another level of abstraction; namespaces are brand new)
[1:19] <lautriv> dmick, i don't use radosgw but like to read some details for CephFS
[1:22] * BillK (~BillK-OFT@58-7-165-124.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[1:24] <lautriv> dmick, or let me explain a bit more ..... i went from NFS to Ceph since NFS on recent kernels does freeze servers on V4 with massive traffic, so i asked in ##linux on freenode about suggestions for a network-FS where i could mount the root from diskless clients on, now i need some more deteails to manage this part.
[1:25] <dmick> well, that doesn't have much to do with pools
[1:25] <dmick> but cephfs documentation is on ceph.com along with the rest
[1:25] * alfredod_ (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[1:27] * alfredod_ is now known as alfredodeza_
[1:28] <lautriv> dmick, the documentation talks only about "mount monitor:6789/ /somewhere", can ceph do something like "mount monitor:6789/some/other/location /somewhere/else " ?
[1:30] * alram (~alram@cpe-76-167-50-51.socal.res.rr.com) Quit (Quit: leaving)
[1:30] <dmick> SYNOPSIS
[1:30] <dmick> mount.ceph monaddr1[,monaddr2,...]:/[subdir] dir [
[1:30] <dmick> -o options ]
[1:30] <dmick> looks like it to me
[1:32] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Ping timeout: 480 seconds)
[1:38] * alfredodeza_ is now known as alfredodeza
[1:44] * mozg (~andrei@host109-151-35-94.range109-151.btcentralplus.com) Quit (Ping timeout: 480 seconds)
[1:49] * BillK (~BillK-OFT@58-7-165-124.dyn.iinet.net.au) has joined #ceph
[1:53] * devoid (~devoid@130.202.135.210) Quit (Quit: Leaving.)
[1:54] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[2:03] * mschiff (~mschiff@port-36117.pppoe.wtnet.de) Quit (Remote host closed the connection)
[2:04] <lautriv> dmick, sorry, wouldn't ignore you, just busy with a lot of xterms and you didn't hightlight me ;)
[2:04] * diegows (~diegows@190.190.2.126) has joined #ceph
[2:05] <ntranger> hey all, quick question, starting from scratch here. lol I have 2 nodes that I'm wanting to setup ceph for, so when creating the cluster, I should "ceph-deploy new node1 node2"?
[2:05] <ntranger> I'm looking at the quick start, but it doesn't mention about a second node.
[2:05] <lautriv> ntranger, new only for those who will become a mon too.
[2:05] <alfredodeza> ntranger: you could do `ceph-deploy new node{1, 2}`
[2:06] <ntranger> ok. is it recommended to have a monitor on each node, or should I just keep it on 1?
[2:08] * joao (~JL@89.181.144.108) Quit (Quit: Leaving)
[2:08] <lautriv> iirr i read it is recommented to not have the mon on the osd
[2:08] <ntranger> yeah, I saw that as well, but with the hardware we have, its pretty much my option. :/
[2:09] <lautriv> ntranger, i assume that recommentation is mostly for performance ( just a guess ) but at least have enought resources for both.
[2:10] <ntranger> yeah, the hardware we have is pretty beefy.
[2:10] <dmick> you probably want 3 mons for fault tolerance. It doesn't matter a whole lot where they live, but if you have enough resources it's always better to separate functions
[2:11] <dmick> you definitely would like to not shrae disks between mons and osds but I don't think it's bad to have the daemons on the same machine
[2:12] <lautriv> is the output of free space on the osd-disks trustful ?
[2:13] <lurbs> lautriv: You can run a cluster with a single monitor, but it's a single point of failure. You really need at least three, because quorum requires a majority vote (2 of 3) in this case.
[2:13] <lurbs> With two monitors you can't have either fail, because the one remaining can't get a majority quorum.
[2:14] <lautriv> lurbs, ntranger asked that part ;)
[2:14] <lurbs> Bleh, eyes fail.
[2:14] <lurbs> ntranger: See above. :)
[2:14] <ntranger> yeah, I have a 3rd node that will be in place as soon as I get these 2 up and running. :)
[2:15] <ntranger> my brain is mush, and I don't even really know why I asked that, knowing the answer. lol
[2:15] <ntranger> sorry about that
[2:16] <lautriv> i wonder about this : built a test-cluster where osd1 has 2x73G and OSD2 has 1x146G, makes 146G with n+1 redundancy. now i try to copy 70G of data and it reports actually 96% full ?
[2:16] <dmick> ntranger: you probably want more than one mon, and the next useful number is 3. that doesn't *have* to mean 3 machines, although it means that if the machine running two mons fails, you're dead. That might be an acceptable risk to you.
[2:17] <dmick> but on a two-node cluster fault-tolerance is not very important
[2:18] <dmick> lautriv: you might want to study ceph df detail. it can be tricky
[2:19] <lurbs> lautriv: Are those two disks striped into a single OSD, or separate OSDs on the same machine?
[2:21] <lautriv> lurbs, prepared/activated in one call, i have actually no idea how to control the amount.
[2:24] <dmick> is machine osd1 running two osds or one?
[2:25] <lautriv> dmick, on the old way i could define [osd.a] node:drive and further [osd.b] samenode:otherdrive but ceph-deploy does only node:drivea node:driveb
[2:26] <dmick> and that's fine, but what's the state of the cluster, regardless of how it was deployed?
[2:27] <lurbs> I suspect that if you have three OSDs (2 * 73 GB on one machine, and 1 * 146 GB on the other) then you'll be needed to change the weights such that the 146 GB drive gets twice the data.
[2:27] <lurbs> What does 'ceph osd tree' say?
[2:28] <dmick> or even ps | grep osd
[2:29] * houkouonchi-work (~linux@12.248.40.138) Quit (Remote host closed the connection)
[2:29] <dmick> lautriv^
[2:30] <lautriv> moment, i bserved some instable behaviour ...
[2:31] <lautriv> actual tree -> http://pastebin.com/pFaaQNbg one crashed for unknown reason :(
[2:32] <dmick> so there are indeed two osds on that one machine, although their waights are set correctly
[2:32] <dmick> *weights
[2:33] <dmick> however one object is going to be limited to something less than 73G, of course
[2:35] <lautriv> ceph-deploy has no call for one osd with 2 drives then.
[2:35] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (Ping timeout: 480 seconds)
[2:35] <dmick> right. nothing about ceph glues two drives into one filesystem
[2:35] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[2:35] <dmick> that was true with mkcephfs as well
[2:36] <lurbs> Oh, I wasn't aware that upstart set weights based on df. My main cluster is of mkcephs/sysvinit vintage.
[2:36] <lurbs> s/mkcephs/mkcephfs/
[2:38] <lautriv> dmick, so if i had 3 servers with 20 drives each i had no glue for one large FS ?
[2:39] <lautriv> but the old could do [osd.a] node devs= drive drive drive ...
[2:39] <dmick> lautriv: not to my knowledge from within the ceph administration. Of course you can glue drives together however you like and then present that block device to ceph
[2:40] <dmick> it *might* have been possible to convince mkcephfs to do stupid btrfs tricks, but I kinda doubt it; that's all I can imagine you're referring to
[2:41] <ntranger> I just ran ceph-deploy install, and got this error. [ERROR ] Warning: RPMDB altered outside of yum. Is this something I should be worried about? I just googled it, and it shows another program that someone installed, and got this error, and it says its just a warning, and it should be fine?
[2:43] <lautriv> ntranger, since ceph does it's own with a bunch of python and not asking your package-manager ...
[2:43] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[2:44] * huangjun (~kvirc@111.173.100.212) has joined #ceph
[2:44] * lautriv gets the feeling the stacking of ceph is worse than LVM
[2:44] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[2:44] <dmick> ceph-deploy installs with rpm
[2:45] <dmick> or rather yum
[2:47] * xmltok (~xmltok@pool101.bizrate.com) Quit (Quit: Bye!)
[2:47] <ntranger> yeah, I thought it was yum.
[2:47] <dmick> I guesss on fedora it's rpm
[2:47] <dmick> I'm no rpm/yum expert but it seems like I've seen it whine like that a lot
[2:48] <lautriv> ok, what is the recommented way of glueing drives together if the admin-node is unable to do so ?
[2:48] <dmick> perhaps provoked by rpm --import
[2:54] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) has joined #ceph
[2:58] * yy-nm (~chatzilla@218.74.33.110) has joined #ceph
[2:59] <lautriv> no recommented way ?
[3:00] <dmick> lautriv: typically we don't recommend using multiple drives with one OSD, but it's up to you if you want to do that
[3:00] <ntranger> I'm to the mon create, which I've run, and its saying that I should have keyrings in my bootsrap folders, but I've run it twice, and they aren't being created, but I'm not getting any errors when it runs either.
[3:01] <dmick> and no, there isn't any particular recommended solution. You increase chances of osd failure because each drive has its own error probability, and in the same chassis they're somewhat correlated of course too
[3:01] <lautriv> dmick, so what do people when they have dedicated drive-servers ?
[3:01] <dmick> run multiple OSDs, like you are
[3:02] <dmick> nothing wrong with that fundamentally. We don't yet know why you appear to be running out of space early, but I haven't seen any discussion of ceph df detail like I suggested
[3:02] <lautriv> dmick, but that gives me several smaller FS
[3:02] <dmick> yes, and that's ok (that's what crush weights are for)
[3:03] <dmick> ntranger: so the monitor has to come up successfully, and then ceph-create-keys runs and creates those keys
[3:03] <dmick> so first check if the mons actually came up without error
[3:03] <dmick> and then see if you have a ceph-create-keys process running
[3:04] <ntranger> ok
[3:04] <lautriv> dmick, in other words you are not talking about any mountable FS but the internal fragmentation.
[3:04] <dmick> lautriv: I don't understand what you mean. if by "mountable FS" you mean "cephfs", that uses the whole cluster, not any one OSD
[3:05] <dmick> if you mean instead "the FS that the OSD process mounts", then yeah, I'm talking about one per disk basically
[3:05] <dmick> maybe you could just ask what is not making sense to you; I sense your model of OSDs/RADOS/Ceph is not quite right
[3:05] <lautriv> dmick, i talk about the logiocal representation to a/any client.
[3:06] <dmick> but, for example, an RBD client doesn't have any "filesystem" at all; it has a block device
[3:06] <dmick> an RGW client has "s3 objects"
[3:06] <dmick> RADOS is not a filesystem
[3:07] <ntranger> I just did a grep and see a ceph-create-keys process running, but not exactly sure how to see if the monitor has come up successfully.
[3:09] <lautriv> ntranger, ps shows a whole line with path and options if running.
[3:10] <ntranger> yeah, I have /usr/bin/ceph-mon -i ceph01 --pid-file /var/run/ceph/mon.ceph01.pid -c /etc/ceph/ceph.conf, and /usr/bin/python /usr/sbin/ceph-create-keys -i ceph01
[3:10] <lautriv> ntranger, like this :
[3:10] <lautriv> root 4502 0.4 0.5 212564 36060 ? Sl Aug01 3:59 /usr/bin/ceph-mon -i node003 --pid-file /var/run/ceph/mon.node003.pid -c /etc/ceph/ceph.conf
[3:13] <ntranger> looks I have that. :)
[3:14] <dmick> does ceph -s work?
[3:14] <dmick> (from the machine where you ran ceph-deploy, or from one that was the target of ceph-deploy admin)
[3:15] <ntranger> ah, negative.
[3:15] <ntranger> 013-08-01 15:13:35.511656 7f151ec52760 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication
[3:15] * dpippenger1 (~riven@tenant.pas.idealab.com) Quit (Remote host closed the connection)
[3:15] <ntranger> 2013-08-01 15:13:35.511681 7f151ec52760 -1 ceph_tool_common_init failed.
[3:15] <lautriv> looks like a rebooted osd doesn't even remember it's external logs o.O
[3:16] <dmick> ntranger: so you have auth problems. Are you running ceph -s from the directory where you ran ceph-deploy new? there's a ceph.conf file there, most likely?
[3:16] <ntranger> correct
[3:17] <dmick> it mentions a client.admin keyring file; is that present?
[3:17] <ntranger> there isn't. ceph.mon.keyring is present.
[3:17] <ntranger> and ceph.conf and .log
[3:18] <dmick> can you pastebin your ceph.conf?
[3:18] * terje (~joey@63-154-130-169.mpls.qwest.net) has joined #ceph
[3:21] * LeaChim (~LeaChim@2.122.178.96) Quit (Ping timeout: 480 seconds)
[3:22] <ntranger> I'd be happy to, but not familiar with pastbin?
[3:23] <dmick> fpaste.org or pastebin.com or google pastebin
[3:23] <dmick> website to dump text so I can see it
[3:26] * terje (~joey@63-154-130-169.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[3:28] <ntranger> okay, I just did it on pastebin.com, its nytangercephconf
[3:28] <ntranger> ntrangercephconf
[3:31] <dmick> you need to send the url for the post
[3:31] <ntranger> http://pastebin.com/m3KF06H4
[3:32] <dmick> there we go
[3:32] <ntranger> sorry about that. :)
[3:32] <dmick> ok, so it will be using default paths for the keyring. /var/lib/ceph/*client.admin*key*, probably?
[3:32] <ntranger> correct
[3:33] <dmick> is that file readable by the user you're trying to run ceph -s as?
[3:33] <ntranger> yes
[3:34] <dmick> ok. maybe it's just not injected yet then. try ceph -n mon. -k ceph.mon.keyring auth list
[3:34] <dmick> (the 'mon.' key, in the ceph.mon.keyring, is kinda "key0")
[3:36] <ntranger> okay, I tried that, and get this.
[3:36] <ntranger> 2013-08-01 15:34:38.626082 7fb26cdb4700 0 -- 10.128.1.110:0/15900 >> 10.128.1.111:6789/0 pipe(0x7fb260000c00 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
[3:37] <dmick> that sounds like the monitor isn't really running
[3:37] <dmick> if it's still apparently in the ps list, check its log in /var/log/cehp
[3:37] <dmick> see if it's complaining
[3:37] <ntranger> ok
[3:38] <dmick> presumably you can ping from 10.128.1.110 to 10.128.1.111 OK/
[3:38] <dmick> (and back)
[3:38] <dmick> (which you'd have to for ping)
[3:39] <ntranger> yeah, they can both ping each other.
[3:39] <ntranger> (just wanted to double check)
[3:39] <dmick> oh, wait, sorry
[3:39] <dmick> you have two monitors
[3:40] <dmick> that will never form a quorum
[3:40] <dmick> that's probably the problem
[3:40] <dmick> one, or three, not two
[3:40] <ntranger> ah! ok. so I need to put in my 3rd node to get this straight. that makes sense.
[3:42] <ntranger> I'll rack this 3rd node up tomorrow, see if I can get it rolling.
[3:42] <ntranger> I greatly appreciate your help with this. Thanks.
[3:44] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[3:44] <dmick> ntranger: doesn't necessarily have to be another node, but it does need to be another mon
[3:45] <dmick> or one less :)
[3:45] <dmick> although with ceph-deploy it's more difficult to run two mons on one host. as in not really supported by ceph-deploy
[3:46] <dmick> http://ceph.com/docs/master/rados/deployment/ceph-deploy-mon/ to nail the concept home
[3:46] <lautriv> seems the ceph-deploy was a bit of a fast shot ;)
[3:46] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (Remote host closed the connection)
[3:46] <dmick> it's intentionally not one-size-fits-all
[3:47] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[3:48] <ntranger> dmick: thanks brother. :)
[3:48] <dmick> yw
[3:49] * haomaiwang (~haomaiwan@li565-182.members.linode.com) has joined #ceph
[3:53] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) Quit (Quit: Ja odoh a vi sta 'ocete...)
[3:56] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Quit: Leaving.)
[3:57] <lautriv> does ceph anything special on top of the FS in the way it can't be placed into fstab (OSD's in var/lib/ceph/osd/*) ?
[3:57] * leseb (~leseb@88-190-214-97.rev.dedibox.fr) Quit (Killed (NickServ (Too many failed password attempts.)))
[3:58] * leseb (~leseb@88-190-214-97.rev.dedibox.fr) has joined #ceph
[4:02] <dmick> the startup scripts and udev manage mounting the FS if they're whole-partition (including whole-disk) FSes; if not, you must manage mounting yourself (say, with fstab)
[4:03] <dmick> the FS isn't special, but it's expected that the daemon "owns" it
[4:03] <lautriv> yes they are but looks like the journal was not meant to be the XFS journal :)
[4:04] * terje_ (~joey@63-154-141-95.mpls.qwest.net) has joined #ceph
[4:05] <lautriv> ok, seems to work but OSD won't come back ...
[4:06] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[4:10] * xmltok (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit (Remote host closed the connection)
[4:10] * xmltok (~xmltok@relay.els4.ticketmaster.com) has joined #ceph
[4:11] <lautriv> i assume kicking the box won't help ...
[4:12] * terje_ (~joey@63-154-141-95.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[4:12] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[4:14] <dmick> lautriv: of course it's impossible to say anything without any specifics, if you're expecting any help
[4:16] <lautriv> dmick, one server crashed unexpected, that is the one with the 2 osd, i rebooted and don't get it up again, like shown by my last paste.
[4:16] <dmick> ok
[4:17] <lautriv> since i use rock-solid hardware, there are just 2 possible causes : recent kernel or ceph.
[4:18] <dmick> ok
[4:20] <lautriv> most annoying parts : already 4:20 AM in germany, if the test fails already i need to waste even more time for alternatives.
[4:21] <lautriv> dmick, did you ever realize a broken OSD producing massive network saturation ? ( close to a net-storm )
[4:22] <dmick> not personally, but I can imagine it's possible if communication goes bad somehow and the OSD is trying to recontact the rest of the cluster. that's the kind of thing distributed systems do
[4:24] <lautriv> i was on several like lustre/afs and friends in the past, none went so massive. but before we shift the topic, how could i kick that broken one in the ass and avoid to happen again ?
[4:25] <dmick> there's absolutely no way to advise you without specifics. So far I know a host went down and when it came back up the OSDs didn't start, or at least didn't rejoin the cluster. That's it.
[4:26] <lautriv> doesn't say much just not starting :P
[4:27] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[4:32] <lautriv> and when it says something, that's even a lie : starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
[4:33] <dmick> how is that a lie, exactly?
[4:34] <lautriv> starting osd.foo when there are zero osd up is not true, is it ?
[4:34] <dmick> what? of course it's true; if it started and then died, that's exactly what happens, right?
[4:35] <lautriv> there is no word about terminating/missing/failing/too much efford/out of space/something else
[4:36] <dmick> I don't know what to tell you lautriv. Diagnosing why something isn't working is more than just guessing what logs you're reading and how you're interpreting them.
[4:36] <dmick> but calling it a lie sounds like you're not interested in finding the problem, you just want to blame something or someone
[4:36] <dmick> I'll let someone else try to help; I can't do much.
[4:38] <lautriv> no. of course i eager about the time this thingie wastes for things that should not happen, but i miss a bunch of standards as why doesn't it write to syslog like any sane daemon ?
[4:39] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[4:39] * terje (~joey@63-154-141-95.mpls.qwest.net) has joined #ceph
[4:39] <lautriv> someone mentioned "ceph takes the best from several distributed FS"
[4:43] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[4:45] <lautriv> and finally..........long after the stopped osd, i get Extended attributes don't appear to work. and no space left on device. where we talk about XFS and an OSD which was absent before i could reach 50%
[4:47] * terje (~joey@63-154-141-95.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[4:54] <MACscr> hmm, so a file system is needed under ceph? Dont we really start to lose on performance when stacking file systems?
[4:54] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[4:56] <lautriv> MACscr, since ceph doesn't stack something on top of it, no.
[4:56] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit ()
[4:58] <dmick> MACscr: presumably you mean if using CephFS on top of RADOS, which is on top of OSDs and mons that use filesystems
[4:58] <dmick> and, no, it doesn't follow that the chain is less performant because of the use of filesystems on the bottom instead of raw disk access
[4:58] <MACscr> lautriv: i guess you are right. I would though be using it for block device storage to my vm's too. So it would have a FS on a FS. Though I guess there isnt really a way to avoid that.
[4:59] <MACscr> i wouldnt be using CephFS since its not production ready yet
[4:59] <dmick> in which case, yes, VMs use a filesystem on top of a filesystem, but manage to work OK
[5:00] <lautriv> MACscr, NBD or NFS with images.
[5:00] <dmick> NFS with images would be three levels of FS
[5:00] <MACscr> well if i were to do an iscsi san with lets say ZFS (though not working as a file system), then just offered block devices ot my hosts, i would essentially only have one file system
[5:01] <dmick> MACscr: one on the VM host, one at ZFS
[5:01] <lautriv> dmick, but it would be also "well known to work"
[5:01] <dmick> lots of people running lots of VMs on Ceph clusters.
[5:01] <lautriv> dmick, and where is the 3rd ?
[5:01] <MACscr> well why would i have to have one on on the ZFS node?
[5:02] <dmick> ZFS *is* a filesystem
[5:02] <MACscr> its not just a file system. You you dont have to use it as a file system
[5:02] <dmick> lautriv: VMFS, NFS, filesystem-that-backs-NFS
[5:02] <dmick> OK, then I have no idea what we're talking about. Carry on.
[5:02] <lautriv> dmick, NFS is NO FS
[5:03] <dmick> I'm sure that will be news to the name, which means network file system
[5:03] <lautriv> in another context like regular FS
[5:04] <MACscr> you can just use zfs to manage storage pools and actually leave off the file system part of it
[5:04] <MACscr> im no zfs expert, its just what i have read
[5:05] <lautriv> MACscr, ZFS will waste a lot of resources to give you reasonable performance.
[5:05] <dmick> and I don't know how that's meaningful; it's not like a zdev doesn't involve at least as much code as a filesystem before getting to the pool of devices. It's not a problem, either; focusing on "how many filesystems" is the wrong question
[5:05] * fireD_ (~fireD@93-139-174-231.adsl.net.t-com.hr) has joined #ceph
[5:06] <MACscr> sorry guys, im just completely new to OS and Ceph, so just trying to wrap my head around the whole setup. Especially with the fact that my compute nodes are diskless and will be booted through pxe, so im not sure how the ephemeral storage part will come into play
[5:07] * fireD (~fireD@93-142-252-173.adsl.net.t-com.hr) Quit (Ping timeout: 480 seconds)
[5:07] <MACscr> though that doenst really have to much to do with exactly what we were talking about =P]
[5:07] <lautriv> MACscr, i have a couple of diskless clients via PXE, they just mount toot on NFS-V4 and are done with the rest.
[5:08] <lautriv> *root
[5:09] <lautriv> MACscr, recent NFS can even handle swapfiles, just 3.10+ seems to have bugs.
[5:16] <MACscr> This was my original design when i was just going to have a zfs/iscsi storage node: http://content.screencast.com/users/MACscr/folders/Snagit/media/7205532a-a2dd-47dc-baf7-7361ff4ad561/2013-07-31_23-15-03.png
[5:16] <MACscr> I am not sure which way I am going to do it now yet
[5:17] <MACscr> as this plan was before i even thought about openstack and hadnt really considered ceph
[5:17] <janos> aww pfsense
[5:18] <janos> i love my home pfsense router
[5:18] <MACscr> ha. im just going to be using it for internal stuff. VM traffic wont go through it
[5:18] <janos> yeah
[5:18] <janos> mine is totally for external routing
[5:18] <janos> snagged a little netgate years ago
[5:19] * sjustlaptop (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Ping timeout: 480 seconds)
[5:20] <MACscr> here is my actual full thread on my buildout idea http://forums.servethehome.com/general-chat/2203-openstack-buildout-advice-needed.html
[5:21] <MACscr> obviously working with limited equipment, but not to bad
[5:31] * mjeanson (~mjeanson@00012705.user.oftc.net) Quit (Quit: No Ping reply in 180 seconds.)
[5:31] * mjeanson (~mjeanson@bell.multivax.ca) has joined #ceph
[5:33] * yy-nm (~chatzilla@218.74.33.110) Quit (Read error: Connection reset by peer)
[5:35] * yy-nm (~chatzilla@218.74.33.110) has joined #ceph
[5:35] <lautriv> ok, i'm done here, thanks for wasted 10 days of pointless digging for bugs on a bad fork of lustre, maybe i'll give it another shot if the goods are re-implementet in a decade :P (waiting a few minutes to laugh about answers)
[5:38] <dmick> lautriv: sorry you feel that way.
[5:39] <lautriv> dmick, i could discuss and rant a lot but that's even more waste of time. i just wonder how so much people could hold the line.
[5:40] <dmick> shrug. A lot of people are having success. There's no way to know what went wrong for you without getting to the bottom of it.
[5:40] <lautriv> dmick, but just to show you it's not pointless, here is the issue :
[5:40] <lautriv> /dev/sdd1 71651328 71651308 20 100% /var/lib/ceph/osd/ceph-1
[5:40] <lautriv> /dev/sdc1 71651328 71651308 20 100% /var/lib/ceph/osd/ceph-0
[5:41] <lautriv> both osd full, even i did not save that much and it won't come up because of disk full ( no space for xattrs ) there was NO disk full or something sane.
[5:41] <dmick> look like full osd filesystems to me, and yes, OSDs aren't gonna work that way.
[5:42] <lautriv> that should just not happen
[5:42] <dmick> I've personally seen OSD full warnings, and have experienced cluster write failures because the cluster won't allow OSDs to write to that level. I don't know how you got there, but yes, it's wrong.
[5:43] <lautriv> my only luck was to test a small one before i migrated the whole
[5:44] <lautriv> whatever, time to leave this chan and find another solution.
[5:44] * lautriv (~lautriv@f050084144.adsl.alicedsl.de) has left #ceph
[5:46] <MACscr> lol, well he just rage quit
[5:47] * yy-nm (~chatzilla@218.74.33.110) Quit (Read error: Connection reset by peer)
[5:47] * yy-nm (~chatzilla@218.74.33.110) has joined #ceph
[6:09] <MACscr> dmick: what do you think of this concept for a ceph/openstack setup? http://www.screencast.com/t/qs1bqWbTMBxh
[6:11] * huangjun (~kvirc@111.173.100.212) Quit (Read error: Connection reset by peer)
[6:11] * huangjun (~kvirc@111.173.100.212) has joined #ceph
[6:13] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[6:15] * terje_ (~joey@63-154-139-224.mpls.qwest.net) has joined #ceph
[6:23] * terje_ (~joey@63-154-139-224.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[6:24] <MACscr> shoot, i thought the qurum was more with the ceph-mon nodes, so i should run typically an odd number like 3. Now when it comes to ceph-osd nodes, is it ok to only run two?
[6:32] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[6:40] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[6:44] <dmick> MACscr: sure, although that's a mighty small cluster with not very much redundancy or load distribution
[6:44] <dmick> but there's no quorum constraint on OSDs
[6:44] <MACscr> well you have start somewhere, right?
[6:46] <MACscr> What are your thoughts on running ceph-mon on the same systems as some of the openstack management services?
[6:49] <dmick> I dunno. Some of the services use very little bandwidth or resources but I'm not very expert on which
[6:49] <dmick> I tend to address such questions with "wth, try it, if something gets overloaded you can reconfigure"
[6:49] * AfC (~andrew@2001:44b8:31cb:d400:cca6:9abc:d330:8406) has joined #ceph
[6:55] * huangjun (~kvirc@111.173.100.212) Quit (Read error: Connection reset by peer)
[6:58] * xmltok_ (~xmltok@cpe-76-170-26-114.socal.res.rr.com) has joined #ceph
[6:59] * xmltok_ (~xmltok@cpe-76-170-26-114.socal.res.rr.com) Quit ()
[7:05] * xmltok (~xmltok@relay.els4.ticketmaster.com) Quit (Ping timeout: 480 seconds)
[7:10] * terje (~joey@63-154-139-224.mpls.qwest.net) has joined #ceph
[7:15] * terje_ (~joey@63-154-139-224.mpls.qwest.net) has joined #ceph
[7:18] * terje (~joey@63-154-139-224.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[7:23] * sjusthm (~sam@24-205-35-233.dhcp.gldl.ca.charter.com) Quit (Remote host closed the connection)
[7:24] * terje_ (~joey@63-154-139-224.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[8:06] * huangjun|2 (~kvirc@59.173.200.16) has joined #ceph
[8:30] * huangjun|2 (~kvirc@59.173.200.16) Quit (Ping timeout: 480 seconds)
[9:01] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Remote host closed the connection)
[9:01] * rongze (~quassel@754fe8ea.test.dnsbl.oftc.net) Quit (Read error: Connection reset by peer)
[9:01] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[9:06] * haomaiwang (~haomaiwan@li565-182.members.linode.com) Quit (Ping timeout: 480 seconds)
[9:07] * jcfischer (~fischer@user-28-12.vpn.switch.ch) has joined #ceph
[9:11] * terje (~joey@63-154-143-44.mpls.qwest.net) has joined #ceph
[9:13] <jcfischer> after (yet another) btrfs related crash on one of our server, I have started the migration to xfs - the first couple of osds are done, and the cluster is recovering. However, it seems to me that recovery speed is abysmal (earlier recoveries took a couple of minutes (30-60) for 5-10% degradation, this recovery has taken hours and isn't finished yet
[9:13] <jcfischer> Any word on optimal parameters when creating ifs? I have: -i size=2048 and am mounting with "inode64, native"
[9:13] <jcfischer> s/native/noatime/
[9:15] * leseb (~leseb@88-190-214-97.rev.dedibox.fr) Quit (Killed (NickServ (Too many failed password attempts.)))
[9:16] * leseb (~leseb@88-190-214-97.rev.dedibox.fr) has joined #ceph
[9:17] <bandrus> http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/11568
[9:17] <bandrus> i've had luck with -n size=64k, but make sure it suits your needs
[9:19] <bandrus> not sure if it will aide in your recovery speeds either, that's not specifically something I've tested
[9:19] * terje (~joey@63-154-143-44.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[9:20] <jcfischer> bandrus: thanks - hmm, doesn't seem too bad what I choose
[9:23] <bandrus> if your fs is a bottleneck, it might increase your recovery times due to 10-20% faster writes and 20-30% faster reads (based on my simple rados bench tests)
[9:25] <jcfischer> I am not sure it's the fs - I just noticed that recovery is a lot slower than what it used to be (but I don't have any scientific evidence ready or a complete understanding where it is loosing time)
[9:30] * bergerx_ (~bekir@78.188.101.175) has joined #ceph
[9:42] * odyssey4me (~odyssey4m@165.233.71.2) has joined #ceph
[9:45] <jcfischer> sigh - while working on the cluster anyway I upgraded to 0.67.1 and tried to restart the mons. The first mon is hanging on "Starting ceph-create-keys…"
[9:45] <jcfischer> and the mons log file has numerous: "cephx: verify_reply coudln't decrypt with error: error decoding block for decryption"
[9:46] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has joined #ceph
[9:47] <jcfischer> and of course all of this happens the day before I go on holidays
[9:48] <bandrus> what version did you have running previously?
[9:50] <jcfischer> 61.5
[9:53] * mschiff (~mschiff@port-12390.pppoe.wtnet.de) has joined #ceph
[9:55] <bandrus> okay, that axes the quorum issue, where 0.61.4 or earlier mons cannot create a quorum with later versions
[9:56] <jcfischer> restarting the mon on the server itself seems to work, but the mon doesn't enter the quorum
[9:56] <jcfischer> and I have the same error messages
[9:56] <bandrus> try and get all mons upgraded
[9:57] <jcfischer> can I roll back to an older version if that should fail?
[9:57] <bandrus> I'd hate to give you bad advice, but once they or perhaps just a majority is on the same version...
[9:57] <bandrus> I don't see why not
[9:57] <jcfischer> what could possibly go wrong…. *famous last words*
[9:57] <bandrus> heh..
[9:58] <bandrus> I couldn't tell you specific procedures, but I'd imagine you could simply rollback the package using your package manager
[9:58] <jcfischer> current status: 2 mons of 5 down
[9:58] <bandrus> you might need to work some magic to pull from the proper repositories, but that's about the extent of my suggestions
[9:59] * bandrus bites fingernails
[9:59] <jcfischer> so now the big question - will it work?
[10:00] <jcfischer> that seemed like a bad idea: ceph -s gives me no information
[10:01] <bandrus> how many in total were upgraded so far?
[10:02] <jcfischer> upgraded software on all of them, restarted 3 of 5
[10:02] <jjgalvez> ceph --admin-daemon /var/run/ceph/ceph-mon.*.asok quorum_status
[10:02] <jcfischer> ah wait
[10:03] <jcfischer> the new mons have reached quorum
[10:03] <jcfischer> *sighs breath of relief*
[10:04] <jcfischer> I'm always *amazed* of how stable ceph is
[10:06] <jcfischer> off to restart osd then
[10:07] * LeaChim (~LeaChim@2.122.178.96) has joined #ceph
[10:09] * ninkotech (~duplo@static-84-242-87-186.net.upcbroadband.cz) has joined #ceph
[10:13] * agh (~oftc-webi@gw-to-666.outscale.net) Quit (Quit: Page closed)
[10:15] * haomaiwang (~haomaiwan@117.79.232.197) has joined #ceph
[10:22] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[10:23] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[10:28] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Ping timeout: 480 seconds)
[10:31] * odyssey4me (~odyssey4m@165.233.71.2) Quit (Ping timeout: 480 seconds)
[10:31] * rongze (~quassel@li565-182.members.linode.com) has joined #ceph
[10:34] * odyssey4me (~odyssey4m@165.233.205.190) has joined #ceph
[10:47] * xinxinsh (~xinxinsh@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[10:47] <jcfischer> uh oh - now the mons are using up diskspace like crazy - hopefully that stops before they overflow the disks
[10:48] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[10:50] * odyssey4me (~odyssey4m@165.233.205.190) Quit (Quit: odyssey4me)
[10:51] * xinxinsh (~xinxinsh@jfdmzpr01-ext.jf.intel.com) Quit ()
[10:58] * xinxinsh (~xinxinsh@jfdmzpr01-ext.jf.intel.com) has joined #ceph
[11:00] * odyssey4me (~odyssey4m@165.233.205.190) has joined #ceph
[11:01] * xinxinsh (~xinxinsh@jfdmzpr01-ext.jf.intel.com) Quit ()
[11:03] * dobber (~dobber@213.169.45.222) has joined #ceph
[11:03] * xinxinsh (~xinxinsh@134.134.139.72) has joined #ceph
[11:05] * xinxinsh (~xinxinsh@134.134.139.72) Quit ()
[11:08] * odyssey4me (~odyssey4m@165.233.205.190) Quit (Ping timeout: 480 seconds)
[11:10] * sleinen (~Adium@p5B37393C.dip0.t-ipconnect.de) has joined #ceph
[11:14] * odyssey4me (~odyssey4m@165.233.71.2) has joined #ceph
[11:18] * yy-nm (~chatzilla@218.74.33.110) Quit (Quit: ChatZilla 0.9.90.1 [Firefox 22.0/20130618035212])
[11:21] * dobber (~dobber@213.169.45.222) Quit (Remote host closed the connection)
[11:22] * huangjun (~kvirc@59.173.200.16) has joined #ceph
[11:29] * sleinen (~Adium@p5B37393C.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[11:31] * ron-slc (~Ron@173-165-129-125-utah.hfc.comcastbusiness.net) Quit (Quit: Leaving)
[11:32] * sleinen (~Adium@2001:620:0:25:47b:4bec:6d5d:7617) has joined #ceph
[11:34] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[11:36] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) Quit ()
[11:43] * X3NQ (~X3NQ@195.191.107.205) has joined #ceph
[11:44] * BManojlovic (~steki@fo-d-130.180.254.37.targo.rs) has joined #ceph
[11:50] * madkiss (~madkiss@2001:6f8:12c3:f00f:9540:7eb0:e1ac:1100) has joined #ceph
[11:52] * mikedawson (~chatzilla@c-98-220-189-67.hsd1.in.comcast.net) Quit (Ping timeout: 480 seconds)
[11:54] * yanzheng (~zhyan@jfdmzpr05-ext.jf.intel.com) Quit (Remote host closed the connection)
[11:55] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[11:56] * terje_ (~joey@63-154-129-196.mpls.qwest.net) has joined #ceph
[11:57] * sleinen (~Adium@2001:620:0:25:47b:4bec:6d5d:7617) Quit (Ping timeout: 480 seconds)
[12:00] * sleinen (~Adium@2001:620:0:25:3143:bcfa:1dd9:9149) has joined #ceph
[12:04] * terje_ (~joey@63-154-129-196.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[12:16] * sleinen (~Adium@2001:620:0:25:3143:bcfa:1dd9:9149) has left #ceph
[13:02] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[13:06] * terje_ (~joey@63-154-129-196.mpls.qwest.net) has joined #ceph
[13:15] * terje_ (~joey@63-154-129-196.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[13:16] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[13:21] * Rocky (~r.nap@188.205.52.204) Quit (Quit: **Poof**)
[13:22] * Rocky (~r.nap@188.205.52.204) has joined #ceph
[13:22] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[13:35] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[13:42] * yanzheng (~zhyan@134.134.137.73) has joined #ceph
[13:42] * huangjun (~kvirc@59.173.200.16) Quit (Quit: KVIrc 4.2.0 Equilibrium http://www.kvirc.net/)
[13:46] * qhead (~Adium@91-157-211-78.elisa-laajakaista.fi) has joined #ceph
[13:49] <qhead> Quick question about CephFS. Does it work properly if I'm reading & writing same files under multiple machines? After reading documentation I'm pretty sure it works but just wanted to make sure :)
[13:50] <qhead> I'm mostly concerned about how conflict resolution works.
[14:01] * tziOm (~bjornar@ti0099a340-dhcp0395.bb.online.no) has joined #ceph
[14:09] <phantomcircuit> qhead, it's designed to work the same as if it was multiple processes instead of machines
[14:11] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[14:14] <qhead> phantomcircuit: ok. I'm trying to find a solution how to get a shared directory between multiple nginx servers. Each server can change the data in the directory so how does Ceph just overwrite the previous object with the one it just got or does it do some sort of timestamp check? Sorry if I'm asking dumb questions. My background is from distributed databases so I'm probably thinking this a bit from that point of view.
[14:14] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) Quit (Quit: smiley)
[14:18] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[14:22] <phantomcircuit> qhead, honestly have no idea
[14:22] <phantomcircuit> but you get POSIX guarantees
[14:22] <phantomcircuit> which aren't as strict as a lot of people think
[14:24] * diegows (~diegows@190.190.2.126) has joined #ceph
[14:25] <qhead> Ok, I'll take a look at those. By the way, if I'm using CephFS, does it do any kind of caching at all or if I happen to read a huge file does it fetch it every time from OSDs?
[14:27] * AfC (~andrew@2001:44b8:31cb:d400:cca6:9abc:d330:8406) Quit (Quit: Leaving.)
[14:31] <qhead> Hmm, I think I can avoid a lot of problems by assigning each nginx that needs one specific directory to same primary OSD
[14:39] * hybrid5121 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[14:40] <yanzheng> qhead, if multiple clients read/write a file, local cache is disabled, all read/write go to osd directly
[14:41] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) has joined #ceph
[14:41] <qhead> yanzheng: ok. you know what happens if two clients write same file simultaneously?
[14:42] <yanzheng> yes
[14:42] * mozg (~andrei@host217-46-236-49.in-addr.btopenworld.com) has joined #ceph
[14:42] <qhead> yanzheng: great, my bad. can you tell me what happens if two clients write same file simultaneously?
[14:43] <yanzheng> it guarantee eventual consistence
[14:44] <yanzheng> writes go to osd directory, and the osd guarantee each write is atomic
[14:44] <yanzheng> s/directory/directly
[14:45] * mathlin (~mathlin@dhcp2-pc112059.fy.chalmers.se) Quit (Read error: Connection reset by peer)
[14:45] <qhead> does it pay attention to the file's timestamp or does it use osd's "received_at" timestamp?
[14:46] <yanzheng> if a write is cross object boundary, cephfs doesn't guarantee the atomicity
[14:47] <qhead> ok, thanks.
[14:47] <yanzheng> timestamp is maintained by mds
[14:47] * mathlin (~mathlin@dhcp2-pc112059.fy.chalmers.se) has joined #ceph
[14:47] <qhead> ah yes.
[14:47] <yanzheng> client tell mds when the file is last modified
[14:49] <yanzheng> the mds updates file's timestamp when receiving client's message. it also guarantee timestamp does not go back
[14:51] <yanzheng> if mds and clients' clocks are out of sync, we have trouble
[14:51] <qhead> common ntp should help?
[14:52] <yanzheng> yes, ntp should work
[15:02] * HDDC (~hichem@41.228.216.84) has joined #ceph
[15:02] * HDDC (~hichem@41.228.216.84) has left #ceph
[15:04] <MACscr> anyone know of any openstack deployment/management tools that work with ceph for vm storage? Most of the ones im findng dont work with ceph
[15:06] <mbjorling> Can Ceph use actual T10 OSD disks through the scsi interface? or will it need Ceph-osd -> Exofs -> OSD to work correctly?
[15:06] <mbjorling> Thanks :)
[15:07] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) has joined #ceph
[15:15] * jjgalvez (~jjgalvez@ip72-193-217-254.lv.lv.cox.net) Quit (Quit: Leaving.)
[15:17] * qhead (~Adium@91-157-211-78.elisa-laajakaista.fi) has left #ceph
[15:17] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[15:21] * joao (~JL@89.181.144.108) has joined #ceph
[15:21] * ChanServ sets mode +o joao
[15:27] * tziOm (~bjornar@ti0099a340-dhcp0395.bb.online.no) Quit (Remote host closed the connection)
[15:28] * SubOracle (~quassel@coda-6.gbr.ln.cloud.data-mesh.net) has joined #ceph
[15:37] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[15:38] * yanzheng (~zhyan@134.134.137.73) Quit (Remote host closed the connection)
[15:38] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[16:08] * stxShadow (~Jens@ip-88-152-161-249.unitymediagroup.de) has joined #ceph
[16:22] * dpippenger (~riven@cpe-76-166-208-83.socal.res.rr.com) Quit (Ping timeout: 480 seconds)
[16:27] * yanzheng (~zhyan@134.134.137.71) has joined #ceph
[16:29] * mikedawson (~chatzilla@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Ping timeout: 480 seconds)
[16:39] * stxShadow (~Jens@ip-88-152-161-249.unitymediagroup.de) Quit (Read error: Connection reset by peer)
[16:51] <joelio> mbjorling: objects on objects? Never really heard of open-osd before tbb - Out of interest why?
[16:55] * BillK (~BillK-OFT@58-7-165-124.dyn.iinet.net.au) Quit (Ping timeout: 480 seconds)
[16:57] * alphe (~alphe@0001ac6f.user.oftc.net) has joined #ceph
[16:57] <alphe> hello everyone I have a question a rados block device is to be done on which machine of the cluster ?
[16:58] <mbjorling> joelio, I'm trying to get a deeper understanding of how the Ceph architecture works in regard to its object storage. As far as I understand, Ceph OSD map its objects into files on a local file system. What I'm building is an SSD, that implements an interface similar to OSD, but expand it to work with more advance data structures. Similar to how btrfs is doing its data management, but tied to the architecture of the SSD.
[16:59] <alphe> do i have to put create the rbd on a extra machine (client) that will be used has gateway ?
[17:00] <joelio> mbjorling: the docs may help :) http://ceph.com/docs/next/architecture/
[17:00] <mbjorling> joelio, thanks :)
[17:01] <joelio> ceph's docs are some of the best around I've found.. (bar ceph-deploy but let's not mention that :D)
[17:02] <joelio> alphe: do you mean RADOS gatweway (s3) or rbd? block device a la /dev/rbd/{image}
[17:03] <mbjorling> joelio, I also found them better than the average fs docs :)
[17:05] * sagelap (~sage@12.130.118.17) has joined #ceph
[17:06] <paravoid> 0.67-rc3 works fwiw
[17:06] <paravoid> release-notes & blog need an update though
[17:06] <joelio> mbjorling: don't know if you're aware btw, but ceph-osd != open-osd
[17:09] <yanzheng> mbjorling, read codes in ceph/src/os/
[17:10] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Remote host closed the connection)
[17:10] * ishkabob (~c7a82cc0@webuser.thegrebs.com) has joined #ceph
[17:11] <yanzheng> there are FileStore, LevelDBStore. you can add an OSDStore
[17:11] <alphe> joelio that is the point I m lost ...
[17:12] <mbjorling> yanzheng, thanks. I'll look into it
[17:12] <alphe> what I want is something fast to transfere a ton of small files something faster than samba at least ...
[17:13] <ntranger> I'm following the quick start, and everything seems to go fine, until I get to the mon create command, and its not creating the keyrings in the bootstrap folders. Anyone know what I might be missing?
[17:13] <alphe> since there no cephfs for windows (we could use dokanfs as api and build on top the ceph protocol but that implies a ton of work ... that I can t affort alone ...)
[17:14] <ntranger> I run "ceph-deploy -overwrite-conf mon create ceph01 ceph02 ceph03", and get no errors, it just doesn't create the keyrings for some reason.
[17:14] <alfredodeza> ntranger: what OS are you using?
[17:14] <alphe> ntranger I had that problem too ...
[17:14] <alfredodeza> I am currently looking into an issue for CentOS for mon create
[17:14] <alphe> you need to reset your installation and create one monitor alone
[17:15] <yanzheng> alphe, cephfs is not idea for small files
[17:15] <alphe> then after waiting a moment add the others monitors (spare)
[17:15] <ntranger> Hey Alfred! Yeah, I'm running Scientific Linux.
[17:15] <yanzheng> s/idea/ideal
[17:16] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[17:16] <alphe> yanzheng zfs neither etc neither
[17:16] <alphe> but I want the best possible compromize
[17:16] <yanzheng> metadata operations are very slow
[17:16] <alphe> because zfs has a ton of usues
[17:16] <ishkabob> hi again Ceph devs :) I'm having some trouble creating a ceph cluster WITHOUT ceph-deploy or mkcephfs (trying to use puppet). I'm having trouble authenticating after boostrapping my monitors - http://pastebin.com/raw.php?i=Hs87QVNi
[17:17] <ishkabob> i think that short paste should explain what the problem is, but basically, I create a monmap using my keyring (with entries for both mon. and client.admin), I mkfs for the monitor, start the monitor, and then it doens't let me authenticate
[17:17] <yanzheng> for metadata operations, local filesystem is at least 10x faster than cephfs
[17:18] <ishkabob> yanzheng: and alphe: not to use a four letter word in this chat, but wouldnt HDFS be better for smaller files?
[17:19] <alphe> yanzheng hum look I did a real case scenario yesterday had 1TB of small files to upload to the ceph cluster using samba as gateway
[17:20] <yanzheng> how many files
[17:20] <alphe> I noticed a ton of pauses the transfere rate was extra slow
[17:20] <alphe> around 20kB/s ...
[17:20] <alphe> on a gigaethernet lan ...
[17:21] <alphe> but each time a big file poped up then bang 100MB/s
[17:21] <ishkabob> alphe: how did you have your samba gateway connected to Ceph?
[17:21] <ishkabob> RBD? or CephFS?
[17:21] <alphe> cephfs
[17:21] <mtanski> Alphene, you need to paralyze it
[17:21] <alphe> ceph-fuse to be exact
[17:21] <mtanski> parallelize*
[17:21] <mtanski> auto correct NFTW
[17:22] <alphe> mtanski parallelize is at the client level no ?
[17:22] <ishkabob> alphe: that might be your problem, Fuse can be a bit kludgy and also CephFS isn't really ready yet (someone feel free to correct me on that)
[17:22] <mozg> hello guys
[17:22] <alphe> mtanski I try putting a big cache too but that don t solve things ...
[17:22] <yanzheng> I think cephfs can only create hundreds of files per second
[17:22] <mtanski> Yeah, if you sequentially copy millions of small files you're going to get bad performance
[17:22] <alphe> so I want to know if S3 could help me or not
[17:22] <mozg> does anyone know what is the current status if geo-replication feature in ceph?
[17:22] <ishkabob> alphe: why not try to xpose a RBD device to your samba gateway?
[17:22] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[17:23] <mozg> i would like to be able to replicate data across wan over to another site
[17:23] <alphe> yanzheng sorry It wasn t doing even 1 small file per second ...
[17:23] <yanzheng> create a file need to send a request to the mds, then wait for the reply.
[17:23] <joelio> alphe: There's no fscache in cephfs yet (tehere's a patch, don't think it's in there yet though)
[17:23] <alphe> in fact over 650 000 files I transfered in half hour like 70 ...
[17:23] <alphe> joelio in samba there is one ...
[17:24] * bergerx_ (~bekir@78.188.101.175) Quit (Remote host closed the connection)
[17:24] <mtanski> Fscache is not going to help the write case, just read
[17:24] * gregmark (~Adium@68.87.42.115) has joined #ceph
[17:24] <mtanski> And only if you've access the data before hand
[17:24] <alphe> joelio I tryed obviously displacing the cache probleme to the gateway instead of attributing it to the main cluster ...
[17:25] <alphe> what i uses is bigger buffers for tcp connections and the oplocks ...
[17:25] <alphe> the basic enhancements that everyone suggests but that as no real impact in fact
[17:26] <mtanski> Can you tell us how you're trying to copy the files? The process your using?
[17:26] <alphe> mtanski from a windows 7 server it is data replication
[17:27] <joelio> just one single cp process? Can you parellise like in a map/reduce fasion?
[17:27] <mtanski> So you're copying it from a cifs mount via the cp copy over to cephfs?
[17:27] <alphe> like you take you c:\windows\ dir and you drop it to you ceph/samba connected drive
[17:27] <yanzheng> I think you can try compressing these files, put the output file to the cephfs
[17:28] <alphe> mtanski I mount ceph to the gateway using ceph-fuse then I run in the gateway samba with a share that point to that directory
[17:28] <joelio> .. or do a parallel copy.. one copy for each letter of the alphabet for example?
[17:28] * sprachgenerator (~sprachgen@130.202.135.202) has joined #ceph
[17:29] <alphe> and trust me on very big files like 100GB the ceph/samba rich almost max gigaehternet capabilities but on small files it is really slow
[17:30] <alphe> yanzheng hum ... s3 will have no impact ?
[17:30] <alphe> the s3 or openstack way have to be installed to a gateway out of the cluster too ?
[17:32] * terje (~joey@63-154-138-72.mpls.qwest.net) has joined #ceph
[17:33] <yanzheng> I think they can be installed on the same node, if the node has enough cpu power
[17:33] <mtanski> This is what I would do in the case of lots of small files… since it's network latency and not throughput that's your problem
[17:34] <ntranger> Alfred: Yeah, its weird. When I did it with one mon, it seemed to work fine, but when I added the other 2 mons, it runs without error, but doesn't create the keyrings. Not sure if that will help you out or not.
[17:36] <mtanski> cd /mnt/samba_mnt ; find -type d | xargs -P 8 mkdir -t /mnt/CEPH_DIR
[17:37] <mtanski> find -type f | xargs -P 8 cp -a -t /mnt/CEPH_DIR
[17:37] <mtanski> That might not be the perfect command (eg. you need to try it / verify it with the manpage)
[17:38] <mtanski> but it should start copying the data in parallel
[17:38] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) has joined #ceph
[17:38] <mtanski> You're still going to experience latency but you should be able to get a higher throughput, you can play with the -P to control how many copies to make at a time
[17:40] * terje (~joey@63-154-138-72.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[17:40] <ishkabob> I'm having some trouble creating a ceph cluster WITHOUT ceph-deploy or mkcephfs (trying to use puppet). I'm having trouble authenticating after boostrapping my monitors - http://pastebin.com/raw.php?i=Hs87QVNi
[17:40] <alphe> mtanski nice ...
[17:41] <alphe> hum ...
[17:41] <alphe> I need something to do the passerela ...
[17:42] <alphe> and with such volumes I don t know if I can do a small FIFO like buffer using 10Gb of my local hardrive for example
[17:44] <alphe> i need to copy and free all what is droped to the gateway into the FIFO like share and as soon it is transfered to ceph then remove it from the fifo like
[17:44] <alphe> I can t have on my gateway a 10TB of local disk ...
[17:45] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[17:46] * hybrid512 (~walid@LPoitiers-156-86-25-85.w193-248.abo.wanadoo.fr) Quit (Quit: Leaving.)
[17:54] * sprachgenerator (~sprachgen@130.202.135.202) Quit (Quit: sprachgenerator)
[17:54] * sprachgenerator (~sprachgen@130.202.135.202) has joined #ceph
[17:55] * sprachgenerator (~sprachgen@130.202.135.202) Quit ()
[17:57] * sprachgenerator (~sprachgen@130.202.135.202) has joined #ceph
[17:57] <mtanski> I wouldn't recommend POCing something without a backup
[18:04] <ntranger> hey Alfred, what issue were you running in to?
[18:04] <ntranger> with the mons?
[18:08] * devoid (~devoid@130.202.135.213) has joined #ceph
[18:09] <alfredodeza> ntranger: you basically cannot create mons with ceph-deploy on CentOS/Scientific/RHEL
[18:09] <alfredodeza> just found the issue and I am in the middle of fixing it
[18:11] <ntranger> You're a machine. :)
[18:14] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) Quit (Quit: Leaving.)
[18:17] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[18:22] <ishkabob> if I'm bootstraping ceph without ceph-deploy or mkcephfs, do I need to create my client.admin key with cephx disabled AND THEN enable after I've created the key?
[18:23] <mattch> ishkabob: I don't know the full process, but you can create mons and retrieve the client.admin keyring from them by starting out with just a ceph.mon.keyring in ceph-deploy
[18:23] <mattch> (with cephx on from the start)
[18:27] * wschulze1 (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) has joined #ceph
[18:28] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * sagelap (~sage@12.130.118.17) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * mjeanson (~mjeanson@00012705.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * iggy (~iggy@theiggy.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * markl (~mark@tpsit.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * cmdrk (~lincoln@c-24-12-206-91.hsd1.il.comcast.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * terje-_ (~root@135.109.216.239) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * janisg (~troll@85.254.50.23) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * cjh_ (~cjh@ps123903.dreamhost.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * sjust (~sam@38.122.20.226) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * joshd (~joshd@38.122.20.226) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * tchmnkyz (~jeremy@0001638b.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * _robbat2|irssi (nobody@www2.orbis-terrarum.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * alexbligh (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * baffle_ (baffle@jump.stenstad.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * nwl (~levine@atticus.yoyo.org) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Ormod (~valtha@ohmu.fi) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * josef (~seven@li70-116.members.linode.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * liiwi (liiwi@idle.fi) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * [cave] (~quassel@boxacle.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * [fred] (fred@konfuzi.us) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Sargun_ (~sargun@208-106-98-2.static.sonic.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * wschulze (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * mschiff (~mschiff@port-12390.pppoe.wtnet.de) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * houkouonchi-work (~linux@12.248.40.138) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * jeroenmoors (~quassel@193.104.8.40) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * jnq (~jon@0001b7cc.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * portante (~portante@nat-pool-bos-t.redhat.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * lmb (lmb@212.8.204.10) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * cclien_ (~cclien@ec2-50-112-123-234.us-west-2.compute.amazonaws.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * nigwil (~idontknow@174.143.209.84) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Jakdaw (~chris@puma-mxisp.mxtelecom.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * jf-jenni (~jf-jenni@stallman.cse.ohio-state.edu) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * beardo (~sma310@beardo.cc.lehigh.edu) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Kdecherf (~kdecherf@shaolan.kdecherf.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * AaronSchulz (~chatzilla@192.195.83.36) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * guppy (~quassel@guppy.xxx) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * gregaf (~Adium@2607:f298:a:607:112c:1fa8:77e1:af2e) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * MACscr (~Adium@c-98-214-103-147.hsd1.il.comcast.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * zackc (~zack@0001ba60.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * masterpe (~masterpe@2a01:670:400::43) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Azrael (~azrael@terra.negativeblue.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * maswan (maswan@kennedy.acc.umu.se) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * sbadia (~sbadia@yasaw.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * soren (~soren@hydrogen.linux2go.dk) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * ivan` (~ivan`@000130ca.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * rturk-away (~rturk@ds2390.dreamservers.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * chutz (~chutz@rygel.linuxfreak.ca) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * yeled (~yeled@spodder.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * infernix (nix@cl-1404.ams-04.nl.sixxs.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * sprachgenerator (~sprachgen@130.202.135.202) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * ishkabob (~c7a82cc0@webuser.thegrebs.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * mtanski (~mtanski@69.193.178.202) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * joao (~JL@89.181.144.108) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * fireD_ (~fireD@93-139-174-231.adsl.net.t-com.hr) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * KindOne (KindOne@0001a7db.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * jochen (~jochen@laevar.de) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * NaioN_ (stefan@andor.naion.nl) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * tdb (~tdb@willow.kent.ac.uk) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Fetch_ (fetch@gimel.cepheid.org) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * jks (~jks@3e6b5724.rev.stofanet.dk) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Tamil (~tamil@38.122.20.226) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * dmick (~dmick@38.122.20.226) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * wer (~wer@206-248-239-142.unassigned.ntelos.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * cfreak201 (~cfreak200@p4FF3E75F.dip0.t-ipconnect.de) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * nwf_ (~nwf@67.62.51.95) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * \ask (~ask@oz.develooper.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Psi-Jack_ (~Psi-Jack@yggdrasil.hostdruids.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * mbjorling (~SilverWol@130.226.133.120) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * Daviey (~DavieyOFT@bootie.daviey.com) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) Quit (resistance.oftc.net synthon.oftc.net)
[18:28] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) Quit (resistance.oftc.net synthon.oftc.net)
[18:29] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[18:29] * sagelap (~sage@12.130.118.17) has joined #ceph
[18:29] * alfredodeza (~alfredode@c-24-131-46-23.hsd1.ga.comcast.net) has joined #ceph
[18:29] * nhorman (~nhorman@hmsreliant.think-freely.org) has joined #ceph
[18:29] * mjeanson (~mjeanson@00012705.user.oftc.net) has joined #ceph
[18:29] * iggy (~iggy@theiggy.com) has joined #ceph
[18:29] * markl (~mark@tpsit.com) has joined #ceph
[18:29] * cmdrk (~lincoln@c-24-12-206-91.hsd1.il.comcast.net) has joined #ceph
[18:29] * terje-_ (~root@135.109.216.239) has joined #ceph
[18:29] * janisg (~troll@85.254.50.23) has joined #ceph
[18:29] * cjh_ (~cjh@ps123903.dreamhost.com) has joined #ceph
[18:29] * josef (~seven@li70-116.members.linode.com) has joined #ceph
[18:29] * sjust (~sam@38.122.20.226) has joined #ceph
[18:29] * joshd (~joshd@38.122.20.226) has joined #ceph
[18:29] * tchmnkyz (~jeremy@0001638b.user.oftc.net) has joined #ceph
[18:29] * _robbat2|irssi (nobody@www2.orbis-terrarum.net) has joined #ceph
[18:29] * alexbligh (~alexbligh@89-16-176-215.no-reverse-dns-set.bytemark.co.uk) has joined #ceph
[18:29] * [fred] (fred@konfuzi.us) has joined #ceph
[18:29] * nolan (~nolan@2001:470:1:41:20c:29ff:fe9a:60be) has joined #ceph
[18:29] * baffle_ (baffle@jump.stenstad.net) has joined #ceph
[18:29] * liiwi (liiwi@idle.fi) has joined #ceph
[18:29] * Sargun_ (~sargun@208-106-98-2.static.sonic.net) has joined #ceph
[18:29] * nwl (~levine@atticus.yoyo.org) has joined #ceph
[18:29] * Ormod (~valtha@ohmu.fi) has joined #ceph
[18:29] * [cave] (~quassel@boxacle.net) has joined #ceph
[18:29] * sprachgenerator (~sprachgen@130.202.135.202) has joined #ceph
[18:29] * ishkabob (~c7a82cc0@webuser.thegrebs.com) has joined #ceph
[18:29] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[18:29] * joao (~JL@89.181.144.108) has joined #ceph
[18:29] * fireD_ (~fireD@93-139-174-231.adsl.net.t-com.hr) has joined #ceph
[18:29] * KindOne (KindOne@0001a7db.user.oftc.net) has joined #ceph
[18:29] * nwf_ (~nwf@67.62.51.95) has joined #ceph
[18:29] * s15y (~s15y@sac91-2-88-163-166-69.fbx.proxad.net) has joined #ceph
[18:29] * mbjorling (~SilverWol@130.226.133.120) has joined #ceph
[18:29] * mynameisbruce (~mynameisb@tjure.netzquadrat.de) has joined #ceph
[18:29] * cfreak201 (~cfreak200@p4FF3E75F.dip0.t-ipconnect.de) has joined #ceph
[18:29] * Daviey (~DavieyOFT@bootie.daviey.com) has joined #ceph
[18:29] * wer (~wer@206-248-239-142.unassigned.ntelos.net) has joined #ceph
[18:29] * elder (~elder@c-71-195-31-37.hsd1.mn.comcast.net) has joined #ceph
[18:29] * dmick (~dmick@38.122.20.226) has joined #ceph
[18:29] * Psi-Jack_ (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[18:29] * Tamil (~tamil@38.122.20.226) has joined #ceph
[18:29] * jks (~jks@3e6b5724.rev.stofanet.dk) has joined #ceph
[18:29] * Fetch_ (fetch@gimel.cepheid.org) has joined #ceph
[18:29] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[18:29] * tdb (~tdb@willow.kent.ac.uk) has joined #ceph
[18:29] * NaioN_ (stefan@andor.naion.nl) has joined #ceph
[18:29] * jochen (~jochen@laevar.de) has joined #ceph
[18:29] * \ask (~ask@oz.develooper.com) has joined #ceph
[18:29] * houkouonchi-work (~linux@12.248.40.138) has joined #ceph
[18:29] * mnash (~chatzilla@66-194-114-178.static.twtelecom.net) has joined #ceph
[18:29] * zackc (~zack@0001ba60.user.oftc.net) has joined #ceph
[18:29] * masterpe (~masterpe@2a01:670:400::43) has joined #ceph
[18:29] * MACscr (~Adium@c-98-214-103-147.hsd1.il.comcast.net) has joined #ceph
[18:29] * infernix (nix@cl-1404.ams-04.nl.sixxs.net) has joined #ceph
[18:29] * gregaf (~Adium@2607:f298:a:607:112c:1fa8:77e1:af2e) has joined #ceph
[18:29] * guppy (~quassel@guppy.xxx) has joined #ceph
[18:29] * AaronSchulz (~chatzilla@192.195.83.36) has joined #ceph
[18:29] * Azrael (~azrael@terra.negativeblue.com) has joined #ceph
[18:29] * yeled (~yeled@spodder.com) has joined #ceph
[18:29] * soren (~soren@hydrogen.linux2go.dk) has joined #ceph
[18:29] * jeroenmoors (~quassel@193.104.8.40) has joined #ceph
[18:29] * jnq (~jon@0001b7cc.user.oftc.net) has joined #ceph
[18:29] * rturk-away (~rturk@ds2390.dreamservers.com) has joined #ceph
[18:29] * maswan (maswan@kennedy.acc.umu.se) has joined #ceph
[18:29] * sbadia (~sbadia@yasaw.net) has joined #ceph
[18:29] * portante (~portante@nat-pool-bos-t.redhat.com) has joined #ceph
[18:29] * lmb (lmb@212.8.204.10) has joined #ceph
[18:29] * cclien_ (~cclien@ec2-50-112-123-234.us-west-2.compute.amazonaws.com) has joined #ceph
[18:29] * nigwil (~idontknow@174.143.209.84) has joined #ceph
[18:29] * Jakdaw (~chris@puma-mxisp.mxtelecom.com) has joined #ceph
[18:29] * jeffhung (~jeffhung@60-250-103-120.HINET-IP.hinet.net) has joined #ceph
[18:29] * jf-jenni (~jf-jenni@stallman.cse.ohio-state.edu) has joined #ceph
[18:29] * ivan` (~ivan`@000130ca.user.oftc.net) has joined #ceph
[18:29] * chutz (~chutz@rygel.linuxfreak.ca) has joined #ceph
[18:29] * beardo (~sma310@beardo.cc.lehigh.edu) has joined #ceph
[18:29] * Kdecherf (~kdecherf@shaolan.kdecherf.com) has joined #ceph
[18:30] * ChanServ sets mode +v joao
[18:31] * odyssey4me (~odyssey4m@165.233.71.2) Quit (Ping timeout: 480 seconds)
[18:34] * buck (~buck@c-24-6-91-4.hsd1.ca.comcast.net) has joined #ceph
[18:35] * yanzheng (~zhyan@134.134.137.71) Quit (Remote host closed the connection)
[18:36] * DarkAceZ (~BillyMays@50.107.55.36) Quit (Ping timeout: 480 seconds)
[18:36] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Remote host closed the connection)
[18:48] * markbby (~Adium@168.94.245.4) has joined #ceph
[18:51] * DarkAceZ (~BillyMays@50.107.55.36) has joined #ceph
[18:58] * gregmark (~Adium@68.87.42.115) Quit (Quit: Leaving.)
[19:00] * markbby (~Adium@168.94.245.4) Quit (Remote host closed the connection)
[19:00] * jackhill (jackhill@pilot.trilug.org) Quit (Read error: Connection reset by peer)
[19:01] * X3NQ (~X3NQ@195.191.107.205) Quit (Remote host closed the connection)
[19:02] * terje (~joey@63-154-137-37.mpls.qwest.net) has joined #ceph
[19:04] * jcfischer (~fischer@user-28-12.vpn.switch.ch) Quit (Quit: jcfischer)
[19:08] * mtanski (~mtanski@69.193.178.202) Quit (Ping timeout: 480 seconds)
[19:10] * terje (~joey@63-154-137-37.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[19:11] * mozg (~andrei@host217-46-236-49.in-addr.btopenworld.com) Quit (Ping timeout: 480 seconds)
[19:11] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) has joined #ceph
[19:14] * gregmark (~Adium@68.87.42.115) has joined #ceph
[19:18] * terje_ (~joey@63-154-137-37.mpls.qwest.net) has joined #ceph
[19:20] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) has joined #ceph
[19:21] * diegows (~diegows@190.190.2.126) has joined #ceph
[19:21] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) Quit (Quit: Leaving.)
[19:21] * davidz (~Adium@ip68-5-239-214.oc.oc.cox.net) has joined #ceph
[19:21] * sjustlaptop (~sam@2607:f298:a:697:2113:5ff6:2f49:6047) has joined #ceph
[19:25] * scuttlemonkey (~scuttlemo@38.106.54.2) has joined #ceph
[19:25] * ChanServ sets mode +o scuttlemonkey
[19:26] * terje_ (~joey@63-154-137-37.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[19:29] * alram (~alram@38.122.20.226) has joined #ceph
[19:32] * gentleben (~sseveranc@c-98-207-40-73.hsd1.ca.comcast.net) Quit (Quit: gentleben)
[19:33] * sjustlaptop (~sam@2607:f298:a:697:2113:5ff6:2f49:6047) Quit (Ping timeout: 480 seconds)
[19:33] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[19:34] * jbd_ (~jbd_@34322hpv162162.ikoula.com) has left #ceph
[19:38] <ishkabob> does anyone know why this is happening?
[19:38] <ishkabob> # ceph osd rm unknown command rm
[19:39] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Read error: Connection reset by peer)
[19:42] * nwat (~nwat@c-50-131-197-174.hsd1.ca.comcast.net) has joined #ceph
[19:43] * Cube (~Cube@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[19:45] * gentleben (~sseveranc@c-98-207-40-73.hsd1.ca.comcast.net) has joined #ceph
[19:47] * mtanski (~mtanski@69.193.178.202) Quit (Quit: mtanski)
[19:48] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) has joined #ceph
[20:06] * diegows (~diegows@190.190.2.126) Quit (Ping timeout: 480 seconds)
[20:07] <alphe> ishkabob probably because the path of the rm command is not in your environement PATH variable
[20:08] <alphe> or because rm in not the command appropriated for the contexte
[20:08] <alphe> ceph osd rm <osd-id> [<osd-id>...]
[20:09] <alphe> you have to give the hostname of your osd target after the ceph osd rm
[20:10] <alphe> it the id of the osd you want to remove
[20:13] * smiley (~smiley@pool-173-73-0-53.washdc.fios.verizon.net) has joined #ceph
[20:27] * sjustlaptop (~sam@38.122.20.226) has joined #ceph
[20:43] * Guest1109 (~coyo@thinks.outside.theb0x.org) Quit (Quit: om nom nom delicious bitcoins...)
[20:44] * dpippenger (~riven@tenant.pas.idealab.com) has joined #ceph
[20:54] * rongze_ (~quassel@notes4.com) has joined #ceph
[20:58] * lx0 (~aoliva@lxo.user.oftc.net) Quit (Remote host closed the connection)
[20:59] * gentleben (~sseveranc@c-98-207-40-73.hsd1.ca.comcast.net) Quit (Quit: gentleben)
[21:00] * Coyo (~coyo@thinks.outside.theb0x.org) has joined #ceph
[21:00] * Coyo is now known as Guest2137
[21:01] * ishkabob (~c7a82cc0@webuser.thegrebs.com) Quit (Quit: TheGrebs.com CGI:IRC)
[21:01] * scuttlemonkey (~scuttlemo@38.106.54.2) Quit (Read error: Operation timed out)
[21:01] * rongze (~quassel@li565-182.members.linode.com) Quit (Ping timeout: 480 seconds)
[21:04] * lx0 (~aoliva@lxo.user.oftc.net) has joined #ceph
[21:04] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[21:04] * john_barbee (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) Quit (Read error: Connection reset by peer)
[21:04] * sagelap (~sage@12.130.118.17) Quit (Remote host closed the connection)
[21:05] * sagelap (~sage@12.130.118.17) has joined #ceph
[21:10] * scuttlemonkey (~scuttlemo@38.106.54.2) has joined #ceph
[21:10] * ChanServ sets mode +o scuttlemonkey
[21:12] * john_barbee (~jbarbee@23-25-46-97-static.hfc.comcastbusiness.net) has joined #ceph
[21:14] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) has joined #ceph
[21:15] * alphe (~alphe@0001ac6f.user.oftc.net) has left #ceph
[21:16] * ScOut3R (~ScOut3R@catv-89-133-17-71.catv.broadband.hu) Quit (Remote host closed the connection)
[21:19] * sagelap (~sage@12.130.118.17) Quit (Remote host closed the connection)
[21:28] <ntranger> Hey Alfred, could you shoot me a message when you might be done with the mon issue? I could be happy to test it for ya.
[21:28] <alfredodeza> ntranger: I will when you say alfredo or alfredodeza :)
[21:28] * alfredodeza 's highlighters don't work on Alfred
[21:29] <alfredodeza> :D
[21:29] <ntranger> LOL! SOrry about that brother! :D
[21:29] <alfredodeza> ntranger: I am just giving you a hard time
[21:29] <alfredodeza> the issue *just* got merged
[21:29] <alfredodeza> so if you grab the master branch in github you should see it fixed
[21:29] <alfredodeza> \o/
[21:29] <ntranger> and I'm completely cool with that. You guys have put up with my crap for a week. :)
[21:30] <alfredodeza> you should also see much better information as to what is going on when you do `ceph-deploy mon create {nodes}`
[21:30] <alfredodeza> like super granular
[21:30] <alfredodeza> ntranger: try it out and let me know if you find any problems
[21:31] <john_barbee> joshd: I work with mikedawason on our ceph deployment. We have had an ongoing issue with windows instances becoming wedged and unresponsive until we intervene with some type of gui activity. We have reason to believe this issue is more frequent when we have writeback caching turned on. Have you seen or heard of anything like this before?
[21:33] <ntranger> alfredodeza: get it from here? https://github.com/ceph/ceph-deploy/tarball/master
[21:33] <alfredodeza> yes
[21:33] <alfredodeza> you could actually pip install from that url I think
[21:33] <ntranger> awesome. doing it now. :)
[21:33] <alfredodeza> pip install https://github.com/ceph/ceph-deploy/tarball/master
[21:33] * gentleben (~sseveranc@216.55.31.102) has joined #ceph
[21:33] <alfredodeza> ntranger: we are also working very hard into making way more frequent releases
[21:33] <alfredodeza> specially when we fix bugs like this one
[21:34] <alfredodeza> so you don't need to be doing the github madness
[21:37] * lx0 is now known as lxo
[21:37] <joshd> john_barbee: someone reported something like that on qemu-devel - did you report this yesterday on #qemu, or was that someone else? https://bugs.launchpad.net/qemu/+bug/1207686
[21:39] <ntranger> alfredodeza: should I uninstall the old ceph, and install this one?
[21:39] <alfredodeza> yes please
[21:39] <alfredodeza> well by ceph, you mean the old ceph-deploy
[21:40] <alfredodeza> right?
[21:41] <ntranger> correct
[21:45] * dpippenger1 (~riven@tenant.pas.idealab.com) has joined #ceph
[21:45] * dpippenger (~riven@tenant.pas.idealab.com) Quit (Read error: Connection reset by peer)
[21:47] <ntranger> alfredodeza: okay, I ran ceph-deploy uninstall ceph01, and when I went to install the new one, I get this.
[21:47] <ntranger> Requirement already satisfied (use --upgrade to upgrade): distribute in /usr/lib/python2.6/site-packages (from ceph-deploy==1.1)
[21:47] <alfredodeza> run: `pip uninstall ceph-deploy`
[21:47] <ntranger> ok
[21:47] <alfredodeza> then `pip install https://github.com/ceph/ceph-deploy/tarball/master`
[21:48] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) has joined #ceph
[21:52] * sjustlaptop (~sam@38.122.20.226) Quit (Ping timeout: 480 seconds)
[21:56] <john_barbee> joshd: that was someone else, it does not look exactly like our problem. our instances appear to be asleep and we ca wake them up immediately with some interaction like a vnc console or virsh screenshot. we have not had the issue much recently, but yesterday mikedawson noticed our writeback caching was not enabled and turned it back on and hard rebooted all instances. Since then we have...
[21:56] <john_barbee> ...had much more frequent wedge occurrences in the past 24 hours.
[21:57] <ntranger> alfredodeza: hrmmm. getting the same thing after uninstall, and reboot.
[21:58] <alfredodeza> ntranger: how did you installed ceph-deploy in the first place?
[21:58] <alfredodeza> with pip or the RPM?
[21:58] <ntranger> pip
[21:58] * terje_ (~joey@63-154-145-64.mpls.qwest.net) has joined #ceph
[22:00] <alfredodeza> run pip uninstall again
[22:00] <alfredodeza> and then make sure that thing is completely gone by running a python console and getting an error when doing `import ceph_deploy`
[22:01] * mtanski (~mtanski@69.193.178.202) Quit (Quit: mtanski)
[22:01] <joshd> john_barbee: do you notice any slow requests reported by ceph?
[22:01] * mtanski (~mtanski@69.193.178.202) has joined #ceph
[22:02] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Quit: Leaving.)
[22:03] <ntranger> I'm getting 'import- command not found'
[22:03] <alfredodeza> I meant in a Python shell
[22:03] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[22:03] <alfredodeza> run `python`
[22:04] <alfredodeza> and then in the prompt: `import ceph_deploy`
[22:04] <alfredodeza> if that succeeds after uninstalling you should run `pip uninstall ceph-deploy` again
[22:04] * saabylaptop (~saabylapt@1009ds5-oebr.1.fullrate.dk) Quit (Quit: Leaving.)
[22:05] <ntranger> got it. importerror: no module.
[22:05] * nhorman (~nhorman@hmsreliant.think-freely.org) Quit (Quit: Leaving)
[22:06] <joshd> john_barbee: I'm wondering if this is a strange behavior of the windows block layer recovering from very slow requests when prodded by the gui - does it happen on any linux guests?
[22:06] <alfredodeza> ntranger: excellent, now try installing again
[22:06] <alfredodeza> hopefully this should work this time
[22:06] * terje_ (~joey@63-154-145-64.mpls.qwest.net) Quit (Ping timeout: 480 seconds)
[22:07] * sagelap (~sage@2600:1012:b01a:6f28:a178:d090:f337:b242) has joined #ceph
[22:08] <ntranger> alfredodeza: here is what its saying. http://pastebin.com/r8bBQtpc
[22:08] <alfredodeza> that sounds to me like success :)
[22:09] <ntranger> ok
[22:09] <ntranger> I didn't know if those "requiremeent already satisfied" messages were anything to worry about
[22:12] <n1md4> evening, all. how can i remove the down osds ? http://pastie.org/pastes/8201020/text
[22:13] <john_barbee> joshd: the only slow request we have noticed is during a reblancing, but not during normal operation. To date, we have never seen one of our linux instances get wedged only windows.
[22:15] * allsystemsarego (~allsystem@188.25.130.190) has joined #ceph
[22:16] <dmick> n1md4: ceph osd rm <id> [<id> ...]
[22:16] <dmick> and those aren't just down, they're "does not exist"
[22:16] <n1md4> dmick: thanks, let me try that.
[22:17] <n1md4> dmick: okay, right, because they don't exist they can't be removed. are they counting for anything being there?
[22:18] <n1md4> I'm trying to set up my first storage cluster, and run in to a small problem and hence have 3 active+clean and 189 active+degraded. there is no data of there, ideally I just want to wipe the thing and get a nice active+clean setup
[22:19] <dmick> hm. if osd rm doesn't do it, then perhaps they're just in the crush map. how about ceph osd crush rm osd.<n>?
[22:19] <dmick> as for why things are degraded, we can look after we get teh dead OSDs out
[22:20] <dmick> (as they could well be the cause)
[22:20] <n1md4> "device 'osd.8' does not appear in the crush map"
[22:21] <ntranger> alfredodeza: when I go to create the mon, I get this error. etc/init.d/ceph: ceph conf /etc/ceph/ceph.conf not found; system is not configured.
[22:21] <dmick> hm. so ceph osd rm 8 gave an error saying.. ?
[22:22] <n1md4> dmick: the same, but '8'
[22:22] <ntranger> alfredodeza: seems like the install isn't creating the files in /etc/ceph.
[22:23] <alfredodeza> what OS are you using ntranger?
[22:23] <dmick> n1md4: sorry, can you post the exact error msg?
[22:23] <alfredodeza> CentOS?
[22:23] <joshd> john_barbee: it could very well be the writeback cache not doing a callback at some point - if you could gather logs of a vm getting stuck with debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be great
[22:24] <john_barbee> will do. thanks joshd
[22:26] <n1md4> dmick: http://pastie.org/pastes/8201045/text
[22:27] <dmick> ah. but you notice osd.8 is not in the dump anymore
[22:27] <n1md4> ah!!! thank you! didn't think to check!
[22:27] <dmick> does ceph osd rm 9 succeed, and then does osd.9 disappear from ceph osd tree?
[22:28] <n1md4> ... ;) just getting to that bit, will let you know. thanks.
[22:28] <n1md4> perfect! well, the tree looks good.
[22:28] <dmick> hows ceph -s look now
[22:29] <n1md4> win! http://pastie.org/pastes/8201052/text
[22:29] <john_barbee> joshd: to be clear can where do I gather these logs? are they client, or monitor logs?
[22:29] <n1md4> (-w) but still ..
[22:29] <n1md4> all clean now
[22:29] <dmick> cool.
[22:30] <n1md4> thank you, very much!
[22:30] <dmick> so just for my edification: ceph osd rm <id> was the right command, right?
[22:30] <n1md4> ceph osd crush rm osd.<id>
[22:30] <john_barbee> joshd: do you think your async qemu patch could have any play here? currently we do not have that patch in place.
[22:30] <dmick> ok
[22:31] <n1md4> dmick: forgive my beginners approach, but what is it i'd have now from what you can see from ceph -s ; it's a 2 server setup, with 3x 1tb drive committed to each osd in each box.
[22:32] <n1md4> (just wondering what my next step is to get to actually using it - with xenserver)
[22:33] <joshd> john_barbee: client logs from the client qemu uses
[22:33] <joshd> john_barbee: the async patch could make a difference there, yes - there could be a flush that takes a long time, causing the guest to become unresponsive
[22:34] <ntranger> alfredodeza: Scientific Linux
[22:34] <ntranger> 6.4
[22:34] <alfredodeza> oh right
[22:34] <alfredodeza> do you get specific errors on install?
[22:34] <alfredodeza> can I get a paste
[22:34] <alfredodeza> ?
[22:36] <ntranger> alfredodeza: http://pastebin.com/ZwBPygqy
[22:36] * yanzheng (~zhyan@134.134.139.74) has joined #ceph
[22:36] <ntranger> thats what I get for install
[22:37] <alfredodeza> ntranger: that sounds to me like a working install :/
[22:38] * sagelap (~sage@2600:1012:b01a:6f28:a178:d090:f337:b242) Quit (Quit: Leaving.)
[22:39] <ntranger> and then I get this during the mon creation
[22:39] <ntranger> http://pastebin.com/Hk07kQZD
[22:40] <ntranger> alfredodeza: the whole "system is not configured" is throwing me.
[22:40] <alfredodeza> hrmnn
[22:40] <alfredodeza> right
[22:40] <alfredodeza> everything else looks fine though
[22:40] <ntranger> yeah, seems to look that way
[22:44] * grepory1 (~Adium@50-115-70-146.static-ip.telepacific.net) has joined #ceph
[22:44] * grepory (~Adium@50-115-70-146.static-ip.telepacific.net) Quit (Read error: Connection reset by peer)
[22:51] <ntranger> alfredodeza: here is what I run gatherkeys.
[22:51] <ntranger> http://pastebin.com/hmvuXqUZ
[22:51] * scuttlemonkey (~scuttlemo@38.106.54.2) Quit (Ping timeout: 480 seconds)
[22:51] <alfredodeza> aha
[22:52] <alfredodeza> ok
[22:52] <alfredodeza> so that seems to me like some progress (as bad as that sounds)
[22:52] <ntranger> :)
[22:52] <alfredodeza> so installation good, mon create good
[22:52] <alfredodeza> we are now stuck at gatherkeys
[22:53] * scuttlemonkey (~scuttlemo@38.122.20.226) has joined #ceph
[22:53] * ChanServ sets mode +o scuttlemonkey
[22:54] <ntranger> I thought maybe it was a mon issue not putting the files in the right place, and then gatherkeys couldn't find them
[22:54] <sage> alfredodeza: btw, i'm thinking we should have a streamlined mode that does all teh mon creates and gatherkeys for you at once
[22:55] <sage> ceph-deploy new <list of mons>
[22:55] <sage> ceph-deploy mon create-all
[22:55] <sage> or something
[22:55] <alfredodeza> sage: sure, that sounds to me reasonable
[22:55] <alfredodeza> I was wondering why the extra step, but having granular and compounded commands sounds good
[22:55] <sage> the gatherkeys step is confusing and frequently where things go wrong
[22:55] <sage> its bc gatherkeys won't work until after there is a mon quorum
[22:56] * yanzheng (~zhyan@134.134.139.74) Quit (Remote host closed the connection)
[22:58] <dmick> n1md4: sorr, got pulled away. what's your question?
[22:59] <dmick> and, ntranger, alfredodeza: this is the 'if the mons are up, ceph-create-keys should be talking to them and creating keys'
[23:00] <dmick> ntranger: can you try ceph -n mon. -k ceph.mon.keyring -s
[23:00] <ntranger> dmick: sure thing
[23:02] <ntranger> dmick: http://pastebin.com/JAnSNL8Y
[23:03] <dmick> seems like the mons aren't up
[23:03] <dmick> ps -ef | grep ceph-mon?
[23:04] <ntranger> this is all that shows up
[23:04] <ntranger> root 6828 6186 0 11:02 pts/0 00:00:00 grep ceph-mon
[23:05] <alfredodeza> yep, mons are not up
[23:06] * sjustlaptop (~sam@38.122.20.226) has joined #ceph
[23:06] <dmick> check mon logs in /var/log/ceph to see why they're complaining
[23:08] * aliguori (~anthony@cpe-70-112-157-87.austin.res.rr.com) Quit (Remote host closed the connection)
[23:11] <ntranger> ceph folder under logs is empty
[23:12] * mozg (~andrei@host109-151-35-94.range109-151.btcentralplus.com) has joined #ceph
[23:16] * Psi-Jack_ (~Psi-Jack@yggdrasil.hostdruids.com) Quit (Ping timeout: 480 seconds)
[23:16] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Ping timeout: 480 seconds)
[23:17] * sjustlaptop (~sam@38.122.20.226) Quit (Ping timeout: 480 seconds)
[23:17] <ntranger> when I try "service ceph start", i get this.
[23:17] <ntranger> [root@ceph01 ceph_cluster]# service ceph start
[23:17] <ntranger> ./etc/init.d/ceph: ceph conf /etc/ceph/ceph.conf not found; system is not configured.
[23:21] <dmick> ntranger: what directory were you in when you ran mon create?
[23:22] <ntranger> the quick start told me to create a folder, which is ceph_cluster, and I was in that.
[23:23] <dmick> is there a ceph.conf in that folder, and not one in /etc/ceph/ceph.conf?
[23:24] <ntranger> correct, there is.
[23:24] <n1md4> dmick: the question is probably too broad to answer easily, and I should probably have a good read of the manuals first.
[23:24] <n1md4> thanks.
[23:28] <dmick> n1md4: OK. ntranger: so both of those assertions are true?
[23:28] <ntranger> dmick: yes
[23:28] <dmick> ok. I don't understand how that could happen. ceph mon create should have written /etc/ceph/ceph.conf
[23:28] <ntranger> dmick: the files in the ceph_cluster folder are: ceph.conf ceph.log ceph.mon.keyring
[23:29] <ntranger> and the /etc/ceph folder is empty
[23:29] <dmick> ntranger: this is the ceph-deploy you just got from pip installing, right?
[23:29] <ntranger> correct
[23:30] * wschulze1 (~wschulze@cpe-69-203-80-81.nyc.res.rr.com) Quit (Quit: Leaving.)
[23:30] * sjustlaptop (~sam@2607:f298:a:697:2113:5ff6:2f49:6047) has joined #ceph
[23:32] <ntranger> it shouldn't cause any issues that I'm doing 3 nodes at the same time, should it?
[23:32] <dmick> no
[23:33] <dmick> can you pastebin the ceph.log?
[23:33] <ntranger> absolutely
[23:34] * bandrus (~Adium@cpe-76-95-217-129.socal.res.rr.com) has joined #ceph
[23:35] <ntranger> dmick: http://pastebin.com/FP4gupD0
[23:40] * sjustlaptop (~sam@2607:f298:a:697:2113:5ff6:2f49:6047) Quit (Ping timeout: 480 seconds)
[23:42] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) has joined #ceph
[23:42] <sage> sjust: not sure why we need mulitple versions of the same object
[23:42] <dmick> alfredodeza: can you look at that log and see why there's apparently no evidence of writing ceph.conf to /etc/ceph?
[23:43] <sjust> sage: in order to roll back a delete
[23:43] <sage> oh that then gets recreated
[23:43] <sjust> yeah
[23:43] <sage> and that version_t would be the creation version
[23:43] <sjust> I suspect there will be fairly common cases where you atomically remove the old version and create a new one
[23:43] <sjust> yeah
[23:43] <sage> yeah
[23:43] <sage> ok
[23:44] * PerlStalker (~PerlStalk@2620:d3:8000:192::70) Quit (Quit: ...)
[23:46] <alfredodeza> dmick: that looks like a bug
[23:46] <alfredodeza> looking but can't point my finger yety
[23:46] <alfredodeza> *yet
[23:47] <alfredodeza> ntranger: can you purge and try again?
[23:47] <alfredodeza> ceph-deploy purge {node}
[23:48] <ntranger> purge, and try the install again?
[23:48] * Psi-Jack_ (~Psi-Jack@yggdrasil.hostdruids.com) has joined #ceph
[23:49] <alfredodeza> yes
[23:49] <ntranger> sure thing.
[23:49] <Tamil> ntranger: purge followed by purgedata and then try again
[23:50] <alfredodeza> what Tamil said
[23:50] <Tamil> ntranger: purgedata is to make sure all the /var/lib/ceph and /etc/ceph entries are removed and osd disks are unmounted[if any]
[23:51] <sage> sjust: this doc is great, btw. should probably get it into ceph.git and link to it from the blueprint
[23:51] <sjust> it'll definitely get linked, just wanted at least most of it written first
[23:51] <sjust> working on the recovery stuff atm
[23:52] <sjust> any especially obvious areas I missed?
[23:52] * Psi-jack (~psi-jack@psi-jack.user.oftc.net) Quit (Remote host closed the connection)
[23:52] * Psi-Jack_ is now known as Psi-Jack
[23:53] <ntranger> ok, I did the purges and did the install.
[23:54] <sage> don't think so.
[23:54] <sjust> ok with punting on object classes?
[23:54] <ntranger> still no files in /etc/ceph
[23:54] <sage> btw my first instinct is to add the rank to pg_t instead of creating a new type
[23:54] <ntranger> no conf or anything
[23:54] <alfredodeza> ntranger: did you keep the log output?
[23:54] <sjust> sagewk: yeah, could do that too, but pg_t really refers to the pg itself rather than to the shard
[23:54] <sage> there is already a (subtle) distinction between the "raw pg" in teh request and the actual pg_t as a container of objects
[23:55] <sjust> true
[23:55] <ntranger> the one I pasted?
[23:55] <alfredodeza> the new one
[23:55] <alfredodeza> I am trying to replicate here as well
[23:56] <alfredodeza> ok
[23:56] <alfredodeza> nevermind, I am opening a ticket to address this
[23:56] <sjust> I only fear that there are users of pg_t which don't care at all about the chunk id, and we will tend to leave non-sensical values in the chunk_id in those areas
[23:56] <alfredodeza> I just verified this is a problem in CentOS too
[23:56] <sage> yeah and now that I think about it the PG itself is the pg_t and not cpg_t one
[23:56] <sjust> right
[23:57] <ntranger> alfredodeza: http://pastebin.com/hU5Dqcu6
[23:58] <alfredodeza> ntranger: thanks, I have created a ticket to work on this and get it fix asap
[23:58] <sjust> sage: I gave no thought at all to the names, they should probably be more meaningful when we get to that point
[23:58] <alfredodeza> ticket ==> http://tracker.ceph.com/issues/5849
[23:58] * sage wonders if we can find a less overloaded term than 'chunk' for the different ranks
[23:58] <sage> yeah
[23:58] <ntranger> alfredodeza: thanks brother!
[23:58] <alfredodeza> no problem, hopefully we can get this rather quickly

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.