#ceph IRC Log

Index

IRC Log for 2012-02-14

Timestamps are in GMT/BST.

[0:12] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[0:16] <Tv|work> sagewk1: i read through wip-signals quickly.. i don't see anything horrible, void *entry looks unused (added github comment), i would have gone with fd-per-signal myself because i'm paranoid about short reads&writes ;)
[0:17] <SpamapS> sagewk1: Do you expect another release before Thursday?
[0:17] <sagewk1> on or about thursday..
[0:17] <sagewk1> tv|work: thanks
[0:17] <sagewk1> tv|work: entry() is the thread entry point
[0:18] <Tv|work> sagewk1: ah i just did git log -p wip-signals ^master | grep entry
[0:18] <SpamapS> sagewk1: ok cool, I'm going to put 0.41 into precise tomorrow. It should be no big deal to get a feature freeze exception if 0.42 slips past Thursday.
[0:19] <sagewk1> spamaps: ok!
[0:19] <SpamapS> sagewk1: speaking of Thursday.. http://www.meetup.com/106MilesSoCal/ .. in Santa Monica.. might be a bit of a long haul for you
[0:19] <Tv|work> sagewk1: oh i should probably say this explicitly: i'm much less comfortable vouching for signal handling when it involved any kind of threads :-/
[0:20] <Tv|work> sagewk1: as in, the sigprocmask doesn't affect signal blocking for any other thread
[0:20] <sagewk1> spamaps: santa monica is close for me (westwood), i'll see if i can make it!
[0:20] <Tv|work> sagewk1: which means your worker thread doing that over the read() means just about nothing
[0:20] <Tv|work> sagewk1: oops that's a bug :(
[0:21] <Tv|work> sagewk1: the self-pipe trick is meant to coordinate between your main thread running poll/select vs signals, not signals vs worker thread running another poll/select independent of the main poll/select
[0:21] <sagewk1> tv|work: that's only to ensure we read a full word from the socket
[0:22] <Tv|work> sagewk1: but what if the signal triggers in main thread while you're in there?
[0:22] <Tv|work> sagewk1: it's like trying to protect state shared across thread with a variable in thread-local data
[0:22] <sagewk1> ah
[0:22] <sagewk1> yeah, should use a pipe per signal i guess
[0:25] * joao (~joao@89.181.154.123) has joined #ceph
[0:26] * sagewk1 is now known as sagewk
[0:33] * yehudasa__ (~yehudasa@aon.hq.newdream.net) has joined #ceph
[0:33] * gregaf1 (~Adium@aon.hq.newdream.net) has joined #ceph
[0:34] * sjust1 (~sam@aon.hq.newdream.net) has joined #ceph
[0:34] * dmick1 (~dmick@aon.hq.newdream.net) has joined #ceph
[0:34] * Tv|work (~Tv|work@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[0:34] * sagewk1 (~sage@aon.hq.newdream.net) has joined #ceph
[0:36] * dmick (~dmick@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[0:38] * yehudasa_ (~yehudasa@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[0:39] * joshd1 (~joshd@aon.hq.newdream.net) has joined #ceph
[0:39] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Read error: Operation timed out)
[0:39] * joshd (~joshd@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[0:39] * sjust (~sam@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[0:39] * gregaf (~Adium@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[0:40] * sagewk (~sage@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[0:51] <SpamapS> So, if I wanted to split out the ceph *filesystem* from the ceph block system in the packaging. (so that we can keep the FS in universe) .. I'm thinking these are the bits to move:
[0:51] <SpamapS> usr/bin/ceph-mds
[0:51] <SpamapS> sbin/mkcephfs
[0:51] <SpamapS> and associated man pages
[0:51] <SpamapS> anything else?
[0:52] * lofejndif (~lsqavnbok@194.Red-83-52-212.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[0:52] <joshd1> SpamapS: mkcephfs is confusingly named - it's just used to initialize the cluster, nothing fs-specific afaik
[0:52] <SpamapS> joshd1: ahh good point. :)
[0:53] <SpamapS> libcephfs probably needs to stay out in universe as well.. but thats already in its own package
[0:54] * yehudasa_ (~yehudasa@aon.hq.newdream.net) has joined #ceph
[0:54] * gregaf (~Adium@aon.hq.newdream.net) has joined #ceph
[0:55] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[0:55] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[0:55] * sagewk (~sage@aon.hq.newdream.net) has joined #ceph
[0:56] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[0:57] * yehudasa (~yehudasa@aon.hq.newdream.net) has joined #ceph
[0:58] * joshd2 (~joshd@aon.hq.newdream.net) has joined #ceph
[0:58] * gregaf2 (~Adium@aon.hq.newdream.net) has joined #ceph
[0:58] * dmick2 (~dmick@aon.hq.newdream.net) has joined #ceph
[0:58] <joshd2> the cephfs command line tool is fs-specific, not sure which package it's in
[0:58] * sjust1 (~sam@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[0:58] * sagewk2 (~sage@aon.hq.newdream.net) has joined #ceph
[0:58] * gregaf1 (~Adium@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[0:58] * dmick1 (~dmick@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[0:59] * gregaf (~Adium@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[0:59] * yehudasa_ (~yehudasa@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[0:59] * dmick (~dmick@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:00] * sagewk (~sage@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:00] * joshd1 (~joshd@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[1:00] <SpamapS> joshd1: its in ceph-common..
[1:00] * sagewk1 (~sage@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[1:01] * yehudasa__ (~yehudasa@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[1:01] <SpamapS> really, mount.ceph is also fs specific
[1:02] * sagewk2 (~sage@aon.hq.newdream.net) Quit (Remote host closed the connection)
[1:04] * joshd (~joshd@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[1:12] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) Quit (Quit: Ex-Chat)
[1:21] * gregaf (~Adium@aon.hq.newdream.net) has joined #ceph
[1:21] * yehudasa_ (~yehudasa@aon.hq.newdream.net) has joined #ceph
[1:22] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[1:22] <SpamapS> Does anyone happen to know what piece of libcrypto++ is actually used in ceph?
[1:22] <SpamapS> it gets linked in to everything
[1:22] <SpamapS> but I can't find any of the headers included
[1:22] * yehudasa (~yehudasa@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:22] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[1:24] <gregaf> SpamapS: the includes are cryptopp/
[1:24] <gregaf> so ceph_crypto, for one
[1:24] <SpamapS> hm my grep must have been flawed I was looking for the individual file names.. :p
[1:25] <gregaf> gregf@kai:~/ceph/src [master]$ git grep cryptopp
[1:25] <gregaf> auth/Crypto.cc:# include <cryptopp/modes.h>
[1:25] <gregaf> auth/Crypto.cc:# include <cryptopp/aes.h>
[1:25] <gregaf> auth/Crypto.cc:# include <cryptopp/filters.h>
[1:25] <gregaf> common/ceph_crypto.h:# include <cryptopp/md5.h>
[1:25] <gregaf> common/ceph_crypto.h:# include <cryptopp/sha.h>
[1:25] <gregaf> common/ceph_crypto.h:# include <cryptopp/hmac.h>
[1:26] <SpamapS> Interesting.. looks like most of the calls have an NSS alternative
[1:26] <gregaf> yeah, they're supposed to but I don't recall the exact state of it
[1:26] * sagewk (~sage@aon.hq.newdream.net) has joined #ceph
[1:26] <SpamapS> Problem is, libcrypto++ is a *massive* chunk of code for Ubuntu to start supporting when they're already supporting openssl and libnss
[1:26] <gregaf> I think "available but not recommended"; it might be for the Red Hat guys because NSS fulfills some gov standard but it's a pain to work with?
[1:27] * SpamapS digresses.. s/they/we/
[1:27] <gregaf> Yehuda or tv would remember better, they did it
[1:27] * dmick2 (~dmick@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:27] * joshd2 (~joshd@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[1:27] * gregaf2 (~Adium@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[1:28] <sagewk> just joined.. what's the question?
[1:28] <SpamapS> sagewk: how complete/useful is the NSS support vs. cryptopp ?
[1:29] <SpamapS> sagewk: if its 100% compatible ,our security team would probably rather not add another crypto library to the arsenal. :)
[1:29] <sagewk> iirc nss works, but the api sucks.. there's some weird/inefficient stuff. tv|work probably remembers best
[1:30] <sagewk> ..and he's gone, let me see if it's next dor
[1:30] * SpamapS wishes openssl wasn't so weird with their license
[1:32] <SpamapS> sagewk: nothing urgent. I will put libcrypto++ as an option with a note that upstream does not recommend NSS but that it may be a viable alternative
[1:34] * gregorg_taf (~Greg@78.155.152.6) has joined #ceph
[1:34] <sagewk> spamaps: seriously. tv's double checking the libnss strangeness, he should pop in shortly
[1:34] * gregorg (~Greg@78.155.152.6) Quit (Ping timeout: 480 seconds)
[1:34] <SpamapS> sagewk: thanks. :)
[1:34] <sagewk> it *should* work, we just don't test it regularly
[1:35] * Tv|work (~Tv__@aon.hq.newdream.net) has joined #ceph
[1:35] <Tv|work> ahh i got disconnected
[1:35] <SpamapS> sagewk: thats what worries me. I think I'll leave it up to our security team as to whether or not they care much about supporting crypto++
[1:36] <SpamapS> other than ceph.. bitcoind and amule seem to be the only other consumers in Debian/Ubuntu. :)
[1:36] <sagewk> k
[1:37] <SpamapS> Tv|work: any info you can provide about the state of the NSS bits would be helpful in that decision.
[1:37] <Tv|work> SpamapS: so when i worked on the crypto.. 1) i didn't want to just replace crypto++ with nss, in case there's trouble or bad platform support 2) the NSS APIs are just so utterly horrible that i felt keeping crypto++ in the loop allowed me to keep my sanity
[1:37] <SpamapS> Tv|work: you are not alone in your dislike of libnss. :)
[1:37] <Tv|work> SpamapS: NSS looks like it was written by monkeys on crack
[1:38] <Tv|work> much like the million monkeys on typewriters writing Shakespear, except with less monkeys because the bar was set way lower
[1:38] <Tv|work> it only needed to compile, not make sense
[1:38] <SpamapS> Tv|work: monkies on crack isn't that far off from the Bubble-fueled early Netscape coders.. ;)
[1:39] <Tv|work> SpamapS: as far as the ceph codebase is concerned, as far as we've been able to figure out NSS, our support for Crypto++ and NSS are both "perfect"
[1:39] <Tv|work> ;)
[1:39] <SpamapS> Tv|work: the tradeoff I'm most concerned about is that you guys don't test NSS much.
[1:42] <Tv|work> SpamapS: true, but we use only a few crypto functions, wrapped by the same functions with both libs, and i actually put in unit tests for a lot of that
[1:42] <Tv|work> SpamapS: i mean, i'm more worried about weird random crap like NSS breaks if you keep the library initialized over a fork
[1:45] <Tv|work> honestly, if this was a migration from NSS to Crypto++, I'd have already thrown away the NSS support code
[1:46] <Tv|work> but now it's a migration from somewhat-pretty to ugly-as-hell, and that makes me try to hold on to the non-NSS like my sanity depended on it
[1:53] * yoshi (~yoshi@p8031-ipngn2701marunouchi.tokyo.ocn.ne.jp) has joined #ceph
[2:06] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[2:06] <SpamapS> Tv|work: what was the reason for adding NSS anyway?
[2:06] <Tv|work> SpamapS: some distro didn't have crypto++, i think
[2:07] <SpamapS> Tv|work: hah.. so in essence, the same problem we have.
[2:08] <Tv|work> http://tracker.newdream.net/issues/812
[2:08] <nhm> Tv: hey, how toguh is it to tell teuthology to install packages when it deploys the OS to a node?
[2:08] <Tv|work> nhm: oh you'll get to do that ;)
[2:09] <Tv|work> nhm: but only after we have reimaging nice & fast on the new sepia hardware
[2:09] <Tv|work> nhm: teuthology doesn't "deploy the OS", and that is the source of a lot of pain
[2:09] <nhm> Tv|work: Ah, ok. So for now would it be terrible if I installed a couple of packages manually? ;)
[2:10] <SpamapS> Tv|work: ahh thanks!
[2:10] <Tv|work> nhm: as is, apt-get or dpkg do not belong inside teuthology
[2:10] <Tv|work> nhm: and teuthology runs must not depend on anything that isn't already installed by default, or they'll fail on all the *other* nodes
[2:11] <nhm> Tv|work: hrm. Can I use it to copy some perl in palce and run that?
[2:12] * The_Bishop (~bishop@p5DC11432.dip.t-dialin.net) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[2:12] <Tv|work> nhm: yes but there may be cleaner ways
[2:12] <Tv|work> nhm: why?
[2:13] <nhm> Tv|work: I want to have collectl running when I do some initial performance tests
[2:14] <Tv|work> nhm: well, collectd isn't currently part of the installed packages
[2:14] <Tv|work> nhm: so your tests won't work just like that
[2:15] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph
[2:15] <nhm> Tv|work: collectl is different from collectd. It's just some perl that basically scans proc on periodic intervals. You can just put the perl src in your home directory and run it as a user.
[2:16] <nhm> though you can run it as a full blown daemon to.
[2:16] <nhm> s/to/too
[2:16] <Tv|work> nhm: oh sorry didn't realize that
[2:17] <nhm> Tv|work: no problem, there are so many statistics collecting daemons out there it's tough to keep track of them. ;)
[2:17] * BManojlovic (~steki@212.200.241.85) Quit (Remote host closed the connection)
[2:18] <Tv|work> nhm: that's not packaged so it'll have to be pulled in somehow by the scripts
[2:18] <Tv|work> nhm: if that really is the right tool to use; not being packaged makes me worry
[2:19] <nhm> Tv|work: It's written by a guy out at HP and is probably more common in the HPC world than the general community.
[2:21] <Tv|work> nhm: honestly, i'm wary of adding much new stuff to teuthology at this point; once the new sepia reimaging stuff is in place, things will get a lot easier
[2:21] <Tv|work> nhm: so if there's a way to run what you need manually for now, for a few weeks, i'd heavily recommend that
[2:21] <nhm> Really all it is though is a way to pull proc data in and write it into a defined log format. The reason I'm interested in it is because I've already got tools written to parse the logs and visualize the results. I used it a lot for the lustre performance testing on our 8k core cluster.
[2:22] <nhm> Tv|work: yeah, if I can distribute the files to a node via tuethology and run them, then I'm golden.
[2:22] <nhm> Ok, gotta go put kids to bed. Have a good evening
[2:22] <Tv|work> nhm: i'm saying, i wouldn't bother integrating that with teuthology right now
[2:23] <nhm> ok
[2:23] <Tv|work> nhm: new way will be making a chef cookbook for it and teuthology just saying "oh and these nodes have these roles"
[2:23] <Tv|work> then most stuff can be debs, etc
[2:23] <Tv|work> and at the end, the install gets scrapped and reimaged
[2:23] <Tv|work> time for me to head out too
[2:32] * Tv|work (~Tv__@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[2:38] * bchrisman (~Adium@108.60.121.114) Quit (Quit: Leaving.)
[3:05] * The_Bishop (~bishop@e179004135.adsl.alicedsl.de) has joined #ceph
[3:54] * aa (~aa@r190-135-24-151.dialup.adsl.anteldata.net.uy) has joined #ceph
[4:03] * joshd (~joshd@aon.hq.newdream.net) Quit (Quit: Leaving.)
[4:05] * aa (~aa@r190-135-24-151.dialup.adsl.anteldata.net.uy) Quit (Ping timeout: 480 seconds)
[4:30] * aa (~aa@r190-135-24-151.dialup.adsl.anteldata.net.uy) has joined #ceph
[4:40] * chutzpah (~chutz@216.174.109.254) Quit (Quit: Leaving)
[5:28] * aa (~aa@r190-135-24-151.dialup.adsl.anteldata.net.uy) Quit (Ping timeout: 480 seconds)
[5:45] * dmick (~dmick@aon.hq.newdream.net) Quit (Quit: Leaving.)
[5:47] * aa (~aa@r190-135-24-151.dialup.adsl.anteldata.net.uy) has joined #ceph
[6:09] * aa (~aa@r190-135-24-151.dialup.adsl.anteldata.net.uy) Quit (Ping timeout: 480 seconds)
[6:23] * monrad-51468 (~mmk@domitian.tdx.dk) Quit (Ping timeout: 480 seconds)
[6:24] * darkfader (~floh@188.40.175.2) Quit (Ping timeout: 480 seconds)
[6:25] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) has joined #ceph
[6:25] * DLange (~DLange@dlange.user.oftc.net) Quit (Ping timeout: 480 seconds)
[6:27] * monrad-51468 (~mmk@domitian.tdx.dk) has joined #ceph
[6:29] * DLange (~DLange@dlange.user.oftc.net) has joined #ceph
[6:33] * darkfader (~floh@188.40.175.2) has joined #ceph
[6:45] * The_Bishop (~bishop@e179004135.adsl.alicedsl.de) Quit (Quit: Wer zum Teufel ist dieser Peer? Wenn ich den erwische dann werde ich ihm mal die Verbindung resetten!)
[7:41] * fghaas (~florian@pd95be58f.dip0.t-ipconnect.de) has joined #ceph
[7:51] * ghaskins (~ghaskins@68-116-192-32.dhcp.oxfr.ma.charter.com) has joined #ceph
[7:54] * ghaskins_ (~ghaskins@68-116-192-32.dhcp.oxfr.ma.charter.com) Quit (Ping timeout: 480 seconds)
[7:58] * fghaas (~florian@pd95be58f.dip0.t-ipconnect.de) Quit (Ping timeout: 480 seconds)
[8:28] * fronlius (~fronlius@e176055060.adsl.alicedsl.de) has joined #ceph
[8:45] * fronlius (~fronlius@e176055060.adsl.alicedsl.de) Quit (Quit: fronlius)
[8:51] * Kioob`Taff1 (~plug-oliv@local.plusdinfo.com) Quit (Quit: Leaving.)
[8:54] * Kioob`Taff (~plug-oliv@local.plusdinfo.com) has joined #ceph
[9:08] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) has joined #ceph
[9:31] * yoshi (~yoshi@p8031-ipngn2701marunouchi.tokyo.ocn.ne.jp) Quit (Remote host closed the connection)
[9:41] * fronlius (~fronlius@testing78.jimdo-server.com) has joined #ceph
[11:54] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[11:58] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit ()
[12:18] * aa (~aa@r186-52-133-37.dialup.adsl.anteldata.net.uy) has joined #ceph
[12:20] * aa (~aa@r186-52-133-37.dialup.adsl.anteldata.net.uy) Quit (Remote host closed the connection)
[12:29] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) Quit (Quit: Ex-Chat)
[12:58] * joaoluis (~Joao@89.181.154.123) has left #ceph
[12:59] * darkfader (~floh@188.40.175.2) Quit (Read error: Operation timed out)
[13:05] * monrad-51468 (~mmk@domitian.tdx.dk) Quit (Read error: Connection reset by peer)
[13:10] * monrad (~mmk@domitian.tdx.dk) has joined #ceph
[13:15] * billy (~billy@195.97.27.244) has joined #ceph
[13:15] <billy> hello from athens greece...
[13:15] * darkfader (~floh@188.40.175.2) has joined #ceph
[13:15] <billy> want to set up a test environment with 3 virtual machines and setup ceph
[13:17] <billy> but I can't seem to find info on how to install on ubuntu/debian
[13:17] <billy> the info on the wiki is confusing
[13:17] <billy> any pointers/directions welcome
[13:17] <billy> oh and thanx in advance
[13:21] * billy (~billy@195.97.27.244) Quit (Quit: Leaving)
[13:21] * billy (~billy@195.97.27.244) has joined #ceph
[13:34] * Meyer is now known as Guest2549
[13:44] * Guest2549 is now known as Meyer_
[13:44] * Meyer_ is now known as Meyer__
[13:56] * The_Bishop (~bishop@cable-89-16-138-109.cust.telecolumbus.net) has joined #ceph
[14:45] * billy (~billy@195.97.27.244) Quit (Remote host closed the connection)
[14:47] * lofejndif (~lsqavnbok@194.Red-83-52-212.dynamicIP.rima-tde.net) has joined #ceph
[15:03] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[15:03] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit ()
[15:04] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) has joined #ceph
[15:19] * aliguori (~anthony@cpe-70-123-132-139.austin.res.rr.com) has joined #ceph
[15:28] * monrad (~mmk@domitian.tdx.dk) Quit (Read error: Connection reset by peer)
[15:29] * DLange (~DLange@dlange.user.oftc.net) Quit (Ping timeout: 480 seconds)
[15:29] * darkfader (~floh@188.40.175.2) Quit (Ping timeout: 480 seconds)
[15:31] * monrad-51468 (~mmk@domitian.tdx.dk) has joined #ceph
[15:33] * DLange (~DLange@dlange.user.oftc.net) has joined #ceph
[15:34] * darkfader (~floh@188.40.175.2) has joined #ceph
[16:40] * lofejndif (~lsqavnbok@194.Red-83-52-212.dynamicIP.rima-tde.net) Quit (Quit: Leaving)
[17:13] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) has joined #ceph
[17:41] * bchrisman (~Adium@c-76-103-130-94.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[17:51] * fronlius (~fronlius@testing78.jimdo-server.com) Quit (Quit: fronlius)
[18:05] * Tv|work (~Tv__@aon.hq.newdream.net) has joined #ceph
[18:07] * fghaas (~florian@pd95be58f.dip0.t-ipconnect.de) has joined #ceph
[18:08] * fghaas (~florian@pd95be58f.dip0.t-ipconnect.de) Quit ()
[18:34] * gregaf (~Adium@aon.hq.newdream.net) Quit (Quit: Leaving.)
[18:35] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has joined #ceph
[18:42] * fronlius (~fronlius@e176055060.adsl.alicedsl.de) has joined #ceph
[18:52] * BManojlovic (~steki@212.200.241.85) has joined #ceph
[18:53] * chutzpah (~chutz@216.174.109.254) has joined #ceph
[18:55] * lxo (~aoliva@lxo.user.oftc.net) Quit (Ping timeout: 480 seconds)
[18:56] * joshd (~joshd@aon.hq.newdream.net) has joined #ceph
[18:57] * bchrisman (~Adium@108.60.121.114) has joined #ceph
[18:59] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[19:04] * lxo (~aoliva@lxo.user.oftc.net) has joined #ceph
[19:11] * fronlius_ (~fronlius@f054184098.adsl.alicedsl.de) has joined #ceph
[19:15] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) has joined #ceph
[19:16] * fronlius (~fronlius@e176055060.adsl.alicedsl.de) Quit (Ping timeout: 480 seconds)
[19:16] * fronlius_ is now known as fronlius
[19:33] <Tv|work> elder: so.. i used to use lkcd and kdb a lot in ~2003. I know kdump has shown up since then, and is sort of the same idea with a kexec transition and generic linux userspace for dumper, instead of kernel C code. What I really wanted to say was, I've had disk-based kernel crash dumping eat a hard drive a few times, and would like to avoid that when data integrity is the thing we really care about. I'm thinking we should look at netdump, provided that it wor
[19:33] <Tv|work> with out hardware.. http://www.redhat.com/support/wpapers/redhat/netdump/ etc
[19:33] <Tv|work> hmm irc line splits are not reliable -- did that come out right, all the way to the link?
[19:35] <nhm> Tv|work: clipped at "that it wor" and then "with out hardware.." is on the next line.
[19:35] <Tv|work> "that is works"
[19:35] <Tv|work> and s/with out/with our/
[19:35] <Tv|work> err, "that it works"
[19:35] <Tv|work> my brain-finger connector is flaky
[19:36] <nhm> Tv|work: indeed, brian->fingers->keyboard->computer seems like a rather non-optimal data path.
[19:38] <nhm> Tv|work: we've got a UV1000 here with 3TB of ram. kernel dumps are pretty crazy on that machine.
[19:38] <nhm> I'm amazed kdump mostly works.
[19:38] <Tv|work> hah.. the above page is also from the 100Mbps era
[19:40] <Tv|work> the target network can pull of 10gig, and we're looking at boxes with 8-16GB of RAM ;)
[19:40] <Tv|work> *pull off
[19:42] <nhm> hopefully netdump can do more than 4GB of memory now. ;)
[19:42] <Tv|work> that sounds like a configurable
[19:42] <Tv|work> as is typical of kernel debugging tools, the documentation is half a decade out of date
[19:42] <nhm> I figured it was a 32bit unsigned value.
[19:43] <Tv|work> above rule is also valid for tools <5 years old; in those cases, documentation will be non-existent
[19:43] <Tv|work> nhm: they introduced the 4GB limit because they had a 100Mbps network, and waiting for 5 minutes with interrupts locked was a bad user experience
[19:52] * yehudasa__ (~yehudasa@aon.hq.newdream.net) has joined #ceph
[19:53] * joshd1 (~joshd@aon.hq.newdream.net) has joined #ceph
[19:53] * joshd (~joshd@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[19:53] * yehudasa_ (~yehudasa@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[19:54] * Tv|work (~Tv__@aon.hq.newdream.net) Quit (Read error: Operation timed out)
[19:54] * sagewk1 (~sage@aon.hq.newdream.net) has joined #ceph
[19:59] * sagewk (~sage@aon.hq.newdream.net) Quit (Ping timeout: 480 seconds)
[20:17] * sjust (~sam@aon.hq.newdream.net) has joined #ceph
[20:23] * gregaf (~Adium@aon.hq.newdream.net) has joined #ceph
[20:33] * todin_ (tuxadero@kudu.in-berlin.de) Quit (Ping timeout: 480 seconds)
[20:40] <elder> nhm, despite working at SGI in storage, I never got access to a real UV system. And certainly not one that large.
[20:40] <elder> As such, I too am surprised it mostly works...
[20:40] <elder> What kind of bandwidth did you get to storage on that system?
[20:49] <nhm> elder: that's a difficult question to answer succinctly... Things like distributed pagecache, numalink/qpi topology, etc, etc all come into play... With CXFS I think the max I could get was somewhere around 6GB/s and typically it was quite a bit lower. With XFS I could get if I recall correctly about 18GB/s max, with 12-13GB/s being more typical.
[20:50] <elder> FC disks?
[20:50] <nhm> yeah, 15k FC for the scratch partition, 7.2k for the project spaces.
[20:51] <nhm> 136TB of scratch and 524TB of project.
[20:52] <elder> I was working at one time on using the FC path to a drive that went through the physically closest HBA on the NUMAlink. Someone else picked that up and I think it's working, maybe not yet released.
[20:53] <elder> Not sure how much that helps but it would use the links available wisely, and concurrently.
[20:53] <nhm> Yeah, numalink contention was/is a huge problem for us. Management at the time didn't really realize the ramifications of buying a 2D torus topology.
[20:54] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Quit: adjohn)
[20:58] <nhm> oh, I should mention those speeds were aggregate with writers placed manually on optimal boards in the machine.
[20:58] <nhm> and pagecache was kept inside the cpuset.
[21:01] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) has joined #ceph
[21:02] * andreask (~andreas@chello062178013131.5.11.vie.surfer.at) has left #ceph
[21:13] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) Quit (Quit: Ex-Chat)
[21:14] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[21:34] * adjohn is now known as Guest2600
[21:34] * Guest2600 (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Read error: Connection reset by peer)
[21:34] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[21:35] * adjohn is now known as Guest2601
[21:35] * adjohn (~adjohn@rackspacesf.static.monkeybrains.net) has joined #ceph
[21:38] * nhorman (~nhorman@99-127-245-201.lightspeed.rlghnc.sbcglobal.net) Quit (Quit: Leaving)
[21:43] * Guest2601 (~adjohn@rackspacesf.static.monkeybrains.net) Quit (Ping timeout: 480 seconds)
[21:52] * Tv|work (~Tv__@aon.hq.newdream.net) has joined #ceph
[22:35] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) has joined #ceph
[22:42] * fronlius (~fronlius@f054184098.adsl.alicedsl.de) Quit (Quit: fronlius)
[23:00] * MarkDude (~MT@c-71-198-138-155.hsd1.ca.comcast.net) Quit (Quit: Leaving)
[23:08] * lofejndif (~lsqavnbok@57.Red-88-19-214.staticIP.rima-tde.net) has joined #ceph
[23:10] * lofejndif (~lsqavnbok@57.Red-88-19-214.staticIP.rima-tde.net) Quit (Max SendQ exceeded)
[23:11] * lofejndif (~lsqavnbok@57.Red-88-19-214.staticIP.rima-tde.net) has joined #ceph
[23:24] * aa (~aa@r200-40-114-26.ae-static.anteldata.net.uy) Quit (Remote host closed the connection)
[23:26] * verwilst (~verwilst@d51A5B5DF.access.telenet.be) Quit (Quit: Ex-Chat)
[23:29] * dmick (~dmick@aon.hq.newdream.net) has joined #ceph

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.