#ceph IRC Log

Index

IRC Log for 2011-03-21

Timestamps are in GMT/BST.

[0:16] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[1:15] * rajeshr (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) has joined #ceph
[2:08] * rajeshr1 (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) has joined #ceph
[2:08] * rajeshr (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) Quit (Read error: Connection reset by peer)
[2:58] * rajeshr1 (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) Quit (Quit: Leaving.)
[6:33] * eternaleye_ is now known as eternaleye
[7:01] * votz (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) has joined #ceph
[7:01] * votz_ (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) has joined #ceph
[7:04] * rajeshr (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) has joined #ceph
[7:53] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) has joined #ceph
[8:07] * allsystemsarego (~allsystem@188.25.130.175) has joined #ceph
[8:27] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[8:38] * votz_ (~votz@dhcp0020.grt.resnet.group.UPENN.EDU) Quit (Quit: Leaving)
[8:54] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Read error: Operation timed out)
[8:59] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[9:09] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[9:24] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) Quit (Quit: neurodrone)
[10:15] * Yoric (~David@213.144.210.93) has joined #ceph
[10:38] <johnl_> morning all
[11:20] * sjust (~sam@ip-66-33-206-8.dreamhost.com) Quit (Read error: Operation timed out)
[11:21] * sjust (~sam@ip-66-33-206-8.dreamhost.com) has joined #ceph
[11:36] * WhiteKIBA (~WhiteKIBA@vm3.rout0r.org) Quit (Server closed connection)
[11:37] * WhiteKIBA (~WhiteKIBA@vm3.rout0r.org) has joined #ceph
[11:59] * Meths_ (rift@91.106.212.117) has joined #ceph
[12:03] * Meths (rift@91.106.167.9) Quit (Read error: Operation timed out)
[12:16] * rajeshr (~Adium@99-7-122-114.lightspeed.brbnca.sbcglobal.net) Quit (Quit: Leaving.)
[12:22] * Yoric_ (~David@213.144.210.93) has joined #ceph
[12:22] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[12:22] * Yoric_ is now known as Yoric
[12:54] * Hugh (~hughmacdo@soho-94-143-249-50.sohonet.co.uk) has joined #ceph
[13:00] * Yoric (~David@213.144.210.93) Quit (Ping timeout: 480 seconds)
[13:08] * Yoric (~David@213.144.210.93) has joined #ceph
[13:19] * dmd17 (~ddumitriu@211.192.11.47) has joined #ceph
[13:28] * lxo (~aoliva@201.82.54.5) Quit (Ping timeout: 480 seconds)
[13:28] * lxo (~aoliva@201.82.54.5) has joined #ceph
[13:41] * samsung (~samsung@58.51.197.99) has joined #ceph
[13:42] <samsung> hi all
[13:57] * Meths (rift@91.106.232.128) has joined #ceph
[14:00] * samsung (~samsung@58.51.197.99) Quit (Quit: Leaving)
[14:04] * Meths_ (rift@91.106.212.117) Quit (Ping timeout: 480 seconds)
[14:06] * samsung (~samsung@61.184.205.40) has joined #ceph
[14:11] * Yoric_ (~David@213.144.210.93) has joined #ceph
[14:11] * Yoric (~David@213.144.210.93) Quit (Read error: Connection reset by peer)
[14:11] * Yoric_ is now known as Yoric
[14:18] * hijacker_ (~hijacker@213.91.163.5) Quit (Server closed connection)
[14:18] * dmd17 (~ddumitriu@211.192.11.47) Quit (Quit: dmd17)
[14:18] * hijacker_ (~hijacker@213.91.163.5) has joined #ceph
[14:41] * Yoric (~David@213.144.210.93) Quit (Ping timeout: 480 seconds)
[14:58] * Yuki (~Yuki@61.184.205.40) has joined #ceph
[15:00] <samsung> help
[15:04] * samsung (~samsung@61.184.205.40) Quit (Quit: Leaving)
[15:04] <Yuki> quit
[15:04] * Yuki (~Yuki@61.184.205.40) Quit (Quit: Leaving)
[15:09] * Yoric (~David@213.144.210.93) has joined #ceph
[15:37] * Yoric_ (~David@80.70.32.140) has joined #ceph
[15:39] * Yoric (~David@213.144.210.93) Quit (Ping timeout: 480 seconds)
[15:39] * Yoric_ is now known as Yoric
[15:41] * Yoric (~David@80.70.32.140) Quit ()
[15:41] * Yoric (~David@80.70.32.140) has joined #ceph
[16:09] * greglap (~Adium@static-72-67-79-74.lsanca.dsl-w.verizon.net) has joined #ceph
[16:17] * greglap (~Adium@static-72-67-79-74.lsanca.dsl-w.verizon.net) Quit (Ping timeout: 480 seconds)
[16:19] * greglap (~Adium@166.205.139.38) has joined #ceph
[16:25] <greglap> morning folks
[16:46] * alexxy[home] (~alexxy@79.173.81.171) Quit (Server closed connection)
[16:47] * alexxy (~alexxy@79.173.81.171) has joined #ceph
[16:52] * bchrisman (~Adium@c-98-207-207-62.hsd1.ca.comcast.net) Quit (Quit: Leaving.)
[16:54] * Tv (~Tv|work@ip-66-33-206-8.dreamhost.com) has joined #ceph
[17:04] * cmccabe (~cmccabe@c-24-23-254-199.hsd1.ca.comcast.net) has joined #ceph
[17:14] * Juul (~Juul@slim.dhcp.lbl.gov) has joined #ceph
[17:15] <Tv> i can't reach any of the hosts in the internal ceph network, except .1 = router
[17:16] <Tv> and logging in via serial console i can't ping the router
[17:16] <Tv> filing ops ticket i guess
[17:19] <cmccabe> tv: same problem here. sage says "rk is looking at it"
[17:20] <Tv> ok.. https://dev.newdream.net/issues/9212
[17:40] * greglap (~Adium@166.205.139.38) Quit (Quit: Leaving.)
[17:43] <Tv> cmccabe: can you get the rpm build to do something like if arch != i386: configure_flags=--without-libatomic-ops
[17:45] <cmccabe> tv: I'm not sure, but probably
[17:47] <cmccabe> tv: I still need to integrate that crazy hairball of code to find $pythonpath
[17:47] <Tv> hahah
[17:47] <Tv> fully agreed on hairiness ;)
[17:47] * Yoric (~David@80.70.32.140) Quit (Ping timeout: 480 seconds)
[17:48] <cmccabe> tv: it's not really ruben's fault. It's just dealing with the versioning and such... gets tricky
[17:49] <Tv> oh absolutely not blaming him
[17:49] <Tv> hatin' on rpm, then again... >:->
[17:55] <cmccabe> tv: yeah
[17:55] <cmccabe> man this outage is a downer. at least I have some other things to do I guess
[17:57] * Yoric (~David@213.144.210.93) has joined #ceph
[18:04] * bchrisman (~Adium@70-35-37-146.static.wiline.com) has joined #ceph
[18:04] <gregaf> cmccabe: Tv: wait, so is libatomic-ops not packaged on RHEL for non-x86 archs?
[18:05] <Tv> gregaf: what i hear is they omitted it on 64-bit because the stack inspection (= profiler) is buggy on 64-bit
[18:06] <cmccabe> tv: I think you're thinking of google-perftools
[18:06] <Tv> err
[18:06] <Tv> yeah now you have me confused
[18:06] <gregaf> okay good — I assumed that was just a bit of confusion on the list but then you had me worried
[18:06] <gregaf> because running without atomic-ops was a bitch for Jim
[18:06] <cmccabe> gregaf: I think tv was just giving libatomic-ops as an example of something that should be mandatory
[18:07] <Tv> "> 3. How should we handle tcmalloc, if at all? Google-perftools is not
[18:07] <Tv> > bundled for 64 bit on CentOS. (It seems like this decision was made"
[18:07] <Tv> that threw me off
[18:07] <Tv> for some reason i started talking about libatomicops, when i meant tcmalloc
[18:08] <Tv> i blame monday morning
[18:08] <gregaf> you probably got confused because of me bitching about atomic-ops with debian armel :)
[18:08] <cmccabe> well, I'm filing a red hat bugzilla now, so hopefully they'll at least consider adding it to RHEL
[18:08] <Tv> i still blame monday morning, though
[18:08] <Tv> for anything
[18:09] <Tv> including the network outage
[18:09] <gregaf> cmccabe: you should really try and get one of our lab users (or somebody with a RHEL support contract) to do it
[18:09] <cmccabe> gregaf: hmm
[18:10] <gregaf> sage could tell you who those users are; I'm not exactly sure but I know we have some because they asked for crypto++ and made the RHEL people pretty happy about government crypto requirements
[18:12] <gregaf> also, did you guys see the happy news from Al Viro this morning?
[18:12] <gregaf> "Misc patches from the last cycle + fix for nfs_path() braino +
[18:12] <gregaf> sys_syncfs(). Please, pull from
[18:12] <gregaf> git://http://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6.git/ for-linus"
[18:13] <cmccabe> gregaf: great!
[18:14] <cmccabe> gregaf: it's a pretty sensible system call
[18:14] <cmccabe> gregaf: I didn't understand the random proposals to overload sync_file_range or other syscalls. I'm glad those got a thumbs down
[18:19] <gregaf> yeah, it was just that one guy and I think he was pretty much opposed to adding kernel interfaces
[18:24] * joshd (~joshd@ip-66-33-206-8.dreamhost.com) has joined #ceph
[18:35] <sagewk> let's do the standup at 11
[18:36] <cmccabe> sagewk: ok
[18:40] * Juul (~Juul@slim.dhcp.lbl.gov) Quit (Ping timeout: 480 seconds)
[18:46] * Yoric (~David@213.144.210.93) Quit (Quit: Yoric)
[18:52] <johnl_> hi cephers
[18:53] <cmccabe> johnl_: hi
[18:54] <johnl_> got some questions about use of paxos. ok here?
[18:55] <cmccabe> johnl_: we're about to have a meeting, so there might be a delay in some of the answers
[18:56] * nolan (~nolan@phong.sigbus.net) has joined #ceph
[18:56] <johnl_> ok, will wait then. though I suppose the mailing list might be a better place for it anyway
[18:56] <cmccabe> johnl_: well, if you ask at 11:20 or so I think it will be good
[18:56] <johnl_> ace :)
[18:56] <johnl_> ta
[18:56] * nolan (~nolan@phong.sigbus.net) Quit (Remote host closed the connection)
[18:56] <johnl_> nice short meetings, that's what I like to see :)
[18:56] * Juul (~Juul@slim.dhcp.lbl.gov) has joined #ceph
[18:57] * nolan (~nolan@phong.sigbus.net) has joined #ceph
[19:07] <Tv> johnl_: it seems the meeting is stuck in a repeating postpone loop, so maybe you should just type your question..
[19:27] <gregaf> Ixo: I saw you had some questions over the weekend; did you figure them out or did you still want help?
[19:35] * GeoAnnn (~Geo@85.186.183.99) has joined #ceph
[19:37] * GeoAnnn (~Geo@85.186.183.99) Quit (autokilled: This host violated network policy. Mail support@oftc.net if you think this in error. (2011-03-21 18:37:18))
[20:02] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[20:05] <johnl_> Tv: ah heh
[20:05] <johnl_> will do shortly
[20:09] <sagewk> tv: intenral network is back
[20:14] * rajeshr (~Adium@98.159.94.26) has joined #ceph
[20:59] * Juul (~Juul@slim.dhcp.lbl.gov) Quit (Ping timeout: 480 seconds)
[21:24] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) Quit (Quit: Leaving.)
[21:24] * gregaf (~Adium@ip-66-33-206-8.dreamhost.com) has joined #ceph
[21:41] <Tv> sagewk, *: you may like the email i just sent to the list, it's a decent indication of where we're heading..
[21:43] <Tv> now, food
[22:36] * Juul (~Juul@slim.dhcp.lbl.gov) has joined #ceph
[22:48] * allsystemsarego (~allsystem@188.25.130.175) Quit (Quit: Leaving)
[23:04] * neurodrone (~neurodron@cpe-76-180-162-12.buffalo.res.rr.com) has joined #ceph
[23:07] <johnl_> ok, finally here for paxos questions, anyone still around?
[23:07] <Tv> johnl_: don't ask about asking, just ask
[23:08] <johnl_> super
[23:08] <johnl_> been studying a paper on paxos
[23:08] <johnl_> now trying to understand where paxos is used in ceph
[23:09] <johnl_> I assume in the monitor cluster, and perhaps the metadata cluster?
[23:10] <johnl_> they both use journals though too right? didn't think that was necessary if they're kept in sync with paxos?
[23:10] <Tv> johnl_: my understanding says monitor cluster does (modified) paxos
[23:11] <johnl_> I must admit, the paxos papers I read were pretty tough to follow, so I might be getting the wrong end of the stick.
[23:11] <Tv> johnl_: mds's journal to objects in osd; osds journal to a local file; monitors don't write anything afaik
[23:11] <johnl_> yer
[23:11] <johnl_> ah right cool
[23:12] <cmccabe> johnl_: there's a pretty good writeup about paxos here: http://the-paper-trail.org/blog/?p=173
[23:13] <cmccabe> johnl_: it's a little bit less abstract and a little bit easier to follow than Leslie Lanport's paper
[23:13] <cmccabe> johnl_: well ok, maybe a lot.
[23:14] <johnl_> not read that one, ta
[23:14] <cmccabe> true story: when lanport submitted the original paxos paper, it was rejected by the conference he submitted it to...
[23:14] <johnl_> yeah, I read the lamport paper
[23:14] <cmccabe> because he phrased the whole thing in terms of this elaborate, very abstract analogy
[23:14] <johnl_> ah yeah, part time parliment
[23:15] <cmccabe> and the reviewers couldn't see the actual application to computer systems :)
[23:15] <johnl_> haha
[23:16] <cmccabe> johnl_: "Lamport had presented the work as solving the problem of achieving consensus on measures voted for by a rather lazy parliament on the ancient island of Paxos. Lamport already had form for describing his work through analogy with the Byzantine Generals paper, and felt a similar approach would be successful here. Instead, the referees demanded the removal of the “Paxos stuff”, and Lamport, rather than comply, let the paper sit
[23:16] <cmccabe> johnl_: However, the protocol began to gain some popularity through word of mouth... Lamport resubmitted the paper some eight years after the original attempt, and TOCS published it at the second time of asking in nearly its original form
[23:18] <johnl_> so how do the metadata servers agree on who is resposible for what? (for example, when shifting around responsibility during the dynamic subtree partitioning?)
[23:18] <johnl_> do they use the monitors somehow?
[23:18] <johnl_> if not, I guess they use paxos too?
[23:18] <Tv> johnl_: i believe monitors use paxos to decide how to split work
[23:20] <Tv> johnl_: it goes something like this: osds store actual data; mds is used to create the filesystem metadata structure on top of this (stored in objects also); mons are used to decide who does what; there's lots of osd operations, fewer mds operations, and even fewer mon operations
[23:20] <cmccabe> I don't _think_ the mds does paxos itself... although I could be wrong, since I haven't checked on that
[23:20] <cmccabe> I think it relies on the monitors for that
[23:20] <Tv> data written >> metadata operations >> delegation changes
[23:20] <johnl_> so when a client wants to do a metadata operation, they ask the monitiro cluster which mds to speak to?
[23:21] <cmccabe> I know that all osdmap changes flow through the monitors
[23:21] <cmccabe> I assume the mdsmap is a similar story
[23:21] <Tv> johnl_: the first time, yes; after that they'll have a cache
[23:21] <johnl_> yeah, cache is mentioned in the paper
[23:21] <johnl_> unfortunately the original paper seems to have the monitors and mds daemons as one entity
[23:21] * lxo (~aoliva@201.82.54.5) Quit (Read error: Connection reset by peer)
[23:21] <johnl_> I assume it changed after the paper was written :)
[23:21] <Tv> johnl_: of ceph? that's not my reading of it
[23:21] <johnl_> oh
[23:22] * lxo (~aoliva@201.82.54.5) has joined #ceph
[23:22] <Tv> then again i went in with some understanding of the current architecture
[23:22] <cmccabe> johnl_: the thesis clearly separates the two
[23:22] <cmccabe> johnl_: not sure about all the earlier work... I've read most but not all
[23:23] <johnl_> hrm, maybe we're talking about diff papers then!
[23:23] <johnl_> I'm reading "Technical Report UCSC-SSRC-06-01
[23:24] * Juul (~Juul@slim.dhcp.lbl.gov) Quit (Quit: Leaving)
[23:24] <johnl_> this definitely doesn't mention a monitor cluster
[23:26] <johnl_> yep, different paper!
[23:26] <johnl_> right, got some more reading to do
[23:26] <gregaf> johnl_, everybody: the monitors aren't involved in metadata management at all
[23:27] <gregaf> they keep track of which MDSes are in the cluster
[23:27] <gregaf> but the partitioning is done solely by the MDSes
[23:27] <cmccabe> gregaf: the MDSes don't run paxos, though, right?
[23:27] <cmccabe> gregaf: they use the mdsmap to do that stuff
[23:27] <cmccabe> gregaf: right?
[23:27] <gregaf> the MDSes don't run paxos at all
[23:28] <johnl_> holy shit this is 239 pages long!
[23:28] <Tv> yeah nothing that scales up high can run paxos directly
[23:28] <gregaf> the only info they send through the monitors is export targets
[23:28] <Tv> paxos is only happy with <<100 nodes
[23:28] <gregaf> they don't even include which subtrees are on which node
[23:28] <gregaf> *node
[23:28] <gregaf> *nodes
[23:28] <gregaf> and the clients don't ask the monitors for information on where metadata is either
[23:28] <Tv> yeah that comes from mds
[23:28] <cmccabe> tv: interesting perspective. I guess that probably just because paxos requires everything to talk to one another
[23:28] <gregaf> they just ask the monitors because all the monitors can direct the request elsewhere if they need to
[23:29] <gregaf> cmccabe: yeah, the latency and inter-node bandwidth doesn't scale very far
[23:29] <gregaf> all metadata safety is maintained via journaling and coordinated agreements between the MDSes
[23:30] <johnl_> how do they coordinate agreements?
[23:30] <gregaf> they do their own two-phase commit
[23:30] <gregaf> it's safe as long as you don't lose both of the involved MDS journals and daemons all at the same time
[23:30] <Tv> what picks new owner of part of a busy subtree? (yeah it's in the paper but i forgot already ;)
[23:30] <cmccabe> gregaf: so it's _not_ the mdsmap per-se, but other messages that are sent
[23:30] <gregaf> but it doesn't go through all the MDSes
[23:31] <cmccabe> gregaf: scanning through the mdsmap source, I don't see references to directories at all.
[23:31] <gregaf> the mds balancing is done in the Migrator
[23:31] <Tv> cmccabe: i think it just bootstraps from known root inode on mds 0, then everything else uses the subtree/frag delegation mechanism
[23:32] <gregaf> they all share use statistics and I believe they just individually calculate moves based on a well-known set of use data and rules
[23:32] <cmccabe> tv: yeah, mdsmap does tell you how to get to the root inode
[23:32] <Tv> gregaf: so gossip makes everyone see the same load everywhere, and make the same deterministic decision?
[23:32] <gregaf> Tv: yeah
[23:33] <gregaf> IIRC that's how it goes
[23:33] <cmccabe> except some caches will get stale, and that's ok, as long as we respond correctly to queries that are out of date
[23:33] <gregaf> cmccabe: which caches are you talking about?
[23:33] <cmccabe> I assume there's a lot of client-side caching of the metadata
[23:33] <cmccabe> as well as caching of which MDS is handling what
[23:34] <gregaf> yeah, the clients get notified if metadata changes hands
[23:34] <cmccabe> oh, so there's a notification mechanism for that.
[23:34] <gregaf> and their capabilities are maintained in the transition and rebuilt later if the transition gets broken
[23:34] <gregaf> yep
[23:34] <cmccabe> what's the name of that message?
[23:34] <gregaf> one of the MExport* messages, not sure which
[23:35] <cmccabe> gregaf: in CIFS that is called an "oplock break"
[23:35] <gregaf> I think — not sure about the specifics at this point
[23:35] <cmccabe> gregaf: don't ask... it's microsoft terminology
[23:36] <gregaf> johnl_: the monitors don't use a journal except in the sense that paxos is a journalling protocol
[23:36] <johnl_> gregaf: yeah, that was my understanding of paxos.
[23:37] <johnl_> which I why I suspected the mds cluster was not paxos
[23:37] <gregaf> yeah
[23:37] <johnl_> all the monitors essentially agree on all "operations" in lock step right?
[23:38] <gregaf> two-phase commits (which the MDSes use when determining ownership) have similar points of data safety but Paxos also adds the >50% agreement amongst the whole cluster rule
[23:38] <gregaf> so yes, all the monitors agree on stuff in lock step
[23:38] <johnl_> so it's like an in-memory synchronised distributed journal :)
[23:38] <gregaf> where each "operation" here is one of the map updates
[23:38] <gregaf> which can each individually contain many changes
[23:38] <johnl_> yeah
[23:39] <gregaf> well, it's not really in-memory (it does live there, but writing it out to disks before anything's done is an important safety consideration)
[23:39] <gregaf> but an absolutely-coherent distributed journal is the whole point of paxos :)
[23:39] <johnl_> ok, great, thanks for clearing that up.
[23:40] <gregaf> yep
[23:40] <johnl_> got more reading to do now too :)
[23:40] <gregaf> I like Paxos, my senior project was all about it
[23:40] <johnl_> yeah, it was quite exciting to me. I started looking for nails to hit with my new paxos hammer, heh
[23:40] <gregaf> hehe
[23:40] <johnl_> bam! paxos!
[23:41] <gregaf> keep in mind it's slow as hell
[23:41] <johnl_> and gets slower with more nodes!
[23:41] <gregaf> yeah
[23:41] <gregaf> we worked on a stripped-down version of Zookeeper for our implementation and got it to the point where we could commit ~5000 ops/second (up to a max bandwidth of ~30MB/s) on a gigabit network with 3 nodes
[23:42] <gregaf> apparently if we'd used a newer version of the Zookeeper codebase we probably could have doubled that
[23:42] <johnl_> wow
[23:42] <gregaf> but that's only if you can pipeline them, if the ops depend on each other it'll be much, much slower than that
[23:42] <gregaf> err, doubled the number of ops, not the bandwidth
[23:43] <johnl_> I'm saddened you had to code some java though. poor you ;)
[23:43] <gregaf> heh, I don't mind Java so much
[23:43] <gregaf> just keep me the hell away from sML and Prolog
[23:43] <johnl_> heh
[23:44] <johnl_> right, I actually better go wind down. it's almost 11pm here and I repeatedly dreamt about ceph the other night after reading about it late
[23:44] <gregaf> heh
[23:44] <gregaf> sounds like a nice dream to me :p
[23:45] <johnl_> heh, nah, one of those kinda repeating obsessive things.
[23:45] <johnl_> not like a nice dream where all my data was safe, heh
[23:46] <johnl_> probably my subconcious playing out the paxos algorithm over and over
[23:46] <Tv> oh wow ecryptfs can make mount(2) return -ETIME, "Timer expired", for crypto keys that have been revoked
[23:46] <Tv> that sounds evil
[23:47] <johnl_> right, nnight all. thanks again for the help.
[23:47] <gregaf> night!
[23:48] <Tv> every now and then i end up thinking one of the biggest failures of POSIX was numeric errors :-/
[23:51] <cmccabe> tv: I really don't mind numeric errors
[23:52] <cmccabe> tv: I do sometimes wish there was a broader selection of errnos to choose from
[23:52] <Tv> kinda hard to say mount(2) is refusing your request because the crypto key you requested has been marked as revoked
[23:53] <cmccabe> tv: the thing is, if you want to return an error that is comprehensible to programs (as opposed to humans) you end up creating something that is equivalent to an int anyway
[23:54] <Tv> it could be hierarchic & extensible outside of the standard
[23:55] <cmccabe> tv: if you return strings, every time you add a new string to the roster you have to update all the programs that call that syscall. Same deal with errno
[23:55] <Tv> "EINVAL because of foo" is much nicer than EINVAL
[23:56] <cmccabe> tv: I think if there are so many errors that you need hierarchies, your interface is wrong
[23:56] <Tv> cmccabe: see the mount example
[23:57] <cmccabe> tv: I do feel like mount's error returns are not particularly helpful.
[23:57] <Tv> there's no way the standard could have included that case
[23:57] <sagewk> tv: the worst part of mount(2) is that most errors the fs generates are translated into 'invalid superblock'
[23:57] <cmccabe> tv: but I think mount's interface is wrong. There was an article on LWN about how mount should have been two separate operations: making the device ready, and adding it to the file namespace
[23:57] <Tv> now, imagine the case where i have >1 key and want to say which one failed..
[23:58] <Tv> sagewk: EINVAL, yeah
[23:58] <cmccabe> tv: with a better interface, the errors would have been a lot clearer. And there would have been no need for mount --bind, mount --move, and all those other special cases.
[23:58] <Tv> cmccabe: oh sure, gimme plan 9 any day ;)
[23:59] <Tv> on a tangent, djb made an interesting point that bsd socket api should have returned two fds from listen(): one incoming, one outgoing -- then you wouldn't need e.g. shutdown(2)
[23:59] <Tv> there's lots of old ugly in unix.. yet, it's still our last, best hope for sanity
[23:59] <Tv> (can you tell I've been rewatching Babylon 5?)
[23:59] <cmccabe> heh

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.