5/23/04:

What to do about replacing symbols in later passes?  A complete
replacement will break existing pointers, and a symbolic reference
would require substantial code changes.  I'll probably produce output
after each pass and wipe the symbol table; this is less efficient,
but similar to how the code was originally intended to work.

How to determine if a pass 2 is needed if the symbol lookup succeded
due to stale external data that would be overwritten?  Perhaps
eliminate the --overwrite option; then it wouldn't be possible,
but it could be annoying to have to wipe everything before each
compilation.  It would pretty much enforce one batch of source IDLs
per output directory, though that's not necessarily a bad thing.

In fact, I think I'll just make --overwrite act on the entire output
directory; if it's not set, you get an error if there's *anything* in
there.

Now, what happens when you get around to declaring a symbol in a
later pass, which has been loaded from the fs already?  If the only
thing the linkage is used for is to generate a name in the output,
then just replace it and let the old version stay refcounted.  Is
there enough in the pass1 output for error checking, other than
constant ranges?  I think so.

When will circular inheritance checks be done?  That'll require the
ability to compare references, meaning we can't just go replacing
things.  Other than that, I think it can be done at the end of pass
2.  So instead of replacing, I'll just add information to existing
objects (which means I get to go fix all the places where that sort
of work is done in the constructor).

5/25/04:

In conclusion on the above, no replacement is done for future passes.
Constructor calls have been replaced with calls to a static declare()
function, which will either call a constructor or return the existng
one (or complain if there's a real conflict (i.e. within this pass or
with an external symbol)), as well as possibly initialize some data.

Still to do:

Implement pass 3
Finish implementing output (including sanity check for incomplete
data).
Nuke --overwrite, and complain if anything is in the target
directory.

8/1/04:

Most vtable-related design is done.  The GUID pointers will have to
be dynamically generated somehow (either by the dynamic linker or by
startup code), to make sure the same pointer is used in all
components.

The compiled type representation will break badly on a
case-insensitive filesystem.  This is already seen in the current IDL
files.  Some sort of alternate mapping will be needed.  Also, it
looks like the performance of output generation will be fairly poor
under UFS on OS X; even with the current set of IDL it takes 1/4
second to generate all output.  Not that it was expected to perform
well on legacy filesystems, but still, I wonder how long it will take
on the full set of IDL...

9/21/04:

Enum and bitfield inheritance may be useful...

9/22/04:

YYError() should probably be turned into UserError()...

9/25/04:

Or more specifically, into something like RecoverableUserError().

12/7/04:

Arrays need some more thought, specifically multi-dimensional
and inline arrays, and how they interact with typedefs.  Currently,
multi-dimensional arrays are simply not supported, but what happens
if you typedef an array, and then create an array of that?  It's
accepted at the moment, and if you accept that, why not regular
multi-dimensional arrrays?  Plus, with out-of-line arrays
multi-dimensional arrays cannot be created simply by multiplying the
sizes of each dimension.  Support should be added.

12/21/04:

A separate type of reference will likely be needed for persistent
entities, as the overhead would be too much to always do it.  This
would also allow an entity to be locked against decaching (but not
ordinary swap-out) by acquiring a non-persistent reference.  

If one is willing to disallow such locking, persistence could simply
be an attribute of a type, but you'd still have problems with
references to embedded types; a persistent type should be able to
contain non-persistent types (either inline or by reference).

One implementation of persistence would be for a persistent reference
to have two states.  An uncached reference consists of a
non-persistent reference to a storage object (or perhaps a cache
object backed by a storage object).  A cached reference is like a
normal, non-persistent reference.  The state would have to be checked
on every dereference.  If it is found to be uncached, the entity is
retrieved (either from storage, or from cache (it may have gotten
there via another reference)), the state of the reference is changed,
and the reference is added to a list to be swept when trying to
decache the object.  Something would need to be done to prevent races
with asynchronous decaching (perhaps an in-use bit or refcount in the
reference).  However, implementing such a mechanism would be
difficult on top of an ordinary language.

An alternative, which is less "automatic" from a programmer's
perspective, but still much better than the current state of things,
is to have the programmer always acquire an ordinary reference before
dereferencing (essentially, the in-use refcount of the previous
mechanism would be managed manually or by garbage collection).  The
programmer can choose whether to keep the ordinary reference around
(which favors simplicity, determinism, speed) or the storage
reference (which minimizes memory consumption and requires more
programmer and CPU time to acquire a usable reference more often). 

The difference between this and simply having serialize/deserialize
methods is that you would receive the same entity address if you
convert a storage reference multiple times.  This causes a problem if
you do this from different address spaces, though.  Shared memory is
a possibility, but it would be unsuitable in many circumstances due
to either races or memory wastage (you'd pretty much need to allocate
a page per entity, so that access can be controlled precisely (you
shouldn't be able to access entity B just because some other process
has it in the same page as entity A to which you do have access
rights, and you can't move one of them to another page without
breaking the other process's references)).

12/25/04:

Security issues need some more thought.  In particular, how to handle
the case where the rights of multiple processes are needed to do
something, with no one process fully trusted with all of those
rights.  If you just pass a handle to one process, and don't have any
further restrictions, then it can do other things with that handle,
long after it's returned.  Delegates would allow it to be limited to
one method, and handle revocation would be nice as well.  However, it
could still be more privilege than was intended to be granted.

To be fully secure, one-time-use objects could be created that only
allow a certain, specific operation, but that would have too much
overhead in many cases.  

12/28/04:

An ordinary reference has certain rights associated with it, and
these rights are transfered to the callee when the reference is
passed.  For persistent references, only the namespace lookup
permission is bypassed; unserializing (or serializing) the object
requires whatever capability token has been set for the relevant
operation.  I don't think it would be worthwhile to implement a third
type of reference that is unserialized but without privilege grant;
if one wants that, one could make a custom storage object that
doesn't actually serialize anything, but just hands out the
real reference upon presentation of the right capability token.

Access revocation is important for making sure the callee doesn't
hold onto the reference longer than it is supposed to (especially if
the access rights change after the call).  However, it cannot be
determined automatically how long to allow a call-granted reference. 
Many calls may only need it for the duration of the call, but some
will need to hold the reference longer.  The reference also must be
revoked if the caller's access to that object is revoked
(technically, it could remain if the callee has another
path-to-privilege, but it may not want to, if the action it takes
assumes that the caller had privilege to carry out the action). 

Implementing access revocation requires that we either say fuck-you
to the app and make it unserialize again if it does happen to have an
alternate path-to-privilege (I believe this is what Unix does), or
somehow link the unserialized entity to the persistent reference, and
give it a chance to prove that it's allowed to retain the reference. 
I greatly favor the latter approach; though it's more complicated to
implement, going the other way will make lots of apps either buggy or
hideously complicated.

Alternatively, a reference could be more tightly bound to the exact
path-to-privilege, requiring the app to explicitly specify which
source(s) of privilege to consider.  This has benefits in avoiding
odd races where an app would have asked the user for a password to
elevate privilege, but didn't because it happened to have a given
authority already for some other reason, but which got revoked before
the operation completed.  It'd also be nice in general in helping
server processes manage inherited permissions sanely.  It'd open the
multiple-references-per-object can of worms, in that a single address
space could have references to the same object compare unequal (or
else have a more complicated comparison operation than simply
checking the reference pointer).

Aah, fuck it.  If you pass a reference to a task, you're trusting it
not to do bad stuff with it.  If you can't give it that trust, send
it a more limited reference.  The major time you'd really want to do
a revocation is when the access rights to an object change, and the
fuck-you-legitimate-reference-holder approach could be sufficient for
the case where the owner of the object is pretty sure there are no
legitimate references remaining.  Existing OSes don't handle anything
beyond that very well AFAIK, so if I come up with anything better
it'll be a bonus, but I'm not too worried.

The problem with the trust-it approach is that it's harder to know
who you're talking to in a polymorphic OS; all you really know
(without excessive ORB queries) is the interface type.  The trust
level for the implementation will often be zero, and (just about)
anything that can be done to limit leakage of privilege is a good
thing.  Oh well, we'll see how it turns out after further API design. 
It might turn out to not be such a big deal, and I need to get on
with making stuff work.

2/1/05: GCC on PPC violates the SYSV ABI by not returning small
structs in registers.  This could have a noticeable performance
impact given that object references are really small structs. 
While fixing it for existing OSes is unlikely due to existing
binaries, perhaps it should be fixed for this OS while it still
can be...

3/13/05: Typedefs are not going to be allowed for interfaces.  The
immediate, selfish reason for this is that allowing them would cause
some minor ugliness in the idlc code that I'd rather avoid (it's ugly
enough already).  However, I'm having a hard time thinking of
legitimate uses for them compared to inheritance.  If such a use
comes up, it can be re-allowed later.  Or maybe I'll implement it
soon, and consider it a FIXME until then.

3/19/05: Oops.  There was an ambiguity in the IDL, in that the
double-dot was used both as a global namespace prefix and as a range
specifier.  This wasn't caught before, because idlc wasn't allowing
namespace-qualified constants.  The range specifier is now a
triple-dot.

3/20/05: The memory management scheme is *still* really screwed up;
an interface declared in the namespace of a superinterface (and
similar constructs) will cause reference loops.  I think when it
finally gets to the point that I try to make memory management
actually work right (which probably won't be until this code is made
library-able) I'll just declare the entire tree to be a monolithic
entity, freed in one go when it is no longer needed.  Reference
counting could still be used for things that aren't part of the tree,
like strings and lists.

5/18/05: I'm thinking of allowing actual return values instead of
using only out parameters (even with overloaded return-the-last-
out-parameter features of language bindings).  It would more clearly
express the intent of the programmer to designate one of the out
parameters as a return value, and it would make it easier to take
advantage of an ABI's return value registers (instead of always using
pointers, or continuing the last-out-param hack at the function
pointer level).

Enums in C++ will be typesafe against assigning one initialized enum
to an enum of a different type; however, it doesn't look like it will
be able to be made safe against initializing an enum with a const
initializer from a different enum type, at least not without breaking
things like switch.  Languages such as D should be able to do it
properly with strong typedefs.

GCC is refusing to do CSE on upcasts, even with const all over the
place; this means that repeatedly calling methods from a derived
interface will be less efficient than casting once to the parent
interface and using that.  At some point, this should be resolved,
but that's an optimization issue which can wait (and it may require
compiler changes to let the compiler know that the data *really* is
not going to change, ever, by anyone (apart from initialization which
happens before the code in question runs)).  Maybe LLVM will do
better.  CSE on downcasts would be nice too.

5/20/05: I've decided against method return values.  The efficiency
part is a minor optimization, and would be better handled with an ABI
that can handle out parameters directly (thus giving you many return
registers).  Of course, switching ABIs will be painful, but it's
probably going to happen a few times anyway during the early phases
of development.  As for "express[ing] the intent of the programmer",
it's really not that big of a deal.  Eventually, instead of having
last-out-param hacks like the C++ binding, a language could allow
some keyword or symbol to replace one (or more) arguments, causing
them to be treated as return values.

It has nothing to do with me being lazy.  No, not at all.  *whistling
and walking away*

It should be possible to disable async as an implementation
attribute, so that in-process wrappers can execute directly (e.g.
FileStream's async methods directly sending out an async method to
the file, rather than requiring both steps to be async).

5/23/05: FileStream's methods probably should be async anyway,
and then either call a sync method, or provide its own notifier. 
That way, it can keep the position correct if the read or write did
not fully succeed.  It'll also need to keep all operations strictly
ordered, so if async calls are used, it needs a message serializer
object.

See update 7/02/06.

10/04/05: There should be a way to check whether a pointer to a
virtual struct is of the most derived type.

7/02/06: FileStream no longer exists as an interface; instead, an
object combining Seekable, HasFile, and the apppropriate stream
interface(s) (which have both sync and async methods) should be used. 
This object will usually be local, so async isn't an issue, but it
can be used remotely if it's really needed to synchronize the file
offset pointer across multiple address spaces.