From 5ee1a4b6695c7889466f97d42b8c416b080b361d Mon Sep 17 00:00:00 2001 From: Scott Wood Date: Wed, 11 Apr 2007 18:49:26 -0500 Subject: [PATCH] Document new memory management semantics. --- doc/abi/x86 | 36 +-- doc/orb/memory-management | 454 ++++++++++++----------------------- doc/orb/parameter-info-block | 47 ++-- 3 files changed, 185 insertions(+), 352 deletions(-) diff --git a/doc/abi/x86 b/doc/abi/x86 index 2038cd5..41f4bb4 100644 --- a/doc/abi/x86 +++ b/doc/abi/x86 @@ -1,12 +1,10 @@ -Function Calls and In-Process Method Invocation: - SysV i386 ABI +Basic ABI: SysV i386 Out-of-Process Method Invocation: Caller: - eax: object ID - ecx: method ID - - edx: pointer to parameter info block (PIB), described below + eax: reserved, must be zero + edx: pointer to parameter info block (PIB) + ecx: maximum "out" bytes, excluding Inlines Call the 32-bit address stored at 0x7fff0000 to invoke the method. @@ -24,37 +22,13 @@ Out-of-Process Method Invocation: was passed in by the caller. Callee: - params on stack (low addr to high), ids replaced with pointers, - at least 4 bytes of spare space beyond the high element - - eax: object pointer + eax: pointer to PIB edx: pointer to caller information struct, if such was requested ecx: return address Upon return: - params on stack (low addr to high), in params may be clobbered eax: pointer to exception, or NULL if none. ebx, esi, edi, ebp, esp: should be preserved ecx, edx: may be clobbered - -Stack: - esp is stack pointer, grows down, decrement before store - -Object structure: - The object ID is stored as a 32-bit quantity at an offset - specified by calling a method TBD. - -Wrapper object creation: - The function to create wrapper objects is specified by calling a method - TBD. The function shall conform to the local ABI, and takes an ID as a - 32-bit integer as the first parameter, and a pointer to the class as - the second. It returns a pointer. - - Wrapper objects may be preemptively declared to avoid infinite loops by - calling a method TBD. - -Struct padding: - All fields are padded so that basic types are naturally aligned. - diff --git a/doc/orb/memory-management b/doc/orb/memory-management index 98871b6..3189696 100644 --- a/doc/orb/memory-management +++ b/doc/orb/memory-management @@ -4,145 +4,123 @@ 1.1 Overview ============ -The writeability, sharedness, and lifetime of memory passed by reference -as a parameter (in or out) depends on the parameter attributes, as well as -whether the method is asynchronous. The implementation of the semantics -will be different for in-process and remote methods, and not all -restrcitions are enforced as strictly for in-process methods, but the -semantics when the rules are followed must be the same for both. - -The data type of parameter also affects how it is handled. Built-in -datatypes, bitfields, enums, and inline structs and arrays are passed -directly by value (or, depending on the ABI, simple reference-valid- -until-end-of-method for out parameters), and thus do not require memory -management. Object references are reference counted both at the ORB and -process level, and are unique in that the client data structures are -read-only and uncopyable (because a unique object needs a unique address) -within an address space; most of this section does not apply to them. - -That leaves non-inline structs and arrays. The ideal way of treating -these depends on how they're being used. Simple, small structs and arrays -would benefit from being passed as a simple pointer (plus a length field -in the case of arrays), with the method copying anything it wants to keep -and/or change. For async methods, the caller would need to avoid -modifying referenced memory until it knows the method has been executed -(in the case of remote methods, it only matters until the ORB has COWed or -copied the memory, but client code should not depend on this, or it will -not work with an in-process server). This is the default behavior if no -parameter attributes are specified. With this type of parameter, there -are no restrictions on what allocator the memory being passed came from -for in parameters. "Out" and inout parameters will be discussed later. - -The only major complexity here is in how the duplicate is made. Struct -duplications will need to do a deep copy without being confused by -reference loops; the stubs will need to provide a "duplicate" function -that does this (possibly using a hash table or other associative array to -identify reference loops if IDLC determines that they are possible with -that particular type of struct). - -To avoid a double-copy, out-of-process methods may want to merely ask the -ORB to give it full, permanent access (via COW if not already copied) to -the memory. This may be hard to implement, though, as it requires the -server to know that the data came from an out-of-process implementation -(even if it's been passed to other functions, some of which may be -in-process object-model methods). This optimization may only be useful -for large amounts of data that is not likely to have most of its pages -modified; if it is likely to be heavily modified, then COW doesn't help -much, and may hurt (and if it's large, it probably hasn't already been -copied). It may be better to not implement this optimization, and instead -recommend that the user use a parameter attribute to deal with the -problem. - -1.2 Parameter Attributes -======================== +This document defines the lifetime and writeability of memory passed by +reference as a parameter under various circumstances (in or out, sync or +async). The implementation of the semantics will be different for local +and remote methods, and not all restrcitions are enforced as strictly for +local methods, but the semantics when the rules are followed must be the +same for both. + +1.2 Definitions +=============== -With larger arrays and complex data structures, one would often benefit -from being able to avoid the copy altogether. This can be accomplished by -altering the interface semantics. All of the parameter attributes but -"copy" do this, and thus cannot be changed without breaking interface -compatibility. None of the current parameter attributes can be combined -with any other current parameter attribute. - -1.2.1 Default Semantics -======================= - -If no attribute is specified, "in" parameters are visible only until the -method returns, and are read-only. There will be a "duplicate" function -provided by the language binding for the server to use if it wants to -retain and/or write to the data. For "small" data (the threshold needs to -be empirically determined), it just makes a copy. For "large" data, the -pages will be copy-on-write mapped (unless the caller asks for an -immediate copy). The only real reason not to use the immediate flag for -small data (as determined by the programmer) as well (rather than have a -threshold) is so that the threshold can be tunable based on the relative -performance of copies versus traps on a given system. It'd also be nice -if the programmer could ask a profiler to determine whether large data -should be COWed or copied immediately on a per-call basis. - -When the "duplicate" function is called, a copy-on-write mapping of the -data will be created. Edge data will be overmapped regardless of page -type, but the overmap status will be retained (so that edge pages will not -be overmapped into some other address space), though read overmap will be -promoted to read/write overmap, as the extra data in the copy will not be -used anymore. There will be an option to the "duplicate" function to -create fully overmappable pages by copying the edge data and zeroing the -rest of the edge pages (in case the caller wants to share the data). - -1.2.2 Copy -========== +implementation attribute: An attribute that is defined only for a specific +server object, and affects only the server stubs. It can be changed +without breaking client compatibility. + +interface attribute: An attribute that is defined as a part of an IDL +interface, and cannot be changed without breaking compatibility. + +local: The method being called exists in the caller's address space, and +thus no marshalling need occur. + +remote: The method being called exists in an address space other than the +caller's. All data must be marshalled and passed through some form of +IPC. + +1.3 In Parameters +================= + +Built-in datatypes, bitfields, enums, and inline structs and arrays are +passed directly by value (or, depending on the ABI, a reference valid +until the end of the method for out parameters), and thus do not require +memory management for local calls. For remote calls, data passed by +value is treated as a single non-inline struct for synchronous methods, +and as an inline struct (copied at invocation time, and freed when the +message is removed from the queue by the ORB) for async methods. + +Object references are reference counted both at the ORB and process +level, and are unique in that the client data structures are read-only +and uncopyable (because a unique object needs a unique address) within an +address space; most of this section does not apply to them. + +That leaves non-inline structs and arrays. These are passed as a simple +pointer (plus a length field in the case of arrays), with the method +copying anything it wants to change and/or keep beyond the end of the +method. For async methods, the caller needs to avoid modifying referenced +memory until it knows the method has been executed (in the remote case, it +only matters until the ORB has COWed or copied the memory, but client code +should not depend on this, or it will not work with an local server). +There are no allocator or alignment restrictions on the memory being +passed (though page alignment and overmap can improve performance when +sending large buffers). + +If the server wants to keep the data past the end of the method, or if it +wants to modify the data, it must call a duplicate() function to copy the +data, map it copy-on-write. Struct duplications will need to do a deep +copy without being confused by reference loops (possibly by using a hash +table or other associative array to identify reference loops if IDLC +determines that they are possible with that particular type of struct). + +To avoid a double-copy, remote methods may want to merely ask the ORB to +give it full, permanent access (via COW if not already copied) to the +memory. This may be hard to implement, though, as it requires the server +to know that the data came from an remote implementation, and be able to +modify the set of pages that the ORB reclaims at the end of the method +call. This may not provide a significant benefit over simply creating a +new copy-on-write mapping. + +In remote calls, the caller is charged with the temporary allocations made +by the ORB in the callee's address space. + +1.3.1 Copy (interface) +====================== + +The "copy" interface attribute may be used to achieve the effect of +calling duplicate() in the server. For local calls, this simply results +in a call to duplicate() in the client stubs. For remote calls, the +copied data segments are marked with the Copy flag, which are read/write +(using copy-on-write) and not freed at the end of the method. These pages +are charged to the callee, and must be freed using the callee's ORB memory +manager. The callee should be able to specify how much data it is willing +to have copied to a given object. + +The "copy" interface attribute may not be used on "out" parameters, and +for "inout" parameters only applies to the "in" phase. Do not confuse +this with the implementation "copy" attribute used on "out" parameters. -The "copy" attribute affects only the implementation; changing it does not -break interface compatibility (and thus require a new GUID). As such, the -use of this attribute is specified in the CDL rather than the IDL. -Language bindings that do not require a CDL file will provide some way of -specifying copy semantics directly from the implementation language. +1.4 Out parameters +================== -This attribute directs the ORB to automatically make a copy (possibly via -COW, but no read-only or shared mappings) of the parameter. For -in-process invocation, methods with any "copy" parameters will need to go -through a stub to do the copy before calling the real method. +When a non-inline struct or array is being returned via an out or inout +parameter, there is no end of method on which to base reference lifetime. +As such, the ownership of such data is transferred to the ORB memory +manager in the caller's address space. For remote calls, the caller +should be able to specify a limit on how much memory it is willing to +receive from the callee. -1.2.3 Shared -============ +Inline "out" parameters are returned into a buffer provided by the caller. +All value types are implicitly inline. -The "shared" attribute declares that the method implementation and caller -will treat the data as a read/write shared memory region, lasting beyond -the end of the method. The caller must ensure that all segments of data -provided (including every nested struct, or struct/array in an array of -structs/arrays) either begins and ends exactly on a page boundary, or has -all partial pages marked for read/write overmap. For out-of-process -methods, an exception will be thrown if this is not the case. For -in-process methods, you'll merely go to hell for writing bugs that won't -show up until someone hands your code a reference to an out-of-process -implementation. All data must be under the management of the ORB memory -manager. - -The shared region is terminated when the caller or callee frees the memory -using the ORB memory manager. This requires calling some function that -knows whether to actually free it (in the out-of-process case), or release -a reference or something (in the in-process case). For arrays, where -we're already using a struct with pointer and length, adding a reference -count pointer wouldn't be a big deal. Struct pointers, OTOH, are -currently plain pointers, and adding an indirection struct with pointer -and reference pointer would be ugly, but doable. The alternative is to -have the memory manager look up the memory fragment and find the refcount -in its internal data structures, which would require a lookup for every -reference count operation. - -1.2.4 Push -========== +1.4.1 Copy (implementation) +=========================== -The "push" attribute transfers the memory region to the destination, -unlinking it from the source address space. The source memory region will -be unmapped for out-of-process invocations; for in-process invocations, -the memory region will simply belong to the callee, and the caller must -not reference it any more. Like "shared", all data fragments must either -begin and end on a page boundary, or be in a page with read/write overmap -enabled. It is also required that every page being pushed be under the -management of the ORB allocator. +If the "copy" implementation attribute is specified on the out parameter, +then the buffer the caller receives will be newly allocated, and the +buffer provided by the callee remains the callee's. In this case, there +are no allocator or alignment restrictions on the callee's buffer. -1.2.5 Inline -============ +If the "copy" attribute is not specified, then the buffer provided by the +callee must be under the control of the ORB memory manager, and it must +begin and end on page boundaries. When returning from a remote call, the +pages will be unmapped from the callee and mapped into the caller. + +The "copy" implementation attribute may not be used on "in" parameters, +and for "inout" parameters only applies to the "out" phase. Do not +confuse this with the interface "copy" attribute used on "in" parameters. + +1.5 Inline +========== When used as a parameter attribute (either explicitly or as part of the type definition), "inline" specifies that the struct or array will be @@ -150,11 +128,6 @@ passed directly as a value parameter. A NULL reference cannot be passed, and for out parameters, the method cannot return a reference to an existing struct or array, but instead must fill in the reference given. -The "inline" attribute is similar to "copy", except that no in-process -stub is required because it is part of the interface (though a stub may be -required in certain languages if they do not provide automatic copying of -value-passed structs/arrays). - An inline array cannot be of variable length, and may be treated differently from other arrays in the language binding (e.g. in C++, inline arrays are bare language arrays, and do not use the Array or MutableArray @@ -164,24 +137,7 @@ The "inline" attribute can also be used on members of a struct, to indicate that the member is embedded directly in the outer struct, rather than linked with a pointer. -1.2.6 Immutable -=============== - -The "immutable" attribute is similar to "const" in C/C++ ("const" in -IDL is used only for compile-time constants), and specifies that the -array or struct cannot be modified through this particular reference. -It is the default for parameters when neither "shared" nor "push" is -used ("copy" and "inline" parameters will accept immutable references -on the caller side, but will produce a mutable copy for the server -side). It may be specified without effect when it is the default. - -Immutable "shared"/"push" parameters will result in read-only -mappings in the destination address space, though for "push" -parameters, the process will have permission to enable writes (this -is intended for use by the ORB memory manager when the memory is -freed). - -The "immutable" attribute may also be used on members of a struct. +"Inline" is an interface attribute. 1.3 Asynchronous methods ======================== @@ -193,65 +149,26 @@ When invoking an async method, there should be a mechanism to get a handle to track the progress of the invocation. This can be used by the caller to try to cancel the method on a timeout, which can only be done if the message has not yet been accepted by the recipient. Once the message has -been accepted, in the in-process, non-"copy" case, the caller must not +been accepted, in the local, non-"copy" case, the caller must not touch the data after this point until it receives a message from the -callee that it is finished. In the out-of-process case, if a caller loses +callee that it is finished. In the remote case, if a caller loses patience with the callee, it can free the memory (thus making it exist only in the callee's address space, non-shared). -1.4 Out parameters -================== - -When a struct or array is being returned via an out or inout parameter, -there is no end of method on which to base reference lifetime. As such, -if neither "shared" nor "inline" is specified, an out parameter is treated -as "push". The default of "push" only applies to the out half of an inout -parameter; in general, use of inout should be probably limited to value -types and parameters that use "push" in both directions so as to avoid -potentially confusing semantics. - -To return static data that does not need to be freed, out parameters can -use the "copy" implementation attribute. The interface semantics will -still be "push", but the ORB (or a wrapper function for in-process calls) -will allocate a pushable buffer and copy the data into it. If the static -data is managed by the ORB memory manager, it will reference count the -page rather than make a copy if the buffer is of sufficient size. - 1.5 Exceptions ============== -Exceptions are thrown as copy/push out parameters. This will often mean -unnecessary copies at in-process IDL method boundaries, but exceptions -should be small and infrequently thrown, and usually not managed by the -ORB memory manager except across method boundaries. - -1.6 Unmet needs -=============== - -It is currently impossible to specify attributes (other than "immutable" -and "inline") on individual struct members. This would be useful to pass -a struct normally that contains a status code or other metadata along with -a pushed or shared buffer, without making the outer struct also pushed or -shared. It would also be useful to pass a pushed buffer through some -fixed superstruct such as a NotifierInfo. +Exceptions are thrown as copied out parameters. This will often mean +unnecessary copies at local IDL method boundaries, but exceptions should +be small and infrequently thrown, and usually not managed by the ORB +memory manager except across method boundaries. 2. The ORB Memory Manager (ORBMM) ================================= -The ORB memory manager keeps track of memory allocated by the ORB during -an out-of-process method invocation. It is also used for allocations made -by user code for memory that may need to be freed by another component in -the same address space (such as when using the shared or push attributes). - -A reference count is kept on each page, so that shared-within-a-process -mappings must be released by both caller and callee before the memory is -freed. Passing a chunk of data through a "shared" parameter to in-process -method increments the page's reference count; this requires a memory -manager lookup for every such parameter. Because of this, consider using -"push" instead if sharing is not required. - -In the out-of-process case, each mapping is freed separately, and the -kernel handles reference counting the physical page. +The ORB memory manager keeps track of memory that is allocated in the +process of calling or returning from a method. This happens with "copy" +in parameters, and all non-inline out parameters. 2.1 Methods =========== @@ -266,121 +183,56 @@ System::RunTime::ORBMM. 2.1.1 alloc =========== -void *ORBMM::alloc(size_t size, ORBMM::AllocGroup *group = NULL); - -Allocate the memory required for the given type (and the given array size, -if the type is an array). A group handle may be passed; if not, no page -will contain more than one allocation. The reference count on the page is -incremented if a page has been reused and per-object refcounts are not -supported; otherwise, the object's reference count is one. If the -allocation spans multiple pages, it will be tracked as an "object", so -that each page will have its reference count incremented and/or -decremented when appropriate. +void *ORBMM::alloc(size_t size, int refs = 1); -The implementation may, but is not required to, track reference counts on -a per-page basis rather than per-object. The former will generally be -more efficient, but will preclude the reuse of an object's memory upon -release until the entire page is released. +Allocate the memory required for the given type, array length, and +inital refcount. -Alternative forms: +Alternate forms: Type *obj = new(orbmm) Type; Type *obj = new(orbmm) Type[]; -Type *obj = new(orbmm, group) Type; -Type *obj = new(orbmm, group) Type[]; +Type *obj = new(orbmm, refs) Type; +Type *obj = new(orbmm, refs) Type[]; 2.1.2 retain ============ -void ORBMM::retain(Region region); - -Increment the reference count on the specified object. +void ORBMM::retain(void *ptr, int refs = 1); -The region must refer to only one ORBMM object; the implementation may, -but is not required to, throw an exception if this rule is violated. If a -region smaller than the object is retained, it will not prevent other -pages in the region from being freed. +Add the specified number of references to the ORBMM object. "ptr" can +be anywhere inside the object; it does not have to point to the +beginning of the object. 2.1.3 release ============= -void ORBMM::release(Region region); - -Decrement the reference count on the specified object, freeing it if it -reaches zero. - -It is allowed, but not required, that space in multi-object groups be -reused when freed, if the same group is used to allocate new objects. -This is only possible if reference counts are kept on a per-object basis -rather than per-page. - -The region must refer to only one ORBMM object; the implementation may, -but is not required to, throw an exception if this rule is violated. If a -region smaller than the object is released resulting in a reference count -of zero, portions may be freed prior to the rest of the region's reference -count reaching zero. - -2.1.4 super_retain -================== - -void ORBMM::super_retain(Region region); - -Increment the reference and super-reference counts on the specified -object. If the reference count ever goes below the super-reference count, -an exception is thrown. This mechanism is intended to ease debugging -reference count problems, by turning memory corruption into an exception. - -It would typically be used when a given object is not intended to be -released until the program exits (or some well-defined cleanup procedure -is done), such as program and module code and static data. It should also -be used when a mapping is created using mmap() or other higher-level -function, so as to be able to detect if such a reference is released -through release() rather than through the high-level mechanism. - -The region must refer to only one ORBMM object; the implementation may, -but is not required to, throw an exception if this rule is violated. +void ORBMM::free(void *ptr, int refs = 1); -2.1.5 super_release -=================== +Alternate forms: +delete(orbmm) Type; +delete(orbmm) Type[]; +delete(orbmm, refs) Type; +delete(orbmm, refs) Type[]; -void ORBMM::super_release(Region region); +Release the specified number of references to the ORBMM object. If +the refcount falls to zero, the memory will be unmapped. If the +refcount goes below zero, an exception may be thrown. "ptr" can be +anywhere inside the object; it does not have to point to the +beginning of the object. -Decrement the reference and super-reference counts on the given object. - -The region must refer to only one ORBMM object; the implementation may, -but is not required to, throw an exception if this rule is violated. - -2.1.6 create_group -================== - -ORBMM::AllocGroup *ORBMM::create_group(); - -Create a group handle that can be passed to the alloc function to pack -multiple allocations into the same page(s). - -2.1.7 destroy_group -=================== - -void ORBMM::destroy_group(ORBMM::AllocGroup *group); - -Free the memory associated with the group handle returned by create_group. -The allocations made under the group are unaffected, and must be released -separately. - -2.1.8 add_region +2.1.3 add_region ================ -void *ORBMM::add_region(System::Mem::Region region); - -The ORB memory manager can manage reference counts of pages that were not -allocated using ORBMM. This can be used to refcount non-anonymous -mappings (and thus make them usable with parameters that require ORBMM -memory). It can also be used on static pages that will never be freed -until the program exits. +void ORBMM::add_region(System::Mem::Region region, bool unmap_orig, + int refs = 1); The add_region method places a non-ORBMM controlled region under ORBMM -control. The ORBMM may use the existing mapping, or it may remap the -pages into its own region of the virtual address space. The address that -it uses will be returned. The entire region will be one ORBMM object. - -Upon a reference count of zero, the pages will be unmapped using -System.AddrSpace.unmap(). +control. This can be used to return data that was not allocated using +ORBMM (such as static data or non-anonymous mappings), without having +to always do so (as the "copy" attribute would require). It is also +used internally by IDL stubs. + +The entire region will be one ORBMM object, and will be freed all at +once when the refcount goes to zero. When it is freed, the region +will cease to be under ORBMM control. The original region will only +be unmapped if unmap_orig is true. diff --git a/doc/orb/parameter-info-block b/doc/orb/parameter-info-block index 2885294..8f32e1c 100644 --- a/doc/orb/parameter-info-block +++ b/doc/orb/parameter-info-block @@ -1,6 +1,7 @@ Parameter Info Block (PIB), all offsets in pointer-length words - Name Offset Meaning - buffer_size 0 Size of the destination buffer + Name Offset Meaning + buffer_size 0 Size of the destination buffer, excluding + Copy segments. The total number of bytes in all of the segments that require a buffer to be created in the destination address space. This is @@ -12,22 +13,31 @@ Parameter Info Block (PIB), all offsets in pointer-length words (because the kernel does not need to allocate a caller-side buffer for them). The kernel may throw an exception if the actual size is greater than specified in this field. + + This only covers the "normal" segments which are mapped only + for the duration of the call. Copy segments are handled + separately. + + copy_size 1 Size of all Copy segments. + + This is like buffer_size, but for Copy segments. The pages - objlist_ptr 1 Pointer to the object list - objlist_len 2 Length of the object list, in IDs + objlist_ptr 2 Pointer to the object list + objlist_len 3 Length of the object list, in IDs The object list is a special segment that contains object IDs rather than arbitrary data. Each object ID will be translated into the destination ID-space, allocating new IDs when necessary. The IDs themselves are 32 bits each, unsigned, - regardless of the pointer size. + regardless of the pointer size. The first object in the list + is the object to receive the message. - num_segments 3 Number of data segments - segment.ptr 4+n*4 Pointer to data segment - segment.len 5+n*4 Length of data segment in bytes - segment.flags 6+n*4 Attributes of data segment - reserved 7+n*4 Reserved for future use, and for - power-of-two indexing + num_segments 4 Number of data segments + segment.ptr 5+n*4 Pointer to data segment + segment.len 6+n*4 Length of data segment in bytes + segment.flags 7+n*4 Attributes of data segment + reserved 8+n*4 Reserved for future use, and for + power-of-two indexing Each segment describes data being transmitted to and/or from the callee. For out segments, the caller may designate a buffer to hold @@ -41,14 +51,11 @@ Parameter Info Block (PIB), all offsets in pointer-length words Segment Flags (see doc/orb/memory-management for more details): In 0x01 Data is copied/mapped from caller to callee. Out 0x02 Data is copied/mapped from callee to caller. - Shared 0x04 A permanent shared mapping is created. - Push 0x08 The region is unmapped from the source and - transferred to the destination. - Inline 0x10 The callee cannot change the length of an - Out segment. Ignored for In segments. - Immutable 0x20 The segment is to be mapped read-only in - the destination. Ignored unless Shared is - set. - Copy 0x8000 The segment is permanently copied into the + The data is unmapped from the callee unless + Copy is set. + Inline 0x04 The callee cannot change the length of an + Out segment, and the caller must provide the + buffer. Ignored for In segments. + Copy 0x08 The segment is permanently copied into the destination address space, with read/write access. -- 2.39.2