Here are some more notes in a convenient to read form. I'll turn these into a documentation patch later. Rob, can you double-check this?
This document assumes you are familiar with the basics of DCOM. If you aren't read this first:
http://winehq.com/site/docs/wine-devel/dcom-1
This is not suitable study material for beginners. Don't say I didn't warn you.
* Apartments
Before a thread can use COM it must enter an apartment. Apartments are an abstraction of a COM objects thread safety level. There are many types of apartment but the only two we care about right now are single threaded apartments (STAs) and the multi-threaded apartment (MTA).
Any given process may contain at most one MTA and potentially many STAs. You enter an apartment by calling CoInitializeEx and passing the desired thread model in as a parameter. The default if you use the deprecated CoInitialize is a STA, and this is the most common type of apartment used in COM.
An object in the multi-threaded apartment may be accessed concurrently by multiple threads: eg, it's supposed to be entirely thread safe. It must also not care about thread-affinity, the object should react the same way no matter which thread is calling it.
An object inside a STA does not have to be thread safe, and all calls upon it should come from the same thread - the thread that entered the apartment in the first place.
The apartment system was originally designed to deal with the disparity between the Windows NT/C++ world in which threading was given a strong emphasis, and the Visual Basic world in which threading was barely supported and even if it had been fully supported most developers would not have used it. Visual Basic code is not truly multi-threaded, instead if you start a new thread you get an entirely new VM, with separate sets of global variables. Changes made in one thread do NOT reflect in another, which pretty much violates the expected semantics of multi- threading entirely but this is Visual Basic, so what did you expect? If you access a VB object concurrently from multiple threads, behind the scenes each VM runs in a STA and the calls are marshaled between the threads using DCOM.
In the Windows 2000 release of COM, several new types of apartment were added, the most important of which are RTAs (the rental threaded apartment) in which concurrent access are serialised by COM using an apartment-wide lock but thread affinity is not guaranteed.
* Structure of a marshaled interface pointer
When an interface is marshaled using CoMarshalInterface, the result is a serialized OBJREF structure. An OBJREF actually contains a union, but we'll be assuming the variant that embeds a STDOBJREF here which is what's used by the system provided standard marshaling. A STDOBJREF (standard object reference) consists of the magic signature 'MEOW', then some flags, then the IID of the marshaled interface. Quite what MEOW stands for is a mystery, but it's definitely not "Microsoft Extended Object Wire". Next comes the STDOBJREF flags, identified by their SORF_ prefix. Most of these are reserved, and their purpose (if any) is unknown, but a few are defined.
After the SORF flags comes a count of the references represented by this marshaled interface. Typically this will 1 in the case of a normal marshal, but may be 5 for table-strong marshals and 0 for table-weak marshals (the difference between these is explained below).
The most interesting part of a STDOBJREF is the OXID, OID, IPID triple. This triple identifies any given marshaled interface pointer in the network. OXIDs are apartment identifiers, and are supposed to be unique network-wide. How this is guaranteed is currently unknown: the original algorithm Windows used was something like the current UNIX time and a local counter.
OXIDs are generated and registered with the OXID resolver by performing local RPCs to the RPC subsystem (rpcss.exe). In a fully security-patched Windows system they appear to be randomly generated. This registration is done using the ILocalOxidResolver interface, however the exact structure of this interface is currently unknown.
OIDs are object identifiers, and identify a stub manager. The stub manager manages interface stubs. For each exported COM object there are multiple interfaces and therefore multiple interface stubs (IRpcStubBuffer implementations). OIDs are apartment scoped. Each ifstub is identified by an IPID, which identifies a marshaled interface pointer. IPIDs are apartment scoped.
Unmarshaling one of these streams therefore means setting up a connection to the object exporter (the apartment holding the marshaled interface pointer) and being able to send RPCs to the right ifstub. Each apartment has its own RPC endpoint and calls can be routed to the correct interface pointer by embedding the IPID into the call using RpcBindingSetObject. IRemUnknown, discussed below, uses a reserved IPID.
Both standard and handler marshaled OBJREFs contains an OXID resolver endpoint which is an RPC string binding in a DUALSTRINGARRAY. This is necessary because an OXID alone is not enough to contact the host, as it doesn't contain any network address data. Instead, the combination of the remote OXID resolver RPC endpoint and the OXID itself are passed to the local OXID resolver. It then returns the apartment string binding.
This step is an optimisation: technically the OBJREF itself could contain the string binding of the apartment endpoint and the OXID resolver could be bypassed, but by using this DCOM can optimise out a server round-trip by having the local OXID resolver cache the query results. The OXID resolver is a service in the RPC subsystem (rpcss.exe) which implements a raw (non object-oriented) RPC interface called IOXIDResolver. Despite the identical naming convention this is not a COM interface.
Unmarshaling an interface pointer stream therefore consists of reading the OXID, OID and IPID from the STDOBJREF, then reading one or more RPC string bindings for the remote OXID resolver. Then RpcBindingFromStringBinding is used to convert this remote string binding into an RPC binding handle which can be passed to the local IOXIDResolver::ResolveOxid implementation along with the OXID. The local OXID resolver consults its list of same-machine OXIDs, then its cache of remote OXIDs, and if not found does an RPC to the remote OXID resolver using the binding handle passed in earlier. The result of the query is stored for future reference in the cache, and finally the unmarshaling application gets back the apartment string binding, the IPID of that apartments IRemUnknown implementation, and a security hint (let's ignore this for now).
Once the remote apartments string binding has been located the unmarshalling process constructs an RPC Channel Buffer implementation with the connection handle and the IPID of the needed interface, loads and constructs the IRpcProxyBuffer implementation for that IID and connects it to the channel. Finally the proxy is passed back to the application.
* Handling IUnknown
There are some subtleties here with respect to IUnknown. IUnknown itself is never marshaled directly: instead a version of it optimised for network usage is used. IRemUnknown is similar in concept to IUnknown except that it allows you to add and release arbitrary numbers of references at once, and it also allows you to query for multiple interfaces at once.
IRemUnknown is used for lifecycle management, and for marshaling new interfaces on an object back to the client. Its definition can be seen in dcom.idl - basically the IRemUnknown::RemQueryInterface method takes an IPID and a list of IIDs, then returns STDOBJREFs of each new marshaled interface pointer.
There is one IRemUnknown implementation per apartment, not per stub manager as you might expect. This is OK because IPIDs are apartment not object scoped.
* Table marshaling
Normally once you have unmarshaled a marshaled interface pointer that stream is dead, you can't unmarshal it again. Sometimes this isn't what you want. In this case, table marshaling can be used. There are two types: strong and weak. In table-strong marshaling, selected by a specific flag to CoMarshalInterface, a stream can be unmarshaled as many times as you like. Even if all the proxies are released, the marshaled object reference is still valid. Effectively the stream itself holds a ref on the object. To release the object entirely so its server can shut down, you must use CoReleaseMarshalData on the stream.
In table-weak marshaling the stream can be unmarshaled many times, however the stream does not hold a ref. If you unmarshal the stream twice, once those two proxies have been released remote object will also be released. Attempting to unmarshal the stream at this point will yield CO_E_DISCONNECTED.
* RPC dispatch
Exactly how RPC dispatch occurs depends on whether the exported object is in a STA or the MTA. If it's in the MTA then all is simple: the RPC dispatch thread can temporarily enter the MTA, perform the remote call, and then leave it again. If it's in a STA things get more complex, because of the requirement that only one thread can ever access the object.
Instead, when entering a STA a hidden window is created implicitly by COM, and the user must manually pump the message loop in order to service incoming RPCs. The RPC dispatch thread performs the context switch into the STA by sending a message to the apartments window, which then proceeds to invoke the remote call in the right thread.
RPC dispatch threads are pooled by the RPC runtime. When an incoming RPC needs to be serviced, a thread is pulled from the pool and invokes the call. The main RPC thread then goes back to listening for new calls. It's quite likely for objects in the MTA to therefore be servicing more than one call at once.
* Message filtering and re-entrancy
When an outgoing call is made from a STA, it's possible that the remote server will re-enter the client, for instance to perform a callback. Because of this potential re-entrancy, when waiting for the reply to an RPC made inside a STA, COM will pump the message loop. That's because while this thread is blocked, the incoming callback will be dispatched by a thread from the RPC dispatch pool, so it must be processing messages.
While COM is pumping the message loop, all incoming messages from the operating system are filtered through one or more message filters. These filters are themselves COM objects which can choose to discard, hold or forward window messages. The default message filter drops all input messages and forwards the rest. This is so that if the user chooses a menu option which triggers an RPC, they then cannot choose that menu option *again* and restart the function from the beginning. That type of unexpected re-entrancy is extremely difficult to debug, so it's disallowed.
Unfortunately other window messages are allowed through, meaning that it's possible your UI will be required to repaint itself during an outgoing RPC. This makes programming with STAs more complex than it may appear, as you must be prepared to run all kinds of code any time an outgoing call is made. In turn this breaks the idea that COM should abstract object location from the programmer, because an object that was originally free-threaded and is then run from a STA could trigger new and untested codepaths in a program.
Oh well, it was nice in theory.
thanks -mike