COM `IStream` interface pointer and access from different threads - windows

Is it an official COM requirement to any IStream implementation, that it should be thread-safe, in terms of concurrent access to IStream methods through the same interface pointer across threads?
I am not talking about data integrity (normally, reads/writes/seeks should be synchronized with locks anyway). The question is about the need to use COM marshaller to pass IStream object to a thread from different COM apartment.
This is a more general question than I asked about IStream as returned by CreateStreamOnHGlobal, please refer there for more technical details. I'm just trying to understand this stuff better.
EDITED, I have found this info on MSDN:
Thread safety. The stream created by SHCreateMemStream is thread-safe
as of Windows 8. On earlier systems, the stream is not thread-safe.
The stream created by CreateStreamOnHGlobal is thread-safe.
Now I believe, the IStream object returned by CreateStreamOnHGlobal is thread-safe, but there is NO requirement that other IStream implementations should follow this.

No, it isn't. And the accepted answer to the other question is dead wrong. Hans Passant's answer is correct. You should delete this question because it presupposes a falsehood, namely that CreateStreamOnHGlobal returns a thread-safe IStream. It doesn't. You then ask if this is true of other IStream implementations. It isn't.
In computer programming generally, and COM in particular, objects have guarantees they give and guarantees they do not give. If you use an object in conformance with its guarantees, then it will work all the time (barring bugs). If you exceed the guarantees, it may still work most of the time, but this is no longer guaranteed.
Generally in COM, the thread-safety guarantee is given by one of the standard threading models.
See here: http://msdn.microsoft.com/en-us/library/ms809971.aspx
Apartment threaded objects can be instantiated on multiple threads, but can only be used from the particular thread they were instantiated on.
Multi-threaded apartment objects can be instantiated in a multi-threaded apartment and can be used from any of those threads.
"Both"-threaded objects can be instantiated in any thread, and used from any thread.
Note: The threading model belongs to the object not the interface. Some objects supporting IStream may be single-threaded, others may be fully-thread safe. This depends on the code which implements the interface. Because an interface is just a specification, and thread-safety is not something covered by it.
It is always harmless to marshal an interface. If the threading models of the threads are compatible with the object's home thread, you will get the exact same interface pointer back. If they are not compatible, you will get a proxy. But it never hurts to marshal, and unless you know that the objects are compatible, you should always marshal.
However it is always open to an implementer to give additional guarantees.
In the case of CoMarshalInterthreadInterfaceInStream, you are told in the documentation that the returned IStream interface can be used for unmarshalling at the destination thread, using CoUnmarshalInterfaceAndReleaseStream.
That is, you have been given an additional guarantee. So you can rely on that working.
But that does not apply to any other instance of IStream at any time.
So you should always marshal them.

Related

How does COM avoid deadlocks when 2 objects call into each other?

Let's say there are two apartment-threaded COM objects, located in different apartments. Or maybe they're in different processes altogether. If one object calls a method on another, which in turn calls a method back on the first object, how does COM prevent the whole thing from deadlocking?
What you describe is called reentrancy.
The truth is that COM doesn’t do anything explicit to prevent reentrancy issues. It’s up to the implementer of each object to take precautions where needed, as applicable.
Funny enough, reentrancy in COM is far less common in real life than you would think. Object graphs in COM tend to be mostly trees, which do not exhibit reentrancy. When you have cycles it’s almost always because of objects exposing event-type functionality of some sort, typically Connection Points.
Event callbacks are very limited in scope and they trigger under the explicit control of each object’s code, so the programmer is able to easily time them so they occur at safe places (for example at/near the end of a method’s body after all the real work is done). This prevents serious reentrancy issues from developing.
But nothing stops you from coding something dangerous. For example, if an object triggers an event while its internal object state is inconsistent, all bets are off.
You mention deadlocks. Deadlocks require a locking mechanism of some sort (for example a Critical Section) and should be extremely rare to impossible in COM apartments for the reasons listed above. Any object that triggers an event while holding a lock is asking for serious trouble, and a deadlock is not the biggest of its worries: by virtue of being an STA object the reentrant call will run on the same thread, and it will be able to acquire the locks again and proceed right through, which means it’s very likely that the object will corrupt its internal state, cause a crash, or worse. Note that locks in an STA thread only make sense if the resources controlled by the lock are accessible to threads outside the object’s STA.
And finally, nothing in COM stops you from causing an infinite recursion loop and subsequent stack overflow either. For example, take two COM objects Obj1 and Obj2, with Obj2 implementing an event. We can have Obj1 call a pObj2->SomeMethod(…) which causes Obj2 to fire the event; then have obj1 listen (“sink”) to that event, and have that event handler call SomeMethod() again.
UPDATE:
Profound thanks to Remy Lebeau for pointing to in his comment something I had forgotten to discuss, via a link to CodeGuru article Understanding COM Apartments, Part I. And in the process I also learned something new myself I should have known about.
There is one aspect of reentrancy and locking to consider and that is what happens during inter-apartment calls (either STA<->STA, STA<->MTA, or even STA<->OutofProc). During an inter-apartment call the STA (caller's) thread needs to stall and wait for an answer to the call request; the response cannot (by definition) execute on the same thread. But it can't just fully block (e.g. WaitForSingleObject) waiting for the response because the thread needs to be able to respond and process not only potential callbacks to the original object, but also to callbacks to any other object inside of the same apartment. If it were to fully block, the COM infrastructure itself would be introducing the potential for a deadlock and you wouldn't even need a dependency cycle between objects. So the COM marshalling infrastructure uses a more complex form of Wait that can unblock for a few other situations (Hans Passat points to CoWaitForMultipleHandles which looks right to me but I don't know the infrastructure to that level). If an applicable callback occurs, the marshalling infrastructure will unblock and allow that call to enter the apartment and proceed.
This is a form of locking induced by the COM infrastructure itself, rather than one coded explicitly as part of the object's implementation, which is why I hadn't thought of bringing it up. So COM does in fact "do something to prevent deadlocks", but to prevent deadlock potentials induced by its own infrastructure.
The part that I hadn't consciously realized was that this mechanism is very selective. It only lets through COM calls that form part of the same causality chain, that is, a callback, a direct consequence of the call that the thread was waiting on. Other COM calls into the apartment have to queue up and wait for that call chain to conclude, and for the STA thread to return to the thread's message loop.1
1 It makes complete sense that it needs to be that way, but I don't think I ever realized it.

Confused about performance implications of Sync

I have a question about the marker trait Sync after reading Extensible Concurrency with the Sync and Send Traits.
Java's "synchronize" means blocking, so I was very confused about how a Rust struct with Sync implemented whose method is executed on multiple threads would be effective.
I searched but found no meaningful answer. I'm thinking about it this way: every thread will get the struct's reference synchronously (blocking), but call the method in parallel, is that true?
Java: Accesses to this object from multiple threads become a synchronized sequence of actions when going through this codepath.
Rust: It is safe to access this type synchronously through a reference from multiple threads.
(The two points above are not canonical definitions, they are just demonstrations how similar words can be used in sentences to obtain different meanings)
synchronized is implemented as a mutual exclusion lock at runtime. Sync is a compile time promise about runtime properties of a specific type that allows other types depend on those properties through trait bounds. A Mutex just happens to be one way one can provide Sync behavior. Immutable types usually provide this behavior too without any runtime cost.
Generally you shouldn't rely on words having exactly the same meaning in different contexts. Java IO stream != java collection stream != RxJava reactive stream ~= tokio Stream. C volatile != java volatile. etc. etc.
Ultimately the prose matters a lot more than the keyword which are just shorthands.

Go destructors?

I know there are no destructors in Go since technically there are no classes. As such, I use initClass to perform the same functions as a constructor. However, is there any way to create something to mimic a destructor in the event of a termination, for the use of, say, closing files? Right now I just call defer deinitClass, but this is rather hackish and I think a poor design. What would be the proper way?
In the Go ecosystem, there exists a ubiquitous idiom for dealing with objects which wrap precious (and/or external) resources: a special method designated for freeing that resource, called explicitly — typically via the defer mechanism.
This special method is typically named Close(), and the user of the object has to call it explicitly when they're done with the resource the object represents. The io standard package does even have a special interface, io.Closer, declaring that single method. Objects implementing I/O on various resources such as TCP sockets, UDP endpoints and files all satisfy io.Closer, and are expected to be explicitly Closed after use.
Calling such a cleanup method is typically done via the defer mechanism which guarantees the method will run no matter if some code which executes after resource acquisition will panic() or not.
You might also notice that not having implicit "destructors" quite balances not having implicit "constructors" in Go. This actually has nothing to do with not having "classes" in Go: the language designers just avoid magic as much as practically possible.
Note that Go's approach to this problem might appear to be somewhat low-tech but in fact it's the only workable solution for the runtime featuring garbage-collection. In a language with objects but without GC, say C++, destructing an object is a well-defined operation because an object is destroyed either when it goes out of scope or when delete is called on its memory block. In a runtime with GC, the object will be destroyed at some mostly indeterminate point in the future by the GC scan, and may not be destroyed at all. So if the object wraps some precious resource, that resource might get reclaimed way past the moment in time the last live reference to the enclosing object was lost, and it might even not get reclaimed at all—as has been well explained by #twotwotwo in their respective answer.
Another interesting aspect to consider is that the Go's GC is fully concurrent (with the regular program execution). This means a GC thread which is about to collect a dead object might (and usually will) be not the thread(s) which executed that object's code when it was alive. In turn, this means that if the Go types could have destructors then the programmer would need to make sure whatever code the destructor executes is properly synchronized with the rest of the program—if the object's state affects some data structures external to it. This actually might force the programmer to add such synchronization even if the object does not need it for its normal operation (and most objects fall into such category). And think about what happens of those exernal data strucrures happened to be destroyed before the object's destructor was called (the GC collects dead objects in a non-deterministic way). In other words, it's much easier to control — and to reason about — object destruction when it is explicitly coded into the program's flow: both for specifying when the object has to be destroyed, and for guaranteeing proper ordering of its destruction with regard to destroying of the data structures external to it.
If you're familiar with .NET, it deals with resource cleanup in a way which resembles that of Go quite closely: your objects which wrap some precious resource have to implement the IDisposable interface, and a method, Dispose(), exported by that interface, must be called explicitly when you're done with such an object. C# provides some syntactic sugar for this use case via the using statement which makes the compiler arrange for calling Dispose() on the object when it goes out of the scope declared by the said statement. In Go, you'll typically defer calls to cleanup methods.
One more note of caution. Go wants you to treat errors very seriously (unlike most mainstream programming language with their "just throw an exception and don't give a fsck about what happens due to it elsewhere and what state the program will be in" attitude) and so you might consider checking error returns of at least some calls to cleanup methods.
A good example is instances of the os.File type representing files on a filesystem. The fun stuff is that calling Close() on an open file might fail due to legitimate reasons, and if you were writing to that file this might indicate that not all the data you wrote to that file had actually landed in it on the file system. For an explanation, please read the "Notes" section in the close(2) manual.
In other words, just doing something like
fd, err := os.Open("foo.txt")
defer fd.Close()
is okay for read-only files in the 99.9% of cases, but for files opening for writing, you might want to implement more involved error checking and some strategy for dealing with them (mere reporting, wait-then-retry, ask-then-maybe-retry or whatever).
runtime.SetFinalizer(ptr, finalizerFunc) sets a finalizer--not a destructor but another mechanism to maybe eventually free up resources. Read the documentation there for details, including downsides. They might not run until long after the object is actually unreachable, and they might not run at all if the program exits first. They also postpone freeing memory for another GC cycle.
If you're acquiring some limited resource that doesn't already have a finalizer, and the program would eventually be unable to continue if it kept leaking, you should consider setting a finalizer. It can mitigate leaks. Unreachable files and network connections are already cleaned up by finalizers in the stdlib, so it's only other sorts of resources where custom ones can be useful. The most obvious class is system resources you acquire through syscall or cgo, but I can imagine others.
Finalizers can help get a resource freed eventually even if the code using it omits a Close() or similar cleanup, but they're too unpredictable to be the main way to free resources. They don't run until GC does. Because the program could exit before next GC, you can't rely on them for things that must be done, like flushing buffered output to the filesystem. If GC does happen, it might not happen soon enough: if a finalizer is responsible for closing network connections, maybe a remote host hits its limit on open connections to you before GC, or your process hits its file-descriptor limit, or you run out of ephemeral ports, or something else. So it's much better to defer and do cleanup right when it's necessary than to use a finalizer and hope it's done soon enough.
You don't see many SetFinalizer calls in everyday Go programming, partly because the most important ones are in the standard library and mostly because of their limited range of applicability in general.
In short, finalizers can help by freeing forgotten resources in long-running programs, but because not much about their behavior is guaranteed, they aren't fit to be your main resource-management mechanism.
There are Finalizers in Go. I wrote a little blog post about it. They are even used for closing files in the standard library as you can see here.
However, I think using defer is more preferable because it's more readable and less magical.

What is the difference Between Singleton object and sessionfactory singleton object

As per my knowledge singleton Object is not thread safe and session Factory singleton object is thread safe.
How this possible, Please explain someone.
The singleton pattern is neither thread-safe nor not thread-safe per se. You have to take a look at your specific implementation. The major question is, does it manage state?
If so then you will make sure that no more than one thread is ever allowed to change the state at the same time. That is the same problem global variables are suffering from regarding thread-safety. But there are mechanisms to ensure this safety, one is called mutual exclusion. The event of two threads concurrently modifying the same variable is one the problematic events, there are more to be aware of. Like two threads sequentially modifying a variable, then the question is whos answer counts.
Mutually exclusive events in general and a specific explanation in the java context can be found here (Mutually exclusive events) and here (Oracle concurrency guide) respectively. Global variables are explained here. Stateless and stateful are also good terms to look at regarding concurrency, parallelism and thread-safety.
Back to your question: A factory usually doesn't introduce any state and though can being shared freely between several threads. Instances produced by the factory most probably are stateful and should only be shared between threads after having made them thread-safe.
Important Note:
But don't get me wrong here. Don't forget to always check the implementation of your singletons! In java you can introduce annotations to document your investigations and mark specific code elements as thread-safe. There exist packages wich already define commonly usable annotations to document such behavior, take a look at the apache org.apache.http.annotation. When you use an API it is a good idea to inspect the documentation for such hints.
Session factory object is also implemented using the singleton design pattern.
singleton design pattern can be made as thread safe.
and they have implemented singleton with thread safe for session factory.
when we implement singleton we should make sure whether we need thread safe or not and we should implement acordingly.
see the various implementation of singleton in my blog under design pattern
java guide

Implementation of MTA COM server

I can't find any source code on the prerequisites of an MTA compliant COM. I tried changing the ThreadingModel registry key of my object from Apartment to Both, and it results in a crash when a secondary thread calls the method before any data is accessed.
If STA COMs require a message pump, what kind of plumbing code do MTA COM objects require?
I do not think that there is anything special about MTA, except that you need to use synchronization primitives like mutexes to synchronize access to your internal structures. Does the "Multithreaded Apartments" not give you all that you need?
Quoting from the documentation, emphasis is mine:
Because calls to objects are not serialized in any way, multithreaded object concurrency offers the highest performance and takes the best advantage of multiprocessor hardware for cross-thread, cross-process, and cross-machine calling. This means, however, that the code for objects must provide synchronization in their interface implementations, typically through the use of synchronization primitives such as event objects, critical sections, mutexes, or semaphores, which are described later in this section. In addition, because the object doesn't control the lifetime of the threads that are accessing it, no thread-specific state may be stored in the object (in thread local storage).

Resources