COM in .NET world. Thread affinity

August 17. 2018 0 Comments

Posted in:
.NET
COM

Another thing that developer should be aware when consuming COM interface is apartment. It is quite long topic and I will not explain it in detail. For today discussion let’s talk about STA. If you never use COM interface in .NET from different threads you can ignore this post.

STA means single thread apartment. Normally main thread has STA. COM interface that arrives to .NET runtime usually will be marked as STA. It means that that COM interface will be used only from thread it originally passed to. It called thread affinity. There are ways to change that behavior, but it will be in later posts.

As result when you call any function on STA COM interface and current thread is not one that originally received that interface then .NET runtime will attempt to switch to that thread.

If interface were originally passed to main thread, then .NET will be able to switch to main thread via message loop and execute that function call and returns back to original thread. But there are few issues with that approach:

Function call will execute slower because .NET runtime has to switch to different thread and back. If you call once per second or even more rare than that, then it is totally fine, and you should not worry about it. But if you do multiple calls then it will be much slower.
If some functions of that COM interface take a lot of time to execute you could think that executes them in different thread will keep your app responsive. But all calls will still be in main threads anyway, so it will not help, and application will still not be responsive.
Also, if you are trying to do some work in parallel and created few background threads it will not work as well because all calls will be serialized to main thread.
If main thread is blocked by some reason, then function call will stall and wait until main thread will be able to process. It also can create deadlocks in some scenarios.

But if interface were originally passed to different thread then its behavior really depends on many factors. Simple COM objects you get invalid typecast error. Sometimes if COM object implements IMarshal or INoMarshal it could work correctly. Sometimes you get strange errors because .NET not able to switch to that thread or that thread could be already destroyed.

Keep in mind that .NET will hold reference to COM interface for quite some time. It could create quite interesting scenarios. Let’s assume you have code in main thread that periodically uses IMyIntf COM interface and worker thread that periodically uses same COM interface. Now let’s go thru some scenarios.

Scenario 1. Main thread at moment A gets COM interface IMyIntf and calls some function on it. No previous IMyIntf passed on .NET side so .NET will create RCW and native object and executes that call. Then at moment A+1 some background thread gets COM interface IMyIntf and calls some function on it. We assume that .NET didn’t collect native object and RCW. In this case .NET runtime returns original native object that “linked” to main thread, so .NET switches to main thread, executes call and returns back. Everything looks correct just works slower than expected.

Scenario 2. Main thread at moment A gets COM interface IMyIntf and calls some function on it. No previous IMyIntf passed on .NET side so .NET will create RCW and native object and executes that call. Then at moment A+1 some background thread gets COM interface IMyIntf and calls some function on it. We assume that .NET did collect native object and RCW. In this case .NET runtime will create new RCW and native object and executes call without switch and everything is fine.

Scenario 3. At moment A some background thread gets COM interface IMyIntf and calls some function on it. In this case .NET runtime will create new RCW and native object and executes call. Then at moment A+1 main thread gets COM interface IMyIntf and calls some function on it. We assume that .NET didn’t collect native object and RCW for that interface. In this case .NET runtime will use the same native object and RCW and will attempt to switch to worker thread to execute call. Usually it fails, and exception is thrown.

So, depending on your scenario, time and GC it could sometimes work and sometimes fail. It is quite hard to debug and even harder to understand what is going on. And if you add more background threads to picture and longer time spans between calls it became really interesting to debug.