A lot of network exceptions in .NET 6

  •   Posted in:
  • .NET

A few weeks after we converted our application to .NET I received an email that states that there are a lot of network exceptions thrown by our application during normal work. None of them are visible to the customer and it just a lot of exceptions.

Obviously, I started my investigation and after some time I found a way to reproduce them. All I needed was to enable all Exceptions in Visual Studio and tell our application to establish a network connection to any server. And after some execution stopped with System.IO.IOException and message: “Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..”

Initially, I thought it was some kind of bug in our code but there was no our code in the call stack. The call stack looks like this:

System.Net.Sockets.dll!System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException
System.Net.Sockets.dll!System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<int>.GetResult
System.Private.CoreLib.dll!System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable<int>.ConfiguredValueTaskAwaiter.GetResult() Line 130
System.Net.Security.dll!System.Net.Security.SslStream.ReadAsyncInternal<System.Net.Security.AsyncReadWriteAdapter>(…)
[Resuming Async Method]                        Annotated Frame
System.Private.CoreLib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext executionContext, System.Threading.ContextCallback callback, object state) Line 137
System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder<int>.AsyncStateMachineBox<System.Net.Security.SslStream.<ReadAsyncInternal>d__188<System.Net.Security.AsyncReadWriteAdapter>>.MoveNext(…) Line 134
System.Private.CoreLib.dll!System.Threading.ThreadPool..cctor.AnonymousMethod__87_0(object state) Line 26
System.Net.Sockets.dll!System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.InvokeContinuation(System.Action<object> continuation, object state, bool forceAsync, bool requiresExecutionContextFlow)
System.Net.Sockets.dll!System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.OnCompleted(System.Net.Sockets.SocketAsyncEventArgs _)
System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncEventArgs.OnCompletedInternal()
System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncEventArgs.HandleCompletionPortCallbackError(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped)
System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncEventArgs..cctor.AnonymousMethod__179_0(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped)
System.Private.CoreLib.dll!System.Threading.ThreadPoolBoundHandleOverlapped.CompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* nativeOverlapped) Line 41
System.Private.CoreLib.dll!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(uint errorCode, uint numBytes, System.Threading.NativeOverlapped* pNativeOverlapped) Line 36
[Async Call Stack]                         Annotated Frame
[Async] System.Net.Http.dll!System.Net.Http.HttpConnection.CheckUsabilityOnScavenge.__ReadAheadWithZeroByteReadAsync|45_0()

The name CheckUsabilityOnScavenge was a hint of what was going on. So I decided to put the breakpoint on this function and continued the application until it stopped again with the following call stack:

System.Net.Http.dll!System.Net.Http.HttpConnection.CheckUsabilityOnScavenge() Line 1226
System.Net.Http.dll!System.Net.Http.HttpConnectionPool.CleanCacheAndDisposeIfUnused.__IsUsableConnectio(…) in Line 1668
System.Net.Http.dll!System.Net.Http.HttpConnectionPool.CleanCacheAndDisposeIfUnused.__ScavengeConnectionList|116_1<System.Net.Http.HttpConnection>(…) Line 1681
System.Net.Http.dll!System.Net.Http.HttpConnectionPool.CleanCacheAndDisposeIfUnused() Line 1624
System.Net.Http.dll!System.Net.Http.HttpConnectionPoolManager.RemoveStalePools() Line 387
System.Net.Http.dll!System.Net.Http.HttpConnectionPoolManager..ctor.AnonymousMethod__11_0(…) Line 139

Then I checked the .NET source code here: https://source.dot.net/#System.Net.Http/System/Net/Http/SocketsHttpHandler/HttpConnectionPoolManager.cs,1a3682da225e4555

Check line #106 and thisRef.RemoveStalePools(). I think by now you already have an idea of what’s going on. Class name HttpConnectionPoolManager will tell you that it is some kind of connection pooling. Please note that it may look different to you because it is the latest version. For example, the .NET 6 version looks slightly different.

Then I decided to check my assumption. This is very important because it is not hard to generate a plausible explanation but it needs to be checked. So I loaded that assembly into dnSpy (a great tool), put a breakpoint on the IF statement, and started our application.

Each time it stops in the HttpConnectionPoolManager I use “Set Next Statement” to move execution after the IF statement. It took quite some time because there were a lot of connections. And then I waited and waited and I didn’t see any System.IO.IOException.

Great, at least I found out why it happened. But there is still a possibility that there is something wrong with our code that triggers that issue. So I created a simple program to check my assumptions. Here is the source code:

using System;
using System.Net.Http;
using System.Threading.Tasks;

namespace ConsoleApp5
{
    internal class Program
    {
        static async Task Main(string[] args)
        {
            for (int j = 0; j < 10; j++)
            {
                using (var socketsHandler = new SocketsHttpHandler
                {
                    PooledConnectionLifetime = TimeSpan.FromSeconds(5),
                    PooledConnectionIdleTimeout = TimeSpan.FromSeconds(5),
                    MaxConnectionsPerServer = 10
                })

                using (var client = new HttpClient(socketsHandler))
                {

                    for (var i = 0; i < 5; i++)
                    {
                        _ = await client.GetAsync("https://www.google.com");
                        await Task.Delay(TimeSpan.FromSeconds(2));
                    }
                }
            };
        }
    }
}

And this test produces exactly the same issues. Then I found some explanations about connection pooling from Microsoft here.

It looks like I need to use HttpClientFactory but we have different servers and they are using different cookie containers. Plus WCF which we still use, creates its own HttpClient’s and we have no control over it. At least I didn’t find it after some research. Anyway, it will be a different story.

I hope it helps someone.