In this post I will explain my investigation about handle leaks and how it resulted in interesting discovery in MemoryCache class.
I received few reports stating that our application consuming handles even when it supposed to do nothing. And then I started my investigation. I explained how to find where is leaking handles here.
After application leaked about 50 handles, I paused application and checked call stack. And to my surprise I found a lot of callstack that looks like this:
ntdll!NtCreateThreadEx+0x0000000000000014 KERNELBASE!CreateRemoteThreadEx+0x0000000000000294 KERNELBASE!CreateThread+0x000000000000003b verifier!AVrfpCreateThread+0x00000000000000bf clr!Thread::CreateNewOSThread+0x00000000000000b2 clr!Thread::CreateNewThread+0x0000000000000091 clr!ThreadpoolMgr::CreateUnimpersonatedThread+0x00000000000000cb clr!ThreadpoolMgr::MaybeAddWorkingWorker+0x000000000000011c clr!ManagedPerAppDomainTPCount::SetAppDomainRequestsActive+0x0000000000000024 clr!ThreadpoolMgr::SetAppDomainRequestsActive+0x000000000000003f clr!ThreadPoolNative::RequestWorkerThread+0x000000000000002b -------------------------------------- ntdll!NtCreateEvent+0x0000000000000014 verifier!AVrfpNtCreateEvent+0x0000000000000080 KERNELBASE!CreateEventW+0x000000000000006b verifier!AVrfpCreateEventW+0x00000000000000a5 clr!CLREventBase::CreateManualEvent+0x000000000000003a clr!Thread::AllocHandles+0x000000000000007b clr!Thread::CreateNewOSThread+0x000000000000007f clr!Thread::CreateNewThread+0x0000000000000091 clr!ThreadpoolMgr::CreateUnimpersonatedThread+0x00000000000000cb clr!ThreadpoolMgr::MaybeAddWorkingWorker+0x000000000000011c clr!ManagedPerAppDomainTPCount::SetAppDomainRequestsActive+0x0000000000000024 clr!ThreadpoolMgr::SetAppDomainRequestsActive+0x000000000000003f clr!ThreadPoolNative::RequestWorkerThread+0x000000000000002b mscorlib_ni!DomainNeutralILStubClass.IL_STUB_PInvoke()$##6000000+0x0000000000000069
From this callstack it wasn’t really obvious what part of code creates new threads, but I can be sure that it is from thread pool. So I decided to put breakpoint on clr!ThreadPoolNative::RequestWorkerThread.
To do this I executed following command:
And resume application execution. And after that I found that breakpoint stopped many times on following callstack:
clr!ThreadPoolNative::RequestWorkerThread mscorlib_ni!DomainNeutralILStubClass.IL_STUB_PInvoke()$##6000000+0x69 mscorlib_ni!System.Threading.ThreadPool.QueueUserWorkItemHelper(System.Threading.WaitCallback, System.Object, System.Threading.StackCrawlMark ByRef, Boolean)$##6003CE6+0x114 mscorlib_ni!System.Threading.ThreadPool.QueueUserWorkItemHelper(System.Threading.WaitCallback, System.Object, System.Threading.StackCrawlMark ByRef, Boolean)$##6003CE6+0x35 mscorlib_ni!System.Threading.ThreadPool.UnsafeQueueUserWorkItem(System.Threading.WaitCallback, System.Object)$##6003CE5+0x21 mscorlib_ni!System.Threading.TimerQueue.QueueTimerCompletion(System.Threading.TimerQueueTimer)$##6003D1F+0x31 mscorlib_ni!System.Threading.TimerQueue.FireNextTimers()$##6003D1E+0x11c mscorlib_ni!System.Threading.TimerQueue.FireNextTimers()$##6003D1E+0x55 clr!CallDescrWorkerInternal+0x83 clr!CallDescrWorkerWithHandler+0x4e clr!MethodDescCallSite::CallTargetWorker+0x102 clr!AppDomainTimerCallback_Worker+0x34 clr!ManagedThreadBase_DispatchInner+0x40 clr!ManagedThreadBase_DispatchMiddle+0x6c clr!ManagedThreadBase_DispatchOuter+0x4c clr!ManagedThreadBase_FullTransitionWithAD+0x2f clr!AppDomainTimerCallback+0x85 clr!ThreadpoolMgr::AsyncTimerCallbackCompletion+0x7e clr!UnManagedPerAppDomainTPCount::DispatchWorkItem+0x1b9 clr!ThreadpoolMgr::ExecuteWorkRequest+0x64 clr!ThreadpoolMgr::WorkerThreadStart+0xf6 clr!Thread::intermediateThreadProc+0x8b verifier!AVrfpStandardThreadFunction+0x44 KERNEL32!BaseThreadInitThunk+0x14 ntdll!RtlUserThreadStart+0x21
At the beginning I thought all these hits were s just some kind of delayed timers, because my application was on pause for some time. But even after I allowed to run application for some time, this breakpoint was hit quite often, periodically, around every 15-20 seconds. And it is really strange. I checked how many threads were created and executed !handle command and I found that there was 130 threads, even Task Manager reports only 59.
And that moment, my theory was that timer bombarding thread pool with ThreadPool.QueueUserWorkItem calls and thread pool had to create a lot of threads to process these requests. Then I checked .NET source code and found ThreadpoolMgr::MaybeAddWorkingWorker in win32threadpool.cpp and it looks like I was right.
And next step was to find what part of our code created some many timer. And simplest way to find is to find what call is called. To do this I need to load .NET support library in WinDbg by executing this command:
.cordll -ve -u -l
And then show call stack with parameters:
Then I will see something like that (I removed irrelevant items):
System.Threading.ThreadPool.QueueUserWorkItemHelper(System.Threading.WaitCallback, System.Object, System.Threading.StackCrawlMark ByRef, Boolean) PARAMETERS: callBack(0x0000000170cee6b0) = 0x000000011e740938 state(0x0000000170cee6b8) = 0x000000011daeb308 stackMark(0x0000000170cee6c0) = 0x0000000170cee6d0 compressStack(0x0000000170cee6c8) = 0x0000000000000000
It is possible to click on right side of state parameter and WinDbg will execute this command automatically:
!DumpObj /d 000000011daeb308
Output will state that object is System.Threading.TimerQueueTimer and there is field m_timerCallback and points to 000000011daeb2a8. It is possible to click on 000000011daeb2a8 and WinDbg will execute command:
!DumpObj /d 000000011daeb2a8
Then WinDbg will state that this is System.Threading.TimerCallback and it has _target field with value 000000011daeac20 and after repeating process again, I found that this is System.Runtime.Caching.CacheExpires. What? Why does it create so many timers? Anyway, I decided to continue to check and to find what callback is actually should execute. TimerCallback class has also field called _methodPtr and it has value 7ffdeb641600. To find function that corresponds to this address I executed following command:
And found that this is System.Runtime.Caching.CacheExpires.TimerCallback
As many developers know, C# source code of .NET is located at https://sourceof.net, but by some reason CacheExpires are not there. You can checkout source code of .Net from official github repo and that file and code will be there.
Anyway, to make story short, I found that every MemoryCache instance will create as many CacheExpires as core count (Environment.ProcessorCount). Check
MemoryCache.Init and initialization of _storeCount and MemoryCache.InitDisposableMembers and allocating MemoryCacheStore. MemoryCacheStore creates CacheExpires and calls MemoryCacheStore. InitDisposableMembers that calls CacheExpires. EnableExpirationTimer and that will create timer. I didn’t find any way to disable it or prevent their creation.
To confirm my finding, I put breakpoint on MemoryCache creation and check how many timers it creates. I also executed following command:
!dumpheap -type System.Threading.Timer
And it confirms that many timers were created. And every 20 seconds they are executed one by one in quick succession. Most of our code uses MemoryCache.Default and there are two places that create new instance of MemoryCache. Many of our developers have at least 12 cores PCs and most has 16 cores. As result, only 3 instances of MemoryCache will create 48 timers on 16 core PC.
I have no idea why Microsoft decided to create so many times, but I believe that it was from times when only servers had more then on processor and most expensive ones had 4 processors and this decision seems reasonable. But now it is possible to buy AMD Threadripper 3990X that is kind of desktop processor and it has 128 cores. And in that case .NET will create 384 timers and that looks kind of bad.
Well, all this quite interesting and fascinating. But why handles are leaking. It turns out that they are not realty leaking and eventually they will be picked up by GC. But if application does nothing there are not many allocations and as result GC will not be active most of the time. Another question why Thread pool retire extra threads in without dispose? I’m still investigating this question.