Recently I got email from customer, that our product consumes relatively lot of CPU on idle. About 3%-7%. He is using old notebook and as result fan is constantly cycles between on and off state and it is really annoying. I decided to check this because very often it could be sign of bigger problems.
Usually for cases like this I am asking customer to run PerfView to collect events that happens during problematic time. It is quite useful application; it is developed by Microsoft and completely free. It has quite unusual interface and requires some learning. There have links to learning videos from app itself.
Anyway, customer was nice and provided quite a few PerfView captures in different state of our application. I opened one that customer said consumed most CPU and went to CPU Stacks. There I disabled all grouping as I would like to see everything and remove all folding because often, I need to see even smaller functions. Then I went to CallTree tab and start expanding main thread.
I immediately see that Idle function consumed about 10% of total capture time. That function does check where is mouse was, where mouse is now to general mouse in and out events. It also does some housekeeping stuff and call Idle for different other modules that does about the same. Consuming 10% is a quite a lot of for Idle. PerfView does not show how many times function was called, because it just capture call stack every millisecond of so and there is no way to tell if function executing for long amount of time or just call very often. As result at that moment I had two theories: Idle executing slow and Idle executes a lot of times.
I decided to check first idea first. I thought that perhaps there is something on his PC that slows some operation by considerable margin and that as result that leads to slow execution. I did run PerfView on my machine and compared percentages on my and customer’s PC for that part. And there were quite close. After that I assumed that Idle executed fast.
As result remaining theory should be true. Then I started to think why Idle can execute so many times and one obvious explanation – there were a lot of messages. And to find who send these messages, I need to check who receives them. So I checked user32!DispatchMessageWorker node. After expanding that node few times I quickly found this node: presentationcore.ni!MediaContext.AnimatedRenderMessageHandler. And this is quite interesting. We are using WPF extensively in our application and looks like there is animation going on. Then I did try to reproduce it. First thing to check is to enable CPU Time column in Task Manager’s Details tab. What good about it, that it shows combined CPU time of all threads. Nobody sees this problem as most of us have more than 8 CPUs and even when one CPU has 10% load, it will look like 0%-1% in Task manager. But if you have 4 CPUs slower CPUs (like in notebooks) then it will look like 3%-7%. I started our application, did something there and let it idle for some time. After 40 minutes I found that CPU time grew by about 4 minutes. The same 10% as in customer’s case.
Then I decided to see if AnimatedRenderMessageHandler is actually called in my scenario and how often. It is possible to do this using WinDbg but it quite long. For this type of task I do recommend using great tool named dnSpy at https://github.com/0xd4d/dnSpy. It allows you to debug .NET Framework and .NET Core without having access to source code. It also allows you to recompile class or method in any compiled dll. I highly recommend tool.
Anyway, I started dnSpy, went to File|Open from GAC, typed presentationcore on right side, selected version 22.214.171.124 and clicked ok. Then I press Ctrl+Shift+K to search and typed AnimatedRenderMessageHandler. Double clicked on result and jumped to this method and pressed F5. After that selected application to run and pressed Ok. After our application started, and I did usual stuff and switched back to dnSpy. I put breakpoint on AnimatedRenderMessageHandler method and application immediately stopped there. After pressing F5 few times I did realize that it called very often. Then I changed breakpoint properties to just output some text and do not stop, switched to Output window and continued execution. Very quickly I found that there are around 30-100 messages per second. It is confirmed that problem is in WPF and somehow related to animations.
After consulting our WPF guru I was told that it looks like there is some animations that never stop. And then he started checking for RepeatBehavior="Forever" in our xaml files. He found quite a few. Mostly it was used to display progress or to show user that something is processing. And my understanding that after that thing was finished these objects with animation were still is Visual Tree somehow and WPF continues to animate them. After that we addressed this issue and problem disappeared.
I hope it helps someone.