Multitenancy and DbCompiledModel

I was asked to investigate why one of our web services constantly consuming more than 3Gb of memory and constantly recycled. There was nothing new deployed at that time and developers do not understand what’s going on.

After I got memory dump, that contains .NET code I execute this command first:
.cordll -ve -u -l

Then for memory related issues I usually execute this command:
!DumpHeap -stat

This will print how many instances of each object created and total size in memory these objects are consuming. Then I try to find some specific objects at the bottom of this list. General objects like strings usually hard to investigate as many objects has a lot of strings, so it just time consuming. In my case I found 1,953,459 of System.Data.Entity.Core.Metadata.Edm.TypeUsage. This type is related to Entity Framework. Initially I thought that developers missed using or Dispose somewhere. But before I will check source code I just click on left to System.Data.Entity.Core.Metadata.Edm.TypeUsage and it will printing addresses of all objects. Usually I immediately press Ctrl+Break or else it will take quite a lot of time. Then I execute this command:
!GCRoot <address of TypeUsage >

This command will print kind of “path” in ownership of that object:

[address] System.Data.Entity.Infrastructure.DbCompiledModel
[address] System.Data.Entity.Internal.CodeFirstCachedMetadataWorkspace
[address] System.Data.Entity.Core.Metadata.Edm.MetadataWorkspace

[address] System.Data.Entity.Core.Metadata.Edm.TypeUsage

And this way I found that DbCompiledModel owns that object. Then I checked how many instances of DbCompiledModel are created. And I found that there about 500 of them. It is quite a lot. But who owns DbCompiledModel? Going up from output of !GCRoot command I found some concurrent dictionary. Initially I thought it is some kind of Entity Framework cache, but I found that TKey in that dictionary is from our namespace. And after some search I found this code:

if (!compiledModelCache.ContainsKey(tenantId))
{
    var modelBuilder = new DbModelBuilder();
    initDbModelBuilder(modelBuilder);
    var model = modelBuilder.Build(connection);
    compiledModel = model.Compile();
    compiledModelCache.TryAdd(tenantId, compiledModel);
}
else
{
    compiledModel = compiledModelCache[tenantId];
}

It turns our that most people are recommending creating compiled model and cache it. Simplest way to build compiled model is to pass connection. And most people assumed that compiled model is per connection. As result, they created some kind of dictionary and using connection string as key (or in our case we use id of tenant) and value is compiled model.

This works fine if you have few connections. But in our case, we have a lot of tenants that use exactly the same database, just different connection string. And as result, each tenant will have its own compiled model. Compiled model is quite expensive object, especially for big models.

Next step is to test my theory. We deployed special version to production of our software that at startup accessed few tables for each tenant in indefinite loop. And as predicted memory immediately skyrocketed and very quickly IIS must recycle this website due to memory limits.

Ok, but how to fix it? Well, after researching I found that internally compiled models are exactly the same. Model only depends on Entity Framework provider and version of server. But how to build compiled model without connection? Well I checked decompiled source code of Entity Framework and found that it does not use connection and they do something like this:

var providerInvariantName = DbConfiguration.DependencyResolver.GetService<IProviderInvariantName>(DbProviderServices.GetProviderFactory(connection)).Name;
var providerManifestToken = DbConfiguration.DependencyResolver.GetService<IManifestTokenResolver>().ResolveManifestToken(connection);
string cacheKey = $"{providerInvariantName},{providerManifestToken}";

As result we change our code to something like this:

var providerInvariantName = DbConfiguration.DependencyResolver.GetService<IProviderInvariantName>(DbProviderServices.GetProviderFactory(connection)).Name;
var providerManifestToken = DbConfiguration.DependencyResolver.GetService<IManifestTokenResolver>().ResolveManifestToken(connection);
string cacheKey = $"{providerInvariantName},{providerManifestToken}";
if (!compiledModelCache.ContainsKey(cacheKey))
{
    var modelBuilder = new DbModelBuilder();
    initDbModelBuilder(modelBuilder);
    var dbInfo = new DbProviderInfo(providerInvariantName, providerManifestToken);
    var model = modelBuilder.Build(dbInfo);
    compiledModel = model.Compile();
    compiledModelCache.TryAdd(cacheKey, compiledModel);
}
else
{
    compiledModel = compiledModelCache[cacheKey];
}

And we ran our special version of our software and memory stays on the same level and not growing at all. After more testing by different departments we finally deployed these changes to production server and IIS did not recycle our site even once for last two weeks and it is working great. Moreover site is working faster, because GC is not checking tens of millions of live objects.

And then I found answer on question: why did it happen without any change from developers? Answer was simple. We got more tenants :)

I hope it helps someone.