Disposing the garbage

Disposing the garbage

The .Net way

What is Garbage Collection (GC)?

Garbage collection in software is a process through which we throw away unwanted objects to free or reclaim the occupied memory space.

A software application throughout its lifetime uses many objects, and over time those objects become unwanted. If left uncleared, they'd occupy the memory space making our application resource heavy. Hence we must frequently clear these unwanted objects to ensure our application remains as light as possible. Depending on the framework we're using this responsibility may be of the developer (i.e. manual) or framework (i.e. automatic). This post is about .Net Core framework, which provides an automatic garbage collection process.

A (very) basic flow of automatic GC would look something like this.

Well, what's the problem then?

An automatic Garbage Collection process happens sometime in the future. Exact time duration of when that future would be depends on various factors, some of which may be beyond a developer's control. This behavior can cause a few issues.

  1. Inefficiency -

    Usually, when we think of objects we think of relatively tiny objects. However, occasionally we may use some very large objects. Think of a scenario where we're processing a large file. Such heavy objects are better cleared off as soon as possible. While the automatic GC promises to clear it up at some point in the future, keeping them in memory for longer than necessary, will make our application memory inefficient, which also impacts its usability.

  2. Customization -

    In theory, every object may have a very specific way of disposal. In C# most objects can be throwaways, but there will be scenarios where objects will have additional demands to be fulfilled. eg. A network connection object may need to close the connection before being discarded.

    An automated process is a generalized process and usually will not have the necessary awareness to handle such custom demands out of the box. And since every object can have its custom demands, it is almost impossible to account for all those possibilities.

  3. Neither instantaneous nor guaranteed -

    As we've already mentioned, automatic GC happens sometime in the future. So we know it is not ideal for scenarios that need instant disposals. However, there's one more issue. Automatic GC is also not guaranteed to happen. Now let's acknowledge that while the framework makes the best possible effort, there's no guarantee that GC will always happen. So if you must dispose some resources, then you can't rely on the automatic GC.

Automatic GC = It will happen when it happens if it happens.

To solve these problems, the framework would need some explicit direction from the developer, so it can do the garbage collection reliably, and efficiently while also accounting for any necessary customizations.

Welcome IDisposable

IDisposable interface provides developers with an assured mechanism to proactively identify valuable resources. Developers can also provide an efficient disposal process that is tailored to the resource in question. This allows the framework to efficiently get rid of these resources. IDisposable creates a contract between the developer and the framework that would be honored.

Let's see how we can implement the IDisposable contract. There are two parts to it.

public void Dispose()
{
    Dispose(true); // Dispose of unmanaged resources.
    GC.SuppressFinalize(this);  // Suppress finalization.
}

This is a publicly visible/accessible part of contract.

  1. Dispose(true) : This method is a call to another method that does the actual disposal work. true flag is sent to indicate this is a proactive disposal call.

  2. GC.SuppressFinalize(this) : This method tells the system that the valuable resources have been manually disposed of, so it can discard the container object as waste, instead of attempting to recycle it (aka putting it on finalization queue).

protected virtual void Dispose(bool disposing)
{
    // Check if already disposed
    if (_disposed)
    {
        return;
    }

    // 1. Free Managed resources
    if (disposing)
    {
    }

    // 2. Free Unmanaged resources

    // 3. Set large fields to null.

    _disposed = true;
}

This is the actual implementation that does the disposal. It has disposed flag that helps identify if the object was already disposed in which case we don't need to dispose it again.

  1. Managed resources

    These are objects whose lifecycle is managed by the framework. These are most objects in C# and the framework knows how to discard them. However, specifically in the context of this block of code, we're not talking of all C# objects but just those objects which may be containers for some recyclable resources. In this block, we can manually call the Dispose() contract on such objects. Please note, this block is only executed when the flag sent is true i.e. when Dispose() is being called by developer code.

  2. Unmanaged resources

    These resources lie outside the boundary of the framework. These may include things like pointers to objects of some library built in a different language (say C++), or objects using network connections (eg. sockets). The framework may either not know how to deal with it or not know about any customization necessary before disposal. eg, sockets may need to be closed before the container is disposed. Hence we can do the specialized disposal in this block, but only for objects that lie outside the realm of the framework.

  3. Large fields

    These are mostly simple managed objects. However, these take up quite a bit of memory so they're better discarded as soon as possible. Such large fields can be dereferenced by setting them to null so the allocated memory can be reclaimed.

Flow with IDisposable

Well that's it then, right? We've got a mechanism to automatically collect which works most of the time, and then we also have a mechanism to proactively identify and efficiently manage valuable resources. We seem to have found the holy grail of garbage collection.

Not exactly!! As you may have noticed I have the word 'proactively' in bold. That's because

To err is human!

It is always possible that despite providing a mechanism for proactive disposal, developers consuming the object may forget to call the Dispose() on it. In that case, we are back to square one.

So, what if there was a mix of the two? Something that allows to run explicit code but also happens automatically?

Finalizers

Just like how we can customize setup when constructing an object, C# provides us with a way to perform some customization when the object is being destroyed. This is known as finalizer (also called destructor). It looks something like this.

~MyClass()
{
    Console.WriteLine("MyClass finalizer called"); // any action
    Dispose(false); // when combined with Dispose method shown above
}

While finalizers can be used for other purposes too, for the sake of this article we're scoping its use to disposal only. As you can see, the finalizer calls into the same internal Dispose method as earlier so it gives us most of the benefits of the Dispose pattern, except the managed resources as we pass false. So it will still clean up the unmanaged or large objects. An object with a finalizer will be put on a special queue called the finalization queue, which will finalize, aka dispose of the valuable resources properly and make the container a throw-away.

So great, I just slap a finalizer on every object that uses IDisposable right, so I get the best of everything?

Not exactly! Well you see, since the finalizer is called by the automatic garbage collection process, it has the same challenges as that of the automatic GC process. There's no guarantee it will always happen! For disposal purposes, the finalizer should be looked at as our last option to do the disposal but not as a guarantee. It's simply a fallback!

Also, since disposal from a finalizer only deals with unmanaged or large resources, if the object only has managed dependencies, it doesn't need a finalizer. Finalization is a costly process. As we are executing extra code, not only are we doing additional processing but also run the risk of getting errors.

Note:

In .Net core/.Net 5 and above, finalizer will not get called upon application termination. Hence if you must reliably dispose objects upon application termination, you may want to implement the System.AppDomain.ProcessExit event handler which manually disposes them.

Refined workflow (sort of)

The above image can give you a rough overview of the complexity of the disposal process. Of course, there will be more to it, but we can sort of get a general idea.


Action plan

So based on what we've learned so far, let's figure out a simple plan of action that we can use for most of the scenarios we'd be dealing with.

  • If your object has no managed/unmanaged/large resources -

    You may choose to do nothing and let the framework manage the lifecycle.

  • If your object doesn't have unmanaged/large resources directly but owns managed disposable objects -

    Implement IDisposable interface and dispose the managed objects as part of managed resource disposal block.

  • If your object has unmanaged/large resources -

    Implement IDisposable interface. Dispose the resources in appropriate blocks. Additionally, implement a finalizer as a safety.


Usage sample

There are two common ways of using a disposable object.

  1. Manually Call Dispose - As part of IDisposable contract, we can manually call Dispose method on the object. We can do this immediately after the object is no longer useful.

     public void MyMethod()
     {
         var disposableObject = new MyDisposableObject();
         // do some operation on disposableObject
         disposableObject.Dispose(); // once no longer useful
     }
    
  2. Using block - This is a syntactic sugar for the first method. It takes away the responsibility of calling the dispose properly and scopes the object usage to the block so we don't accidentally use the object outside the scope.

    The same example would then look like

     public void MyMethod()
     {
         using(var disposableObject = new MyDisposableObject())
         {
             // do some operation on disposableObject
         }
     }
    

(Possible) Design Considerations

Along with the action plan, I would also like to mention some design considerations, around disposable objects. Now I am no expert, so take these suggestions with a pinch of salt, but do consider these if it makes sense and they help simplify life.

  1. Only dispose what you own

    A disposable object could be accessed in multiple ways. Either your class creates the disposable object or it is supplied as a parameter or a dependency (typically constructor injection).

    A decent thumb rule would be to dispose only what you own. If the object's lifetime is not controlled by you, very likely you shouldn't dispose it either as you do not want to dispose it while it may be in use elsewhere.

  2. Disposables may need propagation

    Depending on the usage and object nesting, it may not be sufficient to only have the immediate consumer of such resources as IDisposable but the entire chain that depends on such resources. Since the manual Dispose call can only be initiated on the outermost layer, if you have a nested object structure you may have to slap the IDisposable on all those layers. So consider how you're structuring these objects.

  3. Consider using using

    I may be a little biased toward this, but I much prefer to avoid slapping IDisposable on everything and try to structure the usage in a way that can remove the need for propagation with a using block instead.

    Consider patterns like depending on a factory to create a disposable object instead of the object being passed as a dependency. This makes the class owner of the disposable object which can then be proactively disposed.

  4. Not every disposable object needs a finalizer. We should only implement finalizers if the object itself owns any unmanaged or large objects. If you don't need it, don't add it.

  5. You may also consider setting disposed objects to null as hygiene. Not only are you proactive dereferencing it, but mainly you'll avoid the pitfall of mistakenly referencing a disposed object.


Further Reading:

If you'd like to further read on this topic in more technical detail I would surely recommend the following articles.

Disclaimer:

This article is based on my understanding built by stitching the information found across a few articles. I have tried to create this extremely simplified version of the entire process. I have tried to keep it simple and technical jargon-free, however, I've tried my best to stick to the correctness. If you do find some inaccuracies, please do reach out and I will make the necessary amends.

Did you find this article valuable?

Support Shishir Korde by becoming a sponsor. Any amount is appreciated!