Demystifying Garbage Collection: A Basic Overview
Simplifying .Net Garbage Collection Concepts
What is Garbage Collection (GC)?
Garbage collection in software is a process that removes unwanted objects to free up memory space.
Throughout its lifetime, a software application uses many objects, and over time, some of these objects become unnecessary. If not cleared, they occupy memory space, making the application resource-heavy. Therefore, we must regularly clear these unwanted objects to keep our application as light as possible. Depending on the framework, this responsibility may fall on the developer (manual) or the framework (automatic). This post focuses on the .Net Core framework, which provides an automatic garbage collection process.
A (very) basic flow of automatic GC would look something like this.
Well, what's the problem then?
An automatic Garbage Collection process occurs at some point in the future. The exact timing depends on various factors, some of which may be beyond a developer's control. This behavior can cause a few issues..
Inefficiency -
Usually, when we think of objects, we imagine relatively small ones. However, sometimes we deal with very large objects, like when processing a big file. These heavy objects should be cleared as soon as possible. While automatic GC will eventually remove them, keeping them in memory longer than necessary makes our application memory inefficient, which also affects its usability.
Customization -
In theory, each object may have a specific way it needs to be disposed of. In C#, most objects can be discarded easily, but there are scenarios where objects have additional requirements. For example, a network connection object may need to close the connection before being discarded.
An automated process is generalized and usually lacks the awareness to handle these custom requirements. Since every object can have unique demands, it is almost impossible to account for all possibilities.
Neither instantaneous nor guaranteed -
As we've mentioned, automatic GC happens sometime in the future. So, it is not ideal for scenarios that need instant disposal. There's another issue: automatic GC is not guaranteed to happen. While the framework makes its best effort, there's no guarantee that GC will always occur. So, if you must dispose of some resources, you can't rely on automatic GC.
Automatic GC = It will happen when it happens, if it happens.
To solve these problems, the framework needs explicit direction from the developer to perform garbage collection reliably and efficiently, while also handling any necessary customizations.
Welcome IDisposable
IDisposable
interface gives developers a reliable way to manage valuable resources. Developers can set up an efficient disposal process tailored to specific resources. This helps the framework to effectively remove these resources. IDisposable
creates an agreement between the developer and the framework that will be followed.
Let's see how we can implement the IDisposable
contract. There are two parts to it.
public void Dispose()
{
Dispose(true); // Dispose of unmanaged resources.
GC.SuppressFinalize(this); // Suppress finalization.
}
This is the publicly visible part of the contract.
Dispose(true)
: This calls another method that performs the actual disposal work. Thetrue
flag indicates that this is a proactive disposal call.GC.SuppressFinalize(this)
: This tells the system that the valuable resources have been manually disposed of, so it can discard the container object as waste instead of attempting to recycle it (putting it on the finalization queue)..
protected virtual void Dispose(bool disposing)
{
// Check if already disposed
if (_disposed)
{
return;
}
// 1. Free Managed resources
if (disposing)
{
}
// 2. Free Unmanaged resources
// 3. Set large fields to null.
_disposed = true;
}
This is the actual implementation that handles the disposal. It has a disposed
flag that helps identify if the object has already been disposed of, in which case we don't need to dispose of it again.
Managed resources
These are objects whose lifecycle is managed by the framework. Most objects in C# fall into this category, and the framework knows how to discard them. However, in this context, we are referring to objects that may contain recyclable resources. In this block, we can manually call the
Dispose()
method on such objects. Note that this block is only executed when the flag istrue
, meaningDispose()
is being called by developer code.Unmanaged resources
These resources are outside the framework's control. They may include pointers to objects from a library built in a different language (like C++), or objects using network connections (e.g., sockets). The framework might not know how to handle them or any special steps needed before disposal. For example, sockets may need to be closed before the container is disposed of. Therefore, we handle the specialized disposal in this block, but only for objects outside the framework's scope.
Large fields
These are mostly simple managed objects. However, they take up a lot of memory, so it's better to discard them as soon as possible. You can dereference such large fields by setting them to
null
, allowing the allocated memory to be reclaimed.
Flow with IDisposable
Well, that's it then, right? We have a mechanism that automatically collects most of the time, and we also have a way to proactively identify and manage valuable resources. It seems like we've found the holy grail of garbage collection.
Not exactly! As you may have noticed, I have the word 'proactively' in bold. That's because:
To err is human!
It's always possible that, even with a proactive disposal mechanism, developers might forget to call Dispose()
on an object. In that case, we're back to square one.
So, what if there was a mix of the two? Something that allows us to run explicit code but also happens automatically?
Finalizers
Just as we can customize the setup when creating an object, C# allows us to customize actions when an object is being destroyed. This is known as a finalizer (also called a destructor). It looks something like this:
~MyClass()
{
Console.WriteLine("MyClass finalizer called"); // any action
Dispose(false); // when combined with Dispose method shown above
}
While finalizers can be used for various purposes, for this article, we are focusing on their use for disposal only. As you can see, the finalizer calls the same internal Dispose method as before, giving us most of the benefits of the Dispose pattern, except for managed resources since we pass false
. This means it will still clean up unmanaged or large objects. An object with a finalizer is placed on a special queue called the finalization queue, which will finalize (dispose of) the valuable resources properly and make the container disposable.
So, should I just add a finalizer to every object that uses IDisposable
to get the best of both worlds?
Not exactly! Since the finalizer is called by the automatic garbage collection process, it shares the same challenges as automatic garbage collection. There's no guarantee it will always run! For disposal purposes, the finalizer should be seen as a last resort, not a guarantee. It's simply a fallback!
Also, since finalization only deals with unmanaged or large resources, if the object only has managed dependencies, it doesn't need a finalizer. Finalization is a costly process. Executing extra code means additional processing and the risk of encountering errors..
Note:
In .NET Core/.NET 5 and above, the finalizer will not be called upon application termination. Therefore, if you need to reliably dispose of objects when the application ends, you should implement the System.AppDomain.ProcessExit event handler to manually dispose of them.
Refined workflow (sort of)
The above image can give you a rough overview of the complexity of the disposal process. Of course, there will be more to it, but we can sort of get a general idea.
Action plan
So based on what we've learned so far, let's figure out a simple plan of action that we can use for most scenarios we might encounter.
If your object has no managed, unmanaged, or large resources:
You can choose to do nothing and let the framework manage the lifecycle.
If your object doesn't have unmanaged or large resources directly but owns managed disposable objects:
Implement the
IDisposable
interface and dispose of the managed objects as part of the managed resource disposal block.If your object has unmanaged or large resources:
Implement the
IDisposable
interface. Dispose of the resources in the appropriate blocks. Additionally, implement a finalizer as a safety measure.
Usage sample
There are two common ways to use a disposable object.
- Manually Call Dispose - As part of the
IDisposable
contract, we can manually call theDispose
method on the object. We should do this right after the object is no longer needed.
public void MyMethod() { var disposableObject = new MyDisposableObject(); // do some operation on disposableObject disposableObject.Dispose(); // once no longer useful }
Using block - This is a simpler way to handle disposal. It automatically calls the Dispose method and limits the object's usage to the block, preventing accidental use outside the scope.
The same example would then look like this:
public void MyMethod() { using(var disposableObject = new MyDisposableObject()) { // do some operation on disposableObject } }
(Possible) Design Considerations
Along with the action plan, I would like to mention some design considerations for disposable objects. I am not an expert, so take these suggestions with a grain of salt, but consider them if they make sense and help simplify your work.
Only dispose of what you own
A disposable object can be accessed in multiple ways. Either your class creates the disposable object, or it is supplied as a parameter or a dependency (typically through constructor injection).
A good rule of thumb is to dispose of only what you own. If you do not control the object's lifetime, you likely shouldn't dispose of it either, as it may still be in use elsewhere.
Disposables may need propagation
Depending on the usage and object nesting, it may not be enough to have only the immediate consumer of such resources implement
IDisposable
. The entire chain that depends on these resources might need to implementIDisposable
. Since the manualDispose
call can only be initiated at the outermost layer, if you have a nested object structure, you may need to applyIDisposable
to all those layers. So, consider how you structure these objects.Consider using
using
I prefer to avoid adding
IDisposable
to everything. Instead, I try to structure the usage to eliminate the need for propagation by using ausing
block.Consider patterns like using a factory to create a disposable object instead of passing the object as a dependency. This makes the class the owner of the disposable object, which can then be proactively disposed of.
Not every disposable object needs a finalizer. We should only implement finalizers if the object owns any unmanaged or large resources. If you don't need it, don't add it.
Consider setting disposed objects to
null
as a good practice. This not only proactively dereferences it but also helps avoid mistakenly referencing a disposed object.
Further Reading:
If you'd like to further read on this topic in more technical detail I would surely recommend the following articles.
Disclaimer:
This article is based on my understanding built by stitching the information found across a few articles. I have tried to create this extremely simplified version of the entire process. I have tried to keep it simple and technical jargon-free, however, I've tried my best to stick to the correctness. If you do find some inaccuracies, please do reach out and I will make the necessary amends.