(How)
C^# You Are - 8 April 2007
Garbage collection and memory management is the focus this week.
- How does the .NET garbage collector know when an object is no longer needed? Answer
This is a academic description of how the GC works. It by no means is meant
to represent the line-by-line analysis of the actual algorithm nor does it take
the various special cases into account.
The garbage collector is aware of all allocated objects. In the GC it terms
root references as those objects that are "rooted" in the program unit that is running
or global to the program. Local variables and static fields would be in this
class. All other objects are known as nonroot references. Root references
are normally considered live and therefore will not be collected but nonroots are
considered live only if they are referenced, directly or indirectly, by a root reference.
All "dead" objects can be freed during garbage collection.
An exception exists for local variables. If a local variable is no longer
needed in the unit it is contained then it is not considered live and can be collected.
Here is an example:
public MessageList Foo ( string message )
{
MessageList list = new MessageList();
string[] tokens = message.Split(',');
foreach(string token in tokens)
{
//A
list.Add(token);
Console.WriteLine(token);
};
//B
}
In the above example at line A the root references are message, list,
tokens and token. There are no nonroot refences except the
fields of MessageList. At line B however only message and
list are root references. The JIT scope of the variables is used
for determining root status and not the scoping rules of the language.
- How many generations does the GC, up to v2, support? Answer
There are currently 3 generations (0-2) in .NET although more can be added later.
Objects start in G0. Each time an object survives a garbage collection it
is bumped to the next generation (up to 2). Oftentimes people call GC.Collect
to try to free up memory but this will often hurt memory more than help. Each
call to Collect will bump all local variables to the next generation.
Call it twice and local variables will likely stay around a long time even though
they will never be referenced. Therefore any use of Collect
should be carefully evaluated. I have yet to find a good reason for its usage
that better design could not resolve.
In the earlier question it was not mentioned how the GC determines what objects
to look at. The generations answer that. Normally the GC wants to be
fast but still free enough memory to allow the program to continue. Therefore
the GC starts at G0 and frees any dead objects. After G0 is cleaned up if
more memory is needed then G1 objects are evaluated. After G1 if more memory
is needed then G2 is cleaned up. After G2 an out of memory exception will
probably occur.
- What is a finalizer? Answer
A finalizer is a method, much like a C++ destructor but not the same, that is called
when an object is being freed. A finalizer allows you to clean up any unmanaged
resources before the object goes away. A finalizer has a dramatic impact on
clean up. Firstly a finalizer is run in an arbitrary thread and therefore
can not rely on any thread-local storage. Secondly a finalizer runs during
garbage collection and garbage collection has no defined order so any reference
fields of the object might already be freed and therefore can not be accessed.
Finally a finalizer forces the object to actually require two collections before
it will be cleaned up. When an object with a finalizer is created it is placed
on the finalizer queue. During the first collection the GC will determine that
the object is dead but find that it is on the finalizer queue. The object
is moved to a list where it will be finalized. The finalizer thread, sometime
later, will walk the finalize list and call its finalizer. The next GC will actually clean
up the object. Therefore only define a finalizer when your object uses unmanaged
resources and must be cleaned up. There is a better way to identify objects
that should be cleaned up.
Finalizers in C# use the C++ style destructor syntax but they are not destructors
in the C++ sense.
- What is a critical finalizer? Answer
The big problem with finalizers is that they are not actually guaranteed to be called.
If something goes wrong in the application the finalizer may never be called.
For example if the application runs out of stack space finalizers will not be called.
Critical finalizers, introduced in v2, work around this issue by guaranteeing that
they will always be called at some point. They do this by using critical execution
regions (CERs) which are pre-JITted blocks of code that are guaranteed to execute
and are guaranteed to be safe. CERs can make these guarantees because they
are pre-JITted and therefore the CLR knows what it will take to call them even in
the face of out of memory errors.
A critical finalizer is rarely used outside core classes provided by the framework.
Handles to OS resources are the most prevalent users of critical finalizers.
- What is the IDisposable interface? Answer
This interface is used to identify an object that should be cleaned up. In
most cases an object that implements the interface also implements a finalizer just
in case the object is not cleaned up. The interface follows the disposable
pattern and is covered in this article.
- How do you suppress finalization of an object? Answer
The GC.SuppressFinalize method is used to suppress the finalizer
of an object. It is generally called by the object's public Dispose
method. If a user properly disposes of an object then the finalizer would
have nothing to do. This method effectively tells the GC to not bother with
the finalizer method and, hence, free the object earlier.
- What happens if an object throws an exception during finalization? Answer
As of v2 if a finalizer throws an exception the finalizer thread will die.
In previous versions the finalizer thread would ignore the exception. Do not
allow finalizers to throw unhandled exceptions!!
- What happens if an object throws an unhandled exception in its constructor? Answer
The object is not created and the exception is passed back to the caller.
The finalizer, if any, will still be called so build it accordingly.
- What is the large object heap? Answer
This heap is used for storing large objects in managed code. A large object
is an arbitrary size that, at last check, was 85KB in size or larger. Unlike
normal memory which is compacted when a GC is run the large object heap is never
compacted as it would take too long to move all the large objects.
- How can you determine how much memory is being used in the different GC generations? Answer
The easiest way is to use PerfMon as .NET installs counters for
the different generations including how many objects are in the generation and their
size. It is available under the .NET CLR Memory performance
object. Many other tools exist as well. Process Explorer (see the links
section) provides columns for the GC information of .NET processes.