The sample application is fairly simple; it manipulates an array and displays the total memory size. One important detail we should pay attention to is that a garbage collection is forced when "DisplayMemory()" calls "GC.GetTotalMemory()" with a value of "true." Consequently, the application flow looks like this:
When the application is built with Visual Studio 2005 in the "Debug" configuration and run at the command-line, the results are predictable.
However, when built in the "Release" configuration, the results are quite unexpected.
In the "Release" build, the byte array is reclaimed by the garbage collector before it is set to null. From a memory-optimization perspective, this makes sense because the byte array isn't actually used after its elements are assigned except to set it to null. Releasing the memory early relieves pressure on the garbage collector. However, to the old-school programmer who is accustomed to memory remaining allocated until explicitly freed, this is harder to swallow.
The lesson we learn from this example is not to expect the memory in a garbage-collected environment to be treated in predictable ways. Anyone who tries to pin down the garbage collector (excepting CLR developers) will find it to be an elusive and magical creature. I am always looking for the "why" behind the magic, so let's track down exactly how this optimization is applied under the covers.
Our first step is to check the IL emitted by the C# compiler to see if the compiler simply moves the null assignment. Here is the disassembled IL code with annotations to show exactly what each chunk of instructions does:
The code above shows us that the byte array is explicitly set to null after "DisplayMemory()" is called. While the IL code is interesting, the optimization isn't there. That leaves only one possibility. The optimization must happen at JIT time when the IL is compiled to native instructions.
The Visual Studio 2005 debugger is the most robust tool for viewing native JIT-compiled code. However, we must set a few options first.
At this point, we can return to the sample code, set a breakpoint at the beginning of the "Main()" method and run the application by pressing F5. When the application enters break mode, we can view the disassembled native code by selecting Debug -> Windows -> Disassembly from the Visual Studio menu. Here's what it looks like:
Examining the native instructions reveals that the optimization isn't there, either. The byte array is stored in the ESI register at 00000031 and is explicitly cleared using the old assembly trick of XOR'ing a register with itself at 00000078. However, the registry is cleared after "DisplayMemory()" is called. In fact, when we remove the breakpoint and allow the application to finish running, we see similar results to what we saw in the "Debug" configuration.
What did we do wrong?
The problem is that, by default, Visual Studio 2005 disables the JIT optimizer during debugging. Doing so significantly simplifies managed code debugging. For those of you who are interested, Vance Morrison has written a great article about this "feature". Here is a summary of his instructions to force Visual Studio to show optimized JIT-compiled code:
Now when we run the application, the results indicate that the optimization we saw when we ran it in "Release" configuration at the command-line is present.
With the additional options selected, we again set a breakpoint at the beginning of the "Main()" method and run the application. Here is the disassembled (and fully optimized) native code:
This time, the byte array is stored in the EDI register (ESI is used for the index variable of the for-loop) at 00000020. However, this register isn't ever cleared! The code to set the byte array to null has been completely removed. How does the garbage collector know that it can be reclaimed? To find out, we'll have to peek into the internal data structures of the CLR. Fortunately, Microsoft has provided us with a tool to allow us to do so.
The SOS.DLL ("Son of Strike") debugger extension is one of the most powerful tools available for debugging .NET applications. With it, we can examine the inner workings of the CLR at runtime. However, because it is so low-level, many .NET developers don't know that it exists or have never tried using it. In the case of our byte array, figuring out why the memory is being reclaimed earlier than we expect would be nearly impossible without SOS. It can show us how the JIT optimizer provides the garbage collector with the information that it needs to reclaim the memory.
Loading the SOS extension is easy. While still at the breakpoint, we can open the Immediate Window by selecting Debug -> Windows -> Immediate from the Visual Studio main menu. Then, we just have type the following into the Immediate Window:
We should be rewarded with the following message:
First, we need to examine the "MethodDesc" for the current method. This data structure describes important runtime details about a managed method. SOS provides a command called "ip2md" (Instruction-Pointer-To-MethodDesc). According to the documentation, "given an address in managed jitted code, ip2md attempts to find the MethodDesc associated with it." Since we're interested in the MethodDesc for the current method, we can pass the CPU's instruction pointer (stored in the EIP register) to ip2md. To display the contents of the CPU registers, we select Debug -> Windows -> Registers from the Visual Studio main menu. Using the current value of the EIP register, we type the following command into the Intermediate Window (your EIP value may differ from the one shown here):
SOS will display the MethodDesc information for the current method.
Now that we have the MethodDesc, its address can be used with the "gcinfo" command to gather detailed information about how the garbage collector tracks registers containing managed objects. I should point out that the gcinfo documentation states that it is really intended for CLR developers and is "very difficult for anyone else to make sense of." In other words, we're being a bit naughty, but I'm OK with that. To use the gcinfo command with the MethodDesc address, we type the following:
And here's what comes back:
The highlighted lines in the pointer table show the range of instructions where the EDI register (which holds the byte array) is live. While live, the EDI register and not a candidate for garbage collection. EDI becomes live at 00000022 and is considered dead at 0000004f. If we refer back to the method's optimized JIT-compiled code, we can highlight this range of instructions.
Since the EDI register is live only until DisplayMemory() is called at 0000004f, the garbage collector is free to reclaim the byte array's memory . That's the magic. Although we thought that we were holding a strong reference to the byte array, the JIT optimizer decided otherwise.
Like most magic, this puzzle turns out to have a simple solution in the end. Instead of some convoluted reordering of code, the JIT optimizer simply marks a range of instructions.
Spelunking the JIT optimizer has been a whirlwind adventure. Each step has taken us deeper under the hood of the CLR and revealed new sparkling gems of information. Whether this information can be practically applied to daily coding is unlikely. However, the tools that we have explored can help us to investigate some of the most elusive bugs that we might encounter. Also, pinning down the garbage collector has served to strengthen our faith in that magical creature's ability to do its job. And finally, we have learned that the JIT optimizer is a sorcerer that toils especially hard to work its own magic.
Page rendered at Wednesday, August 20, 2008 4:12:36 AM (Eastern Standard Time, UTC-05:00)