Thursday, October 05, 2006

Today I was iterating a List<int> using a foreach-loop and feeling a bit smug in knowing how much more performance-conscious I was being than if I'd tried doing the same thing with an ArrayList filled with ints. Thanks to the wonder of generics, the C# compiler neatly avoids numerous boxing operations by using a System.Collections.Generic.IEnumerator<int> instance instead of the older System.Collections.IEnumerator. Then I got to thinking: "is this really the fastest way?" Upon investigation, it turns that, no, it isn't the fastest way.

.NET Framework 2.0 shipped full of little nuggets that make life easier and code more elegant. Among my favorites are the additional methods that were added to Array and List<T> which take the Action<T>, Converter<TInput,TOutput> and Predicate<T> generic delegates. Apparently, I'm actually pretty obsessed with them. These methods really shine when used with anonymous methods or lambda expressions.

The particular method that I'm interested in is List<T>.ForEach(Action<T>). I'm curious to know if calling ForEach is slower, faster or about the same as using a standard foreach-loop. Consider this code:

long Sum(List<int> intList)
{
  long result = 0;
  foreach (int i in intList)
    result += i;
  return result;
}

The C# compiler will generate IL instructions for the above code that translate roughly to this:

long Sum(List<int> intList)
{
  long result = 0;
  List<T>.Enumerator enumerator = intList.GetEnumerator();
  try
  {
    while (enumerator.MoveNext())
    {
      int i = enumerator.Current;
      result += i;
    }
  }
  finally
  {
    enumerator.Dispose();
  }
  return result;
}

Essentialy, the C# compiler is generating two method calls per iteration: IEnumerator<T>.MoveNext() and IEnumerator<T>.Current. The use of the List<T>.Enumerator struct (which implements IEnumerator) allows the compiler to generate call IL instructions instead of callvirt to provide a tiny low-level speed boost. In contrast, consider this code:

long Sum(List<int> intList)
{
  long result = 0;
  intList.ForEach(delegate(int i) { result += i; });
  result result;
}

Or, the lambda expression equivalent:

long Sum(List<int> intList)
{
  long result = 0;
  intList.ForEach(i => result += i);
  return result;
}

Using List<T>.ForEach results in only one method call per iteration: whatever Action<T> delegate that you supply. This will be called with a callvirt IL instruction but two calls should be slower than one callvirt. So, my expectation is that List<T>.ForEach will actually be faster.

Armed with my assumptions, I created a small console app that samples timings of four different ways to iterate an instance of List<int> and sum the values:

  1. Iterate using a for-loop: for (int i = 0; i < List<int>.Count; i++)
  2. Iterate using a for-loop with no Count call: for (int i = 0; i < NUM_ITEMS; i++)
  3. Iterate using a foreach loop: foreach (int i in List<int>)
  4. Iterate using a List<int>.ForEach call: List<int>.ForEach(delegate(int i) { result += i; });

First, I generated results without compiler optimizations:

# iterations for-loop for-loop (no count) foreach-loop List<int>.ForEach
1,000 0.000016 0.000013 0.000017 0.000013
2,000 0.000028 0.000024 0.000034 0.000022
3,000 0.000045 0.000034 0.000046 0.000030
4,000 0.000055 0.000045 0.000060 0.000039
5,000 0.000066 0.000054 0.000075 0.000047
10,000 0.000130 0.000113 0.000143 0.000090
20,000 0.000268 0.000207 0.000284 0.000174
30,000 0.000408 0.000314 0.000426 0.000260
40,000 0.000508 0.000414 0.000564 0.000343
50,000 0.000657 0.000523 0.000706 0.000425
100,000 0.001256 0.001032 0.001413 0.000854
200,000 0.002506 0.002071 0.002818 0.001697
300,000 0.003752 0.003120 0.004229 0.002549
400,000 0.005049 0.004159 0.005666 0.003412
500,000 0.006362 0.005227 0.007094 0.004305
1,000,000 0.012625 0.010466 0.014219 0.008609
2,000,000 0.025144 0.020899 0.028395 0.017189
3,000,000 0.037745 0.031426 0.042947 0.026193
4,000,000 0.050619 0.041806 0.056717 0.034362
5,000,000 0.062931 0.052470 0.071795 0.043364
10,000,000 0.125695 0.104486 0.143066 0.086563

These results are mind boggling. It turns out that, when compiling without optimizations, List<int>.ForEach is even faster than a standard for-loop! Of equal interest is to notice that (without compiler optimization) foreach is nearly as fast as a standard for-loop. So, if you're compiling your app without compiler optimizations, List<int>.ForEach is the fastest way to go.

Next, I ran my tests with compiler optimizations enabled to get better "real world" results:

# iterations for-loop for-loop (no count) foreach-loop List<int>.ForEach
1,000 0.000008 0.000007 0.000014 0.000012
2,000 0.000014 0.000013 0.000026 0.000022
3,000 0.000019 0.000016 0.000036 0.000028
4,000 0.000024 0.000022 0.000047 0.000035
5,000 0.000029 0.000025 0.000058 0.000043
10,000 0.000059 0.000047 0.000117 0.000081
20,000 0.000128 0.000093 0.000225 0.000161
30,000 0.000157 0.000141 0.000336 0.000233
40,000 0.000221 0.000180 0.000442 0.000310
50,000 0.000263 0.000236 0.000553 0.000307
100,000 0.000530 0.000443 0.001103 0.000773
200,000 0.001070 0.000879 0.002194 0.001531
300,000 0.001641 0.001345 0.003281 0.002308
400,000 0.002233 0.001783 0.004388 0.003083
500,000 0.002615 0.002244 0.005521 0.003873
1,000,000 0.005303 0.004520 0.011072 0.007767
2,000,000 0.010543 0.009074 0.022127 0.015536
3,000,000 0.015738 0.013569 0.033186 0.023268
4,000,000 0.021039 0.018113 0.044335 0.031188
5,000,000 0.026280 0.022593 0.055521 0.038793
10,000,000 0.052528 0.046090 0.111517 0.078482

Looking at these numbers, I'm impressed by how must the compiler optimizes for-loops to get more than 50% savings. foreach-loops also get about 20% shaved off. List<int>.ForEach does not receive as much optimization but the important thing to notice is that is still beats foreach-loops by a considerable margin. List<T>.ForEach is faster than a standard for-each loop.

Note that these tests were run on Dell Inspiron 9400 laptop with a Core Duo T2400 and 2 GB of ram. If you want to try the tests for yourself, you can download the console app here: ForEachTest.zip (3.82 KB).

And because, a picture is worth a thousand words, I generated a couple of charts to demonstrate the differing speeds of the iteration tests. The charts show 5 different samples that iterate 10,000,000 to 50,000,000 int values.

Without compiler optimizations:



With compiler optimizations:



One final point: while the ForEach method is much faster at iterating a List<T>, this is not the case with arrays. An Array.ForEach method does exist for single-dimension arrays (or vectors) and it is actually slower than using foreach. The reason is that the compiler doesn't generate code that uses IEnumerator<T> for foreach-loops over vectors. Instead, it generates special IL instructions that are designed for working with vectors. Thus, using foreach on a vector results in no method calls while Array.ForEach still requires one call to the supplied delegate per iteration.

kick it on DotNetKicks.com

posted on Thursday, October 05, 2006 7:07:51 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]

kick it on DotNetKicks.com