
Previous: What's in a Closure?
Today I want to revisit our recursive Fibonacci number function from my
previous article and see if there
might be some ways to improve it by drawing on another functional programming
technique. For reference, here's the Fibonacci function:
delegate int
FibonacciCalculator(int n);
int Fibonacci(int
n)
{
int[] results = new
int[n + 1];
FibonacciCalculator calculator = null;
calculator = delegate(int
x)
{
if (x < 2)
return x;
else
{
if (results[x] == 0)
results[x] = calculator(x - 1) +
calculator(x - 2);
return results[x];
}
};
return calculator(n);
}
If you stuck with me through the article on
closures, you know that a closure is created by this function because the
local "results" variable is used within an anonymous method
even though it is declared outside of it. So, the C# compiler will turn the
"results" variable into a field on a compiler-generated helper class.
What's not as obvious is that "calculator" is also a local variable declared
outside of the anonymous method but used inside. If you missed it, look again.
"calculator" isn't a method call. In reality, it's a variable
of the FibonacciCalculator delegate type which happens to reference the method that
we're calling from (and hence, resulting in recursion). This might seem like a
minor
detail but it's extremely important. The compiler actually generates
code that looks something like this:
delegate int
FibonacciCalculator(int n);
sealed class
FibonacciClosure
{
public int[]
results;
public FibonacciCalcualtor calculator;
public int
CalculatorMethod(int x)
{
if (x < 2)
return x;
else
{
if (results[x] == 0)
results[x] = calculator(x - 1) +
calculator(x - 2);
return results[x];
}
}
}
int Fibonacci(int
n)
{
FibonacciClosure closure = new
FibonacciClosure();
closure.results = new
int[n + 1];
closure.calculator = new
FibonacciCalculator(closure.CalculatorMethod);
return closure.calculator(n);
}
Looking at the compiler-generated code actually helps clarify this. If you
look at "CalculatorMethod", you'll notice that it is not calling itself recursively. Instead it is calling the
"calculator" delegate and the effect is that it calls itself
recursively. This has importance that is actually a bit mind-bending but we'll get to that in
a moment.
First, in the original article, I referred to the technique of storing the results from our
function for retrieval later as caching. I used that term generally but there is
a more specific term: Memoization.
Now, a good object-oriented programmer would recognize that this "memoization" is an
idea that should be reused. Perhaps some sort of abstract class could be created to
solve this problem? The class could maintain the lookup table and handle when
the function is called and when results from the lookup table are returned. That
would work, right? Well, it might work but it sounds like a lot of extra effort to me
and it would be awkward to reuse. In reality, there is a much more elegant (and
reusable) solution to this problem. The answer is to use another closure. As
usual, we'll stick with C# 2.0 for now.
static FibonacciCalculator Memoize(FibonacciCalculator
function)
{
Dictionary<int, int>
results = new Dictionary<int,
int>();
return delegate(int
key)
{
int value;
if (results.TryGetValue(key,
out value))
return value;
value = function(key);
results.Add(key, value);
return value;
};
}
Look carefully at what this method does. It takes a delegate of type "FibonacciCalculator" and returns a new
delegate of the same type. The new delegate is declared as an anonymous method which uses a
Dictionary to store and retrieve the results of the passed function. This new
delegate is a closure because it references the local variables "results" and "function". That
means that the scope of those referenced variables will be promoted and will exist outside the Memoize function (see my previous
article if your need a refresher). With that in mind, here's how our Fibonacci function changes:
int Fibonacci(int
n)
{
FibonacciCalculator calculator = null;
calculator = delegate(int
x)
{
if (x < 2)
return x;
else
return calculator(x
- 1) + calculator(x - 2);
};
calculator = Memoize(calculator);
return calculator(n);
}
This is a bit mind-bending if you're seeing a technique like this for the
first time so let's tear it apart a bit. For those less-familiar with
functional programming, it can be easier to see when expressed in the classes
that are generated by the compiler:
delegate int
FibonacciCalculator(int n);
sealed class
MemoizeClosure
{
public FibonacciCalculator function;
public Dictionary<int,
int> results;
public int
MemoizedFunction(int
key)
{
int value;
if (!results.TryGetValue(key,
out value))
{
value = function(key);
results.Add(key, value);
}
return value;
}
}
FibonacciCalculator Memoize(FibonacciCalculator function)
{
MemoizeClosure closure = new
MemoizeClosure();
closure.function = function;
closure.results = new Dictionary<int,
int>();
return new
FibonacciCalculator(closure.MemoizedFunction);
}
sealed class
FibonacciClosure
{
public FibonacciCalculator calculator;
public int
FibonacciFunction(int x)
{
if (x < 2)
return x;
else
return calcualtor(x
- 1) + calculator(x - 2);
}
}
int Fibonacci(int
n)
{
FibonacciClosure closure = new
FibonacciClosure()
closure.calculator = new
FibonacciCalculator(closure.FibonacciFunction);
closure.calculator = Memoize(closure.calculator);
}
Again, read the code carefully to see what's actually happening. The idea is
that we've created a function which wraps our Fibonacci function with the
memoization feature. This works because our Fibonacci function uses a delegate
("calculator") for recursion instead of calling itself directly. The delegate
gets assigned to the new "memoized" function so the Fibonacci function actually
calls that instead.
If you run this code, you will find that it is almost as fast as the
optimized Fibonacci function from the
previous article. The only differences are:
- A slightly longer time to get the first result because the extra closure
and Dictionary have to be instantiated.
- The added indirection and Dictionary slow the algorithm down slightly
but the difference doesn't register at the millisecond level.
Regardless of these extremely minor consequences, we now have a reusable Memoize
function that can be used in other situations. Err... ok, it's not that reusable
yet because it is tied to our FibonacciCalculator type. I held off on using
generics until now because they can look bizarre to the uninitiated but we can
make this reusable by creating our own generic delegate type. Instead of FibonacciCalculator,
we could use this:
delegate TResult Func<TArg, TResult>(TArg arg);
(Note that several generic delegates just like will be included with the .NET Framework 3.5 and used by the C# 3.0 compiler.)
Our Memoize function can become generic itself and use the new
delegate like this:
static Func<TArg, TResult> Memoize<TArg,
TResult>(Func<TArg, TResult> function)
{
Dictionary<TArg, TResult> results = new
Dictionary<TArg, TResult>();
return delegate(TArg
key)
{
TResult value;
if (results.TryGetValue(key,
out value))
return value;
value = function(key);
results.Add(key, value);
return value;
};
}
And a small change is made to our Fibonacci function:
int Fibonacci(int
n)
{
Func<int, int>
calculator = null;
calculator = delegate(int
x)
{
if (x < 2)
return x;
else
return calculator(x
- 1) + calculator(x - 2);
};
calculator = Memoize(calculator);
return calculator(n);
}
Now we have a reusable memoization function that could be used with any function that takes one
argument and returns a result. And, the solution allows us to capture the
elegance of the original Fibonacci algorithm and achieve reasonable performance
without the added baggage of explicitly declaring and maintaining storage.
If you are interested in other applications of this technique,
Wes Dyer has some
excellent blog posts on memoization
here and
here. Wes has some very cool ways to use this technique to implement the
Singleton and
Multiton design
patterns.
Good luck and happy coding!
Previous: What's in a Closure?
