Tuesday, November 13, 2007
While exploring F#, I've grown increasingly impressed by the libraries that ship with it. One of the main purposes of the libraries is to provide underlying support for the language itself. In addition, they contain important modules and classes necessary for functional programming (e.g. immutable List and Map types). However, the most practical aspect of these libraries to me is the rich set of APIs that facilitate using the .NET Framework in a more functional way. These APIs are often directly portable to C#. Let's look at a simple example.

The following C# code is typical of how we might create an array containing the natural numbers from 1 to 20:

int[] a = new int[20];
for (int x = 0; x < a.Length; x++)
  a[x] = x + 1;

There's nothing special about that code. It's representative of the sort of thing that we write all the time. However, it won't fly in the functional world because it's written in an imperative style. That is, the code specifies the exact steps that should be taken to create and initialize the array:

  1. Create a new int array of 20 elements.
  2. Initialize a new indexer variable, x, to 0.
  3. Check to see that x is less than the length of the array. If it isn't, STOP.
  4. Assign the value of the array element at index x to the result of x + 1.
  5. Increment x.
  6. GO BACK to step 3. Repeat as necessary.

In contrast, the F# libraries provide a special module, Array, for manipulating single-dimensional .NET arrays in a more functional style. (Array2 and Array3 are also available for manipulating two- and three-dimensional arrays respectively.) Using the Array module, the C# code above could be translated to F# like so:

let a = Array.init 20 (fun x -> x + 1)

Instead of a specific code recipe, this F# code says (in a more declarative fashion), "create an array of 20 elements, and use this function to initialize each element." An interesting feature of the F# version is that the type of the array is never declared. Because the compiler can infer that the result of the passed function (fun x -> x + 1) will be an int, "a" must be an int array.

To me, this code is beautiful. In addition, it is declarative instead of imperative; it describes what should be done but doesn't dictate exactly how it should be done. When I see such elegant code, I immediately start trying to figure out which of its aspects could be used to improve the code in my daily C# work. Here's how we might "borrow" the F# "Array.init" function in C#:

public static class ArrayEx
{
  public delegate T CreateItem<T>(int index);
 
  public static T[] Create<T>(int length, CreateItem<T> createItem)
  {
    if (length < 0)
      throw new ArgumentOutOfRangeException("length");
 
    if (length == 0)
      return new T[0];
 
    T[] result = new T[length];
    if (createItem != null)
    {
      for (int i = 0; i < length; i++)
        result[i] = createItem(i);
    }
    return result;
  }
}

With this function defined, we can rewrite our array creation sample declaratively using C# 3.0 syntax.

var a = ArrayEx.Create(20, x => x + 1);

Notice that this code takes advantage of the C# compiler's type inference in the same way that the F# sample does. Sweet!

Let's take a look at another example. Suppose we want to iterate through all of the elements in our int array and output each element's value to the console. We have a few of options available to us. First, there's the familiar for-loop approach:

for (int x = 0; x < a.Length; x++)
  Console.WriteLine(a[x]);

Second, there's the more declarative foreach-loop:

foreach (int val in a)
  Console.WriteLine(val);

Finally, the underused "Array.ForEach" BCL method is also a possibility:

Array.ForEach(a, val => Console.WriteLine(val));

In addition, because "Console.WriteLine" has an overload which accepts a single int parameter, we can rewrite the previous code without a lambda expression:

Array.ForEach(a, Console.WriteLine);

Now, for the monkey wrench. Suppose we want to print the index of each element in the array along with the value. With this added requirement, the for-loop is our most attractive choice because the indexer variable is already built in. The other two options would require awkwardly creating an indexer variable and explicitly incrementing it. This additional code looks especially ugly with the "Array.ForEach" option.

int x = 0;
Array.ForEach(a, val => Console.WriteLine("{0}: {1}", x++, val));

Nasty.

How might we handle this in F#? Simple. F# provides an API designed to iterate an array with an index.

Array.iteri (fun x value -> printfn "%i: %i" x value) a

Like the BCL's "Array.ForEach" method, F#'s "Array.iteri" iterates through an array and applies the given function to each element. The difference is that the function to be applied includes an additional parameter representing the element's index in the array.

NeRd Note
Curious about why the parameter ordering of the F# "Array.iteri" API places the function to be applied before the array to be iterated? Isn't that backwards? Wouldn't it make more sense to move the array parameter to the first position? Nope. The parameter ordering is intentional.

Unless specified, F# functions are implicitly curried. Hence, parameters are usually ordered to take advantage of partial application. If the parameters of "Array.iteri" were reordered, we could not easily use partial application to build useful functions from it.
let print = Array.iteri (fun x value -> printfn "%i: %i" x value)

print a
Besides, if passing "a" as the last parameter is awkward, we can always pass it with the F# pipeline operator.
a |> Array.iteri (fun x value -> printfn "%i: %i" x value)
Make sense? OK. Take a deep breath...

Using F#'s "Array.iteri" as a model, we can define an equivalent function in C#.

public static class ArrayEx
{
  public delegate void IndexedAction<T>(int index, T item);
 
  public static void Iterate<T>(T[] array, IndexedAction<T> action)
  {
    if (array == null)
      throw new ArgumentNullException("array");
    if (action == null)
      throw new ArgumentNullException("action");

    if (array.Length <= 0)
      return;

    int lower = array.GetLowerBound(0);
    int upper = array.GetUpperBound(0);

    for (int i = lower; i <= upper; i++)
      action(i, array[i]);
  }
}

Now we can iterate our array and output the index and value of each element to the console with one line of code!

ArrayEx.Iterate(a, (x, i) => Console.WriteLine("{0}: {1}", x, i));

Since we're using C# 3.0, we can declare "ArrayEx.Iterate" as an extension method to make the client code more readable.

a.Iterate((x, i) => Console.WriteLine("{0}: {1}", x, i));

In conclusion, using F# as a source of inspiration, it's easy to create APIs that enable more declarative C# code to be written. Do you have a cool declarative API that you've written for C# or VB? If so, I'd love to hear about it. Feel free to post your creations in the comments or email me directly.

posted on Tuesday, November 13, 2007 11:17:23 PM (Eastern Standard Time, UTC-05:00)  #    Comments [18]

kick it on DotNetKicks.com
Wednesday, November 14, 2007 2:51:25 AM (Eastern Standard Time, UTC-05:00)
I haven't used F# yet, so maybe this is a blatantly obvious comment/question. But why can't you just use the F# libraries from a C# app? Can't you just reference them and use them? Don't they get compiled to IL as any other .NET language? Are they not CLS compliant?
Matt
Wednesday, November 14, 2007 7:58:40 AM (Eastern Standard Time, UTC-05:00)
Since the array parameter of the Iterate method is declared as T[], there's no need to use the GetLowerBound / GetUpperBound methods. Although it's possible to create an array with a non-zero lower bound using the Array.CreateInstance method, the result cannot be cast to a vector:

int[] lowerBounds = new int[1] { 50 };
int[] lengths = new int[1] { 5 };
int[] numbers = (int[])Array.CreateInstance(typeof(int), lengths, lowerBounds);

=> System.InvalidCastException: Unable to cast object of type 'System.Int32[*]' to type 'System.Int32[]'.


Also, it would be useful to overload the Iterate method to operate on IEnumerable<T> instances:

public static void Iterate<T>(IEnumerable<T> array, IndexedAction<T> action)
{
if (null == array) throw new ArgumentNullException("array");
if (null == action) throw new ArgumentNullException("action");

int index = 0;
foreach (T item in array)
{
action(index, item);
index++;
}
}
Richard
Wednesday, November 14, 2007 8:34:40 AM (Eastern Standard Time, UTC-05:00)
@Matt:

Yes, the F# libraries can be called from C#. Unfortunately, it isn't very comfortable to do so. Unless extra effort is taken on the part of the author, libraries written in F# are not as easily consumable by other .NET languages. Here's how one might call "Array.init" and "Array.iteri" from the standard F# libraries using C#:

using System;
using Microsoft.FSharp.Core;
using FSharpArray = Microsoft.FSharp.Collections.Array;

namespace FSharpFromCSharp {
class Program {
static void Main() {
Converter<int, int> createItem = x => x + 1;
FastFunc<int, int> createItemFunc = FuncConvert.ToFastFunc(createItem);
int[] a = FSharpArray.init(20, createItemFunc);

Action<int, int> printItem = (x, i) => Console.WriteLine("{0}: {1}", x, i);
FastFunc<int, FastFunc<int, Unit>> printItemFunc = FuncConvert.ToFastFunc(printItem);
FSharpArray.iteri(printItemFunc, a);
}
}
}

It's simply not very comfortable to call F# libraries that are designed for F# from C#.
Wednesday, November 14, 2007 8:56:24 AM (Eastern Standard Time, UTC-05:00)
@Richard:

You are indeed correct! I was actually unaware of this limitation of Array.CreateInstance -- though the docs say that it "creates a multidimensional array." I suppose this does make sense from a CLR persepective. Vectors are treated a bit differently under the hood than their multi-dimensional cousins.

Also, your suggestion to use IEnumerable<T> and foreach is a good one. It really only makes sense to call GetLowerBound(), GetUpperBound() and hard-code for-loops when Iterate is overloaded to support multi-dimensional arrays:

public delegate void IndexedAction2<T>(int index1, int index2, T item);

public static void Iterate<T>(T[,] array, IndexedAction2<T> action) {
if (array == null)
throw new ArgumentNullException("array");
if (action == null)
throw new ArgumentNullException("action");

int lower1 = array.GetLowerBound(0);
int upper1 = array.GetUpperBound(0);
int lower2 = array.GetLowerBound(1);
int upper2 = array.GetUpperBound(1);

for (int index1 = lower1; index1 <= upper1; index1++)
for (int index2 = lower2; index2 <= upper2; index2++)
action(index1, index2, array[index1, index2]);
}
Thursday, November 15, 2007 8:43:09 AM (Eastern Standard Time, UTC-05:00)
You didn't need F# to learn about this sort of stuff. Enumerable already contains a lot of functionality similar to this in the next release. I'd also suggest that most of these be extension methods. Also, there's built in Action, Func and Predicate templates that could have been used instead of the custom delegates.
Thursday, November 15, 2007 9:43:33 AM (Eastern Standard Time, UTC-05:00)
@William:

I'm not sure what methods in Enumerable you're referring to that provide declarative ways to create arrays or iterate an array with the index and value of each element. Could you clarify?

As for your suggestion to make most of these extension methods, that's exactly what I did. I presented two methods: Array.Create and Array.Iterate. Array.Create can't actually become an extension method, and at the end of the article, Array.Iterate *did* become an extension method. Am I missing something?

Action<T> and Predicate<T> aren't appropriate for any of the code samples in this article. I do use Action<T> in the "Array.ForEach" example, but it doesn't quite offer what is needed to solve the problem. As for the Func delegates, I considered using them (as I have in previous articles). The problem is that, while they work fine, they tend to obfuscate the concepts that I'm presenting. At a glance, it's easier to understand IndexedAction<T> than Func<int, T> -- especially if the reader is new to the concepts.
This comment has not been screened by an external service.
Thursday, November 15, 2007 7:55:32 PM (Eastern Standard Time, UTC-05:00)
I really enjoy the blog, Dustin. Thanks.

This post actually inspired a response of sorts. Though, I guess a real response would've contrasted the usage of the newly created method with an imperative-style batching approach.
Thursday, November 15, 2007 9:45:56 PM (Eastern Standard Time, UTC-05:00)
Great post Jacob!

I played with your idea a bit. If you're comfortable with Slice returning an IEnumerable<IEnumerable<T>> instead of IEnumerable<T[]>, you could implement it using the existing Skip() and Take() extension methods:

public static bool IsEmpty<T>(this IEnumerable<T> sequence) {
if (sequence == null)
throw new ArgumentNullException("sequence");

if (sequence.GetEnumerator().MoveNext())
return false;

return true;
}
public static IEnumerable<IEnumerable<T>> Slice<T>(this IEnumerable<T> sequence, int size) {
if (sequence == null)
throw new ArgumentNullException("sequence");
if (size <= 0)
throw new ArgumentOutOfRangeException("size");

while (!sequence.IsEmpty())
{
yield return sequence.Take(4);
sequence = sequence.Skip(4);
}
}

What do you think?
Friday, November 16, 2007 1:49:50 PM (Eastern Standard Time, UTC-05:00)
Wow. That is incredibly succinct and elegant code. I am a little [possibly unnecessarily] concerned about Take/Skip, though.

I posted a bit about that.
Friday, November 16, 2007 1:51:13 PM (Eastern Standard Time, UTC-05:00)
Oh yeah; I edited your snippet a little bit to get rid of the hard-coded "4". I hope that's what you intended.
Friday, November 16, 2007 3:54:41 PM (Eastern Standard Time, UTC-05:00)
Yeah, the 4 is an artifact of my testing. Sorry 'bout that!
Friday, November 16, 2007 4:14:08 PM (Eastern Standard Time, UTC-05:00)
@Jacob:

Here's another possibility:

private static IEnumerable<T> NextChunk<T>(IEnumerator<T> enumerator, int size)
{
yield return enumerator.Current;

for (int i = 0; i < size - 1; i++)
{
if (enumerator.MoveNext())
yield return enumerator.Current;
else
yield break;
}
}
public static IEnumerable<IEnumerable<T>> Slice<T>(this IEnumerable<T> sequence, int size)
{
if (sequence == null)
throw new ArgumentNullException("sequence");
if (size <= 0)
throw new ArgumentOutOfRangeException("size");

var enumerator = sequence.GetEnumerator();
while (enumerator.MoveNext())
yield return NextChunk(enumerator, size);
}

That's not quite as pretty, but it's certainly more performat.
Friday, November 16, 2007 6:35:11 PM (Eastern Standard Time, UTC-05:00)
True, that does avoid the excessive MoveNext calls. But I think it now relies on the client to fully consume each slice. I haven't tested it, but I think this would break the latest implementation:

foreach (var slice in (new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }).Slice(3)) {
foreach (var n in slice) {
if (n == 5) break;
// consume items...
}
}

I think you'd see batches come back: { 1, 2, 3 }, { 4, 5 }, { 6, 7, 8 }, { 9, 10 }.

I really liked the simple elegance of the Take/Skip-based approach, despite the potential performance drawbacks. It was a very pure solution.
Saturday, November 17, 2007 4:39:41 AM (Eastern Standard Time, UTC-05:00)
Ah, right! Forcing evaluation of NextChunk() will fix that:

public static IEnumerable<IEnumerable<T>> Slice<T>(this IEnumerable<T> sequence, int size)
{
if (sequence == null)
throw new ArgumentNullException("sequence");
if (size <= 0)
throw new ArgumentOutOfRangeException("size");

var enumerator = sequence.GetEnumerator();
while (enumerator.MoveNext())
yield return NextChunk(enumerator, size).ToList();
}
Tuesday, December 11, 2007 10:36:17 PM (Eastern Standard Time, UTC-05:00)
Sorry I missed the end tag with previous post. (hangs head in shame)
Saturday, January 12, 2008 11:34:50 AM (Eastern Standard Time, UTC-05:00)
Was at codemash, enjoyed your F# talk. Had thought of a question as I was driving home.
You over wrote the (+) operator to (-), to go back you over wrote (+) to (--).

My question is, how do you discard the over written operators? i.e. If you then over wrote (-) to (*) your (+) operator would get messed up inadvertently?

Again, Very good presentation.
Steve Bak
Saturday, January 12, 2008 11:48:46 AM (Eastern Standard Time, UTC-05:00)
Hi Steve,

I'm glad that you enjoyed the talk. In the session, we didn't actually overwrite the operators. Instead, we were declaring a new operator with the same name that now has scope.

let inline (+) x y = x - y;;

Once I type the above code into the F# console, there is now an operator called "+" in scope. That means any future uses of "+" will actually perform subtraction. To test this, just define a "-" operator like you suggested:

let inline (-) x y = x * y;;

Now, try using the "+" operator:

3 + 2;;

The F# console will print 1 instead of 6. In other words, declaring a new function or operator with the same name doesn't actually replace the old one--it just steals scope.
Comments are closed.