Monday, March 03, 2008
Help! I've painted myself into a corner. While writing articles for this series, I try very hard to introduce only a little F# syntax at a time. It is a personal goal of mine not to use syntax that hasn't already been introduced in a previous article. Because of this, my hands are now tied. You see, I have some very cool articles in the queue, but I simply cannot post them until I've introduced (what is arguably) the most fundamental data structure in functional programming: the list.

In functional programming (and F#), lists are actually linked lists of the singly-linked variety. As a classic data structure, the linked list should be familiar to any programmer. In the everyday imperative world, a linked list is simply a group of nodes (or "cells"). Each node represents a value in the list and contains a pointer to the next node. The main benefit of a linked list is that insertion and removal are, asymptotically, O(1) operations. ([ed.] The use of large computer science terms helps the author to feel smarter than he actually is.) In other words, insertion and removal are constant time operations whose performance is not affected by the number of items in the list.

The lists of functional programming are very different from their imperative cousins. For instance, functional lists are recursive data structures.1 A functional list is really just a value (or the "head") and another list (or the "tail"). Consider a list containing the elements 1, 2, and 3. In the functional world, that would be a list with 1 as its head, and a tail containing a list with 2 as its head, and a tail containing a list with 3 as its head, and a tail containing the special "empty list"—which has no head or tail. Did you get all that? No? Well, a picture is worth a thousand recursive words:

Simple list (recursive structure)

In the above diagram, lists are represented by boxes, and their heads are represented by circles.2 The empty list is represented by the square containing the special value, []. Because all of those boxes are pretty cumbersome to draw, we'll use diagrams like the one below. However, the diagrams are equivalent.

Simple list

Make sense so far? OK, enough jibber-jabber! Let's see some syntax.

> 1::2::3::[];;

val it : int list = [1; 2; 3]

As you can see, the F# syntax is nearly identical to the diagrams above. We even have to append the empty list explicitly. In fact, the F# interactive environment complains if we forget.

> 1::2::3;;

  1::2::3;;
  ------^^

stdin(16,6): error: FS0001: This expression has type
        int
but is here used with type
        int list
stopped due to error

Thankfully, F# provides a more compact syntax for declaring lists. Just place the contents inside of square brackets, separated by semi-colons—no empty list required.

> [1;2;3];;

val it : int list = [1; 2; 3]

There are lots of other ways to declare lists in F#. Many of you will be pleased to know that range expressions are supported.

> [1..3];;

val it : int list = [1; 2; 3]

In addition, powerful list comprehensions are available.

> [for x in 1..3 -> x * x];;

val it : int list = [1; 4; 9]

As I stated earlier, the lists of functional programming (and hence, F# lists) are very different from imperative linked lists. Another fundamental difference is that F# lists are immutable. Once created, the contents of an F# list can't be changed3—that is, nothing can be added or removed.

Wait. Stop. Didn't I state at the beginning of this very article that the primary benefit of linked lists is fast insertion and removal? If a list can't be changed, haven't we lost the primary motivation for using a linked list in the first place? Well, yes and no.

If we were hoping to use an F# list like an imperative linked list, immutability is deal-breaker.4 However, if we use an F# list in a more functional style, our goals are different, and immutability actually helps us achieve those goals. One primary goal of functional programming is to avoid side effects—e.g., when a function modifies some bit of state in addition to returning the value of a calculation. If values are immutable, many side effects aren't even possible. However, it is possible to perform basic operations with an immutable list. Such operations (e.g., insertion and removal) return a new list. Let's look at a simple example: appending two lists.

Appending lists in F# is trivial. In fact, F# even provides a special @ operator to do the trick.

> [1;2;3] @ [4;5;6];;

val it : int list = [1; 2; 3; 4; 5; 6]

You see? Trivial.

OK, let's define a couple of lists.

> let first = [1;2;3];;

val first : int list

> let second = [4;5;6];;

val second : int list

At this point, our lists look like the following diagrams:

Simple lists (before append)

Now, let's append the two lists, creating a new list.

> let combined = first @ second;;

val combined : int list

> combined;;

val it : int list = [1; 2; 3; 4; 5; 6]

So, what do our lists look like now?

(Downshiftng to the imperative world...)

In the imperative world, linked lists support mutation. If we append two linked lists, the result must be a new list containing a copy of every node. The new list cannot share nodes with the original two lists. Why? Because node sharing would mean that any mutation to the original lists would mutate the new list.

(Shifting gears back to the functional world...)

In the functional world, lists are immutable. This means that node sharing is possible because the original lists will never change. Because the first list ends with the empty list, its nodes must be copied in order to point its last node to the second list. After the append operation, our lists look like so:

Simple lists (after append)

At this point, the more skeptical among you might be saying, "Well, that's a pretty interesting theory, but can you prove it?"

No problem.

Using the knowledge that F# lists are recursive, we can retrieve the last half of combined (the inner list starting at 4) by taking the tail, of its tail, of its tail. List.tl is the function that F# provides for extracting a list's tail.

> let lastHalf = List.tl (List.tl (List.tl combined));;

val lastHalf : int list

> lastHalf;;

val it : int list = [4; 5; 6]

Finally, because F# is first-class citizen of the .NET Framework, we have full access to all of the base class libraries. So, we can use the Object.ReferenceEquals method to test whether or not lastHalf and second are indeed the same instance.

> System.Object.ReferenceEquals(lastHalf, second);;

val it : bool = true

And there you have it. Believe it or not, appending two immutable lists can actually be faster and more memory efficient than appending mutable lists because fewer nodes have to be copied.

Hopefully this is enough to whet your appetites for more information. If so, Nate Hoellein has a series of posts that explore many of the facets of F# lists and the libraries supporting them. Check out his posts here, here and here.

1The recursive structure of lists in functional programming was discussed in my mind-twisting article, Building Data Out Of Thin Air.
2It might be helpful to visualize the diagram without arrows.
3To be fair, F# lists don't enforce any sort of "deep" immutability. Since F# is a multi-paradigm language that fully supports imperative and object-oriented programming, it is certainly feasible to stuff an F# list full of mutable objects.
4If you really want to use a mutable linked list in F#, you don't have to look any further than the .NET Framework. Just use the System.Collections.Generic.LinkedList<T> class.

posted on Monday, March 03, 2008 6:24:53 AM (Pacific Standard Time, UTC-08:00)  #    Comments [3]

kick it on DotNetKicks.com