Tuesday, May 06, 2008

Project Euler problem six is another easy one.

The sum of the squares of the first ten natural numbers is,

12 + 22 + ... + 102 = 385

The square of the sum of the first ten natural numbers is,

(1 + 2 + ... + 10)2 = 552 = 3025

Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 − 385 = 2640.

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

The solution to this problem boils down to a few folding operations and a map. The one-liner is below.

List.fold_left (+) 0 [1..100] * List.fold_left (+) 0 [1..100] - List.fold_left (+) 0 (List.map (fun x -> x * x) [1..100])

Pretty nasty, eh? Quite a bit of code duplication can be removed. Since they're identical, let's generalize all of the folds first by extracting them to a sum function.

let sum lst = List.fold_left (+) 0 lst

sum [1..100] * sum [1..100] - sum (List.map (fun x -> x * x) [1..100])

That already looks a lot better.

Next, we can generalize the multiplication operations. Each time multiplication occurs in the solution above, it's simply squaring a value. So, we can extract those operations into a square function.

let square x = x * x

square (sum [1..100]) - sum (List.map (fun x -> square x) [1..100])

We can simplify that even further. Because the anonymous function passed to List.map just applies its argument to the square function, we can pass square directly.

square (sum [1..100]) - sum (List.map square [1..100])

Next, let's generalize the call to List.map that produces a list of squares by moving it to a new function, squares.

let squares lst = List.map square lst

square (sum [1..100]) - sum (squares [1..100])

At this point, we have a perfectly acceptable solution. It states the problem almost like natural English: "The square of the sum of 1 to 100 minus the sum of the squares of 1 to 100." So, why are there a few more inches left in this article? Well, I'd like to take this a step further.

Thinking more abstractly, what does our solution do? It computes the difference of two calculations that are based on the same list. We can extract this general process to a new function like so:

let difference f1 f2 lst = f1 lst - f2 lst

difference (fun l -> square (sum l)) (fun l -> sum (squares l)) [1..100]

It turns out that we can simplify these anonymous functions in the same way that we did with the square function earlier. However, because there are two functions involved in each calculation, we must compose the functions together. In F#, there are two operators used to perform function composition: the >> operator, which applies the functions from left to right, and the << operator, which applies the functions from right to left. Obviously, we need the latter.

difference (square << sum) (sum << squares) [1..100]

After using the forward pipe operator to move the list to the front, we're finished.

[1..100] |> difference (square << sum) (sum << squares)

"Take the numbers 1 to 100 and find the difference of the square of the sum and the sum of the squares."

Function composition is beautiful.

posted on Tuesday, May 06, 2008 6:21:26 AM (Eastern Standard Time, UTC-05:00)  #    Comments [3]

kick it on DotNetKicks.com

A few days ago, I presented a solution for Project Euler problem four that I didn't really like. The challenge of problem four is to write a function that determines whether a number is a palindrome, that is, whether it reads the same backward as forward. When presented with that challenge, I took an approach that I feel is a bit of a cop-out: converting the number to a string, reversing the string and comparing the result. This felt somehow wrong because I'm not really solving the problem in a mathematical way. So, I'm declaring a mulligan. Below is a new function which properly performs the math necessary to reverse a base-10 number.

let reverse n =
  let rec loop x res =
    if x = 0 then res
    else loop (x/10) (res*10 + (x%10))

  loop n 0

let isPalindrome n =
  n = reverse n

Our original list comprehension below still works properly with the new isPalindrome function.

[ for x in 100..999
    for y in 100..999
      when isPalindrome(x*y) -> x*y ] |> toLargest

This solution is twice as fast as the original string-based solution. In addition, I'd argue that the tail-recursive loop is at least four times as beautiful. :-)

posted on Tuesday, May 06, 2008 6:21:13 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]

kick it on DotNetKicks.com
 Monday, May 05, 2008

At first glance, Project Euler problem five looks like a walk in the park:

2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.

What is the smallest number that is evenly divisible by all of the numbers from 1 to 20?

Sounds easy! The most straightforward solution is to take the sequence of all natural numbers, filter those that are evenly divisible by 1 through 20, and pop off the first element of the sequence (the head). Something like the code below would do the trick.

{ 1L .. Int64.max_int }
  |> Seq.filter (fun n ->
       [1L .. 20L] |> List.for_all (fun d -> n % d = 0L))
  |> Seq.hd

Unfortunately, that solution, while direct, falls far outside of Project Euler's "one-minute rule." It eventually calculates the correct answer but takes as much as 10 minutes on my machine!

OK, let's take a step back. What exactly is the problem asking us to find? Stating it differently, "What is the least common multiple of all of the numbers from 1 to 20?"

The least common multiple (LCM) of two numbers is the smallest number that is evenly divisible by each. Still not familiar? Think about how fractions are added. The first step in adding fractions is to find the least common denominator, which is simply the LCM of the denominators. For example, given the fractions 1/8 and 1/12, we would find the LCM of 8 and 12. Then, the fractions would be rewritten with the LCM (24) as their denominators. Once this is done, we can easily add the fractions 3/24 and 2/24 to get the answer, 5/24.

So, how should we go about calculating the LCM of two numbers? It turns out that there are many well-known possibilities. One of the most popular methods involves finding the prime factors of both numbers. It goes something like this:

Suppose we wanted to find the least common multiple of 160 and 90. First, we would write out the prime factors of each:

160 = 25 * 51
90 = 21 * 32 * 51

The least common multiple can be computed by multiplying the highest power of each unique factor.

lcm(160,90) = 25 * 32 * 51 = 1440.

Many people have chosen this method when working through problem five, and I was tempted to take this road as well because it would allow us to reuse our primeFactors function from problem three. However, the code would be fairly complex.

let countItems lst =
  let incrCount m i =
    match Map.tryfind i m with
    | Some(c) -> Map.add i (c+1) m
    | None -> Map.add i 1 m

  lst |> List.fold_left incrCount Map.empty |> Map.to_list

let lcm x y =
  let rec updateMap m t =
    let i,c = t
    match Map.tryfind i m with
    | Some(v) when v < c -> Map.add i c m
    | None -> Map.add i c m
    | _ -> m

  let factors =
    [x; y]
    |> List.map primeFactors
    |> List.map countItems
    |> List.fold_left (List.fold_left updateMap) Map.empty

  Map.fold (fun i c res -> res * int64 (float i ** float c)) factors 1L

Personally, I feel a sense of accomplishment at writing all of those folds—particularly the double-fold near the end, that's really cool. :-) However, it's pretty far below my standard for code beauty. If you recall, I'm trying to present the most beautiful solution that I can. So, I'm rejecting this solution, even though it's efficient enough to meet Project Euler's requirements. Admittedly, there's a certain beauty in the list transformations, but there's a much better method.

The least common multiple of two numbers can be calculated quite simply using their greatest common divisor (GCD), or the largest number that divides evenly into both numbers. The GCD can be computed easily with the Euclidean algorithm. Here's how it works:

  1. Start with 2 natural numbers, x and y
  2. If y is equal to zero, the answer is x.
  3. If not, set x to the value of y, and y to the remainder of dividing x by y.
  4. Go back to step 2.

For the more visual among you, a flowchart of the Euclidean algorithm is pictured below.

EuclideanAlgorithm

Once we have the GCD, calculating the LCM is easy. Simply divide x by the GCD of x and y, and multiply the result by y. These two algorithms can be implemented quite beautifully in F#.

let rec gcd x y =
  if y = 0 then x
  else gcd y (x%y)

let lcm x y =
  if x = 0 or y = 0 then 0
  else (x / (gcd x y)) * y

However, the F# libraries already supply a function to calculate the GCD of two numbers. The greatest common denominator also goes by another name, highest common factor (HCF), and there is an HCF function in the Microsoft.FSharp.Math.BigInt module. It's a simple matter to rewrite lcm using BigInt.hcf.

open Microsoft.FSharp.Math

let lcm x y =
  if x = 0I or y = 0I then 0I
  else (x / (BigInt.hcf x y)) * y

With lcm in place, would you believe that our solution looks like this?

[1I..20I] |> List.reduce_left lcm

F# can produce truly beautiful code indeed!

posted on Monday, May 05, 2008 8:12:44 AM (Eastern Standard Time, UTC-05:00)  #    Comments [6]

kick it on DotNetKicks.com
 Friday, May 02, 2008

Yet Another Project Euler Series (YAPES) continues with problem four:

A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.

Find the largest palindrome made from the product of two 3-digit numbers.

The most straightforward way to determine if a number is palindromic is to convert it to a string and compare that string with its reverse. Sound easy? It is!

One minor snag is the lack of a library function in F# for reversing strings, but that's easily defined like so:

module String =
  let rev (s : string) = new string(s.ToCharArray() |> Array.rev)

With String.rev in place, writing an isPalindrome function is trivial.

let isPalindrome n =
  let text = Int32.to_string n
  text = String.rev text

Using a list comprehension, we can generate all of the palindromes that are products of 3-digit numbers. Once we have this list, producing the result is as simple as passing it to the toLargest function that we defined for Problem Three.

[ for x in 100..999
    for y in 100..999
      when isPalindrome (x*y) -> x*y ] |> toLargest

Short and sweet—my favorite!

posted on Friday, May 02, 2008 7:49:17 AM (Eastern Standard Time, UTC-05:00)  #    Comments [1]

kick it on DotNetKicks.com
 Thursday, May 01, 2008

Project Euler problem three is first of many to deal with prime numbers.

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143?

Eventually, we'll need a prime number generator to solve some of the more advanced problems, but this problem can be solved efficiently without one. The number in question is small enough (just 12 digits) that the divide-and-conquer method that many of us learned in elementary school will suffice.

Consider how we might use this process to find the prime factors of 140.

140 Is 140 evenly divisible by 2? Yes! Remember 2 and divide 140 by 2.
2 * 70 Is 70 evenly divisible by 2? Yes! Remember 2 and divide 70 by 2.
2 * 2 * 35 Is 35 evenly divisible by 2? No, how about 3? No. 4? Nope. 5? Yes! Remember 5 and divide 35 by 5.
2 * 2 * 5 * 7 And we're done!

This method isn't rocket science, but it gets the job done. In fact, it's pretty fast for reasonably small numbers. After all, we're not trying to find the factors of RSA-200. :-)

The basic algorithm is pictured as a flowchart below.

PrimeFactorization

The following F# function implements our algorithm.

let primeFactors n =
  let inline isFactor n d = n % d = 0L

  let rec nextFactor n d =
    let x = if d = 2L then 3L else d+2L
    if isFactor n x then x else nextFactor n x

  let rec findFactors n d acc =
    if isFactor n d then
      findFactors (n/d) d (d::acc)
    elif n > d then
      findFactors n (nextFactor n d) acc
    else
      acc

  findFactors n 2L [] |> List.rev

To the uninitiated, that function might look pretty complex. In reality, it's extremely simple, but three other functions are nested inside of it. Let's look at each nested function in turn.

let inline isFactor n d = n % d = 0L

There's nothing tricky about isFactor. It simply abstracts the modulo operation that determines whether n is evenly divisible by d.

let rec nextFactor n d =
  let x = if d = 2L then 3L else d+2L
  if isFactor n x then x else nextFactor n x

nextFactor recursively determines the next value of d to be used in the algorithm. There is a small optimization here: nextFactor only produces odd numbers. Since 2 is the only even prime, why bother checking any other evens?

let rec findFactors n d acc =
  if isFactor n d then
    findFactors (n/d) d (d::acc)
  elif n > d then
    findFactors n (nextFactor n d) acc
  else
    acc

The meat of the algorithm is handled by findFactors. Any factors found are cons'd up with the accumulator variable, acc. Note that both findFactors and nextFactor are written tail-recursively, so they can be optimized by the compiler to conserve stack space.

The real body of primeFactors kicks off the recursion:

findFactors n 2L [] |> List.rev.

The result of findFactors is passed to List.rev to return the prime factors in a more logical order (smallest to largest).

A simple test in the F# Interactive Environment shows that primeFactors works as expected.

> primeFactors 140L;;

val it : int64 list = [2L; 2L; 5L; 7L]

Almost done.

Project Euler Problem Three asks, "What is the largest prime factor of the number 600851475143?" That's just a matter of folding the list of prime factors with the max function (from the F# libraries) to get the answer.

primeFactors 600851475143L |> List.fold1_left max

We can generalize the folding logic above with a new function...

let toLargest l = List.fold1_left max l

...And now we can write the following solution.

primeFactors 600851475143L |> toLargest

That's just lovely.

NeRd Note
Eagle-eyed readers might have noticed that the problem could have been solved several inches ago. If primeFactors didn't reorder its results from smallest to largest, the solution to the problem would be in the head of the result list!
primeFactors 600851475143L |> List.hd
However, that solution has some very real consequences. First of all, primeFactors won't return its results in the most logical order, which limits its reusability. Secondly, the intent of the code isn't stated as clearly. And finally, it's a leaky abstraction because the solution relies upon intimate knowledge of how primeFactors returns its results. If primeFactors were changed later, the solution would be broken!
posted on Thursday, May 01, 2008 9:55:56 AM (Eastern Standard Time, UTC-05:00)  #    Comments [2]

kick it on DotNetKicks.com
 Friday, April 25, 2008

Today, I'm tackling Project Euler problem two in F#:

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

Find the sum of all the even-valued terms in the sequence which do not exceed four million.

Like problem one, this is pretty trivial. It's a simple matter of filtering even numbers, taking only the numbers less than some value and folding to get the sum. However, a couple of supporting cast members need to be introduced.

First, I need a way to generate the Fibonacci sequence. There are several ways to calculate Fibonacci numbers (including a clever method that takes advantage of the relationship between Fibonacci numbers and the Golden Ratio). However, I'll heed the heavy hints of the Project Euler problem description above and go with the iterative approach.

My first stab at writing a Fibonacci sequence generator in F# is simply an imperative approach, similar to what I might write in C#, wrapped in an F# sequence expression.

let fibs =
  seq {
        let i = ref 1
        let j = ref 2
        while true do
          let x = !i
          yield x
          do i := !j
          do j := !j + x
      }

In fact, that looks remarkably similar to how I might write a Fibonacci generator as a C# iterator.

static IEnumerable<int> Fibs
{
  get
  {
    int i = 1;
    int j = 2;
    while (true)
    {
      int x = i;
      yield return x;
      i = j;
      j = j + x;
    }
  }
}

The F# and C# Fibonacci generators above are functionally equivalent. The obvious syntactic difference is that the F# version uses reference cells to support mutable variables. Because it uses reference cells, the F# version inherits a bit of operator baggage that might looks strange to C# developers. Most confusing is the unary ! operator, which is used to retrieve the value from a reference cell (i.e., i is a reference cell containing an int, and !i is the int contained within). This will likely look bizarre to many programmers used to C-style syntax where the unary ! operator is used to negate its operand.

NeRd Note
While the C# and F# Fibonacci sequence generators above look essentially the same, they're implemented very differently under the covers. The C# iterator compiles to a class implementing IEnumerable<int> that works like a little state machine. However, the F# sequence expression is compiled as a series of continuations.
let fibs =
  Seq.delay (fun () ->
    let i = ref 1
    let j = ref 2
    Seq.generated
      (fun () -> true)
      (Seq.delay (fun () ->
        let x = !i
        Seq.append
          (Seq.singleton x)
          (Seq.delay (fun () ->
            i := !j
            j := !j + x
            Seq.empty)))))
It's OK if your brain hurts.

I dislike the F# sequence expression approach above for one reason: it seems like a cop-out. It works fine, but it just feels wrong. There has to be a more declarative way to generate the Fibonacci sequence. Fortunately, there is! I can use the Seq.unfold function like so:1

let fibs =
  Seq.unfold (fun (i,j) -> Some(i,(j,i+j))) (1,2)

The generator function passed to Seq.unfold uses a tuple to represent both values needed to iterate the Fibonacci sequence. We can verify that this works properly using the F# Interative Environment.

> fibs |> Seq.take 10;;

val it : int list = [1; 2; 3; 5; 8; 13; 21; 34; 55; 89]

OK. Almost done. I just need a way to take values from the Fibonacci sequence that are less than or equal to four million, and then stop. Effectively, I need something like the LINQ TakeWhile query operator. If I had that, I could use it similarly to the following C#.

foreach (int n in Fibs.TakeWhile(n => n <= 4000000)
  Console.WriteLine(n);

I looked through the F# libraries for a function like TakeWhile but couldn't find one. Either I missed it, or it just isn't there (Don?). Fortunately, this function is trivial to define in F# with a sequence expression. In fact, it's the perfect opportunity to use a sequence expression because the code must interact with an IEnumerator<int>, which has an inherently imperative programming model.

module Seq =
  let
takeWhile f (ie : #seq<'a>) =
    seq { use e = ie.GetEnumerator()
          while e.MoveNext() && f e.Current do
            yield e.Current }

It took a little while to get here, but now I'm ready to solve Project Euler problem two. To restate, we're looking for the sum of the even-valued terms from the Fibonacci sequence that are less than or equal to four million. No problem!

fibs
  |> Seq.filter (fun n -> n % 2 = 0)
  |> Seq.takeWhile (fun n -> n <= 4000000)
  |> Seq.fold (+) 0

As stated last time, I want to present the most beautiful solution that I can. To me, that means the solution should be concise, and it should read like natural language. As before, we can achieve this by abstracting some of the logic into functions whose names better indicate the intent.

let isEven n = n % 2 = 0

let isLessThanOrEqualTo x = (fun n -> n <= x)

let sum s = Seq.fold (+) 0 s

With these functions defined, we can rewrite the solution like so:

fibs
  |> Seq.filter isEven
  |> Seq.takeWhile (isLessThanOrEqualTo 4000000)
  |> sum

The beauty of this code is that it simply states the original problem:

Find the sum of all the even-valued terms in the (Fibonacci) sequence which do not exceed four million.

F# takes care of the rest.

1For those curious about how Seq.unfold works, check out my Apples and Oranges post. For fun, try generating the Fibonacci sequence in C# using the Unfold function presented in that article.

posted on Friday, April 25, 2008 12:44:24 PM (Eastern Standard Time, UTC-05:00)  #    Comments [2]

kick it on DotNetKicks.com
 Thursday, April 24, 2008

For the past several months, I've been using F# to solve at least two Project Euler problems each week. I find this is a great way to sharpen my math skills and my F# skills simultaneously. If you're looking for a way to flex your programming muscles, you really should check out Project Euler.

Last week at the MVP Summit, my friend, Bill Wagner, pressured me to suggested that I post some of my solutions. Now, there are already plenty of smart people posting F# solutions to the Project Euler problems. That's why I've resisted starting my own series: I'm not certain that I have anything new to say on the topic. However, Bill was very convincing (especially when he mentioned that I would be starting a series to a couple hundred C# MVPs).

So, here's the deal. I will try to present the most beautiful solution that I can. Naturally, beauty is in the eye of the beholder, so you might disagree with me. That's OK. Just make certain to let me know why you disagree so that I can grow as a developer. If anything, this about learning to be a better programmer.

Let's get started.

Problem One

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

Obviously, Project Euler starts with some easy problems and moves up to harder fare. The first problem is pretty trivial. It's even reminiscent of the famous FizzBuzz question.

In F#, a list of all multiples of 3 or 5 less than 1000 can be created using a list comprehension. Once we have the list, it can be summed by folding with the addition operator.

[ for x in 1 .. 999 when x % 3 = 0 or x % 5 = 0 -> x ]
  |> List.fold_left (+) 0

The power of F# (and functional programming in general) is its ability to express problems in a more declarative way. This lends the language to mathematical problems very naturally. Looking at the solution above, there are some obvious changes that could make it more succinct. First, the duplicated modulo operation can be abstracted away by declaring a new operator. (Did I mention that F# allows you to declare new operators?) Second, we can extract the folding logic to a new function that better describes its intent.

let inline (/:) x y = x % y = 0 // evenly-divisible by...

let sum list = List.fold_left (+) 0 list

With these defined, we can express the solution more cleanly.

[ for x in 1 .. 999 when x /: 3 or x /: 5 -> x ] |> sum

That's beautiful.

posted on Thursday, April 24, 2008 3:37:46 PM (Eastern Standard Time, UTC-05:00)  #    Comments [3]

kick it on DotNetKicks.com
 Thursday, March 20, 2008

A few days ago, my friend Michael Letterle (the artist formerly known as Michael.NET) twat the following tweet:

MichalL-20YearsToLate

The story Michael referred to is Landon Dyer's "Donkey Kong and Me" blog post, which chronicles his conversion of the Donkey Kong arcade game to the 8-bit Atari 400/800 systems. (Screenshots and a review of the port can be found here.) A fascinating yarn, Dyer's post evokes a feeling of nostalgia for the swashbuckling coder days of more than two decades ago. His recent post about the development of the Atari ST is equally enjoyable.

I've often shared Michael's sentiment. Sometimes, I feel like I was born a bit too late. At the advanced age of 0x20, I am fascinated by stories of the Herculean coding efforts of those who came before me—the original early adopters. (Although, there's a strong argument that the present day is just as, if not more, exciting.) Perhaps the most interesting aspect of tech history is how our forefathers were forced to invent creative solutions for just about everything. For me personally, that's what makes "Donkey Kong and Me" so much fun. The same appeal can be found in the early-Macintosh hardware-tweaking stories at Andy Hertzfeld's Folklore.org.

To fuel my interest in computer tech history, I've recently begun re-reading its bible: Programmers At Work.

ProgrammersAtWork2

Published in 1986, this book features interviews with an amazing array of programmers, including figures like Gary Kildall, Charles Simonyi, Jaron Lanier and even Bill Gates. It's out-of-print but can still be purchased used. (I "borrowed" my water-damaged copy from my father's bookshelf). Thankfully, Susan Lammers, the author, has recently started a "Programmers At Work" blog where she's posting the original interviews. So, if you can't find the book, these classic interviews should all be available soon.

What interesting tech history articles or books have you read recently?

posted on Thursday, March 20, 2008 12:39:47 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]

kick it on DotNetKicks.com
 Friday, March 14, 2008

Recently, I was refactoring some trivial F# code, and the results were so elegant that I felt it would be instructive to share them. My tale begins simply with a list of lists...

> let lists = [[1;2];[5;6;7];[9;10];[3;4];[8]];;

val lists : int list list

Now, suppose we wanted to sort lists by the lengths of the inner lists. How might we do that? Easy! The F# libraries include a List.sort function which does the trick.

val sort: ('a -> 'a -> int) -> 'a list -> 'a list

List.sort takes two arguments. The first argument is a function used to compare elements from the list, and the second argument is the list to be sorted. Obviously, most of the work is in defining the first argument. This comparison function returns a negative value if the first element is less than the second, a positive value if the second element is less than the first, or zero if the two elements are, in fact, equal. With that in mind, we could sort lists using List.sort and List.length like so:

> List.sort (fun x y -> if (List.length x) < (List.length y) then -1
-                       elif (List.length x) > (List.length y) then 1
-                       else 0) lists;;

val it : int list list = [[8]; [1; 2]; [9; 10]; [3; 4]; [5; 6; 7]]

OK. It worked, but that's an awful lot of code. Typing all of that into the F# Interactive Environment is fraught with peril ([ed.] the author spelled "length" as "lentgh" at least twice).

Thankfully, F# provides a function, compare, which can be used to calculate a generic comparison of two arguments.

val inline compare: 'a -> 'a -> int

compare can do most of the heavy lifting and greatly decreases the amount of code we have to write.

> List.sort (fun x y -> compare (List.length x) (List.length y)) lists;;

val it : int list list = [[8]; [1; 2]; [9; 10]; [3; 4]; [5; 6; 7]]

That's much better!

NeRd Note
Did you know that the .NET Framework also provides an API for generic comparison? For types that implement IComparable or IComparable<T>, System.Collections.Generic.Comparer<T>.Compare() can handle the dirty work!
int CompareGuids(Guid x, Guid y)
{
  return Comparer<Guid>.Default.Compare(x, y);
}

Our sort is looking pretty good, but we can do better. Let's take a closer look at the comparison function we're passing to List.sort.

(fun x y -> compare (List.length x) (List.length y))

What exactly are we doing here? Essentially, we're inserting a function application for each argument before calling compare. It sure would be nice to have a function that generalizes this for us. Perhaps something like this:

let inline compareWith f x y = compare (f x) (f y)

I can already sense the snickers. Some of you are thinking, "How could that possibly work? There aren't any types! How would F#'s statically-typed compiler handle that?"

The answer to my hecklers is yet another reason why I love F#: automatic generalization. If necessary, F# will attempt to insert generic type parameters into a function as part of its type inference. This allows very sophisticated code to be written with breathtaking succinctness. The following code shows automatic generalization in action.

> let inline compareWith f x y = compare (f x) (f y);;

val inline compareWith : ('a -> 'b) -> 'a -> 'a -> int

As you can see, F# allows us to define the essence of a function without the noise of type annotations. It looks very similar to code written in dynamically-typed languages, but has all of the benefits of static-typing.

Armed with our new compareWith function (any chance of getting that into the libraries Don?), we can sort lists using List.length like so:

> List.sort (fun x y -> compareWith List.length x y) lists;;

val it : int list list = [[8]; [1; 2]; [9; 10]; [3; 4]; [5; 6; 7]]

But wait! There's more!

I intentionally inserted what I consider to be a sophomoric blunder in that last bit of code. Try to find it. Notice that both of the parameters of our anonymous comparison function are passed, in order, as the last two arguments of compareWith. That's a big clue. Here's another. Consider the signatures of List.sort and comparewith. I'll highlight the interesting bits.

val sort: ('a -> 'a -> int) -> 'a list -> 'a list

val inline compareWith : ('a -> 'b) -> 'a -> 'a -> int

Do you see it? compareWith returns a function whose signature matches the signature of the comparison function expected by List.sort. In essence, the anonymous function is an extra "function layer" that really isn't necessary. Instead, we could write this1:

> List.sort (compareWith List.length) lists;;

val it : int list list = [[8]; [1; 2]; [9; 10]; [3; 4]; [5; 6; 7]]

This an excellent example of the benefits of currying and partial application (and yet another reason why I love F#). If you need to brush up, I've written about these topics in the past here, here and here.

There is one final bit of refactoring that I'd like to do. Notice how lists appears at the very end of the argument list for List.sort. We can make the code more readable by moving lists ahead of List.sort and using the forward pipe operator (|>) like so:

> lists |> List.sort (compareWith List.length);;

val it : int list list = [[8]; [1; 2]; [9; 10]; [3; 4]; [5; 6; 7]]

Now the code reads like an English sentence:

"Take lists and sort it, comparing with List.length."

It's a tiny jump to see that other functions could be easily used to sort lists. For example, we might sort using the head of each inner list (assuming that none of the inner lists is the empty list).

> lists |> List.sort (compareWith List.hd);;

val it : int list list = [[1; 2]; [3; 4]; [5; 6; 7]; [8]; [9; 10]]

Or, we could sort lists by the sum of each inner list (using List.fold_left with the + operator to perform the sum).

> lists |> List.sort (compareWith (List.fold_left (+) 0));;

val it : int list list = [[1; 2]; [3; 4]; [8]; [5; 6; 7]; [9; 10]]

The possibilities are endless!

Next time, we'll take a closer look at the wickedly clever forward pipe operator to see how its very existence hangs upon currying.

1If you don't immediately see why this works, try again. Work it out on paper. The reward is worth the effort.

posted on Friday, March 14, 2008 1:18:34 PM (Eastern Standard Time, UTC-05:00)  #    Comments [1]

kick it on DotNetKicks.com
 Monday, March 03, 2008
Help! I've painted myself into a corner. While writing articles for this series, I try very hard to introduce only a little F# syntax at a time. It is a personal goal of mine not to use syntax that hasn't already been introduced in a previous article. Because of this, my hands are now tied. You see, I have some very cool articles in the queue, but I simply cannot post them until I've introduced (what is arguably) the most fundamental data structure in functional programming: the list.

In functional programming (and F#), lists are actually linked lists of the singly-linked variety. As a classic data structure, the linked list should be familiar to any programmer. In the everyday imperative world, a linked list is simply a group of nodes (or "cells"). Each node represents a value in the list and contains a pointer to the next node. The main benefit of a linked list is that insertion and removal are, asymptotically, O(1) operations. ([ed.] The use of large computer science terms helps the author to feel smarter than he actually is.) In other words, insertion and removal are constant time operations whose performance is not affected by the number of items in the list.

The lists of functional programming are very different from their imperative cousins. For instance, functional lists are recursive data structures.1 A functional list is really just a value (or the "head") and another list (or the "tail"). Consider a list containing the elements 1, 2, and 3. In the functional world, that would be a list with 1 as its head, and a tail containing a list with 2 as its head, and a tail containing a list with 3 as its head, and a tail containing the special "empty list"—which has no head or tail. Did you get all that? No? Well, a picture is worth a thousand recursive words:

Simple list (recursive structure)

In the above diagram, lists are represented by boxes, and their heads are represented by circles.2 The empty list is represented by the square containing the special value, []. Because all of those boxes are pretty cumbersome to draw, we'll use diagrams like the one below. However, the diagrams are equivalent.

Simple list

Make sense so far? OK, enough jibber-jabber! Let's see some syntax.

> 1::2::3::[];;

val it : int list = [1; 2; 3]

As you can see, the F# syntax is nearly identical to the diagrams above. We even have to append the empty list explicitly. In fact, the F# interactive environment complains if we forget.

> 1::2::3;;

  1::2::3;;
  ------^^

stdin(16,6): error: FS0001: This expression has type
        int
but is here used with type
        int list
stopped due to error

Thankfully, F# provides a more compact syntax for declaring lists. Just place the contents inside of square brackets, separated by semi-colons—no empty list required.

> [1;2;3];;

val it : int list = [1; 2; 3]

There are lots of other ways to declare lists in F#. Many of you will be pleased to know that range expressions are supported.

> [1..3];;

val it : int list = [1; 2; 3]

In addition, powerful list comprehensions are available.

> [for x in 1..3 -> x * x];;

val it : int list = [1; 4; 9]

As I stated earlier, the lists of functional programming (and hence, F# lists) are very different from imperative linked lists. Another fundamental difference is that F# lists are immutable. Once created, the contents of an F# list can't be changed3—that is, nothing can be added or removed.

Wait. Stop. Didn't I state at the beginning of this very article that the primary benefit of linked lists is fast insertion and removal? If a list can't be changed, haven't we lost the primary motivation for using a linked list in the first place? Well, yes and no.

If we were hoping to use an F# list like an imperative linked list, immutability is deal-breaker.4 However, if we use an F# list in a more functional style, our goals are different, and immutability actually helps us achieve those goals. One primary goal of functional programming is to avoid side effects—e.g., when a function modifies some bit of state in addition to returning the value of a calculation. If values are immutable, many side effects aren't even possible. However, it is possible to perform basic operations with an immutable list. Such operations (e.g., insertion and removal) return a new list. Let's look at a simple example: appending two lists.

Appending lists in F# is trivial. In fact, F# even provides a special @ operator to do the trick.

> [1;2;3] @ [4;5;6];;

val it : int list = [1; 2; 3; 4; 5; 6]

You see? Trivial.

OK, let's define a couple of lists.

> let first = [1;2;3];;

val first : int list

> let second = [4;5;6];;

val second : int list

At this point, our lists look like the following diagrams:

Simple lists (before append)

Now, let's append the two lists, creating a new list.

> let combined = first @ second;;

val combined : int list

> combined;;

val it : int list = [1; 2; 3; 4; 5; 6]

So, what do our lists look like now?

(Downshiftng to the imperative world...)

In the imperative world, linked lists support mutation. If we append two linked lists, the result must be a new list containing a copy of every node. The new list cannot share nodes with the original two lists. Why? Because node sharing would mean that any mutation to the original lists would mutate the new list.

(Shifting gears back to the functional world...)

In the functional world, lists are immutable. This means that node sharing is possible because the original lists will never change. Because the first list ends with the empty list, its nodes must be copied in order to point its last node to the second list. After the append operation, our lists look like so:

Simple lists (after append)

At this point, the more skeptical among you might be saying, "Well, that's a pretty interesting theory, but can you prove it?"

No problem.

Using the knowledge that F# lists are recursive, we can retrieve the last half of combined (the inner list starting at 4) by taking the tail, of its tail, of its tail. List.tl is the function that F# provides for extracting a list's tail.

> let lastHalf = List.tl (List.tl (List.tl combined));;

val lastHalf : int list

> lastHalf;;

val it : int list = [4; 5; 6]

Finally, because F# is first-class citizen of the .NET Framework, we have full access to all of the base class libraries. So, we can use the Object.ReferenceEquals method to test whether or not lastHalf and second are indeed the same instance.

> System.Object.ReferenceEquals(lastHalf, second);;

val it : bool = true

And there you have it. Believe it or not, appending two immutable lists can actually be faster and more memory efficient than appending mutable lists because fewer nodes have to be copied.

Hopefully this is enough to whet your appetites for more information. If so, Nate Hoellein has a series of posts that explore many of the facets of F# lists and the libraries supporting them. Check out his posts here, here and here.

1The recursive structure of lists in functional programming was discussed in my mind-twisting article, Building Data Out Of Thin Air.
2It might be helpful to visualize the diagram without arrows.
3To be fair, F# lists don't enforce any sort of "deep" immutability. Since F# is a multi-paradigm language that fully supports imperative and object-oriented programming, it is certainly feasible to stuff an F# list full of mutable objects.
4If you really want to use a mutable linked list in F#, you don't have to look any further than the .NET Framework. Just use the System.Collections.Generic.LinkedList<T> class.

posted on Monday, March 03, 2008 9:24:53 AM (Eastern Standard Time, UTC-05:00)  #    Comments [3]

kick it on DotNetKicks.com