Monday, March 09, 2009

Welcome back F# junkies! In this installment of YAPES1, we’re tackling Project Euler problem nine.

A Pythagorean triplet is a set of three natural numbers, a < b < c, for which,

a2 + b2 = c2

For example, 32 + 42 = 9 + 16 = 25 = 52.

There exists exactly one Pythagorean triplet for which a + b + c = 1000. Find the product abc.

Like we’ve done before, we’ll break this problem down into steps:

  1. Generate a stream of Pythagorean triples.
  2. Find the Pythagorean triple whose sum is equal to 1000.
  3. Calculate the product of that triple.

Sound good?

The first order of business is to determine how to generate Pythagorean triples. It turns out that there are many ways to do this. However, to keep it simple, we will use Euclid’s classic formula:

Given a pair of positive integers m and n with m > n

a = 2mn, b = m2n2, c = m2 + n2

Truthfully, this formula doesn’t actually produce every possible Pythagorean triple. However, since the solution is within the set that it generates, that really isn’t a problem.

NeRd Note
To produce every Pythagorean triple, we would have to introduce an additional parameter, k, into the formula.
a = 2kmn, b = k(m2n2), c = k(m2 + n2)
This is left as an exercise for you, dear reader!

First, we’ll declare an F# function that calculates a Pythagorean triple using Euclid’s formula.

let getTriple m n =
  let a = 2*m*n
  let b = m*m - n*n
  let c = m*m + n*n
  a,b,c

Using the above function, it’s easy to produce a stream of Pythagorean triples. Our strategy will be to start with the pair, m = 2 and n = 1. To generate the next pair, we’ll first check if n + 1 is less than m. If it is, we’ll use m and n + 1 as the next pair, otherwise we’ll use m + 1 and 1. Using Seq.unfold, we can use this algorithm to produce an infinite stream of Pythagorean triples:

let triples =
  Seq.unfold (fun (m,n) ->
                let res = getTriple m n
                let next = if n+1 < m then m,n+1 else m+1,1
                Some(res, next))
             (2,1)

With our generator in place, we can easily solve the problem.

triples
  |> Seq.find (fun (a,b,c) –> a+b+c = 1000)
  |> (fun (a,b,c) –> a*b*c)

While the above solution works well, we can abstract further to produce a more elegant solution. First, we’ll declare three functions to handle the three operations that we’re performing explicitly, sum, product and equals.

let sum (a,b,c) = a+b+c
let product (a,b,c) = a*b*c
let equals x y = x = y

Now, we can restate our solution using these functions with a bit of function composition to produce a truly beautiful solution.

triples
  |> Seq.find (sum >> equals 1000)
  |> product

As we’ve seen once before, function composition is beautiful.

1Yet Another Project Euler Series

posted on Monday, March 09, 2009 7:05:47 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]

kick it on DotNetKicks.com
 Saturday, March 07, 2009

Welcome coding ninjas!1 Continuing with Yet Another Project Euler Series (YAPES™), we come upon problem eight.

Find the greatest product of five consecutive digits in the 1000-digit number.

73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450

Being one of the earlier Project Euler problems, this is pretty trivial to solve. We’ll approach the problem by first breaking it into a set of simple steps:

  1. Transform the 1000-digit number into a list of digits.
  2. Transform the list of digits into a list of quintuplets—that is, a list containing tuples of each five consecutive elements.
  3. Transform the list of quintuplets into a list containing the product of each.
  4. Find the maximum element in the list.

Some or all of the steps above could be combined to produce a more optimal solution. However, the beauty of F#—and functional programming in general—is the ability to easily break a problem like this into a series of data transformations that run quickly enough for most situations. When a problem is broken down into simple operations, it is easier to optimize in other ways (e.g. parallelization).

The first step is to transform the 1000-digit number into a list of digits. To facilitate working with such a large number in code, we’ll represent it as a multi-line string. Like before, we’ll break the problem of converting a string into a list of digits into a set of simple steps:

  1. Transform the string into a list of chars.
  2. Because this was a multi-line string, some of the chars will be line breaks and whitespace. So, we’ll filter the list to include only the chars that represent digits.
  3. Transform the list of digit chars into a list containing the numerical value of each.

Let’s define a few library functions to to help with each step above.

module String =
  /// Takes a string and produces a list of chars.
  let toChars (s : string) =
    s.ToCharArray() |> Array.to_list

module Char =
  /// Determines whether a char represents a digit.
  let isDigit (c : char) =
    System.Char.IsDigit(c)

  /// Converts a char representing a digit into its numerical value.
  let toNumber (c : char) =
    int c - int '0'

Armed with these, we can perform the 3 steps above in a declarative fashion.

let number =
  "73167176531330624919225119674426574742355349194934
   96983520312774506326239578318016984801869478851843
   85861560789112949495459501737958331952853208805511
   12540698747158523863050715693290963295227443043557
   66896648950445244523161731856403098711121722383113
   62229893423380308135336276614282806444486645238749
   30358907296290491560440772390713810515859307960866
   70172427121883998797908792274921901699720888093776
   65727333001053367881220235421809751254540594752243
   52584907711670556013604839586446706324415722155397
   53697817977846174064955149290862569321978468622482
   83972241375657056057490261407972968652414535100474
   82166370484403199890008895243450658541227588666881
   16427171479924442928230863465674813919123162824586
   17866458359124566529476545682848912883142607690042
   24219022671055626321111109370544217506941658960408
   07198403850962455444362981230987879927244284909188
   84580156166097919133875499200524063689912560717606
   05886116467109405077541002256983155200055935729725
   71636269561882670428252483600823257530420752963450"
   |> String.toChars
   |> List.filter Char.isDigit
   |> List.map Char.toNumber

The next step is to transform the list of digits into a list of quintuplets. To achieve this, we’ll write a simple recursive function.

let rec toQuintuplets l =
  match l with
  | x1::(x2::x3::x4::x5::_ as t) -> (x1,x2,x3,x4,x5)::(toQuintuplets t)
  | _ -> []

Notice that the first pattern match above is slightly more complicated as we bind the list starting with the next element in the list to t, to make the recursion cleaner.

NeRd Note
The toQuintuplets function defined above will work well for this particular problem but will stack overflow if passed too large of a list because it is head-recursive. There are a couple of ways to make it tail-recursive and avoid blowing the stack, but the most elegant is to employ a continuation.
let toQuintuplets l =
  let rec loop l cont =
    match l with
    | x1::(x2::x3::x4::x5::_ as t) -> loop t (fun l –> cont ((x1,x2,x3,x4,x5)::l))
    | _ -> cont []

  loop l (fun x -> x)
Of course, that unnecessarily complicates this solution by turning the logic inside out!

The last function we’ll define is a small helper to produce the product of all of the values in a quintuplet.

let product (x1,x2,x3,x4,x5) = x1*x2*x3*x4*x5

All of the pieces are now in place, and the steps can be combined to solve Project Euler problem eight.

number
  |> toQuintuplets
  |> List.map product
  |> List.max

It’s that simple!

1Or pirates.

posted on Saturday, March 07, 2009 1:09:11 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]

kick it on DotNetKicks.com
 Saturday, February 14, 2009

As a Program Manager at Microsoft, I wear several different hats. First and foremost, I am the Visual Basic IDE Program Manager, whose job is to keep the powerful engine that is the VB IDE Team running smoothly. However, another hat I wear is that of the Debugger Experience PM, whose job is to ensure that C#, Visual Basic and, yes, F# programmers have a great debug-time experience. From the language perspective, this often comes down to making the right tweaks in the debugger expression evaluators.

So, what are these debugger expression evaluators? The expression evaluators (affectionately known as the EEs) are magic machinery that deal with language-specific details on behalf of the core debugging engine that lives inside Visual Studio. In general, the EEs are responsible for the following:1

Notice that all three of the items above must be tailored to the language that the user is developing in. For example, it doesn’t make sense to display the C#-style hexadecimal value, 0x2a, to Visual Basic users. VB users expect to see the value rendered in the Visual Basic style for hexadecimal numbers, &H2A. Thus, to keep the experiences consistent, C# and VB both provide their own debugger expression evaluators.2

Recently, I was playing with a bit of Visual Basic code using an XML literal.

Module Module1

  Sub Main()
    Dim contacts = <contacts>
                     <contact name="Bob"/>
                     <contact name="Sally"/>
                   </contacts>

        Console.WriteLine(contacts)
    End Sub

End Module

If you’ve never seen a VB XML literal before, know that they are beautiful and powerful language constructs.3 Essentially, the code above is equivalent to the following C# code.

var contacts = new XElement("contacts",
                 new XElement("contact", new XAttribute("name", "Bob")),
                 new XElement("contact", new XAttribute("name", "Sally")));

Console.WriteLine(contacts);

If I set a breakpoint in the VB code above on Console.WriteLine(contacts) and press F5 to start debugging, I can see in the Locals window that my XML literal has effectively produced the same tree of XElement and XAttribute objects that the C# code explicitly declares above.

Locals

Now, suppose that I’ve made a mistake when declaring the XML literal (e.g. “Bob” should be “Harry”), but I don’t want to stop debugging, fix the code, and F5 to start again. I have a couple of options available to me for modifying values at debug-time.

  1. I can use Edit and Continue. This feature allows me to change my code at debug-time and move the instruction pointer in order to execute the modified code. However, this might not always be convenient.
  2. I can edit the value directly in one of the variable windows, the datatip, or the Immediate Window.

Choosing option two above, I’ll fix my mistake by editing the XML value directly in the Locals Window.

EditingWhenNotRemovingQuotes

Upon pressing ENTER, I’m presented with the following error dialog.

EndOfStatementExpected

What the heck does that mean? Well, this is the message that would be displayed by the Visual Basic compiler for a string value with inner quotes that are not escaped.4

Attempt number two: I’ll edit the value again and be careful to escape the quotes.

EditingAndEscapingQuotes

This time, when I press ENTER, I get yet another error message.

ValueOfTypeStringCannotBeConvertedToXElement

Grrr. Now I’m getting irritated!

Fortunately, the error message above gives me the clue I need to make this work. The problem is that I’m producing a string value, and I should be producing an XML value. What do I need to do differently? I need to remove the quotes that are enclosing the whole XML value.

Last try.

Editing XML in Locals

Success!

This time, I’m rewarded with a red display color for the value, indicating that the value has changed. If I expand some of the inner nodes, I can see that the tree of XElement and XAttribute objects has indeed been rebuilt.

Modified XML in Locals

That’s a pretty nifty trick that you can use right now in Visual Studio 2008. However, when I put on my Debugger PM hat, I consider how to improve this experience for Visual Studio 2010. Easy! We’ve removed the enclosing double-quotes from XElement values in the Visual Basic Expression Evaluator.

Sometimes achieving the best experience is just a simple tweak away.

1This is not an exhaustive list.
2F# currently uses the C# EE, but eventually F# will provide its own. Feel free to ping Luke and Tim to help make this happen.
3For a good example of how powerful XML literals are, see Dmitry Robsman's ASP .NET MVC Engine Using VB.NET XML Literals.
4Displaying this string without escaping the quotes strikes me as a bug. :-)

posted on Saturday, February 14, 2009 12:07:24 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]

kick it on DotNetKicks.com
 Tuesday, January 20, 2009

A few months ago, I was honored to record an episode of Deep Fried Bytes my good friend (and partner-in-crime) Chris Smith. We blabbed on about F#, functional programming, pink vodka. The usual stuff.

Check it out!

Episode 24: Chatting about F# with Chris Smith and Dustin Campbell

posted on Tuesday, January 20, 2009 12:01:22 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]

kick it on DotNetKicks.com
 Monday, January 19, 2009

Last time, we took a brute force approach to solving Project Euler problem seven. Unfortunately, the resulting prime number generator turned out to be fairly ugly and not really efficient enough to handle the needs of future YAPES1 problems. This time, we’ll approach the problem using a tried-and-true algorithm that should give us the desired performance and produce a beautiful solution. I can already hear some of you rolling your eyes, because we’ll be looking at...

The Sieve of Eratosthenes

Amazingly, the Sieve of Eratosthenes is perhaps the oldest known algorithm for generating prime numbers, and yet it remains reasonably efficient by today’s standards. It’s true that there are faster methods available, but this should be fast enough. The basic algorithm is nothing earth-shattering—it’s first semester computer science stuff. However, we’ll add a twist to keep things interesting.

The Sieve of Eratosthenes works by marking composite numbers:

  1. Start with the natural numbers from 2 up to some maximum limit.
  2. Take the first unmarked number (2) and mark all multiples of that number. Note that 2 remains unmarked. Only its multiples (e.g. 4,6,8,…) are marked.
  3. Take the next unmarked number and mark all of its multiples.
  4. Repeat step 3 until there are no more numbers left to test.
  5. When finished, the unmarked numbers are the primes.

As described, this algorithm doesn’t quite work for our purposes. It is certainly easy to implement, but because of the starting requirement (a big list of numbers), it doesn’t lend itself well to generating a, potentially, infinite stream of primes. However, with a small modification, we can make it work.

The adaptation we’ll implement comes from from the excellent paper, The Genuine Sieve of Eratosthenes2, by Melissa E. O’Neill. The idea is to turn the algorithm inside out. Instead of keeping track of which numbers are prime, we’ll maintain a table of composite numbers along with their known factors. As we iterate the natural numbers, the table will be updated with new information about which numbers are known to be composite and what factors have been found.

Confused? Let’s walk through the algorithm, step by step.

Tracking Composites

We start with 2 as the first natural number and check to see if it exists in the composite table. Since the table is empty, 2 isn’t present, so we add a new list containing 2 to the table with the square of 2 (4) as the key. Finally, since 2 was not found in the composite table, we know it is prime.

Sieve Diagram 1 (after 2)

We take the next natural number, 3. Like before, 3 is not found in the composite table, so we add it to the table using its square (9) as the key and return 3 as prime.

Sieve Diagram 2 (after 3)

We take the next natural number, 4. This time is different—4 is found in the composite table, so we know it’s not prime. In this case, we take the list found in the table under 4 (the known factors), and for each factor, we add 4 and insert the result into the table. Thus, we insert 6 into the table along with its known factor, 2. (Interestingly, this algorithm doesn’t tell us that 3 is a factor of 6). Finally, we remove 4 from the table.

Sieve Diagram 3 (after 4)

Let's skip ahead to some more interesting cases.

After 9 has been processed, the table looks like so:

Sieve Diagram 4 (after 9)

The next natural number, 10, is found in the table, so we know that it’s composite and not prime. Again, we take the list of known factors, and for each factor, we add 10 and insert the result into the table. This time, the value (12) already exists in the table, so we add it to the list of known factors for that value. Lastly, we remove 10 from the table.

Sieve Diagram 5 (after 10)

The next natural number, 11, is not found in the table, so we insert its square and return it as prime.

Sieve Diagram 6 (after 11)

We find the next natural number, 12, in the table. However, because there is more than one known factor in the table, we add two new composites, 12+2=14 and 12+3=15, before finally removing 12.

Sieve Diagram 7 (after 12)

As you can see, this take on the classic algorithm allows us to determine which numbers are prime and composite by using a minimal amount of memory. Obviously, the table can grow large as we find larger primes, but the memory usage is quite reasonable relative to the traditional sieve where a large list of natural numbers would have to be generated before even the first prime is returned.

Show Me Some Code!

The F# code nearly writes itself.

let reinsert x table prime =
   let comp = x+prime
   match Map.tryfind comp table with
   | None        -> table |> Map.add comp [prime]
   | Some(facts) -> table |> Map.add comp (prime::facts)

let rec sieve x table =
  seq {
    match Map.tryfind x table with
    | None ->
        yield x
        yield! sieve (x+1L) (table |> Map.add (x*x) [x])
    | Some(factors) ->
        yield! sieve (x+1L) (factors |> List.fold_left (reinsert x) (table |> Map.remove x)) }

let primes =
  sieve 2L Map.empty

The meat of the algorithm is in the sieve function. Here, we try to locate x in the table. If it is not found, we yield x (because it’s prime) and recursively call sieve passing x+1 and a table with the square of x added to it. (Note that Map is an immutable data structure, so “mutating” functions like Map.add actually return a brand new Map.) If x is found in the table, we recursively call sieve passing x+1 and a table from which x has been removed and each known factor has been reinserted. The reinsert function performs the trivial work of reinserting a known factor into the table.

A couple of other interesting details about this code are worth pointing out:

  • sieve returns a sequence expression, denoted by the seq keyword and the curly braces. Sequence expressions are lazily evaluated, meaning that the next prime won't be calculated until requested. In addition, the sequence expression is recursive due to the use of the yield! keyword, which can be used to return another sequence expression as part of a sequence.
  • The call to List.fold_left might look a bit exotic because we are folding over the list of known factors to produce a new table with each factor reinserted.

Mutation Isn't All Bad

Now that we have a working prime number generator, let’s take a moment to optimize. Until now, we’ve been sticking with the immutable F# Map type, but we could employ a mutable data structure to improve performance. To replace the data structure in a way that preserves the beauty of the algorithm, we’ll define a set of APIs for Dictionary<TKey,TValue> that mimic the Map APIs:

module Dict =
  open System.Collections.Generic

  let empty() = new Dictionary<_,_>()

  let add k v (d : #IDictionary<'a,'b>) =
    d.[k] <- v; d

  let remove (k : 'a) (d : #IDictionary<'a,'b>) =
    d.Remove(k) |> ignore; d

  let tryfind k (d : #IDictionary<'a,'b>) =
    match d.TryGetValue(k) with
    | true, v  -> Some(v)
    | false, _ -> None

Now, it’s a simple matter of replacing references to Map with Dict, and our algorithm immediately benefits from the performance boost of using a mutable table.

let reinsert x table prime =
   let comp = x+prime
   match Dict.tryfind comp table with
   | None        -> table |> Dict.add comp [prime]
   | Some(facts) -> table |> Dict.add comp (prime::facts)

let rec sieve x table =
  seq {
    match Dict.tryfind x table with
    | None ->
        yield x
        yield! sieve (x+1L) (table |> Dict.add (x*x) [x])
    | Some(factors) ->
        yield! sieve (x+1L) (factors |> List.fold_left (reinsert x) (table |> Dict.remove x)) }

let primes =
  sieve 2L (Dict.empty())

With our prime number generator in hand, solving Project Euler problem seven is trivial:

primes |> Seq.nth 10000

At last, we’ve produced a prime number generator with a good balance of performance and beauty. If we’ve learned anything from this article or its predecessor, I hope it’s that finding an optimal solution often requires finding an appropriate algorithm.

Take that Chris!

1Yet Another Project Euler Series
2O'Neill's paper appears in The Journal of Functional Programming, Volume 19, Issue 01.

posted on Monday, January 19, 2009 11:44:09 PM (Pacific Standard Time, UTC-08:00)  #    Comments [6]

kick it on DotNetKicks.com
 Monday, December 29, 2008

The F# library provides a variety of functions (based on the printf functions found in OCaml) that produce formatted text.

printf outputs formatted text to the console using the stdout stream.
printfn outputs formatted text suffixed with a new-line character to the console using the stdout stream.
eprintf outputs formatted text to the console using the stderr stream.
eprintfn outputs formatted text suffixed with a new-line character to the console using the stderr stream.
sprintf returns formatted text as a string.
bprintf appends formatted text to a System.Text.StringBuilder.
fprintf writes formatted text to a System.IO.TextWriter.
fprintfn writes formatted text suffixed with a new-line character to a System.IO.TextWriter.
twprintf writes formatted text to a System.IO.TextWriter.
twprintfn writes formatted text suffixed with a new-line character to a System.IO.TextWriter.

The functions above are especially powerful because the F# compiler will type-check the format arguments and give design-time errors.

Type-safe Format String Error with Tooltip

While this set of functions covers most of the cases where formatted text is needed, there is one glaring omission: debug output. If we want to pass formatted text to System.Diagnostics.Debug.Write, we have to do it using sprintf like so:

System.Diagnostics.Debug.Write(sprintf "The answer = %d" 42)

Obviously, that's not ideal. What we would really like is a function that behaves exactly like the other printf functions but writes debug output. Fortunately, extensibility has been built into the F# library, making it possible to create additional formatting functions that benefit from the same sweet type-checking. So, how do we go about doing that? The trick is to wrap our functions around a special F# library function, ksprintf.

ksprintf is similar to sprintf in that it formats text as a string. However, instead of returning that string to the caller, it passes the string into a continuation.

I haven't covered the broad topic of continuations yet because a) they can be pretty eye-crossing when presented plainly, and b) there are already fantastic treatments out there. The basic idea is to pass a lambda1 (the so-called continuation) into a computation that will be executed when the computation is finished. Really. That's it. This is a simple idea, but it can become complicated quickly. However, in the case of ksprintf it remains simple. We'll pass a lambda that calls Debug.Write with the final string after it has been formatted.

module Debug =
  open System.Diagnostics

  let writef fmt = Printf.ksprintf (fun s -> Debug.Write(s)) fmt

  let writefn fmt = Printf.ksprintf (fun s -> Debug.WriteLine(s)) fmt

In fact, the wrapper lambdas are redundant because Debug.Write and Debug.WriteLine have overloads that match the expected signature. We can simply pass the functions themselves and remove the wrappers:

module Debug =
  open System.Diagnostics

  let writef fmt = Printf.ksprintf Debug.Write fmt

  let writefn fmt = Printf.ksprintf Debug.WriteLine fmt

Just drop that code into an F# project, and you'll be writing debug output in a beautiful, concise F# style.

Debug.writef "The answer = %d" 42

You'll even get helpful design-time type errors:

Debug.writef Type-safe Format String Error with Tooltip

Now it's just a matter of convincing the F# team to add it to the libraries. ;-)

1Did It With .NET readers probably already know that "lambda" = "anonymous function".

posted on Monday, December 29, 2008 11:35:14 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]

kick it on DotNetKicks.com
 Wednesday, December 17, 2008

It’s that time of year again. CodeMash time.

When I first moved to the Seattle area, I was concerned that I wouldn’t be able to come back to Ohio for this year’s CodeMash. Of all of the conferences that I’ve attended over the years, CodeMash has been among the most rewarding, and I really didn’t want to miss this one. Fortunately, the stars have aligned, and my family’s holiday vacation is bringing me back to the Midwest just in time to join in on all the fun.

This year’s speaker list is just insane. With names like Bill Wagner, Richard Campbell, Steve Smith, Mads Torgersen and David Laribee, I’m expecting to have a very full and satisfied brain. But CodeMash is about more than just the sessions; it’s also about the pure geek nirvana of hanging out with a group of people who are just as excited about technology as I am.

Given the list of heavyweights speaking this year, I feel pretty honored to be filling one of the slots. If you’re coming to CodeMash, check out my talk, “Multi-threading Mojo with F#.” It should be a blast.

CodeMashGears

posted on Wednesday, December 17, 2008 11:39:46 PM (Pacific Standard Time, UTC-08:00)  #    Comments [6]

kick it on DotNetKicks.com
 Wednesday, September 17, 2008

Prime numbers.

If there’s one mathematical curiosity that appears more often than any other in the Project Euler problems, it’s prime numbers. To be fair, we've dealt with primes before, but problem seven is the first that requires a prime number generator as part of its solution.

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10001st prime number?

I must admit that I’ve been riffing on this problem for quite a while now. There are so many variations to prime number generation that I’ve had difficulty choosing one with the right balance of elegance and efficiency. Because I’ll be reusing my generator for future problems, I must be certain that it’s fast enough. However, as always my primary goal is to produce the most beautiful solution that I can.

Spoiler alert! I'm revealing the problem solution early.

primes |> Seq.nth 10000

Trivial, right? Of course, the challenge of this problem is in declaring the magic behind primes. We have a couple of options available to us, but since there’s a great deal that can be learned from using brute force to solve a problem, we’ll first try the…

The Naïve Approach

The most straightforward way to generate prime numbers is to simply test every number, starting from 2—the first prime—with some primality test. Perhaps something like the code below.

{ 2L..System.Int64.MaxValue } |> Seq.filter isPrime

For the primality test, we can test the candidate prime against every smaller number. If it is evenly divisible by any smaller number other than 1, it isn't prime.

let isPrime n =
  { 2L..n-1L } |> Seq.for_all (fun x -> n%x <> 0L)

Putting the pieces together gives us a real working solution.

let isPrime n =
  { 2L..n-1L } |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  { 2L..System.Int64.MaxValue } |> Seq.filter isPrime

primes |> Seq.nth 10000

Are we finished? Hardly! While simple, this prime number generator takes a whopping 30 seconds on my hefty-yet-quickly-obsoleting machine! That might fall within Project Euler's "one-minute rule," but we can certainly do better.

Optimizing Naïvely

The obvious optimization is to reduce the set of numbers that isPrime tests. Observe that1 the largest factor of a number (other than itself) is its square root. Armed with this knowledge, we can improve the primality test by testing just the natural numbers from 2 through the square root of n.

let isPrime n =
  let limit = int64 (sqrt (float n))
  { 2L..limit } |> Seq.for_all (fun x -> n%x <> 0L)

That's a huge improvement! Finding the 10001st prime now only takes about .25 seconds.

We can do better yet. If 2 is the only even prime, why are we bothering to test any other even numbers? We can save more time by testing only the odds.

let odds limit =
  Seq.unfold (fun n ->
    if n > limit then None
    else Some(n,n+2L)) 3L

let isPrime n =
  let limit = int64 (sqrt (float n))
  odds limit |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  seq { yield 2L
        yield! odds System.Int64.MaxValue |> Seq.filter isPrime }

That brings the solution down to approximately .2 seconds.

NeRd Note
Curious about the use of yield and yield! in the example above? The difference between these keywords is simple yet powerful. yield simply returns a single element of a sequence expression, much like the yield return of C# iterators. However, yield! returns another sequence expression as part of the sequence.2 This is an extraordinarily powerful feature that offers a lot of flexibility. In part 2, we'll use yield! to produce an elegant recursive sequence expression.

Before moving on, let's make one last optimization to our naïve algorithm. We can take advantage of the fact that every prime, after 2 and 3, is of the form 6k ± 1. By reducing the set of numbers used by our primality test to those of this form, we can eke out a tiny bit more speed.

let inline next k i =
  if i = -1L then (k,1L)
  else ((k+1L),-1L)

let candidates limit =
  Seq.unfold (fun (k,i) ->
    let v = 6L*k + i
    if (v > limit) then None
    else Some(v, (next k i))) (1L,-1L)

let isPrime n =
  let limit = int64 (sqrt (float n))
  candidates limit |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  seq { yield! [2L;3L]
        yield! candidates System.Int64.MaxValue |> Seq.filter isPrime }

Using the prime number generator above, our solution takes around .15 seconds. Not too shabby! To be fair, this problem deals with reasonably small prime numbers. Future Project Euler problems (like problem ten) will benefit from a more efficient algorithm. Next time we’ll take a look at another well-known algorithm for generating prime numbers.

 

1"Observe that"? Clearly, I've been reading too many academic papers lately.
2F#'s yield! is similar to the stream-flattening concept of . Perhaps this would be an useful extension to the C# language? Mads? What do you think?

posted on Wednesday, September 17, 2008 1:33:05 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]

kick it on DotNetKicks.com
 Monday, September 01, 2008

NOTE: I have shamefully stolen the title of this post from the custom T-shirt sported by Amanda Laucher at Tech Ed Developer 2008. The cleverness is all hers.

I’m a bit late to the blog posting party, but the F# Team shipped the F# September 2008 CTP on Friday. Kudos to Don, Luke, Chris, Brian, Jomo and the gang! Note that this isn’t yet another research release1, but an honest-to-goodness preview of F# as a fully-supported .NET language. In addition to the CTP release, the F# Developer Center is now open for business!

The CTP is jam-packed with hotness including…

  • The new F# Project System. Brian has the details here, here, here, here, and here.
  • Sweet scripting support. Check out Jomo’s “Zero to Execute in Ten Seconds” for the nitty-gritty.
  • Units of Measure. That’s right—amidst all of the effort to pull the CTP together, the F# Team managed to include a ground-breaking new feature! Read the first part of Andrew Kennedy’s series on this killer language feature here.2

This is just a taste of what’s in the CTP. Read the detailed (and I do mean detailed) release notes for more info.

1YARR — pirate speak.
2I have a certain amount of affection for the Units of Measure language feature after seeing its power firsthand while we competed in this year’s ICFP Programming Contest.

posted on Monday, September 01, 2008 7:01:51 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]

kick it on DotNetKicks.com