Wednesday, September 17, 2008

Prime numbers.

If there’s one mathematical curiosity that appears more often than any other in the Project Euler problems, it’s prime numbers. To be fair, we've dealt with primes before, but problem seven is the first that requires a prime number generator as part of its solution.

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10001st prime number?

I must admit that I’ve been riffing on this problem for quite a while now. There are so many variations to prime number generation that I’ve had difficulty choosing one with the right balance of elegance and efficiency. Because I’ll be reusing my generator for future problems, I must be certain that it’s fast enough. However, as always my primary goal is to produce the most beautiful solution that I can.

Spoiler alert! I'm revealing the problem solution early.

primes |> Seq.nth 10000

Trivial, right? Of course, the challenge of this problem is in declaring the magic behind primes. We have a couple of options available to us, but since there’s a great deal that can be learned from using brute force to solve a problem, we’ll first try the…

The Naïve Approach

The most straightforward way to generate prime numbers is to simply test every number, starting from 2—the first prime—with some primality test. Perhaps something like the code below.

{ 2L..System.Int64.MaxValue } |> Seq.filter isPrime

For the primality test, we can test the candidate prime against every smaller number. If it is evenly divisible by any smaller number other than 1, it isn't prime.

let isPrime n =
  { 2L..n-1L } |> Seq.for_all (fun x -> n%x <> 0L)

Putting the pieces together gives us a real working solution.

let isPrime n =
  { 2L..n-1L } |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  { 2L..System.Int64.MaxValue } |> Seq.filter isPrime

primes |> Seq.nth 10000

Are we finished? Hardly! While simple, this prime number generator takes a whopping 30 seconds on my hefty-yet-quickly-obsoleting machine! That might fall within Project Euler's "one-minute rule," but we can certainly do better.

Optimizing Naïvely

The obvious optimization is to reduce the set of numbers that isPrime tests. Observe that1 the largest factor of a number (other than itself) is its square root. Armed with this knowledge, we can improve the primality test by testing just the natural numbers from 2 through the square root of n.

let isPrime n =
  let limit = int64 (sqrt (float n))
  { 2L..limit } |> Seq.for_all (fun x -> n%x <> 0L)

That's a huge improvement! Finding the 10001st prime now only takes about .25 seconds.

We can do better yet. If 2 is the only even prime, why are we bothering to test any other even numbers? We can save more time by testing only the odds.

let odds limit =
  Seq.unfold (fun n ->
    if n > limit then None
    else Some(n,n+2L)) 3L

let isPrime n =
  let limit = int64 (sqrt (float n))
  odds limit |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  seq { yield 2L
        yield! odds System.Int64.MaxValue |> Seq.filter isPrime }

That brings the solution down to approximately .2 seconds.

NeRd Note
Curious about the use of yield and yield! in the example above? The difference between these keywords is simple yet powerful. yield simply returns a single element of a sequence expression, much like the yield return of C# iterators. However, yield! returns another sequence expression as part of the sequence.2 This is an extraordinarily powerful feature that offers a lot of flexibility. In part 2, we'll use yield! to produce an elegant recursive sequence expression.

Before moving on, let's make one last optimization to our naïve algorithm. We can take advantage of the fact that every prime, after 2 and 3, is of the form 6k ± 1. By reducing the set of numbers used by our primality test to those of this form, we can eke out a tiny bit more speed.

let inline next k i =
  if i = -1L then (k,1L)
  else ((k+1L),-1L)

let candidates limit =
  Seq.unfold (fun (k,i) ->
    let v = 6L*k + i
    if (v > limit) then None
    else Some(v, (next k i))) (1L,-1L)

let isPrime n =
  let limit = int64 (sqrt (float n))
  candidates limit |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  seq { yield! [2L;3L]
        yield! candidates System.Int64.MaxValue |> Seq.filter isPrime }

Using the prime number generator above, our solution takes around .15 seconds. Not too shabby! To be fair, this problem deals with reasonably small prime numbers. Future Project Euler problems (like problem ten) will benefit from a more efficient algorithm. Next time we’ll take a look at another well-known algorithm for generating prime numbers.


1"Observe that"? Clearly, I've been reading too many academic papers lately.
2F#'s yield! is similar to the stream-flattening concept of . Perhaps this would be an useful extension to the C# language? Mads? What do you think?

posted on Wednesday, September 17, 2008 1:33:05 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]

kick it on
 Monday, September 01, 2008

NOTE: I have shamefully stolen the title of this post from the custom T-shirt sported by Amanda Laucher at Tech Ed Developer 2008. The cleverness is all hers.

I’m a bit late to the blog posting party, but the F# Team shipped the F# September 2008 CTP on Friday. Kudos to Don, Luke, Chris, Brian, Jomo and the gang! Note that this isn’t yet another research release1, but an honest-to-goodness preview of F# as a fully-supported .NET language. In addition to the CTP release, the F# Developer Center is now open for business!

The CTP is jam-packed with hotness including…

  • The new F# Project System. Brian has the details here, here, here, here, and here.
  • Sweet scripting support. Check out Jomo’s “Zero to Execute in Ten Seconds” for the nitty-gritty.
  • Units of Measure. That’s right—amidst all of the effort to pull the CTP together, the F# Team managed to include a ground-breaking new feature! Read the first part of Andrew Kennedy’s series on this killer language feature here.2

This is just a taste of what’s in the CTP. Read the detailed (and I do mean detailed) release notes for more info.

1YARR — pirate speak.
2I have a certain amount of affection for the Units of Measure language feature after seeing its power firsthand while we competed in this year’s ICFP Programming Contest.

posted on Monday, September 01, 2008 7:01:51 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]

kick it on