Wednesday, September 17, 2008

Prime numbers.

If there’s one mathematical curiosity that appears more often than any other in the Project Euler problems, it’s prime numbers. To be fair, we've dealt with primes before, but problem seven is the first that requires a prime number generator as part of its solution.

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10001st prime number?

I must admit that I’ve been riffing on this problem for quite a while now. There are so many variations to prime number generation that I’ve had difficulty choosing one with the right balance of elegance and efficiency. Because I’ll be reusing my generator for future problems, I must be certain that it’s fast enough. However, as always my primary goal is to produce the most beautiful solution that I can.

Spoiler alert! I'm revealing the problem solution early.

primes |> Seq.nth 10000

Trivial, right? Of course, the challenge of this problem is in declaring the magic behind primes. We have a couple of options available to us, but since there’s a great deal that can be learned from using brute force to solve a problem, we’ll first try the…

The Naïve Approach

The most straightforward way to generate prime numbers is to simply test every number, starting from 2—the first prime—with some primality test. Perhaps something like the code below.

{ 2L..System.Int64.MaxValue } |> Seq.filter isPrime

For the primality test, we can test the candidate prime against every smaller number. If it is evenly divisible by any smaller number other than 1, it isn't prime.

let isPrime n =
  { 2L..n-1L } |> Seq.for_all (fun x -> n%x <> 0L)

Putting the pieces together gives us a real working solution.

let isPrime n =
  { 2L..n-1L } |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  { 2L..System.Int64.MaxValue } |> Seq.filter isPrime

primes |> Seq.nth 10000

Are we finished? Hardly! While simple, this prime number generator takes a whopping 30 seconds on my hefty-yet-quickly-obsoleting machine! That might fall within Project Euler's "one-minute rule," but we can certainly do better.

Optimizing Naïvely

The obvious optimization is to reduce the set of numbers that isPrime tests. Observe that1 the largest factor of a number (other than itself) is its square root. Armed with this knowledge, we can improve the primality test by testing just the natural numbers from 2 through the square root of n.

let isPrime n =
  let limit = int64 (sqrt (float n))
  { 2L..limit } |> Seq.for_all (fun x -> n%x <> 0L)

That's a huge improvement! Finding the 10001st prime now only takes about .25 seconds.

We can do better yet. If 2 is the only even prime, why are we bothering to test any other even numbers? We can save more time by testing only the odds.

let odds limit =
  Seq.unfold (fun n ->
    if n > limit then None
    else Some(n,n+2L)) 3L

let isPrime n =
  let limit = int64 (sqrt (float n))
  odds limit |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  seq { yield 2L
        yield! odds System.Int64.MaxValue |> Seq.filter isPrime }

That brings the solution down to approximately .2 seconds.

NeRd Note
Curious about the use of yield and yield! in the example above? The difference between these keywords is simple yet powerful. yield simply returns a single element of a sequence expression, much like the yield return of C# iterators. However, yield! returns another sequence expression as part of the sequence.2 This is an extraordinarily powerful feature that offers a lot of flexibility. In part 2, we'll use yield! to produce an elegant recursive sequence expression.

Before moving on, let's make one last optimization to our naïve algorithm. We can take advantage of the fact that every prime, after 2 and 3, is of the form 6k ± 1. By reducing the set of numbers used by our primality test to those of this form, we can eke out a tiny bit more speed.

let inline next k i =
  if i = -1L then (k,1L)
  else ((k+1L),-1L)

let candidates limit =
  Seq.unfold (fun (k,i) ->
    let v = 6L*k + i
    if (v > limit) then None
    else Some(v, (next k i))) (1L,-1L)

let isPrime n =
  let limit = int64 (sqrt (float n))
  candidates limit |> Seq.for_all (fun x -> n%x <> 0L)

let primes =
  seq { yield! [2L;3L]
        yield! candidates System.Int64.MaxValue |> Seq.filter isPrime }

Using the prime number generator above, our solution takes around .15 seconds. Not too shabby! To be fair, this problem deals with reasonably small prime numbers. Future Project Euler problems (like problem ten) will benefit from a more efficient algorithm. Next time we’ll take a look at another well-known algorithm for generating prime numbers.


1"Observe that"? Clearly, I've been reading too many academic papers lately.
2F#'s yield! is similar to the stream-flattening concept of . Perhaps this would be an useful extension to the C# language? Mads? What do you think?

posted on Wednesday, September 17, 2008 1:33:05 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]

kick it on