Monday, January 19, 2009

Last time, we took a brute force approach to solving Project Euler problem seven. Unfortunately, the resulting prime number generator turned out to be fairly ugly and not really efficient enough to handle the needs of future YAPES1 problems. This time, we’ll approach the problem using a tried-and-true algorithm that should give us the desired performance and produce a beautiful solution. I can already hear some of you rolling your eyes, because we’ll be looking at...

The Sieve of Eratosthenes

Amazingly, the Sieve of Eratosthenes is perhaps the oldest known algorithm for generating prime numbers, and yet it remains reasonably efficient by today’s standards. It’s true that there are faster methods available, but this should be fast enough. The basic algorithm is nothing earth-shattering—it’s first semester computer science stuff. However, we’ll add a twist to keep things interesting.

The Sieve of Eratosthenes works by marking composite numbers:

  1. Start with the natural numbers from 2 up to some maximum limit.
  2. Take the first unmarked number (2) and mark all multiples of that number. Note that 2 remains unmarked. Only its multiples (e.g. 4,6,8,…) are marked.
  3. Take the next unmarked number and mark all of its multiples.
  4. Repeat step 3 until there are no more numbers left to test.
  5. When finished, the unmarked numbers are the primes.

As described, this algorithm doesn’t quite work for our purposes. It is certainly easy to implement, but because of the starting requirement (a big list of numbers), it doesn’t lend itself well to generating a, potentially, infinite stream of primes. However, with a small modification, we can make it work.

The adaptation we’ll implement comes from from the excellent paper, The Genuine Sieve of Eratosthenes2, by Melissa E. O’Neill. The idea is to turn the algorithm inside out. Instead of keeping track of which numbers are prime, we’ll maintain a table of composite numbers along with their known factors. As we iterate the natural numbers, the table will be updated with new information about which numbers are known to be composite and what factors have been found.

Confused? Let’s walk through the algorithm, step by step.

Tracking Composites

We start with 2 as the first natural number and check to see if it exists in the composite table. Since the table is empty, 2 isn’t present, so we add a new list containing 2 to the table with the square of 2 (4) as the key. Finally, since 2 was not found in the composite table, we know it is prime.

Sieve Diagram 1 (after 2)

We take the next natural number, 3. Like before, 3 is not found in the composite table, so we add it to the table using its square (9) as the key and return 3 as prime.

Sieve Diagram 2 (after 3)

We take the next natural number, 4. This time is different—4 is found in the composite table, so we know it’s not prime. In this case, we take the list found in the table under 4 (the known factors), and for each factor, we add 4 and insert the result into the table. Thus, we insert 6 into the table along with its known factor, 2. (Interestingly, this algorithm doesn’t tell us that 3 is a factor of 6). Finally, we remove 4 from the table.

Sieve Diagram 3 (after 4)

Let's skip ahead to some more interesting cases.

After 9 has been processed, the table looks like so:

Sieve Diagram 4 (after 9)

The next natural number, 10, is found in the table, so we know that it’s composite and not prime. Again, we take the list of known factors, and for each factor, we add 10 and insert the result into the table. This time, the value (12) already exists in the table, so we add it to the list of known factors for that value. Lastly, we remove 10 from the table.

Sieve Diagram 5 (after 10)

The next natural number, 11, is not found in the table, so we insert its square and return it as prime.

Sieve Diagram 6 (after 11)

We find the next natural number, 12, in the table. However, because there is more than one known factor in the table, we add two new composites, 12+2=14 and 12+3=15, before finally removing 12.

Sieve Diagram 7 (after 12)

As you can see, this take on the classic algorithm allows us to determine which numbers are prime and composite by using a minimal amount of memory. Obviously, the table can grow large as we find larger primes, but the memory usage is quite reasonable relative to the traditional sieve where a large list of natural numbers would have to be generated before even the first prime is returned.

Show Me Some Code!

The F# code nearly writes itself.

let reinsert x table prime =
   let comp = x+prime
   match Map.tryfind comp table with
   | None        -> table |> Map.add comp [prime]
   | Some(facts) -> table |> Map.add comp (prime::facts)

let rec sieve x table =
  seq {
    match Map.tryfind x table with
    | None ->
        yield x
        yield! sieve (x+1L) (table |> Map.add (x*x) [x])
    | Some(factors) ->
        yield! sieve (x+1L) (factors |> List.fold_left (reinsert x) (table |> Map.remove x)) }

let primes =
  sieve 2L Map.empty

The meat of the algorithm is in the sieve function. Here, we try to locate x in the table. If it is not found, we yield x (because it’s prime) and recursively call sieve passing x+1 and a table with the square of x added to it. (Note that Map is an immutable data structure, so “mutating” functions like Map.add actually return a brand new Map.) If x is found in the table, we recursively call sieve passing x+1 and a table from which x has been removed and each known factor has been reinserted. The reinsert function performs the trivial work of reinserting a known factor into the table.

A couple of other interesting details about this code are worth pointing out:

  • sieve returns a sequence expression, denoted by the seq keyword and the curly braces. Sequence expressions are lazily evaluated, meaning that the next prime won't be calculated until requested. In addition, the sequence expression is recursive due to the use of the yield! keyword, which can be used to return another sequence expression as part of a sequence.
  • The call to List.fold_left might look a bit exotic because we are folding over the list of known factors to produce a new table with each factor reinserted.

Mutation Isn't All Bad

Now that we have a working prime number generator, let’s take a moment to optimize. Until now, we’ve been sticking with the immutable F# Map type, but we could employ a mutable data structure to improve performance. To replace the data structure in a way that preserves the beauty of the algorithm, we’ll define a set of APIs for Dictionary<TKey,TValue> that mimic the Map APIs:

module Dict =
  open System.Collections.Generic

  let empty() = new Dictionary<_,_>()

  let add k v (d : #IDictionary<'a,'b>) =
    d.[k] <- v; d

  let remove (k : 'a) (d : #IDictionary<'a,'b>) =
    d.Remove(k) |> ignore; d

  let tryfind k (d : #IDictionary<'a,'b>) =
    match d.TryGetValue(k) with
    | true, v  -> Some(v)
    | false, _ -> None

Now, it’s a simple matter of replacing references to Map with Dict, and our algorithm immediately benefits from the performance boost of using a mutable table.

let reinsert x table prime =
   let comp = x+prime
   match Dict.tryfind comp table with
   | None        -> table |> Dict.add comp [prime]
   | Some(facts) -> table |> Dict.add comp (prime::facts)

let rec sieve x table =
  seq {
    match Dict.tryfind x table with
    | None ->
        yield x
        yield! sieve (x+1L) (table |> Dict.add (x*x) [x])
    | Some(factors) ->
        yield! sieve (x+1L) (factors |> List.fold_left (reinsert x) (table |> Dict.remove x)) }

let primes =
  sieve 2L (Dict.empty())

With our prime number generator in hand, solving Project Euler problem seven is trivial:

primes |> Seq.nth 10000

At last, we’ve produced a prime number generator with a good balance of performance and beauty. If we’ve learned anything from this article or its predecessor, I hope it’s that finding an optimal solution often requires finding an appropriate algorithm.

Take that Chris!

1Yet Another Project Euler Series
2O'Neill's paper appears in The Journal of Functional Programming, Volume 19, Issue 01.

posted on Monday, January 19, 2009 11:44:09 PM (Pacific Standard Time, UTC-08:00)  #    Comments [6]

kick it on DotNetKicks.com
Tuesday, January 20, 2009 3:16:21 AM (Pacific Standard Time, UTC-08:00)
You need some brackets in:

let primes =
sieve 2L (Dict.empty())

Nice solution! But I like mine too (don't know how to format properly, alas):

#r "FSharp.PowerPack.dll"
#nowarn "40"
module Ochkarik =
let rec primes =
let rec odd n = seq {
if Seq.take_while (fun p->p*p<=n) primes
|> Seq.exists ((%) n >> (=) 0L)
|> not then yield n
yield! odd (n+2L) }
Seq.cons 2L (odd 3L) |> Seq.cache
Thursday, January 22, 2009 7:38:46 AM (Pacific Standard Time, UTC-08:00)
Thanks for the correction to the code. Hopefully, that will teach me that editing code in Live Writer is fraught with peril! :-)

Your solution definitely is a bit of clever code. It's very reminiscent of some LINQ solutions that I've seen. However, it looks to be an implementation of the brute force approach that I wrote last time. The performance of such an approach simply won't scale to later Euler problems.
Sunday, January 25, 2009 10:44:20 PM (Pacific Standard Time, UTC-08:00)
You're right about the performance. But no need to worry about my Euler solutions: I have another - sieving - implementation using asynchronous workflows that gets me 10M primes in under 4 seconds. It's simply too ugly to post ;-)
Wednesday, February 25, 2009 7:46:15 PM (Pacific Standard Time, UTC-08:00)
Your little sieve program is "good enough" for this purpose although it could be improved by applying more of the Melissa E. O'Neill article plus other ideas that would avoid intensive recursion, allow more parallelism, perform even better, and still be beautiful!

I found your program takes many seconds on a 1.6 GHz machine to just calculate primes up to 300,000 and started investigating the above techniques to try to understand why using imperative techniques with C# takes only tiny fractions of a second. All of the above algorithmic techniques are good and can be applied to either imperative or functional programming, but one of the main limitations in speed in the current CTP of F# (September 2008) is the speed of handling seq's.

For instance, the following code just performs the trivial exercise of printing the billionth (and one) element of the infinite sequence of integers starting at zero. The time required is dependent on the sequence generator and that used in the built in library function has clearly been optimized as ie. "test1" as compared to Chris Smith's method ie. "test0", as follows:

let test0 = 0 |> Seq.unfold (fun i -> Some(i, i + 1))
let test1 = Seq.init_infinite (fun i -> i)

let st = System.DateTime.Now.Ticks
let doit = test1 |> Seq.nth 1000000000
let elpsd = (System.DateTime.Now.Ticks - st) / 10000L

printfn "%d in %d milliseconds.\r\n" doit elpsd

This takes about 25 seconds for test1 on my low end machine where as the Chris Smith generator takes over 10 times as long. Even the built in one is taking about 40 clock cycles per loop which is still a little long but I suppose ;)

Looks like the library functions and the compiler could do with some more speed optimization as to seq operations. As you say, everything is to do with algorithms.

Regards, Gordon Goodsman
Gordon Goodsman
Saturday, March 07, 2009 10:49:15 AM (Pacific Standard Time, UTC-08:00)
Gordon: Thanks for your excellent comments. It's true that I only went about halfway into to O'Neill's complete solution. I stopped short of introducing a priority queue, and chose to use a mutable map instead. FWIW, there are some excellent performance improvements coming in the next release of F# for sequence expressions.
Saturday, March 14, 2009 3:22:17 AM (Pacific Standard Time, UTC-08:00)
Dustin: This problem intrigues me and has led me into investigating F# a little more for this particular use. I realize that you stopped short of all the improvements discussed in O'Neill's article, but even the priority queue and the "factoring wheel" have their limits and would only increase the speed by some single digit factor. In fact, I also investigated use of the Richard Bird mutable list (lazy because it's in Haskell) as discussed at the end of the article, which is difficult to implement in F# because it seems to have lost its LazyList type somewhere before CTP version 1.9.6.2. The real limit of these and all similar declarative algorithms as compared to the imperative ones is the need to do either an efficient sort (for the Richard Bird or the priority queue algorithms) or an efficient look up (for the use of the Map or Dictionary), where as the imperative algorithms don't need to do this. Thus, the imperative algorithms can be written to take just over linear time per iteration and the declarative style can never approach that.

This leads one to look at where functional programming should properly be used and one immediately sees that it shouldn't really even be considered for implementation of Eratosthenes Sieve for any sort of large range of numbers such as even the 32 bit number range of something a little over 200 million primes. One can do this on our quad core machines in something like 7 seconds for imperative programming and using parallel processing (C#), but I wouldn't want to wait on it using functional programming in F# (or any functional language), even with the most optimized of O'Neill's algorithms.

F# fans will say "Ah, but you can mix imperative and declarative programming styles in F#.", but I think what you will find as one tries to come anywhere close to the processing speed of the imperative style for large numeric problems such as this is a single declarative line that says "Filter all of the possible odd primes by the imperative 'isprime' function", and none of the brilliant algorithms of which we have spoken will have been used at all.

Yes, F# is an elegant language capabable of expressing a mathematical concept in a few lines of code and testing that for a reasonable range of numbers, but the language and its encouraged programming style run into performance limits when larger but still practical ranges of numbers are to be used. At this point, it becomes just another mostly imperative language and will be performance limited by the quality of its compiler as compared to other compilers once the product is mature. However, other than having type inference and first class functions (which C# offers more and more with ongoing versions), its syntax is not really more concise than C# and being a less mature and evolving language, still has many inconsitencies in its syntax and structure. As far as supporting parallel processing, while it does support the concept, it doesn't really do it any better than any other language where the programmer still must know what they are doing in order to effectively use it (this includes the use of the new Parallel library functions in DotNet).

So while F# may find its uses for problems that don't put any real burden on the processing power of our current desktop computers such as the relatively small numeric ranges of the Euler Problems, and for parallel processing where the processes are pretty much independent of each other until complete, it should never be seriously considered for a problem such as calculating all the primes in the 32 bit number range, at least using the Sieve of Eratosthenes or any algorithm derived from it.
Gordon Goodsman
Comments are closed.
msn cams