Tuesday, March 17, 2009

I’m back again with another post in my Yet Another Project Euler Series. As a reminder, my approach to these problems is to try to find the most beautiful solution that I can using F#. While performance is an important, I’m looking for the most elegant solutions that I can find.

Project Euler problem ten is another dealing prime numbers.

The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.

Find the sum of all the primes below two million.

As you can guess, there’s not too much to this problem. In fact, we’ll declare just one simple supporting function.

let lessThan y x = x < y

Using the above function and our existing prime number generator, our solution is short and simple.

primes
  |> Seq.take_while (lessThan 2000000L)
  |> Seq.sum

I feel a little guilty that this post is so short, but that’s the benefit of spending two whole blog posts working out a prime number generator for problem seven.

posted on Tuesday, March 17, 2009 2:13:48 AM (Pacific Standard Time, UTC-08:00)  #    Comments [5]

kick it on DotNetKicks.com
Tuesday, March 17, 2009 5:48:08 AM (Pacific Standard Time, UTC-08:00)
One could also use (<) instead of defining a new lessThan function:

primes
|> Seq.take_while ((<) 2000000L)
|> Seq.sum

I guess your solution is slightly more readable...
Martin
Tuesday, March 17, 2009 6:32:16 AM (Pacific Standard Time, UTC-08:00)
Martin: Great point. I actually considered adding a footnote for this. Generally, I've been avoiding passing operators as functions in my solutions, simply for the sake of the small readability boost.
Friday, March 27, 2009 9:23:27 PM (Pacific Standard Time, UTC-08:00)
Hi. Meetings are indispensable when you don't want to do anything.
I am from Zealand and learning to write in English, give true I wrote the following sentence: "Zed was formed in to simplify leisure travel arrangements for airline employees travelling on another carrier."

Thanks for the help ;), Jada.
Thursday, April 09, 2009 4:25:19 AM (Pacific Standard Time, UTC-08:00)
Dustin, I see your comment about finding the most elegant solution to this problem using F#. I submit that applying Melissa E. O'Neill's work to this trivial problem of calculating all the primes up to two million is not the most elegant solution, even if you were to make the very minor changes to four lines to make it only concerned with the odd primes (add 2 instead of 1 to the two calls to the recursive sieve, make the comp = x + prime + prime or two times prime, and make primes equal appending two to the front of the sequence starting at 3 for all odd - add 2 for each). Problem 10 using this algorithm would then take about 1.8 seconds in "Release" mode on a machine of about the same power as yours instead of about twice that as currently implemented.

Remember that O'Neill set out to prove that the usual recursive functional programming solution to finding the sequence of primes is not a true Eratosthenes Sieve on two counts: 1) that it isn't true to the strike-off pattern of the Eratosthenes algorithm in that it just uses trial division instead, and 2) that it isn't an implementation of the Eratosthenes Sieve because it takes many more operations per prime cycle than the true sieve does. She most ably proved number two and then submitted her own incremental strike-off recursive algorithm that is better. Even not considering implementing the wheel factorization to the problem, which is likely overkill for such a limited range of primes, she glossed over the problem of finding and writing a functional priority queue that would reduce the average time to increment to the next value after re-inserting the primes in the queue to anything less than O(log n) time. Although there are very complex priority queues that can limit this to linear time for some special limited uses, none of the queues that she suggests do so. Also, she did not consider the memory usage for her algorithm in that to find even the first two million primes there will be all the primes less than two million entries in the various linked lists and given that you are using 64 bit keys and values in order to prevent some unique overflow problems with the algorithm as expressed, will use about 360 KBytes of RAM for even this trivial number of primes, not including the hash values (in the case of a Dictionary) or the extra link pointers (in the case of a queue).

I submit that if we revert back to the pure Eratosthenes Sieve using a check-off array, we both have an algorithm that is more true to his original idea and gain performance for such a limited range of primes. I have written the following using your idea of easy readability using helper functions but I have made them inline in order to not sacrifice performance. Since it uses a fixed array for the whole problem, I size this to include all of the odd possible primes up to the required value plus a few in that I have read that there is never a jump between primes of more than 255 until one reaches a very large prime:

First, a couple of constants to establish the size of the array and the index to the number just less than the root of the maximum prime:

let PGSZ = 2000000 / 2 + 256
let NXT = (int32(sqrt(float(PGSZ * 2 - 1))) - 3) / 2

and the actual check-off boolean array:

let oddPossibles = Array.create PGSZ true

then the helper functions as discussed:

let inline isNdxPrime i = oddPossibles.[i]
let inline prime2ndx = (n - 3) >>> 1 //convert to index by - 3) / 2
let inline prime2startndx p = prime2ndx(p * p)
let inline ndx2prime i = i + i + 3 // is 2 * i + 3
let inline lessThan lim i = i < lim
let inline repeatPattern p = seq{(prime2startndx p)..int(p)..(PGSZ - 1)}
let inline markoffComposite f = oddPossibles.[f] <- false
let inline eliminateComposites i = repeatPattern (ndx2prime i) |> Seq.iter markoffComposite

Note the elegant expression of the strike-off repeat pattern using a (lazy) sequence.

Finally, the expression of the odd primes sequence, which consists of two lines, one to strike-off the composites and one to provide the sequence of the remaining primes:

let oddprimes =
__seq{0..NXT} |> Seq.filter isNdxPrime |> Seq.iter eliminateComposites
__seq{0..(PGSZ - 1)} |> Seq.filter isNdxPrime |> Seq.map ndx2prime

and the actual primes sequence that assembles it with the only even prime of two:

let primes = Seq.append (Seq.singleton 2) oddprimes

Now, I agree that this gets a little more complex if we were to consider finding much larger primes in that the array could get very large, but you know we would then burn about twice the lines of code and implement array paging and possibly wheel factorization to reduce the requirements on memory bandwidth and memory; however, you said you wanted adequate elegance for the problem at hand. This solution solves problem 10 in about 0.260 seconds on the same machine as above including the time to allocate and initialize the check-off array, although it does require about a megabyte of spare memory (nothing these days). Note that this algorithm doesn't have an overflow problem up to finding primes of two billion and thus doesn't need to use 64 bit integers although one needs a map of the prime results to uint64 or int64 before calculating the sum in order to avoid sum overflow.

Alternatively, for much larger ranges of primes we could implement O'Neil's final form of her incremental strike-off algorithm and find an elegant implementation of her required priority queue - I am thinking an implementation of an array based queue would fill the requirements of linear average operations per prime check cycle and this might be even faster than the paged array technique. For a fixed maximum primes limit, this could likely be written in about 10 to 15 "elegant" F# lines.

Now that's elegant, thank you Eratosthenes, and take that Dustin ;-)
Gordon
Friday, April 10, 2009 7:33:02 PM (Pacific Standard Time, UTC-08:00)
Looks fantastic Gordon!
Comments are closed.
msn cams