F# performance difference between tail recursion and Seq library - performance

I have this code in F# which finds the smallest positive number that is evenly divisible by all of the numbers from 1 to 20. It takes 10 seconds to complete.
let isDivisableByAll num (divisors: int[]) = Array.forall (fun div -> num % div = 0) divisors
let minNumDividedBy (divisors: int[]) =
let rec minNumDividedByAll stopAt acc =
if acc >= stopAt then 0
else if isDivisableByAll acc divisors then acc
else minNumDividedByAll stopAt (acc + 1)
minNumDividedByAll 400000000 1
minNumDividedBy [|1..20|]
So, I thought I could make it more elegant, because I prefer less code and wrote the following.
let answer = { 1..400000000 }
|> Seq.tryFind (fun el -> isDivisableByAll el [|1..20|])
It took 10 minutes! I couldn't explain the huge difference, since sequences are lazy. In an effort to investigate, I wrote an imperative loop.
let mutable i = 1
while i < 232792561 do
if isDivisableByAll i [|1..20|] then
printfn "%d" i
i <- i + 1
It took 8 minutes. Therefore, it's not the sequence's fault either, right? So, why is the initial function so fast? It can't be avoiding building up the stack, due to tail recursion, can it? Because I wouldn't expect a considerable stack if any, being built in the slow examples either.
It doesn't make much sense to me, can someone tell me?
Thank you.

If I understand correctly, you are trying to find how many numbers between 1 and 400000000 (inclusive) are divisible by all the numbers from 1 to 20. I made my own crude version of it:
let factors = Array.rev [| 2 .. 20 |]
let divisible f n =
Array.forall (fun x -> n % x = 0) f
let solution () =
{1 .. 400000000}
|> Seq.filter (divisible factors)
|> Seq.length
This solution takes over 90 seconds to run where I tested it. But I came to realize that it is a variation of Euler problem number 5, where we learn that 2520 is the first number divisible by all the numbers from 1 to 10. Using this fact, we can create a sequence of multiples of 2520, and test only the numbers from 11 to 19, as the multiples are guaranteed to be divisible by all the numbers from 1 to 10, and 20 as well:
let factors = Array.rev [| 11 .. 19 |]
let divisible f n =
Array.forall (fun x -> n % x = 0) f
let solution () =
Seq.initInfinite (fun i -> (i + 1) * 2520)
|> Seq.takeWhile (fun i -> i <= 400000000)
|> Seq.filter (divisible factors)
|> Seq.length
This solution takes 0.191 seconds.
If you don't know about Euler problem number 5, you can even algorithmically compute sequences with elements that are multiples of a given starting value. We feed the algorithm a sequence of numbers divisible by all numbers from 2 to n - 1, and it computes the first number divisible by all numbers from 2 to n. This is iterated through until we have a sequence of multiples of the first number divisible by all the factors we want:
let narrowDown m n s =
(s, {m .. n})
||> Seq.fold (fun a i ->
let j = Seq.find (fun x -> x % i = 0) a
Seq.initInfinite (fun i -> (i + 1) * j))
let solution () =
Seq.initInfinite (fun i -> i + 1)
|> narrowDown 2 20
|> Seq.takeWhile (fun i -> i <= 400000000)
|> Seq.length
This solution runs in 0.018 seconds.

As Fyodor Soikin commented, making a new array [|1..20|] for each iteration in the seq solution is the main culprit. If I define the array once and pass it in, I can run it in 10 seconds, compared to 27 seconds for the recursive solution. The remaining disparity must be down to the extra machinery needed around for a lazy sequence, compared to recursion that is tail-call optimised into a for loop.
Making the isDivisableByAll an inline function makes a significant difference for the recursive solution (down to 6 seconds). It doesn't seem to affect the seq solution.

Related

F#: Two Dimensional Array - Generate all possible binary combinations

What approach would you use to generate the set of NxN matrices containing only zeros and ones which represents all possible distinct combinations?
let matrix Array2D.init N N (fun x y -> something)
If you don't know F# then pseudocode will be a contribution aswell.
So what I want is a list/array of all the distinct matrix combinations
So, I think the hard part is the generating the list of elements. We can do it recursively.
The base case is easy. For a 1x1 matrix, you have 1 element which can only have two combinations: [|[|0|]; [|1|]|].
For a 2x2 elements, we have 2^2 = 4 elements. Each one of these can be either 1 or 0, so there are 2^4 = 16 combinations possible. To get all the combinations possible for this 2x2 array, we can think of it as an array of length 4.
But first, let's think about an array of length 2. Then we have to find all the combinations between [|[|0|]; [|1|]|] and [|[|0|]; [|1|]|]. This would be [|[|0; 0|]; [|0;1|]; [|1;0|]; [|1; 1|]|]. Luckily, there's a function called Array.allPairs which will generate the array of all possible combinations between two arrays, which already does this for us!
So, we can apply Array.allPairs to each element of our array of length 4 sequentially to get all the possible combinations for the entire matrix using Array.reduce. I make a function called pairsToArray to basically flatten the data structure.
let pairsToArray x = Array.concat [|fst x; snd x|]
let rec binary N =
match N with
| 0 -> [||]
| 1 -> [|[|0|]; [|1|]|]
| n -> let elements = n*n
let combinations = Array.init elements (fun i -> binary 1)
let result = Array.reduce (fun acc i -> Array.allPairs acc i |> Array.map pairsToArray) combinations
result
Now, all that remains is converting this to a Array2D.
Something like should do the trick
let c = binary 2
c |> Array.map (fun i -> Array2D.init 2 2 (fun j k -> i.[j+k*2]))
for the 2x2 case
Maybe something like this
let rec addOne (N1: int, N2: int) (M: int[,]) (i: int, j: int)=
if M.[i,j] = 0
then M.[i,j] <- 1
true
else M.[i,j] <- 0
let newi, newj =
if i < N1-1
then (i+1,j)
else (0,j+1)
if newj = N2
then false
else addOne (N1, N2) M (newi,newj)
combined with this
let N = 3
let M: int[,] = Array2D.zeroCreate N N
let mylist =
[ yield M;
while addOne (N,N) M (0,0)
do yield Array2D.copy M ]
I don't know if it makes sense.
It is a method to find the "next" matrix, and then make a list of all the matrices that we encounter that way.
edit: replaced bool with int (0 and 1) to better fit the original question.

F# List optimisation

From an unordered list of int, I want to have the smallest difference between two elements. I have a code that is working but way to slow. Can anyone sugest some change to improve the performance? Please explain why you did the change and what will be the performance gain.
let allInt = [ 5; 8; 9 ]
let sortedList = allInt |> List.sort;
let differenceList = [ for a in 0 .. N-2 do yield sortedList.Item a - sortedList.Item a + 1 ]
printfn "%i" (List.min differenceList) // print 1 (because 9-8 smallest difference)
I think I'm doing to much list creation or iteration but I don't know how to write it differently in F#...yet.
Edit: I'm testing this code on list with 100 000 items or more.
Edit 2: I believe that if I can calculte the difference and have the min in one go it should improve the perf a lot, but I don't know how to do that, anay idea?
Thanks in advance
The List.Item performs in O(n) time and is probably the main performance bottle neck in your code. The evaluation of differenceList iterates the elements of sortedList by index, which means the performance is around O((N-2)(2(N-2))), which simplifies to O(N^2), where N is the number of elements in sortedList. For long lists, this will eventually perform badly.
What I would do is to eliminate calls to Item and instead use the List.pairwise operation
let data =
[ let rnd = System.Random()
for i in 1..100000 do yield rnd.Next() ]
#time
let result =
data
|> List.sort
|> List.pairwise // convert list from [a;b;c;...] to [(a,b); (b,c); ...]
|> List.map (fun (a,b) -> a - b |> abs) // Calculates the absolute difference
|> List.min
#time
The #time directives lets me measure execution time in F# Interactive and the output I get when running this code is:
--> Timing now on
Real: 00:00:00.029, CPU: 00:00:00.031, GC gen0: 1, gen1: 1, gen2: 0
val result : int = 0
--> Timing now off
F#'s built-in list type is implemented as a linked list, which means accessing elements by index has to enumerate the list all the way to the index each time. In your case you have two index accesses repeated N-2 times, getting slower and slower with each iteration, as the index grows and each access needs to go through longer part of the list.
First way out of this would be using an array instead of a list, which is a trivial change, but grants you faster index access.
(*
[| and |] let you define an array literal,
alternatively use List.toArray allInt
*)
let allInt = [| 5; 8; 9 |]
let sortedArray = allInt |> Array.sort;
let differenceList = [ for a in 0 .. N-2 do yield sortedArray.[a] - sortedArray.[a + 1] ]
Another approach might be pairing up the neighbours in the list, subtracting them and then finding a min.
let differenceList =
sortedList
|> List.pairwise
|> List.map (fun (x,y) -> x - y)
List.pairwise takes a list of elements and returns a list of the neighbouring pairs. E.g. in your example List.pairwise [ 5; 8; 9 ] = [ (5, 8); (8, 9) ], so that you can easily work with the pairs in the next step, the subtraction mapping.
This way is better, but these functions from List module take a list as input and produce a new list as the output, having to pass through the list 3 times (1 for pairwise, 1 for map, 1 for min at the end). To solve this, you can use functions from the Seq module, which work with .NETs IEnumerable<'a> interface allowing lazy evaluation resulting usually in fewer passes.
Fortunately in this case Seq defines alternatives for all the functions we use here, so the next step is trivial:
let differenceSeq =
sortedList
|> Seq.pairwise
|> Seq.map (fun (x,y) -> x - y)
let minDiff = Seq.min differenceSeq
This should need only one enumeration of the list (excluding the sorting phase of course).
But I cannot guarantee you which approach will be fastest. My bet would be on simply using an array instead of the list, but to find out, you will have to try it out and measure for yourself, on your data and your hardware. BehchmarkDotNet library can help you with that.
The rest of your question is adequately covered by the other answers, so I won't duplicate them. But nobody has yet addressed the question you asked in your Edit 2. To answer that question, if you're doing a calculation and then want the minimum result of that calculation, you want List.minBy. One clue that you want List.minBy is when you find yourself doing a map followed by a min operation (as both the other answers are doing): that's a classic sign that you want minBy, which does that in one operation instead of two.
There's one gotcha to watch out for when using List.minBy: It returns the original value, not the result of the calculation. I.e., if you do ints |> List.pairwise |> List.minBy (fun (a,b) -> abs (a - b)), then what List.minBy is going to return is a pair of items, not the difference. It's written that way because if it gives you the original value but you really wanted the result, you can always recalculate the result; but if it gave you the result and you really wanted the original value, you might not be able to get it. (Was that difference of 1 the difference between 8 and 9, or between 4 and 5?)
So in your case, you could do:
let allInt = [5; 8; 9]
let minPair =
allInt
|> List.pairwise
|> List.minBy (fun (x,y) -> abs (x - y))
let a, b = minPair
let minDifference = abs (a - b)
printfn "The difference between %d and %d was %d" a b minDifference
The List.minBy operation also exists on sequences, so if your list is large enough that you want to avoid creating an intermediate list of pairs, then use Seq.pairwise and Seq.minBy instead:
let allInt = [5; 8; 9]
let minPair =
allInt
|> Seq.pairwise
|> Seq.minBy (fun (x,y) -> abs (x - y))
let a, b = minPair
let minDifference = abs (a - b)
printfn "The difference between %d and %d was %d" a b minDifference
EDIT: Yes, I see that you've got a list of 100,000 items. So you definitely want the Seq version of this. The F# seq type is just IEnumerable, so if you're used to C#, think of the Seq functions as LINQ expressions and you'll have the right idea.
P.S. One thing to note here: see how I'm doing let a, b = minPair? That's called destructuring assignment, and it's really useful. I could also have done this:
let a, b =
allInt
|> Seq.pairwise
|> Seq.minBy (fun (x,y) -> abs (x - y))
and it would have given me the same result. Seq.minBy returns a tuple of two integers, and the let a, b = (tuple of two integers) expression takes that tuple, matches it against the pattern a, b, and thus assigns a to have the value of that tuple's first item, and b to have the value of that tuple's second item. Notice how I used the phrase "matches it against the pattern": this is the exact same thing as when you use a match expression. Explaining match expressions would make this answer too long, so I'll just point you to an excellent reference on them if you haven't already read it:
https://fsharpforfunandprofit.com/posts/match-expression/
Here is my solution:
let minPair xs =
let foo (x, y) = abs (x - y)
xs
|> List.allPairs xs
|> List.filter (fun (x, y) -> x <> y)
|> List.minBy foo
|> foo

dynamic programming and continuation passing style

for simple problems like fibonacci, writing CPS is relatively straightforward
let fibonacciCPS n =
let rec fibonacci_cont a cont =
if a <= 2 then cont 1
else
fibonacci_cont (a - 2) (fun x ->
fibonacci_cont (a - 1) (fun y ->
cont(x + y)))
fibonacci_cont n (fun x -> x)
However, in the case of the rod-cutting exemple from here (or the book intro to algo), the number of closure is not always equal to 2, and can't be hard coded.
I imagine one has to change the intermediate variables to sequences.
(I like to think of the continuation as a contract saying "when you have the value, pass it on to me, then i'll pass it on to my boss after treatment" or something along those line, which defers the actual execution)
For the rod cutting, we have
//rod cutting
let p = [|1;5;8;9;10;17;17;20;24;30|]
let rec r n = seq { yield p.[n-1]; for i in 1..(n-1) -> (p.[i-1] + r (n-i)) } |> Seq.max
[1 .. 10] |> List.map (fun i -> i, r i)
In this case, I will need to attached the newly created continuation
let cont' = fun (results: _ array) -> cont(seq { yield p.[n-1]; for i in 1..(n-1) -> (p.[i-1] + ks.[n-i]) } |> Seq.max)
to the "cartesian product" continuation made by the returning subproblems.
Has anyone seen a CPS version of rod-cutting / has any tips on this ?
I assume you want to explicitly CPS everything, which means some nice stuff like the list comprehension will be lost (maybe using async blocks can help, I don't know F# very well) -- so starting from a simple recursive function:
let rec cutrod (prices: int[]) = function
| 0 -> 0
| n -> [1 .. min n (prices.Length - 1)] |>
List.map (fun i -> prices.[i] + cutrod prices (n - i)) |>
List.max
It's clear that we need CPS versions of the list functions used (map, max and perhaps a list-building function if you want to CPS the [1..(blah)] expression too). map is quite interesting since it's a higher-order function, so its first parameter needs to be modified to take a CPS-ed function instead. Here's an implementation of a CPS List.map:
let rec map_k f list k =
match list with
| [] -> k []
| x :: xs -> f x (fun y -> map_k f xs (fun ys -> k (y :: ys)))
Note that map_k invokes its argument f like any other CPS function, and puts the recursion in map_k into the continuation. With map_k, max_k, gen_k (which builds a list from 1 to some value), the cut-rod function can be CPS-ed:
let rec cutrod_k (prices: int[]) n k =
match n with
| 0 -> k 0
| n -> gen_k (min n (prices.Length - 1)) (fun indices ->
map_k (fun i k -> cutrod_k prices (n - i) (fun ret -> k (prices.[i] + ret)))
indices
(fun totals -> max_k totals k))

F# using sequence cache correctly

I'm trying to use Seq.cache with a function that I made that returns a sequence of primes up to a number N excluding the number 1. I'm having trouble figuring out how to keep the cached sequence in scope but still use it in my definition.
let rec primesNot1 n =
{2 .. n}
|> Seq.filter (fun i ->
(primesNot1 (i / 2) |> Seq.for_all (fun o -> i % o <> 0)))
|> Seq.append {2 .. 2}
|> Seq.cache
Any ideas of how I could use Seq.cache to make this faster? Currently it keeps dropping from scope and is only slowing down performance.
Seq.cache caches an IEnumerable<T> instance so that each item in the sequence is only calculated once. In your case, though, you're caching the sequence returned by a function, and each time you call the function you get a new cached sequence, which doesn't do you any good. I don't think caching is really the right approach to your problem as you've outlined it; instead you should probably look into memoization.
If instead of defining a function giving the primes less than n you want to define an infinite enumerable sequence of primes, then caching makes more sense. That would look more like this:
let rec upFrom i =
seq {
yield i
yield! upFrom (i+1)
}
let rec primes =
seq {
yield 2
yield!
upFrom 3 |>
Seq.filter (fun p -> primes |> Seq.takeWhile (fun j -> j*j <= p) |> Seq.forall (fun j -> p % j <> 0))
}
|> Seq.cache
I haven't compared the performance of this method compared to yours.
I figured out how to solve my problem with a fold but not my idea of using seq.cache.
let primesNot1 n =
{2 .. n}
|> Seq.fold (fun primes i ->
if primes |> Seq.for_all (fun o -> i % o <> 0) then
List.append primes [i]
else
primes) [2]
Have you taken a look at LazyList? Seems like it's designed to solve the same problem. It's in PowerPack.

Functional learning woes

I'm a beginner to functional languages, and I'm trying to get the whole thing down in Haskell. Here's a quick-and-dirty function that finds all the factors of a number:
factors :: (Integral a) => a -> [a]
factors x = filter (\z -> x `mod` z == 0) [2..x `div` 2]
Works fine, but I found it to be unbearably slow for large numbers. So I made myself a better one:
factorcalc :: (Integral a) => a -> a -> [a] -> [a]
factorcalc x y z
| y `elem` z = sort z
| x `mod` y == 0 = factorcalc x (y+1) (z ++ [y] ++ [(x `div` y)])
| otherwise = factorcalc x (y+1) z
But here's my problem: Even though the code works, and can cut literally hours off the execution time of my programs, it's hideous!
It reeks of ugly imperative thinking: It constantly updates a counter and a data structure in a loop until it finishes. Since you can't change state in purely functional programming, I cheated by holding the data in the parameters, which the function simply passes to itself over and over again.
I may be wrong, but there simply must be a better way of doing the same thing...
Note that the original question asked for all the factors, not for only the prime factors. There being many fewer prime factors, they can probably be found more quickly. Perhaps that's what the OQ wanted. Perhaps not. But let's solve the original problem and put the "fun" back in "functional"!
Some observations:
The two functions don't produce the same output---if x is a perfect square, the second function includes the square root twice.
The first function enumerates checks a number of potential factors proportional to the size of x; the second function checks only proportional to the square root of x, then stops (with the bug noted above).
The first function (factors) allocates a list of all integers from 2 to n div 2, where the second function never allocates a list but instead visits fewer integers one at a time in a parameter. I ran the optimizer with -O and looked at the output with -ddump-simpl, and GHC just isn't smart enough to optimize away those allocations.
factorcalc is tail-recursive, which means it compiles into a tight machine-code loop; filter is not and does not.
Some experiments show that the square root is the killer:
Here's a sample function that produces the factors of x from z down to 2:
factors_from x 1 = []
factors_from x z
| x `mod` z == 0 = z : factors_from x (z-1)
| otherwise = factors_from x (z-1)
factors'' x = factors_from x (x `div` 2)
It's a bit faster because it doesn't allocate, but it's still not tail-recursive.
Here's a tail-recursive version that is more faithful to the original:
factors_from' x 1 l = l
factors_from' x z l
| x `mod` z == 0 = factors_from' x (z-1) (z:l)
| otherwise = factors_from' x (z-1) l
factors''' x = factors_from x (x `div` 2)
This is still slower than factorcalc because it enumerates all the integers from 2 to x div 2, whereas factorcalc stops at the square root.
Armed with this knowledge, we can now create a more functional version of factorcalc which replicates both its speed and its bug:
factors'''' x = sort $ uncurry (++) $ unzip $ takeWhile (uncurry (<=)) $
[ (z, x `div` z) | z <- [2..x], x `mod` z == 0 ]
I didn't time it exactly, but given 100 million as an input, both it and factorcalc terminate instantaneously, where the others all take a number of seconds.
How and why the function works is left as an exercise for the reader :-)
ADDENDUM: OK, to mitigate the eyeball bleeding, here's a slightly saner version (and without the bug):
saneFactors x = sort $ concat $ takeWhile small $
[ pair z | z <- [2..], x `mod` z == 0 ]
where pair z = if z * z == x then [z] else [z, x `div` z]
small [z, z'] = z < z'
small [z] = True
Okay, take a deep breath. It'll be all right.
First of all, why is your first attempt slow? How is it spending its time?
Can you think of a recursive definition for the prime factorization that doesn't have that property?
(Hint.)
Firstly, although factorcalc is "ugly", you could add a wrapper function factors' x = factorscalc x 2 [], add a comment, and move on.
If you want to make a 'beautiful' factors fast, you need to find out why it is slow. Looking at your two functions, factors walks the list about n/2 elements long, but factorcalc stops after around sqrt n iterations.
Here is another factors that also stops after about sqrt n iterations, but uses a fold instead of explicit iteration. It also breaks the problem into three parts: finding the factors (factor); stopping at the square root of x (small) and then computing pairs of factors (factorize):
factors' :: (Integral a) => a -> [a]
factors' x = sort (foldl factorize [] (takeWhile small (filter factor [2..])))
where
factor z = x `mod` z == 0
small z = z <= (x `div` z)
factorize acc z = z : (if z == y then acc else y : acc)
where y = x `div` z
This is marginally faster than factorscalc on my machine. You can fuse factor and factorize and it is about twice as fast as factorscalc.
The Profiling and Optimization chapter of Real World Haskell is a good guide to the GHC suite's performance tools for tackling tougher performance problems.
By the way, I have a minor style nitpick with factorscalc: it is much more efficient to prepend single elements to the front of a list O(1) than it is to append to the end of a list of length n O(n). The lists of factors are typically small, so it is not such a big deal, but factorcalc should probably be something like:
factorcalc :: (Integral a) => a -> a -> [a] -> [a]
factorcalc x y z
| y `elem` z = sort z
| x `mod` y == 0 = factorcalc x (y+1) (y : (x `div` y) : z)
| otherwise = factorcalc x (y+1) z
Since you can't change state in purely
functional programming, I cheated by
holding the data in the parameters,
which the function simply passes to
itself over and over again.
Actually, this is not cheating; this is a—no, make that the—standard technique! That sort of parameter is usually known as an "accumulator," and it's generally hidden within a helper function that does the actual recursion after being set up by the function you're calling.
A common case is when you're doing list operations that depend on the previous data in the list. The two problems you need to solve are, where do you get the data about previous iterations, and how do you deal with the fact that your "working area of interest" for any particular iteration is actually at the tail of the result list you're building. For both of these, the accumulator comes to the rescue. For example, to generate a list where each element is the sum of all of the elements of the input list up to that point:
sums :: Num a => [a] -> [a]
sums inp = helper inp []
where
helper [] acc = reverse acc
helper (x:xs) [] = helper xs [x]
helper (x:xs) acc#(h:_) = helper xs (x+h : acc)
Note that we flip the direction of the accumulator, so we can operate on the head of that, which is much more efficient (as Dominic mentions), and then we just reverse the final output.
By the way, I found reading The Little Schemer to be a useful introduction and offer good practice in thinking recursively.
This seemed like an interesting problem, and I hadn't coded any real Haskell in a while, so I gave it a crack. I've run both it and Norman's factors'''' against the same values, and it feels like mine's faster, though they're both so close that it's hard to tell.
factors :: Int -> [Int]
factors n = firstFactors ++ reverse [ n `div` i | i <- firstFactors ]
where
firstFactors = filter (\i -> n `mod` i == 0) (takeWhile ( \i -> i * i <= n ) [2..n])
Factors can be paired up into those that are greater than sqrt n, and those that are less than or equal to (for simplicity's sake, the exact square root, if n is a perfect square, falls into this category. So if we just take the ones that are less than or equal to, we can calculate the others later by doing div n i. They'll be in reverse order, so we can either reverse firstFactors first or reverse the result later. It doesn't really matter.
This is my "functional" approach to the problem. ("Functional" in quotes, because I'd approach this problem the same way even in non-functional languages, but maybe that's because I've been tainted by Haskell.)
{-# LANGUAGE PatternGuards #-}
factors :: (Integral a) => a -> [a]
factors = multiplyFactors . primeFactors primes 0 [] . abs where
multiplyFactors [] = [1]
multiplyFactors ((p, n) : factors) =
[ pn * x
| pn <- take (succ n) $ iterate (* p) 1
, x <- multiplyFactors factors ]
primeFactors _ _ _ 0 = error "Can't factor 0"
primeFactors (p:primes) n list x
| (x', 0) <- x `divMod` p
= primeFactors (p:primes) (succ n) list x'
primeFactors _ 0 list 1 = list
primeFactors (_:primes) 0 list x = primeFactors primes 0 list x
primeFactors (p:primes) n list x
= primeFactors primes 0 ((p, n) : list) x
primes = sieve [2..]
sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]
primes is the naive Sieve of Eratothenes. There's better, but this is the shortest method.
sieve [2..]
=> 2 : sieve [x | x <- [3..], x `mod` 2 /= 0]
=> 2 : 3 : sieve [x | x <- [4..], x `mod` 2 /= 0, x `mod` 3 /= 0]
=> 2 : 3 : sieve [x | x <- [5..], x `mod` 2 /= 0, x `mod` 3 /= 0]
=> 2 : 3 : 5 : ...
primeFactors is the simple repeated trial-division algorithm: it walks through the list of primes, and tries dividing the given number by each, recording the factors as it goes.
primeFactors (2:_) 0 [] 50
=> primeFactors (2:_) 1 [] 25
=> primeFactors (3:_) 0 [(2, 1)] 25
=> primeFactors (5:_) 0 [(2, 1)] 25
=> primeFactors (5:_) 1 [(2, 1)] 5
=> primeFactors (5:_) 2 [(2, 1)] 1
=> primeFactors _ 0 [(5, 2), (2, 1)] 1
=> [(5, 2), (2, 1)]
multiplyPrimes takes a list of primes and powers, and explodes it back out to a full list of factors.
multiplyPrimes [(5, 2), (2, 1)]
=> [ pn * x
| pn <- take (succ 2) $ iterate (* 5) 1
, x <- multiplyPrimes [(2, 1)] ]
=> [ pn * x | pn <- [1, 5, 25], x <- [1, 2] ]
=> [1, 2, 5, 10, 25, 50]
factors just strings these two functions together, along with an abs to prevent infinite recursion in case the input is negative.
I don't know much about Haskell, but somehow I think this link is appropriate:
http://www.willamette.edu/~fruehr/haskell/evolution.html
Edit: I'm not entirely sure why people are so aggressive about the downvoting on this. The original poster's real problem was that the code was ugly; while it's funny, the point of the linked article is, to some extent, that advanced Haskell code is, in fact, ugly; the more you learn, the uglier your code gets, to some extent. The point of this answer was to point out to the OP that apparently, the ugliness of the code that he was lamenting is not uncommon.

Resources