I'd like the algorithm for highest product of 3 problem implemented in haskell. Here's the problem statement:
Given an array of integers, find the highest product you can get from
three of the integers.
For example given [1, 2, 3, 4], the algorithm should return 24. And given [-10, -10, 5, 1, 6], the highest product of 3 would be 600 = -10*-10*6.
My attempt (assumed no negatives for the first try):
sol2' a b c [] = a*b*c
sol2' a b c (x:xs) = sol2' a' b' c' xs
where
a' = if (x > a) then x else a
b' = if (x > a && a > b) then a else b
c' = if (x > a && a > b && b > c) then b else c
sol2 li = sol2' a b c li
where a = 0
b = 0
c = 0
I tested the implementation with [3, 5, 1, 2, 4, 10, 0, 4, 8, 11] but the return value is 550, which is supposed to be 880.
Positive numbers
You are on the right track in the sense that you look for the highest numbers. The problem is however that a, b and c are not always ordered.
Indeed say for instance that we have the numbers [6,2,4]. Then the way (a,b,c) will evolve through the recursion is:
(0,0,0) -> (6,0,0) -> (2,6,0) -> (4,2,6)
But now a=4, so that means that if we now encounter 3, we will not replace that value, whereas we can do this since we can remove the 2.
Although there are many ways to solve this, probably the best way to do this is to maintain order: ensure that a <= b <= c.
So we can use:
sol1 = sol2' (0,0,0)
sol2' (a,b,c) [] = a*b*c
sol2' t#(a,b,c) (x:xs) = sol2' f xs
where f | x >= c = (b,c,x)
| x >= b = (b,x,c)
| x > a = (x,b,c)
| otherwise = t
this produces the expected:
Prelude> sol1 [1,2,3,4]
24
Prelude> sol1 [3, 5, 1, 2, 4, 10, 0, 4, 8, 11]
880
Intermezzo: keep track of numbers if negative are present
Your program first takes (0,0,0) as the first three values. But in case for instance the list contains only negative numbers (i.e. [-1,-2,-3]) we of course want to keep track of these first. We can do this for instance by initializing our tuple with elements from the list:
import Data.List(sort)
sol1 (xa:xb:xc:xs) = sol2' (a,b,c) xs
where [a,b,c] = sort [xa,xb,xc]
So now we take the first three elements, sort these, and use these as the first tuple. The remaining of the list is processed. This function will error in case sol1 is not given a list with at least three elements, but in that case there probably is no answer. We can use a Maybe to handle the fact that the function is non-total.
all numbers
Of course we also want to deal with negative numbers. Multiplying two negative numbers results in a positive number. So by keeping track of the two smallest numbers as well, we can then do the math properly. So first we will use another argument (d,e) to keep track of the smallest numbers with d <= e:
sol1_all = sol2_all' (0,0,0) (0,0)
sol2_all' (a,b,c) (d,e) [] = -- ...
sol2_all' t#(a,b,c) u#(d,e) (x:xs) = sol2_all' f g xs
where f | x >= c = (b,c,x)
| x >= b = (b,x,c)
| x > a = (x,b,c)
| otherwise = t
g | x <= d = (x,d)
| x <= e = (d,x)
| otherwise = u
So now we have obtained the greatest numbers (a,b,c) and the smallest numbers (d,e). If d and e are indeed negative, then the only way to produce a large . So now we have the following possibilities to consider a*b*c and c*d*e. So we can write it as:
sol2_all' (a,b,c) (d,e) [] = max (a*b*c) (c*d*e)
sol2_all' t#(a,b,c) u#(d,e) (x:xs) = sol2_all' f g xs
where f | x >= c = (b,c,x)
| x >= b = (b,x,c)
| x > a = (x,b,c)
| otherwise = t
g | x <= d = (x,d)
| x <= e = (d,x)
| otherwise = u
Note however that this will not always produce the correct result here because we can count two numbers in both tuples. We can solve this by properly initializing the tuples:
import Data.List(sort)
sol1_all (xa:xb:xc:xs) = sol2_all' (a,b,c) (a,b) xs
where [a,b,c] = sort [xa,xb,xc]
sol2_all' (a,b,c) (d,e) [] = max (a*b*c) (c*d*e)
sol2_all' t#(a,b,c) u#(d,e) (x:xs) = sol2_all' f g xs
where f | x >= c = (b,c,x)
| x >= b = (b,x,c)
| x > a = (x,b,c)
| otherwise = t
g | x <= d = (x,d)
| x <= e = (d,x)
| otherwise = u
Rationale behind picking different (possibly equivalent) elements
How do we know that we will not use an element twice? Since we only use a*b*c or c*d*e this will - in the case of a list with three element - boils down to max(a*b*c,a*b*c) (a, b, and c here the result of sort). So uniqueness is guaranteed. Since we will only add elements in the first tuple if these are at least greater than a, and less than b, we know that in order for an x to be added in both tuples, it should be a <= x <= b. In that case we will obtain tuples (x,b,c) and (a,x). But since we evaluate in that case x*b*c and a*x*c, x will thus not occur in any expression twice.
Leetcode challenge
I submitted a Python version of this code to the Leetcode Challenge and it was accepted:
class Solution:
def maximumProduct(self, nums):
a,b,c = d,e,_ = sorted(nums[:3])
for x in nums[3:]:
if x >= c:
a,b,c = b,c,x
elif x >= b:
a,b = b,x
elif x >= a:
a = x
if x <= d:
d,e = x,d
elif x < e:
e = x
return max(a*b*c,c*d*e)
There are somewhat more efficient solutions, but I would lean toward something more straightforward like:
import Data.List (subsequences)
f :: (Num a, Ord a) => [a] -> a
f = maximum . map product . filter ((==3) . length) . subsequences
Thinking about functional algorithms as sequences of transformations on collections makes them much more idiomatic than transforming imperative loops into recursive functions.
Note if you are doing this with really long lists where efficiency is a concern, you can sort the list first, then take the lowest two and the highest three, and the algorithm will still work:
takeFirstLast xs = (take 2 sorted) ++ (drop (length sorted - 3) sorted)
where sorted = sort xs
However, my original way is plenty fast up to lists of size 100 or so, and is a lot easier to understand. I don't believe in sacrificing readability for speed until I'm told it's an actual requirement.
Related
How can I efficiently represent the list [0..] \\ [t+0*p, t+1*p ..]?
I have defined:
Prelude> let factors p t = [t+0*p, t+1*p ..]
I want to efficiently represent an infinite list that is the difference of [0..] and factors p t, but using \\ from Data.List requires too much memory for even medium-sized lists:
Prelude Data.List> [0..10000] \\ (factors 5 0)
<interactive>: out of memory
I know that I can represent the values between t+0*p and t+1*p with:
Prelude> let innerList p1 p2 t = [t+p1+1, t+p1+2 .. t+p2-1]
Prelude> innerList 0 5 0
[1,2,3,4]
However, repeatedly calculating and concatenating innerList for increasing intervals seems clumsy.
Can I efficiently represent [0..] \\ (factors p t) without calculating rem or mod for each element?
For the infinite list [0..] \\ [t,t+p..],
yourlist t p = [0..t-1] ++ [i | m <- [0,p..], i <- [t+m+1..t+m+p-1]]
Of course this approach doesn't scale, at all, if you'd want to remove some other factors, like
[0..] \\ [t,t+p..] \\ [s,s+q..] \\ ...
in which case you'll have to remove them in sequence with minus, mentioned in Daniel Fischer's answer. There is no magic bullet here.
But there's also a union, with which the above becomes
[0..] \\ ( [t,t+p..] `union` [s,s+q..] `union` ... )
the advantage is, we can arrange the unions in a tree, and get algorithmic improvement.
You can't use (\\) for that, because
(\\) :: (Eq a) => [a] -> [a] -> [a]
(\\) = foldl (flip delete)
the list of elements you want to remove is infinite, and a left fold never terminates when the list it folds over is infinite.
If you rather want to use something already written than write it yourself, you can use minus from the data-ordlist package.
The performance should be adequate.
Otherwise,
minus :: Ord a => [a] -> [a] -> [a]
minus xxs#(x:xs) yys#(y:ys)
| x < y = x : minus xs yys
| x == y = minus xs ys
| otherwise = minus xss ys
minus xs _ = xs
You can use a list comprehesion with a predicate, using rem:
>>> let t = 0
>>> let p = 5
>>> take 40 $ [ x | x <- [1..], x `rem` p /= t ]
[1,2,3,4,6,7,8,9,11,12,13,14,16,17,18,19,21,22,23,24,26,27,28,29,31,32,33,34,36,37,38,39,41,42,43,44,46,47,48,49]
If you want efficiency, why does your solution have to use list comprehension syntax?
Why not something like this?
gen' n i p | i == p = gen' (n + p) 1 p
gen' n i p = (n+i) : gen' n (i+1) p
gen = gen' 0 1
and then do
gen 5
Because you have ascending lists, you can simply lazily merge them:
nums = [1..]
nogos = factors p t
result = merge nums (dropWhile (<head nums) nogos) where
merge (a:as) (b:bs)
| a < b = a : merge as (b:bs)
| a == b = merge as bs
| otherwise = error "should not happen"
Writing this in a general way so that we have a function that builds the difference of two infinite lists, provided only that they are in ascending order, is left as exercise. In the end, the following should be possible
[1..] `infiniteDifference` primes `infiniteDifference` squares
For this, make it a left associative operator.
I have to write a function which returns a list of all pairs (x,y) where x,
y ∈ N , and:
x is the product of two natural numbers (x = a • b, where a, b ∈ N) and
x is really bigger than 5 but really smaller than 500, and
y is a square number (y = c² where c ∈ N) NOT greater than 1000, and
x is a divisor of y.
My attempt:
listPairs :: [(Int, Int)]
listPairs = [(a*b, y) | y <- [0..], a <- [0..], b <- [0..],
(a*b) > 5, (a*b) < 500, (y*y) < 1001,
mod y (a*b) == 0]
But it doesn't return anything and the computer works a lot on it.
However if I choose a smaller range for a, b and y e. g. [0..400], it takes up to a minute but it returns the right result.
So how could I solve the performance issue?
So, of course nested list comprehensions on infinite lists do not terminate.
Fortunately, your lists are not infinite. There's a limit. If x = a*b < 500, then we know that it must be a < 500 and b < 500. Also, c = y*y < 1001 is just y < 32. So,
listPairs :: [(Int, Int)]
listPairs =
[(x, c*c) | c <- [1..31], a <- [1..499], -- a*b < 500 ==> b<500/a ,
b <- [a..min 499 (div 500 a)], -- a*b==b*a ==> b >= a
let x = a*b, x > 5,
-- (a*b) < 500, (c*c) < 1001, -- no need to test this
rem (c*c) x == 0]
mod 0 n == 0 trivially holds, so I'm excluding 0 from "natural numbers" here.
There are still some duplicates produced here, even though we've limited the b value to b >= a in x=a*b, because x can have several representations (e.g. 1*6 == 2*3).
You can use Data.List.nub to get rid of them.
After learning some Scala and the benefits of FP, I am reimplementing some of my previous CS assignments to better understand FP. However, I got to one assignment that seems impractical to implement with FP (or at least trivially translate).
When solving a simple 2D maze it is necessary to remember which nodes have been visited. However, without shared state, how can each recursive call know what nodes the other recursive calls have examined? I could pass the maze as a parameter to each recursive call and return a new maze containing the places visited, but that seems too computationally intensive to copy an entire maze each recursive call. Would a more advanced approach be required to implement an immutable maze solver?
You can pass around a set containing the visited nodes (or their ids/names if nodes themselves are comparable for equality in your setup). Adding items to an immutable set generally takes O(log n), so does checking whether an element is contained in the set. So that's significantly cheaper than copying the maze.
Perhaps you noticed that my earlier answer was deleted. Although I was poking fun, by suggesting only that the "computer display in red all the dead-ends and in green the path that connects the entrance to the exit," at the same time, it was a metaphor for what I understand of the functional paradigm - a kind of encompassing precomputed certainty. Given my limited understanding and knowledge, I worked on an example in Haskell that avoids a recursive depth search, computing paths for a 4x5 maze, given an array where each cell in the maze (i.e., each array element) contains only the indexes of the cells it can connect to; and -1 for the entrance, -2 for the exit. (You can see an outline of the maze at the top of the code section.) I know, more experienced programmers could do much more and better. Please let me know if this seems to fit in with the spirit of this question (and thank you, Andrew, for the interesting challenge/direction).
{-M A Z E-}
[E]=[ ]=[ ]=[ ]
|
[ ]=[ ]=[ ]=[ ]
| |
[ ] [ ]=[ ] [ ]
| | |
[ ] [ ]=[ ]=[ ]
| |
[ ]=[ ]=[ ]=[E]
import Data.List
import Data.Maybe
--Each element in the maze lists the indexes of connected cells, '-1' for entrance, '-2' for exit
maze = [[-1,1], [0,2,5], [1,3], [2],
[5], [4,6,1,9], [5,7], [6,11],
[12], [5,13,10], [9], [7,15],
[8,16], [14,9,17], [13,15], [14,11],
[12,17], [13,16,18], [17,19], [18,-2]]
maze' = [[-1,1], [0,2], [1,3], [2,7],
[8,5], [4,6], [5,7], [3,6],
[4,9], [8,10], [9,11], [10,15],
[16,13], [12,14], [13,15], [11,14],
[12,17], [16,18], [17,19], [18,-2]]
index a = fromJust $ elemIndex a maze
indexes a = map (index) a
areConnected index_a index_b = elem index_a (maze !! index_b)
isStart a --(a :: cell)
| elem (-1) a = True
| otherwise = False
isEnd a --(a :: cell)
| elem (-2) a = True
| otherwise = False
hasStart a --(a :: [cell])
| isStart (head a) = True
| otherwise = False
hasEnd a --(a :: [cell])
| isEnd (last a) = True
| otherwise = False
isSequenced (w:x:xs) (y:z:zs) --includes possibility of overlap since we do not know how many cells comprise the solution
| areConnected (index $ last xs) (index y)
|| last xs == y
|| let (b:c:cs) = reverse (w:x:xs) in [c,b] == [y,z] = True
| otherwise = False
removeBacktracks (x:xs)
| (x:xs) == [] = []
| xs == [] = [x]
| x == head xs = removeBacktracks xs
| length xs > 1 && x == let (y:ys) = xs in head ys = removeBacktracks (tail xs)
| otherwise = x : removeBacktracks xs
--list dead ends
dead_ends = filter (\x -> length x==1 && find (==(-1)) x == Nothing) maze
dead_ends_indexes = map (index) dead_ends
connectedToDeadEnd (x:xs)
| x `elem` dead_ends_indexes = True
| not (x `elem` dead_ends_indexes) && xs == [] = False
| otherwise = connectedToDeadEnd xs
--list first from dead ends
first_from_dead_ends = filter (\x -> length x==2 && find (==(-1)) x == Nothing && connectedToDeadEnd x) maze
--create sequences
filtered = [l | l <- maze, not (elem l dead_ends) && not (elem l first_from_dead_ends)]
sequences_3 = [[a,b,c] | a <- filtered, not (isEnd a),
b <- filtered, not (isEnd b || isStart b), areConnected (index a) (index b),
c <- filtered, not (isStart c), a /= c, areConnected (index b) (index c)]
sequences_4 = [a ++ [b] | a <- sequences_3, not (hasEnd a), b <- filtered,
last a /= b, areConnected (index $last a) (index b)]
paths = take 1 [indexes $ concat [a, b, c, d, e] | a <- sequences, hasStart a,
b <- sequences, not (hasStart b || hasEnd b),
isSequenced a b,
c <- sequences, b /= c, not (hasStart c || hasEnd c),
isSequenced b c,
d <- sequences, c /= d, not (hasStart d || hasEnd d),
isSequenced c d,
e <- sequences, hasEnd e,
isSequenced d e]
where sequences
| length filtered < 16 = sequences_3
| otherwise = sequences_4
path = removeBacktracks $ head paths
main = print path
--outputs: [0,1,5,9,13,17,18,19]
this has been bugging me for a while.
Lets say you have a function f x y where x and y are integers and you know that f is strictly non-decreasing in its arguments,
i.e. f (x+1) y >= f x y and f x (y+1) >= f x y.
What would be the fastest way to find the largest f x y satisfying a property given that x and y are bounded.
I was thinking that this might be a variation of saddleback search and I was wondering if there was a name for this type of problem.
Also, more specifically I was wondering if there was a faster way to solve this problem if you knew that f was the multiplication operator.
Thanks!
Edit: Seeing the comments below, the property can be anything
Given a property g (where g takes a value and returns a boolean) I am simply looking for the largest f such that g(f) == True
For example, a naive implementation (in haskell) would be:
maximise :: (Int -> Int -> Int) -> (Int -> Bool) -> Int -> Int -> Int
maximise f g xLim yLim = head . filter g . reverse . sort $ results
where results = [f x y | x <- [1..xLim], y <- [1..yLim]]
Let's draw an example grid for your problem to help think about it. Here's an example plot of f for each x and y. It is monotone in each argument, which is an interesting constraint we might be able to do something clever with.
+------- x --------->
| 0 0 1 1 1 2
| 0 1 1 2 2 4
y 1 1 3 4 6 6
| 1 2 3 6 6 7
| 7 7 7 7 7 7
v
Since we don't know anything about the property, we can't really do better than to list the values in the range of f in decreasing order. The question is how to do that efficiently.
The first thing that comes to mind is to traverse it like a graph starting at the lower-right corner. Here is my attempt:
import Data.Maybe (listToMaybe)
maximise :: (Ord b, Num b) => (Int -> Int -> b) -> (b -> Bool) -> Int -> Int -> Maybe b
maximise f p xLim yLim =
listToMaybe . filter p . map (negate . snd) $
enumIncreasing measure successors (xLim,yLim)
where
measure (x,y) = negate $ f x y
successors (x,y) = [ (x-1,y) | x > 0 ] ++ [ (x,y-1) | y > 0 ] ]
The signature is not as general as it could be (Num should not be necessary, but I needed it to negate the measure function because enumIncreasing returns an increasing rather than a decreasing list -- I could have also done it with a newtype wrapper).
Using this function, we can find the largest odd number which can be written as a product of two numbers <= 100:
ghci> maximise (*) odd 100 100
Just 9801
I wrote enumIncreasing using meldable-heap on hackage to solve this problem, but it is pretty general. You could tweak the above to add additional constraints on the domain, etc.
The answer depends on what's expensive. The case that might be intersting is when f is expensive.
What you might want to do is look at pareto-optimality. Suppose you have two points
(1, 2) and (3, 4)
Then you know that the latter point is going to be a better solution, so long as f is a nondecreasing function. However, of course, if you have points,
(1, 2) and (2, 1)
then you can't know. So, one solution would be to establish a pareto-optimal frontier of points that the predicate g permits, and then evaluate these though f.
I'm a beginner to functional languages, and I'm trying to get the whole thing down in Haskell. Here's a quick-and-dirty function that finds all the factors of a number:
factors :: (Integral a) => a -> [a]
factors x = filter (\z -> x `mod` z == 0) [2..x `div` 2]
Works fine, but I found it to be unbearably slow for large numbers. So I made myself a better one:
factorcalc :: (Integral a) => a -> a -> [a] -> [a]
factorcalc x y z
| y `elem` z = sort z
| x `mod` y == 0 = factorcalc x (y+1) (z ++ [y] ++ [(x `div` y)])
| otherwise = factorcalc x (y+1) z
But here's my problem: Even though the code works, and can cut literally hours off the execution time of my programs, it's hideous!
It reeks of ugly imperative thinking: It constantly updates a counter and a data structure in a loop until it finishes. Since you can't change state in purely functional programming, I cheated by holding the data in the parameters, which the function simply passes to itself over and over again.
I may be wrong, but there simply must be a better way of doing the same thing...
Note that the original question asked for all the factors, not for only the prime factors. There being many fewer prime factors, they can probably be found more quickly. Perhaps that's what the OQ wanted. Perhaps not. But let's solve the original problem and put the "fun" back in "functional"!
Some observations:
The two functions don't produce the same output---if x is a perfect square, the second function includes the square root twice.
The first function enumerates checks a number of potential factors proportional to the size of x; the second function checks only proportional to the square root of x, then stops (with the bug noted above).
The first function (factors) allocates a list of all integers from 2 to n div 2, where the second function never allocates a list but instead visits fewer integers one at a time in a parameter. I ran the optimizer with -O and looked at the output with -ddump-simpl, and GHC just isn't smart enough to optimize away those allocations.
factorcalc is tail-recursive, which means it compiles into a tight machine-code loop; filter is not and does not.
Some experiments show that the square root is the killer:
Here's a sample function that produces the factors of x from z down to 2:
factors_from x 1 = []
factors_from x z
| x `mod` z == 0 = z : factors_from x (z-1)
| otherwise = factors_from x (z-1)
factors'' x = factors_from x (x `div` 2)
It's a bit faster because it doesn't allocate, but it's still not tail-recursive.
Here's a tail-recursive version that is more faithful to the original:
factors_from' x 1 l = l
factors_from' x z l
| x `mod` z == 0 = factors_from' x (z-1) (z:l)
| otherwise = factors_from' x (z-1) l
factors''' x = factors_from x (x `div` 2)
This is still slower than factorcalc because it enumerates all the integers from 2 to x div 2, whereas factorcalc stops at the square root.
Armed with this knowledge, we can now create a more functional version of factorcalc which replicates both its speed and its bug:
factors'''' x = sort $ uncurry (++) $ unzip $ takeWhile (uncurry (<=)) $
[ (z, x `div` z) | z <- [2..x], x `mod` z == 0 ]
I didn't time it exactly, but given 100 million as an input, both it and factorcalc terminate instantaneously, where the others all take a number of seconds.
How and why the function works is left as an exercise for the reader :-)
ADDENDUM: OK, to mitigate the eyeball bleeding, here's a slightly saner version (and without the bug):
saneFactors x = sort $ concat $ takeWhile small $
[ pair z | z <- [2..], x `mod` z == 0 ]
where pair z = if z * z == x then [z] else [z, x `div` z]
small [z, z'] = z < z'
small [z] = True
Okay, take a deep breath. It'll be all right.
First of all, why is your first attempt slow? How is it spending its time?
Can you think of a recursive definition for the prime factorization that doesn't have that property?
(Hint.)
Firstly, although factorcalc is "ugly", you could add a wrapper function factors' x = factorscalc x 2 [], add a comment, and move on.
If you want to make a 'beautiful' factors fast, you need to find out why it is slow. Looking at your two functions, factors walks the list about n/2 elements long, but factorcalc stops after around sqrt n iterations.
Here is another factors that also stops after about sqrt n iterations, but uses a fold instead of explicit iteration. It also breaks the problem into three parts: finding the factors (factor); stopping at the square root of x (small) and then computing pairs of factors (factorize):
factors' :: (Integral a) => a -> [a]
factors' x = sort (foldl factorize [] (takeWhile small (filter factor [2..])))
where
factor z = x `mod` z == 0
small z = z <= (x `div` z)
factorize acc z = z : (if z == y then acc else y : acc)
where y = x `div` z
This is marginally faster than factorscalc on my machine. You can fuse factor and factorize and it is about twice as fast as factorscalc.
The Profiling and Optimization chapter of Real World Haskell is a good guide to the GHC suite's performance tools for tackling tougher performance problems.
By the way, I have a minor style nitpick with factorscalc: it is much more efficient to prepend single elements to the front of a list O(1) than it is to append to the end of a list of length n O(n). The lists of factors are typically small, so it is not such a big deal, but factorcalc should probably be something like:
factorcalc :: (Integral a) => a -> a -> [a] -> [a]
factorcalc x y z
| y `elem` z = sort z
| x `mod` y == 0 = factorcalc x (y+1) (y : (x `div` y) : z)
| otherwise = factorcalc x (y+1) z
Since you can't change state in purely
functional programming, I cheated by
holding the data in the parameters,
which the function simply passes to
itself over and over again.
Actually, this is not cheating; this is a—no, make that the—standard technique! That sort of parameter is usually known as an "accumulator," and it's generally hidden within a helper function that does the actual recursion after being set up by the function you're calling.
A common case is when you're doing list operations that depend on the previous data in the list. The two problems you need to solve are, where do you get the data about previous iterations, and how do you deal with the fact that your "working area of interest" for any particular iteration is actually at the tail of the result list you're building. For both of these, the accumulator comes to the rescue. For example, to generate a list where each element is the sum of all of the elements of the input list up to that point:
sums :: Num a => [a] -> [a]
sums inp = helper inp []
where
helper [] acc = reverse acc
helper (x:xs) [] = helper xs [x]
helper (x:xs) acc#(h:_) = helper xs (x+h : acc)
Note that we flip the direction of the accumulator, so we can operate on the head of that, which is much more efficient (as Dominic mentions), and then we just reverse the final output.
By the way, I found reading The Little Schemer to be a useful introduction and offer good practice in thinking recursively.
This seemed like an interesting problem, and I hadn't coded any real Haskell in a while, so I gave it a crack. I've run both it and Norman's factors'''' against the same values, and it feels like mine's faster, though they're both so close that it's hard to tell.
factors :: Int -> [Int]
factors n = firstFactors ++ reverse [ n `div` i | i <- firstFactors ]
where
firstFactors = filter (\i -> n `mod` i == 0) (takeWhile ( \i -> i * i <= n ) [2..n])
Factors can be paired up into those that are greater than sqrt n, and those that are less than or equal to (for simplicity's sake, the exact square root, if n is a perfect square, falls into this category. So if we just take the ones that are less than or equal to, we can calculate the others later by doing div n i. They'll be in reverse order, so we can either reverse firstFactors first or reverse the result later. It doesn't really matter.
This is my "functional" approach to the problem. ("Functional" in quotes, because I'd approach this problem the same way even in non-functional languages, but maybe that's because I've been tainted by Haskell.)
{-# LANGUAGE PatternGuards #-}
factors :: (Integral a) => a -> [a]
factors = multiplyFactors . primeFactors primes 0 [] . abs where
multiplyFactors [] = [1]
multiplyFactors ((p, n) : factors) =
[ pn * x
| pn <- take (succ n) $ iterate (* p) 1
, x <- multiplyFactors factors ]
primeFactors _ _ _ 0 = error "Can't factor 0"
primeFactors (p:primes) n list x
| (x', 0) <- x `divMod` p
= primeFactors (p:primes) (succ n) list x'
primeFactors _ 0 list 1 = list
primeFactors (_:primes) 0 list x = primeFactors primes 0 list x
primeFactors (p:primes) n list x
= primeFactors primes 0 ((p, n) : list) x
primes = sieve [2..]
sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]
primes is the naive Sieve of Eratothenes. There's better, but this is the shortest method.
sieve [2..]
=> 2 : sieve [x | x <- [3..], x `mod` 2 /= 0]
=> 2 : 3 : sieve [x | x <- [4..], x `mod` 2 /= 0, x `mod` 3 /= 0]
=> 2 : 3 : sieve [x | x <- [5..], x `mod` 2 /= 0, x `mod` 3 /= 0]
=> 2 : 3 : 5 : ...
primeFactors is the simple repeated trial-division algorithm: it walks through the list of primes, and tries dividing the given number by each, recording the factors as it goes.
primeFactors (2:_) 0 [] 50
=> primeFactors (2:_) 1 [] 25
=> primeFactors (3:_) 0 [(2, 1)] 25
=> primeFactors (5:_) 0 [(2, 1)] 25
=> primeFactors (5:_) 1 [(2, 1)] 5
=> primeFactors (5:_) 2 [(2, 1)] 1
=> primeFactors _ 0 [(5, 2), (2, 1)] 1
=> [(5, 2), (2, 1)]
multiplyPrimes takes a list of primes and powers, and explodes it back out to a full list of factors.
multiplyPrimes [(5, 2), (2, 1)]
=> [ pn * x
| pn <- take (succ 2) $ iterate (* 5) 1
, x <- multiplyPrimes [(2, 1)] ]
=> [ pn * x | pn <- [1, 5, 25], x <- [1, 2] ]
=> [1, 2, 5, 10, 25, 50]
factors just strings these two functions together, along with an abs to prevent infinite recursion in case the input is negative.
I don't know much about Haskell, but somehow I think this link is appropriate:
http://www.willamette.edu/~fruehr/haskell/evolution.html
Edit: I'm not entirely sure why people are so aggressive about the downvoting on this. The original poster's real problem was that the code was ugly; while it's funny, the point of the linked article is, to some extent, that advanced Haskell code is, in fact, ugly; the more you learn, the uglier your code gets, to some extent. The point of this answer was to point out to the OP that apparently, the ugliness of the code that he was lamenting is not uncommon.