Haskell - list comprehension can't enumerate N × N

Haskell - list comprehension can't enumerate N × N - performance

I have to write a function which returns a list of all pairs (x,y) where x,
y ∈ N , and:
x is the product of two natural numbers (x = a • b, where a, b ∈ N) and
x is really bigger than 5 but really smaller than 500, and
y is a square number (y = c² where c ∈ N) NOT greater than 1000, and
x is a divisor of y.
My attempt:
listPairs :: [(Int, Int)]
listPairs = [(a*b, y) | y <- [0..], a <- [0..], b <- [0..],
(a*b) > 5, (a*b) < 500, (y*y) < 1001,
mod y (a*b) == 0]
But it doesn't return anything and the computer works a lot on it.
However if I choose a smaller range for a, b and y e. g. [0..400], it takes up to a minute but it returns the right result.
So how could I solve the performance issue?

So, of course nested list comprehensions on infinite lists do not terminate.
Fortunately, your lists are not infinite. There's a limit. If x = a*b < 500, then we know that it must be a < 500 and b < 500. Also, c = y*y < 1001 is just y < 32. So,
listPairs :: [(Int, Int)]
listPairs =
[(x, c*c) | c <- [1..31], a <- [1..499], -- a*b < 500 ==> b<500/a ,
b <- [a..min 499 (div 500 a)], -- a*b==b*a ==> b >= a
let x = a*b, x > 5,
-- (a*b) < 500, (c*c) < 1001, -- no need to test this
rem (c*c) x == 0]
mod 0 n == 0 trivially holds, so I'm excluding 0 from "natural numbers" here.
There are still some duplicates produced here, even though we've limited the b value to b >= a in x=a*b, because x can have several representations (e.g. 1*6 == 2*3).
You can use Data.List.nub to get rid of them.

Related

Algorithm to precisely compare two exponentiations for very large integers (order of 1 billion)

We want to compare a^b to c^d, and tell if the first is smaller, greater, or equal (where ^ denotes exponentiation).
Obviously, for very large numbers, we cannot explicitely compute these values.
The most common approach in this situation is to apply log on both sides and compare b * log(a) to d * log(c). The issue here is that logs are floating-point operations, and as such we cannot trust our answer with 100% confidence (there might be some values which are incredibly close, and because of floating-point error we get a wrong answer).
Is there an algorithm for solving this problem? I've been scouring the intrernet for this, but I can only find solutions which work for particular cases only (e.g. in which one exponent is a multiple of another), or which use floating point in some way (logarithms, division) etc.

This is sort of two questions in one:
Are they equal?
If not, which one is greater?
As Peter O. observes, it's easiest to build in a language that provides an arbitrary-precision fraction type. I'll use Python 3.
Let's assume without loss of generality that a ≤ c (swap if necessary) and b is relatively prime to d (divide both by the greatest common divisor).
To get at the core of the question, I'm going to assume that a, c > 0 and b, d ≥ 0. Removing this assumption is tedious but not difficult.
Equality test
There are some easy cases where a = 1 or b = 0 or c = 1 or d = 0.
Separately, necessary conditions for a^b = c^d are
i. b ≥ d, since otherwise b < d, which together with a ≤ c implies a^b < c^d;
ii. a is a divisor of c, since we know from (i) that a^b = c^d is a divisor of c^b = c^(b−d) c^d.
When these conditions hold, we can divide through by a^d to reduce the problem to testing whether a^(b−d) = (c/a)^d.
In Python 3:
def equal_powers(a, b, c, d):
while True:
lhs_is_one = a == 1 or b == 0
rhs_is_one = c == 1 or d == 0
if lhs_is_one or rhs_is_one:
return lhs_is_one and rhs_is_one
if a > c:
a, b, c, d = c, d, a, b
if b < d:
return False
q, r = divmod(c, a)
if r != 0:
return False
b -= d
c = q
def test_equal_powers():
for a in range(1, 25):
for b in range(25):
for c in range(1, 25):
for d in range(25):
assert equal_powers(a, b, c, d) == (a ** b == c ** d)
test_equal_powers()
Inequality test
Once we've established that the two quantities are not equal, it's time to figure out which one is greater. (Without the equality test, the code here could run forever.)
If you're doing this for real, you should consult an actual reference on computing elementary functions. I'm just going to try to do the simplest thing that works.
Time for a calculus refresher. We have the Taylor series
−log x = (1−x) + (1−x)^2/2 + (1−x)^3/3 + (1−x)^4/4 + ...
To get a lower bound, truncate the series. To get an upper bound, we can truncate but replace the final term (1−x)^n/n with (1−x)^n/n (1/x), since
(1−x)^n/n (1/x)
= (1−x)^n/n (1 + (1−x) + (1−x)^2 + ...)
= (1−x)^n/n + (1−x)^(n+1)/n + (1−x)^(n+2)/n + ...
> (1−x)^n/n + (1−x)^(n+1)/(n+1) + (1−x)^(n+2)/(n+2) + ...
To get a good convergence rate, we're going to want 0.5 ≤ x < 1, which we can achieve by dividing x by a power of two.
In Python, we'll represent a real number as an infinite generator of shrinking intervals that contain the true value. Once the intervals for b log a and d log c are disjoint, we can determine how they compare.
import fractions
def minus(x, y):
while True:
x_lo, x_hi = next(x)
y_lo, y_hi = next(y)
yield x_lo - y_hi, x_hi - y_lo
def times(b, x):
for lo, hi in x:
yield b * lo, b * hi
def restricted_log(a):
series = 0
n = 0
numerator = 1
while True:
n += 1
numerator *= 1 - a
series += fractions.Fraction(numerator, n)
yield -(series + fractions.Fraction(numerator * (1 - a), (n + 1) * a)), -series
def log(a):
n = 0
while a >= 1:
a = fractions.Fraction(a, 2)
n += 1
return minus(restricted_log(a), times(n, restricted_log(fractions.Fraction(1, 2))))
def less_powers(a, b, c, d):
lhs = times(b, log(a))
rhs = times(d, log(c))
while True:
lhs_lo, lhs_hi = next(lhs)
rhs_lo, rhs_hi = next(rhs)
if lhs_hi < rhs_lo:
return True
if rhs_hi < lhs_lo:
return False
def test_less_powers():
for a in range(1, 10):
for b in range(10):
for c in range(1, 10):
for d in range(10):
if a ** b != c ** d:
assert less_powers(a, b, c, d) == (a ** b < c ** d)
test_less_powers()

How can I speed up this haskell lastDigits x y function?

I have a haskell assignment in which i have to create a function lastDigit x y of 2 arguments that calculates the sum of all [x^x | (0..x)], mine is too slow and i need to speed it up. Anyone has any ideas??
list :: Integral x=>x->[x]
list 0 = []
list x = list(div x 10) ++ [(mod x 10)]
sqrall :: Integer->[Integer]
sqrall x y = [mod (mod x 10^y)^x 10^y | x <- [1..x]]
lastDigits :: Integer -> Int -> [Integer]
lastDigits x y = drop (length((list(sum (sqrall x y))))-y) (list(sum (sqrall x)))

The main reason this will take too long is because you calculate the entire number of x^x, which scales super exponentially. This means that even for very small x, it will still take a considerable amount of time.
The point is however that you do not need to calculate the entire number. Indeed, you can make use of the fact that x×y mod n = (x mod n) × (y mod n) mod n. For example Haskell's arithmoi package makes use of this [src]:
powMod :: (Integral a, Integral b) => a -> b -> a -> a
powMod x y m
| m <= 0 = error "powModInt: non-positive modulo"
| y < 0 = error "powModInt: negative exponent"
| otherwise = f (x `rem` m) y 1 `mod` m
where
f _ 0 acc = acc
f b e acc = f (b * b `rem` m) (e `quot` 2)
(if odd e then (b * acc `rem` m) else acc)
We can make a specific version for modulo 10 with:
pow10 :: Integral i => i -> i
pow10 x = go x x
where go 0 _ = 1
go i j | odd i = rec * j `mod` 10
| otherwise = rec
where rec = go (div i 2) ((j*j) `mod` 10)
This then matches x^x `mod` 10, except that we do not need to calculate the entire number:
Prelude> map pow10 [1 .. 20]
[1,4,7,6,5,6,3,6,9,0,1,6,3,6,5,6,7,4,9,0]
Prelude> [x^x `mod` 10 | x <- [1..20]]
[1,4,7,6,5,6,3,6,9,0,1,6,3,6,5,6,7,4,9,0]
Now that we have that, we can also calculate the the sum of the two last digits with integers that range to at most 18:
sum10 :: Int -> Int -> Int
sum10 x y = (x + y) `mod` 10
we thus can calculate the last digit with:
import Data.List(foldl')
lastdigit :: Int -> Int
lastdigit x = foldl' sum10 0 (map pow10 [0 .. x])
For example for x = 26, we get:
Prelude Data.List> lastdigit 26
4
Prelude Data.List> sum [ x^x | x <- [0 .. 26] ]
6246292385799360560872647730684286774
I keep it as an exercise to generalize the above to calculate it for the last y digits. As long as y is relatively small, this will be efficient, since then the numbers never take huge amounts of memory. Furthermore if the numbers have an upper bound, addition, multiplication, etc. are done in constant time. If you however use an Integer, then the numbers can be arbitrary large, and thus operations like addition are not constant.

Most efficient algorithm to find integer points within an ellipse

I'm trying to find all the integer lattice points within various 3D ellipses.
I would like my program to take an integer N, and count all the lattice points within the ellipses of the form ax^2 + by^2 + cz^2 = n, where a,b,c are fixed integers and n is between 1 and N. This program should then return N tuples of the form (n, numlatticePointsWithinEllipse n).
I'm currently doing it by counting the points on the ellipses ax^2 + by^2 + cz^2 = m, for m between 0 and n inclusive, and then summing over m. I'm also only looking at x, y and z all positive initially, and then adding in the negatives by permuting their signs later.
Ideally, I'd like to reach numbers of N = 1,000,000+ within the scale of hours
Taking a specific example of x^2 + y^2 + 3z^2 = N, here's the Haskell code I'm currently using:
import System.Environment
isqrt :: Int -> Int
isqrt 0 = 0
isqrt 1 = 1
isqrt n = head $ dropWhile (\x -> x*x > n) $ iterate (\x -> (x + n `div` x) `div` 2) (n `div` 2)
latticePointsWithoutNegatives :: Int -> [[Int]]
latticePointsWithoutNegatives 0 = [[0,0,0]]
latticePointsWithoutNegatives n = [[x,y,z] | x<-[0.. isqrt n], y<- [0.. isqrt (n - x^2)], z<-[max 0 (isqrt ((n-x^2 -y^2) `div` 3))], x^2 +y^2 + z^2 ==n]
latticePoints :: Int -> [[Int]]
latticePoints n = [ zipWith (*) [x1,x2,x3] y | [x1,x2,x3] <- (latticePointsWithoutNegatives n), y <- [[a,b,c] | a <- (if x1 == 0 then [0] else [-1,1]), b<-(if x2 == 0 then [0] else [-1,1]), c<-(if x3 == 0 then [0] else [-1,1])]]
latticePointsUpTo :: Int -> Int
latticePointsUpTo n = sum [length (latticePoints x) | x<-[0..n]]
listResults :: Int -> [(Int, Int)]
listResults n = [(x, latticePointsUpTo x) | x<- [1..n]]
main = do
args <- getArgs
let cleanArgs = read (head args)
print (listResults cleanArgs)
I've compiled this with
ghc -O2 latticePointsTest
but using the PowerShell "Measure-Command" command, I get the following results:
Measure-Command{./latticePointsTest 10}
TotalMilliseconds : 12.0901
Measure-Command{./latticePointsTest 100}
TotalMilliseconds : 12.0901
Measure-Command{./latticePointsTest 1000}
TotalMilliseconds : 31120.4503
and going any more orders of magnitude up takes us onto the scale of days, rather than hours or minutes.
Is there anything fundamentally wrong with the algorithm I'm using? Is there any core reason why my code isn't scaling well? Any guidance will be greatly appreciated. I may also want to process the data between "latticePoints" and "latticePointsUpTo", so I can't just rely entirely on clever number theoretic counting techniques - I need the underlying tuples preserved.

Some things I would try:
isqrt is not efficient for the range of values you are working work. Simply use the floating point sqrt function:
isqrt = floor $ sqrt ((fromIntegral n) :: Double)
Alternatively, instead of computing integer square roots, use logic like this in your list comprehensions:
x <- takeWhile (\x -> x*x <= n) [0..],
y <- takeWhile (\y -> y*y <= n - x*x) [0..]
Also, I would use expressions like x*x instead of x^2.
Finally, why not compute the number of solutions with something like this:
sols a b c n =
length [ () | x <- takeWhile (\x -> a*x*x <= n) [0..]
, y <- takeWhile (\y -> a*x*x+b*y*y <= n) [0..]
, z <- takeWhile (\z -> a*x*x+b*y*y+c*z*z <= n) [0..]
]
This does not exactly compute the same answer that you want because it doesn't account for positive and negative solutions, but you could easily modify it to compute your answer. The idea is to use one list comprehension instead of iterating over various values of n and summing.
Finally, I think using floor and sqrt to compute the integral square root is completely safe in this case. This code verifies that the integer square root by sing sqrt of (x*x) == x for all x <= 3037000499:
testAll :: Int -> IO ()
testAll n =
print $ head [ (x,a) | x <- [n,n-1 .. 1], let a = floor $ sqrt (fromIntegral (x*x) :: Double), a /= x ]
main = testAll 3037000499
Note I am running this on a 64-bit GHC - otherwise just use Int64 instead of Int since Doubles are 64-bit in either case. Takes only a minute or so to verify.
This shows that taking the floor of sqrt y will never result in the wrong answer if y <= 3037000499^2.

More efficient algorithm preforms worse in Haskell

A friend of mine showed me a home exercise in a C++ course which he attend. Since I already know C++, but just started learning Haskell I tried to solve the exercise in the "Haskell way".
These are the exercise instructions (I translated from our native language so please comment if the instructions aren't clear):
Write a program which reads non-zero coefficients (A,B,C,D) from the user and places them in the following equation:
A*x + B*y + C*z = D
The program should also read from the user N, which represents a range. The program should find all possible integral solutions for the equation in the range -N/2 to N/2.
For example:
Input: A = 2,B = -3,C = -1, D = 5, N = 4
Output: (-1,-2,-1), (0,-2, 1), (0,-1,-2), (1,-1, 0), (2,-1,2), (2,0, -1)
The most straight-forward algorithm is to try all possibilities by brute force. I implemented it in Haskell in the following way:
triSolve :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
triSolve a b c d n =
let equation x y z = (a * x + b * y + c * z) == d
minN = div (-n) 2
maxN = div n 2
in [(x,y,z) | x <- [minN..maxN], y <- [minN..maxN], z <- [minN..maxN], equation x y z]
So far so good, but the exercise instructions note that a more efficient algorithm can be implemented, so I thought how to make it better. Since the equation is linear, based on the assumption that Z is always the first to be incremented, once a solution has been found there's no point to increment Z. Instead, I should increment Y, set Z to the minimum value of the range and keep going. This way I can save redundant executions.
Since there are no loops in Haskell (to my understanding at least) I realized that such algorithm should be implemented by using a recursion. I implemented the algorithm in the following way:
solutions :: (Integer -> Integer -> Integer -> Bool) -> Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
solutions f maxN minN x y z
| solved = (x,y,z):nextCall x (y + 1) minN
| x >= maxN && y >= maxN && z >= maxN = []
| z >= maxN && y >= maxN = nextCall (x + 1) minN minN
| z >= maxN = nextCall x (y + 1) minN
| otherwise = nextCall x y (z + 1)
where solved = f x y z
nextCall = solutions f maxN minN
triSolve' :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
triSolve' a b c d n =
let equation x y z = (a * x + b * y + c * z) == d
minN = div (-n) 2
maxN = div n 2
in solutions equation maxN minN minN minN minN
Both yield the same results. However, trying to measure the execution time yielded the following results:
*Main> length $ triSolve' 2 (-3) (-1) 5 100
3398
(2.81 secs, 971648320 bytes)
*Main> length $ triSolve 2 (-3) (-1) 5 100
3398
(1.73 secs, 621862528 bytes)
Meaning that the dumb algorithm actually preforms better than the more sophisticated one. Based on the assumption that my algorithm was correct (which I hope won't turn as wrong :) ), I assume that the second algorithm suffers from an overhead created by the recursion, which the first algorithm isn't since it's implemented using a list comprehension.
Is there a way to implement in Haskell a better algorithm than the dumb one?
(Also, I'll be glad to receive general feedbacks about my coding style)

Of course there is. We have:
a*x + b*y + c*z = d
and as soon as we assume values for x and y, we have that
a*x + b*y = n
where n is a number we know.
Hence
c*z = d - n
z = (d - n) / c
And we keep only integral zs.

It's worth noticing that list comprehensions are given special treatment by GHC, and are generally very fast. This could explain why your triSolve (which uses a list comprehension) is faster than triSolve' (which doesn't).
For example, the solution
solve :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
-- "Buffalo buffalo buffalo buffalo Buffalo buffalo buffalo..."
solve a b c d n =
[(x,y,z) | x <- vals, y <- vals
, let p = a*x +b*y
, let z = (d - p) `div` c
, z >= minN, z <= maxN, c * z == d - p ]
where
minN = negate (n `div` 2)
maxN = (n `div` 2)
vals = [minN..maxN]
runs fast on my machine:
> length $ solve 2 (-3) (-1) 5 100
3398
(0.03 secs, 4111220 bytes)
whereas the equivalent code written using do notation:
solveM :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
solveM a b c d n = do
x <- vals
y <- vals
let p = a * x + b * y
z = (d - p) `div` c
guard $ z >= minN
guard $ z <= maxN
guard $ z * c == d - p
return (x,y,z)
where
minN = negate (n `div` 2)
maxN = (n `div` 2)
vals = [minN..maxN]
takes twice as long to run and uses twice as much memory:
> length $ solveM 2 (-3) (-1) 5 100
3398
(0.06 secs, 6639244 bytes)
Usual caveats about testing within GHCI apply -- if you really want to see the difference, you need to compile the code with -O2 and use a decent benchmarking library (like Criterion).

Functional learning woes

I'm a beginner to functional languages, and I'm trying to get the whole thing down in Haskell. Here's a quick-and-dirty function that finds all the factors of a number:
factors :: (Integral a) => a -> [a]
factors x = filter (\z -> x `mod` z == 0) [2..x `div` 2]
Works fine, but I found it to be unbearably slow for large numbers. So I made myself a better one:
factorcalc :: (Integral a) => a -> a -> [a] -> [a]
factorcalc x y z
| y `elem` z = sort z
| x `mod` y == 0 = factorcalc x (y+1) (z ++ [y] ++ [(x `div` y)])
| otherwise = factorcalc x (y+1) z
But here's my problem: Even though the code works, and can cut literally hours off the execution time of my programs, it's hideous!
It reeks of ugly imperative thinking: It constantly updates a counter and a data structure in a loop until it finishes. Since you can't change state in purely functional programming, I cheated by holding the data in the parameters, which the function simply passes to itself over and over again.
I may be wrong, but there simply must be a better way of doing the same thing...

Note that the original question asked for all the factors, not for only the prime factors. There being many fewer prime factors, they can probably be found more quickly. Perhaps that's what the OQ wanted. Perhaps not. But let's solve the original problem and put the "fun" back in "functional"!
Some observations:
The two functions don't produce the same output---if x is a perfect square, the second function includes the square root twice.
The first function enumerates checks a number of potential factors proportional to the size of x; the second function checks only proportional to the square root of x, then stops (with the bug noted above).
The first function (factors) allocates a list of all integers from 2 to n div 2, where the second function never allocates a list but instead visits fewer integers one at a time in a parameter. I ran the optimizer with -O and looked at the output with -ddump-simpl, and GHC just isn't smart enough to optimize away those allocations.
factorcalc is tail-recursive, which means it compiles into a tight machine-code loop; filter is not and does not.
Some experiments show that the square root is the killer:
Here's a sample function that produces the factors of x from z down to 2:
factors_from x 1 = []
factors_from x z
| x `mod` z == 0 = z : factors_from x (z-1)
| otherwise = factors_from x (z-1)
factors'' x = factors_from x (x `div` 2)
It's a bit faster because it doesn't allocate, but it's still not tail-recursive.
Here's a tail-recursive version that is more faithful to the original:
factors_from' x 1 l = l
factors_from' x z l
| x `mod` z == 0 = factors_from' x (z-1) (z:l)
| otherwise = factors_from' x (z-1) l
factors''' x = factors_from x (x `div` 2)
This is still slower than factorcalc because it enumerates all the integers from 2 to x div 2, whereas factorcalc stops at the square root.
Armed with this knowledge, we can now create a more functional version of factorcalc which replicates both its speed and its bug:
factors'''' x = sort $ uncurry (++) $ unzip $ takeWhile (uncurry (<=)) $
[ (z, x `div` z) | z <- [2..x], x `mod` z == 0 ]
I didn't time it exactly, but given 100 million as an input, both it and factorcalc terminate instantaneously, where the others all take a number of seconds.
How and why the function works is left as an exercise for the reader :-)
ADDENDUM: OK, to mitigate the eyeball bleeding, here's a slightly saner version (and without the bug):
saneFactors x = sort $ concat $ takeWhile small $
[ pair z | z <- [2..], x `mod` z == 0 ]
where pair z = if z * z == x then [z] else [z, x `div` z]
small [z, z'] = z < z'
small [z] = True

Okay, take a deep breath. It'll be all right.
First of all, why is your first attempt slow? How is it spending its time?
Can you think of a recursive definition for the prime factorization that doesn't have that property?
(Hint.)

Firstly, although factorcalc is "ugly", you could add a wrapper function factors' x = factorscalc x 2 [], add a comment, and move on.
If you want to make a 'beautiful' factors fast, you need to find out why it is slow. Looking at your two functions, factors walks the list about n/2 elements long, but factorcalc stops after around sqrt n iterations.
Here is another factors that also stops after about sqrt n iterations, but uses a fold instead of explicit iteration. It also breaks the problem into three parts: finding the factors (factor); stopping at the square root of x (small) and then computing pairs of factors (factorize):
factors' :: (Integral a) => a -> [a]
factors' x = sort (foldl factorize [] (takeWhile small (filter factor [2..])))
where
factor z = x `mod` z == 0
small z = z <= (x `div` z)
factorize acc z = z : (if z == y then acc else y : acc)
where y = x `div` z
This is marginally faster than factorscalc on my machine. You can fuse factor and factorize and it is about twice as fast as factorscalc.
The Profiling and Optimization chapter of Real World Haskell is a good guide to the GHC suite's performance tools for tackling tougher performance problems.
By the way, I have a minor style nitpick with factorscalc: it is much more efficient to prepend single elements to the front of a list O(1) than it is to append to the end of a list of length n O(n). The lists of factors are typically small, so it is not such a big deal, but factorcalc should probably be something like:
factorcalc :: (Integral a) => a -> a -> [a] -> [a]
factorcalc x y z
| y `elem` z = sort z
| x `mod` y == 0 = factorcalc x (y+1) (y : (x `div` y) : z)
| otherwise = factorcalc x (y+1) z

Since you can't change state in purely
functional programming, I cheated by
holding the data in the parameters,
which the function simply passes to
itself over and over again.
Actually, this is not cheating; this is a—no, make that the—standard technique! That sort of parameter is usually known as an "accumulator," and it's generally hidden within a helper function that does the actual recursion after being set up by the function you're calling.
A common case is when you're doing list operations that depend on the previous data in the list. The two problems you need to solve are, where do you get the data about previous iterations, and how do you deal with the fact that your "working area of interest" for any particular iteration is actually at the tail of the result list you're building. For both of these, the accumulator comes to the rescue. For example, to generate a list where each element is the sum of all of the elements of the input list up to that point:
sums :: Num a => [a] -> [a]
sums inp = helper inp []
where
helper [] acc = reverse acc
helper (x:xs) [] = helper xs [x]
helper (x:xs) acc#(h:_) = helper xs (x+h : acc)
Note that we flip the direction of the accumulator, so we can operate on the head of that, which is much more efficient (as Dominic mentions), and then we just reverse the final output.
By the way, I found reading The Little Schemer to be a useful introduction and offer good practice in thinking recursively.

This seemed like an interesting problem, and I hadn't coded any real Haskell in a while, so I gave it a crack. I've run both it and Norman's factors'''' against the same values, and it feels like mine's faster, though they're both so close that it's hard to tell.
factors :: Int -> [Int]
factors n = firstFactors ++ reverse [ n `div` i | i <- firstFactors ]
where
firstFactors = filter (\i -> n `mod` i == 0) (takeWhile ( \i -> i * i <= n ) [2..n])
Factors can be paired up into those that are greater than sqrt n, and those that are less than or equal to (for simplicity's sake, the exact square root, if n is a perfect square, falls into this category. So if we just take the ones that are less than or equal to, we can calculate the others later by doing div n i. They'll be in reverse order, so we can either reverse firstFactors first or reverse the result later. It doesn't really matter.

This is my "functional" approach to the problem. ("Functional" in quotes, because I'd approach this problem the same way even in non-functional languages, but maybe that's because I've been tainted by Haskell.)
{-# LANGUAGE PatternGuards #-}
factors :: (Integral a) => a -> [a]
factors = multiplyFactors . primeFactors primes 0 [] . abs where
multiplyFactors [] = [1]
multiplyFactors ((p, n) : factors) =
[ pn * x
| pn <- take (succ n) $ iterate (* p) 1
, x <- multiplyFactors factors ]
primeFactors _ _ _ 0 = error "Can't factor 0"
primeFactors (p:primes) n list x
| (x', 0) <- x `divMod` p
= primeFactors (p:primes) (succ n) list x'
primeFactors _ 0 list 1 = list
primeFactors (_:primes) 0 list x = primeFactors primes 0 list x
primeFactors (p:primes) n list x
= primeFactors primes 0 ((p, n) : list) x
primes = sieve [2..]
sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]
primes is the naive Sieve of Eratothenes. There's better, but this is the shortest method.
sieve [2..]
=> 2 : sieve [x | x <- [3..], x `mod` 2 /= 0]
=> 2 : 3 : sieve [x | x <- [4..], x `mod` 2 /= 0, x `mod` 3 /= 0]
=> 2 : 3 : sieve [x | x <- [5..], x `mod` 2 /= 0, x `mod` 3 /= 0]
=> 2 : 3 : 5 : ...
primeFactors is the simple repeated trial-division algorithm: it walks through the list of primes, and tries dividing the given number by each, recording the factors as it goes.
primeFactors (2:_) 0 [] 50
=> primeFactors (2:_) 1 [] 25
=> primeFactors (3:_) 0 [(2, 1)] 25
=> primeFactors (5:_) 0 [(2, 1)] 25
=> primeFactors (5:_) 1 [(2, 1)] 5
=> primeFactors (5:_) 2 [(2, 1)] 1
=> primeFactors _ 0 [(5, 2), (2, 1)] 1
=> [(5, 2), (2, 1)]
multiplyPrimes takes a list of primes and powers, and explodes it back out to a full list of factors.
multiplyPrimes [(5, 2), (2, 1)]
=> [ pn * x
| pn <- take (succ 2) $ iterate (* 5) 1
, x <- multiplyPrimes [(2, 1)] ]
=> [ pn * x | pn <- [1, 5, 25], x <- [1, 2] ]
=> [1, 2, 5, 10, 25, 50]
factors just strings these two functions together, along with an abs to prevent infinite recursion in case the input is negative.

I don't know much about Haskell, but somehow I think this link is appropriate:
http://www.willamette.edu/~fruehr/haskell/evolution.html
Edit: I'm not entirely sure why people are so aggressive about the downvoting on this. The original poster's real problem was that the code was ugly; while it's funny, the point of the linked article is, to some extent, that advanced Haskell code is, in fact, ugly; the more you learn, the uglier your code gets, to some extent. The point of this answer was to point out to the OP that apparently, the ugliness of the code that he was lamenting is not uncommon.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Haskell - list comprehension can't enumerate N × N - performance

Related

Algorithm to precisely compare two exponentiations for very large integers (order of 1 billion)

How can I speed up this haskell lastDigits x y function?

Most efficient algorithm to find integer points within an ellipse

More efficient algorithm preforms worse in Haskell

Functional learning woes

Categories

Resources