A friend of mine showed me a home exercise in a C++ course which he attend. Since I already know C++, but just started learning Haskell I tried to solve the exercise in the "Haskell way".
These are the exercise instructions (I translated from our native language so please comment if the instructions aren't clear):
Write a program which reads non-zero coefficients (A,B,C,D) from the user and places them in the following equation:
A*x + B*y + C*z = D
The program should also read from the user N, which represents a range. The program should find all possible integral solutions for the equation in the range -N/2 to N/2.
For example:
Input: A = 2,B = -3,C = -1, D = 5, N = 4
Output: (-1,-2,-1), (0,-2, 1), (0,-1,-2), (1,-1, 0), (2,-1,2), (2,0, -1)
The most straight-forward algorithm is to try all possibilities by brute force. I implemented it in Haskell in the following way:
triSolve :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
triSolve a b c d n =
let equation x y z = (a * x + b * y + c * z) == d
minN = div (-n) 2
maxN = div n 2
in [(x,y,z) | x <- [minN..maxN], y <- [minN..maxN], z <- [minN..maxN], equation x y z]
So far so good, but the exercise instructions note that a more efficient algorithm can be implemented, so I thought how to make it better. Since the equation is linear, based on the assumption that Z is always the first to be incremented, once a solution has been found there's no point to increment Z. Instead, I should increment Y, set Z to the minimum value of the range and keep going. This way I can save redundant executions.
Since there are no loops in Haskell (to my understanding at least) I realized that such algorithm should be implemented by using a recursion. I implemented the algorithm in the following way:
solutions :: (Integer -> Integer -> Integer -> Bool) -> Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
solutions f maxN minN x y z
| solved = (x,y,z):nextCall x (y + 1) minN
| x >= maxN && y >= maxN && z >= maxN = []
| z >= maxN && y >= maxN = nextCall (x + 1) minN minN
| z >= maxN = nextCall x (y + 1) minN
| otherwise = nextCall x y (z + 1)
where solved = f x y z
nextCall = solutions f maxN minN
triSolve' :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
triSolve' a b c d n =
let equation x y z = (a * x + b * y + c * z) == d
minN = div (-n) 2
maxN = div n 2
in solutions equation maxN minN minN minN minN
Both yield the same results. However, trying to measure the execution time yielded the following results:
*Main> length $ triSolve' 2 (-3) (-1) 5 100
(2.81 secs, 971648320 bytes)
*Main> length $ triSolve 2 (-3) (-1) 5 100
(1.73 secs, 621862528 bytes)
Meaning that the dumb algorithm actually preforms better than the more sophisticated one. Based on the assumption that my algorithm was correct (which I hope won't turn as wrong :) ), I assume that the second algorithm suffers from an overhead created by the recursion, which the first algorithm isn't since it's implemented using a list comprehension.
Is there a way to implement in Haskell a better algorithm than the dumb one?
(Also, I'll be glad to receive general feedbacks about my coding style)

Of course there is. We have:
a*x + b*y + c*z = d
and as soon as we assume values for x and y, we have that
a*x + b*y = n
where n is a number we know.
c*z = d - n
z = (d - n) / c
And we keep only integral zs.

It's worth noticing that list comprehensions are given special treatment by GHC, and are generally very fast. This could explain why your triSolve (which uses a list comprehension) is faster than triSolve' (which doesn't).
For example, the solution
solve :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
-- "Buffalo buffalo buffalo buffalo Buffalo buffalo buffalo..."
solve a b c d n =
[(x,y,z) | x <- vals, y <- vals
, let p = a*x +b*y
, let z = (d - p) `div` c
, z >= minN, z <= maxN, c * z == d - p ]
minN = negate (n `div` 2)
maxN = (n `div` 2)
vals = [minN..maxN]
runs fast on my machine:
> length $ solve 2 (-3) (-1) 5 100
(0.03 secs, 4111220 bytes)
whereas the equivalent code written using do notation:
solveM :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
solveM a b c d n = do
x <- vals
y <- vals
let p = a * x + b * y
z = (d - p) `div` c
guard $ z >= minN
guard $ z <= maxN
guard $ z * c == d - p
return (x,y,z)
minN = negate (n `div` 2)
maxN = (n `div` 2)
vals = [minN..maxN]
takes twice as long to run and uses twice as much memory:
> length $ solveM 2 (-3) (-1) 5 100
(0.06 secs, 6639244 bytes)
Usual caveats about testing within GHCI apply -- if you really want to see the difference, you need to compile the code with -O2 and use a decent benchmarking library (like Criterion).


How can I speed up this haskell lastDigits x y function?

I have a haskell assignment in which i have to create a function lastDigit x y of 2 arguments that calculates the sum of all [x^x | (0..x)], mine is too slow and i need to speed it up. Anyone has any ideas??
list :: Integral x=>x->[x]
list 0 = []
list x = list(div x 10) ++ [(mod x 10)]
sqrall :: Integer->[Integer]
sqrall x y = [mod (mod x 10^y)^x 10^y | x <- [1..x]]
lastDigits :: Integer -> Int -> [Integer]
lastDigits x y = drop (length((list(sum (sqrall x y))))-y) (list(sum (sqrall x)))
The main reason this will take too long is because you calculate the entire number of x^x, which scales super exponentially. This means that even for very small x, it will still take a considerable amount of time.
The point is however that you do not need to calculate the entire number. Indeed, you can make use of the fact that x×y mod n = (x mod n) × (y mod n) mod n. For example Haskell's arithmoi package makes use of this [src]:
powMod :: (Integral a, Integral b) => a -> b -> a -> a
powMod x y m
| m <= 0 = error "powModInt: non-positive modulo"
| y < 0 = error "powModInt: negative exponent"
| otherwise = f (x `rem` m) y 1 `mod` m
f _ 0 acc = acc
f b e acc = f (b * b `rem` m) (e `quot` 2)
(if odd e then (b * acc `rem` m) else acc)
We can make a specific version for modulo 10 with:
pow10 :: Integral i => i -> i
pow10 x = go x x
where go 0 _ = 1
go i j | odd i = rec * j `mod` 10
| otherwise = rec
where rec = go (div i 2) ((j*j) `mod` 10)
This then matches x^x `mod` 10, except that we do not need to calculate the entire number:
Prelude> map pow10 [1 .. 20]
Prelude> [x^x `mod` 10 | x <- [1..20]]
Now that we have that, we can also calculate the the sum of the two last digits with integers that range to at most 18:
sum10 :: Int -> Int -> Int
sum10 x y = (x + y) `mod` 10
we thus can calculate the last digit with:
import Data.List(foldl')
lastdigit :: Int -> Int
lastdigit x = foldl' sum10 0 (map pow10 [0 .. x])
For example for x = 26, we get:
Prelude Data.List> lastdigit 26
Prelude Data.List> sum [ x^x | x <- [0 .. 26] ]
I keep it as an exercise to generalize the above to calculate it for the last y digits. As long as y is relatively small, this will be efficient, since then the numbers never take huge amounts of memory. Furthermore if the numbers have an upper bound, addition, multiplication, etc. are done in constant time. If you however use an Integer, then the numbers can be arbitrary large, and thus operations like addition are not constant.

Most efficient algorithm to find integer points within an ellipse

I'm trying to find all the integer lattice points within various 3D ellipses.
I would like my program to take an integer N, and count all the lattice points within the ellipses of the form ax^2 + by^2 + cz^2 = n, where a,b,c are fixed integers and n is between 1 and N. This program should then return N tuples of the form (n, numlatticePointsWithinEllipse n).
I'm currently doing it by counting the points on the ellipses ax^2 + by^2 + cz^2 = m, for m between 0 and n inclusive, and then summing over m. I'm also only looking at x, y and z all positive initially, and then adding in the negatives by permuting their signs later.
Ideally, I'd like to reach numbers of N = 1,000,000+ within the scale of hours
Taking a specific example of x^2 + y^2 + 3z^2 = N, here's the Haskell code I'm currently using:
import System.Environment
isqrt :: Int -> Int
isqrt 0 = 0
isqrt 1 = 1
isqrt n = head $ dropWhile (\x -> x*x > n) $ iterate (\x -> (x + n `div` x) `div` 2) (n `div` 2)
latticePointsWithoutNegatives :: Int -> [[Int]]
latticePointsWithoutNegatives 0 = [[0,0,0]]
latticePointsWithoutNegatives n = [[x,y,z] | x<-[0.. isqrt n], y<- [0.. isqrt (n - x^2)], z<-[max 0 (isqrt ((n-x^2 -y^2) `div` 3))], x^2 +y^2 + z^2 ==n]
latticePoints :: Int -> [[Int]]
latticePoints n = [ zipWith (*) [x1,x2,x3] y | [x1,x2,x3] <- (latticePointsWithoutNegatives n), y <- [[a,b,c] | a <- (if x1 == 0 then [0] else [-1,1]), b<-(if x2 == 0 then [0] else [-1,1]), c<-(if x3 == 0 then [0] else [-1,1])]]
latticePointsUpTo :: Int -> Int
latticePointsUpTo n = sum [length (latticePoints x) | x<-[0..n]]
listResults :: Int -> [(Int, Int)]
listResults n = [(x, latticePointsUpTo x) | x<- [1..n]]
main = do
args <- getArgs
let cleanArgs = read (head args)
print (listResults cleanArgs)
I've compiled this with
ghc -O2 latticePointsTest
but using the PowerShell "Measure-Command" command, I get the following results:
Measure-Command{./latticePointsTest 10}
TotalMilliseconds : 12.0901
Measure-Command{./latticePointsTest 100}
TotalMilliseconds : 12.0901
Measure-Command{./latticePointsTest 1000}
TotalMilliseconds : 31120.4503
and going any more orders of magnitude up takes us onto the scale of days, rather than hours or minutes.
Is there anything fundamentally wrong with the algorithm I'm using? Is there any core reason why my code isn't scaling well? Any guidance will be greatly appreciated. I may also want to process the data between "latticePoints" and "latticePointsUpTo", so I can't just rely entirely on clever number theoretic counting techniques - I need the underlying tuples preserved.
Some things I would try:
isqrt is not efficient for the range of values you are working work. Simply use the floating point sqrt function:
isqrt = floor $ sqrt ((fromIntegral n) :: Double)
Alternatively, instead of computing integer square roots, use logic like this in your list comprehensions:
x <- takeWhile (\x -> x*x <= n) [0..],
y <- takeWhile (\y -> y*y <= n - x*x) [0..]
Also, I would use expressions like x*x instead of x^2.
Finally, why not compute the number of solutions with something like this:
sols a b c n =
length [ () | x <- takeWhile (\x -> a*x*x <= n) [0..]
, y <- takeWhile (\y -> a*x*x+b*y*y <= n) [0..]
, z <- takeWhile (\z -> a*x*x+b*y*y+c*z*z <= n) [0..]
This does not exactly compute the same answer that you want because it doesn't account for positive and negative solutions, but you could easily modify it to compute your answer. The idea is to use one list comprehension instead of iterating over various values of n and summing.
Finally, I think using floor and sqrt to compute the integral square root is completely safe in this case. This code verifies that the integer square root by sing sqrt of (x*x) == x for all x <= 3037000499:
testAll :: Int -> IO ()
testAll n =
print $ head [ (x,a) | x <- [n,n-1 .. 1], let a = floor $ sqrt (fromIntegral (x*x) :: Double), a /= x ]
main = testAll 3037000499
Note I am running this on a 64-bit GHC - otherwise just use Int64 instead of Int since Doubles are 64-bit in either case. Takes only a minute or so to verify.
This shows that taking the floor of sqrt y will never result in the wrong answer if y <= 3037000499^2.

Generating an unique ID from two values

What would be an idiomatic way of generating an unique number (say, a 64bit unsigned int) from two values, in such a way that the input values (also numbers of the same type) could be regenerated from the number, as a Haskell function?
On C/C++ I would probably use something like
result = (((value1) << BITS) + ((value2) & ((1 << BITS) - 1)))
and, accordingly,
value1 = (result >> BITS)
value2 = (result & ((1 << BITS) - 1))
for regenerating the values, but I don't think I should be trying to use bitwise operations in Haskell.
After consideration, I simply abandoned the idea of using bitwise operations and resorted to Cantor's pairing function:
pair :: (Fractional a) => a -> a -> a
pair x y = (1 / 2) * (x + y) * (x + y + 1) + y
unpair :: (RealFrac a, Floating a) => a -> (a, a)
unpair z = (x, y) where
q = (-1 / 2) + sqrt (1 / 4 + 2 * z)
j = fromInteger (truncate q)
y = z - ((1 / 2) * j * (j + 1))
x = j - y
This is probably the way I should have thought from the beginning. Thank you all very much for helping me to better understand bit operations on Haskell, though.
You can use the exact same way in Haskell. Bitwise operations can be found in Data.Bits and unsigned, fixed-sized integer types in Data.Word. For example:
import Data.Bits
import Data.Word
combine :: Word32 -> Word32 -> Word64
combine a b = (fromIntegral a `shiftL` 32) + fromIntegral b
separate :: Word64 -> (Word32, Word32)
separate w = (fromIntegral $ w `shiftR` 32, fromIntegral $ w .&. 0xffff)
The thing that might trip you up compared to C is that Haskell never converts between different numeric types implicitly, so you need to use fromIntegral to convert between e.g. 32bit and 64bit unsigned integers.

Haskell - list comprehension can't enumerate N × N

I have to write a function which returns a list of all pairs (x,y) where x,
y ∈ N , and:
x is the product of two natural numbers (x = a • b, where a, b ∈ N) and
x is really bigger than 5 but really smaller than 500, and
y is a square number (y = c² where c ∈ N) NOT greater than 1000, and
x is a divisor of y.
My attempt:
listPairs :: [(Int, Int)]
listPairs = [(a*b, y) | y <- [0..], a <- [0..], b <- [0..],
(a*b) > 5, (a*b) < 500, (y*y) < 1001,
mod y (a*b) == 0]
But it doesn't return anything and the computer works a lot on it.
However if I choose a smaller range for a, b and y e. g. [0..400], it takes up to a minute but it returns the right result.
So how could I solve the performance issue?
So, of course nested list comprehensions on infinite lists do not terminate.
Fortunately, your lists are not infinite. There's a limit. If x = a*b < 500, then we know that it must be a < 500 and b < 500. Also, c = y*y < 1001 is just y < 32. So,
listPairs :: [(Int, Int)]
listPairs =
[(x, c*c) | c <- [1..31], a <- [1..499], -- a*b < 500 ==> b<500/a ,
b <- [a..min 499 (div 500 a)], -- a*b==b*a ==> b >= a
let x = a*b, x > 5,
-- (a*b) < 500, (c*c) < 1001, -- no need to test this
rem (c*c) x == 0]
mod 0 n == 0 trivially holds, so I'm excluding 0 from "natural numbers" here.
There are still some duplicates produced here, even though we've limited the b value to b >= a in x=a*b, because x can have several representations (e.g. 1*6 == 2*3).
You can use Data.List.nub to get rid of them.

Recursive addition in F# using

I'm trying to implement the following recursive definition for addition in F#
m + 0 := m
m + (n + 1) := (m + n) + 1
I can't seem to get the syntax correct, The closest I've come is
let rec plus x y =
match y with
| 0 -> x;
| succ(y) -> succ( plus(x y) );
Where succ n = n + 1. It throws an error on pattern matching for succ.
I'm not sure what succ means in your example, but it is not a pattern defined in the standard F# library. Using just the basic functionality, you'll need to use a pattern that matches any number and then subtract one (and add one in the body):
let rec plus x y =
match y with
| 0 -> x
| y -> 1 + (plus x (y - 1))
In F# (unlike e.g. in Prolog), you can't use your own functions inside patterns. However, you can define active patterns that specify how to decompose input into various cases. The following takes an integer and returns either Zero (for zero) or Succ y for value y + 1:
let (|Zero|Succ|) n =
if n < 0 then failwith "Unexpected!"
if n = 0 then Zero else Succ(n - 1)
Then you can write code that is closer to your original version:
let rec plus x y =
match y with
| Zero -> x
| Succ y -> 1 + (plus x y)
As Tomas said, you can't use succ like this without declaring it. What you can do is to create a discriminated union that represents a number:
type Number =
| Zero
| Succ of Number
And then use that in the plus function:
let rec plus x y =
match y with
| Zero -> x
| Succ(y1) -> Succ (plus x y1)
Or you could declare it as the + operator:
let rec (+) x y =
match y with
| Zero -> x
| Succ(y1) -> Succ (x + y1)
If you kept y where I have y1, the code would work, because the second y would hide the first one. But I think doing so makes the code confusing.
type N = Zero | Succ of N
let rec NtoInt n =
match n with
| Zero -> 0
| Succ x -> 1 + NtoInt x
let rec plus x y =
match x with
| Zero -> y
| Succ n -> Succ (plus n y)
> plus (Succ (Succ Zero)) Zero |> NtoInt ;;
val it : int = 2
> plus (Succ (Succ Zero)) (Succ Zero) |> NtoInt ;;
val it : int = 3
let rec plus x y =
match y with
| 0 -> x
| _ -> plus (x+1) (y-1)
