List of tuples by taking the same index for an element in haskell - algorithm

I have been trying to solve the following problem in haskell:
Generate a list of tuples (n, s) where 0 ≤ n ≤ 100 and n mod 2 = 0,
and where s = sum(1..n) The output should be the list
[(0,0),(2,3),(4,10),...,(100,5050)] Source
I tried to solve the problem with following code:
genListTupleSumUntilX :: Int -> [(Int,Int)]
genListTupleSumUntilX x =
take x [(n, s) | n <- [1..x], s <- sumUntilN x]
where
sumUntilN :: Int -> [Int]
sumUntilN n
| n == 0 = []
| n == 1 = [1]
| otherwise = sumUntilN (n-1) ++ [sum[1..n]]
However, this code does not give the expected result. (as #Guru Stron Pointed out- Thank you!)
I would also appreciate it if somebody could help me make this code more concise. I am also new to the concept of lazy evaluation, so am unable to determine the runtime complexity. Help will be appreciated.
However I feel like this code could still be improved upon, espically with:
take x in the function seems really inelegant. So Is there a way to have list comprhensions only map to the same index?
sumUntilN feels really verbose. Is there an idiomatic way to do the same in haskell?
Finally, I am extremely new to haskell and have trouble evaluating the time and space complexity of the function. Can somebody help me there?

sumOfNumsUptoN n = n * (n + 1) `div` 2
genListTupleSumUntilX :: Int -> [(Int, Int)]
genListTupleSumUntilX n = zip [0, 2 .. n] $ map sumOfNumsUptoN [0, 2 .. n]
This is of linear complexity on the size of the list.

I would say that you overcomplicate things. To produce correct output you can use simple list comprehension:
genListTupleSumUntilX :: Int -> [(Int,Int)]
genListTupleSumUntilX x = [(n, sum [1..n]) | n <- [0,2..x]]
Note that this solution will recalculate the same sums repeatedly (i.e for n+1 element sum is actually n + 2 + n + 1 + sumForNthElemnt, so you can potentially reuse the computation) which will lead to O(n^2) complexity, but for such relatively small n it is not a big issue. You can handle this using scanl function (though maybe there is more idiomatic approach for memoization):
genListTupleSumUntilX :: Int -> [(Int,Int)]
genListTupleSumUntilX 0 = []
genListTupleSumUntilX x = scanl (\ (prev, prevSum) curr -> (curr, prevSum + prev + 1 + curr)) (0,0) [2,4..x]

Related

Randomized algorithm not behaving as expected

I am implementing an approximate counting algorithm where we:
Maintain a counter X using log (log n) bits
Initialize X to 0
When an item arrives, increase X by 1 with probability (½)X
When the stream is over, output 2X − 1 so that E[2X]= n + 1
My implementation is as follows:
import System.Random
type Prob = Double
type Tosses = Int
-- * for sake of simplicity we assume 0 <= p <= 1
tos :: Prob -> StdGen -> (Bool,StdGen)
tos p s = (q <= 100*p, s')
where (q,s') = randomR (1,100) s
toses :: Prob -> Tosses -> StdGen -> [(Bool,StdGen)]
toses _ 0 _ = []
toses p n s = let t#(b,s') = tos p s in t : toses p (pred n) s'
toses' :: Prob -> Tosses -> StdGen -> [Bool]
toses' p n = fmap fst . toses p n
morris :: StdGen -> [a] -> Int
morris s xs = go s xs 0 where
go _ [] n = n
go s (_:xs) n = go s' xs n' where
(h,s') = tos (0.5^n) s
n' = if h then succ n else n
main :: IO Int
main = do
s <- newStdGen
return $ morris s [1..10000]
The problem is that my X is always incorrect for any |stream| > 2, and it seems like for all StdGen and |stream| > 1000, X = 7
I tested the same algorithm in Matlab and it works there, so I assume it's either
an issue with my random number generator, or
raising 1/2 to a large n in Double
Please suggest a path forward?
The problem is actually very simple: with randomR (1,100) you preclude values within the first percent, so you have a complete cutoff at high powers of 1/2 (which all lie in that small interval). Actually a general thing: ranges should start at zero, not at one†, unless there's a specific reason.
But why even use a range of 100 in the first place? I'd just make it
tos :: Prob -> StdGen -> (Bool,StdGen)
tos p s = (q <= p, s')
where (q,s') = randomR (0,1) s
†I know, Matlab gets this wrong all over the place. Just one of the many horrible things about that language.
Unrelated to your problem: as chi remarked this kind of code looks a lot nicer if you use a suitable random monad, instead of manually passing around StdGens.
import Data.Random
import Data.Random.Source.Std
type Prob = Double
tos :: Prob -> RVar Bool
tos p = do
q <- uniform 0 1
return $ q <= p
morris :: [a] -> RVar Int
morris xs = go xs 0 where
go [] n = return n
go (_:xs) n = do
h <- tos (0.5^n)
go xs $ if h then succ n else n
morrisTest :: Int -> IO Int
morrisTest n = do
runRVar (morris [1..n]) StdRandom

Most efficient algorithm to find integer points within an ellipse

I'm trying to find all the integer lattice points within various 3D ellipses.
I would like my program to take an integer N, and count all the lattice points within the ellipses of the form ax^2 + by^2 + cz^2 = n, where a,b,c are fixed integers and n is between 1 and N. This program should then return N tuples of the form (n, numlatticePointsWithinEllipse n).
I'm currently doing it by counting the points on the ellipses ax^2 + by^2 + cz^2 = m, for m between 0 and n inclusive, and then summing over m. I'm also only looking at x, y and z all positive initially, and then adding in the negatives by permuting their signs later.
Ideally, I'd like to reach numbers of N = 1,000,000+ within the scale of hours
Taking a specific example of x^2 + y^2 + 3z^2 = N, here's the Haskell code I'm currently using:
import System.Environment
isqrt :: Int -> Int
isqrt 0 = 0
isqrt 1 = 1
isqrt n = head $ dropWhile (\x -> x*x > n) $ iterate (\x -> (x + n `div` x) `div` 2) (n `div` 2)
latticePointsWithoutNegatives :: Int -> [[Int]]
latticePointsWithoutNegatives 0 = [[0,0,0]]
latticePointsWithoutNegatives n = [[x,y,z] | x<-[0.. isqrt n], y<- [0.. isqrt (n - x^2)], z<-[max 0 (isqrt ((n-x^2 -y^2) `div` 3))], x^2 +y^2 + z^2 ==n]
latticePoints :: Int -> [[Int]]
latticePoints n = [ zipWith (*) [x1,x2,x3] y | [x1,x2,x3] <- (latticePointsWithoutNegatives n), y <- [[a,b,c] | a <- (if x1 == 0 then [0] else [-1,1]), b<-(if x2 == 0 then [0] else [-1,1]), c<-(if x3 == 0 then [0] else [-1,1])]]
latticePointsUpTo :: Int -> Int
latticePointsUpTo n = sum [length (latticePoints x) | x<-[0..n]]
listResults :: Int -> [(Int, Int)]
listResults n = [(x, latticePointsUpTo x) | x<- [1..n]]
main = do
args <- getArgs
let cleanArgs = read (head args)
print (listResults cleanArgs)
I've compiled this with
ghc -O2 latticePointsTest
but using the PowerShell "Measure-Command" command, I get the following results:
Measure-Command{./latticePointsTest 10}
TotalMilliseconds : 12.0901
Measure-Command{./latticePointsTest 100}
TotalMilliseconds : 12.0901
Measure-Command{./latticePointsTest 1000}
TotalMilliseconds : 31120.4503
and going any more orders of magnitude up takes us onto the scale of days, rather than hours or minutes.
Is there anything fundamentally wrong with the algorithm I'm using? Is there any core reason why my code isn't scaling well? Any guidance will be greatly appreciated. I may also want to process the data between "latticePoints" and "latticePointsUpTo", so I can't just rely entirely on clever number theoretic counting techniques - I need the underlying tuples preserved.
Some things I would try:
isqrt is not efficient for the range of values you are working work. Simply use the floating point sqrt function:
isqrt = floor $ sqrt ((fromIntegral n) :: Double)
Alternatively, instead of computing integer square roots, use logic like this in your list comprehensions:
x <- takeWhile (\x -> x*x <= n) [0..],
y <- takeWhile (\y -> y*y <= n - x*x) [0..]
Also, I would use expressions like x*x instead of x^2.
Finally, why not compute the number of solutions with something like this:
sols a b c n =
length [ () | x <- takeWhile (\x -> a*x*x <= n) [0..]
, y <- takeWhile (\y -> a*x*x+b*y*y <= n) [0..]
, z <- takeWhile (\z -> a*x*x+b*y*y+c*z*z <= n) [0..]
]
This does not exactly compute the same answer that you want because it doesn't account for positive and negative solutions, but you could easily modify it to compute your answer. The idea is to use one list comprehension instead of iterating over various values of n and summing.
Finally, I think using floor and sqrt to compute the integral square root is completely safe in this case. This code verifies that the integer square root by sing sqrt of (x*x) == x for all x <= 3037000499:
testAll :: Int -> IO ()
testAll n =
print $ head [ (x,a) | x <- [n,n-1 .. 1], let a = floor $ sqrt (fromIntegral (x*x) :: Double), a /= x ]
main = testAll 3037000499
Note I am running this on a 64-bit GHC - otherwise just use Int64 instead of Int since Doubles are 64-bit in either case. Takes only a minute or so to verify.
This shows that taking the floor of sqrt y will never result in the wrong answer if y <= 3037000499^2.

Recursion with accumulators that are not reversed - is it possible?

I've been playing with Haskell a fair amount lately, and I came up with this function to find the nth prime:
nthPrime 1 = 2
nthPrime 2 = 3
nthPrime n = aux [2, 3] 3 5 n
where
aux knownPrimes currentNth suspect soughtNth =
let currentIsPrime = foldl (\l n -> l && suspect `mod` n /= 0)
True knownPrimes
in case (currentIsPrime, soughtNth == currentNth) of
(True, True) -> suspect
(True, False) -> aux (suspect:knownPrimes) (currentNth + 1)
(suspect + 2) soughtNth
_ -> aux knownPrimes currentNth (suspect + 2) soughtNth
My question is, is there a way to have an accumulative parameter (in this case knownPrimes) that is not reversed (as occurs when passing (suspect:knownPrimes))?
I have tried using knownPrimes ++ [suspect] but this seems inefficient as well.
My hope is that if I can pass the known primes in order then I can shortcut some of the primality checks further.
In Haskell, if you are using an accumulator to build a list, but end up having to reverse it, it is often the case that it is better to drop the accumulator and instead produce the list lazily as the result of your computation.
If you apply this kind of thinking to searching for primes, and take full advantage of laziness, you end up with a well-known technique of producing an infinite list of all the primes. If we refactor your code as little as possible to use this technique, we get something like:
allPrimes = [2, 3] ++ aux 5
where
aux suspect =
let currentIsPrime = foldl (\l n -> l && suspect `mod` n /= 0) True
$ takeWhile (\n -> n*n <= suspect) allPrimes
in case currentIsPrime of
True -> suspect : aux (suspect + 2)
False -> aux (suspect + 2)
nthPrime n = allPrimes !! (n-1)
I have removed now unnecessary parameters and changed the code from accumulating into lazily producing, and to use its own result as the source of prime divisors to test (this is called "tying the knot"). Other than that, the only change here is to add a takeWhile check: since the list we are testing divisors from is defined in terms of itself, and is infinite to boot, we need to know where on the list to stop checking for divisors so that we don't get a truly infinite recursion.
Apart from this, there is an inefficiency in this code:
foldl (\l n -> l && suspect `mod` n /= 0) True
is not a good way for checking whether there are no divisors in a list, because as written, it won't stop once a divisor has been found, even though && itself is shortcutting (stopping as soon as its first argument is found to be False).
To allow proper shortcutting, a foldr could be used instead:
foldr (\n r -> suspect `mod` n /= 0 && r) True
Or, even better, use the predefined function all:
all (\n -> suspect `mod` n /= 0)
Using my remarks
This is how it would look like if you use all and refactor it a bit:
allPrimes :: [Integer]
allPrimes = 2 : 3 : aux 5
where
aux suspect
| currentIsPrime = suspect : nextPrimes
| otherwise = nextPrimes
where
currentIsPrime =
all (\n -> suspect `mod` n /= 0)
$ takeWhile (\n -> n*n <= suspect) allPrimes
nextPrimes = aux (suspect + 2)
nthPrime :: Int -> Integer
nthPrime n = allPrimes !! (n-1)

performance issue on collatz chain

I have a working program to compute the longest collatz chain in a given range (project euler n°14). I think it works correctly, but is very slow. I tried to look for a better solution, but I can only reduce slightly the evaluated domain. Am I doing something wrong?
The implementation use memoization to avoid computing the same result twice. Is Data.Map bad for general performances?
import Data.Map ((!), member, insert, singleton, assocs, Map)
insertSolution::Integer->(Map Integer Integer)->(Map Integer Integer)
insertSolution n syracMap
| n `member` syracMap = syracMap
|otherwise = let
next = if n `mod` 2 == 0 then n `div` 2 else 3 * n + 1
newMap = insertSolution next syracMap
solution = newMap ! next + 1
in insert n solution newMap
bound = 1::Integer
lower = 999999::Integer
test::[Integer]
test = [lower,lower+2..bound]
values = takeWhile (\(k, v) -> k < bound) $ assocs $ foldr insertSolution (singleton 1 1) test
result = foldr (\(k, v) (k', v') -> if v > v' then (k, v) else (k', v')) (1, 1) values
main = putStr $ show $ result
edit
updated function to remove bug. It is still pretty slow on my laptop.
FWIW, here's my solution:
module Main
where
import Data.List
import Data.Ord
next_hailstone n | even n = n `div` 2
| otherwise = 3*n+1
gen_next_hailstone n
= if nh == 1
then Nothing
else Just (nh, nh)
where nh = next_hailstone n
hailstone n = unfoldr gen_next_hailstone n
hailstone_seqs = map hailstone [1..1000000]
zip_hailstone = zip [1..1000000] hailstone_seqs
max_hailstone = maximumBy (comparing (length . snd)) zip_hailstone
main = print . fst $ max_hailstone
It's relatively fast. If you want more speed, consult the Haskell wiki (SPOILER ALERT!!!).

Performance comparison of two implementations of a primes filter

I have two programs to find prime numbers (just an exercise, I'm learning Haskell). "primes" is about 10X faster than "primes2", once compiled with ghc (with flag -O). However, in "primes2", I thought it would consider only prime numbers for the divisor test, which should be faster than considering odd numbers in "isPrime", right? What am I missing?
isqrt :: Integral a => a -> a
isqrt = floor . sqrt . fromIntegral
isPrime :: Integral a => a -> Bool
isPrime n = length [i | i <- [1,3..(isqrt n)], mod n i == 0] == 1
primes :: Integral a => a -> [a]
primes n = [2,3,5,7,11,13] ++ (filter (isPrime) [15,17..n])
primes2 :: Integral a => a -> [a]
primes2 n = 2 : [i | i <- [3,5..n], all ((/= 0) . mod i) (primes2 (isqrt i))]
I think what's happening here is that isPrime is a simple loop, whereas primes2 is calling itself recursively — and its recursion pattern looks exponential to me.
Searching through my old source code, I found this code:
primes :: [Integer]
primes = 2 : filter isPrime [3,5..]
isPrime :: Integer -> Bool
isPrime x = all (\n -> x `mod` n /= 0) $
takeWhile (\n -> n * n <= x) primes
This tests each possible prime x only against the primes below sqrt(x), using the already generated list of primes. So it probably doesn't test any given prime more than once.
Memoization in Haskell:
Memoization in Haskell is generally explicit, not implicit. The compiler won't "do the right thing" but it will only do what you tell it to. When you call primes2,
*Main> primes2 5
[2,3,5]
*Main> primes2 10
[2,3,5,7]
Each time you call the function it calculates all of its results all over again. It has to. Why? Because 1) You didn't make it save its results, and 2) the answer is different each time you call it.
In the sample code I gave above, primes is a constant (i.e. it has arity zero) so there's only one copy of it in memory, and its parts only get evaluated once.
If you want memoization, you need to have a value with arity zero somewhere in your code.
I like what Dietrich has done with the memoization, but I think theres a data structure issue here too. Lists are just not the ideal data structure for this. They are, by necessity, lisp style cons cells with no random access. Set seems better suited to me.
import qualified Data.Set as S
sieve :: (Integral a) => a -> S.Set a
sieve top = let l = S.fromList (2:3:([5,11..top]++[7,13..top]))
iter s c
| cur > (div (S.findMax s) 2) = s
| otherwise = iter (s S.\\ (S.fromList [2*cur,3*cur..top])) (S.deleteMin c)
where cur = S.findMin c
in iter l (l S.\\ (S.fromList [2,3]))
I know its kind of ugly, and not too declarative, but it runs rather quickly. Im looking into a way to make this nicer looking using Set.fold and Set.union over the composites. Any other ideas for neatening this up would be appreciated.
PS - see how (2:3:([5,11..top]++[7,13..top])) avoids unnecessary multiples of 3 such as the 15 in your primes. Unfortunately, this ruins your ordering if you work with lists and you sign up for a sorting, but for sets thats not an issue.

Resources