Determining if a given number is a prime in haskell - algorithm

So I have devised the following function for seeing if a given number is a prime in Haskell (it assumes the first prime is 2):
isPrime k = length [ x | x <- [2..k], k `mod` x == 0] == 1
it has the obvious pitfall of continuing the evaluation even if it is divisible by several numbers :(. Is there any sane way of "cutting" the evaluation when it finds more than one solution, using list comprehensions?
Also, which other implementations would you you try on? I'm not looking for performance here, I'm just trying to see if there are other more "haskellish" ways of doing the same thing.

A quick change to your code that will 'short circuit' the evaluation, and relies on the laziness of Haskell Lists, is:
isPrime k = if k > 1 then null [ x | x <- [2..k - 1], k `mod` x == 0] else False
The very first divisor of k will cause the list to be non-empty, and the Haskell implementation of null will only look at the first element of the list.
You should only need to check up to sqrt(k) however:
isPrime k = if k > 1 then null [ x | x <- [2..isqrt k], k `mod` x == 0] else False
Of course, if you are looking to do high-performance primality testing, a library is preferred.

Here is the best resource for prime numbers in haskell in haskell.org
and here prime.hs github project

I like this approach:
First make function to get all factors of n:
factors n = [x | x <- [1..n], mod n x == 0]
Then check if factors are only the given number and 1, if so, the number is prime:
prime n = factors n == [1,n]

It's perhaps not directly relevant, but on the topic of finding primes in functional languages I found Melissa E. O'Neill's The Genuine Sieve of Eratosthenes very interesting.

Ignoring the primes issue, and focusing on the narrow point of a more efficient method of length xs == n:
hasLength :: Integral count => [a] -> count -> Bool
_ `hasLength` n | n < 0 = False
[] `hasLength` n = n == 0
(_ : xs) `hasLength` n = xs `hasLength` (pred n)
isPrime k = [ x | x <- [2..k], k `mod` x == 0)] `hasLength` 1

This may be silly and inefficient (I'm a complete Haskell newby), but the function isMyNumberPrime (in ghci) seems to tell you if a number is prime or not.
factors n = [x | x <- [2..(n`div` 2)], mod n x == 0]
factormap n = fmap factors $ factors n
isMyNumberPrime n = case factormap n of [] -> True; _ -> False

Related

Sum Of Two Squares: Where's My Error?

I'm trying to calculate the number of ways to write a natural number as the sum of two squares. I'm working from the definition:
So, here is my code. Below where I test it, I find what I think is an error in the result.
sumOfSquares :: Integer -> Int
sumOfSquares k = 4 * (d1 - d3)
where
divs = divisors k
d1 = congruents d1_test divs
d3 = congruents d3_test divs
d1_test n = (n - 1) `mod` 4 == 0
d3_test n = (n - 3) `mod` 4 == 0
congruents :: (Integer -> Bool) -> [Integer] -> Int
congruents f divs = length $ filter f divs
divisors :: Integer -> [Integer]
divisors k = divisors' 2 k
where
divisors' n k' | n*n > k' = [k']
| n*n == k' = [n, k']
| k' `mod` n == 0 = (n:(k' `div` n):result)
| otherwise = result
where result = divisors' (n+1) k'
And when I run it, it generates:
*Main Numbers.SumOfSquares> sumOfSquares 10
4
I calculated that there is only one way to express 10 as a sum of two squares
1^2 + 3^2. Note that the intermediate result (d1 - d3) equals 1.
I'm missing something important but don't know what.
I think you misread the semantics of the formula. The Wikipedia article states the following equation:
There are two important remarks here:
the domain is Z, not N, therefore (-1), (-3), 0, etc. are also valid elements for the squares; and
we count the number of tuples, not sets so the order is important (and (1,2,2) is not equal to (1,2)): if (1,3) is a solution, so is (3,1) and we count these as two separate ones.
Now 10 has the following divisors: 1, 2, 5, 10 (your program forgot about 1 and 10). Two are congruent with 1 modulo 4: 1 and 5. Furthermore there are no divisors congruent with 3 modulo 4. So d1 = 2 and d3 = 0. Therefore there are eight (4×(2-0) = 8) possibilities:
(1,3): 12+32=10
(3,1): 32+12=10
(1,-3): 12+(-3)2=10
(3,-1): 32+(-1)2=10
(-1,3): (-1)2+32=10
(-3,1): (-3)2+12=10
(-1,-3): (-1)2+(-3)2=10
(-3,-1): (-3)2+(-1)2=10
Now we only have to resolve the issue with your program. You simply need to start counting from 1 instead of 2:
divisors :: Integer -> [Integer]
divisors k = divisors' 1
where
divisors' i | i2 > k = []
| i2 == k = [i]
| k `mod` i == 0 = (i:(k `div` i):result)
| otherwise = result
where i2 = i*i
result = divisors' (i+1)
I also simplified the program a bit and solved some other semantical errors. Now it should at least be sound with rk(n).

My Haskell Solution to Euler #3 is Inefficient

I am attempting to solve Euler problem 3 in Haskell, which involves finding the largest prime factor of a number. My code runs for a long time and seems to hang. What is causing my code to be so grossly inefficient?
primes = sieve (2:[3,5..])
where sieve (x:xs) = x:[y | y <- (sieve xs), mod y x /= 0]
sieve [] = []
primefactors n = filter (\x -> mod n x == 0) (primesUnder n)
where primesUnder z = reverse (takeWhile (< z) primes)
solve3 = head (primefactors 600851475143)
Your main problem is you're checking for enormous primes -- all the way up to 600851475143. You can improve things a lot by observing two things:
Every time you find a prime, you can decrease the maximum prime you look at by dividing away that factor.
You only have to look for primes until you reach the square root of the target. If your primes are bigger than that, and you know there are no smaller factors, you're done.
Using these two improvements together, even without the nicety that you used of only checking primes for divisibility, makes the program run in a snap:
factor = go (2:[3,5..]) where
go (p:ps) n
| p*p > n = [n]
| n `mod` p == 0 = p : go (p:ps) (n `div` p)
| otherwise = go ps n
main = print . last . factor $ 600851475143
In ghci:
*Main> main
6857
(0.00 secs, 0 bytes)
You can see that we only had to inspect numbers up to 6857 -- eight orders of magnitude smaller than what you would have to do with your approach.
Independently, your sieve is dog slow. You could have a look at the wiki for ideas about how to find primes quickly.

Fast obtention of all the subsets of size N in Haskell

The following (unoptimal) code generates all the subsets of size N for certain subset.
This code works but, as I said, is highly unoptimal. Using an intermediate list to avoid the O(log(n)) of Set.insert doesn't seem help due to the large cost of later reconverting the list to a Set
Can anybody suggest how to optimize the code?
import qualified Data.Set as Set
subsetsOfSizeN :: Ord a => Int -> Set.Set a -> Set.Set (Set.Set a)
subsetsOfSizeN n s
| Set.size s < n || n < 0 = error "subsetOfSizeN: wrong parameters"
| otherwise = doSubsetsOfSizeN n s
where doSubsetsOfSizeN n s
| n == 0 = Set.singleton Set.empty
| Set.size s == n = Set.singleton s
| otherwise =
case Set.minView s of
Nothing -> Set.empty
Just (firstS, restS) ->
let partialN n = doSubsetsOfSizeN n restS in
Set.map (Set.insert firstS) (partialN (n-1)) `Set.union` partialN n
This is inspired by Pascal's triangle.
choose :: [b] -> Int -> [[b]]
_ `choose` 0 = [[]]
[] `choose` _ = []
(x:xs) `choose` k = (x:) `fmap` (xs `choose` (k-1)) ++ xs `choose` k
This code works but, as I said, is highly unoptimal.
Doesn't seem so terribly bad to me. The number of subsets of size k of a set of size n is n `choose` k which grows rather fast for k ~ n/2. So creating all the subsets must scale badly.
Using an intermediate list to avoid the O(log(n)) of Set.insert doesn't seem help due to the large cost of later reconverting the list to a Set.
Hmm, I found using lists to give better performance. Not asymptotically, I think, but a not negligible more-or-less constant factor.
But first, there is an inefficiency in your code that is simple to repair:
Set.map (Set.insert firstS) (partialN (n-1))
Note that Set.map must rebuild a tree from scratch. But we know that firstS is always smaller than any element in any of the sets in partialN (n-1), so we can use Set.mapMonotonic that can reuse the spine of the set.
And that principle is also what makes lists attractive, the subsets are generated in lexicographic order, so instead of Set.fromList we can use the more efficient Set.fromDistinctAscList. Transcribing the algorithm yields
onlyLists :: Ord a => Int -> Set.Set a -> Set.Set (Set.Set a)
onlyLists n s
| n == 0 = Set.singleton Set.empty
| Set.size s < n || n < 0 = error "onlyLists: out of range n"
| Set.size s == n = Set.singleton s
| otherwise = Set.fromDistinctAscList . map Set.fromDistinctAscList $
go n (Set.size s) (Set.toList s)
where
go 1 _ xs = map return xs
go k l (x:xs)
| k == l = [x:xs]
| otherwise = map (x:) (go (k-1) (l-1) xs) ++ go k (l-1) xs
which in the few benchmarks I've run is between 1.5 and 2× faster than the amended algorithm using Sets.
And that is in turn, in my criterion benchmarks, nearly twice as fast as dave4420's.
subsets :: Int -> [a] -> [[a]]
subsets 0 _ = [[]]
subsets _ [] = []
subsets k (x:xs) = map (x:) (subsets (k - 1) xs) ++ subsets k xs
First, use a better algorithm.
Look at your final line:
Set.map (Set.insert firstS) (partialN (n-1)) `Set.union` partialN n
Evaluating doSubsetsOfSizeN k (Set.fromList $ 1:2:xs) will involve evaluating doSubsetsOfSizeN (k-1) (Set.fromList xs) twice (once when inserting 1, and once when inserting 2). This duplication is wasteful.
Enter a better algorithm.
mine :: Ord a => Int -> Set.Set a -> Set.Set (Set.Set a)
mine n s | Set.size s < n || n < 0 = Set.empty
| otherwise = Set.foldr cons nil s !! n
where
nil :: Ord a => [Set.Set (Set.Set a)]
nil = Set.singleton Set.empty : repeat Set.empty
cons :: Ord a => a -> [Set.Set (Set.Set a)] -> [Set.Set (Set.Set a)]
cons x sets = zipWith Set.union sets
(Set.empty : map (Set.map $ Set.insert x) sets)
mine 9 (Data.Set.fromList [0..18]) `seq` () is faster than subsetsOfSizeN 9 (Data.Set.fromList [0..18]) `seq` () and should have better asymptotic performance.
I haven't tried optimising this any further. There may be a better algorithm still.
(If the cost of insert and fromList are issues, you should consider giving back a list of lists instead of a set of sets.)
I found this, may be it can help you
f [] = [[1]]
f l = (:) [u] l'
where
u = succ (head (head l))
l' = (++) l (map(\x->(:) u x) l)
fix f n = if (n==0) then [] else f (fix f (n-1))
To test it
$ length $ (fix f 10) => 1023 -- The empty set is always include then == 1024

List of divisors of an integer n (Haskell)

I currently have the following function to get the divisors of an integer:
-- All divisors of a number
divisors :: Integer -> [Integer]
divisors 1 = [1]
divisors n = firstHalf ++ secondHalf
where firstHalf = filter (divides n) (candidates n)
secondHalf = filter (\d -> n `div` d /= d) (map (n `div`) (reverse firstHalf))
candidates n = takeWhile (\d -> d * d <= n) [1..n]
I ended up adding the filter to secondHalf because a divisor was repeating when n is a square of a prime number. This seems like a very inefficient way to solve this problem.
So I have two questions: How do I measure if this really is a bottle neck in my algorithm? And if it is, how do I go about finding a better way to avoid repetitions when n is a square of a prime?
To mesure where the bottleneck is, put the three auxiliary definitions (firstHalf, secondHalf, candidates) at the top level, and run your code with the profiler on: ghc -prof --make divisors.hs ./divisors 100 +RTS -p -RTS
Also, you know that the biggest candidate is sqrt n, so instead of doing that many multiplications d*d, just consider [1..floor (sqrt n)]
For better algorithms, you should take a maths book, for it's not a haskell related question… Things you can consider: if "a divides b", then for all divisor d of a, d divides b as well.
You'll want to use memoization or dynamic programming to avoid checking multiple times if a given d divides b (for example, if 15 and 27 divide b, then you need to mathematically check only once that 3 divides b. The other times, you just see if 3 is in your table of divisors of b).
You needn't test all the elements of reversed second half. You know that if the square root is present, it is the head element there:
secondHalf = let (r:ds) = [n `div` d | d <- reverse firstHalf]
in [r | n `div` r /= r] ++ ds
This assumes n is positive.
A simpler way to handle the sqrt of a number differently is to handle it separately:
divs n =
let
r = floor $ sqrt $ fromIntegral n
(a,b) = unzip $ (1,n) : [(d, q) | d<-[2..r-1], let (q,r)=quotRem n d, r==0]
in
if r*r==n
then a ++ r : reverse b
else a ++ reverse b
That way we get the second half for free, as a part of producing the first half.
But this could hardly be a bottleneck in your application because the algorithm itself is inefficient. It is usually much faster to generate the divisors from a number's prime factorization. Prime factorization by trial division can be much quicker because we divide out each divisor as it is found, reducing the number being factorized and thus the amount of divisors that are tried (up to the reduced number's square root). For example, 12348 = 2*2*3*3*7*7*7 and no factor above 7 is tried in the process of factorization, whereas in divs 12348 the number 12348 is divided by all numbers from 2 to 110:
factorize n = go n (2:[3,5..]) -- or: (go n primes) where
where -- primes = 2 :
go n ds#(d:t) -- filter (null.tail.factorize) [3,5..]
| d*d > n = [n]
| r == 0 = d : go q ds
| otherwise = go n t
where (q,r) = quotRem n d

Building an expression with maximum value

Given n integers, is there an O(n) or O(n log n) algorithm that can compute the maximum value of a mathematical expression that can be obtained by inserting the operators -, +, * and parentheses between the given numbers? Assume only binary variants of the operators, so no unary minus, except before the first element if needed.
For example, given -3 -4 5, we can build the expression (-3) * (-4) * 5, whose value is 60, and maximum possible.
Background:
I stumbled upon this problem some time ago when studying genetic algorithms, and learned that it can be solved pretty simply with a classical genetic algorithm. This runs slowly however, and it's only simple in theory, as the code gets rather ugly in practice (evaluate the expression, check for correct placement of brackets etc.). What's more, we're not guaranteed to find the absolute maximum either.
All these shortcomings of genetic algorithms got me wondering: since we can don't have to worry about division, is there a way to do this efficiently with a more classic approach, such as dynamic programming or a greedy strategy?
Update:
Here's an F# program that implements the DP solution proposed by #Keith Randall together with my improvement, which I wrote in a comment to his post. This is very inefficient, but I maintain that it's polynomial and has cubic complexity. It runs in a few seconds for ~50 element arrays. It would probably be faster if written in a fully imperative manner, as a lot of time is probably wasted on building and traversing lists.
open System
open System.IO
open System.Collections.Generic
let Solve (arr : int array) =
let memo = new Dictionary<int * int * int, int>()
let rec Inner st dr last =
if st = dr then
arr.[st]
else
if memo.ContainsKey(st, dr, last) then
memo.Item(st, dr, last)
else
match last with
| 0 -> memo.Add((st, dr, last),
[
for i in [st .. dr - 1] do
for j in 0 .. 2 do
for k in 0 .. 2 do
yield (Inner st i j) * (Inner (i + 1) dr k)
] |> List.max)
memo.Item(st, dr, last)
| 1 -> memo.Add((st, dr, last),
[
for i in [st .. dr - 1] do
for j in 0 .. 2 do
for k in 0 .. 2 do
yield (Inner st i j) + (Inner (i + 1) dr k)
] |> List.max)
memo.Item(st, dr, last)
| 2 -> memo.Add((st, dr, last),
[
for i in [st .. dr - 1] do
for j in 0 .. 2 do
for k in 0 .. 2 do
yield (Inner st i j) - (Inner (i + 1) dr k)
] |> List.max)
memo.Item(st, dr, last)
let noFirst = [ for i in 0 .. 2 do yield Inner 0 (arr.Length - 1) i ] |> List.max
arr.[0] <- -1 * arr.[0]
memo.Clear()
let yesFirst = [ for i in 0 .. 2 do yield Inner 0 (arr.Length - 1) i ] |> List.max
[noFirst; yesFirst] |> List.max
let _ =
printfn "%d" <| Solve [|-10; 10; -10|]
printfn "%d" <| Solve [|2; -2; -1|]
printfn "%d" <| Solve [|-5; -3; -2; 0; 1; -1; -1; 6|]
printfn "%d" <| Solve [|-5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6; -5; -3; -2; 0; 1; -1; -1; 6;|]
Results:
1000
6
540
2147376354
The last one is most likely an error due to overflow, I'm just trying to show that a relatively big test runs too fast for this to be exponential.
Here's a proposed solution:
def max_result(a_):
memo = {}
a = list(a_)
a.insert(0, 0)
return min_and_max(a, 0, len(a)-1, memo)[1]
def min_and_max(a, i, j, memo):
if (i, j) in memo:
return memo[i, j]
if i == j:
return (a[i], a[i])
min_val = max_val = None
for k in range(i, j):
left = min_and_max(a, i, k, memo)
right = min_and_max(a, k+1, j, memo)
for op in "*-+":
for x in left:
for y in right:
val = apply(x, y, op)
if min_val == None or val < min_val: min_val = val
if max_val == None or val > max_val: max_val = val
ret = (min_val, max_val)
memo[i, j] = ret
return ret
def apply(x, y, op):
if op == '*': return x*y
if op == '+': return x+y
return x-y
max_result is the main function, and min_and_max is auxiliary. The latter returns the minimum and maximum results that can be achieved by sub-sequence a[i..j].
It assumes that maximum and minimum results of sequences are composed by maximum and minimum results of sub-sequences. Under this assumption, the problem has optimal substructure and can be solved with dynamic programming (or memoization). Run time is O(n^3).
I haven't proved correctness, but I have verified its output against a brute force solution with thousands of small randomly generated inputs.
It handles the possibility of a leading unary minus by inserting a zero at the beginning of the sequence.
EDIT
Been thinking a bit more about this problem, and I believe it can be reduced to a simpler problem in which all values are (strictly) positive and only operators * and + are allowed.
Just remove all zeroes from the sequence and replace negative numbers by their absolute value.
Furthermore, if there are no ones in the resulting sequence, the result is simply the product of all numbers.
After this reduction, the simple dynamic programming algorithm would work.
EDIT 2
Based on the previous insights I think I found a linear solution:
def reduce(a):
return filter(lambda x: x > 0, map(abs, a))
def max_result(a):
b = reduce(a)
if len(b) == 0: return 0
return max_result_aux(b)
def max_result_aux(b):
best = [1] * (len(b) + 1)
for i in range(len(b)):
j = i
sum = 0
while j >= 0 and i-j <= 2:
sum += b[j]
best[i+1] = max(best[i+1], best[j] * sum)
j -= 1
return best[len(b)]
best[i] is the maximum result that can be achieved by sub-sequence b[0..(i-1)].
EDIT 3
Here's an argument in favor of the O(n) algorithm based on the following assumption:
You can always achieve the maximum result with an expression of the form
+/- (a_1 +/- ... +/- a_i) * ... * (a_j +/- ... +/- a_n)
That is: a product of factors composed of an algebraic sum of terms (including the case of only one factor).
I will also use the following lemmas which are easy to prove:
Lemma 1: x*y >= x+y for all x,y such that x,y >= 2
Lemma 2: abs(x_1) + ... + abs(x_n) >= abs(x_1 +/- ... +/- x_n)
Here it goes.
The sign of each factor doesn't matter, since you can always make the product positive by using the leading unary minus. Hence, to maximize the product we need to maximize the absolute value of each factor.
Setting aside the trivial case in which all numbers are zeroes, in an optimal solution no factor will be composed only of zeroes. Therefore, since zeroes have no effect inside each sum of terms, and each factor will have at least one non-zero number, we can remove all zeroes. From now on, let's assume there are no zeroes.
Let's concentrate in each sum of terms separately:
(x_1 +/- x_2 +/- ... +/- x_n)
By Lemma 2, the maximum absolute value each factor can achieve is the sum of the absolute values of each term. This can be achieved in the following way:
If x_1 is positive, add all positive terms and subtract all negative terms. If x_1 is negative, subtract all positive terms and add all negative terms.
This implies that the sign of each term does not matter, we can consider the absolute value of each number and only use operator + inside factors. From now on, let's consider all numbers are positive.
The crucial step, that leads to an O(n) algorithm, is to prove that the maximum result can always be achieved with factors that have at most 3 terms.
Suppose we have a factor of more than 3 terms, by Lemma 1 we can break it into two smaller factors of 2 or more terms each (hence, each add up to 2 or more), without reducing the total result. We can break it down repeatedly until no factors of more than 3 terms are left.
That completes the argument. I still haven't found a complete justification of the initial assumption. But I tested my code with millions of randomly generated cases and couldn't break it.
A reasonable big value can be found in O(N). Consider this a greedy algorithm.
Find all positive numbers ≥ 2. Store the result as A.
Count all "-1"s . Store the result as B.
Find all negative numbers ≤ -2. Store the result as C.
Count all "1"s. Store the result as D.
Initialize Product to 1.
If A is not empty, multiply Product by the product of A.
If C is not empty and has even count, multiply Product by the product of C.
If C is has odd count, take the smallest number in magnitude of C away (store it as x), and multiply Product by the product of the rest of C.
If x is set and B is nonzero, compare Product × -x with Product − x + 1.
If the former is strictly larger, decrease B by 1 and multiply Product by -x, then remove x.
If the latter is larger, do nothing.
Set Result to 0. If Product ≠ 1, add it to Result.
Add D to Result, representing addition of D "1"s.
Add B to Result, representing subtraction of B "-1"s.
If x is set, substract x from Result.
The time complexities are:
1. O(N), 2. O(N), 3. O(N), 4. O(N), 5. O(1), 6. O(N), 7. O(N), 8. O(N), 9. O(1), 10. O(1), 11. O(1), 12. O(1), 13. O(1),
so the whole algorithm runs in O(N) time.
An example session:
-3 -4 5
A = [5]
B = 0
C = [-3, -4]
D = 1
Product = 1
A is not empty, so Product = 5.
C is even, so Product = 5 × -3 × -4 = 60
-
-
Product ≠ 1, so Result = 60.
-
-
-
5 × -3 × -4 = 60
-5 -3 -2 0 1 -1 -1 6
A = [6]
B = 2
C = [-5, -3, -2]
D = 1
Product = 1
A is not empty, so Product = 6
-
C is odd, so x = -2, and Product = 6 × -5 × -3 = 90.
x is set and B is nonzero. Compare Product × -x = 180 and Product − x + 1 = 93. Since the former is larger, we reset B to 1, Product to 180 and remove x.
Result = 180.
Result = 180 + 1 = 181
Result = 181 + 1 = 182
-
6 × -5 × -3 × -2 × -1 + 1 − (-1) + 0 = 182
2 -2 -1
A = [2]
B = 1
C = [-2]
D = 0
Product = 1
Product = 2
-
x = -2, Product is unchanged.
B is nonzero. Compare Product × -x = 4 and Product − x + 1 = 5. Since the latter is larger, we do nothing.
Result = 2
-
Result = 2 + 1 = 3
Result = 3 − (-2) = 5.
2 − (-1) − (-2) = 5.
You should be able to do this with dynamic programming. Let x_i be your input numbers. Then let M(a,b) be the maximum value you can get with the subsequence x_a through x_b. You can then compute:
M(a,a) = x_a
M(a,b) = max_i(max(M(a,i)*M(i+1,b), M(a,i)+M(i+1,b), M(a,i)-M(i+1,b))
edit:
I think you need to compute both the max and min computable value using each subsequence. So
Max(a,a) = Min(a,a) = x_a
Max(a,b) = max_i(max(Max(a,i)*Max(i+1,b),
Max(a,i)*Min(i+1,b),
Min(a,i)*Max(i+1,b),
Min(a,i)*Min(i+1,b),
Max(a,i)+Max(i+1,b),
Max(a,i)-Min(i+1,b))
...similarly for Min(a,b)...
Work this in reverse polish - that way you don't have to deal with parentheses. Next put a - in front of every -ve number (thereby making it positive). Finally multiply them all together. Not sure about the complexity, probably about O(N).
EDIT: forgot about 0. If it occurs in your input set, add it to the result.
This feels NP Complete to me, though I haven't yet figured out how to do a reduction. If I'm right, then I could say
Nobody in the world knows if any polynomial algorithm exists, let alone O(n log n), but most computer scientists suspect there isn't.
There are poly time algorithms to estimate the answer, such as the genetic algorithm you describe.
In fact, I think the question you mean to ask is, "Is there a reasonably useful O(n) or O(n log n) algorithm to estimate the maximum value?"
This is my first post on stackoverflow, so I apologize in advance for missing any preliminary etiquette. Also, in the interest of full disclosure, Dave brought this problem to my attention.
Here's an O(N^2logN) solution, mostly because of the the repeated sorting step in the for loop.
Absolute values: Remove zero elements and sort by absolute value. Since you are allowed to place a negative sign in front of your final result, it does not matter whether your answer is negative or positive. Only the absolute values of all numbers in the set matter.
Multiplication only for numbers > 1: We make the observation that for any set of positive integers greater than 1, (e.g. {2,3,4}), the largest result comes from a multiplication. This can be shown by an enumerative technique or a contradiction argument over permitted operations + and -. e.g. (2+3)*4 = 2*4 + 3*4 < 3*4 + 3*4 = 2*(3*4). In other words, multiplication is the most "powerful" operation (except for the 1s).
Addition of the 1s to the smallest non-1 numbers: For the 1s, since multiplication is a useless operation, we are better off adding. Here again we show a complete ordering on the result of an addition. For rhetoric sake, consider again the set {2,3,4}. We note that: 2*3*(4+1) <= 2*(3+1)*4 <= (2+1)*3*4. In other words, we get the most "mileage" from a 1 by adding it to the smallest existing non-1 element in the set. Given a sorted set, this can be done in O(N^2logN).
Here's what the pseudo-code looks like:
S = input set of integers;
S.absolute();
S.sort();
//delete all the 0 elements
S.removeZeros();
//remove all 1 elements from the sorted list, and store them
ones = S.removeOnes();
//now S contains only integers > 1, in ascending order S[0] ... S[end]
for each 1 in ones:
S[0] = S[0] + 1;
S.sort();
end
max_result = Product(S);
I know I'm late to the party, but I took this on as a challenge to myself. Here is the solution I came up with.
type Operation =
| Add
| Sub
| Mult
type 'a Expr =
| Op of 'a Expr * Operation * 'a Expr
| Value of 'a
let rec eval = function
| Op (a, Add, b) -> (eval a) + (eval b)
| Op (a, Sub, b) -> (eval a) - (eval b)
| Op (a, Mult, b) -> (eval a) * (eval b)
| Value x -> x
let rec toString : int Expr -> string = function
| Op (a, Add, b) -> (toString a) + " + " + (toString b)
| Op (a, Sub, b) -> (toString a) + " - " + (toString b)
| Op (a, Mult, b) -> (toString a) + " * " + (toString b)
| Value x -> string x
let appendExpr (a:'a Expr) (o:Operation) (v:'a) =
match o, a with
| Mult, Op(x, o2, y) -> Op(x, o2, Op(y, o, Value v))
| _ -> Op(a, o, Value v)
let genExprs (xs:'a list) : 'a Expr seq =
let rec permute xs e =
match xs with
| x::xs ->
[Add; Sub; Mult]
|> Seq.map (fun o -> appendExpr e o x)
|> Seq.map (permute xs)
|> Seq.concat
| [] -> seq [e]
match xs with
| x::xs -> permute xs (Value x)
| [] -> Seq.empty
let findBest xs =
let best,result =
genExprs xs
|> Seq.map (fun e -> e,eval e)
|> Seq.maxBy snd
toString best + " = " + string result
findBest [-3; -4; 5]
returns "-3 * -4 * 5 = 60"
findBest [0; 10; -4; 0; 52; -2; -40]
returns "0 - 10 * -4 + 0 + 52 * -2 * -40 = 4200"
It should work with any type supporting comparison and the basic mathmatical operators, but FSI will constrain it to ints.

Resources