Improvement of the Greedy Algorithm - algorithm

I've been working on an abstract chess algorithm using Haskell (trying to expand my understanding of different paradigms), and I've hit a challenge that I've been pondering about for weeks.
Here's the problem:
Given a board (represented by a list of lists of integers; each
integer represents a subsequent point value), with dimensions n x n,
determine the path that provides the most points. If there is a tie
for best path, return either of them.
Here are the specifics:
A = [[5,4,3,1],[10,2,1,0],[0,1,2,0],[2,3,4,20]]
which renders as:
R1: 5 4 3 1, R2: 10 2 1 0, R3: 0 1 2 0, R4: 2 3 4 20.
The rules are:
You may start anywhere on the top row
You may move one square at a time, either straight down, down-left (diagonal) , or down-right (diagonal).
The output must be a tuple of integers.
First element is a list representing the columns vs. row, and the second element is the total number of points. Eg. for the above board, the best solution is to travel from top-left (5) and go diagonally for the remaining steps (until the 20 point square). This would result in the tuple ([1,2,3,4], 29).
Remember, this is all in Haskell so it is a functional-paradigm recursive problem. At first, I was thinking about using the greedy algorithm, that is, choosing the highest value in r1, and recursing through comparing the next 3 possibilities; choosing the highest of the 3. However, the downfall is that the greedy algorithm doesn't have the ability to see potential ahead of the next row.
How would I go about this? I'm not looking for code per se, since I enjoy solving things on my own. However, pseudocode or some algorithmic guidance would be much appreciated!

I saw your previous question on the same topic, and I start to work on it.
As you doesn't want the direct solution, I can provide you my reflexion about your problem, I guess it could help you.
Some basic property :
1. The number of movement is alway egal to the length of the list m = length A
2. The number of starting point is egal to the length of the head of the list n = length (head A)
3. The current position could never be negative, then :
- if the current position is egal to 0 you can either go down or right
- else you can go to left, down or right
Which lead us to this pseudo code
generate_path :: [[Int]] -> [[Int]]
generate_path [] = [[]]
generate_path A = ... -- You have to put something here
where
m = length A
n = length (head A)
This things should look like something as this
move pos0 count0
| count0 == 0 =
| pos0 == 0 = move (down count) ++ move (right count)
| otherwise = move (left count) ++ move (down count) ++ move (right count)
where
count = count0 - 1
down = position0
left = position0 - 1
right = position0 + 1
In fact keeping all of this in mind and adding the (!!) operator, we shouldn't be so far of the solution. To convince you play with A + list comprehension + !!, as
[A !! x !! y | x <- [1..2], y <- [0..2]] -- I take random range
Or play with another version :
[[A !! x !! y | x <- [1..2]] | y <- [0..2]]] -- I take random range
In fact you have two recursion the main one working on the parameter n = length (head A), you repeat the same action from 0 to (n-1) at (n-1) retrieve the result, this recursion embedded another one which work on m, repeat the same action from 0 to (m-1).
Hope it help.
Good luck.

Keep a list of the paths to each column in the row just reached with the highest score to that cell.
You'd start (in your example), with the list
[([1],5), ([2],4), ([3],3), ([4],1)]
Then, when checking the next row, for each column, you pick the path with the highest score in the previous row that can reach that column, here, for the second row, in column 1 and 2, you'd pick the path ending in column 1 on the row above, and in column 3, you'd pick the path ending in column 2 in the row above, in column 4, the path ending in colum 3 in the previous row, so that would give you
[([1,1],15), ([1,2],7), ([2,3],5), ([3,4],3)]
for the third row, [0,1,2,0], you'd again pick the path ending in column 1 for the first two columns, the path ending in column 2 for the third, and the path ending in column 3 for the fourth,
[([1,1,1],15), ([1,1,2],16), ([1,2,3],9), ([2,3,4],5)]
for the fourth row, [2,3,4,20], you'd pick the path ending in column 2 for the first three columns, and the path ending in column 3 for the last,
[([1,1,2,1],18), ([1,1,2,2],19), ([1,1,2,3],20), ([1,2,3,4],29)]
Then, when you've reached the last row, you pick the path with the highest total.
Why it works:
Let the highest-scoring path end in column c. The part above the last column must be the highest scoring path ending in one of the columns c-1, c, c+1 on the penultimate row, since column c in the last row can only be reached from those.

The best solution is not a greedy algorithm from the top down, but rather an approach that starts with the last row and works up:
import Data.Function
import Data.List
-- All elements of Board are lists of equal lengths
-- valid b = 1 == length (group (map length b))
type Value = Int
type Board = [[Value]]
type Index = Int
type Result = ([Index], Value)
p :: Board
p = [[5,4,3,1],[10,2,1,0],[0,1,2,0],[2,3,4,20]]
best_from :: Board -> Result
best_from [] = undefined
best_from xs | any null xs = undefined
best_from b = best_of . best_list $ b
best_list :: Board -> [Result]
best_list b = foldr1 layer (map label b)
where label = zipWith (\index value -> ([index],value)) [1..]
layer new rest = zipWith (\(i1,v1) (i2,v2) -> (i1++i2, v1+v2)) new best
where temp = head rest : map best_pair (zip rest (tail rest))
best = map best_pair (zip temp (tail rest)) ++ [last temp]
best_pair :: (Result,Result) -> Result
best_pair (a#(_,a1), b#(_,b1)) | a1 >=b1 = a
| otherwise = b
best_of :: [Result] -> Result
best_of = maximumBy (compare `on` snd)
main = do
print (best_from p)
It is easy to solve if there is one row. So this converts each row into a list of Result with a simple [#] solution path.
Given the rest for the puzzel below a new row then adding the new row is a matter of finding the best solution from rest (by checking down, down left, down right) and combining with the new row.
This makes foldr, or here foldr1 the natural structure.

I chose a different path, no pun intended. I listed the allowed index combinations and mapped the board to them. Perhaps someone can find a way to generalize it to a board of any size.
import Data.List
import Data.Ord
import Data.Maybe
a = [[5,4,3,1],[10,2,1,0],[0,1,2,0],[2,3,4,20]]
r1 = a !! 0
r2 = a !! 1
r3 = a !! 2
r4 = a !! 3
i = [0,1,2,3]
index_combinations = [[a,b,c,d] | a <- i, b <- i, c <- i, d <- i,
abs (b-a) < 2, abs (c-b) < 2, abs (d-c) < 2]
mapR xs = [r1 !! (xs !! 0), r2 !! (xs !! 1),
r3 !! (xs !! 2), r4 !! (xs !! 3)]
r_combinations = map mapR index_combinations
r_combinations_summed = zip r_combinations $ map (foldr (+) 0) r_combinations
result = maximumBy (comparing snd) r_combinations_summed
path = index_combinations !! fromJust (elemIndex result r_combinations_summed)

Related

Algorithm to find list given dot product and another list

I need to write a function findL that takes a list L1 of integers and a desired dot product n, and returns a list L2 of nonnegative integers such that L1 · L2 = n. (By "dot product" I mean the sum of the pairwise products; for example, [1,2] · [3,4] = 1·3+2·4 = 11.)
So, for example, findL(11, [1,2]) might return SOME [3,4]. If there's no possible list, I return NONE.
I'm using a functional language. (Specifically Standard ML, but the exact language isn't so important, I'm just trying to think of an FP algorithm.) What I have written so far:
Let's say I have findL(n, L1):
if L1 = [], I return NONE.
if L1 = [x] (list of length 1)
if (n >= 0 and x > 0 and n mod x = 0), return SOME [n div x]
else return NONE
If L1 has length greater than 1, I recurse on findL (n, L[1:]). If that returns a list L2, I return [1] concatenated to L2. If the recursive call returns NONE, I did another recursive call on findL (0, L[1:]) and prepended [n div x] to the result if it wasn't NONE. This works on many inputs but are failing on others.
I need to change part 3, but I'm not sure if I have the right idea. I would appreciate any tips!
Unless you need to say that empty lists in the input are always bad (even n = 0 with the list []), I'd recommend returning something different for an empty list based on whether you've reached 0 at the end (everything has been subtracted away) or not, then recurse when receiving any nonempty list rather than special-casing a one-element list.
As far as step three, you need to test every possible positive integer multiple of the first element of your input list until they exceed n, not just the first and last. The first non-None value you get is good enough, so you just prepend the multiplier (not the multiple) to the return list. If everything gives you Nones, you return None.
I don't know SML, but here's how I'd do it in Haskell:
import Data.Maybe (isJust, listToMaybe)
-- Find linear combinations of positive integers
solve :: Integer -> [Integer] -> Maybe [Integer]
-- If we've made it to the end with zero left, good!
solve 0 [] = Just []
-- Otherwise, this way isn't the way to go.
solve _ [] = Nothing
-- If one of the elements of the input list is zero, just multiply that element by one.
solve n (0:xs) = case solve n xs of
Nothing -> Nothing
Just ys -> Just (1:ys)
solve n (x:xs) = listToMaybe -- take first solution if it exists
. map (\ (m, Just ys) -> m:ys) -- put multiplier at front of list
. filter (isJust . snd) -- remove nonsolutions
. zip [1 ..] -- tuple in the multiplier
. map (\ m -> solve (n - m) xs) -- use each multiple
$ [x, x + x .. n] -- the multiples of x up to n
Here it is solving 11 with [1, 2] and 1 with [1, 2].

Not able to sort a list properly in ocaml

So I'm trying to sort this list of integers so that all the even numbers are in the front and the odds are all in the back. I have my program now which works for the most part but it keeps reversing the order of my odds numbers which I don't want it to do. E.g. given the input [1;2;3;4;5;6] I would like to get [2;4;6;1;3;5], but I'm getting [2;4;6;5;3;1] Any help is greatly appreciated!
let rec evens (xl:int list) (odd:int list) : int list =
match xl with
| [] -> []
| h::t ->
if h mod 2 = 0
then (h)::evens t odd
else
evens t odd#[(h)]
The main part of your current code parses like this:
if h mod 2 = 0 then
h :: (evens t odd)
else
(evens t odd) # [h]
It says this: if the next number h is even, sort out the rest of the list, then add h to the front. If the next number h is odd, sort out the rest of the list, then add h to the end. So it follows that the odd numbers will be reversed at the end.
It's worth noting that your parameter named odd is always passed along unchanged, and hence will always be an empty list (or whatever you pass as the second parameter of evens).
When I first looked at your code, I assumed you were planning to accumulate the odd numbers in the odd parameter. If you want to do that, you need to make two changes. First you need to rewrite like this:
if h mod 2 = 0 then
h :: evens t odd
else
evens t (odd # [h])
The precedence rules of OCaml require the parentheses if you want to add h to the odd parameter. Your current code adds h to the returned result of evens (as above).
This rewrite will accumulate the odd numbers, in order, in the odd parameter.
Then you need to actually use the odd parameter at the end of the recursion. I.e., you need to use it when xl is empty.
The standard library has a neat solution to your problem.
List.partition (fun x -> x mod 2 = 0) [1;2;3;4;5;6]
- : int list * int list = ([2; 4; 6], [1; 3; 5])
The partition function splits your list into a tuple of two lists:
The list of elements that validate a predicate;
The list of elements that don't.
All you have to do is combine those lists together.
let even_first l =
let evens, odds = List.partition (fun x -> x mod 2 = 0) l in
evens # odds
If you want to make it more generic, let the predicate be an argument:
let order_by_predicate ~f l =
let valid, invalid = List.partition f l in
valid # invalid

Is a list comprehension or a sequential filter more optimized?

Let's say you have to return the sum of all the multiples of 2 and 3 in a set of integers from 1-100. In Haskell, the code I would write would look something like this:
sum ([x*2 | x<-[1..100], x*2 < 100] `union` [x*3 | x<-[1..100], x*3 < 100])
This uses 2 list comprehensions with a union. Another solution would be to step through each item in the list and evaluate it (using a modulus), then add it to a separate list, which you would later add together.
Both of these solutions come out with the same answer, but which one is more optimized if you had to do the same for, say, a list from 1..1000000?
The answer to the original question is 3317 if you want to create your own algorithm.
If you are looking for performance, you can simplify this problem to the point where you don't even need a computer....
Numbers divisible by 2 or 3 fall into a pattern
0 (1) 2 3 4 (5).... 6 (7) 8 9 10 (11).... etc
or
TFTTTF.... TFTTTF....
Assume that the max bound is divisible by 6, (if not, you can just choose the highest value below the real bound and add the remaining few values by hand). Let maxBound=6*N.
For each additional N, you add the following values
6*n, 0, 6*n+2, 6*n+3, 6*n+4, 0
which sums to
24*n+9
so all you need to do is sum up
sum from n=0 to N of (24*n+9)
=24*(sum from n=0 to N of n) + 9*N
=24*N*(N-1)/2 + 9*N
=12*N^2-3*N
so a very fast Haskell program that would solve this problem would look something like this
f maxBound = 12*n^2-3*n + remainingStuff
where n = maxBound `quot` 6
remainingStuff = sum $ filter (<= maxBound) [6*n, 6*n+2, 6*n+3, 6*n+4]
The union function is a "quadratic" algorithm, so using one list comprehension will be faster.
A better way which is useful for generating these kinds of sequences is to take advantage of the fact that they are ordered and merge them together with a function like:
merge :: [Int] -> [Int] -> [Int]
merge as [] = as
merge [] bs = bs
merge as#(a:at) bs#(b:bt) =
case compare a b of
LT -> a : merge at bs
EQ -> a : merge at bt
GT -> b : merge as bt
and then generate your sequence with:
[ x | x <- merge [2,4..100] [3,6..100] ]
One last tip for writing combinatorial loops... replace expressions like x <- [1..100], 2*x < 100 with x <- [1..49], or if you can't compute the upper bound explicitly, use x <- takeWhile (\x -> 2*x < 100) [1..100]. The latter forms only generates as many items as needed.

Wine Tasting problem

I've spent almost all competition time(3 h) for solving this problem. In vain :( Maybe you could help me to find the solution.
A group of Facebook employees just had a very successful product launch. To celebrate, they have decided to go wine tasting. At the vineyard, they decide to play a game. One person is given some glasses of wine, each containing a different wine. Every glass of wine is labelled to indicate the kind of wine the glass contains. After tasting each of the wines, the labelled glasses are removed and the same person is given glasses containing the same wines, but unlabelled. The person then needs to determine which of the unlabelled glasses contains which wine. Sadly, nobody in the group can tell wines apart, so they just guess randomly. They will always guess a different type of wine for each glass. If they get enough right, they win the game. You must find the number of ways that the person can win, modulo 1051962371.
Input
The first line of the input is the number of test cases, N. The next N lines each contain a test case, which consists of two integers, G and C, separated by a single space. G is the total number of glasses of wine and C is the minimum number that the person must correctly identify to win.
Constraints
N = 20
1 ≤ G ≤ 100
1 ≤ C ≤ G
Output
For each test case, output a line containing a single integer, the number of ways that the person can win the game modulo 1051962371.
Example input
5
1 1
4 2
5 5
13 10
14 1
Example output
1
7
1
651
405146859
Here's the one that doesn't need the prior knowledge of Rencontres numbers. (Well, it's basically the proof a formula from the wiki but I thought I'd share it anyway.)
First find f(n): the number of permutations of n elements that don't have a fixed point. It's simple by inclusion-exclusion formula: the number of permutations that fix k given points is (n-k)!, and these k points can be chosen in C(n,k) ways. So, f(n) = n! - C(n,1)(n-1)! + C(n,2)(n-2)! - C(n,3)(n-3)! + ...
Now find the number of permutations that have exactly k fixed points. These points can be chosen in C(n,k) ways and the rest n-k points can be rearranged in f(n-k) ways. So, it's C(n,k)f(n-k).
Finally, the answer to the problem is the sum of C(g,k)f(g-k) over k = c, c+1, ..., g.
My solution involved the use of Rencontres Numbers.
A Rencontres Number D(n,k) is the number of permutations of n elements where exactly k elements are in their original places. The problem asks for at least k elemenets, so I just took the sum over k, k+1,...,n.
Here's my Python submission (after cleaning up):
from sys import stdin, stderr, setrecursionlimit as recdepth
from math import factorial as fact
recdepth(100000)
MOD=1051962371
cache=[[-1 for i in xrange(101)] for j in xrange(101)]
def ncr(n,k):
return fact(n)/fact(k)/fact(n-k)
def D(n,k):
if cache[n][k]==-1:
if k==0:
if n==0:
cache[n][k]=1
elif n==1:
cache[n][k]=0
else:
cache[n][k]= (n-1)*(D(n-1,0)+D(n-2,0))
else:
cache[n][k]=ncr(n,k)*D(n-k,0)
return cache[n][k]
return cache[n][k]
def answer(total, match):
return sum(D(total,i) for i in xrange(match,total+1))%MOD
if __name__=='__main__':
cases=int(stdin.readline())
for case in xrange(cases):
stderr.write("case %d:\n"%case)
G,C=map(int,stdin.readline().split())
print answer(G,C)
from sys import stdin, stderr, setrecursionlimit as recdepth
from math import factorial as fact
recdepth(100000)
MOD=1051962371
cache=[[-1 for i in xrange(101)] for j in xrange(101)]
def ncr(n,k):
return fact(n)/fact(k)/fact(n-k)
def D(n,k):
if cache[n][k]==-1:
if k==0:
if n==0:
cache[n][k]=1
elif n==1:
cache[n][k]=0
else:
cache[n][k]= (n-1)*(D(n-1,0)+D(n-2,0))
else:
cache[n][k]=ncr(n,k)*D(n-k,0)
return cache[n][k]
return cache[n][k]
def answer(total, match):
return sum(D(total,i) for i in xrange(match,total+1))%MOD
if __name__=='__main__':
cases=int(stdin.readline())
for case in xrange(cases):
stderr.write("case %d:\n"%case)
G,C=map(int,stdin.readline().split())
print answer(G,C)
Like everyone else, I computed the function that I now know is Rencontres Numbers, but I derived the recursive equation myself in the contest. Without loss of generality, we simply assume the correct labels of wines are 1, 2, .., g, i.e., not permuted at all.
Let's denote the function as f(g,c). Given g glasses, we look at the first glass, and we could either label it right, or label it wrong.
If we label it right, we reduce the problem to getting c-1 right out of g-1 glasses, i.e., f(g-1, c-1).
If we label it wrong, we have g-1 choices for the first glass. For the remaining g-1 glasses, we must get c glasses correct, but this subproblem is different from the f we're computing, because out of the g-1 glasses, there's already a mismatching glass. To be more precise, for the first glass, our answer is j instead of the correct label 1. Let's assume there's another function h that computes it for us.
So we have f(g,c) = f(g-1,c-1) + (g-1) * h(g-1, c).
Now to compute h(g,c), we need to consider two cases at the jth glass.
If we label it 1, we reduce the problem to f(g-1,c).
If we label it k, we have g-1 choices, and the problem is reduced to h(g-1,c).
So we have h(g,c) = f(g-1,c) + (g-1) * h(g-1,c).
Here's the complete program in Haskell, with memoization and some debugging support.
import Control.Monad
import Data.MemoTrie
--import Debug.Trace
trace = flip const
add a b = mod (a+b) 1051962371
mul a b = mod (a*b) 1051962371
main = do
(_:input) <- liftM words getContents
let map' f [] = []
map' f (a:c:xs) = f (read a) (read c) : map' f xs
mapM print $ map' ans input
ans :: Integer -> Integer -> Integer
ans g c = foldr add 0 $ map (f g) [c..g]
memoF = memo2 f
memoH = memo2 h
-- Exactly c correct in g
f :: Integer -> Integer -> Integer
f g c = trace ("f " ++ show (g,c) ++ " = " ++ show x) x
where x = if c < 0 || g < c then 0
else if g == c then 1
else add (memoF (g-1) (c-1)) (mul (g-1) (memoH (g-1) c))
-- There's one mismatching position in g positions
h :: Integer -> Integer -> Integer
h g c = trace ("h " ++ show (g,c) ++ " = " ++ show x) x
where x = if c < 0 || g < c then 0
else add (memoF (g-1) c) (mul (g-1) (memoH (g-1) c))

variant of pascal's triangle in haskell - problem with lazy evaluation

To solve some problem I need to compute a variant of the pascal's triangle which is defined like this:
f(1,1) = 1,
f(n,k) = f(n-1,k-1) + f(n-1,k) + 1 for 1 <= k < n,
f(n,0) = 0,
f(n,n) = 2*f(n-1,n-1) + 1.
For n given I want to efficiently get the n-th line (f(n,1) .. f(n,n)). One further restriction: f(n,k) should be -1 if it would be >= 2^32.
My implementation:
next :: [Int64] -> [Int64]
next list#(x:_) = x+1 : takeWhile (/= -1) (nextRec list)
nextRec (a:rest#(b:_)) = boundAdd a b : nextRec rest
nextRec [a] = [boundAdd a a]
boundAdd x y
| x < 0 || y < 0 = -1
| x + y + 1 >= limit = -1
| otherwise = (x+y+1)
-- start shoud be [1]
fLine d start = until ((== d) . head) next start
The problem: for very large numbers I get a stack overflow. Is there a way to force haskell to evaluate the whole list? It's clear that each line can't contain more elements than an upper bound, because they eventually become -1 and don't get stored and each line only depends on the previous one. Due to the lazy evaluation only the head of each line is computed until the last line needs it's second element and all the trunks along the way are stored...
I have a very efficient implementation in c++ but I am really wondering if there is a way to get it done in haskell, too.
Works for me: What Haskell implementation are you using? A naive program to calculate this triangle works fine for me in GHC 6.10.4. I can print the 1000th row just fine:
nextRow :: [Integer] -> [Integer]
nextRow row = 0 : [a + b + 1 | (a, b) <- zip row (tail row ++ [last row])]
tri = iterate nextRow [0]
main = putStrLn $ show $ tri !! 1000 -- print 1000th row
I can even print the first 10 numbers in row 100000 without overflowing the stack. I'm not sure what's going wrong for you. The global name tri might be keeping the whole triangle of results alive, but even if it is, that seems relatively harmless.
How to force order of evaluation: You can force thunks to be evaluated in a certain order using the Prelude function seq (which is a magic function that can't be implemented in terms of Haskell's other basic features). If you tell Haskell to print a `seq` b, it first evaluates the thunk for a, then evaluates and prints b.
Note that seq is shallow: it only does enough evaluation to force a to no longer be a thunk. If a is of a tuple type, the result might still be a tuple of thunks. If it's a list, the result might be a cons cell having thunks for both the head and the tail.
It seems like you shouldn't need to do this for such a simple problem; a few thousand thunks shouldn't be too much for any reasonable implementation. But it would go like this:
-- Evaluate a whole list of thunks before calculating `result`.
-- This returns `result`.
seqList :: [b] -> a -> a
seqList lst result = foldr seq result lst
-- Exactly the same as `nextRow`, but compute every element of `row`
-- before calculating any element of the next row.
nextRow' :: [Integer] -> [Integer]
nextRow' row = row `seqList` nextRow row
tri = iterate nextRow' [0]
The fold in seqList basically expands to lst!!0 `seq` lst!!1 `seq` lst!!2 `seq` ... `seq` result.
This is much slower for me when printing just the first 10 elements of row 100,000. I think that's because it requires computing 99,999 complete rows of the triangle.

Resources