Code Golf: Countdown Number Game - algorithm

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Challenge
Here is the task, inspired by the well-known British TV game show Countdown. The challenge should be pretty clear even without any knowledge of the game, but feel free to ask for clarifications.
And if you fancy seeing a clip of this game in action, check out this YouTube clip. It features the wonderful late Richard Whitely in 1997.
You are given 6 numbers, chosen at random from the set {1, 2, 3, 4, 5, 6, 8, 9, 10, 25, 50, 75, 100}, and a random target number between 100 and 999. The aim is to use the six given numbers and the four common arithmetic operations (addition, subtraction, multiplication, division; all over the rational numbers) to generate the target - or as close as possible either side. Each number may only be used once at most, while each arithmetic operator may be used any number of times (including zero.) Note that it does not matter how many numbers are used.
Write a function that takes the target number and set of 6 numbers (can be represented as list/collection/array/sequence) and returns the solution in any standard numerical notation (e.g. infix, prefix, postfix). The function must always return the closest-possible result to the target, and must run in at most 1 minute on a standard PC. Note that in the case where more than one solution exists, any single solution is sufficient.
Examples:
{50, 100, 4, 2, 2, 4}, target 203
e.g. 100 * 2 + 2 + (4 / 4) (exact)
e.g. (100 + 50) * 4 * 2 / (4 + 2) (exact)
{25, 4, 9, 2, 3, 10}, target 465
e.g. (25 + 10 - 4) * (9 * 2 - 3) (exact)
{9, 8, 10, 5, 9, 7}, target 241
e.g. ((10 + 9) * 9 * 7) + 8) / 5 (exact)
{3, 7, 6, 2, 1, 7}, target 824
e.g. ((7 * 3) - 1) * 6 - 2) * 7 (= 826; off by 2)
Rules
Other than mentioned in the problem statement, there are no further restrictions. You may write the function in any standard language (standard I/O is not necessary). The aim as always is to solve the task with the smallest number of characters of code.
Saying that, I may not simply accept the answer with the shortest code. I'll also be looking at elegance of the code and time complexity of the algorithm!
My Solution
I'm attempting an F# solution when I find the free time - will post it here when I have something!
Format
Please post all answers in the following format for the purpose of easy comparison:
Language
Number of characters: ???
Fully obfuscated function:
(code here)
Clear (ideally commented) function:
(code here)
Any notes on the algorithm/clever shortcuts it takes.

Python
Number of characters: 548 482 425 421 416 413 408
from operator import *
n=len
def C(N,T):
R=range(1<<n(N));M=[{}for i in R];p=1
for i in range(n(N)):M[1<<i][1.*N[i]]="%d"%N[i]
while p:
p=0
for i in R:
for j in R:
m=M[i|j];l=n(m)
if not i&j:m.update((f(x,y),"("+s+o+t+")")for(y,t)in M[j].items()if y for(x,s)in M[i].items() for(o,f)in zip('+-*/',(add,sub,mul,div)))
p|=l<n(m)
return min((abs(x-T),e)for t in M for(x,e)in t.items())[1]
you can call it like this:
>>> print C([50, 100, 4, 2, 2, 4], 203)
((((4+2)*(2+100))/4)+50)
Takes about half a minute on the given examples on an oldish PC.
Here's the commented version:
def countdown(N,T):
# M is a map: (bitmask of used input numbers -> (expression value -> expression text))
M=[{} for i in range(1<<len(N))]
# initialize M with single-number expressions
for i in range(len(N)):
M[1<<i][1.0*N[i]] = "%d" % N[i]
# allowed operators
ops = (("+",lambda x,y:x+y),("-",lambda x,y:x-y),("*",lambda x,y:x*y),("/",lambda x,y:x/y))
# enumerate all expressions
n=0
while 1:
# test to see if we're done (last iteration didn't change anything)
c=0
for x in M: c +=len(x)
if c==n: break
n=c
# loop over all values we have so far, indexed by bitmask of used input numbers
for i in range(len(M)):
for j in range(len(M)):
if i & j: continue # skip if both expressions used the same input number
for (x,s) in M[i].items():
for (y,t) in M[j].items():
if y: # avoid /0 (and +0,-0,*0 while we're at it)
for (o,f) in ops:
M[i|j][f(x,y)]="(%s%s%s)"%(s,o,t)
# pick best expression
L=[]
for t in M:
for(x,e) in t.items():
L+=[(abs(x-T),e)]
L.sort();return L[0][1]
It works through exhaustive enumeration of all possibilities. It is a bit smart in that if there are two expressions with the same value that use the same input numbers, it discards one of them. It is also smart in how it considers new combinations, using the index into M to prune quickly all the potential combinations that share input numbers.

Haskell
Number of characters: 361 350 338 322
Fully obfuscated function:
m=map
f=toRational
a%w=m(\(b,v)->(b,a:v))w
p[]=[];p(a:w)=(a,w):a%p w
q[]=[];q(a:w)=[((a,b),v)|(b,v)<-p w]++a%q w
z(o,p)(a,w)(b,v)=[(a`o`b,'(':w++p:v++")")|b/=0]
y=m z(zip[(-),(/),(+),(*)]"-/+*")++m flip(take 2 y)
r w=do{((a,b),v)<-q w;o<-y;c<-o a b;c:r(c:v)}
c t=snd.minimum.m(\a->(abs(fst a-f t),a)).r.m(\a->(f a,show a))
Clear function:
-- | add an element on to the front of the remainder list
onRemainder :: a -> [(b,[a])] -> [(b,[a])]
a`onRemainder`w = map (\(b,as)->(b,a:as)) w
-- | all ways to pick one item from a list, returns item and remainder of list
pick :: [a] -> [(a,[a])]
pick [] = []
pick (a:as) = (a,as) : a `onRemainder` (pick as)
-- | all ways to pick two items from a list, returns items and remainder of list
pick2 :: [a] -> [((a,a),[a])]
pick2 [] = []
pick2 (a:as) = [((a,b),cs) | (b,cs) <- pick as] ++ a `onRemainder` (pick2 as)
-- | a value, and how it was computed
type Item = (Rational, String)
-- | a specification of a binary operation
type OpSpec = (Rational -> Rational -> Rational, String)
-- | a binary operation on Items
type Op = Item -> Item -> Maybe Item
-- | turn an OpSpec into a operation
-- applies the operator to the values, and builds up an expression string
-- in this context there is no point to doing +0, -0, *0, or /0
combine :: OpSpec -> Op
combine (op,os) (ar,as) (br,bs)
| br == 0 = Nothing
| otherwise = Just (ar`op`br,"("++as++os++bs++")")
-- | the operators we can use
ops :: [Op]
ops = map combine [ ((+),"+"), ((-), "-"), ((*), "*"), ((/), "/") ]
++ map (flip . combine) [((-), "-"), ((/), "/")]
-- | recursive reduction of a list of items to a list of all possible values
-- includes values that don't use all the items, includes multiple copies of
-- some results
reduce :: [Item] -> [Item]
reduce is = do
((a,b),js) <- pick2 is
op <- ops
c <- maybe [] (:[]) $ op a b
c : reduce (c : js)
-- | convert a list of real numbers to a list of items
items :: (Real a, Show a) => [a] -> [Item]
items = map (\a -> (toRational a, show a))
-- | return the first reduction of a list of real numbers closest to some target
countDown:: (Real a, Show a) => a -> [a] -> Item
countDown t is = snd $ minimum $ map dist $ reduce $ items is
where dist is = (abs . subtract t' . fst $ is, is)
t' = toRational t
Any notes on the algorithm/clever shortcuts it takes:
In the golf'd version, z returns in the list monad, rather than Maybe as ops does.
While the algorithm here is brute force, it operates in small, fixed, linear space due to Haskell's laziness. I coded the wonderful #keith-randall algorithm, but it ran in about the same time and took over 1.5G of memory in Haskell.
reduce generates some answers multiple times, in order to easily include solutions with fewer terms.
In the golf'd version, y is defined partially in terms of itself.
Results are computed with Rational values. Golf'd code would be 17 characters shorter, and faster if computed with Double.
Notice how the function onRemainder factors out the structural similarity between pick and pick2.
Driver for golf'd version:
main = do
print $ c 203 [50, 100, 4, 2, 2, 4]
print $ c 465 [25, 4, 9, 2, 3, 10]
print $ c 241 [9, 8, 10, 5, 9, 7]
print $ c 824 [3, 7, 6, 2, 1, 7]
Run, with timing (still under one minute per result):
[1076] : time ./Countdown
(203 % 1,"(((((2*4)-2)/100)+4)*50)")
(465 % 1,"(((((10-4)*25)+2)*3)+9)")
(241 % 1,"(((((10*9)/5)+8)*9)+7)")
(826 % 1,"(((((3*7)-1)*6)-2)*7)")
real 2m24.213s
user 2m22.063s
sys 0m 0.913s

Ruby 1.9.2
Number of characters: 404
I give up for now, it works as long as there is an exact answer. If there isn't it takes way too long to enumerate all possibilities.
Fully Obfuscated
def b a,o,c,p,r
o+c==2*p ?r<<a :o<p ?b(a+['('],o+1,c,p,r):0;c<o ?b(a+[')'],o,c+1,p,r):0
end
w=a=%w{+ - * /}
4.times{w=w.product a}
b [],0,0,3,g=[]
*n,l=$<.read.split.map(&:to_f)
h={}
catch(0){w.product(g).each{|c,f|k=f.zip(c.flatten).each{|o|o.reverse! if o[0]=='('};n.permutation{|m|h[x=eval(d=m.zip(k)*'')]=d;throw 0 if x==l}}}
c=h[k=h.keys.min_by{|i|(i-l).abs}]
puts c.gsub(/(\d*)\.\d*/,'\1')+"=#{k}"
Decoded
Coming soon
Test script
#!/usr/bin/env ruby
[
[[50,100,4,2,2,4],203],
[[25,4,9,2,3,10],465],
[[9,8,10,5,9,7],241],
[[3,7,6,2,1,7],824]
].each do |b|
start = Time.now
puts "{[#{b[0]*', '}] #{b[1]}} gives #{`echo "#{b[0]*' '} #{b[1]}" | ruby count-golf.rb`.strip} in #{Time.now-start}"
end
Output
→ ./test.rb
{[50, 100, 4, 2, 2, 4] 203} gives 100+(4+(50-(2)/4)*2)=203.0 in 3.968534736
{[25, 4, 9, 2, 3, 10] 465} gives 2+(3+(25+(9)*10)*4)=465.0 in 1.430715549
{[9, 8, 10, 5, 9, 7] 241} gives 5+(9+(8)+10)*9-(7)=241.0 in 1.20045702
{[3, 7, 6, 2, 1, 7] 824} gives 7*(6*(7*(3)-1)-2)=826.0 in 193.040054095
Details
The function used for generating the bracket pairs (b) is based off this one: Finding all combinations of well-formed brackets

Ruby 1.9.2 second attempt
Number of characters: 492 440(426)
Again there is a problem with the non-exact answer. This time this is easily fast enough but for some reason the closest it gets to 824 is 819 instead of 826.
I decided to put this in a new answer since it is using a very different method to my last attempt.
Removing the total of the output (as its not required by spec) is -14 characters.
Fully Obfuscated
def r d,c;d>4?[0]:(k=c.pop;a=[];r(d+1,c).each{|b|a<<[b,k,nil];a<<[nil,k,b]};a)end
def f t,n;[0,2].each{|a|Array===t[a] ?f(t[a],n): t[a]=n.pop}end
def d t;Float===t ?t:d(t[0]).send(t[1],d(t[2]))end
def o c;Float===c ?c.round: "(#{o c[0]}#{c[1]}#{o c[2]})"end
w=a=%w{+ - * /}
4.times{w=w.product a}
*n,l=$<.each(' ').map(&:to_f)
h={}
w.each{|y|r(0,y.flatten).each{|t|f t,n.dup;h[d t]=o t}}
puts h[k=h.keys.min_by{|i|(l-i).abs}]+"=#{k.round}"
Decoded
Coming soon
Test script
#!/usr/bin/env ruby
[
[[50,100,4,2,2,4],203],
[[25,4,9,2,3,10],465],
[[9,8,10,5,9,7],241],
[[3,7,6,2,1,7],824]
].each do |b|
start = Time.now
puts "{[#{b[0]*', '}] #{b[1]}} gives #{`echo "#{b[0]*' '} #{b[1]}" | ruby count-golf.rb`.strip} in #{Time.now-start}"
end
Output
→ ./test.rb
{[50, 100, 4, 2, 2, 4] 203} gives ((4-((2-(2*4))/100))*50)=203 in 1.089726252
{[25, 4, 9, 2, 3, 10] 465} gives ((10*(((3+2)*9)+4))-25)=465 in 1.039455671
{[9, 8, 10, 5, 9, 7] 241} gives (7+(((9/(5/10))+8)*9))=241 in 1.045774539
{[3, 7, 6, 2, 1, 7] 824} gives ((((7-(1/2))*6)*7)*3)=819 in 1.012330419
Details
This constructs the set of ternary trees representing all possible combinations of 5 operators. It then goes through and inserts all permutations of the input numbers into the leaves of these trees. Finally it simply iterates through these possible equations storing them into a hash with the result as index. Then it's easy enough to pick the closest value to the required answer from the hash and display it.

Related

Coin change with split into two sets

I'm trying to figure out how to solve a problem that seems a tricky variation of a common algorithmic problem but require additional logic to handle specific requirements.
Given a list of coins and an amount, I need to count the total number of possible ways to extract the given amount using an unlimited supply of available coins (and this is a classical change making problem https://en.wikipedia.org/wiki/Change-making_problem easily solved using dynamic programming) that also satisfy some additional requirements:
extracted coins are splittable into two sets of equal size (but not necessarily of equal sum)
the order of elements inside the set doesn't matter but the order of set does.
Examples
Amount of 6 euros and coins [1, 2]: solutions are 4
[(1,1), (2,2)]
[(1,1,1), (1,1,1)]
[(2,2), (1,1)]
[(1,2), (1,2)]
Amount of 8 euros and coins [1, 2, 6]: solutions are 7
[(1,1,2), (1,1,2)]
[(1,2,2), (1,1,1)]
[(1,1,1,1), (1,1,1,1)]
[(2), (6)]
[(1,1,1), (1,2,2)]
[(2,2), (2,2)]
[(6), (2)]
By now I tried different approaches but the only way I found was to collect all the possible solution (using dynamic programming) and then filter non-splittable solution (with an odd number of coins) and duplicates. I'm quite sure there is a combinatorial way to calculate the total number of duplication but I can't figure out how.
(The following method first enumerates partitions. My other answer generates the assignments in a bottom-up fashion.) If you'd like to count splits of the coin exchange according to coin count, and exclude redundant assignments of coins to each party (for example, where splitting 1 + 2 + 2 + 1 into two parts of equal cardinality is only either (1,1) | (2,2), (2,2) | (1,1) or (1,2) | (1,2) and element order in each part does not matter), we could rely on enumeration of partitions where order is disregarded.
However, we would need to know the multiset of elements in each partition (or an aggregate of similar ones) in order to count the possibilities of dividing them in two. For example, to count the ways to split 1 + 2 + 2 + 1, we would first count how many of each coin we have:
Python code:
def partitions_with_even_number_of_parts_as_multiset(n, coins):
results = []
def C(m, n, s, p):
if n < 0 or m <= 0:
return
if n == 0:
if not p:
results.append(s)
return
C(m - 1, n, s, p)
_s = s[:]
_s[m - 1] += 1
C(m, n - coins[m - 1], _s, not p)
C(len(coins), n, [0] * len(coins), False)
return results
Output:
=> partitions_with_even_number_of_parts_as_multiset(6, [1,2,6])
=> [[6, 0, 0], [2, 2, 0]]
^ ^ ^ ^ this one represents two 1's and two 2's
Now since we are counting the ways to choose half of these, we need to find the coefficient of x^2 in the polynomial multiplication
(x^2 + x + 1) * (x^2 + x + 1) = ... 3x^2 ...
which represents the three ways to choose two from the multiset count [2,2]:
2,0 => 1,1
0,2 => 2,2
1,1 => 1,2
In Python, we can use numpy.polymul to multiply polynomial coefficients. Then we lookup the appropriate coefficient in the result.
For example:
import numpy
def count_split_partitions_by_multiset_count(multiset):
coefficients = (multiset[0] + 1) * [1]
for i in xrange(1, len(multiset)):
coefficients = numpy.polymul(coefficients, (multiset[i] + 1) * [1])
return coefficients[ sum(multiset) / 2 ]
Output:
=> count_split_partitions_by_multiset_count([2,2,0])
=> 3
(Posted a similar answer here.)
Here is a table implementation and a little elaboration on algrid's beautiful answer. This produces an answer for f(500, [1, 2, 6, 12, 24, 48, 60]) in about 2 seconds.
The simple declaration of C(n, k, S) = sum(C(n - s_i, k - 1, S[i:])) means adding all the ways to get to the current sum, n using k coins. Then if we split n into all ways it can be partitioned in two, we can just add all the ways each of those parts can be made from the same number, k, of coins.
The beauty of fixing the subset of coins we choose from to a diminishing list means that any arbitrary combination of coins will only be counted once - it will be counted in the calculation where the leftmost coin in the combination is the first coin in our diminishing subset (assuming we order them in the same way). For example, the arbitrary subset [6, 24, 48], taken from [1, 2, 6, 12, 24, 48, 60], would only be counted in the summation for the subset [6, 12, 24, 48, 60] since the next subset, [12, 24, 48, 60] would not include 6 and the previous subset [2, 6, 12, 24, 48, 60] has at least one 2 coin.
Python code (see it here; confirm here):
import time
def f(n, coins):
t0 = time.time()
min_coins = min(coins)
m = [[[0] * len(coins) for k in xrange(n / min_coins + 1)] for _n in xrange(n + 1)]
# Initialize base case
for i in xrange(len(coins)):
m[0][0][i] = 1
for i in xrange(len(coins)):
for _i in xrange(i + 1):
for _n in xrange(coins[_i], n + 1):
for k in xrange(1, _n / min_coins + 1):
m[_n][k][i] += m[_n - coins[_i]][k - 1][_i]
result = 0
for a in xrange(1, n + 1):
b = n - a
for k in xrange(1, n / min_coins + 1):
result = result + m[a][k][len(coins) - 1] * m[b][k][len(coins) - 1]
total_time = time.time() - t0
return (result, total_time)
print f(500, [1, 2, 6, 12, 24, 48, 60])

permutations without repetition

I would like to know, what is the best approach to solve this problem:
Given x, y and y integers: a1, a2, a3 .. ay find all combinations of
a1 ± a2 ± ... ± ay = x, y < 20.
My recent approach is to find all permutations of 1 and 0 stored in table T and then, depending on whether number T[i] is 1 and 0, add or subtract ai from sum. The problem is that there are n! permutations of n-element array. Hence, for 20-element array, I have to check 20! possibilities where most of them are repeated. Could you please suggest me any potential approach to solving my problem?
There are only 2^20 (just over a million) binary vectors of length 20 rather than the infeasible 20!. Use should be able to brute-force that few in less than a second, especially if you use a Gray Code which would allow you to pass from one candidate sum to another in a single step (e.g. to go from a + b - c -d to a + b - c + d just add 2*d.
The excellent branch and bound idea of #MikeWise would be good if y gets much larger. Generate a tree starting with a root node of 0. Give it children of -a1 and +a1. Then 4 grand children by adding and subtracting a2, etc. If you ever get farther than the sum of the remaining ai from the target x -- you can prune that branch. In the worst case, this might be slightly worse than the Gray-code based brute force (because you need to do so much more processing at each node), but in the best case you might be able to prune away most possibilities.
On Edit: Here is some Python code. First I define a generator which, given an integer n, successively returns which bit position needs to flip to step through a Gray code:
def grayBit(n):
code = [0]*n
odd = True
done = False
while not done:
if odd:
code[0] = 1 - code[0] #flip bit
odd = False
yield 0
else:
i = code.index(1)
if i == n-1:
done = True
else:
code[i+1] = 1 - code[i+1]
odd = True
yield i+1
(This uses an algorithm which I learned years ago in the excellent book "Constructive Combinatorics" by Stanton and White).
Then -- I use this to return all solutions (as lists consisting of the input list of numbers with negative signs inserted as needed). The key point is that I can take the current bit-to-flip and either add or subtract twice the corresponding number:
def signedSums(nums, target):
n = len(nums)
patterns = []
total = sum(nums)
pattern = [1]*n
if target == total: patterns.append([x*y for x,y in zip(nums,pattern)])
deltas = [2*i for i in nums]
for i in grayBit(n):
if pattern[i] == 1:
total -= deltas[i]
else:
total += deltas[i]
pattern[i] = -1 * pattern[i]
if target == total: patterns.append([x*y for x,y in zip(nums,pattern)])
return patterns
Typical output:
>>> signedSums([1,2,3,4,5,9],6)
[[1, -2, -3, -4, 5, 9], [1, 2, 3, -4, -5, 9], [-1, 2, -3, 4, -5, 9], [1, 2, 3, 4, 5, -9]]
It only takes about a second to evaluate:
>>> len(signedSums([i for i in range(1,21)],100))
2865
Hence there are 2865 ways to add or subtract the integers in the range 1,2,..,20 to get a net sum of 100.
I assumed that a1 can be either added or subtracted (instead of just added, which is what your question implies if taken literally). Note that if you really want to insist that a1 occurs positively, then you could just subtract it from x and apply the above algorithm to the rest of the list and the adjusted target.
Finally, it is not too hard to see that if you solve the subset sub problem with the set of weights {2*a1, 2*a2, 2*a3, .... 2*ay} and with a target sum of x + a1 + a2 + ... + ay then the subsets selected will correspond exactly to the subsets where the positive signs occur in the solution to the original problem. Thus your problem is easily reducible to the subset-sum problem and it is thus NP-complete to determine if it has any solutions (and NP-hard to list them all).
We have conditions:
a1 ± a2 ± ... ± ay = x, y<20 [1]
First of all, I would generalize the condition [1], allowing all 'a' including 'a1' to be ±:
±a1 ± a2 ± ... ± ay = x [2]
If we have solution for [2], we can easily get solution for [1]
To solve [2] we can use the following approach:
combinations list x
| x == 0 && null list = [[]]
| null list = []
| otherwise = plusCombinations ++ minusCombinations where
a = head list
rest = tail list
plusCombinations = map (\c -> a:c) $ combinations rest (x-a)
minusCombinations = map (\c -> -a:c) $ combinations rest (x+a)
Explanation:
First condition checks if x reached zero and used all numbers from list. This means that solution found and we return single solution: [[]]
Second condition checks that list is empty and as far as x is not 0 this means that no solution can be found, returning empty solution: []
Third branch means that we can two alternatives: to use ai with '+' or with '-' so we concatenate plus and minus combinations
Example output:
*Main> combinations [1,2,3,4] 2
[[1,2,3,-4],[-1,2,-3,4]]
*Main> combinations [1,2,3,4] 3
[]
*Main> combinations [1,2,3,4] 4
[[1,2,-3,4],[-1,-2,3,4]]

Knuth-Morris-Pratt algorithm corner-case

In the Knuth-Morris-Pratt algorithm when the "substring" word is a sequence of the same letter, eg. "AAAAAAAA...", the failure table it something like this: "-1, 0, 1, 2, 3, 4, 5,...".
This means if the test is something like "AAAAAAAB" When we will reach "B" we will go back X characters and start trying to match again, although we know we should start after B.
Am I missing something?
Edit (making the example specific):
Let's say the test is: AAAAAAAABAAAAAAAAA, that is (8 As, B, 9 As) and the word looking for is AAAAAAAAA, that is (9 As).
The fail table will be: -1, 0, 1, 2, 3, 4, 5, 6, 7.
At some point it will be m = 0, i = 8. This will fail, so m will become m = m + i - T[8] = 0 + 8 - 7 => m = 1 and i will be T[8] = 7.
This will fail again, so now we will have m = 2, i = 6, and then m = 3, i = 7, etc..
You will go back length(needle) characters, but you will only start matching at the offset given by the failure table. In this case, if there are 7 A's and you fail on the B, T[7] will say "skip 7 characters", so you check needle[7] vs. haystack[failed-length(needle)+7] (where failed is the index of haystack where the B in the needle compared to an A). Hence it'll run in linear time, always skipping comparison for the 6 of the 7 A's that you already matched. A smarter algorithm could probably skip ahead a little more, but only constants worth of more, as it can't be better than linear.

Allocate an array of integers proportionally compensating for rounding errors

I have an array of non-negative values. I want to build an array of values who's sum is 20 so that they are proportional to the first array.
This would be an easy problem, except that I want the proportional array to sum to exactly
20, compensating for any rounding error.
For example, the array
input = [400, 400, 0, 0, 100, 50, 50]
would yield
output = [8, 8, 0, 0, 2, 1, 1]
sum(output) = 20
However, most cases are going to have a lot of rounding errors, like
input = [3, 3, 3, 3, 3, 3, 18]
naively yields
output = [1, 1, 1, 1, 1, 1, 10]
sum(output) = 16 (ouch)
Is there a good way to apportion the output array so that it adds up to 20 every time?
There's a very simple answer to this question: I've done it many times. After each assignment into the new array, you reduce the values you're working with as follows:
Call the first array A, and the new, proportional array B (which starts out empty).
Call the sum of A elements T
Call the desired sum S.
For each element of the array (i) do the following:
a. B[i] = round(A[i] / T * S). (rounding to nearest integer, penny or whatever is required)
b. T = T - A[i]
c. S = S - B[i]
That's it! Easy to implement in any programming language or in a spreadsheet.
The solution is optimal in that the resulting array's elements will never be more than 1 away from their ideal, non-rounded values. Let's demonstrate with your example:
T = 36, S = 20. B[1] = round(A[1] / T * S) = 2. (ideally, 1.666....)
T = 33, S = 18. B[2] = round(A[2] / T * S) = 2. (ideally, 1.666....)
T = 30, S = 16. B[3] = round(A[3] / T * S) = 2. (ideally, 1.666....)
T = 27, S = 14. B[4] = round(A[4] / T * S) = 2. (ideally, 1.666....)
T = 24, S = 12. B[5] = round(A[5] / T * S) = 2. (ideally, 1.666....)
T = 21, S = 10. B[6] = round(A[6] / T * S) = 1. (ideally, 1.666....)
T = 18, S = 9. B[7] = round(A[7] / T * S) = 9. (ideally, 10)
Notice that comparing every value in B with it's ideal value in parentheses, the difference is never more than 1.
It's also interesting to note that rearranging the elements in the array can result in different corresponding values in the resulting array. I've found that arranging the elements in ascending order is best, because it results in the smallest average percentage difference between actual and ideal.
Your problem is similar to a proportional representation where you want to share N seats (in your case 20) among parties proportionnaly to the votes they obtain, in your case [3, 3, 3, 3, 3, 3, 18]
There are several methods used in different countries to handle the rounding problem. My code below uses the Hagenbach-Bischoff quota method used in Switzerland, which basically allocates the seats remaining after an integer division by (N+1) to parties which have the highest remainder:
def proportional(nseats,votes):
"""assign n seats proportionaly to votes using Hagenbach-Bischoff quota
:param nseats: int number of seats to assign
:param votes: iterable of int or float weighting each party
:result: list of ints seats allocated to each party
"""
quota=sum(votes)/(1.+nseats) #force float
frac=[vote/quota for vote in votes]
res=[int(f) for f in frac]
n=nseats-sum(res) #number of seats remaining to allocate
if n==0: return res #done
if n<0: return [min(x,nseats) for x in res] # see siamii's comment
#give the remaining seats to the n parties with the largest remainder
remainders=[ai-bi for ai,bi in zip(frac,res)]
limit=sorted(remainders,reverse=True)[n-1]
#n parties with remainter larger than limit get an extra seat
for i,r in enumerate(remainders):
if r>=limit:
res[i]+=1
n-=1 # attempt to handle perfect equality
if n==0: return res #done
raise #should never happen
However this method doesn't always give the same number of seats to parties with perfect equality as in your case:
proportional(20,[3, 3, 3, 3, 3, 3, 18])
[2,2,2,2,1,1,10]
You have set 3 incompatible requirements. An integer-valued array proportional to [1,1,1] cannot be made to sum to exactly 20. You must choose to break one of the "sum to exactly 20", "proportional to input", and "integer values" requirements.
If you choose to break the requirement for integer values, then use floating point or rational numbers. If you choose to break the exact sum requirement, then you've already solved the problem. Choosing to break proportionality is a little trickier. One approach you might take is to figure out how far off your sum is, and then distribute corrections randomly through the output array. For example, if your input is:
[1, 1, 1]
then you could first make it sum as well as possible while still being proportional:
[7, 7, 7]
and since 20 - (7+7+7) = -1, choose one element to decrement at random:
[7, 6, 7]
If the error was 4, you would choose four elements to increment.
A naïve solution that doesn't perform well, but will provide the right result...
Write an iterator that given an array with eight integers (candidate) and the input array, output the index of the element that is farthest away from being proportional to the others (pseudocode):
function next_index(candidate, input)
// Calculate weights
for i in 1 .. 8
w[i] = candidate[i] / input[i]
end for
// find the smallest weight
min = 0
min_index = 0
for i in 1 .. 8
if w[i] < min then
min = w[i]
min_index = i
end if
end for
return min_index
end function
Then just do this
result = [0, 0, 0, 0, 0, 0, 0, 0]
result[next_index(result, input)]++ for 1 .. 20
If there is no optimal solution, it'll skew towards the beginning of the array.
Using the approach above, you can reduce the number of iterations by rounding down (as you did in your example) and then just use the approach above to add what has been left out due to rounding errors:
result = <<approach using rounding down>>
while sum(result) < 20
result[next_index(result, input)]++
So the answers and comments above were helpful... particularly the decreasing sum comment from #Frederik.
The solution I came up with takes advantage of the fact that for an input array v, sum(v_i * 20) is divisible by sum(v). So for each value in v, I mulitply by 20 and divide by the sum. I keep the quotient, and accumulate the remainder. Whenever the accumulator is greater than sum(v), I add one to the value. That way I'm guaranteed that all the remainders get rolled into the results.
Is that legible? Here's the implementation in Python:
def proportion(values, total):
# set up by getting the sum of the values and starting
# with an empty result list and accumulator
sum_values = sum(values)
new_values = []
acc = 0
for v in values:
# for each value, find quotient and remainder
q, r = divmod(v * total, sum_values)
if acc + r < sum_values:
# if the accumlator plus remainder is too small, just add and move on
acc += r
else:
# we've accumulated enough to go over sum(values), so add 1 to result
if acc > r:
# add to previous
new_values[-1] += 1
else:
# add to current
q += 1
acc -= sum_values - r
# save the new value
new_values.append(q)
# accumulator is guaranteed to be zero at the end
print new_values, sum_values, acc
return new_values
(I added an enhancement that if the accumulator > remainder, I increment the previous value instead of the current value)

Recursive interlacing permutation

I have a program (a fractal) that draws lines in an interlaced order. Originally, given H lines to draw, it determines the number of frames N, and draws every Nth frame, then every N+1'th frame, etc.
For example, if H = 10 and N = 3, it draws them in order:
0, 3, 6, 9,
1, 4, 7,
2, 5, 8.
However I didn't like the way bands would gradually thicken, leaving large areas between undrawn for a long time. So the method was enhanced to recursively draw midpoint lines in each group instead of the immediately sebsequent lines, for example:
0, (32) # S (step size) = 32
8, (24) # S = 16
4, (12) # S = 8
2, 6, (10) # S = 4
1, 3, 5, 7, 9. # S = 2
(The numbers in parentheses are out of range and not drawn.) The algorithm's pretty simple:
Set S to a power of 2 greater than N*2, set F = 0.
While S > 1:
Draw frame F.
Set F = F + S.
If F >= H, then set S = S / 2; set F = S / 2.
When the odd numbered frames are drawn on the last step size, they are drawn in simple order just as an the initial (annoying) method. The same with every fourth frame, etc. It's not as bad because some intermediate frames have already been drawn.
But the same permutation could recursively be applied to the elements for each step size. In the example above, the last line would change to:
1, # the 0th element, S' = 16
9, # 4th, S' = 8
5, # 2nd, S' = 4
3, 7. # 1st and 3rd, S' = 2
The previous lines have too few elements for the recursion to take effect. But if N was large enough, some lines might require multiple levels of recursion. Any step size with 3 or more corresponding elements can be recursively permutated.
Question 1. Is there a common name for this permutation on N elements, that I could use to find additional material on it? I am also interested in any similar examples that may exist. I would be surprised if I'm the first person to want to do this.
Question 2. Are there some techniques I could use to compute it? I'm working in C but I'm more interested at the algorithm level at this stage; I'm happy to read code other language (within reason).
I have not yet tackled its implemention. I expect I will precompute the permutation first (contrary to the algorithm for the previous method, above). But I'm also interested if there is a simple way to get the next frame to draw without having to precomputing it, similar in complexity to the previous method.
It sounds as though you're trying to construct one-dimensional low-discrepancy sequences. Your permutation can be computed by reversing the binary representation of the index.
def rev(num_bits, i):
j = 0
for k in xrange(num_bits):
j = (j << 1) | (i & 1)
i >>= 1
return j
Example usage:
>>> [rev(4,i) for i in xrange(16)]
[0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]
A variant that works on general n:
def rev(n, i):
j = 0
while n >= 2:
m = i & 1
if m:
j += (n + 1) >> 1
n = (n + 1 - m) >> 1
i >>= 1
return j
>>> [rev(10,i) for i in xrange(10)]
[0, 5, 3, 8, 2, 7, 4, 9, 1, 6]

Resources