Knuth-Morris-Pratt algorithm corner-case - algorithm

In the Knuth-Morris-Pratt algorithm when the "substring" word is a sequence of the same letter, eg. "AAAAAAAA...", the failure table it something like this: "-1, 0, 1, 2, 3, 4, 5,...".
This means if the test is something like "AAAAAAAB" When we will reach "B" we will go back X characters and start trying to match again, although we know we should start after B.
Am I missing something?
Edit (making the example specific):
Let's say the test is: AAAAAAAABAAAAAAAAA, that is (8 As, B, 9 As) and the word looking for is AAAAAAAAA, that is (9 As).
The fail table will be: -1, 0, 1, 2, 3, 4, 5, 6, 7.
At some point it will be m = 0, i = 8. This will fail, so m will become m = m + i - T[8] = 0 + 8 - 7 => m = 1 and i will be T[8] = 7.
This will fail again, so now we will have m = 2, i = 6, and then m = 3, i = 7, etc..

You will go back length(needle) characters, but you will only start matching at the offset given by the failure table. In this case, if there are 7 A's and you fail on the B, T[7] will say "skip 7 characters", so you check needle[7] vs. haystack[failed-length(needle)+7] (where failed is the index of haystack where the B in the needle compared to an A). Hence it'll run in linear time, always skipping comparison for the 6 of the 7 A's that you already matched. A smarter algorithm could probably skip ahead a little more, but only constants worth of more, as it can't be better than linear.

Related

Minimum Delete operations to empty the vector

My friend was asked this question in an interview:
We have a vector of integers consisting only of 0s and 1s. A delete consists of selecting consecutive equal numbers and removing them. The remaining parts are then attached to each other. For e.g., if the vector is [0,1,1,0] then after removing [1,1] we get [0,0]. We need one delete to remove an element from the vector, if no consecutive elements are found.
We need to write a function that returns the minimum number of deletes to make the vector empty.
Examples 1:
Input: [0,1,1,0]
Output: 2
Explanation: [0,1,1,0] -> [0,0] -> []
Examples 2:
Input: [1,0,1,0]
Output: 3
Explanation: [1,0,1,0] -> [0,1,0] -> [0,0] -> [].
Examples 3:
Input: [1,1,1]
Output: 1
Explanation: [1,1,1] -> []
I am unsure of how to solve this question. I feel that we can use a greedy approach:
Remove all consecutive equal elements and increment the delete counter for each;
Remove elements of the form <a, b, c> where a==c and a!=b, because of we had multiple consecutive bs, it would have been deleted in step (1) above. Increment the delete counter once as we delete one b.
Repeat steps (1) and (2) as long as we can.
Increment delete counter once for each of the remaining elements in the vector.
But I am not sure if this would work. Could someone please confirm if this is the right approach? If not, how do we solve this?
Hint
You can simplify this problem greatly by noticing the following fact: a chain of consecutive zeros or ones can be shortened or lengthened without changing the final solution. By example, the two vectors have the same solution:
[1, 0, 1]
[1, 0, 0, 0, 0, 0, 0, 1]
With that in mind, the solution becomes simpler. So I encourage you to pause and try to figure it out!
Solution
With the previous remark, we can reduce the problem to vectors of alternating zeros and ones. In fact, since zero and one have no special meaning here, it suffices to solve for all such vector which start by... say a one.
[] # number of steps: 0
[1] # number of steps: 1
[1, 0] # number of steps: 2
[1, 0, 1] # number of steps: 2
[1, 0, 1, 0] # number of steps: 3
[1, 0, 1, 0, 1] # number of steps: 3
[1, 0, 1, 0, 1, 0] # number of steps: 4
[1, 0, 1, 0, 1, 0, 1] # number of steps: 4
We notice a pattern, the solution seems to be floor(n / 2) + 1 for n > 1 where n is the length of those sequences. But can we prove it..?
Proof
We will proceed by induction. Suppose you have a solution for a vector of length n - 2, then any move you do (except for deleting the two characters on the edges of the vector) will have the following result.
[..., 0, 1, 0, 1, 0 ...]
^------------ delete this one
Result:
[..., 0, 1, 1, 0, ...]
But we already mentioned that a chain of consecutive zeros or ones can be shortened or lengthened without changing the final solution. So the result of the deletion is in fact equivalent to now having to solve for:
[..., 0, 1, 0, ...]
What we did is one deletion in n elements and arrived to a case which is equivalent to having to solve for n - 2 elements. So the solution for a vector of size n is...
Solution(n) = Solution(n - 2) + 1
= [floor((n - 2) / 2) + 1] + 1
= floor(n / 2) + 1
Keeping in mind that the solutions for [1] and [1, 0] are respectively 1 and 2, this concludes our proof. Notice here, that [] turns out to be an edge case.
Interestingly enough, this proof also shows us that the optimal sequence of deletions for a given vector is highly non-unique. You can simply delete any block of ones or zeros, except for the first and last ones, and you will end up with an optimal solution.
Conclusion
In conclusion, given an arbitrary vector of ones and zeros, the smallest number of deletions you will need can be computed by counting the number of groups of consecutive ones or zeros. The answer is then floor(n / 2) + 1 for n > 1.
Just for fun, here is a Python implementation to solve this problem.
from itertools import groupby
def solution(vector):
n = 0
for group in groupby(vector):
n += 1
return n // 2 + 1 if n > 1 else n
Intuition: If we remove the subsegments of one integer, then all the remaining integers are of one type leads to only one operation.
Choosing the integer which is not the starting one to remove subsegments leads to optimal results.
Solution:
Take the integer other than the one that is starting as a flag.
Count the number of contiguous segments of the flag in a vector.
The answer will be the above count + 1(one operation for removing a segment of starting integer)
So, the answer is:
answer = Count of contiguous segments of flag + 1
Example 1:
[0,1,1,0]
flag = 1
Count of subsegments with flag = 1
So, answer = 1 + 1 = 2
Example 2:
[1,0,1,0]
flag = 0
Count of subsegments with flag = 2
So, answer = 2 + 1 = 3
Example 3:
[1,1,1]
flag = 0
Count of subsegments with flag = 0
So, answer = 0 + 1 = 1

previous most recent bigger element in array

I try to solve the following problem. Given an array of real numbers [7, 2, 4, 8, 1, 1, 6, 7, 4, 3, 1] for every element I need to find most recent previous bigger element in the array.
For example there is nothing bigger then first element (7) so it has NaN. For the second element (2) 7 is bigger. So in the end the answer looks like:
[NaN, 7, 7, NaN, 8, 8, 8, 8, 7, 4, 3, 1]. Of course I can just check all the previous elements for every element, but this is quadratic in terms of the number of elements of the array.
My another approach was to maintain the sorted list of previous elements and then select the first element bigger then current. This sounds like a log linear to me (am not sure). Is there any better way to approach this problem?
Here's one way to do it
create a stack which is initially empty
for each number N in the array
{
while the stack is not empty
{
if the top item on the stack T is greater than N
{
output T (leaving it on the stack)
break
}
else
{
pop T off of the stack
}
}
if the stack is empty
{
output NAN
}
push N onto the stack
}
Taking your sample array [7, 2, 4, 8, 1, 1, 6, 7, 4, 3, 1], here's how the algorithm would solve it.
stack N output
- 7 NAN
7 2 7
7 2 4 7
7 4 8 NAN
8 1 8
8 1 1 8
8 1 6 8
8 6 7 8
8 7 4 7
8 7 4 3 4
8 7 4 3 1 3
The theory is that the stack doesn't need to keep small numbers since they will never be part of the output. For example, in the sequence 7, 2, 4, the 2 is not needed, because any number less than 2 will also be less than 4. Hence the stack only needs to keep the 7 and the 4.
Complexity Analysis
The time complexity of the algorithm can be shown to be O(n) as follows:
there are exactly n pushes (each number in the input array is
pushed onto the stack once and only once)
there are at most n pops (once a number is popped from the stack,
it is discarded)
there are at most n failed comparisons (since the number is popped
and discarded after a failed comparison)
there are at most n successful comparisons (since the algorithm
moves to the next number in the input array after a successful
comparison)
there are exactly n output operations (since the algorithm
generates one output for each number in the input array)
Hence we conclude that the algorithm executes at most 5n operations to complete the task, which is a time complexity of O(n).
We can keep for each array element the index of the its most recent bigger element. When we process a new element x, we check the previous element y. If y is greater then we found what we want. If not, we check which is the index of the most recent bigger element of y. We continue until we find our needed element and its index. Using python:
a = [7, 2, 4, 8, 1, 1, 6, 7, 4, 3, 1]
idx, result = [], []
for i, v in enumerate(a, -1):
while i >= 0 and v >= a[i]:
i = idx[i]
idx.append(i)
result.append(a[i] if i >= 0 else None)
Result:
[None, 7, 7, None, 8, 8, 8, 8, 7, 4, 3]
The algorithm is linear. When an index j is unsuccessfully checked because we are looking for the most recent bigger element of index i > j then from now on i will point to a smaller index than j and j won't be checked again.
Why not just define a variable 'current_largest' and iterate through your array from left to right? At each element, current largest is largest previous, and if the current element is larger, assign current_largest to the current element. Then move to the next element.
EDIT:
I just re-read your question and I may have misunderstood it. Do you want to find ALL larger previous elements?
EDIT2:
It seems to me like the current largest method will work. You just need to record current_largest before you assign it a new value. For example, in python:
current_largest = 0
for current_element in elements:
print("Largest previous is "+current_largest)
if(current_element>current_largest):
current_largest = current_element
If you want an array of these, then just push the value to an array in place of the print statement.
As per my best understanding of your question. Below is a solution.
Working Example : JSFIDDLE
var item = document.getElementById("myButton");
item.addEventListener("click", myFunction);
function myFunction() {
var myItems = [7, 2, 4, 8, 1, 1, 6, 7, 4, 3, 1];
var previousItem;
var currentItem;
var currentLargest;
for (var i = 0; i < myItems.length; i++) {
currentItem = myItems[i];
if (i == 0) {
previousItem = myItems[0];
currentItem = myItems[0];
myItems[i] = NaN;
}
else {
if (currentItem < previousItem) {
myItems[i] = previousItem;
currentLargest = previousItem;
}
if (currentItem > currentLargest) {
currentLargest = currentItem;
myItems[i] = NaN;
}
else {
myItems[i] = currentLargest;
}
previousItem = currentItem;
}
}
var stringItems = myItems.join(",");
document.getElementById("arrayAnswer").innerHTML = stringItems;
}

Maximum continuous achievable number

The problem
Definitions
Let's define a natural number N as a writable number (WN) for number set in M numeral system, if it can be written in this numeral system from members of U using each member no more than once. More strict definition of 'written': - here CONCAT means concatenation.
Let's define a natural number N as a continuous achievable number (CAN) for symbol set in M numeral system if it is a WN-number for U and M and also N-1 is a CAN-number for U and M (Another definition may be N is CAN for U and M if all 0 .. N numbers are WN for U and M). More strict:
Issue
Let we have a set of S natural numbers: (we are treating zero as a natural number) and natural number M, M>1. The problem is to find maximum CAN (MCAN) for given U and M. Given set U may contain duplicates - but each duplicate could not be used more than once, of cause (i.e. if U contains {x, y, y, z} - then each y could be used 0 or 1 time, so y could be used 0..2 times total). Also U expected to be valid in M-numeral system (i.e. can not contain symbols 8 or 9 in any member if M=8). And, of cause, members of U are numbers, not symbols for M (so 11 is valid for M=10) - otherwise the problem will be trivial.
My approach
I have in mind a simple algorithm now, which is simply checking if current number is CAN via:
Check if 0 is WN for given U and M? Go to 2: We're done, MCAN is null
Check if 1 is WN for given U and M? Go to 3: We're done, MCAN is 0
...
So, this algorithm is trying to build all this sequence. I doubt this part can be improved, but may be it can? Now, how to check if number is a WN. This is also some kind of 'substitution brute-force'. I have a realization of that for M=10 (in fact, since we're dealing with strings, any other M is not a problem) with PHP function:
//$mNumber is our N, $rgNumbers is our U
function isWriteable($mNumber, $rgNumbers)
{
if(in_array((string)$mNumber, $rgNumbers=array_map('strval', $rgNumbers), true))
{
return true;
}
for($i=1; $i<=strlen((string)$mNumber); $i++)
{
foreach($rgKeys = array_keys(array_filter($rgNumbers, function($sX) use ($mNumber, $i)
{
return $sX==substr((string)$mNumber, 0, $i);
})) as $iKey)
{
$rgTemp = $rgNumbers;
unset($rgTemp[$iKey]);
if(isWriteable(substr((string)$mNumber, $i), $rgTemp))
{
return true;
}
}
}
return false;
}
-so we're trying one piece and then check if the rest part could be written with recursion. If it can not be written, we're trying next member of U. I think this is a point which can be improved.
Specifics
As you see, an algorithm is trying to build all numbers before N and check if they are WN. But the only question is - to find MCAN, so, question is:
May be constructive algorithm is excessive here? And, if yes, what other options could be used?
Is there more quick way to determine if number is WN for given U and M? (this point may have no sense if previous point has positive answer and we'll not build and check all numbers before N).
Samples
U = {4, 1, 5, 2, 0}
M = 10
then MCAN = 2 (3 couldn't be reached)
U = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11}
M = 10
then MCAN = 21 (all before could be reached, for 22 there are no two 2 symbols total).
Hash the digit count for digits from 0 to m-1. Hash the numbers greater than m that are composed of one repeated digit.
MCAN is bound by the smallest digit for which all combinations of that digit for a given digit count cannot be constructed (e.g., X000,X00X,X0XX,XX0X,XXX0,XXXX), or (digit count - 1) in the case of zero (for example, for all combinations of four digits, combinations are needed for only three zeros; for a zero count of zero, MCAN is null). Digit counts are evaluated in ascending order.
Examples:
1. MCAN (10, {4, 1, 5, 2, 0})
3 is the smallest digit for which a digit-count of one cannot be constructed.
MCAN = 2
2. MCAN (10, {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11})
2 is the smallest digit for which a digit-count of two cannot be constructed.
MCAN = 21
3. (from Alma Do Mundo's comment below) MCAN (2, {0,0,0,1,1,1})
1 is the smallest digit for which all combinations for a digit-count of four
cannot be constructed.
MCAN = 1110
4. (example from No One in Particular's answer)
MCAN (2, {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1111,11111111})
1 is the smallest digit for which all combinations for a digit-count of five
cannot be constructed.
MCAN = 10101
The recursion steps I've made are:
If the digit string is available in your alphabet, mark it used and return immediately
If the digit string is of length 1, return failure
Split the string in two and try each part
This is my code:
$u = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11];
echo ncan($u), "\n"; // 21
// the functions
function satisfy($n, array $u)
{
if (!empty($u[$n])) { // step 1
--$u[$n];
return $u;
} elseif (strlen($n) == 1) { // step 2
return false;
}
// step 3
for ($i = 1; $i < strlen($n); ++$i) {
$u2 = satisfy(substr($n, 0, $i), $u);
if ($u2 && satisfy(substr($n, $i), $u2)) {
return true;
}
}
return false;
}
function is_can($n, $u)
{
return satisfy($n, $u) !== false;
}
function ncan($u)
{
$umap = array_reduce($u, function(&$result, $item) {
#$result[$item]++;
return $result;
}, []);
$i = -1;
while (is_can($i + 1, $umap)) {
++$i;
}
return $i;
}
Here is another approach:
1) Order the set U with regards to the usual numerical ordering for base M.
2) If there is a symbol between 0 and (M-1) which is missing, then that is the first number which is NOT MCAN.
3) Find the fist symbol which has the least number of entries in the set U. From this we have an upper bound on the first number which is NOT MCAN. That number would be {xxxx} N times. For example, if M = 4 and U = { 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3}, then the number 333 is not MCAN. This gives us our upper bound.
4) So, if the first element of the set U which has the small number of occurences is x and it has C occurences, then we can clearly represent any number with C digits. (Since every element has at least C entries).
5) Now we ask if there is any number less than (C+1)x which can't be MCAN? Well, any (C+1) digit number can have either (C+1) of the same symbol or only at most (C) of the same symbol. Since x is minimal from step 3, (C+1)y for y < x can be done and (C)a + b can be done for any distinct a, b since they have (C) copies at least.
The above method works for set elements of only 1 symbol. However, we now see that it becomes more complex if multi-symbol elements are allowed. Consider the following case:
U = { 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1111,11111111}
Define c(A,B) = the number of 'A' symbols of 'B' length.
So for our example, c(0,1) = 15, c(0,2) = 0, c(0,3) = 0, c(0,4) = 0, ...
c(1,1) = 3, c(1,2) = 0, c(1,3) = 0, c(1,4) = 1, c(0,5) = 0, ..., c(1,8) = 1
The maximal 0 string we can't do is 16. The maximal 1 string we can't do is also 16.
1 = 1
11 = 1+1
111 = 1+1+1
1111 = 1111
11111 = 1+1111
111111 = 1+1+1111
1111111 = 1+1+1+1111
11111111 = 11111111
111111111 = 1+11111111
1111111111 = 1+1+11111111
11111111111 = 1+1+1+11111111
111111111111 = 1111+11111111
1111111111111 = 1+1111+11111111
11111111111111 = 1+1+1111+11111111
111111111111111 = 1+1+1+1111+11111111
But can we make the string 11111101111? We can't because the last 1 string (1111) needs the only set of 1's with the 4 in a row. Once we take that, we can't make the first 1 string (111111) because we only have an 8 (which is too big) or 3 1-lengths which are too small.
So for multi-symbols, we need another approach.
We know from sorting and ordering our strings what is the minimum length we can't do for a given symbol. (In the example above, it would be 16 zeros or 16 ones.) So this is our upper bound for an answer.
What we have to do now is start a 1 and count up in base M. For each number we write it in base M and then determine if we can make it from our set U. We do this by using the same approach used in the coin change problem: dynamic programming. (See for example http://www.geeksforgeeks.org/dynamic-programming-set-7-coin-change/ for the algorithm.) The only difference is that in our case we only have finite number of each elements, not an infinite supply.
Instead of subtracting the amount we are using like in the coin change problem, we strip the matching symbol off of the front of the string we are trying to match. (This is the opposite of our addition - concatenation.)

Recursive interlacing permutation

I have a program (a fractal) that draws lines in an interlaced order. Originally, given H lines to draw, it determines the number of frames N, and draws every Nth frame, then every N+1'th frame, etc.
For example, if H = 10 and N = 3, it draws them in order:
0, 3, 6, 9,
1, 4, 7,
2, 5, 8.
However I didn't like the way bands would gradually thicken, leaving large areas between undrawn for a long time. So the method was enhanced to recursively draw midpoint lines in each group instead of the immediately sebsequent lines, for example:
0, (32) # S (step size) = 32
8, (24) # S = 16
4, (12) # S = 8
2, 6, (10) # S = 4
1, 3, 5, 7, 9. # S = 2
(The numbers in parentheses are out of range and not drawn.) The algorithm's pretty simple:
Set S to a power of 2 greater than N*2, set F = 0.
While S > 1:
Draw frame F.
Set F = F + S.
If F >= H, then set S = S / 2; set F = S / 2.
When the odd numbered frames are drawn on the last step size, they are drawn in simple order just as an the initial (annoying) method. The same with every fourth frame, etc. It's not as bad because some intermediate frames have already been drawn.
But the same permutation could recursively be applied to the elements for each step size. In the example above, the last line would change to:
1, # the 0th element, S' = 16
9, # 4th, S' = 8
5, # 2nd, S' = 4
3, 7. # 1st and 3rd, S' = 2
The previous lines have too few elements for the recursion to take effect. But if N was large enough, some lines might require multiple levels of recursion. Any step size with 3 or more corresponding elements can be recursively permutated.
Question 1. Is there a common name for this permutation on N elements, that I could use to find additional material on it? I am also interested in any similar examples that may exist. I would be surprised if I'm the first person to want to do this.
Question 2. Are there some techniques I could use to compute it? I'm working in C but I'm more interested at the algorithm level at this stage; I'm happy to read code other language (within reason).
I have not yet tackled its implemention. I expect I will precompute the permutation first (contrary to the algorithm for the previous method, above). But I'm also interested if there is a simple way to get the next frame to draw without having to precomputing it, similar in complexity to the previous method.
It sounds as though you're trying to construct one-dimensional low-discrepancy sequences. Your permutation can be computed by reversing the binary representation of the index.
def rev(num_bits, i):
j = 0
for k in xrange(num_bits):
j = (j << 1) | (i & 1)
i >>= 1
return j
Example usage:
>>> [rev(4,i) for i in xrange(16)]
[0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]
A variant that works on general n:
def rev(n, i):
j = 0
while n >= 2:
m = i & 1
if m:
j += (n + 1) >> 1
n = (n + 1 - m) >> 1
i >>= 1
return j
>>> [rev(10,i) for i in xrange(10)]
[0, 5, 3, 8, 2, 7, 4, 9, 1, 6]

Code Golf: Countdown Number Game

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Challenge
Here is the task, inspired by the well-known British TV game show Countdown. The challenge should be pretty clear even without any knowledge of the game, but feel free to ask for clarifications.
And if you fancy seeing a clip of this game in action, check out this YouTube clip. It features the wonderful late Richard Whitely in 1997.
You are given 6 numbers, chosen at random from the set {1, 2, 3, 4, 5, 6, 8, 9, 10, 25, 50, 75, 100}, and a random target number between 100 and 999. The aim is to use the six given numbers and the four common arithmetic operations (addition, subtraction, multiplication, division; all over the rational numbers) to generate the target - or as close as possible either side. Each number may only be used once at most, while each arithmetic operator may be used any number of times (including zero.) Note that it does not matter how many numbers are used.
Write a function that takes the target number and set of 6 numbers (can be represented as list/collection/array/sequence) and returns the solution in any standard numerical notation (e.g. infix, prefix, postfix). The function must always return the closest-possible result to the target, and must run in at most 1 minute on a standard PC. Note that in the case where more than one solution exists, any single solution is sufficient.
Examples:
{50, 100, 4, 2, 2, 4}, target 203
e.g. 100 * 2 + 2 + (4 / 4) (exact)
e.g. (100 + 50) * 4 * 2 / (4 + 2) (exact)
{25, 4, 9, 2, 3, 10}, target 465
e.g. (25 + 10 - 4) * (9 * 2 - 3) (exact)
{9, 8, 10, 5, 9, 7}, target 241
e.g. ((10 + 9) * 9 * 7) + 8) / 5 (exact)
{3, 7, 6, 2, 1, 7}, target 824
e.g. ((7 * 3) - 1) * 6 - 2) * 7 (= 826; off by 2)
Rules
Other than mentioned in the problem statement, there are no further restrictions. You may write the function in any standard language (standard I/O is not necessary). The aim as always is to solve the task with the smallest number of characters of code.
Saying that, I may not simply accept the answer with the shortest code. I'll also be looking at elegance of the code and time complexity of the algorithm!
My Solution
I'm attempting an F# solution when I find the free time - will post it here when I have something!
Format
Please post all answers in the following format for the purpose of easy comparison:
Language
Number of characters: ???
Fully obfuscated function:
(code here)
Clear (ideally commented) function:
(code here)
Any notes on the algorithm/clever shortcuts it takes.
Python
Number of characters: 548 482 425 421 416 413 408
from operator import *
n=len
def C(N,T):
R=range(1<<n(N));M=[{}for i in R];p=1
for i in range(n(N)):M[1<<i][1.*N[i]]="%d"%N[i]
while p:
p=0
for i in R:
for j in R:
m=M[i|j];l=n(m)
if not i&j:m.update((f(x,y),"("+s+o+t+")")for(y,t)in M[j].items()if y for(x,s)in M[i].items() for(o,f)in zip('+-*/',(add,sub,mul,div)))
p|=l<n(m)
return min((abs(x-T),e)for t in M for(x,e)in t.items())[1]
you can call it like this:
>>> print C([50, 100, 4, 2, 2, 4], 203)
((((4+2)*(2+100))/4)+50)
Takes about half a minute on the given examples on an oldish PC.
Here's the commented version:
def countdown(N,T):
# M is a map: (bitmask of used input numbers -> (expression value -> expression text))
M=[{} for i in range(1<<len(N))]
# initialize M with single-number expressions
for i in range(len(N)):
M[1<<i][1.0*N[i]] = "%d" % N[i]
# allowed operators
ops = (("+",lambda x,y:x+y),("-",lambda x,y:x-y),("*",lambda x,y:x*y),("/",lambda x,y:x/y))
# enumerate all expressions
n=0
while 1:
# test to see if we're done (last iteration didn't change anything)
c=0
for x in M: c +=len(x)
if c==n: break
n=c
# loop over all values we have so far, indexed by bitmask of used input numbers
for i in range(len(M)):
for j in range(len(M)):
if i & j: continue # skip if both expressions used the same input number
for (x,s) in M[i].items():
for (y,t) in M[j].items():
if y: # avoid /0 (and +0,-0,*0 while we're at it)
for (o,f) in ops:
M[i|j][f(x,y)]="(%s%s%s)"%(s,o,t)
# pick best expression
L=[]
for t in M:
for(x,e) in t.items():
L+=[(abs(x-T),e)]
L.sort();return L[0][1]
It works through exhaustive enumeration of all possibilities. It is a bit smart in that if there are two expressions with the same value that use the same input numbers, it discards one of them. It is also smart in how it considers new combinations, using the index into M to prune quickly all the potential combinations that share input numbers.
Haskell
Number of characters: 361 350 338 322
Fully obfuscated function:
m=map
f=toRational
a%w=m(\(b,v)->(b,a:v))w
p[]=[];p(a:w)=(a,w):a%p w
q[]=[];q(a:w)=[((a,b),v)|(b,v)<-p w]++a%q w
z(o,p)(a,w)(b,v)=[(a`o`b,'(':w++p:v++")")|b/=0]
y=m z(zip[(-),(/),(+),(*)]"-/+*")++m flip(take 2 y)
r w=do{((a,b),v)<-q w;o<-y;c<-o a b;c:r(c:v)}
c t=snd.minimum.m(\a->(abs(fst a-f t),a)).r.m(\a->(f a,show a))
Clear function:
-- | add an element on to the front of the remainder list
onRemainder :: a -> [(b,[a])] -> [(b,[a])]
a`onRemainder`w = map (\(b,as)->(b,a:as)) w
-- | all ways to pick one item from a list, returns item and remainder of list
pick :: [a] -> [(a,[a])]
pick [] = []
pick (a:as) = (a,as) : a `onRemainder` (pick as)
-- | all ways to pick two items from a list, returns items and remainder of list
pick2 :: [a] -> [((a,a),[a])]
pick2 [] = []
pick2 (a:as) = [((a,b),cs) | (b,cs) <- pick as] ++ a `onRemainder` (pick2 as)
-- | a value, and how it was computed
type Item = (Rational, String)
-- | a specification of a binary operation
type OpSpec = (Rational -> Rational -> Rational, String)
-- | a binary operation on Items
type Op = Item -> Item -> Maybe Item
-- | turn an OpSpec into a operation
-- applies the operator to the values, and builds up an expression string
-- in this context there is no point to doing +0, -0, *0, or /0
combine :: OpSpec -> Op
combine (op,os) (ar,as) (br,bs)
| br == 0 = Nothing
| otherwise = Just (ar`op`br,"("++as++os++bs++")")
-- | the operators we can use
ops :: [Op]
ops = map combine [ ((+),"+"), ((-), "-"), ((*), "*"), ((/), "/") ]
++ map (flip . combine) [((-), "-"), ((/), "/")]
-- | recursive reduction of a list of items to a list of all possible values
-- includes values that don't use all the items, includes multiple copies of
-- some results
reduce :: [Item] -> [Item]
reduce is = do
((a,b),js) <- pick2 is
op <- ops
c <- maybe [] (:[]) $ op a b
c : reduce (c : js)
-- | convert a list of real numbers to a list of items
items :: (Real a, Show a) => [a] -> [Item]
items = map (\a -> (toRational a, show a))
-- | return the first reduction of a list of real numbers closest to some target
countDown:: (Real a, Show a) => a -> [a] -> Item
countDown t is = snd $ minimum $ map dist $ reduce $ items is
where dist is = (abs . subtract t' . fst $ is, is)
t' = toRational t
Any notes on the algorithm/clever shortcuts it takes:
In the golf'd version, z returns in the list monad, rather than Maybe as ops does.
While the algorithm here is brute force, it operates in small, fixed, linear space due to Haskell's laziness. I coded the wonderful #keith-randall algorithm, but it ran in about the same time and took over 1.5G of memory in Haskell.
reduce generates some answers multiple times, in order to easily include solutions with fewer terms.
In the golf'd version, y is defined partially in terms of itself.
Results are computed with Rational values. Golf'd code would be 17 characters shorter, and faster if computed with Double.
Notice how the function onRemainder factors out the structural similarity between pick and pick2.
Driver for golf'd version:
main = do
print $ c 203 [50, 100, 4, 2, 2, 4]
print $ c 465 [25, 4, 9, 2, 3, 10]
print $ c 241 [9, 8, 10, 5, 9, 7]
print $ c 824 [3, 7, 6, 2, 1, 7]
Run, with timing (still under one minute per result):
[1076] : time ./Countdown
(203 % 1,"(((((2*4)-2)/100)+4)*50)")
(465 % 1,"(((((10-4)*25)+2)*3)+9)")
(241 % 1,"(((((10*9)/5)+8)*9)+7)")
(826 % 1,"(((((3*7)-1)*6)-2)*7)")
real 2m24.213s
user 2m22.063s
sys 0m 0.913s
Ruby 1.9.2
Number of characters: 404
I give up for now, it works as long as there is an exact answer. If there isn't it takes way too long to enumerate all possibilities.
Fully Obfuscated
def b a,o,c,p,r
o+c==2*p ?r<<a :o<p ?b(a+['('],o+1,c,p,r):0;c<o ?b(a+[')'],o,c+1,p,r):0
end
w=a=%w{+ - * /}
4.times{w=w.product a}
b [],0,0,3,g=[]
*n,l=$<.read.split.map(&:to_f)
h={}
catch(0){w.product(g).each{|c,f|k=f.zip(c.flatten).each{|o|o.reverse! if o[0]=='('};n.permutation{|m|h[x=eval(d=m.zip(k)*'')]=d;throw 0 if x==l}}}
c=h[k=h.keys.min_by{|i|(i-l).abs}]
puts c.gsub(/(\d*)\.\d*/,'\1')+"=#{k}"
Decoded
Coming soon
Test script
#!/usr/bin/env ruby
[
[[50,100,4,2,2,4],203],
[[25,4,9,2,3,10],465],
[[9,8,10,5,9,7],241],
[[3,7,6,2,1,7],824]
].each do |b|
start = Time.now
puts "{[#{b[0]*', '}] #{b[1]}} gives #{`echo "#{b[0]*' '} #{b[1]}" | ruby count-golf.rb`.strip} in #{Time.now-start}"
end
Output
→ ./test.rb
{[50, 100, 4, 2, 2, 4] 203} gives 100+(4+(50-(2)/4)*2)=203.0 in 3.968534736
{[25, 4, 9, 2, 3, 10] 465} gives 2+(3+(25+(9)*10)*4)=465.0 in 1.430715549
{[9, 8, 10, 5, 9, 7] 241} gives 5+(9+(8)+10)*9-(7)=241.0 in 1.20045702
{[3, 7, 6, 2, 1, 7] 824} gives 7*(6*(7*(3)-1)-2)=826.0 in 193.040054095
Details
The function used for generating the bracket pairs (b) is based off this one: Finding all combinations of well-formed brackets
Ruby 1.9.2 second attempt
Number of characters: 492 440(426)
Again there is a problem with the non-exact answer. This time this is easily fast enough but for some reason the closest it gets to 824 is 819 instead of 826.
I decided to put this in a new answer since it is using a very different method to my last attempt.
Removing the total of the output (as its not required by spec) is -14 characters.
Fully Obfuscated
def r d,c;d>4?[0]:(k=c.pop;a=[];r(d+1,c).each{|b|a<<[b,k,nil];a<<[nil,k,b]};a)end
def f t,n;[0,2].each{|a|Array===t[a] ?f(t[a],n): t[a]=n.pop}end
def d t;Float===t ?t:d(t[0]).send(t[1],d(t[2]))end
def o c;Float===c ?c.round: "(#{o c[0]}#{c[1]}#{o c[2]})"end
w=a=%w{+ - * /}
4.times{w=w.product a}
*n,l=$<.each(' ').map(&:to_f)
h={}
w.each{|y|r(0,y.flatten).each{|t|f t,n.dup;h[d t]=o t}}
puts h[k=h.keys.min_by{|i|(l-i).abs}]+"=#{k.round}"
Decoded
Coming soon
Test script
#!/usr/bin/env ruby
[
[[50,100,4,2,2,4],203],
[[25,4,9,2,3,10],465],
[[9,8,10,5,9,7],241],
[[3,7,6,2,1,7],824]
].each do |b|
start = Time.now
puts "{[#{b[0]*', '}] #{b[1]}} gives #{`echo "#{b[0]*' '} #{b[1]}" | ruby count-golf.rb`.strip} in #{Time.now-start}"
end
Output
→ ./test.rb
{[50, 100, 4, 2, 2, 4] 203} gives ((4-((2-(2*4))/100))*50)=203 in 1.089726252
{[25, 4, 9, 2, 3, 10] 465} gives ((10*(((3+2)*9)+4))-25)=465 in 1.039455671
{[9, 8, 10, 5, 9, 7] 241} gives (7+(((9/(5/10))+8)*9))=241 in 1.045774539
{[3, 7, 6, 2, 1, 7] 824} gives ((((7-(1/2))*6)*7)*3)=819 in 1.012330419
Details
This constructs the set of ternary trees representing all possible combinations of 5 operators. It then goes through and inserts all permutations of the input numbers into the leaves of these trees. Finally it simply iterates through these possible equations storing them into a hash with the result as index. Then it's easy enough to pick the closest value to the required answer from the hash and display it.

Resources