Calculating slot machine payout - algorithm

A slot machine has 5 reels and displays 3 symbols per reel (no spaces or "empty" symbols).
Payout can occur in a number of ways. Some examples...
A special diamond symbol appears
3 Lucky 7's appear
All five symbols in the payline are identical
All five symbols are the same number, but different color
Etc.
There are also multiple paylines that need to be checked for a payout.
What is the most efficient way to calculate winnings for each spin? Or, is there a more efficient way than brute force to apply each payout scenario to each payline?

Every payout besides the paylines seem trivial. For the three lucky 7s, just iterate over the visible squares and count the 7s. Same for checking for a diamond. If we let h be the number of rows and w be the number of columns, this operation is O(hw*), which for practically sized slot machines is pretty low.
The paylines, though, are more interesting. Theoretically the number of paylines (m from here on out) is much, much larger than *h ** w*; before throwing out illegal paylines that jump m = h^w which is much larger than *h ** w. More importantly, they appear to share a lot of similarity. For example, line 2 and line 6 in your example both require matching top left and top middle-left squares. If these two don't match, then you can't win on either line 2 or line 6.
To represent paylines, I'm going to use length w arrays of integers in the range [1, h], such that payline[i] = the index in the column (1 indexed) of row i in the solution. For example, payline 1 is [1, 1, 1, 1, 1], and payline 17 is [3, 3, 2, 1, 2].
To this end, a suffix tree seems like an applicable data structure that can vastly improve your running time of checking all of the paylines against a given board state. Consider the following algorithm to construct a suffix tree that encapsulates all paylines.
Initialize:
Create a root node at column 0 (off-screen, non-column part of all solutions)
root node.length = 0
root node.terminal = false
Add all paylines (in the form of length w arrays of integers ranging from 1 to h) to the root nodes' "toDistribute set"
Create a toWork queue, add the root node to it
Iterate: while toWork not empty:
let node n = toWork.pop()
if n.length < w
create children of n with length n.length + 1 and terminal = (n.length + 1 == w).
for payline p in n.toDistribute
remove p from n.toDistribute
if(p.length > 1)
add p.subArray(1, end) to child of n as applicable.
add children of n to toWork
Running this construction algorithm on your example for lines 1-11 gives a tree that looks like this:
The computation of this tree is fairly intensive; it involves the creation of sum i = 1 to w of h ^ i nodes. The size of the tree depends only on the size of the board (height and width), not the number of paylines, which is the main advantage of this approach. Another advantage is that it's all pre-processing; you can have this tree built long before a player ever sits down to pull the lever.
Once the tree is built, you can give each node a field for each match criteria (same symbol, same color, etc). Then, when evaluating a board state, you can dfs the tree, and at every new node, ask (for each critera) if it matches its parent node. If so, mark that criteria as true and continue. Else, mark it as false and do not search the children for that criteria. For example, if you're looking specifically for identical tokens on the sub array [1, 1, ...] and find that column 1's row 1 and column 2's row 1 don't match, then any payline that includes [1, 1, ...] (2, 6, 16, 20) all can't be won, and you don't have to dfs that part of the tree.
It's hard to have a thorough algorithmic analysis of how much more efficient this dfs approach is than individually checking each payline, because such an analysis would require knowing how much left-side overlap (on average) there is between paylines. It's certainly no worse, and at least for your example is a good deal better. Moreover, the more paylines you add to a board, the greater the overlap, and the greater the time savings for checking all paylines by using this method.

In order to calculate RTP you should have full slot machine information. The most important part are reels strips. Monte-Carlo is usually done in order to get statistics needed. For example: https://raw.githubusercontent.com/VelbazhdSoftwareLLC/BugABoomSimulator/master/Main.cs
Paytable info:
private static int[][] paytable = {
new int[]{0,0,0,0,0,0,0,0,0,0,0,0,0},
new int[]{0,0,0,0,0,0,0,0,0,0,0,0,0},
new int[]{0,0,0,0,0,0,0,0,2,2,2,10,2},
new int[]{5,5,5,10,10,10,15,15,25,25,50,250,5},
new int[]{25,25,25,50,50,50,75,75,125,125,250,2500,0},
new int[]{125,125,125,250,250,250,500,500,750,750,1250,10000,0},
};
Betting lines:
private static int[][] lines = {
new int[]{1,1,1,1,1},
new int[]{0,0,0,0,0},
new int[]{2,2,2,2,2},
new int[]{0,1,2,1,0},
new int[]{2,1,0,1,2},
new int[]{0,0,1,2,2},
new int[]{2,2,1,0,0},
new int[]{1,0,1,2,1},
new int[]{1,2,1,0,1},
new int[]{1,0,0,1,0},
new int[]{1,2,2,1,2},
new int[]{0,1,0,0,1},
new int[]{2,1,2,2,1},
new int[]{0,2,0,2,0},
new int[]{2,0,2,0,2},
new int[]{1,0,2,0,1},
new int[]{1,2,0,2,1},
new int[]{0,1,1,1,0},
new int[]{2,1,1,1,2},
new int[]{0,2,2,2,0},
};
Reels strips:
private static int[][] baseReels = {
new int[]{0,4,11,1,3,2,5,9,0,4,2,7,8,0,5,2,6,10,0,5,1,3,9,4,2,7,8,0,5,2,6,9,0,5,2,4,10,0,5,1,7,9,2,5},
new int[]{4,1,11,2,7,0,9,5,1,3,8,4,2,6,12,4,0,3,1,8,4,2,6,0,10,4,1,3,2,12,4,0,7,1,8,2,4,0,9,1,6,2,8,0},
new int[]{1,7,11,5,1,7,8,6,0,3,12,4,1,6,9,5,2,7,10,1,3,2,8,1,3,0,9,5,1,3,10,6,0,3,8,7,1,6,12,3,2,5,9,3},
new int[]{5,2,11,3,0,6,1,5,12,2,4,0,10,3,1,7,3,2,11,5,4,6,0,5,12,1,3,7,2,4,8,0,3,6,1,4,12,2,5,7,0,4,9,1},
new int[]{7,0,11,4,6,1,9,5,10,2,7,3,8,0,4,9,1,6,5,10,2,8,3},
};
private static int[][] freeReels = {
new int[]{2,4,11,0,3,7,1,4,8,2,5,6,0,5,9,1,3,7,2,4,10,0,3,1,8,4,2,5,6,0,4,1,10,5,2,3,7,0,5,9,1,3,6},
new int[]{4,2,11,0,5,2,12,1,7,0,9,2,3,0,12,2,4,0,5,8,2,6,0,12,2,7,1,3,10,6,0},
new int[]{1,4,11,2,7,8,1,5,12,0,3,9,1,7,8,1,5,12,2,6,10,1,4,9,3,1,8,0,12,6,9},
new int[]{6,4,11,2,7,3,9,1,6,5,12,0,4,10,2,3,8,1,7,5,12,0},
new int[]{3,4,11,0,6,5,3,8,1,7,4,9,2,5,10,0,3,8,1,4,10,2,5,9},
};
The spin function, which should be called many times in order to calculate RTP:
private static void spin(int[][] reels) {
for (int i = 0, r, u, d; i < view.Length && i < reels.Length; i++) {
if (bruteForce == true) {
u = reelsStops [i];
r = u + 1;
d = u + 2;
} else {
u = prng.Next (reels [i].Length);
r = u + 1;
d = u + 2;
}
r = r % reels[i].Length;
d = d % reels[i].Length;
view[i][0] = reels[i][u];
view[i][1] = reels[i][r];
view[i][2] = reels[i][d];
}
}
After each spin all wins should be calculated.

Related

Checking the validity of a pyramid of dominoes

I came across this question in a coding interview and couldn't figure out a good solution.
You are given 6 dominoes. A domino has 2 halves each with a number of spots. You are building a 3-level pyramid of dominoes. The bottom level has 3 dominoes, the middle level has 2, and the top has 1.
The arrangement is such that each level is positioned over the center of the level below it. Here is a visual:
[ 3 | 4 ]
[ 2 | 3 ] [ 4 | 5 ]
[ 1 | 2 ][ 3 | 4 ][ 5 | 6 ]
The pyramid must be set up such that the number of spots on each domino half should be the same as the number on the half beneath it. This doesn't apply to neighboring dominoes on the same level.
Is it possible to build a pyramid from 6 dominoes in the arrangement described above? Dominoes can be freely arranged and rotated.
Write a function that takes an array of 12 ints (such that arr[0], arr[1] are the first domino, arr[2], arr[3] are the second domino, etc.) and return "YES" or "NO" if it is possible or not to create a pyramid with the given 6 dominoes.
Thank you.
You can do better than brute-forcing. I don't have the time for a complete answer. So this is more like a hint.
Count the number of occurrences of each number. It should be at least 3 for at least two numbers and so on. If these conditions are not met, there is no solution. In the next steps, you need to consider the positioning of numbers on the tiles.
Just iterate every permutation and check each one. If you find a solution, then you can stop and return "YES". If you get through all permutations then return "NO". There are 6 positions and each domino has 2 rotations, so a total of 12*10*8*6*4*2 = 46080 permutations. Half of these are mirrors of each other so we're doubling our necessary workload, but I don't think that's going to trouble the user. I'd fix the domino orientations, then iterate through all the position permutations, then iterate the orientation permutations and repeat.
So I'd present the algorithm as:
For each permutation of domino orientations
For each permutation of domino positions
if arr[0] == arr[3] && arr[1] == arr[4] && arr[2] == arr[7] && arr[3] == arr[8] && arr[4] == arr[9] && && arr[5] == arr[10] then return "YES"
return "NO"
At that point I'd ask the interviewer where they wanted to go from there. We could look at optimisations, equivalences, implementations or move on to something else.
We can formulate a recursive solution:
valid_row:
if row_index < N - 1:
copy of row must exist two rows below
if row_index > 2:
matching left and right must exist
on the row above, around a center
of size N - 3, together forming
a valid_row
if row_index == N - 1:
additional matching below must
exist for the last number on each side
One way to solve it could be backtracking while tracking chosen dominoes along the path. Given the constraints on matching, a six domino pyramid ought to go pretty quick.
Before I start... There is an ambiguity in the question, which may be what the interviewer was more interested than the answer. This would appear to be a question asking for a method to validate one particular arrangement of the values, except for the bit which says "Is it possible to build a pyramid from 6 dominoes in the arrangement described above? Dominoes can be freely arranged and rotated." which implies that they might want you to also move the dominoes around to find a solution. I'm going to ignore that, and stick with the simple validation of whether it is a valid arrangement. (If it is required, I'd split the array into pairs, and then brute force the permutations of the possible arrangements against this code to find the first one that is valid.)
I've selected C# as a language for my solution, but I have intentionally avoided any language features which might make this more readable to a C# person, or perform faster, since the question is not language-specific, so I wanted this to be readable/convertible for people who prefer other languages. That's also the reason why I've used lots of named variables.
Basically check that each row is duplicated in the row below (offset by one), and stop when you reach the last row.
The algorithm drops out as soon as it finds a failure. This algorithm is extensible to larger pyramids; but does no validation of the size of the input array: it will work if the array is sensible.
using System;
public static void Main()
{
int[] values = new int[] { 3, 4, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6 };
bool result = IsDominoPyramidValid(values);
Console.WriteLine(result ? "YES" : "NO");
}
private const int DominoLength = 2;
public static bool IsDominoPyramidValid(int[] values)
{
int arrayLength = values.Length;
int offset = 0;
int currentRow = 1; // Note: I'm using a 1-based value here as it helps the maths
bool result = true;
while (result)
{
int currentRowLength = currentRow * DominoLength;
// Avoid checking final row: there is no row below it
if (offset + currentRowLength >= arrayLength)
{
break;
}
result = CheckValuesOnRowAgainstRowBelow(values, offset, currentRowLength);
offset += currentRowLength;
currentRow++;
}
return result;
}
private static bool CheckValuesOnRowAgainstRowBelow(int[] values, int startOfCurrentRow, int currentRowLength)
{
int startOfNextRow = startOfCurrentRow + currentRowLength;
int comparablePointOnNextRow = startOfNextRow + 1;
for (int i = 0; i < currentRowLength; i++)
{
if (values[startOfCurrentRow + i] != values[comparablePointOnNextRow + i])
{
return false;
}
}
return true;
}

How to search for the largest subset where every pair meets criteria?

I hope this isn't more of a statistics question...
Suppose I have an interface:
public interface PairValidatable<T>
{
public boolean isValidWith(T);
}
Now if I have a large array of PairValidatables, how do I find the largest subset of that array where every pair passes the isValidWith test?
To clarify, if there are three entries in a subset, then elements 0 and 1 should pass isValidWith, elements 1 and 2 should pass isValidWith, and elements 0 and 2 should pass isValidWith.
Example,
public class Point implements PairValidatable<Point>
{
int x;
int y;
public Point(int xIn, int yIn)
{
x = xIn;
y = yIn;
}
public boolean isValidWith(Point other)
{
//whichever has the greater x must have the lesser (or equal) y
return x > other.x != y > other.y;
}
}
The intuitive idea is to keep a vector of Points, add array element 0, and compare each remaining array element to the vector if it passes the validation with every element in the vector, adding it to the vector if so... but the problem is that element 0 might be very restrictive. For example,
Point[] arr = new Point[5];
arr[0] = new Point(1000, 1000);
arr[1] = new Point(10, 10);
arr[2] = new Point(15, 7);
arr[3] = new Point(3, 6);
arr[4] = new Point(18, 6);
Iterating through as above would give us a subset containing only element 0, but the subset of elements 1, 2 and 4 is a larger subset where every pair passes the validation. The algorithm should then return the points stored in elements 1, 2 and 4. Though elements 3 and 4 are valid with each other and elements 1 and 4 are valid with each other, elements 2 and 3 are not, nor elements 1 and 3. The subset containing 1, 2 and 4 is a larger subset than 3 and 4.
I would guess some tree or graph algorithm would be best for solving this but I'm not sure how to set it up.
The solution doesn't have to be Java-specific, and preferably could be implemented in any language instead of relying on Java built-ins. I just used Java-like pseudocode above for familiarity reasons.
Presumably isValidWith is commutative -- that is, if x.isValidWith(y) then y.isValidWith(x). If you know nothing more than that, you have an instance of the maximum clique problem, which is known to be NP-complete:
Skiena, S. S. "Clique and Independent Set" and "Clique." ยง6.2.3 and 8.5.1 in The Algorithm Design Manual. New York: Springer-Verlag, pp. 144 and 312-314, 1997.
Therefore, if you want an efficient algorithm, you will have to hope that your specific isValidWith function has more structure than mere commutativity, and you will have to exploit that structure.
For your specific problem, you should be able to do the following:
Sort your points in increasing order of x coordinate.
Find the longest decreasing subsequence of the y coordinates in the sorted list.
Each operation can be performed in O(n*log(n)) time, so your particular problem is efficiently solvable.

Least distance between two values in a large binary tree with duplicate values

Given a binary tree that might contain duplicate values, you need to find minimum distance between two given values. Note that the binary tree can be large.
For example:
5
/ \
1 7
/ \ / \
4 3 8 2
/ \
1 2
The function should return 2 for (1 and 2 as input).
(If duplicates are not present, we can find LCA and then calculate the distance.)
I've written the following code but I couldn't handle cases when the values are present in different subtrees and in the below cases:
root = 1, root.left = 4, root.left.left = 3, root.left.right = 2, root.left.left.left = 1
root = 1, root.left = 4, root.left.left = 3, root.left.left.left = 1, root.left.left.right = 2
void dist(struct node* root,int& min,int n1,int n2,int pos1,int pos2,int level) {
if(!root)
return;
if(root->data==n1){
pos1 = level;
if(pos2>=0)
if(pos1-pos2 < min)
min = pos1-pos2;
}
else if(root->data==n2){
pos2 = level;
if(pos1>=0)
if(pos2-pos1 < min)
min = pos2-pos1;
}
dist(root->left,min,n1,n2,pos1,pos2,level+1);
dist(root->right,min,n1,n2,pos1,pos2,level+1);
}
I think at each node we can find if that node is the LCA of the values or not. If that node is LCA then find the distance and update min accordingly, but this would take O(n2).
Following is an algorithm to solve the problem:-
traverse all of the tree and calculate paths for each node using binary strings representation and store into hash map
eg. For your tree the hashmap will be
1 => 0,000
2 => 001,11
3 => 01
...
When query for distance between (u,v) check for each pair and calculate distance between them. Remove common prefix from strings and then sum the remaining lengths
eg. u=1 and v=2
distance(0,001) = 2
distance(0,11) = 3
distance(000,001) = 2
distance(000,11) = 5
min = 2
Note: I think the second step can be made more efficient but need to do more research
You can compute the LCA of a set of nodes by computing LCA(x1, LCA(x2, LCA(x3... and all the nodes in the set will be somewhere below this LCA. If you compare the LCAs of two sets of nodes and one is not directly beneath the other then the minimum distance between any two nodes in different sets will be at least the distance between the LCAs. If one LCA is above the other then the minimum distance could be zero.
This allows a sort of branch and bound approach. At each point you have a best minimum distance so far (initialized as infinity). Given two sets of nodes, use their LCAs to work out a lower bound on their minimum distance and discard them if this is no better than the best answer so far. If not discarded, split each set into two plus a possible single depending on whether each node in the set is to the left of the LCA, to the right of the LCA, or is the LCA. Recursively check for the minimum distance in the (up to nine) pairs of split sets. If both splits in a pair are below some minimum size, just work out the LCAs and minimum distances of each pair of nodes across the two sets - at this point may find out that you have a new best answer and can update the best answer so far.
Looking at the example at the top of the question, the LCA of the 2s is the root of the tree, and the LCA of the 1s is the highest 1. So the minimum distance between these two sets could be close to zero. Now split each set in two. The left hand 2 is distance two from both of the two 1s. The LCA of the right hand 2 is itself, on the right hand branch of the tree, and the LCA of each of the two 1s is down on the left hand branch of the tree. So the distance between the two is at least two, and we could tell this even if we had a large number of 2s anywhere below the position of the existing right-hand two, and a large number of 1s anywhere on the left hand subtree.
Do a pre-order traversal of the tree (or any traversal should work).
During this process, simply keep track of the closest 1 and 2, and update the distance whenever you find a 2 and the closest 1 is closer than the closest distance so far, or vice versa.
Code (C++, untested first draft): (hardcoded 1 and 2 for simplicity)
int getLeastDistance(Node *n, int *distTo1, int *distTo2)
{
if (n == NULL)
return;
int dist = LARGE_VALUE;
// process current node
if (n->data == 1)
{
dist = *distTo2;
*distTo1 = 0;
}
else if (n->data == 2)
{
dist = *distTo1;
*distTo2 = 0;
}
// go left
int newDistTo1 = *distTo1 + 1,
newDistTo2 = *distTo2 + 1;
dist = min(dist, getLeastDistance(n->left, &newDistTo1, &newDistTo2));
// update distances
*distTo1 = min(*distTo1, newDistTo1 + 1);
*distTo2 = min(*distTo2, newDistTo2 + 1);
// go right
newDistTo1 = *distTo1 + 1;
newDistTo2 = *distTo2 + 1;
dist = min(dist, getLeastDistance(n->right, &newDistTo1, &newDistTo2));
}
Caller:
Node root = ...;
int distTo1 = LARGE_VALUE, distTo2 = LARGE_VALUE;
int dist = getLeastDistance(&root, &distTo1, &distTo2);
Just be sure to make LARGE_VALUE far enough from the maximum value for int such that it won't overflow if incremented (-1 is probably safer, but it requires more complex code).

24 Game/Countdown/Number Game solver, but without parentheses in the answer

I've been browsing the internet all day for an existing solution to creating an equation out of a list of numbers and operators for a specified target number.
I came across a lot of 24 Game solvers, Countdown solvers and alike, but they are all build around the concept of allowing parentheses in the answers.
For example, for a target 42, using the number 1 2 3 4 5 6, a solution could be:
6 * 5 = 30
4 * 3 = 12
30 + 12 = 42
Note how the algorithm remembers the outcome of a sub-equation and later re-uses it to form the solution (in this case 30 and 12), essentially using parentheses to form the solution (6 * 5) + (4 * 3) = 42.
Whereas I'd like a solution WITHOUT the use of parentheses, which is solved from left to right, for example 6 - 1 + 5 * 4 + 2 = 42, if I'd write it out, it would be:
6 - 1 = 5
5 + 5 = 10
10 * 4 = 40
40 + 2 = 42
I have a list of about 55 numbers (random numbers ranging from 2 to 12), 9 operators (2 of each basic operator + 1 random operator) and a target value (a random number between 0 and 1000). I need an algorithm to check whether or not my target value is solvable (and optionally, if it isn't, how close we can get to the actual value). Each number and operator can only be used once, which means there will be a maximum of 10 numbers you can use to get to the target value.
I found a brute-force algorithm which can be easily adjusted to do what I want (How to design an algorithm to calculate countdown style maths number puzzle), and that works, but I was hoping to find something which generates more sophisticated "solutions", like on this page: http://incoherency.co.uk/countdown/
I wrote the solver you mentioned at the end of your post, and I apologise in advance that the code isn't very readable.
At its heart the code for any solver to this sort of problem is simply a depth-first search, which you imply you already have working.
Note that if you go with your "solution WITHOUT the use of parentheses, which is solved from left to right" then there are input sets which are not solvable. For example, 11,11,11,11,11,11 with a target of 144. The solution is ((11/11)+11)*((11/11)+11). My solver makes this easier for humans to understand by breaking the parentheses up into different lines, but it is still effectively using parentheses rather than evaluating from left to right.
The way to "use parentheses" is to apply an operation to your inputs and put the result back in the input bag, rather than to apply an operation to one of the inputs and an accumulator. For example, if your input bag is 1,2,3,4,5,6 and you decide to multiply 3 and 4, the bag becomes 1,2,12,5,6. In this way, when you recurse, that step can use the result of the previous step. Preparing this for output is just a case of storing the history of operations along with each number in the bag.
I imagine what you mean about more "sophisticated" solutions is just the simplicity heuristic used in my javascript solver. The solver works by doing a depth-first search of the entire search space, and then choosing the solution that is "best" rather than just the one that uses the fewest steps.
A solution is considered "better" than a previous solution (i.e. replaces it as the "answer" solution) if it is closer to the target (note that any state in the solver is a candidate solution, it's just that most are further away from the target than the previous best candidate solution), or if it is equally distant from the target and has a lower heuristic score.
The heuristic score is the sum of the "intermediate values" (i.e. the values on the right-hand-side of the "=" signs), with trailing 0's removed. For example, if the intermediate values are 1, 4, 10, 150, the heuristic score is 1+4+1+15: the 10 and the 150 only count for 1 and 15 because they end in zeroes. This is done because humans find it easier to deal with numbers that are divisible by 10, and so the solution appears "simpler".
The other part that could be considered "sophisticated" is the way that some lines are joined together. This simply joins the result of "5 + 3 = 8" and "8 + 2 = 10" in to "5 + 3 + 2 = 10". The code to do this is absolutely horrible, but in case you're interested it's all in the Javascript at https://github.com/jes/cntdn/blob/master/js/cntdn.js - the gist is that after finding the solution which is stored in array form (with information about how each number was made) a bunch of post-processing happens. Roughly:
convert the "solution list" generated from the DFS to a (rudimentary, nested-array-based) expression tree - this is to cope with the multi-argument case (i.e. "5 + 3 + 2" is not 2 addition operations, it's just one addition that has 3 arguments)
convert the expression tree to an array of steps, including sorting the arguments so that they're presented more consistently
convert the array of steps into a string representation for display to the user, including an explanation of how distant from the target number the result is, if it's not equal
Apologies for the length of that. Hopefully some of it is of use.
James
EDIT: If you're interested in Countdown solvers in general, you may want to take a look at my letters solver as it is far more elegant than the numbers one. It's the top two functions at https://github.com/jes/cntdn/blob/master/js/cntdn.js - to use call solve_letters() with a string of letters and a function to get called for every matching word. This solver works by traversing a trie representing the dictionary (generated by https://github.com/jes/cntdn/blob/master/js/mk-js-dict), and calling the callback at every end node.
I use the recursive in java to do the array combination. The main idea is just using DFS to get the array combination and operation combination.
I use a boolean array to store the visited position, which can avoid the same element to be used again. The temp StringBuilder is used to store current equation, if the corresponding result is equal to target, i will put the equation into result. Do not forget to return temp and visited array to original state when you select next array element.
This algorithm will produce some duplicate answer, so it need to be optimized later.
public static void main(String[] args) {
List<StringBuilder> res = new ArrayList<StringBuilder>();
int[] arr = {1,2,3,4,5,6};
int target = 42;
for(int i = 0; i < arr.length; i ++){
boolean[] visited = new boolean[arr.length];
visited[i] = true;
StringBuilder sb = new StringBuilder();
sb.append(arr[i]);
findMatch(res, sb, arr, visited, arr[i], "+-*/", target);
}
for(StringBuilder sb : res){
System.out.println(sb.toString());
}
}
public static void findMatch(List<StringBuilder> res, StringBuilder temp, int[] nums, boolean[] visited, int current, String operations, int target){
if(current == target){
res.add(new StringBuilder(temp));
}
for(int i = 0; i < nums.length; i ++){
if(visited[i]) continue;
for(char c : operations.toCharArray()){
visited[i] = true;
temp.append(c).append(nums[i]);
if(c == '+'){
findMatch(res, temp, nums, visited, current + nums[i], operations, target);
}else if(c == '-'){
findMatch(res, temp, nums, visited, current - nums[i], operations, target);
}else if(c == '*'){
findMatch(res, temp, nums, visited, current * nums[i], operations, target);
}else if(c == '/'){
findMatch(res, temp, nums, visited, current / nums[i], operations, target);
}
temp.delete(temp.length() - 2, temp.length());
visited[i] = false;
}
}
}

algorithm to find longest non-overlapping sequences

I am trying to find the best way to solve the following problem. By best way I mean less complex.
As an input a list of tuples (start,length) such:
[(0,5),(0,1),(1,9),(5,5),(5,7),(10,1)]
Each element represets a sequence by its start and length, for example (5,7) is equivalent to the sequence (5,6,7,8,9,10,11) - a list of 7 elements starting with 5. One can assume that the tuples are sorted by the start element.
The output should return a non-overlapping combination of tuples that represent the longest continuous sequences(s). This means that, a solution is a subset of ranges with no overlaps and no gaps and is the longest possible - there could be more than one though.
For example for the given input the solution is:
[(0,5),(5,7)] equivalent to (0,1,2,3,4,5,6,7,8,9,10,11)
is it backtracking the best approach to solve this problem ?
I'm interested in any different approaches that people could suggest.
Also if anyone knows a formal reference of this problem or another one that is similar I'd like to get references.
BTW - this is not homework.
Edit
Just to avoid some mistakes this is another example of expected behaviour
for an input like [(0,1),(1,7),(3,20),(8,5)] the right answer is [(3,20)] equivalent to (3,4,5,..,22) with length 20. Some of the answers received would give [(0,1),(1,7),(8,5)] equivalent to (0,1,2,...,11,12) as right answer. But this last answer is not correct because is shorter than [(3,20)].
Iterate over the list of tuples using the given ordering (by start element), while using a hashmap to keep track of the length of the longest continuous sequence ending on a certain index.
pseudo-code, skipping details like items not found in a hashmap (assume 0 returned if not found):
int bestEnd = 0;
hashmap<int,int> seq // seq[key] = length of the longest sequence ending on key-1, or 0 if not found
foreach (tuple in orderedTuples) {
int seqLength = seq[tuple.start] + tuple.length
int tupleEnd = tuple.start+tuple.length;
seq[tupleEnd] = max(seq[tupleEnd], seqLength)
if (seqLength > seq[bestEnd]) bestEnd = tupleEnd
}
return new tuple(bestEnd-seq[bestEnd], seq[bestEnd])
This is an O(N) algorithm.
If you need the actual tuples making up this sequence, you'd need to keep a linked list of tuples hashed by end index as well, updating this whenever the max length is updated for this end-point.
UPDATE: My knowledge of python is rather limited, but based on the python code you pasted, I created this code that returns the actual sequence instead of just the length:
def get_longest(arr):
bestEnd = 0;
seqLengths = dict() #seqLengths[key] = length of the longest sequence ending on key-1, or 0 if not found
seqTuples = dict() #seqTuples[key] = the last tuple used in this longest sequence
for t in arr:
seqLength = seqLengths.get(t[0],0) + t[1]
tupleEnd = t[0] + t[1]
if (seqLength > seqLengths.get(tupleEnd,0)):
seqLengths[tupleEnd] = seqLength
seqTuples[tupleEnd] = t
if seqLength > seqLengths.get(bestEnd,0):
bestEnd = tupleEnd
longestSeq = []
while (bestEnd in seqTuples):
longestSeq.append(seqTuples[bestEnd])
bestEnd -= seqTuples[bestEnd][1]
longestSeq.reverse()
return longestSeq
if __name__ == "__main__":
a = [(0,3),(1,4),(1,1),(1,8),(5,2),(5,5),(5,6),(10,2)]
print(get_longest(a))
Revised algorithm:
create a hashtable of start->list of tuples that start there
put all tuples in a queue of tupleSets
set the longestTupleSet to the first tuple
while the queue is not empty
take tupleSet from the queue
if any tuples start where the tupleSet ends
foreach tuple that starts where the tupleSet ends
enqueue new tupleSet of tupleSet + tuple
continue
if tupleSet is longer than longestTupleSet
replace longestTupleSet with tupleSet
return longestTupleSet
c# implementation
public static IList<Pair<int, int>> FindLongestNonOverlappingRangeSet(IList<Pair<int, int>> input)
{
var rangeStarts = input.ToLookup(x => x.First, x => x);
var adjacentTuples = new Queue<List<Pair<int, int>>>(
input.Select(x => new List<Pair<int, int>>
{
x
}));
var longest = new List<Pair<int, int>>
{
input[0]
};
int longestLength = input[0].Second - input[0].First;
while (adjacentTuples.Count > 0)
{
var tupleSet = adjacentTuples.Dequeue();
var last = tupleSet.Last();
int end = last.First + last.Second;
var sameStart = rangeStarts[end];
if (sameStart.Any())
{
foreach (var nextTuple in sameStart)
{
adjacentTuples.Enqueue(tupleSet.Concat(new[] { nextTuple }).ToList());
}
continue;
}
int length = end - tupleSet.First().First;
if (length > longestLength)
{
longestLength = length;
longest = tupleSet;
}
}
return longest;
}
tests:
[Test]
public void Given_the_first_problem_sample()
{
var input = new[]
{
new Pair<int, int>(0, 5),
new Pair<int, int>(0, 1),
new Pair<int, int>(1, 9),
new Pair<int, int>(5, 5),
new Pair<int, int>(5, 7),
new Pair<int, int>(10, 1)
};
var result = FindLongestNonOverlappingRangeSet(input);
result.Count.ShouldBeEqualTo(2);
result.First().ShouldBeSameInstanceAs(input[0]);
result.Last().ShouldBeSameInstanceAs(input[4]);
}
[Test]
public void Given_the_second_problem_sample()
{
var input = new[]
{
new Pair<int, int>(0, 1),
new Pair<int, int>(1, 7),
new Pair<int, int>(3, 20),
new Pair<int, int>(8, 5)
};
var result = FindLongestNonOverlappingRangeSet(input);
result.Count.ShouldBeEqualTo(1);
result.First().ShouldBeSameInstanceAs(input[2]);
}
This is a special case of the longest path problem for weighted directed acyclic graphs.
The nodes in the graph are the start points and the points after the last element in a sequence, where the next sequence could start.
The problem is special because the distance between two nodes must be the same independently of the path.
Just thinking about the algorithm in basic terms, would this work?
(apologies for horrible syntax but I'm trying to stay language-independent here)
First the simplest form: Find the longest contiguous pair.
Cycle through every member and compare it to every other member with a higher startpos. If the startpos of the second member is equal to the sum of the startpos and length of the first member, they are contiguous. If so, form a new member in a new set with the lower startpos and combined length to represent this.
Then, take each of these pairs and compare them to all of the single members with a higher startpos and repeat, forming a new set of contiguous triples (if any exist).
Continue this pattern until you have no new sets.
The tricky part then is you have to compare the length of every member of each of your sets to find the real longest chain.
I'm pretty sure this is not as efficient as other methods, but I believe this is a viable approach to brute forcing this solution.
I'd appreciate feedback on this and any errors I may have overlooked.
Edited to replace pseudocode with actual Python code
Edited AGAIN to change the code; The original algorithm was on the solution, but I missunderstood what the second value in the pairs was! Fortunatelly the basic algorithm is the same, and I was able to change it.
Here's an idea that solves the problem in O(N log N) and doesn't use a hash map (so no hidden times). For memory we're going to use N * 2 "things".
We're going to add two more values to each tuple: (BackCount, BackLink). In the successful combination BackLink will link from right to left from the right-most tuple to the left-most tuple. BackCount will be the value accumulated count for the given BackLink.
Here's some python code:
def FindTuplesStartingWith(tuples, frm):
# The Log(N) algorithm is left as an excersise for the user
ret=[]
for i in range(len(tuples)):
if (tuples[i][0]==frm): ret.append(i)
return ret
def FindLongestSequence(tuples):
# Prepare (BackCount, BackLink) array
bb=[] # (BackCount, BackLink)
for OneTuple in tuples: bb.append((-1,-1))
# Prepare
LongestSequenceLen=-1
LongestSequenceTail=-1
# Algorithm
for i in range(len(tuples)):
if (bb[i][0] == -1): bb[i] = (0, bb[i][1])
# Is this single pair the longest possible pair all by itself?
if (tuples[i][1] + bb[i][0]) > LongestSequenceLen:
LongestSequenceLen = tuples[i][1] + bb[i][0]
LongestSequenceTail = i
# Find next segment
for j in FindTuplesStartingWith(tuples, tuples[i][0] + tuples[i][1]):
if ((bb[j][0] == -1) or (bb[j][0] < (bb[i][0] + tuples[i][1]))):
# can be linked
bb[j] = (bb[i][0] + tuples[i][1], i)
if ((bb[j][0] + tuples[j][1]) > LongestSequenceLen):
LongestSequenceLen = bb[j][0] + tuples[j][1]
LongestSequenceTail=j
# Done! I'll now build up the solution
ret=[]
while (LongestSequenceTail > -1):
ret.insert(0, tuples[LongestSequenceTail])
LongestSequenceTail = bb[LongestSequenceTail][1]
return ret
# Call the algoritm
print FindLongestSequence([(0,5), (0,1), (1,9), (5,5), (5,7), (10,1)])
>>>>>> [(0, 5), (5, 7)]
print FindLongestSequence([(0,1), (1,7), (3,20), (8,5)])
>>>>>> [(3, 20)]
The key for the whole algorithm is where the "THIS IS THE KEY" comment is in the code. We know our current StartTuple can be linked to EndTuple. If a longer sequence that ends at EndTuple.To exists, it was found by the time we got to this point, because it had to start at an smaller StartTuple.From, and the array is sorted on "From"!
I removed the previous solution because it was not tested.
The problem is finding the longest path in a "weighted directed acyclic graph", it can be solved in linear time:
http://en.wikipedia.org/wiki/Longest_path_problem#Weighted_directed_acyclic_graphs
Put a set of {start positions} union {(start position + end position)} as vertices. For your example it would be {0, 1, 5, 10, 11, 12}
for vertices v0, v1 if there is an end value w that makes v0 + w = v1, then add a directed edge connecting v0 to v1 and put w as its weight.
Now follow the pseudocode in the wikipedia page. since the number of vertices is the maximum value of 2xn (n is number of tuples), the problem can still be solved in linear time.
This is a simple reduce operation. Given a pair of consecutive tuples, they either can or can't be combined. So define the pairwise combination function:
def combo(first,second):
if first[0]+first[1] == second[0]:
return [(first[0],first[1]+second[1])]
else:
return [first,second]
This just returns a list of either one element combining the two arguments, or the original two elements.
Then define a function to iterate over the first list and combine pairs:
def collapse(tupleList):
first = tupleList.pop(0)
newList = []
for item in tupleList:
collapsed = combo(first,item)
if len(collapsed)==2:
newList.append(collapsed[0])
first = collapsed.pop()
newList.append(first)
return newList
This keeps a first element to compare with the current item in the list (starting at the second item), and when it can't combine them it drops the first into a new list and replaces first with the second of the two.
Then just call collapse with the list of tuples:
>>> collapse( [(5, 7), (12, 3), (0, 5), (0, 7), (7, 2), (9, 3)] )
[(5, 10), (0, 5), (0, 12)]
[Edit] Finally, iterate over the result to get the longest sequence.
def longest(seqs):
collapsed = collapse(seqs)
return max(collapsed, key=lambda x: x[1])
[/Edit]
Complexity O(N). For bonus marks, do it in reverse so that the initial pop(0) becomes a pop() and you don't have to reindex the array, or move the iterator instead. For top marks make it run as a pairwise reduce operation for multithreaded goodness.
This sounds like a perfect "dynamic programming" problem...
The simplest program would be to do it brute force (e.g. recursive), but this has exponential complexity.
With dynamic programming you can set up an array a of length n, where n is the maximum of all (start+length) values of your problem, where a[i] denotes the longest non-overlapping sequence up to a[i]. You can then step trought all tuples, updating a. The complexity of this algorithm would be O(n*k), where k is the number of input values.
Create an ordered array of all start and end points and initialise all of them to one
For each item in your tuple, compare the end point (start and end) to the ordered items in your array, if any point is between them (e.g. point in the array is 5 and you have start 2 with length 4) change value to zero.
After finishing the loop, start moving across the ordered array and create a strip when you see 1 and while you see 1, add to the existing strip, with any zero, close the strip and etc.
At the end check the length of strips
I think complexity is around O(4-5*N)
(SEE UPDATE)
with N being number of items in the tuple.
UPDATE
As you figured out, the complexity is not accurate but definitely very small since it is a function of number of line stretches (tuple items).
So if N is number of line stretches, sorting is O(2N * log2N). Comparison is O(2N). Finding line stretches is also O(2N). So all in all O(2N(log2N + 2)).

Resources