Computing number of sequences - algorithm

I saw the following problem that I was unable to solve. What kind of algorithm will solve it?
We have been given a positive integer n. Let A be the set of all possible strings of length n where characters are from the set {1,2,3,4,5,6}, i.e. the results of dice thrown n times. How many elements of A contains at least one of the following strings as a substring:
1, 2, 3, 4, 5, 6
1, 1, 2, 2, 3, 3
4, 4, 5, 5, 6, 6
1, 1, 1, 2, 2, 2
3, 3, 3, 4, 4, 4
5, 5, 5, 6, 6, 6
1, 1, 1, 1, 1, 1
2, 2, 2, 2, 2, 2
3, 3, 3, 3, 3, 3
4, 4, 4, 4, 4, 4
5, 5, 5, 5, 5, 5
6, 6, 6, 6, 6, 6
I was wondering some kind of recursive approach but I got only mess when I tried to solve the problem.

I suggest reading up on the Aho-Corasick algorithm. This constructs a finite state machine based on a set of strings. (If your list of strings is fixed, you could even do this by hand.)
Once you have a finite state machine (with around 70 states), you should add an extra absorbing state to mark when any of the strings has been detected.
Now you problem is reduced to finding how many of the 6**n strings end up in the absorbing state after being pushed through the state machine.
You can do this by expressing the state machine as a matrix . Entry M[i,j] tells the number of ways of getting to state i from state j when one letter is added.
Finally you compute the matrix raised to the power n applied to an input vector that is all zeros except for a 1 in the position corresponding to the initial state. The number in the absorbing state position will tell you the total number of strings.
(You can use the standard matrix exponentiation algorithm to generate this answer in O(logn) time.)

What's wrong with your recursive approach, can you elaborate on that, anyway this can be solved using a recursive approach in O(6^n), but can be optimized using dp, using the fact that you only need to track the last 6 elements, so it can be done in O ( 6 * 2^6 * n) with dp.
rec (String cur, int step) {
if(step == n) return 0;
int ans = 0;
for(char c in { '1', '2', '3', '4', '5', '6' } {
if(cur.length < 6) cur += c
else {
shift(cur,1) // shift the string to the left by 1 step
cur[5] = c // add the new element to the end of the string
}
if(cur in list) ans += 1 + rec(cur, step+1) // list described in the question
else ans += rec(cur, step+1)
}
return ans;
}

Related

Find all combinations that include a specific value where all values are next to each other in array

I have an array of variable length containing all unique values and I need to find all combinations of values whose indices are next to each other and always include a specified value. The order of values in each resulting combination doesn't matter (However I kept them in order in my example to better illustrate).
As an example: [5, 4, 2, 0, 1, 3]
If the specific value chosen is 0, we would end up with the following 12 combinations:
0
0, 1
2, 0
0, 1, 3
2, 0, 1
4, 2, 0
2, 0, 1, 3
4, 2, 0, 1
5, 4, 2, 0
4, 2, 0, 1, 3
5, 4, 2, 0, 1
5, 4, 2, 0, 1, 3
If the specific value chosen is 3, we would end up with the following 6 combinations:
3
1, 3
0, 1, 3
2, 0, 1, 3
4, 2, 0, 1, 3
5, 4, 2, 0, 1, 3
Answers in any programming language will work.
EDIT: I believe this can be brute forced by finding all combinations of all numbers and then narrowing that list to make sure each combination meets the requirements...its not ideal but should work.
This problem could be solved in O(n^3) time-complexity using the following algorithm:
Step-1: Find the index of the target element.
Step-2: Iterate through an index of the target to the rightmost index. Let's call this iterator as idx.
Step-3: Then iterate from the target index to the leftmost index. Let's call this index as i.
Step-4: Print all the elements between the indices idx and i.
Following the above steps will print all the combinations.
The code for the above algorithm is implemented using python below.
def solution(array,target):
index = -1
for idx,element in enumerate(array):
if(element == target):
index = idx
n = len(array)
for idx in range(n-1,index-1,-1):
for i in range(index,-1,-1):
for j in range(i,idx+1):
print(array[j],end = ",")
print()
arr = [5, 4, 2, 0, 1, 3]
target = 0
solution(arr,target)

Longest Increasing subsequence length in NlogN.[Understanding the Algo]

Problem Statement: Aim is to find the longest increasing subsequence(not contiguous) in nlogn time.
Algorithm: I understood the algorithm as explained here :
http://www.geeksforgeeks.org/longest-monotonically-increasing-subsequence-size-n-log-n/.
What i did not understand is what is getting stored in tail in the following code.
int LongestIncreasingSubsequenceLength(std::vector<int> &v) {
if (v.size() == 0)
return 0;
std::vector<int> tail(v.size(), 0);
int length = 1; // always points empty slot in tail
tail[0] = v[0];
for (size_t i = 1; i < v.size(); i++) {
if (v[i] < tail[0])
// new smallest value
tail[0] = v[i];
else if (v[i] > tail[length-1])
// v[i] extends largest subsequence
tail[length++] = v[i];
else
// v[i] will become end candidate of an existing subsequence or
// Throw away larger elements in all LIS, to make room for upcoming grater elements than v[i]
// (and also, v[i] would have already appeared in one of LIS, identify the location and replace it)
tail[CeilIndex(tail, -1, length-1, v[i])] = v[i];
}
return length;
}
For example ,if input is {2,5,3,,11,8,10,13,6},
the code gives correct length as 6.
But tail will be storing 2,3,6,8,10,13.
So I want to understand what is stored in tail?.This will help me in understanding correctness of this algo.
tail[i] is the minimal end value of the increasing subsequence (IS) of length i+1.
That's why tail[0] is the 'smallest value' and why we can increase the value of LIS (length++) when the current value is bigger than end value of the current longest sequence.
Let's assume that your example is the starting values of the input:
input = 2, 5, 3, 7, 11, 8, 10, 13, 6, ...
After 9 steps of our algorithm tail looks like this:
tail = 2, 3, 6, 8, 10, 13, ...
What does tail[2] means? It means that the best IS of length 3 ends with tail[2]. And we could build an IS of length 4 expanding it with the number that is bigger than tail[2].
tail[0] = 2, IS length = 1: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[1] = 3, IS length = 2: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[2] = 6, IS length = 3: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[3] = 8, IS length = 4: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[4] = 10,IS length = 5: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[5] = 13,IS length = 6: 2, 5, 3, 7, 11, 8, 10, 13, 6
This presentation allows you to use binary search (note that defined part of tail is always sorted) to update tail and to find the result at the end of the algorithm.
Tail srotes the Longest Increasing Subsequence (LIS).
It will update itself following the explanation given in the link you provided and claimed to have understood. Check the example.
You want the minimum value at the first element of the tail, which explains the first if statement.
The second if statement is there to allow the LIS to grow, since we want to maximize its length.

Find the maximum number of points per game

The input is an array of cards. In one move, you can remove any group of consecutive identical cards. For removing k cards, you get k * k points. Find the maximum number of points you can get per game.
Time limit: O(n4)
Example:
Input: [1, 8, 7, 7, 7, 8, 4, 8, 1]
Output: 23
Does anyone have an idea how to solve this?
To clarify, in the given example, one path to the best solution is
Remove Points Total new hand
3 7s 9 9 [1, 8, 8, 4, 8, 1]
1 4 1 10 [1, 8, 8, 8, 1]
3 8s 9 19 [1, 1]
2 1s 4 23 []
Approach
Recursion would fit well here.
First, identify the contiguous sequences in the array -- one lemma of this problem is that if you decide to remove at least one 7, you want to remove the entire sequence of three. From here on, you'll work with both cards and quantities. For instance,
card = [1, 8, 7, 8, 4, 8, 1]
quant = [1, 1, 3, 1, 1, 1, 1]
Now you're ready for the actual solving. Iterate through the array. For each element, remove that element, and add the score for that move.
Check to see whether the elements on either side match; if so, merge those entries. Recur on the remaining array.
For instance, here's the first turn of what will prove to be the optimal solution for the given input:
Choose and remove the three 7's
card = [1, 8, 8, 4, 8, 1]
quant = [1, 1, 1, 1, 1, 1]
score = score + 3*3
Merge the adjacent 8 entries:
card = [1, 8, 4, 8, 1]
quant = [1, 2, 1, 1, 1]
Recur on this game.
Improvement
Use dynamic programming: memoize the solution for every sub game.
Any card that appears only once in the card array can be removed first, without loss of generality. In the given example, you can remove the 7's and the single 4 to improve the remaining search tree.

N non­ overlapping Optimal partition

Here is a problem I run into a few days ago.
Given a list of integer items, we want to partition the items into at most N non­overlapping, consecutive bins, in a way that minimizes the maximum number of items in any bin.
For example, suppose we are given the items (5, 2, 3, 6, 1, 6), and we want 3 bins. We can optimally partition these as follows:
n < 3: 1, 2 (2 items)
3 <= n < 6: 3, 5 (2 items)
6 <= n: 6, 6 (2 items)
Every bin has 2 items, so we can’t do any better than that.
Can anyone share your idea about this question?
Given n bins and an array with p items, here is one greedy algorithm you could use.
To minimize the max number of items in a bin:
p <= n Try to use p bins.
Simply try and put each item in it's own bin. If you have duplicate numbers then your average will be unavoidably worse.
p > n Greedily use all bins but try to keep each one's member count near floor(p / n).
Group duplicate numbers
Pad the largest duplicate bins that fall short of floor(p / n) with unique numbers to the left and right (if they exist).
Count the number of bins you have and determine the number mergers you need to make, let's call it r.
Repeat the following r times:
Check each possible neighbouring bin pairing; find and perform the minimum merger
Example
{1,5,6,9,8,8,6,2,5,4,7,5,2,4,5,3,2,8,7,5} 20 items to 4 bins
{1}{2, 2, 2}{3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8}{9} 1. sorted and grouped
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8, 9} 2. greedy capture by largest groups
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8, 9} 3. 6 bins but we want 4, so 2 mergers need to be made.
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6, 7, 7}{8, 8, 8, 9} 3. first merger
{1, 2, 2, 2, 3, 4, 4}{5, 5, 5, 5, 5}{6, 6, 7, 7}{8, 8, 8, 9} 3. second merger
So the minimum achievable max was 7.
Here is some psudocode that will give you just one solution with the minimum bin quantity possible:
Sort the list of "Elements" with Element as a pair {Value, Quanity}.
So for example {5,2,3,6,1,6} becomes an ordered set:
Let S = {{1,1},{2,1},{3,1},{5,1},{6,2}}
Let A = the largest quanity of any particular value in the set
Let X = Items in List
Let N = Number of bins
Let MinNum = ceiling ( X / N )
if A > MinNum then Let MinNum = A
Create an array BIN(1 to N+1) of pointers to linked lists of elements.
For I from 1 to N
Remove as many elements from the front of S that are less than MinNum
and Add them to Bin(I)
Next I
Let Bin(I+1)=any remaining in S
LOOP while Bin(I+1) not empty
Let MinNum = MinNum + 1
For I from 1 to N
Remove as many elements from the front of Bin(I+1) so that Bin(I) is less than MinNum
and Add them to Bin(I)
Next I
END LOOP
Your minimum bin size possible will be MinNum and BIN(1) to Bin(N) will contain the distribution of values.

Algorithm to generate Diagonal Latin Square matrix

I need for given N create N*N matrix which does not have repetitions in rows, cells, minor and major diagonals and values are 1, 2 , 3, ...., N.
For N = 4 one of matrices is the following:
1 2 3 4
3 4 1 2
4 3 2 1
2 1 4 3
Problem overview
The math structure you described is Diagonal Latin Square. Constructing them is the more mathematical problem than the algorithmic or programmatic.
To correctly understand what it is and how to create you should read following articles:
Latin squares definition
Magic squares definition
Diagonal Latin square construction <-- p.2 is answer to your question with proof and with other interesting properties
Short answer
One of the possible ways to construct Diagonal Latin Square:
Let N is the power of required matrix L.
If there are exist numbers A and B from range [0; N-1] which satisfy properties:
A relativly prime to N
B relatively prime to N
(A + B) relatively prime to N
(A - B) relatively prime to N
Then you can create required matrix with the following rule:
L[i][j] = (A * i + B * j) mod N
It would be nice to do this mathematically, but I'll propose the simplest algorithm that I can think of - brute force.
At a high level
we can represent a matrix as an array of arrays
for a given N, construct S a set of arrays, which contains every combination of [1..N]. There will be N! of these.
using an recursive & iterative selection process (e.g. a search tree), search through all orders of these arrays until one of the 'uniqueness' rules is broken
For example, in your N = 4 problem, I'd construct
S = [
[1,2,3,4], [1,2,4,3]
[1,3,2,4], [1,3,4,2]
[1,4,2,3], [1,4,3,2]
[2,1,3,4], [2,1,4,3]
[2,3,1,4], [2,3,4,1]
[2,4,1,3], [2,4,3,1]
[3,1,2,4], [3,1,4,2]
// etc
]
R = new int[4][4]
Then the algorithm is something like
If R is 'full', you're done
Evaluate does the next row from S fit into R,
if yes, insert it into R, reset the iterator on S, and go to 1.
if no, increment the iterator on S
If there are more rows to check in S, go to 2.
Else you've iterated across S and none of the rows fit, so remove the most recent row added to R and go to 1. In other words, explore another branch.
To improve the efficiency of this algorithm, implement a better data structure. Rather than a flat array of all combinations, use a prefix tree / Trie of some sort to both reduce the storage size of the 'options' and reduce the search area within each iteration.
Here's a method which is fast for N <= 9 : (python)
import random
def generate(n):
a = [[0] * n for _ in range(n)]
def rec(i, j):
if i == n - 1 and j == n:
return True
if j == n:
return rec(i + 1, 0)
candidate = set(range(1, n + 1))
for k in range(i):
candidate.discard(a[k][j])
for k in range(j):
candidate.discard(a[i][k])
if i == j:
for k in range(i):
candidate.discard(a[k][k])
if i + j == n - 1:
for k in range(i):
candidate.discard(a[k][n - 1 - k])
candidate_list = list(candidate)
random.shuffle(candidate_list)
for e in candidate_list:
a[i][j] = e
if rec(i, j + 1):
return True
a[i][j] = 0
return False
rec(0, 0)
return a
for row in generate(9):
print(row)
Output:
[8, 5, 4, 7, 1, 6, 2, 9, 3]
[2, 7, 5, 8, 4, 1, 3, 6, 9]
[9, 1, 2, 3, 6, 4, 8, 7, 5]
[3, 9, 7, 6, 2, 5, 1, 4, 8]
[5, 8, 3, 1, 9, 7, 6, 2, 4]
[4, 6, 9, 2, 8, 3, 5, 1, 7]
[6, 3, 1, 5, 7, 9, 4, 8, 2]
[1, 4, 8, 9, 3, 2, 7, 5, 6]
[7, 2, 6, 4, 5, 8, 9, 3, 1]

Resources