Why is an ancestral array needed when recovering longest increasing subsequence? - algorithm

I looked at the following website describing the longest increasing subsequnce algorithm: https://www.fyears.org/2016/12/LIS.html
In the section "how to reconstruct the subsequence?", it says that
"We should pay attention that the dp in the end is NOT the LIS."
Somehow I don't see how dp is not LIS?
We know that dp is sorted and that it contains as many entries modified by the algorithm as the length of the LIS. An element at index i cannot be equal to an element at i-1, since for every index dp[i] contains the smallest possible ending value in all increasing subsequences with length i + 1. So, if there is a subsequence of length i + 1, this implies that there is also a subsequence of length i, which consequently must end at a smaller value, right?

LIS is a subsequence (fixed order of elements), but DP array isn't saving elements order. Check on array [2, 3, 1]. DP will be [1, 3] after all iterations, but [1, 3] isn't the subsequence of the initial array.

Related

Is there any algorithm to address the longest common subsequence problem with different weights for each character?

I'm looking for an algorithm that addresses the LCS problem for two strings with the following conditions:
Each string consists of English characters and each character has a weight. For example:
sequence 1 (S1): "ABBCD" with weights [1, 2, 4, 1, 3]
sequence 2 (S2): "TBDC" with weights [7, 5, 1, 2]
Suppose that MW(s, S) is defined as the maximum weight of the sub-sequence s in string S with respect to the associated weights. The heaviest common sub-sequence (HCS) is defined as:
HCS = argmin(MW(s, S1), MW(s, S2))
The algorithm output should be the indexes of HCS in both strings and the weight. In this case, the indexes will be:
I_S1 = [2, 4] --> MW("BD", "ABBCD") = 7
I_S2 = [1, 2] --> MW("BD", "TBDC") = 6
Therefore HCS = "BD", and weight = min(MW(s, S1), MW(s, S2)) = 6.
The table that you need to build will have this.
for each position in sequence 1
for each position in sequence 2
for each extreme pair of (weight1, weight2)
(last_position1, last_position2)
Where an extreme pair is one where it is not possible to find a subsequence to that point whose weights in sequence 1 and weights in sequence 2 are both >= and at least one is >.
There may be multiple extreme pairs, where one sequence is higher than the other.
The rule is that at the (i, -1) or (-1, j) positions, the only extreme pair is the empty set with weight 0. At any other we merge the extreme pairs for (i-1, j) and (i, j-1). And then if seq1[i] = seq2[j], then add the options where you went to (i-1, j-1) and then included the i and j in the respective subsequences. (So add weight1[i] and weight2[j] to the weights then do a merge.)
For that merge you can sort by weight1 ascending, all of the extreme values for both previous points, then throw away all of the ones whose weight2 is less than or equal to the best weight2 that was already posted earlier in the sequence.
When you reach the end you can find the extreme pair with the highest min, and that is your answer. You can then walk the data structure back to find the subsequences in question.

Circular Longest Increasing Subsequence

How can I find the length of Longest Increasing Sub-sequence if the numbers are arranged in circular fashion. For example:
LIS of 3, 2, 1 is 3 [1, 2, 3].
P.S I know how to solve Linear LIS in O(nlogn).
Problem Source: https://www.codechef.com/problems/D2/
Update: The LIS has to be calculated by going through the circle only once.
Example 2: LIS of 1, 4, 3 is 2 and that could be either of 1, 3 or 1, 4 or 3, 4.
Thanks
The example in question is wrong. circular rotation of [1,2,3] would be [2,3,1] or [3,1,2].
In which case, we can solve it similar way as longest increasing subsequence. As:
Sort the list in ascending order.
Find min element in the original list.
Start iteration from min_index in original list and compare it with sorted list, and create intermediate array L[i][j] with same logic as longest common subsequence. i will vary from min_index to (i+n-1)%n
Finally return L[max_index][n]

Find minimum length interval that has K as its factor

Given an integer K and a list of N integers. We need to find all possible shortest intervals in the list, such that the product of the integers in each interval, is a multiple of K.
Example : Let N=6, K=5 and array be [2,9,4,3,16] then here minimum length of interval is 2 whose product are multiple of K.
Intervals: [1, 2] , [2, 3] , [3, 4] , [4, 5].
Now I need to find both minimum length and all intervals start and end.
But the problem is constraints are large, 1≤N≤2×10^5 , 1≤K≤10^17 and array elements are upto 10^15.
You can use a segment tree to be able to compute product(a[i...j])%K in O(log N).
From the principle that if product(a[i...j])%K==0, then product(a[i...j+k])%K==0, you can, for each i, perform a binary search to find the first j where product(a[i..j])%K==0.
In the first pass, find what's the minimum length. Then do another pass finding and printing which i's have that length.
That's O(n log^2 n). For 2*10^5 that should be enough. Specially given the answer can have O(n^2) items (e.g. n/2 subarrays with n/2 items each).

Summing a given series of numbers in order to reset the summation as many times as possible algorithm

I'm looking for an efficient algorithm (not necessarily a code) for solving the following question:
Given n positive and negative numbers that sum up to zero, we would like to find a starting index that will cause the cumulated sum to zero up as many times as possible.
It doesn't have to be in a specific manner, but the importance here is the efficincy- we want the algorithm/idea to be able to this in less then a qudratic "time complexity"
An example:
Given the numbers: 2, -1, 3, 1, -3, -2:
If we strat summing up with 2 (first index), the sum will be zero only once (at the end of the summation), but strting with -1 will yield zero twice during the summation.
The given numbers may have more than one "best index", but we would like to find at least one of these indexes.
I've tried doing it with binary search, but didn't make much progress- so any hints/help will be appreciated.
You can compute prefix sums. In terms of prefix sums, zeros are positions that have the same value of a prefix sum as the start position. So the problem is reduced to finding the most frequent element in the array of prefix sums. It can be solved efficiently using sorting or hash tables.
Here is an example:
Input: {2, -1, 3, 1, -3, 2}
Prefix sums: {0, 2, 1, 4, 5, 2, 0}
The most frequent element is 2. The first occurrence of 2 is in the first position. Thus, starting from the second element yields optimal answer.

Finding the Longest Palindrome Subsequence with less memory

I am trying to solve a dynamic programming problem from Cormem's Introduction to Algorithms 3rd edition (pg 405) which asks the following:
A palindrome is a nonempty string over
some alphabet that reads the same
forward and backward. Examples of
palindromes are all strings of length
1, civic, racecar, and aibohphobia
(fear of palindromes).
Give an efficient algorithm to find
the longest palindrome that is a
subsequence of a given input string.
For example, given the input
character, your algorithm should
return carac.
Well, I could solve it in two ways:
First solution:
The Longest Palindrome Subsequence (LPS) of a string is simply the Longest Common Subsequence of itself and its reverse. (I've build this solution after solving another related question which asks for the Longest Increasing Subsequence of a sequence).
Since it's simply a LCS variant, it also takes O(n²) time and O(n²) memory.
Second solution:
The second solution is a bit more elaborated, but also follows the general LCS template. It comes from the following recurrence:
lps(s[i..j]) =
s[i] + lps(s[i+1]..[j-1]) + s[j], if s[i] == s[j];
max(lps(s[i+1..j]), lps(s[i..j-1])) otherwise
The pseudocode for calculating the length of the lps is the following:
compute-lps(s, n):
// palindromes with length 1
for i = 1 to n:
c[i, i] = 1
// palindromes with length up to 2
for i = 1 to n-1:
c[i, i+1] = (s[i] == s[i+1]) ? 2 : 1
// palindromes with length up to j+1
for j = 2 to n-1:
for i = 1 to n-i:
if s[i] == s[i+j]:
c[i, i+j] = 2 + c[i+1, i+j-1]
else:
c[i, i+j] = max( c[i+1, i+j] , c[i, i+j-1] )
It still takes O(n²) time and memory if I want to effectively construct the lps (because I 'll need all cells on the table). Analysing related problems, such as LIS, which can be solved with approaches other than LCS-like with less memory (LIS is solvable with O(n) memory), I was wondering if it's possible to solve it with O(n) memory, too.
LIS achieves this bound by linking the candidate subsequences, but with palindromes it's harder because what matters here is not the previous element in the subsequence, but the first. Does anyone know if is possible to do it, or are the previous solutions memory optimal?
Here is a very memory efficient version. But I haven't demonstrated that it is always O(n) memory. (With a preprocessing step it can better than O(n2) CPU, though O(n2) is the worst case.)
Start from the left-most position. For each position, keep track of a table of the farthest out points at which you can generate reflected subsequences of length 1, 2, 3, etc. (Meaning that a subsequence to the left of our point is reflected to the right.) For each reflected subsequence we store a pointer to the next part of the subsequence.
As we work our way right, we search from the RHS of the string to the position for any occurrences of the current element, and try to use those matches to improve the bounds we previously had. When we finish, we look at the longest mirrored subsequence and we can easily construct the best palindrome.
Let's consider this for character.
We start with our best palindrome being the letter 'c', and our mirrored subsequence being reached with the pair (0, 11) which are off the ends of the string.
Next consider the 'c' at position 1. Our best mirrored subsequences in the form (length, end, start) are now [(0, 11, 0), (1, 6, 1)]. (I'll leave out the linked list you need to generate to actually find the palindrome.
Next consider the h at position 2. We do not improve the bounds [(0, 11, 0), (1, 6, 1)].
Next consider the a at position 3. We improve the bounds to [(0, 11, 0), (1, 6, 1), (2, 5, 3)].
Next consider the r at position 4. We improve the bounds to [(0, 11, 0), (1, 10, 4), (2, 5, 3)]. (This is where the linked list would be useful.
Working through the rest of the list we do not improve that set of bounds.
So we wind up with the longest mirrored list is of length 2. And we'd follow the linked list (that I didn't record in this description to find it is ac. Since the ends of that list are at positions (5, 3) we can flip the list, insert character 4, then append the list to get carac.
In general the maximum memory that it will require is to store all of the lengths of the maximal mirrored subsequences plus the memory to store the linked lists of said subsequences. Typically this will be a very small amount of memory.
At a classic memory/CPU tradeoff you can preprocess the list once in time O(n) to generate a O(n) sized hash of arrays of where specific sequence elements appear. This can let you scan for "improve mirrored subsequence with this pairing" without having to consider the whole string, which should generally be a major saving on CPU for longer strings.
First solution in #Luiz Rodrigo's question is wrong: Longest Common Subsesquence (LCS) of a string and its reverse is not necessarily a palindrome.
Example: for string CBACB, CAB is LCS of the string and its reverse and it's obviously not a palindrome.
There is a way, however, to make it work. After LCS of a string and its reverse is built, take left half of it (including mid-character for odd-length strings) and complement it on the right with reversed left half (not including mid-character if length of the string is odd).
It will obviously be a palindrome and it can be trivially proven that it will be a subsequence of the string.
For above LCS, the palindrome built this way will be CAC.

Resources