Longest odd Palindromic Substring with middle index i - algorithm

Longest Odd Pallindromes
Problem Description
Given a string S(consisting of only lower case characters) and Q queries.
In each query you will given an integer i and your task is to find the length of longest odd palindromic substring whose middle index is i. Note:
1.) Assume 1 based indexing.
2.) Longest odd palindrome: A palindrome substring whose length is odd.
Problem Constraints
1<=|s|,Q<=1e5
1<=i<=|s|
Input Format
First argument A is string S.
Second argument B is an array of integers where B[i] denotes the query index of ith query.
Output Format
Return an array of integers where ith integer denotes the answer of ith query.
Is there any better way to solve this question other than brute force, that is, when we generate all the palindromic substrings and check

There is Manacher's algorithm that calculates number of palindromes centered at i-th index in linear time.
After precalculation stage you can answer query in O(1). I changed result array to contain lengths of the longest palindromes centered at every position.
Python code (link contains C++ one)
def manacher_odd(s):
n = len(s)
odds = []
l, r = 0, -1
for i in range(n):
k = min(odds[l+r-i], r-i+1) if i<=r else 1
while (i+k < n) and (i-k >= 0) and (s[i+k]==s[i-k]):
k += 1
odds.append(k)
if (i+k-1 > r):
l, r = i-k+1, i+k-1
for i in range(n):
odds[i] = 2 * odds[i] - 1
return odds
print(manacher_odd("abaaaba"))
[1, 3, 1, 7, 1, 3, 1]

There are 2 possible optimizations they might be looking for.
First, you can do an initial run over S first, cleverly building a lookup table, and then your query will just use that, which I think would be faster if B is long.
Alternatively, if not doing a look up, then while you're searching at index i, you'll potentially search neighboring indexes at the same time. As you check i, you can also be checking i+1, i-1, i+2, i-2, etc... as you go, and save that answer for later. This seems the less promising route to me, so I want to dive into the first idea more.
Finally, if B is quite short, then the best answer might be brute force, actually. It's good to know when to keep it simple.
Initial run method
One optimization that comes to mind for a pre-process run is as follows:
Search the next unknown index, brute force it by looking forwards and back, while recording the frequency of each letter (including the middle one.) If a palindrome of 1 or 3 was found, move to the next index and repeat.
If a palindrome or 5 or longer was found, calculate mid points of any letters that showed up more than twice which are to the right of the current index.
Any point between the current index and the index of the last palindrome letter that isn't in the mid-points list is a 1 for length.
This means you'll search all the midpoints found in (2). After that, you'll continue searching from the index of the last letter of the palindrome found in (2).
An example
Let's say S starts with: ```a, b, c, d, a, f, g, f, g, f, a, d, c, b, a, ...``` and you have checked from ```i = 2``` up to ```i = 7``` but found nothing except a run of 3 at ```i = 7```. Now, you check index ```i = 8```. You will find a palindrome extending out 7 letters in each direction, for a total of 15, but as you check, note any letters that show up more than twice. In this case, there are 3 ```f```s and 4 ```a```s. Find any mid points these pairs have that are right of the current index (8). In this case, 2 ```f```s have a mid point of i=9, the 2 right-most ```a```s have a midpoint of i=13. Once you're done looking at i=8, then you can skip any index not on your list, all the way up to the last letter you found in i=8. For example, we only have to check i=9 and i=13, and then start from i=15, checking every step. We've been able to skip checking i=10, 11, 12, and 14.

Related

Substring search with max 1's in a binary sequence

Problem
The task is to find a substring from the given binary string with highest score. The substring should be at least of given min length.
score = number of 1s / substring length where score can range from 0 to 1.
Inputs:
1. min length of substring
2. binary sequence
Outputs:
1. index of first char of substring
2. index of last char of substring
Example 1:
input
-----
5
01010101111100
output
------
7
11
explanation
-----------
1. start with minimum window = 5
2. start_ind = 0, end_index = 4, score = 2/5 (0.4)
3. start_ind = 1, end_index = 5, score = 3/5 (0.6)
4. and so on...
5. start_ind = 7, end_index = 11, score = 5/5 (1) [max possible]
Example 2:
input
-----
5
10110011100
output
------
2
8
explanation
-----------
1. while calculating all scores for windows 5 to len(sequence)
2. max score occurs in the case: start_ind=2, end_ind=8, score=5/7 (0.7143) [max possible]
Example 3:
input
-----
4
00110011100
output
------
5
8
What I attempted
The only technique i could come up with was a brute force technique, with nested for loops
for window_size in (min to max)
for ind 0 to end
calculate score
save max score
Can someone suggest a better algorithm to solve this problem?
There's a few observations to make before we start talking about an algorithm- some of these observations already have been pointed out in the comments.
Maths
Take the minimum length to be M, the length of the entire string to be L, and a substring from the ith char to the jth char (inclusive-exclusive) to be S[i:j].
All optimal substrings will satisfy at least one of two conditions:
It is exactly M characters in length
It starts and ends with a 1 character
The reason for the latter being if it were longer than M characters and started/ended with a 0, we could just drop that 0 resulting in a higher ratio.
In the same spirit (again, for the 2nd case), there exists an optimal substring which is not preceded by a 1. Otherwise, if it were, we could include that 1, resulting in an equal or higher ratio. The same logic applies to the end of S and a following 1.
Building on the above- such a substring being preceded or followed by another 1 will NOT be optimal, unless the substring contains no 0s. In the case where it doesn't contain 0s, there will exist an optimal substring of length M as well anyways.
Again, that all only applies to the length greater than M case substrings.
Finally, there exists an optimal substring that has length at least M (by definition), and at most 2 * M - 1. If an optimal substring had length K, we could split it into two substrings of length floor(K/2) and ceil(K/2) - S[i:i+floor(K/2)] and S[i+floor(K/2):i+K]. If the substring has the score (ratio) R, and its halves R0 and R1, we would have one of two scenarios:
R = R0 = R1, meaning we could pick either half and get the same score as the combined substring, giving us a shorter substring.
If this substring has length less than 2 * M, we are done- we have an optimal substring of length [M, 2*M).
Otherwise, recurse on the new substring.
R0 != R1, so (without loss of generality) R0 < R < R1, meaning the combined substring would not be optimal in the first place.
Note that I say "there exists an optimal" as opposed to "the optimal". This is because there may be multiple optimal solutions, and the observations above may refer to different instances.
Algorithm
You could search every window size [M, 2*M) at every offset, which would already be better than a full search for small M. You can also try a two-phase approach:
search every M sized window, find the max score
search from the beginning of every run of 1s forward through a special list of ends of runs of 1s, implicitly skipping over 0s and irrelevant 1s, breaking when out of the [M, 2 * M) bound.
For random data, I only expect this to save a small factor- skipping 15/16 of the windows (ignoring the added overhead). For less-random data, you could potentially see huge benefits, particularly if there's LOTS of LARGE runs of 1s and 0s.
The biggest speedup you'll be able to do (besides limiting the window max to 2 * M) is computing a cumulative sum of the bit array. This lets you query "how many 1s were seen up to this point". You can then take the difference of two elements in this array to query "how many 1s occurred between these offsets" in constant time. This allows for very quick calculation of the score.
You can use 2 pointer method, starting from both left-most and right-most ends. then adjust them searching for highest score.
We can add some cache to optimize time.
Example: (Python)
binary="01010101111100"
length=5
def get_score(binary,left,right):
ones=0
for i in range(left,right+1):
if binary[i]=="1":
ones+=1
score= ones/(right-left+1)
return score
cache={}
def get_sub(binary,length,left,right):
if (left,right) in cache:
return cache[(left,right)]
table=[0,set()]
if right-left+1<length:
pass
else:
scores=[[get_score(binary,left,right),set([(left,right)])],
get_sub(binary,length,left+1,right),
get_sub(binary,length,left,right-1),
get_sub(binary,length,left+1,right-1)]
for s in scores:
if s[0]>table[0]:
table[0]=s[0]
table[1]=s[1]
elif s[0]==table[0]:
for e in s[1]:
table[1].add(e)
cache[(left,right)]=table
return table
result=get_sub(binary,length,0,len(binary)-1)
print("Score: %f"%result[0])
print("Index: %s"%result[1])
Output
Score: 1
Index: {(7, 11)}

Judgecode -- Sort with swap (2)

The problem I've seen is as bellow, anyone has some idea on it?
http://judgecode.com/problems/1011
Given a permutation of integers from 0 to n - 1, sorting them is easy. But what if you can only swap a pair of integers every time?
Please calculate the minimal number of swaps
One classic algorithm seems to be permutation cycles (https://en.wikipedia.org/wiki/Cycle_notation#Cycle_notation). The number of swaps needed equals the total number of elements subtracted by the number of cycles.
For example:
1 2 3 4 5
2 5 4 3 1
Start with 1 and follow the cycle:
1 down to 2, 2 down to 5, 5 down to 1.
1 -> 2 -> 5 -> 1
3 -> 4 -> 3
We would need to swap index 1 with 5, then index 5 with 2; as well as index 3 with index 4. Altogether 3 swaps or n - 2. We subtract n by the number of cycles since cycle elements together total n and each cycle represents a swap less than the number of elements in it.
Here is a simple implementation in C for the above problem. The algorithm is similar to User גלעד ברקן:
Store the position of every element of a[] in b[]. So, b[a[i]] = i
Iterate over the initial array a[] from left to right.
At position i, check if a[i] is equal to i. If yes, then keep iterating.
If no, then it's time to swap. Look at the logic in the code minutely to see how the swapping takes place. This is the most important step as both array a[] and b[] needs to be modified. Increase the count of swaps.
Here is the implementation:
long long sortWithSwap(int n, int *a) {
int *b = (int*)malloc(sizeof(int)*n); //create a temporary array keeping track of the position of every element
int i,tmp,t,valai,posi;
for(i=0;i<n;i++){
b[a[i]] = i;
}
long long ans = 0;
for(i=0;i<n;i++){
if(a[i]!=i){
valai = a[i];
posi = b[i];
a[b[i]] = a[i];
a[i] = i;
b[i] = i;
b[valai] = posi;
ans++;
}
}
return ans;
}
The essence of solving this problem lies in the following observation
1. The elements in the array do not repeat
2. The range of elements is from 0 to n-1, where n is the size of the array.
The way to approach
After you have understood the way to approach the problem ou can solve it in linear time.
Imagine How would the array look like after sorting all the entries ?
It will look like arr[i] == i, for all entries . Is that convincing ?
First create a bool array named FIX, where FIX[i] == true if ith location is fixed, initialize this array with false initially
Start checking the original array for the match arr[i] == i, till the time this condition holds true, eveything is okay. While going ahead with traversal of array also update the FIX[i] = true. The moment you find that arr[i] != i you need to do something, arr[i] must have some value x such that x > i, how do we guarantee that ? The guarantee comes from the fact that the elements in the array do not repeat, therefore if the array is sorted till index i then it means that the element at position i in the array cannot come from left but from right.
Now the value x is essentially saying about some index , why so because the array only has elements till n-1 starting from 0, and in the sorted arry every element i of the array must be at location i.
what does arr[i] == x means is that , not only element i is not at it's correct position but also the element x is missing from it's place.
Now to fix ith location you need to look at xth location, because maybe xth location holds i and then you will swap the elements at indices i and x, and get the job done. But wait, it's not necessary that the index x will hold i (and you finish fixing these locations in just 1 swap). Rather it may be possible that index x holds value y, which again will be greater than i, because array is only sorted till location i.
Now before you can fix position i , you need to fix x, why ? we will see later.
So now again you try to fix position x, and then similarly you will try fixing till the time you don't see element i at some location in the fashion told .
The fashion is to follow the link from arr[i], untill you hit element i at some index.
It is guaranteed that you will definitely hit i at some location while following in this way . Why ? try proving it, make some examples, and you will feel it
Now you will start fixing all the index you saw in the path following from index i till this index (say it j). Now what you see is that the path which you have followed is a circular one and for every index i, the arr[i] is tored at it's previous index (index from where you reached here), and Once you see that you can fix the indices, and mark all of them in FIX array to be true. Now go ahead with next index of array and do the same thing untill whole array is fixed..
This was the complete idea, but to only conunt no. of swaps, you se that once you have found a cycle of n elements you need n swaps, and after doing that you fix the array , and again continue. So that's how you will count the no. of swaps.
Please let me know if you have some doubts in the approach .
You may also ask for C/C++ code help. Happy to help :-)

Implementing Parallel Algorithm for Longest Common Subsequence

I am trying to implement the Parallel Algorithm for Longest Common Subsequence Problem described in http://www.iaeng.org/publication/WCE2010/WCE2010_pp499-504.pdf
But i am having a problem with the variable C in Equation 6 on page 4
The paper refered to C on at the end of page 3 as
C as Let C[1 : l] bethe finite alphabet
I am not sure what is ment by this, as i guess it would it with the 2 strings ABCDEF and ABQXYEF be ABCDEFQXY. But what if my 2 stings is a list of objects (Where my match test for an example is obj1.Name = obj2.Name), what would my C be here? just a union on the 2 arrays?
Having read and studied the paper, I can say that C is supposed to be an array holding the alphabet of your strings, where the alphabet size (and, thus, the size of C) is l.
By the looks of your question, however, I feel the need to go deeper on this, because it looks like you didn't get the whole picture yet. What is P[i,j], and why do you need it? The answer is that you don't really need it, but it's an elegant optimization. In page 3, a little bit before Theorem 1, it is said that:
[...] This process ends when j-k = 0 at the k-th step, or a(i) =
b(j-k) at the k-th step. Assume that the process stops at the k-th
step, and k must be the minimum number that makes a(i) = b(j-k) or j-k
= 0. [...]
The recurrence relation in (3) is equivalent to (2), but the fundamental difference is that (2) expands recursively, whereas with (3) you never have recursive calls, provided that you know k. In other words, the magic behind (3) not expanding recursively is that you somehow know the spot where the recursion on (2) would stop, so you look at that cell immediately, rather than recursively approaching it.
Ok then, but how do you find out the value for k? Since k is the spot where (2) reaches a base case, it can be seen that k is the amount of columns that you have to "go back" on B until you are either off the limits (i.e., the first column that is filled with 0's) OR you find a match between a character in B and a character in A (which corresponds to the base case conditions in (2)). Remember that you will be matching the character a(i-1), where i is the current row.
So, what you really want is to find the last position in B before j where the character a(i-1) appears. If no such character ever appears in B before j, then that would be equivalent to reaching the case i = 0 or j-1 = 0 in (2); otherwise, it's the same as reaching a(i) = b(j-1) in (2).
Let's look at an example:
Consider that the algorithm is working on computing the values for i = 2 and j = 3 (the row and column are highlighted in gray). Imagine that the algorithm is working on the cell highlighted in black and is applying (2) to determine the value of S[2,2] (the position to the left of the black one). By applying (2), it would then start by looking at a(2) and b(2). a(2) is C, b(2) is G, to there's no match (this is the same procedure as the original, well-known algorithm). The algorithm now wants to find the value of S[2,2], because it is needed to compute S[2,3] (where we are). S[2,2] is not known yet, but the paper shows that it is possible to determine that value without refering to the row with i = 2. In (2), the 3rd case is chosen: S[2,2] = max(S[1, 2], S[2, 1]). Notice, if you will, that all this formula is doing is looking at the positions that would have been used to calculate S[2,2]. So, to rephrase that: we're computing S[2,3], we need S[2,2] for that, we don't know it yet, so we're going back on the table to see what's the value of S[2,2] in pretty much the same way we did in the original, non-parallel algorithm.
When will this stop? In this example, it will stop when we find the letter C (this is our a(i)) in TGTTCGACA before the second T (the letter on the current column) OR when we reach column 0. Because there is no C before T, we reach column 0. Another example:
Here, (2) would stop with j-1 = 5, because that is the last position in TGTTCGACA where C shows up. Thus, the recursion reaches the base case a(i) = b(j-1) when j-1 = 5.
With this in mind, we can see a shortcut here: if you could somehow know the amount k such that j-1-k is a base case in (2), then you wouldn't have to go through the score table to find the base case.
That's the whole idea behind P[i,j]. P is a table where you lay down the whole alphabet vertically (on the left side); the string B is, once again, placed horizontally in the upper side. This table is computed as part of a preprocessing step, and it will tell you exactly what you will need to know ahead of time: for each position j in B, it says, for each character C[i] in C (the alphabet), what is the last position in B before j where C[i] is found (note that i is used to index C, the alphabet, and not the string A. Maybe the authors should have used another index variable to avoid confusion).
So, you can think of the semantics for an entry P[i,j] as something along the lines of: The last position in B where I saw C[i] before position j. For example, if you alphabet is sigma = {A, E, I, O, U}, and B = "AOOIUEI", thenP` is:
Take the time to understand this table. Note the row for O. Remember: this row lists, for every position in B, where is the last known "O". Only when j = 3 will we have a value that is not zero (it's 2), because that's the position after the first O in AOOIUEI. This entry says that the last position in B where O was seen before is position 2 (and, indeed, B[2] is an O, the one that follows A). Notice, in that same row, that for j = 4, we have the value 3, because now the last position for O is the one that correspnds to the second O in B (and since no more O's exist, the rest of the row will be 3).
Recall that building P is a preprocessing step necessary if you want to easily find the value of k that makes the recursion from equation (2) stop. It should make sense by now that P[i,j] is the k you're looking for in (3). With P, you can determine that value in O(1) time.
Thus, the C[i] in (6) is a letter of the alphabet - the letter that we are currently considering. In the example above, C = [A,E,I,O,U], and C[1] = A, C[2] = E, etc. In equaton (7), c is the position in C where a(i) (the current letter of string A being considered) lives. It makes sense: after all, when building the score table position S[i,j], we want to use P to find the value of k - we want to know where was the last time we saw an a(i) in B before j. We do that by reading P[index_of(a(i)), j].
Ok, now that you understand the use of P, let's see what's happening with your implementation.
About your specific case
In the paper, P is shown as a table that lists the whole alphabet. It is a good idea to iterate through the alphabet because the typical uses of this algorithm are in bioinformatics, where the alphabet is much, much smaller than the string A, making the iteration through the alphabet cheaper.
Because your strings are sequences of objects, your C would be the set of all possible objects, so you'd have to build a table P with the set of all possible object instance (nonsense, of course). This is definitely a case where the alphabet size is huge when compared to your string size. However, note that you will only be indexing P in those rows that correspond to letters from A: any row in P for a letter C[i] that is not in A is useless and will never be used. This makes your life easier, because it means you can build P with the string A instead of using the alphabet of every possible object.
Again, an example: if your alphabet is AEIOU, A is EEI and B is AOOIUEI, you will only be indexing P in the rows for E and I, so that's all you need in P:
This works and suffices, because in (7), P[c,j] is the entry in P for the character c, and c is the index of a(i). In other words: C[c] always belongs to A, so it makes perfect sense to build P for the characters of A instead of using the whole alphabet for the cases where the size of A is considerably smaller than the size of C.
All you have to do now is to apply the same principle to whatever your objects are.
I really don't know how to explain it any better. This may be a little dense at first. Make sure to re-read it until you really get it - and I mean every little detail. You have to master this before thinking about implementing it.
NOTE: You said you were looking for a credible and / or official source. I'm just another CS student, so I'm not an official source, but I think I can be considered "credible". I've studied this before and I know the subject. Happy coding!

Disperse Duplicates in an Array

Source : Google Interview Question
Write a routine to ensure that identical elements in the input are maximally spread in the output?
Basically, we need to place the same elements,in such a way , that the TOTAL spreading is as maximal as possible.
Example:
Input: {1,1,2,3,2,3}
Possible Output: {1,2,3,1,2,3}
Total dispersion = Difference between position of 1's + 2's + 3's = 4-1 + 5-2 + 6-3 = 9 .
I am NOT AT ALL sure, if there's an optimal polynomial time algorithm available for this.Also,no other detail is provided for the question other than this .
What i thought is,calculate the frequency of each element in the input,then arrange them in the output,each distinct element at a time,until all the frequencies are exhausted.
I am not sure of my approach .
Any approaches/ideas people .
I believe this simple algorithm would work:
count the number of occurrences of each distinct element.
make a new list
add one instance of all elements that occur more than once to the list (order within each group does not matter)
add one instance of all unique elements to the list
add one instance of all elements that occur more than once to the list
add one instance of all elements that occur more than twice to the list
add one instance of all elements that occur more than trice to the list
...
Now, this will intuitively not give a good spread:
for {1, 1, 1, 1, 2, 3, 4} ==> {1, 2, 3, 4, 1, 1, 1}
for {1, 1, 1, 2, 2, 2, 3, 4} ==> {1, 2, 3, 4, 1, 2, 1, 2}
However, i think this is the best spread you can get given the scoring function provided.
Since the dispersion score counts the sum of the distances instead of the squared sum of the distances, you can have several duplicates close together, as long as you have a large gap somewhere else to compensate.
for a sum-of-squared-distances score, the problem becomes harder.
Perhaps the interview question hinged on the candidate recognizing this weakness in the scoring function?
In perl
#a=(9,9,9,2,2,2,1,1,1);
then make a hash table of the counts of different numbers in the list, like a frequency table
map { $x{$_}++ } #a;
then repeatedly walk through all the keys found, with the keys in a known order and add the appropriate number of individual numbers to an output list until all the keys are exhausted
#r=();
$g=1;
while( $g == 1 ) {
$g=0;
for my $n (sort keys %x)
{
if ($x{$n}>1) {
push #r, $n;
$x{$n}--;
$g=1
}
}
}
I'm sure that this could be adapted to any programming language that supports hash tables
python code for algorithm suggested by Vorsprung and HugoRune:
from collections import Counter, defaultdict
def max_spread(data):
cnt = Counter()
for i in data: cnt[i] += 1
res, num = [], list(cnt)
while len(cnt) > 0:
for i in num:
if num[i] > 0:
res.append(i)
cnt[i] -= 1
if cnt[i] == 0: del cnt[i]
return res
def calc_spread(data):
d = defaultdict()
for i, v in enumerate(data):
d.setdefault(v, []).append(i)
return sum([max(x) - min(x) for _, x in d.items()])
HugoRune's answer takes some advantage of the unusual scoring function but we can actually do even better: suppose there are d distinct non-unique values, then the only thing that is required for a solution to be optimal is that the first d values in the output must consist of these in any order, and likewise the last d values in the output must consist of these values in any (i.e. possibly a different) order. (This implies that all unique numbers appear between the first and last instance of every non-unique number.)
The relative order of the first copies of non-unique numbers doesn't matter, and likewise nor does the relative order of their last copies. Suppose the values 1 and 2 both appear multiple times in the input, and that we have built a candidate solution obeying the condition I gave in the first paragraph that has the first copy of 1 at position i and the first copy of 2 at position j > i. Now suppose we swap these two elements. Element 1 has been pushed j - i positions to the right, so its score contribution will drop by j - i. But element 2 has been pushed j - i positions to the left, so its score contribution will increase by j - i. These cancel out, leaving the total score unchanged.
Now, any permutation of elements can be achieved by swapping elements in the following way: swap the element in position 1 with the element that should be at position 1, then do the same for position 2, and so on. After the ith step, the first i elements of the permutation are correct. We know that every swap leaves the scoring function unchanged, and a permutation is just a sequence of swaps, so every permutation also leaves the scoring function unchanged! This is true at for the d elements at both ends of the output array.
When 3 or more copies of a number exist, only the position of the first and last copy contribute to the distance for that number. It doesn't matter where the middle ones go. I'll call the elements between the 2 blocks of d elements at either end the "central" elements. They consist of the unique elements, as well as some number of copies of all those non-unique elements that appear at least 3 times. As before, it's easy to see that any permutation of these "central" elements corresponds to a sequence of swaps, and that any such swap will leave the overall score unchanged (in fact it's even simpler than before, since swapping two central elements does not even change the score contribution of either of these elements).
This leads to a simple O(nlog n) algorithm (or O(n) if you use bucket sort for the first step) to generate a solution array Y from a length-n input array X:
Sort the input array X.
Use a single pass through X to count the number of distinct non-unique elements. Call this d.
Set i, j and k to 0.
While i < n:
If X[i+1] == X[i], we have a non-unique element:
Set Y[j] = Y[n-j-1] = X[i].
Increment i twice, and increment j once.
While X[i] == X[i-1]:
Set Y[d+k] = X[i].
Increment i and k.
Otherwise we have a unique element:
Set Y[d+k] = X[i].
Increment i and k.

Interview puzzle: Jump Game

Jump Game:
Given an array, start from the first element and reach the last by jumping. The jump length can be at most the value at the current position in the array. The optimum result is when you reach the goal in minimum number of jumps.
What is an algorithm for finding the optimum result?
An example: given array A = {2,3,1,1,4} the possible ways to reach the end (index list) are
0,2,3,4 (jump 2 to index 2, then jump 1 to index 3 then 1 to index 4)
0,1,4 (jump 1 to index 1, then jump 3 to index 4)
Since second solution has only 2 jumps it is the optimum result.
Overview
Given your array a and the index of your current position i, repeat the following until you reach the last element.
Consider all candidate "jump-to elements" in a[i+1] to a[a[i] + i]. For each such element at index e, calculate v = a[e] + e. If one of the elements is the last element, jump to the last element. Otherwise, jump to the element with the maximal v.
More simply put, of the elements within reach, look for the one that will get you furthest on the next jump. We know this selection, x, is the right one because compared to every other element y you can jump to, the elements reachable from y are a subset of the elements reachable from x (except for elements from a backward jump, which are obviously bad choices).
This algorithm runs in O(n) because each element need be considered only once (elements that would be considered a second time can be skipped).
Example
Consider the array of values a, indicies, i, and sums of index and value v.
i -> 0 1 2 3 4 5 6 7 8 9 10 11 12
a -> [4, 11, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
v -> 4 12 3 4 5 6 7 8 9 10 11 12 13
Start at index 0 and consider the next 4 elements. Find the one with maximal v. That element is at index 1, so jump to 1. Now consider the next 11 elements. The goal is within reach, so jump to the goal.
Demo
See here or here with code.
Dynamic programming.
Imagine you have an array B where B[i] shows the minimum number of step needed to reach index i in your array A. Your answer of course is in B[n], given A has n elements and indices start from 1. Assume C[i]=j means the you jumped from index j to index i (this is to recover the path taken later)
So, the algorithm is the following:
set B[i] to infinity for all i
B[1] = 0; <-- zero steps to reach B[1]
for i = 1 to n-1 <-- Each step updates possible jumps from A[i]
for j = 1 to A[i] <-- Possible jump sizes are 1, 2, ..., A[i]
if i+j > n <-- Array boundary check
break
if B[i+j] > B[i]+1 <-- If this path to B[i+j] was shorter than previous
B[i+j] = B[i]+1 <-- Keep the shortest path value
C[i+j] = i <-- Keep the path itself
The number of jumps needed is B[n]. The path that needs to be taken is:
1 -> C[1] -> C[C[1]] -> C[C[C[1]]] -> ... -> n
Which can be restored by a simple loop.
The algorithm is of O(min(k,n)*n) time complexity and O(n) space complexity. n is the number of elements in A and k is the maximum value inside the array.
Note
I am keeping this answer, but cheeken's greedy algorithm is correct and more efficient.
Construct a directed graph from the array. eg: i->j if |i-j|<=x[i] (Basically, if you can move from i to j in one hop have i->j as an edge in the graph). Now, find the shortest path from first node to last.
FWIW, you can use Dijkstra's algorithm so find shortest route. Complexity is O( | E | + | V | log | V | ). Since | E | < n^2, this becomes O(n^2).
We can calculate far index to jump maximum and in between if the any index value is larger than the far, we will update the far index value.
Simple O(n) time complexity solution
public boolean canJump(int[] nums) {
int far = 0;
for(int i = 0; i<nums.length; i++){
if(i <= far){
far = Math.max(far, i+nums[i]);
}
else{
return false;
}
}
return true;
}
start from left(end)..and traverse till number is same as index, use the maximum of such numbers. example if list is
list: 2738|4|6927
index: 0123|4|5678
once youve got this repeat above step from this number till u reach extreme right.
273846927
000001234
in case you dont find nething matching the index, use the digit with the farthest index and value greater than index. in this case 7.( because pretty soon index will be greater than the number, you can probably just count for 9 indices)
basic idea:
start building the path from the end to the start by finding all array elements from which it is possible to make the last jump to the target element (all i such that A[i] >= target - i).
treat each such i as the new target and find a path to it (recursively).
choose the minimal length path found, append the target, return.
simple example in python:
ls1 = [2,3,1,1,4]
ls2 = [4,11,1,1,1,1,1,1,1,1,1,1,1]
# finds the shortest path in ls to the target index tgti
def find_path(ls,tgti):
# if the target is the first element in the array, return it's index.
if tgti<= 0:
return [0]
# for each 0 <= i < tgti, if it it possible to reach
# tgti from i (ls[i] <= >= tgti-i) then find the path to i
sub_paths = [find_path(ls,i) for i in range(tgti-1,-1,-1) if ls[i] >= tgti-i]
# find the minimum length path in sub_paths
min_res = sub_paths[0]
for p in sub_paths:
if len(p) < len(min_res):
min_res = p
# add current target to the chosen path
min_res.append(tgti)
return min_res
print find_path(ls1,len(ls1)-1)
print find_path(ls2,len(ls2)-1)
>>>[0, 1, 4]
>>>[0, 1, 12]

Resources