Permutations unrank - algorithm

I know of an algorithm (it can be found online) to rank a permutation, i.e. given a permutation return the integer index into the list of lexicographically-sorted permutations, but I don't know any unrank algorithm that does the opposite: given an index i, return the i-th permutation in that lexicographic order.
Since I couldn't find any, can somebody please shed some light?

Let's say you are permutating the letters (a, b, c).
There are 3×2×1=6 permutations. Out of these, a third starts with a and lexicographically precedes another third starting with b, preceding the last third starting with c.
For each of these thirds there are two halves, one starting with the first letter left after choosing the first, and the other with the second.
Each of these halves has only one element (the last letter).
So, given a set of three elements and an index between zero and five (let's say 3), we can divide (with reminder) by the size of each "third" to get the first letter. Now:
the set has size n=3
there are n!=6 permutations
there are n=3 groups of permutations which start with each of the n elements
each group has size n!/n = (n-1)! = 6/3 = 2 elements
To determine the index of the first element, we divide by 2 with remainder:
3÷2 = 1 rem 1
Since our set is (a,b,c), this tells us that the first letter is b.
Now, we can remove the letter b from the set, and use the reminder as the new index. We get the set (a, c) and index 1. Re-applying the algorithm,
the set has size n=2
there are n!=2 permutations
there are n=2 groups of permutations which start with each of the n elements
each group has size n!/n = (n-1)! = 2/2 = 1 element
To determine the index of the first element, we divide by 1 with remainder:
1÷1 = 1 rem 0
Since our set is (a,c), this tells us that the first second letter is c.
The third set is reduced to the singleton a and that's our third letter.
The permutation with index 3 is b,c,a.
Let's check it:
0 abc
1 acb
2 bac
3 bca <-- correct!
4 cab
5 cba
So, putting this in a real algorithm and generalizing:
public string NthPerm(string set, int n)
{
var res = "";
while (set.Length > 0)
{
var setSize = Math.Factorial(set.Length-1);
var index = n/setSize;
res.Concat(set[index]);
set = index > 0 ? set.Substring(0, index) : "" +
index < set.Length-1 ? set.Substring(index+1) : "";
n = n % setSize;
}
return res;
}

Related

Substring search with max 1's in a binary sequence

Problem
The task is to find a substring from the given binary string with highest score. The substring should be at least of given min length.
score = number of 1s / substring length where score can range from 0 to 1.
Inputs:
1. min length of substring
2. binary sequence
Outputs:
1. index of first char of substring
2. index of last char of substring
Example 1:
input
-----
5
01010101111100
output
------
7
11
explanation
-----------
1. start with minimum window = 5
2. start_ind = 0, end_index = 4, score = 2/5 (0.4)
3. start_ind = 1, end_index = 5, score = 3/5 (0.6)
4. and so on...
5. start_ind = 7, end_index = 11, score = 5/5 (1) [max possible]
Example 2:
input
-----
5
10110011100
output
------
2
8
explanation
-----------
1. while calculating all scores for windows 5 to len(sequence)
2. max score occurs in the case: start_ind=2, end_ind=8, score=5/7 (0.7143) [max possible]
Example 3:
input
-----
4
00110011100
output
------
5
8
What I attempted
The only technique i could come up with was a brute force technique, with nested for loops
for window_size in (min to max)
for ind 0 to end
calculate score
save max score
Can someone suggest a better algorithm to solve this problem?
There's a few observations to make before we start talking about an algorithm- some of these observations already have been pointed out in the comments.
Maths
Take the minimum length to be M, the length of the entire string to be L, and a substring from the ith char to the jth char (inclusive-exclusive) to be S[i:j].
All optimal substrings will satisfy at least one of two conditions:
It is exactly M characters in length
It starts and ends with a 1 character
The reason for the latter being if it were longer than M characters and started/ended with a 0, we could just drop that 0 resulting in a higher ratio.
In the same spirit (again, for the 2nd case), there exists an optimal substring which is not preceded by a 1. Otherwise, if it were, we could include that 1, resulting in an equal or higher ratio. The same logic applies to the end of S and a following 1.
Building on the above- such a substring being preceded or followed by another 1 will NOT be optimal, unless the substring contains no 0s. In the case where it doesn't contain 0s, there will exist an optimal substring of length M as well anyways.
Again, that all only applies to the length greater than M case substrings.
Finally, there exists an optimal substring that has length at least M (by definition), and at most 2 * M - 1. If an optimal substring had length K, we could split it into two substrings of length floor(K/2) and ceil(K/2) - S[i:i+floor(K/2)] and S[i+floor(K/2):i+K]. If the substring has the score (ratio) R, and its halves R0 and R1, we would have one of two scenarios:
R = R0 = R1, meaning we could pick either half and get the same score as the combined substring, giving us a shorter substring.
If this substring has length less than 2 * M, we are done- we have an optimal substring of length [M, 2*M).
Otherwise, recurse on the new substring.
R0 != R1, so (without loss of generality) R0 < R < R1, meaning the combined substring would not be optimal in the first place.
Note that I say "there exists an optimal" as opposed to "the optimal". This is because there may be multiple optimal solutions, and the observations above may refer to different instances.
Algorithm
You could search every window size [M, 2*M) at every offset, which would already be better than a full search for small M. You can also try a two-phase approach:
search every M sized window, find the max score
search from the beginning of every run of 1s forward through a special list of ends of runs of 1s, implicitly skipping over 0s and irrelevant 1s, breaking when out of the [M, 2 * M) bound.
For random data, I only expect this to save a small factor- skipping 15/16 of the windows (ignoring the added overhead). For less-random data, you could potentially see huge benefits, particularly if there's LOTS of LARGE runs of 1s and 0s.
The biggest speedup you'll be able to do (besides limiting the window max to 2 * M) is computing a cumulative sum of the bit array. This lets you query "how many 1s were seen up to this point". You can then take the difference of two elements in this array to query "how many 1s occurred between these offsets" in constant time. This allows for very quick calculation of the score.
You can use 2 pointer method, starting from both left-most and right-most ends. then adjust them searching for highest score.
We can add some cache to optimize time.
Example: (Python)
binary="01010101111100"
length=5
def get_score(binary,left,right):
ones=0
for i in range(left,right+1):
if binary[i]=="1":
ones+=1
score= ones/(right-left+1)
return score
cache={}
def get_sub(binary,length,left,right):
if (left,right) in cache:
return cache[(left,right)]
table=[0,set()]
if right-left+1<length:
pass
else:
scores=[[get_score(binary,left,right),set([(left,right)])],
get_sub(binary,length,left+1,right),
get_sub(binary,length,left,right-1),
get_sub(binary,length,left+1,right-1)]
for s in scores:
if s[0]>table[0]:
table[0]=s[0]
table[1]=s[1]
elif s[0]==table[0]:
for e in s[1]:
table[1].add(e)
cache[(left,right)]=table
return table
result=get_sub(binary,length,0,len(binary)-1)
print("Score: %f"%result[0])
print("Index: %s"%result[1])
Output
Score: 1
Index: {(7, 11)}

Number of distinct sequences of fixed length which can be generated using a given set of numbers

I am trying to find different sequences of fixed length which can be generated using the numbers from a given set (distinct elements) such that each element from set should appear in the sequence. Below is my logic:
eg. Let the set consists of S elements, and we have to generate sequences of length K (K >= S)
1) First we have to choose S places out of K and place each element from the set in random order. So, C(K,S)*S!
2) After that, remaining places can be filled from any values from the set. So, the factor
(K-S)^S should be multiplied.
So, overall result is
C(K,S)S!((K-S)^S)
But, I am getting wrong answer. Please help.
PS: C(K,S) : No. of ways selecting S elements out of K elements (K>=S) irrespective of order. Also, ^ : power symbol i.e 2^3 = 8.
Here is my code in python:
# m is the no. of element to select from a set of n elements
# fact is a list containing factorial values i.e. fact[0] = 1, fact[3] = 6& so on.
def ways(m,n):
res = fact[n]/fact[n-m+1]*((n-m)**m)
return res
What you are looking for is the number of surjective functions whose domain is a set of K elements (the K positions that we are filling out in the output sequence) and the image is a set of S elements (your input set). I think this should work:
static int Count(int K, int S)
{
int sum = 0;
for (int i = 1; i <= S; i++)
{
sum += Pow(-1, (S-i)) * Fact(S) / (Fact(i) * Fact(S - i)) * Pow(i, K);
}
return sum;
}
...where Pow and Fact are what you would expect.
Check out this this math.se question.
Here's why your approach won't work. I didn't check the code, just your explanation of the logic behind it, but I'm pretty sure I understand what you're trying to do. Let's take for example K = 4, S = {7,8,9}. Let's examine the sequence 7,8,9,7. It is a unique sequence, but you can get to it by:
Randomly choosing positions 1,2,3, filling them randomly with 7,8,9 (your step 1), then randomly choosing 7 for the remaining position 4 (your step 2).
Randomly choosing positions 2,3,4, filling them randomly with 8,9,7 (your step 1), then randomly choosing 7 for the remaining position 1 (your step 2).
By your logic, you will count it both ways, even though it should be counted only once as the end result is the same. And so on...

Disperse Duplicates in an Array

Source : Google Interview Question
Write a routine to ensure that identical elements in the input are maximally spread in the output?
Basically, we need to place the same elements,in such a way , that the TOTAL spreading is as maximal as possible.
Example:
Input: {1,1,2,3,2,3}
Possible Output: {1,2,3,1,2,3}
Total dispersion = Difference between position of 1's + 2's + 3's = 4-1 + 5-2 + 6-3 = 9 .
I am NOT AT ALL sure, if there's an optimal polynomial time algorithm available for this.Also,no other detail is provided for the question other than this .
What i thought is,calculate the frequency of each element in the input,then arrange them in the output,each distinct element at a time,until all the frequencies are exhausted.
I am not sure of my approach .
Any approaches/ideas people .
I believe this simple algorithm would work:
count the number of occurrences of each distinct element.
make a new list
add one instance of all elements that occur more than once to the list (order within each group does not matter)
add one instance of all unique elements to the list
add one instance of all elements that occur more than once to the list
add one instance of all elements that occur more than twice to the list
add one instance of all elements that occur more than trice to the list
...
Now, this will intuitively not give a good spread:
for {1, 1, 1, 1, 2, 3, 4} ==> {1, 2, 3, 4, 1, 1, 1}
for {1, 1, 1, 2, 2, 2, 3, 4} ==> {1, 2, 3, 4, 1, 2, 1, 2}
However, i think this is the best spread you can get given the scoring function provided.
Since the dispersion score counts the sum of the distances instead of the squared sum of the distances, you can have several duplicates close together, as long as you have a large gap somewhere else to compensate.
for a sum-of-squared-distances score, the problem becomes harder.
Perhaps the interview question hinged on the candidate recognizing this weakness in the scoring function?
In perl
#a=(9,9,9,2,2,2,1,1,1);
then make a hash table of the counts of different numbers in the list, like a frequency table
map { $x{$_}++ } #a;
then repeatedly walk through all the keys found, with the keys in a known order and add the appropriate number of individual numbers to an output list until all the keys are exhausted
#r=();
$g=1;
while( $g == 1 ) {
$g=0;
for my $n (sort keys %x)
{
if ($x{$n}>1) {
push #r, $n;
$x{$n}--;
$g=1
}
}
}
I'm sure that this could be adapted to any programming language that supports hash tables
python code for algorithm suggested by Vorsprung and HugoRune:
from collections import Counter, defaultdict
def max_spread(data):
cnt = Counter()
for i in data: cnt[i] += 1
res, num = [], list(cnt)
while len(cnt) > 0:
for i in num:
if num[i] > 0:
res.append(i)
cnt[i] -= 1
if cnt[i] == 0: del cnt[i]
return res
def calc_spread(data):
d = defaultdict()
for i, v in enumerate(data):
d.setdefault(v, []).append(i)
return sum([max(x) - min(x) for _, x in d.items()])
HugoRune's answer takes some advantage of the unusual scoring function but we can actually do even better: suppose there are d distinct non-unique values, then the only thing that is required for a solution to be optimal is that the first d values in the output must consist of these in any order, and likewise the last d values in the output must consist of these values in any (i.e. possibly a different) order. (This implies that all unique numbers appear between the first and last instance of every non-unique number.)
The relative order of the first copies of non-unique numbers doesn't matter, and likewise nor does the relative order of their last copies. Suppose the values 1 and 2 both appear multiple times in the input, and that we have built a candidate solution obeying the condition I gave in the first paragraph that has the first copy of 1 at position i and the first copy of 2 at position j > i. Now suppose we swap these two elements. Element 1 has been pushed j - i positions to the right, so its score contribution will drop by j - i. But element 2 has been pushed j - i positions to the left, so its score contribution will increase by j - i. These cancel out, leaving the total score unchanged.
Now, any permutation of elements can be achieved by swapping elements in the following way: swap the element in position 1 with the element that should be at position 1, then do the same for position 2, and so on. After the ith step, the first i elements of the permutation are correct. We know that every swap leaves the scoring function unchanged, and a permutation is just a sequence of swaps, so every permutation also leaves the scoring function unchanged! This is true at for the d elements at both ends of the output array.
When 3 or more copies of a number exist, only the position of the first and last copy contribute to the distance for that number. It doesn't matter where the middle ones go. I'll call the elements between the 2 blocks of d elements at either end the "central" elements. They consist of the unique elements, as well as some number of copies of all those non-unique elements that appear at least 3 times. As before, it's easy to see that any permutation of these "central" elements corresponds to a sequence of swaps, and that any such swap will leave the overall score unchanged (in fact it's even simpler than before, since swapping two central elements does not even change the score contribution of either of these elements).
This leads to a simple O(nlog n) algorithm (or O(n) if you use bucket sort for the first step) to generate a solution array Y from a length-n input array X:
Sort the input array X.
Use a single pass through X to count the number of distinct non-unique elements. Call this d.
Set i, j and k to 0.
While i < n:
If X[i+1] == X[i], we have a non-unique element:
Set Y[j] = Y[n-j-1] = X[i].
Increment i twice, and increment j once.
While X[i] == X[i-1]:
Set Y[d+k] = X[i].
Increment i and k.
Otherwise we have a unique element:
Set Y[d+k] = X[i].
Increment i and k.

number to unique permutation mapping of a sequence containing duplicates

I am looking for an algorithm that can map a number to a unique permutation of a sequence. I have found out about Lehmer codes and the factorial number system thanks to a similar question, Fast permutation -> number -> permutation mapping algorithms, but that question doesn't deal with the case where there are duplicate elements in the sequence.
For example, take the sequence 'AAABBC'. There are 6! = 720 ways that could be arranged, but I believe there are only 6! / (3! * 2! * 1!) = 60 unique permutation of this sequence. How can I map a number to a permutation in these cases?
Edit: changed the term 'set' to 'sequence'.
From Permutation to Number:
Let K be the number of character classes (example: AAABBC has three character classes)
Let N[K] be the number of elements in each character class. (example: for AAABBC, we have N[K]=[3,2,1], and let N= sum(N[K])
Every legal permutation of the sequence then uniquely corresponds to a path in an incomplete K-way tree.
The unique number of the permutation then corresponds to the index of the tree-node in a post-order traversal of the K-ary tree terminal nodes.
Luckily, we don't actually have to perform the tree traversal -- we just need to know how many terminal nodes in the tree are lexicographically less than our node. This is very easy to compute, as at any node in the tree, the number terminal nodes below the current node is equal to the number of permutations using the unused elements in the sequence, which has a closed form solution that is a simple multiplication of factorials.
So given our 6 original letters, and the first element of our permutation is a 'B', we determine that there will be 5!/3!1!1! = 20 elements that started with 'A', so our permutation number has to be greater than 20. Had our first letter been a 'C', we could have calculated it as 5!/2!2!1! (not A) + 5!/3!1!1! (not B) = 30+ 20, or alternatively as
60 (total) - 5!/3!2!0! (C) = 50
Using this, we can take a permutation (e.g. 'BAABCA') and perform the following computations:
Permuation #= (5!/2!2!1!) ('B') + 0('A') + 0('A')+ 3!/1!1!1! ('B') + 2!/1!
= 30 + 3 +2 = 35
Checking that this works: CBBAAA corresponds to
(5!/2!2!1! (not A) + 5!/3!1!1! (not B)) 'C'+ 4!/2!2!0! (not A) 'B' + 3!/2!1!0! (not A) 'B' = (30 + 20) +6 + 3 = 59
Likewise, AAABBC =
0 ('A') + 0 'A' + '0' A' + 0 'B' + 0 'B' + 0 'C = 0
Sample implementation:
import math
import copy
from operator import mul
def computePermutationNumber(inPerm, inCharClasses):
permutation=copy.copy(inPerm)
charClasses=copy.copy(inCharClasses)
n=len(permutation)
permNumber=0
for i,x in enumerate(permutation):
for j in xrange(x):
if( charClasses[j]>0):
charClasses[j]-=1
permNumber+=multiFactorial(n-i-1, charClasses)
charClasses[j]+=1
if charClasses[x]>0:
charClasses[x]-=1
return permNumber
def multiFactorial(n, charClasses):
val= math.factorial(n)/ reduce(mul, (map(lambda x: math.factorial(x), charClasses)))
return val
From Number to Permutation:
This process can be done in reverse, though I'm not sure how efficiently:
Given a permutation number, and the alphabet that it was generated from, recursively subtract the largest number of nodes less than or equal to the remaining permutation number.
E.g. Given a permutation number of 59, we first can subtract 30 + 20 = 50 ('C') leaving 9. Then we can subtract 'B' (6) and a second 'B'(3), re-generating our original permutation.
Here is an algorithm in Java that enumerates the possible sequences by mapping an integer to the sequence.
public class Main {
private int[] counts = { 3, 2, 1 }; // 3 Symbols A, 2 Symbols B, 1 Symbol C
private int n = sum(counts);
public static void main(String[] args) {
new Main().enumerate();
}
private void enumerate() {
int s = size(counts);
for (int i = 0; i < s; ++i) {
String p = perm(i);
System.out.printf("%4d -> %s\n", i, p);
}
}
// calculates the total number of symbols still to be placed
private int sum(int[] counts) {
int n = 0;
for (int i = 0; i < counts.length; i++) {
n += counts[i];
}
return n;
}
// calculates the number of different sequences with the symbol configuration in counts
private int size(int[] counts) {
int res = 1;
int num = 0;
for (int pos = 0; pos < counts.length; pos++) {
for (int den = 1; den <= counts[pos]; den++) {
res *= ++num;
res /= den;
}
}
return res;
}
// maps the sequence number to a sequence
private String perm(int num) {
int[] counts = this.counts.clone();
StringBuilder sb = new StringBuilder(n);
for (int i = 0; i < n; ++i) {
int p = 0;
for (;;) {
while (counts[p] == 0) {
p++;
}
counts[p]--;
int c = size(counts);
if (c > num) {
sb.append((char) ('A' + p));
break;
}
counts[p]++;
num -= c;
p++;
}
}
return sb.toString();
}
}
The mapping used by the algorithm is as follows. I use the example given in the question (3 x A, 2 x B, 1 x C) to illustrate it.
There are 60 (=6!/3!/2!/1!) possible sequences in total, 30 (=5!/2!/2!/1!) of them have an A at the first place, 20 (=5!/3!/1!/1!) have a B at the first place, and 10 (=5!/3!/2!/0!) have a C at the first place.
The numbers 0..29 are mapped to all sequences starting with an A, 30..49 are mapped to the sequences starting with B, and 50..59 are mapped to the sequences starting with C.
The same process is repeated for the next place in the sequence, for example if we take the sequences starting with B we have now to map numbers 0 (=30-30) .. 19 (=49-30) to the sequences with configuration (3 x A, 1 x B, 1 x C)
A very simple algorithm to mapping a number for a permutation consists of n digits is
number<-digit[0]*10^(n-1)+digit[1]*10^(n-2)+...+digit[n]*10^0
You can find plenty of resources for algorithms to generate permutations. I guess you want to use this algorithm in bioinformatics. For example you can use itertools.permutations from Python.
Assuming the resulting number fits inside a word (e.g. 32 or 64 bit integer) relatively easily, then much of the linked article still applies. Encoding and decoding from a variable base remains the same. What changes is how the base varies.
If you're creating a permutation of a sequence, you pick an item out of your bucket of symbols (from the original sequence) and put it at the start. Then you pick out another item from your bucket of symbols and put it on the end of that. You'll keep picking and placing symbols at the end until you've run out of symbols in your bucket.
What's significant is which item you picked out of the bucket of the remaining symbols each time. The number of remaining symbols is something you don't have to record because you can compute that as you build the permutation -- that's a result of your choices, not the choices themselves.
The strategy here is to record what you chose, and then present an array of what's left to be chosen. Then choose, record which index you chose (packing it via the variable base method), and repeat until there's nothing left to choose. (Just as above when you were building a permuted sequence.)
In the case of duplicate symbols it doesn't matter which one you picked, so you can treat them as the same symbol. The difference is that when you pick a symbol which still has a duplicate left, you didn't reduce the number of symbols in the bucket to pick from next time.
Let's adopt a notation that makes this clear:
Instead of listing duplicate symbols left in our bucket to choose from like c a b c a a we'll list them along with how many are still in the bucket: c-2 a-3 b-1.
Note that if you pick c from the list, the bucket has c-1 a-3 b-1 left in it. That means next time we pick something, we have three choices.
But on the other hand, if I picked b from the list, the bucket has c-2 a-3 left in it. That means next time we pick something, we only have two choices.
When reconstructing the permuted sequence we just maintain the bucket the same way as when we were computing the permutation number.
The implementation details aren't trivial, but they're straightforward with standard algorithms. The only thing that might heckle you is what to do when a symbol in your bucket is no longer available.
Suppose your bucket was represented by a list of pairs (like above): c-1 a-3 b-1 and you choose c. Your resulting bucket is c-0 a-3 b-1. But c-0 is no longer a choice, so your list should only have two entries, not three. You could move the entire list down by 1 resulting in a-3 b-1, but if your list is long this is expensive. A fast an easy solution: move the last element of the bucket into the removed location and decrease your bucket size: c0 a-3 b-1 becomes b-1 a-3 <empty> or just b-1 a-3.
Note that we can do the above because it doesn't matter what order the symbols in the bucket are listed in, as long as it's the same way when we encode or decode the number.
As I was unsure of the code in gbronner's answer (or of my understanding), I recoded it in R as follows
ritpermz=function(n, parclass){
return(factorial(n) / prod(factorial(parclass)))}
rankum <- function(confg, parclass){
n=length(confg)
permdex=1
for (i in 1:(n-1)){
x=confg[i]
if (x > 1){
for (j in 1:(x-1)){
if(parclass[j] > 0){
parclass[j]=parclass[j]-1
permdex=permdex + ritpermz(n-i, parclass)
parclass[j]=parclass[j]+1}}}
parclass[x]=parclass[x]-1
}#}
return(permdex)
}
which does produce a ranking with the right range of integers

Interview puzzle: Jump Game

Jump Game:
Given an array, start from the first element and reach the last by jumping. The jump length can be at most the value at the current position in the array. The optimum result is when you reach the goal in minimum number of jumps.
What is an algorithm for finding the optimum result?
An example: given array A = {2,3,1,1,4} the possible ways to reach the end (index list) are
0,2,3,4 (jump 2 to index 2, then jump 1 to index 3 then 1 to index 4)
0,1,4 (jump 1 to index 1, then jump 3 to index 4)
Since second solution has only 2 jumps it is the optimum result.
Overview
Given your array a and the index of your current position i, repeat the following until you reach the last element.
Consider all candidate "jump-to elements" in a[i+1] to a[a[i] + i]. For each such element at index e, calculate v = a[e] + e. If one of the elements is the last element, jump to the last element. Otherwise, jump to the element with the maximal v.
More simply put, of the elements within reach, look for the one that will get you furthest on the next jump. We know this selection, x, is the right one because compared to every other element y you can jump to, the elements reachable from y are a subset of the elements reachable from x (except for elements from a backward jump, which are obviously bad choices).
This algorithm runs in O(n) because each element need be considered only once (elements that would be considered a second time can be skipped).
Example
Consider the array of values a, indicies, i, and sums of index and value v.
i -> 0 1 2 3 4 5 6 7 8 9 10 11 12
a -> [4, 11, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
v -> 4 12 3 4 5 6 7 8 9 10 11 12 13
Start at index 0 and consider the next 4 elements. Find the one with maximal v. That element is at index 1, so jump to 1. Now consider the next 11 elements. The goal is within reach, so jump to the goal.
Demo
See here or here with code.
Dynamic programming.
Imagine you have an array B where B[i] shows the minimum number of step needed to reach index i in your array A. Your answer of course is in B[n], given A has n elements and indices start from 1. Assume C[i]=j means the you jumped from index j to index i (this is to recover the path taken later)
So, the algorithm is the following:
set B[i] to infinity for all i
B[1] = 0; <-- zero steps to reach B[1]
for i = 1 to n-1 <-- Each step updates possible jumps from A[i]
for j = 1 to A[i] <-- Possible jump sizes are 1, 2, ..., A[i]
if i+j > n <-- Array boundary check
break
if B[i+j] > B[i]+1 <-- If this path to B[i+j] was shorter than previous
B[i+j] = B[i]+1 <-- Keep the shortest path value
C[i+j] = i <-- Keep the path itself
The number of jumps needed is B[n]. The path that needs to be taken is:
1 -> C[1] -> C[C[1]] -> C[C[C[1]]] -> ... -> n
Which can be restored by a simple loop.
The algorithm is of O(min(k,n)*n) time complexity and O(n) space complexity. n is the number of elements in A and k is the maximum value inside the array.
Note
I am keeping this answer, but cheeken's greedy algorithm is correct and more efficient.
Construct a directed graph from the array. eg: i->j if |i-j|<=x[i] (Basically, if you can move from i to j in one hop have i->j as an edge in the graph). Now, find the shortest path from first node to last.
FWIW, you can use Dijkstra's algorithm so find shortest route. Complexity is O( | E | + | V | log | V | ). Since | E | < n^2, this becomes O(n^2).
We can calculate far index to jump maximum and in between if the any index value is larger than the far, we will update the far index value.
Simple O(n) time complexity solution
public boolean canJump(int[] nums) {
int far = 0;
for(int i = 0; i<nums.length; i++){
if(i <= far){
far = Math.max(far, i+nums[i]);
}
else{
return false;
}
}
return true;
}
start from left(end)..and traverse till number is same as index, use the maximum of such numbers. example if list is
list: 2738|4|6927
index: 0123|4|5678
once youve got this repeat above step from this number till u reach extreme right.
273846927
000001234
in case you dont find nething matching the index, use the digit with the farthest index and value greater than index. in this case 7.( because pretty soon index will be greater than the number, you can probably just count for 9 indices)
basic idea:
start building the path from the end to the start by finding all array elements from which it is possible to make the last jump to the target element (all i such that A[i] >= target - i).
treat each such i as the new target and find a path to it (recursively).
choose the minimal length path found, append the target, return.
simple example in python:
ls1 = [2,3,1,1,4]
ls2 = [4,11,1,1,1,1,1,1,1,1,1,1,1]
# finds the shortest path in ls to the target index tgti
def find_path(ls,tgti):
# if the target is the first element in the array, return it's index.
if tgti<= 0:
return [0]
# for each 0 <= i < tgti, if it it possible to reach
# tgti from i (ls[i] <= >= tgti-i) then find the path to i
sub_paths = [find_path(ls,i) for i in range(tgti-1,-1,-1) if ls[i] >= tgti-i]
# find the minimum length path in sub_paths
min_res = sub_paths[0]
for p in sub_paths:
if len(p) < len(min_res):
min_res = p
# add current target to the chosen path
min_res.append(tgti)
return min_res
print find_path(ls1,len(ls1)-1)
print find_path(ls2,len(ls2)-1)
>>>[0, 1, 4]
>>>[0, 1, 12]

Resources