I am trying to solve this problem. the problem can be summarized as:
Given a sequence of integers find no of safe partitions, where safe partitions are defined as:
A safe partition is a partition into subsequences S1,S2,…,SK such that for each valid i, min(Si)≤|Si|≤max(Si)— that is, for each subsequence in this partition, its length is greater or equal to its smallest element and smaller or equal to its largest element.
Ex:
Input => 1 6 2 3 4 3 4
Output => 6 partitions
[1],[6,2,3,4,3,4]
[1,6,2],[3,4,3,4]
[1,6,2,3],[4,3,4]
[1],[6,2],[3,4,3,4]
[1],[6,2,3],[4,3,4]
[1,6],[2,3],[4,3,4]
I can probably find out the solution somewhere on internet which includes the code but i am more intrested in finding out the approach to solve this problem so i am asking here what are the points that I am missing in my observation.
These are the things that pop in my mind when I read this problem:
if an element at index i extends a sequence safely its quite
possible that it could also be the start of a new sequence.so at
every element i am left with two choices whether it extends the
sequence or not.
so i think it can be represented mathematically as ,
p(0..N)=1+P(i..N)+P(i+1..N),if A[i] is safe to extend current partition
p(0..N)=1+ p(i..N), if A[i] can't be used to extend
where P is the partition function.
is this reasoning valid? am i missing something?
[I'm having trouble giving a direction without actually giving the solution, because once a person thinks in the right direction then the solution becomes evident. I'll try to highlight some facts which may put a person on the right track.]
Explicitly enumerating safe partitions is problematic, since there are O(2n) safe partitions. For example in:
1,N,1,N,1,N ... [N elements]
For this sequence, at any subsequence of length > 1 and the subsequence [1] matches the criteria. The number of safe partitions for such a sequence of length n=2k is 3k-1. To prove that, look at the following
Base k = 1: f(1) = f(2) = 1
Step assumption: f(2k) = 3k-1.
f(2k+1) =
f(2k+2) = (f(2k) + f(2k-1)) + (f(2k-2) + f(2k-3)) + ... + f(1) + 1
= 2*(f(2k) + f(2k-2) + .. + f(2)) + 1
= 2 * (3k-1 + 3k-2 + ... + 1) + 1
= 2 * (3k - 1) / 2 + 1
= 3k
Since enumeration is out of the question, for any reasonable performance, the solution must somehow count without iterating. Since the proof that 1,N,...,1,N has 3k-1 did not have to explicitly enumerate all sequences, its principles can be generalized to any sequence.
NOTES:
I have solved similar problems before, so the direction was clear to me. For this question I tried to break my thoughts into something manageable and came up with the thought about complexity. I had a very strong feeling that this is exponential even before writing it down, and trying to prove it. This comes from experience and from seeing other problems. The complexity function felt worse than a Fibbonacci because adding an element to a sequence seemed to be adding at least two elements of smaller sizes (similar to the Fibbonacci sequence). Since Fibbonacci is exponential, so the 1,...,1 partitioning must be exponential. From there went on and analyzed it with a recurrence relation.
The exact way I reached the solution matches my way of thought. Everybody has a different way of thought that works for them, and they need to develop and find it.
This is how I came to suspect that the number of safe sequences in tge example was 3k-1:
I recursively calculated f(2k), with base condition f(1)=f(2)=1. Then for 3:
[1,N,1]
[1],[N,1]
[1,N],[1]
And for 4:
[1,N,1,N]
[1],[N,1,N]
[1,N],[1,N]
Meaning f(3)=f(4)=3. Then I recursively applied
f(2k+2)=2*(f(2k) + f(2k-2) + .. + f(2)) + 1
resulting with f(2)=1, f(4)=3, f(6)=9, f(8)=27. This suspiciously looks like 3k-1. Then I simply had to prove that with induction.
Related
The Z algorithm is a string matching algorithm with O(n) complexity.
One use case is finding the longest occurence of string A from string B. For example, the longest occurence of "overdose" from "stackoverflow" would be "over". You could discover this by calling the Z algorithm with a combined string "overdose#stackoverflow" (where # is some character not present in either string). The Z algorithm would then try to match the combined string with itself - and create an array z[] where z[i] gives you the length of longest match starting from index i. In our example:
index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
string o v e r d o s e # s t a c k o v e r f l o w
z (21) 0 0 0 0 1 0 0 0 0 0 0 0 0 4 0 0 0 0 0 1 0
There are plenty of code implementations and mathematically oriented explanations of the algorithm, here are some good ones:
http://www.geeksforgeeks.org/z-algorithm-linear-time-pattern-searching-algorithm/
http://codeforces.com/blog/entry/3107
I can see how it works, but I don't understand why. It seems almost like black magic. I have a very strong intuition that this task is supposed to take O(n^2), yet here is an algorithm that does it in O(n)
I don't find it completely intuitive either, so I think that I qualify for answering. Otherwise I'd just say that you don't understand because you're an idiot, and surely that's not the answer your hoping for :-)
Case in point (citation from an explanation):
Correctness is inherent in the algorithm and is pretty intuitively clear.
So, let's try to be even more intuitive...
First, I'd guess that the common intuition for O(n^2) is this: for a string of length N, if you're dropped at a random place i in the string with no other information, you have to match x (< N) characters to compute Z[i]. If you're dropped N times, you have to do up to N(N-1) tests, so that's O(n^2).
The Z algorithm, however, makes good use of the informations you've gained from the past computations.
Let's see.
First, as long as you don't have a match (Z[i]=0), you progress along the string with one comparison per character, so that's O(N).
Second, when you find a range where there's a match (at index i), the trick is to use clever deductions using the previous Z[0...i-1] to compute all the Z values in that range in constant time, without other comparisons inside that range. The next matches will only be done on the right of the range.
That's how I understand it anyway, hope this helps.
I was looking for a deeper understanding for this algorithm hence I found this question.
I didn't understand the codeforces post initially, but later I found it is good enough for understanding, and I noticed that the post was not entirely accurate, and it omitted some steps in the thinking process, making it a bit confusing.
Let me try to correct the inaccuracy in that post, and clarify some of the steps I think may help people connect the dots to a line. In this process, I hope we can learn some intuition from the original author. In the explanation, I'll mix some quoted blocks from codeforces and my own notes so we can keep the original post close to our discussion.
The Z algorithm starts as:
As we iterate over the letters in the string (index i from 1 to n - 1), we maintain an interval [L, R] which is the interval with maximum R such that 1 ≤ L ≤ i ≤ R and S[L...R] is a prefix-substring (if no such interval exists, just let L = R = - 1). For i = 1, we can simply compute L and R by comparing S[0...] to S[1...]. Moreover, we also get Z1 during this.
This is simple and straightforward.
Now suppose we have the correct interval [L, R] for i - 1 and all of the Z values up to i - 1. We will compute Z[i] and the new [L, R] by the following steps:
If i > R, then there does not exist a prefix-substring of S that starts before i and ends at or after i. If such a substring existed, [L, R] would have been the interval for that substring rather than its current value. Thus we "reset" and compute a new [L, R] by comparing S[0...] to S[i...] and get Z[i] at the same time (Z[i] = R - L + 1).
The bold part in the bullet point might be confusing, but if you read it twice, it's really just repeating the definition of R.
Otherwise, i ≤ R, so the current [L, R] extends at least to i. Let k = i - L. We know that Z[i] ≥ min(Z[k], R - i + 1) because S[i...] matches S[k...] for at least R - i + 1 characters (they are in the [L, R] interval which we know to be a prefix-substring). Now we have a few more cases to consider.
The bold part is not completely accurate, because R - i + 1 can be greater than Z[k], in which case Z[i] would be Z[k].
Let's focus on the key now: Z[i] ≥ min(Z[k], R - i + 1). Why is this true? Because of the following:
Based on the definition of interval [L, R] and i ≤ R, we already confirmed that S[0...R - L] == S[L...R], hence S[0...k] == S[L...i], and S[k...R - L] == S[i...R];
Say Z[k] = x, based on the definition of Z, we know S[0...x] == S[k...k + x];
Combined above equations, we know S[0...x] == S[L...L + x] == S[k...k + x] == S[i...i + x], when x < R - i + 1. The point is, S[k...k + x] == S[i...i + x], so Z[i] = Z[k] when Z[k] < R - i + 1.
These are the missing dots I mentioned in the beginning, and they explain both the second and the third bullet points, and partially the last bullet point. This wasn't straightforward when I read the codeforces post. To me this is the most important part of this algorithm.
For the last bullet point, if Z[k] ≥ R - i + 1, we would refresh [L, R], using i as the new L, and extending R to a bigger R'.
In the whole process, Z algorithm only uses each character once for comparison, so the time complexity is O(n).
As Ilya answered, the intuition in this algorithm is to carefully reuse every piece of information we gathered so far. I just explained it in another way. Hope it helps.
I've been asked to build an algorithm that involves a permutation and I'm a little stumped and looking for a starting place. The details are this...
You are climbing a staircase of n stairs. Each time you can either climb 1 or 2 steps at a time. How many distinct ways can you climb to the top?
Any suggestions how I can tackle this challenge?
You're wrong about permutations. Permutations involve orderings of a set. This problem is something else.
Problems involving simple decisions that produce another instance of the same problem are often solvable by dynamic programming. This is such a problem.
When you have n steps to climb, you can choose a hop of either 1 or 2 steps, then solve the smaller problems for n-1 and n-2 steps respectively. In this case you want to add the numbers of possibilities. (Many DPs are to find minimums or maximums instead, so this is a bit unusual.)
The "base cases" are when you have either 0 or 1 step. There's exactly 1 way to traverse each of these.
With all that in mind, we can write this dynamic program for the number of ways to climb n steps as a recursive expression:
W(n) = W(n - 1) + W(n - 2) if n > 1
1 n == 0, 1
Now you don't want to implement this as a simple recursive function. It will take time exponential in n to compute because each call to W calls itself twice. Yet most of those calls are unnecessary repeats. What to do?
One way to get the job done is find a way to compute the W(i) values in sequence. For a 1-valued DP it's usually quite simple, and so it is here:
W(0) = 1
W(1) = 1 (from the base case)
W(2) = W(1) + W(0) = 1 + 1 = 2
W(3) = W(2) + W(1) = 2 + 1 = 3
You get the idea. This is a very simple DP indeed. To compute W(i), we need only two previous values, W(i-1) and W(i-2). A simple O(n) loop will do the trick.
As a sanity check, look at W(3)=3. Indeed, to go up 3 steps, we can take hops of 1 then 2, 2 then 1, or three hops of 1. That's 3 ways!
Can't resist one more? W(4)=2+3=5. The hop sequences are (2,2), (2,1,1), (1,2,1), (1,1,2), and (1,1,1,1): 5 ways.
In fact, the chart above will look familiar to many. The number of ways to climb n steps is the (n+1)th Fibonacci number. You should code the loop yourself. If stuck, you can look up any of the hundreds of posted examples.
private static void printPath(int n,String path) {
if (n == 0) {
System.out.println(path.substring(1));
}
if (n-1 >=0) {
printPath(n-1,path + ",1");
}
if (n-2 >=0) {
printPath(n-2,path + ",2");
}
}
public static void main(String[] args) {
printPath(4,"");
}
Since it looks like a programming assignment, I will give you the steps to get you started instead of giving the actual code:
You can write a recursive function which keeps track of the step you are on.
If you have reached step n then you found one permutation or if you are past n then you should discard it.
At every step you can call the function by incrementing the steps by 1 and 2
The function will return the number of permutations possible once finished
There are N sticks placed in a straight line. Bob is planning to take few of these sticks. But whatever number of sticks he is going to take, he will take no two successive sticks.(i.e. if he is taking a stick i, he will not take i-1 and i+1 sticks.)
So given N, we need to calculate how many different set of sticks he could select. He need to take at least stick.
Example : Let N=3 then answer is 4.
The 4 sets are: (1, 3), (1), (2), and (3)
Main problem is that I want solution better than simple recursion. Can their be any formula for it? As am not able to crack it
It's almost identical to Fibonacci. The final solution is actually fibonacci(N)-1, but let's explain it in terms of actual sticks.
To begin with we disregard from the fact that he needs to pick up at least 1 stick. The solution in this case looks as follows:
If N = 0, there is 1 solution (the solution where he picks up 0 sticks)
If N = 1, there are 2 solutions (pick up the stick, or don't)
Otherwise he can choose to either
pick up the first stick and recurse on N-2 (since the second stick needs to be discarded), or
leave the first stick and recurse on N-1
After this computation is finished, we remove 1 from the result to avoid counting the case where he picks up 0 sticks in total.
Final solution in pseudo code:
int numSticks(int N) {
return N == 0 ? 1
: N == 1 ? 2
: numSticks(N-2) + numSticks(N-1);
}
solution = numSticks(X) - 1;
As you can see numSticks is actually Fibonacci, which can be solved efficiently using for instance memoization.
Let the number of sticks taken by Bob be r.
The problem has a bijection to the number of binary vectors with exactly r 1's, and no two adjacent 1's.
This is solveable by first placing the r 1's , and you are left with exactly n-r 0's to place between them and in the sides. However, you must place r-1 0's between the 1's, so you are left with exactly n-r-(r-1) = n-2r+1 "free" 0's.
The number of ways to arrange such vectors is now given as:
(1) = Choose(n-2r+1 + (r+1) -1 , n-2r+1) = Choose(n-r+1, n-2r+1)
Formula (1) is deriving from number of ways of choosing n-2r+1
elements from r+1 distinct possibilities with replacements
Since we solved it for a specific value of r, and you are interested in all r>=1, you need to sum for each 1<=r<=n
So, the solution of the problem is given by the close formula:
(2) = Sum{ Choose(n-r+1, n-2r+1) | for each 1<=r<=n }
Disclaimer:
(A close variant of the problem with fixed r was given as HW in the course I am TAing this semester, main difference is the need to sum the various values of r.
This is something that I routinely err in while solving problems. How do we decide what is the value of a recursive function when the argument is at the lowest extreme. An example will help:
Given n, find the number of ways to tile a 3xN grid using 2x1 blocks only. Rotation of blocks is allowed.
The DP solution is easily found as
f(n): the number of ways of tiling a 3xN grid
g(n): the number of ways of tiling a 3xN grid with a 1x1 block cut off at the rightmost column
f(n) = f(n-2) + 2*g(n-1)
g(n) = f(n-1) + g(n-2)
I initially thought that the base cases would be f(0)=0, g(0)=0, f(1)=0, g(1)=1. However, this yields a wrong answer. I then read somewhere that f(0)=1 and reasoned it out as
The number of ways of tiling a 3x0 grid is one because there is only one way we cannot use any tiles(2x1 blocks).
My question is, by that logic, shouldn't g(0) be also one. But, in the correct solution, g(0)=0. In general, when can we say that the number of ways of using nothing is one?
About your specific question of tiling, think this way:
How many ways are there to "tile a 3*0 grid"?
I would say: Just one way, don't do anything! and you can't "do nothing" any other way. (f(0) = 1)
How many ways are there to "tile a 3*0 grid, cutting that specific block off"?
I would say: Hey! That's impossible! You can't cut the specific block off since there is nothing. So, there's no way one can solve the task anyhow. (g(0) = 0)
Now, let's get to the general case:
There's no "general" rule about zero cases.
Depending on your problem, you may be able to somehow "interpret" the situation, and find the reasonable value. Most of the times (depending on your definition of "ways") number of ways of doing "nothing" is 1, and number of ways of doing something impossible is 0!
Warning! Being able to somehow "interpret" the zero case is not enough for the relation to be correct! You should recheck your recursive relation (i.e. the way you get the n-th value from the previous ones) to be applicable for the zero-to-one case as well, since most of the time this would be a "tricky" case.
You may find it easier to base your recursive relation on some non-zero case, if you find the zero-case being tricky, or confusing.
The way I see it, g(0) is invalid, since there is no way to cut a 1x1 block out of a 3x0 grid.
Invalid values are typically represented as 0, -∞ or ∞, but this largely depends on the problem. The number of ways to place something would make sense to be 0 to represent invalid values (otherwise your counts will be off). When working with min, you'd generally use ∞. When working with max, you'd generally use -∞ (or possibly 0).
Generally, the number of ways to place 0 objects or objects in a 0-sized space makes sense to be 1 (i.e. placing no objects) (so f(0) = 1). In a lot of other cases valid values would be 0.
But these are far from rules (and avoid blindly following rules, because you'll get hurt with exceptions); the best advice I can give - when in doubt, throw a few values in and see what happens.
In this case you can easily determine what the actual values for g(1), f(1), g(2) and f(2) should be, and use these to calculate g(0) and f(0):
g(1) = 1
f(1) = 0
g(2): (all invalid, since ? is not populated)
|X |X ?X
|? || --
-- ?| --
g(2) = 0, thus g(0) = 0 - f(1) = 0 - 0 = 0
f(2):
|| -- --
|| -- ||
-- -- ||
f(2) = 3, thus f(0) = 3 - 2*g(1) = 3 - 2 = 1
Say S = 5 and N = 3 the solutions would look like - <0,0,5> <0,1,4> <0,2,3> <0,3,2> <5,0,0> <2,3,0> <3,2,0> <1,2,2> etc etc.
In the general case, N nested loops can be used to solve the problem. Run N nested loop, inside them check if the loop variables add upto S.
If we do not know N ahead of time, we can use a recursive solution. In each level, run a loop starting from 0 to N, and then call the function itself again. When we reach a depth of N, see if the numbers obtained add up to S.
Any other dynamic programming solution?
Try this recursive function:
f(s, n) = 1 if s = 0
= 0 if s != 0 and n = 0
= sum f(s - i, n - 1) over i in [0, s] otherwise
To use dynamic programming you can cache the value of f after evaluating it, and check if the value already exists in the cache before evaluating it.
There is a closed form formula : binomial(s + n - 1, s) or binomial(s+n-1,n-1)
Those numbers are the simplex numbers.
If you want to compute them, use the log gamma function or arbitrary precision arithmetic.
See https://math.stackexchange.com/questions/2455/geometric-proof-of-the-formula-for-simplex-numbers
I have my own formula for this. We, together with my friend Gio made an investigative report concerning this. The formula that we got is [2 raised to (n-1) - 1], where n is the number we are looking for how many addends it has.
Let's try.
If n is 1: its addends are o. There's no two or more numbers that we can add to get a sum of 1 (excluding 0). Let's try a higher number.
Let's try 4. 4 has addends: 1+1+1+1, 1+2+1, 1+1+2, 2+1+1, 1+3, 2+2, 3+1. Its total is 7.
Let's check with the formula. 2 raised to (4-1) - 1 = 2 raised to (3) - 1 = 8-1 =7.
Let's try 15. 2 raised to (15-1) - 1 = 2 raised to (14) - 1 = 16384 - 1 = 16383. Therefore, there are 16383 ways to add numbers that will equal to 15.
(Note: Addends are positive numbers only.)
(You can try other numbers, to check whether our formula is correct or not.)
This can be calculated in O(s+n) (or O(1) if you don't mind an approximation) in the following way:
Imagine we have a string with n-1 X's in it and s o's. So for your example of s=5, n=3, one example string would be
oXooXoo
Notice that the X's divide the o's into three distinct groupings: one of length 1, length 2, and length 2. This corresponds to your solution of <1,2,2>. Every possible string gives us a different solution, by counting the number of o's in a row (a 0 is possible: for example, XoooooX would correspond to <0,5,0>). So by counting the number of possible strings of this form, we get the answer to your question.
There are s+(n-1) positions to choose for s o's, so the answer is Choose(s+n-1, s).
There is a fixed formula to find the answer. If you want to find the number of ways to get N as the sum of R elements. The answer is always:
(N+R-1)!/((R-1)!*(N)!)
or in other words:
(N+R-1) C (R-1)
This actually looks a lot like a Towers of Hanoi problem, without the constraint of stacking disks only on larger disks. You have S disks that can be in any combination on N towers. So that's what got me thinking about it.
What I suspect is that there is a formula we can deduce that doesn't require the recursive programming. I'll need a bit more time though.