How to prove correctness of this algorithm? - algorithm

I am solving a problem from codeforces.
Our job is to find a minimum cost to make a given integer sequence be a non-decreasing sequence. We can increase/decrease any number of the sequence by 1 at each step and it will cost 1.
For example, when we are given a sequence 3, 2, -1, 2, 11, we can make the sequence be non decreasing with cost 4 (by decreasing 3 to 2 and increasing -1 to 2, so the non-decreasing sequence will be 2, 2, 2, 2, 11)
According the editorial of this problem, we can solve this problem using dynamic programming with 2 sequences, (one is the sequence we are given, and the other one is sorted sequence of the given one).
The outline of solution:
If we let a be the original sequence and b be the sorted sequence of the sequence a, and let f(i,j) be the minimal number of moves required to obtain the sequence in which the first i elements are non-decreasing and i-th element is at most bj. Then we can make recurrence as follows. (This is from the editorial of the problem)
f(1,1)=|a1-b1|
f(1,j)=min{|a1-bj|,f(1,j-1)}, j>1
f(i,1)=|ai-b1|+f(i-1,1), i>1
f(i,j)=min{f(i,j-1),f(i-1,j)+|ai-bj|}, i>1, j>1
I understand this recurrence. However, I can't figure out why we should compare the original sequence with the sorted one of itself and I am not sure whether we can get the correct minimum cost with another sequence other than the sorted sequence of the given one.
How we can prove the correctness of this solution? And how can we guarantee the answer with sorted sequence to be the minimum cost?

The point of the exercise is that this recurrence can be proven with induction. Once it is proven, then we have proven that f(n,n) is the minimum cost for winding up with a solution where the nth value is at most bn.
To finish proving the result there is one more step. Which is to prove that any solution where the nth value exceeds bn can be improved without increasing that maximum value. But that's trivial - just omit one of the +1s from the first value to exceed bn and you have a strictly better solution without a larger maximum. Therefore no solution winding up with a maximum greater than bn can be better than the best with a maximum at most bn.
Therefore we have the optimal solution.

Related

How does Dynamic Programming help in the subset sum problem to reduce time complexity?

In the Subset Sum problem, if we don't use the Dynamic Programming approach, then we have an exponential time complexity. But if we draw the recursion tree, it seems that all the 2^n branches are unique. If we use dynamic programming, how can we assure that all the unique branches are explored? If there really exists 2^n possible solutions, how does dynamic programming reduce it to polynomial time while also ensuring all 2^n solutions are explored?
How does dynamic programming reduce it to polynomial time while also ensuring all 2^n solutions are explored?
It is pseudo polynomial time, not polynomial time. It's a very important distinction. According to Wikipedia, A numeric algorithm runs in pseudo-polynomial time if its running time is a polynomial in the numeric value of the input, but not necessarily in the length of the input, which is the case for polynomial time algorithms.
What does it matter?
Consider an example [1, 2, 3, 4], sum = 1 + 2 + 3 + 4 = 10.
There does in fact exist 2^4 = 16 subsequences, however, do we need to check them all? The answer is no, since we are only concerned about the sum of subsequence. To illustrate this, let's say we're iterating from the 1st element to the 4th element:
1st element:
We can choose to take or not take the 1st element, so the possible sum will be [0, 1].
2nd element:
We can choose to take or not to take the 2nd element. Same idea, possible sum will be [0, 1, 2, 3].
3rd element:
We have [0, 1, 2, 3] now. We now consider taking the third element. But wait... If we take the third element and add it to 0, we still get 3, which is already present in the array, do we need to store this piece of information? Apparently not. In fact, we only need to know whether a sum is possible at any stage. If there are multiple subsequences summing to the same value, we ignore it. This is the key to the reduction of complexity, if you consider it as a reduction.
With that said, a real polynomial solution for subset sum is not known since it is NP-complete

Greedy algorithm to minimize the sum of a permutation

I have an array {a1,a2,....,an} (of natural numbers), i need to build an greedy algorithm that finds a permutation (i1,...in) of 1....n that minimizes the sum: 1.ai1 + 2.ai2 + .... + (n − 1)ain−1 + n.ain.
Definitely I can just try all of them and select the one which gives the smallest sum (this will give correct result in O(n!)).
The greedy choice that i though is to choose the numbers in decreasing order, but i don't know how to prove that this works.
P.S: this is just for study and training, I'm not being able to think "greedly"
Choosing the numbers in decreasing order is optimal.
Proof is by induction on n: suppose there's a permutation that is optimal and that the smallest number is not in the last place. Then, swapping the element that is in the last place and the smallest element decreases the total sum. That contradicts the assumption of optimality, so we must have that the smallest element is in the last place. By the induction hypothesis, the other elements are in decreasing order in the first (n-1) places.
The base case of n=1 is trivial.

Constrained Longest Increasing Subsequence

Consider an array, which has N integers. Now we are given with an index i, which can take up values from 1 through N. This particular index should always be present in the LIS that we generate. Calculate the LIS for each value at i.
How can we solve the above problem efficiently? My straightforward solution is to vary the index i for all of its values and calculate LIS. The time complexity goes up to O(N2log(N)). Can it be beaten?
Example:
N = 2. i = 1
Say the given array is [1,2].
[1,2] or [2, 2]
The longest (strictly) increasing subsequence in each case is 2 and 1.
The canonical dynamic program for LIS computes, for each k, the longest increasing subsequence of the elements at index 1..k that includes the element at index k. Using this data and the mirror image data for longest increasing subsequences of k..n, we find the LIS that includes index k as the union of the longest before k and the longest after k.
O(n log n)
Having an index i that must be in the subsequence makes it an easy task to look to the left and right and see how far you can go to remain strictly increasing. This will take at most O(N) steps.
The straight forward solution will now just repeat this for all N values at the index i, which gives a total effort of O(N^2).
But note, that when changing the value at index i, the calculations done earlier can be reused. It is only necessary to check if the the sequence can be extended beyond i in either direction or not, If yes, you know already how far (or can calculate it now once and for all).
This brings the total effort down to O(N).

Count the total number of subsets that don't have consecutive elements

I'm trying to solve pretty complex problem with combinatorics and counting subsets. First of all let's say we have given set A = {1, 2, 3, ... N} where N <= 10^(18). Now we want to count subsets that don't have consecutive numbers in their representation.
Example
Let's say N = 3, and A = {1,2,3}. There are 2^3 total subsets but we don't want to count the subsets (1,2), (2,3) and (1,2,3). So in total for this question we want to answer 5 because we want to count only the remaining 5 subsets. Those subsets are (Empty subset), (1), (2), (3), (1,3). Also we want to print the result modulo 10^9 + 7.
What I've done so far
I was thinking that this should be solved using dynamical programming with two states (are we taking the i-th element or not), but then I saw that N could go up to 10^18, so I was thinking that this should be solved using mathematical formula. Can you please give me some hints where should I start to get the formula.
Thanks in advance.
Take a look at How many subsets contain no consecutive elements? on the Mathematics Stack Exchange.
They come to the conclusion that the number of non-consecutive subsets in the set {1,2,3...n} is the fib(n+2) where fib is the function computing the Fibonacci sequence for the number n+2. Your solution to n=3 conforms to this solution. If you can implement the Fibonacci algorithm, then you can solve this problem, but solving the question for a number as large as 10^18 will still be a challenge.
As mentioned in the comments here, you can check out the fast doubling algorithm on Hacker Earth.
It will find Fibonacci numbers in O(log n).

Difficult algorithm: Optimal Solution to 1

This is one of those difficult algorithms just because there are so many options. Imagine a number 'N' and a set of Primes under 10 i.e. {2, 3, 5, 7}. The goal is to keep dividing N till we reach 1. If at any steps N is not divisible by any of the given primes then you can an operation out of:
i) N= N-1
OR ii) N=N+1
This will ensure that N is even and we can continue.
The goal should be achieved while using minimum number of operations.
Please note that this may sound trivial i.e. you can implement a step in your algo that "if N is divisible by any prime then divide it". But this does not always produce the optimal solution
E.g. if N=134: Now 134 is divisble by 2. if you divide by 2 , you get 67. 67 is not divisible by any prime, so you do an operation and N will be 66/68 both of which require another operation. So total 2 operations.
Alternatively, if N=134 and you do an operation N=N+1 i.e. N=135, In this case the total operations needed to reach 1 is 1. So this is the optimal solution
Unless there is some mathematical solution for this problem (If you are looking for a mathematical solution, math.SE is better for this question) - you could reduce the problem to a shortest path problem.
Represent the problem as a graph G=(V,E) where V = N (all natural numbers) and E = {(u,v) | you can get from u to v in a single step }1.
Now, you need to run a classic search algorithm from your source (the input number) to your target (the number 1). Some of the choices to get an optimal solution are:
BFS - since the reduced graph is not weighted, BFS is guaranteed to be both complete (find a solution if one exists) and optimal (finds the shortest solution).
heuristic A* - which is also complete and optimal2, and if you have a good heuristic function - should be faster then an uninformed BFS.
Optimization note:
The graph can be constructed "on the fly", no need to create it as pre-processing. To do so, you will need a next:V->2^V (from a node to a set of node) function, such that next(v) = {u | (v,u) is in E}
P.S. complexity comment: The BFS solution is pseudo-polynomial (linear in the input number worst case), since the "highest" vertex you will ever develop is n+1, so the solution is basically O(n) worst case - though I believe deeper analysis can restrict it to a better limit.
(1) If you are interested only in +1/-1 to be counted as ops, you can create the edges based on the target after finishing divisions.
(2) If an admissible heuristic function is used.

Resources