How do I solve this dynamic programming problem?

How do I solve this dynamic programming problem? - algorithm

I am using the C++ language.
Mark has 'N' days. Initially he is at position (h1,0) on the X-axis. On each day he can go to the co-ordinates (h1+a,0) or (h1+b,0) or (h1+c,0) . He can select any one of the choice he wants. Each day he can go to (+a , +b or +c).
At the N-th day, he has to reach the position (h2,0).
Count the number of ways in which Mark can reach (h2,0) in N days.
Values of N,h1,h2,a,b,c are large(co-ordinates and values of a,b,c can be negative as well, in some cases a=b or b=c or c=a or a=b=c)
My approach is:- At each day, I store the positions which he can reach on that particular day with the count(number of ways) to reach that position. I am using a map to do this. And this approach is not efficient.
Can somebody share a much more efficient approach ?
My second approach which must work is the variation of coin-exchange-problem :-)
Example:-
N=3,h1=0,h2=6,a=1,b=2,c=3
Answer : -7(number of ways)
1st way:-(1+2+3)
2nd way:-(1+3+2)
3rd way:-(2+1+3)
4th way:-(2+3+1)
5th way:-(3+1+2)
6th way:-(3+2+1)
7th way:-(2+2+2)
Format:-(Choice on 1st day+Choice on 2nd day+Choice on 3rd day)
Constraints:-
1<=N<=10^5 .
-10^9<=h1,h2,a,b,c<=10^9 .

If you know how much times a,b,c used then you can easily say the result.
For example, if we use x times of a, y times of b and z times of c to get h2 from h1 we can say that way of using x times of a, y times of b and z times of c is
factorial(x+y+z)/(factorial(x)*factorial(y)*factorial(z))
.
Now how can we know the values of x,y,z . There can be a lot of triplet of x,y,z.
Now , we can consider 0 to n every number as x.
So for each x from 0 to n,
xa+yb+z*c=(h2-h1)
y+z=n-x
As we know the values of x,a,b,c,h2,h1
we can rewrite the equation,
yb+zc=(h2-h1-xa)
yb+zc=k, where k= (h2-h1-xa)
Now solve the problem for the following equations:
yb+zc=k
y+z=n-x
So there can be solution or not for that equation such that y and z is an integer.
If there is a solution then these equation will be solvable.
After finding y and z, we can calculate the permutation for x by using
factorial(x+y+z)/(factorial(x)*factorial(y)*factorial(z))
.
So if there is no any solution, then you should skip current x.
By this way calculate y and z for each x from 0 to n and sum them.

Related

Finding algorithm that minimizes a cost function

I have a restroom that I need to place at some point. I want the restroom's placement to minimize the total distance people have to travel to get there.
So I have x apartments, and each house has n people living in each apartment, so the apartments would be like a_1, a_2, a_3, ... a_x and the number of people in a_1 would be n_1, a_2 would be n_2, etc. No two apartments can be in the same space and each apartment has a positive number of people.
So I know the distance between an apartment a_1 and the proposed bathroom, placed at a, would be |a_1 - a|.
MY WORKING:
I defined a cost function, C(a) = SUM[from i = 1 to x] (n_i)|a_i - a|. I want to find the location a that minimizes this cost function, given two arrays - one for the location of the apartments and one for the number of people in each apartment. I want my algorithm to be in O(n) time.
I was thinking of representing this as a graph and using MSTs or Djikstra's but that would not meet the O(n) runtime. Clearly, there must be something I can do without graphs, but I am unsure.

My understanding of your problem:
You have a once dimensional line with points a1,...,an. Each point has a value n1,....n, and you need to pick a point a that minimizes the cost function
SUM[from i = 1 to x] (n_i)|a_i - a|.
Lets assume our input a1...an is sorted.
Our strategy will be a sweep from left to right, calculating possible a on the way.
Things we will keep track of:
total_n : the total number of people
left_n : the number of people living to the left or at our current position
right_n : the number of people living to the right of our current position
a: our current postition
Calculate:
C(a)
left_n = n1
right_n = total_n - left_n
Now we consider what happens to the sum if we move our restroom to the right 1 step. The people on the left get 1 step further away, but the people on the right get 1 step closer.
We can say that C(a+1) = C(a) +left_n -right_n
If the range an-a1 is fairly small, we can use this and just step through the range using this formula to update the sum. Note that when this sum starts increasing we have gone too far and can safely stop.
However, if the apartments are very far apart we cannot step 1 by 1 unit. We need instead to step apartment by apartment. Note
C(a[i]) = C(a[i-1]) + left_n*(a[i]-a[a-1]) - right_n*(a[i]-a[i-1])
If at any point C(a[i]) > C(a[i-1]) we know that the correct position of the restroom is somewhere between i and i-1.
We can calculate that position, lets call it x.
The sum at x is C(a[i-1]) + left_n*(x-a[i-1]) - right_n*(x-a[i-1]) and we want to minimize this. Note that everything but x is known values.
We can simplify to
f(x) = C(a[i-1]) + left_n*x-left_n*a[i-1]) - right_n*x-left_n*a[i-1])
Constant terms cannot affect our decision so we are actually looking to minize
f(x) = x*(left_n-right_n)
We see that if left_n < right_n we want the restroom to be at i+1, but if left_n > right_n we want the restroom to be at i.
We need to at most do this calculation at each apartment, so the running time is O(n).

If Random variable is a function

The basic definition of random variable is that it is a function based on random experiment.the question is that if it is a function say f then how can it take numerical values..
Suppose if we toss two coins and X be random variable relating no. of heads with (0,1,2) .For event of two heads say w....we have X(w)=2 is value of function X at w. and not of X itself..
But sometimes it is written that x is a r .v taking values 0,1,2,....
Don't it sound wrong to say function and takes values?

A random variable is a well defined function X: E -> R, whose domain E is a probability space and its codomain is (generally speaking) the set of real numbers.
Intuitively, X is some kind of metric or measurement on the elements of E.
Example 1
Let E be the set of users of Stack Overflow at a given point in time, say right now. And let X be the function that assigns their reputation to every SO user. For example, you could calculate P(X >= 5000) which is the percent of SO users with a reputation of 5000 or more.
Notice that P(X >= 5000) is nothing but a compact notation for the subset of E defined as:
{u in E | X(u) >= 5000}
meaning the subset of SO users u with a reputation of 5000 or more.
Example 2
Let E be the set of questions in SO and X the function that assigns the number of votes (at certain point in time) to each question. If you pick one question q at random, X(q) would be its number of votes and we could ask for the probability of, say, X < 0 (down-voted questions.)
Here the subset of such questions is
{q in E | X(q) < 0}
i.e., the subset of questions q having a negative vote count.
Conclusion
There is nothing random in a Random variable. The randomness is in the way we pick elements (or subsets) from its domain.

Speaking of functions - Yes, it is safe to say that a function can take certain values. Speaking of random variables and probability, the definition I know is:
A random variable assigns a numerical value to each possible outcome of a random experiment
This definition does indeed say that X (aka random variable) is a function. In your case, where it is said that X (as in function) can take values 0,1,2 is basically saying that the subset of the codomain (or even the codomain or target set itself) of function X is the set {0,1,2}, or interval
[0,2] ⊂ ℕ.

How to solve SPOJ : SCALE using binary search?

http://www.spoj.com/problems/SCALE/
I am trying to do it using recursion but getting TLE.
The tags of the problem say BINARY SEARCH.
How can one do it using binary search ?
Thanx in advance.

First thing to notice here is that if you had two weights of each size instead of one, then the problem would be quite trivial, as we we would only need to represent X in its base 3 representation and take corresponding number of weights. For, example if X=21 then we could take two times P_3 and one time P_2, and put those into another scale.
Now let's try to make something similar using the fact that we can add to both scales (including the one where X is placed):
Assume that X <= P_1+P_2+...+P_n, that would mean that X <= P_n + (P_n-1)/2 (easy to understand why). Therefore, X + P_(n-1) + P_(n-2)+...+P_1 < 2*P_n.
(*) What that means is that if we add some of the weights from 1 to n-1 to same scale as X, then the number on that scale still does
not have 2 in its n-th rightmost digit (either 0 or 1).
From now on assume that digit means a digit of a number in its base-3 representation (but it can temporarily become larger than 2 :P ). Now lets denote the total weight of first scale (where X is placed) as A=X and the other scale is B=0 and our goal is to make them equal (both A and B will change as we will make our progress) .
Let's iterate through all digits of the A from smallest to largest (leftmost). If the current digit index is i and it:
Equals to 0 then just ignore and proceed further
Equals to 1 then we place weight P_i=3^(i-1) on scale B.
Equals to 2 then we add P_i=3^(i-1) to scale A. Note that it would result in the increase of the digit (i+1).
Equals to 3 (yes this case is possible, if both current and previous digit were 2) add 1 to digit at index i+1 and go further (no weights are added to any scale).
Due to (*) obviously the procedure will run correctly (as the last digit will be equal to 1 in A), as we will choose only one weight from the set and place them correctly, and obviously the numbers A and B will be equal after the procedure is complete.
Now second case X > P_1+P_2+...+P_n. Obviously we cannot balance even if we place all weights on the second scale.
This completes the proof and shows when it is possible and the way how to place the weights to both scales to equalise them.
EDIT:
C++ code which I successfully submitted on SPOJ just now https://ideone.com/tbB7Ve

The solution to this problem is quite trivial. The idea is the same as #Yerken's answer, but expressed in a bit different way:
Only the first weight has a mass not divisible by 3. So the first weight is the only one has effect on balancing mod 3 property of the 2 scales:
If X mod 3 == 0, the first weight must not be used
If X mod 3 == 1, the first weight must be on scale B (the currently empty one)
If X mod 3 == 2, the first weight must be on scale A
Subtract both scales by weight(B) --> solution doesn't change, and now weight(A) is divisible by 3 while weight(B) == 0
Set X' = weight(A)/3 and divide every weights Pi by 3 ==> Solution doesn't change, and now it's the same problem with N' = N-1 and X' = (X+1)/3
pseudo-code:
listA <- empty
listB <- empty
for i = 1 to N {
if (X == 0) break for loop; // done!
if (X mod 3 == 1) then push i to listB;
if (X mod 3 == 2) then push i to listA;
X = (X + 1)/3; // integer division
}
hasSolution <- (X == 0)
C++ code: http://ideone.com/LXLGmE

Intuition behind the Z algorithm

The Z algorithm is a string matching algorithm with O(n) complexity.
One use case is finding the longest occurence of string A from string B. For example, the longest occurence of "overdose" from "stackoverflow" would be "over". You could discover this by calling the Z algorithm with a combined string "overdose#stackoverflow" (where # is some character not present in either string). The Z algorithm would then try to match the combined string with itself - and create an array z[] where z[i] gives you the length of longest match starting from index i. In our example:
index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
string o v e r d o s e # s t a c k o v e r f l o w
z (21) 0 0 0 0 1 0 0 0 0 0 0 0 0 4 0 0 0 0 0 1 0
There are plenty of code implementations and mathematically oriented explanations of the algorithm, here are some good ones:
http://www.geeksforgeeks.org/z-algorithm-linear-time-pattern-searching-algorithm/
http://codeforces.com/blog/entry/3107
I can see how it works, but I don't understand why. It seems almost like black magic. I have a very strong intuition that this task is supposed to take O(n^2), yet here is an algorithm that does it in O(n)

I don't find it completely intuitive either, so I think that I qualify for answering. Otherwise I'd just say that you don't understand because you're an idiot, and surely that's not the answer your hoping for :-)
Case in point (citation from an explanation):
Correctness is inherent in the algorithm and is pretty intuitively clear.
So, let's try to be even more intuitive...
First, I'd guess that the common intuition for O(n^2) is this: for a string of length N, if you're dropped at a random place i in the string with no other information, you have to match x (< N) characters to compute Z[i]. If you're dropped N times, you have to do up to N(N-1) tests, so that's O(n^2).
The Z algorithm, however, makes good use of the informations you've gained from the past computations.
Let's see.
First, as long as you don't have a match (Z[i]=0), you progress along the string with one comparison per character, so that's O(N).
Second, when you find a range where there's a match (at index i), the trick is to use clever deductions using the previous Z[0...i-1] to compute all the Z values in that range in constant time, without other comparisons inside that range. The next matches will only be done on the right of the range.
That's how I understand it anyway, hope this helps.

I was looking for a deeper understanding for this algorithm hence I found this question.
I didn't understand the codeforces post initially, but later I found it is good enough for understanding, and I noticed that the post was not entirely accurate, and it omitted some steps in the thinking process, making it a bit confusing.
Let me try to correct the inaccuracy in that post, and clarify some of the steps I think may help people connect the dots to a line. In this process, I hope we can learn some intuition from the original author. In the explanation, I'll mix some quoted blocks from codeforces and my own notes so we can keep the original post close to our discussion.
The Z algorithm starts as:
As we iterate over the letters in the string (index i from 1 to n - 1), we maintain an interval [L, R] which is the interval with maximum R such that 1 ≤ L ≤ i ≤ R and S[L...R] is a prefix-substring (if no such interval exists, just let L = R =  - 1). For i = 1, we can simply compute L and R by comparing S[0...] to S[1...]. Moreover, we also get Z1 during this.
This is simple and straightforward.
Now suppose we have the correct interval [L, R] for i - 1 and all of the Z values up to i - 1. We will compute Z[i] and the new [L, R] by the following steps:
If i > R, then there does not exist a prefix-substring of S that starts before i and ends at or after i. If such a substring existed, [L, R] would have been the interval for that substring rather than its current value. Thus we "reset" and compute a new [L, R] by comparing S[0...] to S[i...] and get Z[i] at the same time (Z[i] = R - L + 1).
The bold part in the bullet point might be confusing, but if you read it twice, it's really just repeating the definition of R.
Otherwise, i ≤ R, so the current [L, R] extends at least to i. Let k = i - L. We know that Z[i] ≥ min(Z[k], R - i + 1) because S[i...] matches S[k...] for at least R - i + 1 characters (they are in the [L, R] interval which we know to be a prefix-substring). Now we have a few more cases to consider.
The bold part is not completely accurate, because R - i + 1 can be greater than Z[k], in which case Z[i] would be Z[k].
Let's focus on the key now: Z[i] ≥ min(Z[k], R - i + 1). Why is this true? Because of the following:
Based on the definition of interval [L, R] and i ≤ R, we already confirmed that S[0...R - L] == S[L...R], hence S[0...k] == S[L...i], and S[k...R - L] == S[i...R];
Say Z[k] = x, based on the definition of Z, we know S[0...x] == S[k...k + x];
Combined above equations, we know S[0...x] == S[L...L + x] == S[k...k + x] == S[i...i + x], when x < R - i + 1. The point is, S[k...k + x] == S[i...i + x], so Z[i] = Z[k] when Z[k] < R - i + 1.
These are the missing dots I mentioned in the beginning, and they explain both the second and the third bullet points, and partially the last bullet point. This wasn't straightforward when I read the codeforces post. To me this is the most important part of this algorithm.
For the last bullet point, if Z[k] ≥ R - i + 1, we would refresh [L, R], using i as the new L, and extending R to a bigger R'.
In the whole process, Z algorithm only uses each character once for comparison, so the time complexity is O(n).
As Ilya answered, the intuition in this algorithm is to carefully reuse every piece of information we gathered so far. I just explained it in another way. Hope it helps.

Count ways to take atleast one stick

There are N sticks placed in a straight line. Bob is planning to take few of these sticks. But whatever number of sticks he is going to take, he will take no two successive sticks.(i.e. if he is taking a stick i, he will not take i-1 and i+1 sticks.)
So given N, we need to calculate how many different set of sticks he could select. He need to take at least stick.
Example : Let N=3 then answer is 4.
The 4 sets are: (1, 3), (1), (2), and (3)
Main problem is that I want solution better than simple recursion. Can their be any formula for it? As am not able to crack it

It's almost identical to Fibonacci. The final solution is actually fibonacci(N)-1, but let's explain it in terms of actual sticks.
To begin with we disregard from the fact that he needs to pick up at least 1 stick. The solution in this case looks as follows:
If N = 0, there is 1 solution (the solution where he picks up 0 sticks)
If N = 1, there are 2 solutions (pick up the stick, or don't)
Otherwise he can choose to either
pick up the first stick and recurse on N-2 (since the second stick needs to be discarded), or
leave the first stick and recurse on N-1
After this computation is finished, we remove 1 from the result to avoid counting the case where he picks up 0 sticks in total.
Final solution in pseudo code:
int numSticks(int N) {
return N == 0 ? 1
: N == 1 ? 2
: numSticks(N-2) + numSticks(N-1);
}
solution = numSticks(X) - 1;
As you can see numSticks is actually Fibonacci, which can be solved efficiently using for instance memoization.

Let the number of sticks taken by Bob be r.
The problem has a bijection to the number of binary vectors with exactly r 1's, and no two adjacent 1's.
This is solveable by first placing the r 1's , and you are left with exactly n-r 0's to place between them and in the sides. However, you must place r-1 0's between the 1's, so you are left with exactly n-r-(r-1) = n-2r+1 "free" 0's.
The number of ways to arrange such vectors is now given as:
(1) = Choose(n-2r+1 + (r+1) -1 , n-2r+1) = Choose(n-r+1, n-2r+1)
Formula (1) is deriving from number of ways of choosing n-2r+1
elements from r+1 distinct possibilities with replacements
Since we solved it for a specific value of r, and you are interested in all r>=1, you need to sum for each 1<=r<=n
So, the solution of the problem is given by the close formula:
(2) = Sum{ Choose(n-r+1, n-2r+1) | for each 1<=r<=n }
Disclaimer:
(A close variant of the problem with fixed r was given as HW in the course I am TAing this semester, main difference is the need to sum the various values of r.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How do I solve this dynamic programming problem? - algorithm

Related

Finding algorithm that minimizes a cost function

If Random variable is a function

How to solve SPOJ : SCALE using binary search?

Intuition behind the Z algorithm

Count ways to take atleast one stick

Categories

Resources