I'm working on an algorithms problem, and I'm hitting a wall in speeding it up.
I have a function f(i,j), where i and j are integers such that 1 <= i <= j <= n for some upper bound n. This function is already written.
Furthermore, this function satisfies the equality f(i, j) + f(j, k) = f(i, k).
I need to compute f(x, y) for many different pairs x, y. Assume n is big enough that storing f(x,y) for every possible pair x,y will take up too much space.
Is there a known algorithm for this type of question? The one I'm using right now memoizes f and tries to reduce x,y to a previously computed pair of numbers by using the equality mentioned above, but my guess is that I'm not reducing in a smart way, and it's costing me time.
Edit: Assume that f(i, j) takes time proportional to j-i when computed the naive way.
You can use an implicit tree of power-of-two-sized intervals:
Store f(i,i+1) for every i
Store f(i,i+2) for every even i
Store f(i,i+4) for every i divisible by four
...
There will be O(log n) tables (floor(log_2(n)), to be exact), with a total size of O(n) (~2*n).
To retrieve f(i,j) where i<=j:
find the highest bit where i, j differ.
Let n be the value with this bit set, and all lower bits cleared. This guarantees the following steps will always succeed:
find f(i, n) by cutting off a chunk as large as possible from the right repeatedly
find f(n, j) by cutting off a chunk as large as possible from the left repeatedly
The retreival accesses each table at most twice, and thus runs in O(log n).
The function satisfies the rule
f(i, j) + f(j, k) = f(i, k)
As you say .
So modify the function to something like f(i,j) =g(j)-g(i) , where g(i)= f(1,x)
So as
f(i,k)=g(k)-g(i)
=g(k)-g(j)+g(j)-g(i)
=f(j,k) + f(i,j)
So i think if you try to store all combinations of f(i,j) it is it cost you around o(n^2) space , so better you store value of g(i) values for all values of i which is of o(n) space
so when ever you need to find f(i,j) you can actually find it as g(j)-g(i) .
As
f(i,j)= g(j)-g(i) // as we already calculated and stored the g(i) .
This is a solution that requires O(n) space, O(n^2) setup time and O(1) time per evaluation.
We have that f(i, j) = -f(j, i) for i <= j.
Given is f(i, k) = f(i, j) + f(j, k). Therefore, f(i, k) = f(i, j) + f(j, k) = -f(j, i) + f(j, k). In a setup phase, fix j = 1 arbitrarily. Then, compute f(1, i) for every i and store the result. This takes O(n) space and O(n^2) time: n evaluations with running times of 1, 2, 3, ..., n.
For a query f(i, k), we need two constant-time lookups for f(i, 1) and f(k, 1).
Related
I have found the following algorithm:
for i in range(1, n)
for j in range(i+1,n+1)
for k in range(1,j+1)
//some instructions
and I would like to determine its complexity, so far what I have done is the following:
I have converted the three loops into summations so I have
so when I analize the loops j and k it is easy to see that when j starts with 2 then k with make 2 loops, when j starts with 3 then k will do 3 loops, and so on. At this poing I can have something like:
I am considering c as the instructions that are inside the k loop. For finishing I can say that I have:
Is this analysis correct or am I missing something?
Thanks
There are O(n^3) steps here, as you believe. Your heuristic has shown that O(n^3) is an upper bound --- but one might ask if it is a tight upper bound. In fact, it is (assuming that the content of each loop is essentially a constant-time operation).
One way to see this is to make some loose upper and lower bounds.
For an upper bound, note that i ranges from 1 to n, j ranges over a subset of 1 to n+1, and k ranges over a subset of 1 to n+2. There are then fewer steps than if i, j, and k ran over (1,n), (1,n+1), and (1,n+2), respectively. And this is O(n^3) steps. Thus there are at most O(n^3) steps. I note that this is roughly the heuristic that you used to produce your answer.
For a lower bound, we can note that when i is in (n/3, 2n/3), the j index will always include the range (2n/3 + 1, n + 1). And for i and j in these ranges, the k index will always include the range (1, 2n/3 + 2). The lengths of these ranges are n/3 (for i), n/3 (for j), and 2n/3 + 1 (for k). This is also of order n^3, and so O(n^3) is the right estimate. Some people would then say that this is BigTheta(n^3).
Let the following algorithm be:
sum(v, i, j) {
if i == j
return v[i]
else {
k = (i + j) / 2
return sum(v, i, k) + sum(v, k+1, j)
}
}
The time complexity of this algorithm is O(n), but how can I prove (in natural language) its complexity? The problem always gets divided in two new problems so that would be O(log n), but where does the rest of the complexity come from?
Applying master theorem yields the expected result, O(n).
Thanks.
From a high level perspective, your algorithm acts as if it is traversing a balanced binary tree, where each node covers a specific interval [i, j]. Their children divide the interval into 2, roughly equal parts, namely [i, (i+j)/2] and [(i+j)/2 + 1, j].
Let's assume that they are, in this case equal. (in other words, for the sake of the proof, the length of the array n is a power of 2)
Think of it in the following way. There are n leaves of this balanced binary tree your algorithm is traversing. Each are responsible from an interval of length 1. There are n/2 nodes of the tree that are the parents of these n leaves. Those n/2 nodes have n/4 parents. This goes all the way until you reach the root node of the tree, which covers the entire interval.
Think of how many nodes there are in this tree. n + (n/2) + (n/4) + (n/8) + ... + 2 + 1. Since we initially assumed that n = 2^k, we can formulate this sum as the sum of exponents, for which the summation formula is well known. It turns out that there are 2^(k+1) - 1 = 2 * (2^k) - 1 = 2n - 1 nodes in that tree. So, obviously traversing all nodes of that tree would take O(n) time.
Dividing the problem in two parts does not necessarly mean that complexity is log(n).
I guess you are referring to binary search algorithm but in that every division each half is skipped as we know search key would be in other side of division.
Just by looking at the sudo code , Recursive call is made for every division and it is not skipping anything. Why would it be log(n)?
O(n) is correct complexity.
I need an algorithm which can help me divide N-elements array into pairs, elements in each pair must have minimum difference
So, I assume that we may throw away at most 1 element (in case when N is odd). Let a1 ≤ a2 ≤ … ≤ aN be our set of numbers sorted in non-decreasing order. Let f(k) be the minimum possible sum of differences inside pairs formed by the first k numbers. (So minimum is taken over all partitions of the first k numbers into pairs.) As it is mentioned in comments for even N we just need to take elements one by one. Then
f(0) = f(1) = 0,
f(2·k) = f(2·k − 2) + (a2·k − a2·k − 1) for k ≥ 1 and
f(2·k + 1) = min { f(2·k), f(2·k − 1) + (a2·k + 1 − a2·k) } for k ≥ 1.
The last formula means that we can either throw away (2·k + 1)th element or throw away one element among the first (2·k − 1). On the remaining 2·k elements we apply known solution for even N.
This is a way to find numerical answer for the problem in O(N) time after sorting numbers, i. e. in general it takes O(N log N) time. After that if you need partition of all numbers into pairs just use backward step of dynamic programming. For even N just make pairs trivially. For odd N throw away aN if f(N) = f(N − 1) or take pair (aN, aN − 1) and decrease N by 2 otherwise.
Suppose an array = {2,5,7,8,10}. You need to find the length of Longest Increasing Sub-sequence such that a element is not less than the sum of all elements before it.
In this case the answer can be {2,5,7}, {2,5,8} or {2,8,10}. So Length = 3
This is easily solvable in O(n^2). As LIS Length can be found in O(n log n). As the problem is asking only the length, so, I think this problem is also solvable in O(n log n). But how can I do that?
Actually you don't need a DP Solution at all.
First sort the numbers in non-decreasing order. And loop from left to right. keep track of the current sum.
If next number is not-less than the sum add it to LIS. else proceed to next number.
It can be proven that the greedy solution is the optimal solution. Prove it yourself ;)
There's an O(N^2) dynamic programming solution which goes like:
Let f(i, j) be the smallest sum that a "correct" subsequence ending in one of the first i elements and consisting of exactly j elements can have.
The base case is f(0, 0) = 0 (empty prefix, no elements)
The transition are f(i, j) -> f(i + 1, j) (not adding a new element) and
f(i, j) -> f(i + 1, j + 1) if a[i] > f(i, j) (adding the i-th element to the end of the subsequence if we can).
The correctness of this solution is self-evident.
A cool fact: let A be a "correct" subsequence of k elements. Than the last element of A is not less than max(1, 2^(k-2))(proof: it's the case for k = 1 and k = 2. Now we can use induction and the fact that 1 + sum i = 0 .. k of 2^k = 2^(k+1))
Thus, j ranges over 0..log MAX_A + C in the dynamic programming solution described above, so it works in O(N * log MAX_A).
O(N * log MAX_A) is not O(N log N), but this solution can be good for practical purposes.
Consider the task of finding the top-k elements in a set of N independent and identically distributed floating point values. By using a priority queue / heap, we can iterate once over all N elements and maintain a top-k set by the following operations:
if the element x is "worse" than the heap's head: discard x ⇒ complexity O(1)
if the element x is "better" than the heap's head: remove the head and insert x ⇒ complexity O(log k)
The worst case time complexity of this approach is obviously O(N log k), but what about the average time complexity? Due to the iid-assumption, the probability of the O(1) operation increases over time, and we rarely have to perform the costly O(log k), especially for k << N.
Is this average time complexity documented in any citable reference? What's the average time complexity? If you have a citeable reference for your answer please include it.
Consider the i'th largest element, and a particular permutation. It'll inserted into the k-sized heap if it appears before no more than k-1 of the (i - 1) larger elements in the permutation.
The probability of that heap-insertion happening is 1 if i <= k, and k/i if i > k.
From this, you can compute the expectation of the number of heap adjustments, using linearity of expectation. It's sum(i = 1 to k)1 + sum(i = k+1 to n)k/i = k + sum(i = k+1 to n)k/i = k * (1 + H(n) - H(k)), where H(n) is the n'th harmonic number.
This is approximately k log(n) (for k << n), and you can compute your average cost from there.