Big O for exponential complexity specific case - algorithm

Let's an algorithm to find all paths between two nodes in a directed, acyclic, non-weighted graph, that may contain more than one edge between the same two vertices. (this DAG is just an example, please I'm not discussing this case specifically, so disregard it's correctness though it's correct, I think).
We have two effecting factors which are:
mc: max number of outgoing edges from a vertex.
ml: length of the max length path measured by number of edges.
Using an iterative fashion to solve the problem, where complexity in the following stands for count of processing operations done.
for the first iteration the complexity = mc
for the second iteration the complexity = mc*mc
for the third iteration the complexity = mc*mc*mc
for the (max length path)th iteration the complexity = mc^ml
Total worst complexity is (mc + mc*mc + ... + mc^ml).
1- can we say it's O(mc^ml)?
2- Is this exponential complexity?, as I know, in exponential complexity, the variable only appear at the exponent, not at the base.
3- Are mc and ml both are variables in my algorithm comlexity?

There's a faster way to achieve the answer in O(V + E), but seems like your question is about calculating complexity, not about optimizing algorithm.
Yes, seems like it's O(mc^ml)
Yes, they bot can be variables in your algorithm complexity
As about the complexity of your algorithm: let's do some transformation, using the fact that a^b = e^(b*ln(a)):
mc^ml = (e^ln(mc))^ml = e^(ml*ln(mc)) < e^(ml*mc) if ml,mc -> infinity
So, basically, your algorithm complexity upperbound is O(e^(ml*mc)), but we can still shorten it to see, if it's really an exponential complexity. Assume that ml, mc <= N, where N is, let's say, max(ml, mc). So:
e^(ml*mc) <= e^N^2 = e^(e^2*ln(N)) = (e^e)^(2*ln(N)) < (e^e)^(2*N) = [C = e^e] = O(C^N).
So, your algorithm complexity will be O(C^N), where C is a constant, and N is something that growth not faster than linear. So, basically - yes, it is exponetinal complexity.

Related

Big O Notation For An Algorithm That Contains A Shrinking List Inside While Loop

I was trying to estimate the worst case scenario for an algorithm that looks like this (estimated complexity in comments are mine in which V is the number of Vertices and E is the number of Edges in a graph):
while(nodes.size()!=0) { // O(V) in which nodes is a LinkedList
Vertex u = Collections.min(nodes); // O(V)
nodes.remove(u); // O(1)
for(Map.Entry<Vertex, Integer> adjacency : u.adj.entrySet()) { // O(E)
// Some O(1) Statement
if(nodes.contains(v)) { // O(V)
// Some O(1) Statement
}
}
}
My question is very straightforward:
After every round in the while loop, the nodes LinkedList will go smaller and smaller.
Eventually, both Collections.min() and nodes.contains() operations will take less time every round.
My understanding is Big O Notation always considers the worst, thus the above complexities should be correct.
Otherwise, would you please explain the plot of how to figure out the correct Complexity in the above scenario?
nodes.contains has worst-case time complexity in Θ(V), the for-loop runs a number of times in Θ(E) and so has worst-case time complexity in Θ(V*E), Collections.min has worst-case time complexity in Θ(V), so the body of the while loop has worst-case time complexity in Θ(V+V*E), but V+V*E is itself Θ(V*E) (see later), so the body of the while loop has worst-case time complexity in Θ(V*E). The while loop executes V times. So the worst-case time to execute the algorithm is in Θ(V^2*E).
The simplification there -- replacing Θ(V+V*E) with Θ(V*E) -- is an acceptable one because we are looking at the general case of V>1. That is, V*E will always be a bigger number than V, so we can absorb V into a boundedly constant factor. It would also be correct to say the worst-cast time is in Θ(V^2+E*V^2), but one wouldn't use that since the simplified form is more useful.
Incidentally, as a matter of intuition, you can generally ignore the effect of containers being "used up" during an algorithm, such as with insertion sort having fewer and fewer items to look through, or this algorithm having fewer and fewer nodes to scan. Those effects turn into constant factors and disappear. It's only when you're eliminating an interesting number of elements each time, such as with the quickselect algorithm or binary search, that that sort of thing starts to affect the asymptotic runtime.
You can take the largest possible values at each step but this can give a final value that is too much of an overestimate. To make sure the value is accurate, you can leave taking the upper-bound until the end, but often it ends up being the same anyway.
When the value of V changes, then introduce another variable v which is the value for one specific iteration. Then the complexity of each iteration is v+(E*v). The total complexity is then the sum of each iteration:
sum(v = 1...V) v+(E*v)
= 1+1E + 2+2E + 3+3E + ... + V+VE - Expand the sum
= (1 + 2 + 3 + ... + V) + (1 + 2 + 3 + ... + V)E - Regroup terms
= (V^2 + V)/2 + (V^2 + V)E/2 - Sum of arithmetic series
= (V^2 + V + EV^2 + EV)/2 - Addition of fractions
= O(EV^2) - EV^2 greater than all other terms
Yes, these look correct. And putting them together you will get time O(V*(V+E)). (Correction, O((1+E)*V^2) - I had missed the O(V) inside of an O(E) inner loop.)
However there is an important correction to your understanding. Big O notation not always worst case. The notation is a way of estimating the growth of mathematical functions. Whether those functions are worst case, or average, or what they measure is entirely up to the problem at hand. For example quicksort can be implemented in O(n^2) worst case running time, with O(n log(n)) average running time, using O(log(n)) extra memory on average and O(n) extra memory in the worst case.

How do I prove that this algorithm is O(loglogn)

How do I prove that this algorithm is O(loglogn)
i <-- 2
while i < n
i <-- i*i
Well, I believe we should first start with n / 2^k < 1, but that will yield O(logn). Any ideas?
I want to look at this in a simple way, what happends after one iteration, after two iterations, and after k iterations, I think this way I'll be able to understand better how to compute this correctly. What do you think about this approach? I'm new to this, so excuse me.
Let us use the name A for the presented algorithm. Let us further assume that the input variable is n.
Then, strictly speaking, A is not in the runtime complexity class O(log log n). A must be in (Omega)(n), i.e. in terms of runtime complexity, it is at least linear. Why? There is i*i, a multiplication that depends on i that depends on n. A naive multiplication approach might require quadratic runtime complexity. More sophisticated approaches will reduce the exponent, but not below linear in terms of n.
For the sake of completeness, the comparison < is also a linear operation.
For the purpose of the question, we could assume that multiplication and comparison is done in constant time. Then, we can formulate the question: How often do we have to apply the constant time operations > and * until A terminates for a given n?
Simply speaking, the multiplication reduces the effort logarithmic and the iterative application leads to a further logarithmic reduce. How can we show this? Thankfully to the simple structure of A, we can transform A to an equation that we can solve directly.
A changes i to the power of 2 and does this repeatedly. Therefore, A calculates 2^(2^k). When is 2^(2^k) = n? To solve this for k, we apply the logarithm (base 2) two times, i.e., with ignoring the bases, we get k = log log n. The < can be ignored due to the O notation.
To answer the very last part of the original question, we can also look at examples for each iteration. We can note the state of i at the end of the while loop body for each iteration of the while loop:
1: i = 4 = 2^2 = 2^(2^1)
2: i = 16 = 4*4 = (2^2)*(2^2) = 2^(2^2)
3: i = 256 = 16*16 = 4*4 = (2^2)*(2^2)*(2^2)*(2^2) = 2^(2^3)
4: i = 65536 = 256*256 = 16*16*16*16 = ... = 2^(2^4)
...
k: i = ... = 2^(2^k)

Big O and Big Omega Notation Algorithms

There is a comparison-based sorting algorithm that runs in O(n*log(sqrt(n))).
Given the existence of an Omega(n(log(n)) lower bound for sorting, how can this be possible?
Basically, this problem is asking you to prove that O(n*log(n)) = O(n*log(√n)), which means that you need to find some constant c > 0 such that: O(n*log(n)) = O(c*n*log(√n)). Remember that √n = n^(1/2) and that log(n^(1/2)) = 1/2*log(n). So, now we have O(n*log(n)) = O(1/2*n*log(n)). Since asymptotic notation ignores constant multipliers, we can rewrite this as O(n*log(n)) = O(n*log(n)). Voila, proof positive that it is possible.
For a sorting algorithm based on comparison you can draw a decision tree. It is a binary tree representing comparisons done by the algorithm, and every leaf of this tree is a permutation of the elements from given set.
There are n! possible permutations, where n is the size of the set, and only one of them represents the sorted set. Path leading to every leaf represents comparisons necessary to achieve permutation represented by the leaf.
Now lets make h the height of our decision tree, and l to be the number of leafs. Every possible permutation of the input set must be in one of the leafs, so n! <= l. A binary tree with height h can have at most 2^h leafs. Therefore we get n! <= l <= 2^h. Apply a logarithm to both sides, so you get h >= log(n!), and log(n!) is Omega(nlog(n)).
Because the height of the decision tree represents a number of comparisons necessary to get to the leaf, this is the proof that the lower bound for sorting algorithms based on comparison is nlog(n). This can't be done faster. So the only option left for this task to be correct is to assume that Omega(nlog(n) is also Omega(nlog(sqrt(n)). log(sqrt(n)) = log(n^(1/2)) = (1/2)log(n) => nlog(sqrt(n)) = n((1/2)log(n)) = (1/2)nlog(n). Ignore const = 1/2 (as we are interested in asympthotic complexity) an you get nlog(sqrt(n)) = nlog(n) in terms of complexity.

Maximum non-overlapping intervals in a interval tree

Given a list of intervals of time, I need to find the set of maximum non-overlapping intervals.
For example,
if we have the following intervals:
[0600, 0830], [0800, 0900], [0900, 1100], [0900, 1130],
[1030, 1400], [1230, 1400]
Also it is given that time have to be in the range [0000, 2400].
The maximum non-overlapping set of intervals is [0600, 0830], [0900, 1130], [1230, 1400].
I understand that maximum set packing is NP-Complete. I want to confirm if my problem (with intervals containing only start and end time) is also NP-Complete.
And if so, is there a way to find an optimal solution in exponential time, but with smarter preprocessing and pruning data. Or if there is a relatively easy to implement fixed parameter tractable algorithm. I don't want to go for an approximation algorithm.
This is not a NP-Complete problem. I can think of an O(n * log(n)) algorithm using dynamic programming to solve this problem.
Suppose we have n intervals. Suppose the given range is S (in your case, S = [0000, 2400]). Either suppose all intervals are within S, or eliminate all intervals not within S in linear time.
Pre-process:
Sort all intervals by their begin points. Suppose we get an array A[n] of n intervals.
This step takes O(n * log(n)) time
For all end points of intervals, find the index of the smallest begin point that follows after it. Suppose we get an array Next[n] of n integers.
If such begin point does not exist for the end point of interval i, we may assign n to Next[i].
We can do this in O(n * log(n)) time by enumerating n end points of all intervals, and use a binary search to find the answer. Maybe there exists linear approach to solve this, but it doesn't matter, because the previous step already take O(n * log(n)) time.
DP:
Suppose the maximum non-overlapping intervals in range [A[i].begin, S.end] is f[i]. Then f[0] is the answer we want.
Also suppose f[n] = 0;
State transition equation:
f[i] = max{f[i+1], 1 + f[Next[i]]}
It is quite obvious that the DP step take linear time.
The above solution is the one I come up with at the first glance of the problem. After that, I also think out a greedy approach which is simpler (but not faster in the sense of big O notation):
(With the same notation and assumptions as the DP approach above)
Pre-process: Sort all intervals by their end points. Suppose we get an array B[n] of n intervals.
Greedy:
int ans = 0, cursor = S.begin;
for(int i = 0; i < n; i++){
if(B[i].begin >= cursor){
ans++;
cursor = B[i].end;
}
}
The above two solutions come out from my mind, but your problem is also referred as the activity selection problem, which can be found on Wikipedia http://en.wikipedia.org/wiki/Activity_selection_problem.
Also, Introduction to Algorithms discusses this problem in depth in 16.1.

Finding time complexity of partition by quick sort metod

Here is an algorithm for finding kth smallest number in n element array using partition algorithm of Quicksort.
small(a,i,j,k)
{
if(i==j) return(a[i]);
else
{
m=partition(a,i,j);
if(m==k) return(a[m]);
else
{
if(m>k) small(a,i,m-1,k);
else small(a,m+1,j,k);
}
}
}
Where i,j are starting and ending indices of array(j-i=n(no of elements in array)) and k is kth smallest no to be found.
I want to know what is the best case,and average case of above algorithm and how in brief. I know we should not calculate termination condition in best case and also partition algorithm takes O(n). I do not want asymptotic notation but exact mathematical result if possible.
First of all, I'm assuming the array is sorted - something you didn't mention - because that code wouldn't otherwise work. And, well, this looks to me like a regular binary search.
Anyway...
The best case scenario is when either the array is one element long (you return immediately because i == j), or, for large values of n, if the middle position, m, is the same as k; in that case, no recursive calls are made and it returns immediately as well. That makes it O(1) in best case.
For the general case, consider that T(n) denotes the time taken to solve a problem of size n using your algorithm. We know that:
T(1) = c
T(n) = T(n/2) + c
Where c is a constant time operation (for example, the time to compare if i is the same as j, etc.). The general idea is that to solve a problem of size n, we consume some constant time c (to decide if m == k, if m > k, to calculate m, etc.), and then we consume the time taken to solve a problem of half the size.
Expanding the recurrence can help you derive a general formula, although it is pretty intuitive that this is O(log(n)):
T(n) = T(n/2) + c = T(n/4) + c + c = T(n/8) + c + c + c = ... = T(1) + c*log(n) = c*(log(n) + 1)
That should be the exact mathematical result. The algorithm runs in O(log(n)) time. An average case analysis is harder because you need to know the conditions in which the algorithm will be used. What is the typical size of the array? The typical size of k? What is the mos likely position for k in the array? If it's in the middle, for example, the average case may be O(1). It really depends on how you use this.

Resources