I'm searching for the Big-O complexity of PageRank algorithm.
I hardly could found anything, I just found O(n+m) ( n - number of nodes, m - number of arcs/edges) but I didn't believe this complexity by now.
I think it is missing the convergence criteria. I didn't think that this is a constant, I think the convergence depends on the graph diameter. It might be enough to have the Big-O for one iteration, then convergence is not important.
Nevertheless PageRank need to touch every node and aggregate every incoming rank, so I expected a runtime of O(n * m).
Did I miss something? Did anyone know a valuable source for the Big-O complexity of PageRank?
Thanks in advance.
After some research and further thinking I have come to the conclusion that O(n+m) ist the real thing.
Because even in a complete graph, one has to touch each edge twice. One could not touch every edge, that was the mistake in my thinkings. Therefor one has to touch at least every node, which is n times and two times each edge which is m in big O.
So the correct answer is O(n+m)
Related
I'm watching the coursera lectures about algorithm and the professor introduces that the naive implementation of Dijkstra's shortest path algorithm, without using heaps, takes O(nm) time(n is the number of vertices, and m is the number of edges)
It claims that the main loop will go through the rest vertices besides the source, which are n-1 vertices, this I can understand, but inside the loop, the algorithm will go through edges that with tail in the processed vertices and head in the unprocessed vertices, to minimize the next path. But why does it mean there are m edges to go through, we can just go through the edges that qualifies the criteria(tail in the processed vertices and head in the unprocessed vertices) even in a naive implementation right?
Could anyone please help me understand this? thanks.
When you consider big-O time complexity, you should think of it as a form of upper-bound. Meaning, if some input can possibly make your program run in O(N) while some other input can possibly make your program run in O(NM), we generally say that this is an O(NM) program (especially when dealing with algorithms). This is called considering an algorithm in its worst-case.
There are rare cases where we don't consider the worst-case (amortized time complexity analysis, time complexity with random elements to it such as quicksort). In the case of quicksort, for example, the worst-case time complexity is O(N^2), but the chance of that happening is so, so small that in practice, if we select a random pivot, we can expect O(N log N) time complexity. That being said, unless you can guarantee (or have large confidence that) your program runs faster than O(NM), it is an O(NM) program.
Let's consider the case of your naive implementation of Dijkstra's algorithm. You didn't specify whether you were considering directed or undirected graphs, so I'll assume that the graph is undirected (the case for a directed graph is extremely similar).
We're going through all the nodes, besides the first node. This means we already have an O(N) loop.
In each layer of the loop, we're considering all the edges that stem from a processed node to an unprocessed node.
However, in the worst-case, there are close to O(M) of these in each layer of the loop; not O(1) or anything less than O(M).
There's an easy way to prove this. If each edge were to only be visited a single time, then we can say that the algorithm runs in O(M). However, in your naive implementation of Dijkstra's algorithm, the same edge can be considered multiple times. In fact, asymptomatically speaking, O(N) times.
You can try creating a graph yourself, then dry-running the process of Dijkstra on paper. You'll notice that each edge can be considered up to N times: once in each layer of the O(N) loop.
In other words, the reason we can't say the program is faster than O(NM) is because there is no guarantee that each edge isn't processed N times, or less than O(log N) times, or less than O(sqrt N) times, etc. Therefore, the best upper bound we can give is N, although in practice it may be less than N by some sort of constant. However, we do not consider this constant in big-O time complexity.
However, your thought process may lead to a better implementation of Dijsktra. When considering the algorithm, you may realize that instead of considering every single edge that goes from processed to unprocessed vertices in every iteration of the main loop, we only have to consider every single edge that is adjacent to the current node we are on (I'm talking about something like this or this).
By implementing it like so, you can achieve a complexity of O(N^2).
First time complexity analysis is given here
Door in an infinite wall algorithm
My question is, we can rewrite the exact same algorithm differently and get it to be O(n^2), but why?
If n=2^k, then at worst case we would have to walk 2^(2k+1) steps by moving the exact same way as in the above algorithm. After some algebra that becomes 8*(2^(k-1))^2 which is less than 8*(n^2). Therefore O(n^2).
How can the same algorithm have two different time complexities?
Your error is in this claim:
...we would have to walk 2^(2k+1) steps
This is not true. It seems you wrongly copied this formula:
2 x 2^(k-1)
Nowhere in the referenced question or answer does k occur with a coefficient different from 1, and there is no reason why one would need to introduce 2 for it to get 2k.
I would like to quote from Wikipedia
In mathematics, the minimum k-cut, is a combinatorial optimization
problem that requires finding a set of edges whose removal would
partition the graph to k connected components.
It is said to be the minimum cut if the set of edges is minimal.
For a k = 2, It would mean Finding the set of edges whose removal would Disconnect the graph into 2 connected components.
However, The same article of Wikipedia says that:
For a fixed k, the problem is polynomial time solvable in O(|V|^(k^2))
My question is Does this mean that minimum 2-cut is a problem that belongs to complexity class P?
The min-cut problem is solvable in polynomial time and thus yes it is true that it belongs to complexity class P. Another article related to this particular problem is the Max-flow min-cut theorem.
First of all, the time complexity an algorithm should be evaluated by expressing the number of steps the algorithm requires to finish as a function of the length of the input (see Time complexity). More or less formally, if you vary the length of the input, how would the number of steps required by the algorithm to finish vary?
Second of all, the time complexity of an algorithm is not exactly the same thing as to what complexity class does the problem the algorithm solves belong to. For one problem there can be multiple algorithms to solve it. The primality test problem (i.e. testing if a number is a prime or not) is in P, but some (most) of the algorithms used in practice are actually not polynomial.
Third of all, in the case of most algorithms you'll find on the Internet evaluating the time complexity is not done by definition (i.e. not as a function of the length of the input, at least not expressed directly as such). Lets take the good old naive primality test algorithm (the one in which you take n as input and you check for division by 2,3...n-1). How many steps does this algo take? One way to put it is O(n) steps. This is correct. So is this algorithm polynomial? Well, it is linear in n, so it is polynomial in n. But, if you take a look at what time complexity means, the algorithm is actually exponential. First, what is the length of the input to your problem? Well, if you provide the input n as an array of bits (the usual in practice) then the length of the input is, roughly said, L = log n. Your algorithm thus takes O(n)=O(2^log n)=O(2^L) steps, so exponential in L. So the naive primality test is in the same time linear in n, but exponential in the length of the input L. Both correct. Btw, the AKS primality test algorithm is polynomial in the size of input (thus, the primality test problem is in P).
Fourth of all, what is P in the first place? Well, it is a class of problems that contains all decision problems that can be solved in polynomial time. What is a decision problem? A problem that can be answered with yes or no. Check these two Wikipedia pages for more details: P (complexity) and decision problems.
Coming back to your question, the answer is no (but pretty close to yes :p). The minimum 2-cut problem is in P if formulated as a decision problem (your formulation requires an answer that is not just a yes-or-no). In the same time the algorithm that solves the problem in O(|V|^4) steps is a polynomial algorithm in the size of the input. Why? Well, the input to the problem is the graph (i.e. vertices, edges and weights), to keep it simple lets assume we use an adjacency/weights matrix (i.e. the length of the input is at least quadratic in |V|). So solving the problem in O(|V|^4) steps means polynomial in the size of the input. The algorithm that accomplishes this is a proof that the minimum 2-cut problem (if formulated as decision problem) is in P.
A class related to P is FP and your problem (as you formulated it) belongs to this class.
I stumbled upon this question:
Given a binary search tree with 2^n-1 nodes, give an efficient algorithm to convert it to a self balancing tree(like avl or RB tree). and analyze its worst case running time as a function of n.
well I think the most efficient algorithm is at o(n) time for n nodes, but the 2^n-1 nodes is the tricky part. any idea what will be the running time then?
any help will be greatly appreciated
If you've already got a linear-time algorithm for solving this problem, great! Think of it this way. Let m = 2n - 1. If you have an algorithm that balances the tree and runs in time linear in the number of nodes, then your algorithm runs in time O(m) in this case, which is great. Don't let the exponential time scare you; if the runtime is O(2n) on inputs of size 2n - 1, then you're running efficiently.
As for particular algorithms, you seem to already know one, but if you haven't heard of it already, check out the Day-Stout-Warren algorithm, which optimally rebuilds a tree and does so in linear time and constant space.
Sorry for stupid question. I cannot jog my memory and googling did not help me answer this question.
So basically given a graph G(V,E), I know that O(|V|^2) or O(|E|^2 + |V|^2) is considered to be polynomial complexity, so is O(|E|*|V|) polynomial as well? If not, what kind of complexity is it? I believe it's not pseudo-polynomial either.
Another question is: is O(m*n) considered polynomial as well, given m and n are the sizes of two INDEPENDENT inputs to a problem? I just want to clarify the concept of polynomial time in here and want to know if O(m*n) has a different name for its type of complexity.
it is polynomial O(|V|^3) since the number of edges is bounded O(|V|^2)