Dijkstra's: Where is the equation from? m < n^2/log n - algorithm

In this passage from my textbook:
where are the inequalities from? (The ones that I've marked with red rectangles.) I feel that they describe a relationship between vertices and edges in a graph, but I don't understand it.

You have two implementations of Dijkstra’s algorithm to choose from. One runs in time O((m + n) log n) = O(m log n), assuming the graph is connected. The other runs in time O(n2). The question is where the crossover point is between these two runtimes. Equating and simplifying gives that
m log n = n2
m = n2 / log n
So if m is asymptotically smaller than n2 / log n, you’d prefer the heap implementation, and if m is asymptotically bigger than n2 / log n you’d prefer the unsorted sequence approach.
(Note that, with a Fibonacci heap, the runtime of Dijkstra’s algorithm is O(m + n log n), which is never asymptotically worse than O(n2).)

Related

O(n log n) vs O(m) for algorithm

I am finding an algorithm for a problem where I have two sets A and B of points with n and m points. I have two algorithms for the sets with complexity O(n log n) and O(m) and I am now wondering whether the complexity for the both algorithms combined is O(n log n) or O(m).
Basically, I am wondering whether there is some relation between m and n which would result in O(m).
If m and n are truly independent of one another and neither quantity influences the other, then the runtime of running an O(n log n)-time algorithm and then an O(m)-time algorithm is will be O(n log n + m). Neither term dominates the other - if n gets huge compared to m then the n log n part dominates, and if m is huge relative to n then the m term dominates.
This gets more complicated if you know how m and n relate to one another in some way. Many graph algorithms, for example, use m to denote the number of edges and n to denote the number of nodes. In those cases, you can sometimes simplify these expressions, but sometimes cannot. For example, the cost of implementing Dijkstra’s algorithm with a Fibonacci heap is O(m + n log n), the same as what we have above.
Size of your input is x: = m + n.
Complexity of a combined (if both are performed at most a constant number of times in the combined algorithm) algorithm is:
O(n log n) + O(m) = O(x log x) + O(x) = O(x log x).
Yes if m ~ n^n, then O(logm) = O(nlogn).
There is a log formula:
log(b^c) = c*log(b)
EDIT:
For both the algos combined the Big O is always the one that is larger because we are concerned about the asymptotic upper bound.
So it will depend on value of n and m. Eg: While n^n < m, the complexity is Olog(m), after that it becomes O(nlog(n)).
For Big-O notation we are only concerned about the larger values, so if n^n >>>> m then it is O(nlog(n)), else if m >>>> n^n then it is O(logm)

Master theorem solution for the case d=log_b(a)

From Dasgupta's Algorithms: if the running time of a divide and conquer algorithm is described by the recurrence T(n)=aT(n/b)+O(n^d), then its solution is:
T(n)=O(n^d) if d>log_b(a)
T(n)=O(n^log_b(a)) if d<log_b(a)
T(n)=O(n^d*log_2(n)) if d=log_b(a)
where each subproblem's size is decreasing by b in the next recursion call, a is the branching factor and O((n/b^k)^d) is the time for deviding and combining the subproblems on the level k for each subproblem.
Cases 1 and 2 are straightforward - they are taken from the geometric series formed when summing the work done at each level of the recursion tree, which is a^k*O((n\b^k)^d)=O(n^d)*(a/b^d)^k.
Where does the log_2(n) come up from in case 3? When d=log_b(a), the ratio a/b^d equals 1, hence the sum of the series is n^d*log_b(a), not n^d*log_2(n)
As a simpler example, first note that O(log n), O(log137 n), and O(log16 n) mean the same thing. The reason for this is that, by the change of basis formula for logarithms, for any fixed constant m we have
log_m n = log n / log m = (1 / log m) · log n = O(log n).
The Master Theorem assumes that a, b, and d are constants. From the change of basis formula for logarithms, we have that
In that sense, O(nd logb n) = O(nd log n), since b is a constant here.
As a note, it’s unusual to see something written out as O(nd log2 n), since the log base here doesn’t matter and just contributes to the (already hidden) constant factor.

Find 3 elements in each of 3 arrays that sum to a given value

Let A , B, C be 3 arrays of n elements each. Find an algorithm for determining whether there exist an a in A, b in B, c in C such that a+b+c = k.
I have tried the following algorithm, but it takes O(n²):
Sort all 3 arrays. - O(n log n)
Temporary array h = k - (a+b) - O(n)
For every h, find c' in B such that c' = h - B[i] - O(n)
Search c' in C using binary search - O(log n)
Total is = O(n log n) + O(n) + O(n² log n)
Can we solve it in O(n log n)?
Your question asks about solving the problem 3SUMx1, in linearithmic time, which is shown to reduce to 3SUMx3 in randomized linear time. See here for the reduction.
Unless you're about to publish something very big, I doubt that there can be such a fast algorithm for your problem, which is at least as hard as 3SUM (you can also show the reduction in the opposite direction with some work, too).
Edit: To make the above paragraph clear, the linear-time reduction from 3SUM proves that OP's problem is $\Omega(n^{1.5})$.
this is just a variation of the 3SUM problem. you cannot solve it in O(n log n)
it can be solved in O(n^2). The algorithm you described is wrong - it is not considering combinations of various indexes from A and B... see https://en.wikipedia.org/wiki/3SUM

Lower bounds on comparison sorts for a small fraction of inputs?

Can someone please walk me through mathematical part of the solution of the following problem.
Show that there is no comparison sort whose running time is linear for at least half
of the n! inputs of length n. What about a fraction of 1/n of the inputs of length n?
What about a fraction (1/(2)^n)?
Solution:
If the sort runs in linear time for m input permutations, then the height h of the
portion of the decision tree consisting of the m corresponding leaves and their
ancestors is linear.
Use the same argument as in the proof of Theorem 8.1 to show that this is impossible
for m = n!/2, n!/n, or n!/2n.
We have 2^h ≥ m, which gives us h ≥ lgm. For all the possible ms given here,
lgm = Ω(n lg n), hence h = Ω(n lg n).
In particular,
lgn!/2= lg n! − 1 ≥ n lg n − n lg e − 1
lgn!/n= lg n! − lg n ≥ n lg n − n lg e − lg n
lgn!/2^n= lg n! − n ≥ n lg n − n lg e − n
Each of these proofs are a straightforward modification of the more general proof that you can't have a comparison sort that sorts any faster than Ω(n log n) (you can see this proof in this earlier answer). Intuitively, the argument goes as follows. In order for a sorting algorithm to work correctly, it has to be able to determine what the initial ordering of the elements is. Otherwise, it can't reorder the values to put them in ascending order. Given n elements, there are n! different permutations of those elements, meaning that there are n! different inputs to the sorting algorithm.
Initially, the algorithm knows nothing about the input sequence, and it can't distinguish between any of the n! different permutations. Every time the algorithm makes a comparison, it gains a bit more information about how the elements are ordered. Specifically, it can tell whether the input permutation is in the group of permutations where the comparison yields true or in the group of permutations where the comparison yields false. You can visualize how the algorithm works as a binary tree, where each node corresponds to some state of the algorithm, and the (up to) two children of a particular node indicate the states of the algorithm that would be entered if the comparison yields true or false.
In order for the sorting algorithm to be able to sort correctly, it has to be able to enter a unique state for each possible input, since otherwise the algorithm couldn't distinguish between two different input sequences and would therefore sort at least one of them incorrectly. This means that if you consider the number of leaf nodes in the tree (parts where the algorithm has finished comparing and is going to sort), there must be at least one leaf node per input permutation. In the general proof, there are n! permutations, so there must be at least n! leaf nodes. In a binary tree, the only way to have k leaf nodes is to have height at least Ω(log k), meaning that you have to do at least Ω(log k) comparisons. Thus the general sorting lower bound is Ω(log n!) = Ω(n log n) by Stirling's approximation.
In the cases that you're considering, we're restricting ourselves to a subset of those possible permutations. For example, suppose that we want to be able to sort n! / 2 of the permutations. This means that our tree must have height at least lg (n! / 2) = lg n! - 1 = Ω(n log n). As a result. you can't sort in time O(n), because no linear function grows at the rate Ω(n log n). For the second part, seeing if you can get n! / n sorted in linear time, again the decision tree would have to have height lg (n! / n) = lg n! - lg n = Ω(n log n), so you can't sort in O(n) comparisons. For the final one, we have that lg n! / 2n = lg n! - n = Ω(n log n) as well, so again it can't be sorted in O(n) time.
However, you can sort 2n permutations in linear time, since lg 2n = n = O(n).
Hope this helps!

Disjoint sets with forest implementation without path compression

Consider a forest implementation of disjoint sets with only the weighted union heuristics (NO PATH COMPRESSION!) with n distinct elements. Define T(n,m) to be the worst case time complexity of executing a sequence of n-1 unions and m finds in any order, where m is any positive integer greater than n.
I defined T(n,m) to be the sequence of doing n-1 unions and then m finds AFTERWARDS because doing the find operation on the biggest tree possible would take the longest. Accordingly, T(n,m) = m*log(n) + n - 1 because each union takes O(1) so n-1 unions is n-1 steps, and each find takes log(n) steps per as the height of the resultant tree after n-1 unions is bounded by log_2 (n).
My problem now is, does the T(n,m) chosen look fine?
Secondly, is T(n,m) Big Omega (m*log(n)) ? My claim is that it is with c = 1 and n >= 2, given that the smallest possible T(n,m) is m*log(2) + 1, which is obviously greater than m*log(2). Seems rather stupid to ask this but it seemed rather too easy for a solution, so I have my suspicions regarding my correctness.
Thanks in advance.
Yes to T(n, m) looking fine, though I suppose you could give a formal induction proof that the worst-case is unions followed by finds.
As for proving that T(n, m) is Ω(m log(n)), you need to show that there exist n0 and m0 and c such that for all n ≥ n0 and all m ≥ m0, it holds that T(n, m) ≥ c m log(n). What you've written arguably shows this only for n = 2.

Resources