Merge sort - recursion tree - sorting

So I've been studying sorting algorithms.
I am stuck on finding the complexity of merge sort.
Can someone please explain to me how h=1+lg(n)

If you keep dividing n by 2, you'll eventually get to 1.
Namely, it takes log2(n) divisions by 2 to make this happen, by definition of the logarithm.
Every time we divide by 2, we add a new level to the recursion tree.
Add that to the root level (which didn't require any divisions), and we have log2(n) + 1 levels total.
Here's a cooler proof. Notice that, rearranging, we have T(2n) - 2 T(n) = 2 c n.
If n = 2k, then we have T(2k + 1) - 2 T(2k) = 2 c 2k.
Let's simplify the mess. Let's define U(k) = T(2k) / (2 c).
Then we have U(k + 1) - 2 U(k) = 2k, or, if we define U'(k) = U(k + 1) - U(k):
U'(k) - U(k) = 2k
k is discrete here, but we can let it be continuous, and if we do, then U' is the derivative of U.
At that point the solution is obvious: if you've ever taken derivatives, then you know that if the difference of a function and its derivative is exponential, then the function itself has to be exponential (since only in that case will the derivative be some multiple of itself).
At that point you know U(k) is exponential, so you can just plug in an exponential for the unknown coefficients in the exponential, and plug it back in to solve for T.

Related

Counting primitive operations on recursive functions

I'm reading Algorithm Design and Applications, by Michael T. Goodrich and Roberto Tamassia, published by Wiley. They teach the concept of primitive operations and how to count then in a given algorithm. Everything was clear to me until the moment they showed a recursive function (a simple recursive way to calculate the maximum value of an array) and its primitive operation count.
The function (in pseudo-code) is this:
Algorithm recursiveMax(A, n):
Input: An array A storing n ≥ 1 integers.
Output: The maximum element in A.
if n = 1 then
return A[0]
return max{recursiveMax(A, n − 1), A[n − 1]}
where A is an array and n its length. The author states what follows concerning how we calculate the number of primitive operations this function has:
As with this example, recursive algorithms are often quite elegant. Analyzing
the running time of a recursive algorithm takes a bit of additional work, however.
In particular, to analyze such a running time, we use a recurrence equation, which
defines mathematical statements that the running time of a recursive algorithm must
satisfy. We introduce a function T (n) that denotes the running time of the algorithm
on an input of size n, and we write equations that T (n) must satisfy. For example,
we can characterize the running time, T (n), of the recursiveMax algorithm as T(n) = 3 if n = 1 or T(n - 1) + 7 otherwise, assuming that we count each comparison, array reference, recursive call, max calculation, or return as a single primitive operation. Ideally, we would like to characterize a recurrence equation like that above in closed form, where no references to the function T appear on the righthand side. For the recursiveMax algorithm, it isn’t too hard to see that a closed form would be T (n) = 7(n − 1) + 3 = 7n − 4.
I can clearly understand that in the case of a single item array, our T(n) would be just 3 (only 3 primitive operations will occur, i.e. the comparision n = 1, the array index A[0] and the return operation), but I cannot understand why in the case where n is not 1 we have T(n-1) + 7. Why + 7? From where did we get this constant?
Also, I cannot comprehend this closed form: how did he get that T(n) = 7(n - 1) + 3?
I appreciate any help.

Solving a recurrence: T(n)=3T(n/2)+n

I need to Find the solution of the recurrence for n, a power of two if T(n)=3T(n/2)+n for n>1 and T(n)=1 otherwise.
using substitution of n=2^m,S(m)=T(2^(m-1)) I can get down to:
S(m)=2^m+3*2^(m-1)+3^2*2^(m-2)+⋯+3^(m-1) 2^1+3^m
But I have no idea how to simply that.
These types of recurrences are most easily solved by Master Theorem for analysis of algorithms which is explained as follows:
Let a be an integer greater than or equal to 1, b be a real number greater than 1, and c be a positive real number. Given a recurrence of the form -
T (n) = a * T(n/b) + nc where n > 1, then for n a power of b, if
Logba < c, T (n) = Θ(nc);
Logba = c, T (n) = Θ(nc * Log n);
Logba > c, T (n) = Θ(nlogba).
English translation of your recurrence
The most critical thing to understand in Master Theorem is the constants a, b, and c mentioned in the recurrence. Let's take your own recurrence - T(n) = 3T(n/2) + n - for example.
This recurrence is actually saying that the algorithm represented by it is such that,
(Time to solve a problem of size n) = (Time taken to solve 3 problems of size n/2) + n
The n at the end is the cost of merging the results of those 3 n/2 sized problems.
Now, intuitively you can understand that:
if the cost of "solving 3 problems of size n/2" has more weight than "n" then the first item will determine the overall complexity;
if the cost "n" has more weight than "solving 3 problems of size n/2" then the second item will determine the overall complexity; and,
if both parts are of same weight then solving the sub-problems and merging their results will have an overall compounded weight.
From the above three intuitive understanding, only the three cases of Master Theorem arise.
In your example, a = 3, b = 2 and c = 1. So it falls in case-3 as Logba = Log23 which is greater than 1 (the value of c).
The complexity therefore is straightforward - Θ(nlogba) = Θ(nlog23).
You can solve this using Masters theorem, but also by opening the recursion tree in the following way:
At the root of the recursion tree, you will have a work of n.
In the second stage, the tree splits into three parts, and in each part, the work will be n / 2.
Keep going until you reach the leaves. The entire work leaf will be: O (1) = O (n / 2 ^ k) when: n = 2 ^ k.
Note that at each step m have 3 ^ m splits.
Now we'll combine all the steps together, using the geometric progression and logarithms rules. In the end, you will get:
T(n) = 3T(n/2)+n = 2n^(log3)-2n
the calculation
Have a look here at page 60 http://www.cs.columbia.edu/~cs4205/files/CM2.pdf.
And maybe you should have asked here https://math.stackexchange.com/
The problems like this can be solved using Masters theorem.
In your case a = 3, b = 2 and f(n) = n.
So c = log_b(a) = log_2(3), which is bigger than 1, and therefore you fall into the first case. So your complexity is:
O(n^{log_2(3)}) = O(n^{1.58})

Choosing minimum length k of array for merge sort where use of insertion sort to sort the subarrays is more optimal than standard merge sort

This is a question from Introduction to Algorithms By Cormen. But this isn't a homework problem instead self-study.
There is an array of length n. Consider a modification to merge sort in which n/k sublists each of length k are sorted using insertion sort and then merged using merging mechanism, where k is a value to be determined.
The relationship between n and k isn't known. The length of array is n. k sublists of n/k means n * (n/k) equals n elements of the array. Hence k is simply a limit at which the splitting of array for use with merge-sort is stopped and instead insertion-sort is used because of its smaller constant factor.
I was able to do the mathematical proof that the modified algorithm works in Θ(n*k + n*lg(n/k)) worst-case time. Now the book went on to say to
find the largest value of k as a function of n for which this modified algorithm has the same running time as standard merge sort, in terms of Θ notation. How should we choose k in practice?
Now this got me thinking for a lot of time but I couldn't come up with anything. I tried to solve
n*k + n*lg(n/k) = n*lg(n) for a relationship. I thought that finding an equality for the 2 running times would give me the limit and greater can be checked using simple hit-and-trial.
I solved it like this
n k + n lg(n/k) = n lg(n)
k + lg(n/k) = lg(n)
lg(2^k) + lg(n/k) = lg(n)
(2^k * n)/k = n
2^k = k
But it gave me 2 ^ k = k which doesn't show any relationship. What is the relationship? I think I might have taken the wrong equation for finding the relationship.
I can implement the algorithm and I suppose adding an if (length_Array < k) statement in the merge_sort function here(Github link of merge sort implementation) for calling insertion sort would be good enough. But how do I choose k in real life?
Well, this is a mathematical minimization problem, and to solve it, we need some basic calculus.
We need to find the value of k for which d[n*k + n*lg(n/k)] / dk == 0.
We should also check for the edge cases, which are k == n, and k == 1.
The candidate for the value of k that will give the minimal result for n*k + n*lg(n/k) is the minimum in the required range, and is thus the optimal value of k.
Attachment, solving the derivitives equation:
d[n*k + n*lg(n/k)] / dk = d[n*k + nlg(n) - nlg(k)] / dk
= n + 0 - n*1/k = n - n/k
=>
n - n/k = 0 => n = n/k => 1/k = 1 => k = 1
Now, we have the candidates: k=n, k=1. For k=n we get O(n^2), thus we conclude optimal k is k == 1.
Note that we found the derivitives on the function from the big Theta, and not on the exact complexity function that uses the needed constants.
Doing this on the exact complexity function, with all the constants might yield a bit different end result - but the way to solve it is pretty much the same, only take derivitives from a different function.
maybe k should be lg(n)
theta(nk + nlog(n/k)) have two terms, we have the assumption that k>=1, so the second term is less than nlog(n).
only when k=lg(n), the whole result is theta(nlog(n))

Multipling two polynomials - Not exactly FFT

I'm currently studying FFT and I have trouble solving the following question:
Write a "divide and conquer" algorithm which multiplies two polynomials (max N degree) in complexity:
Theta of n^log3 (base 2 log ofc)
The algorithm should divide the two given polynomials coefficients into two groups:
Group1) Coefficients with even indexes.
Group2) Coefficients with odd indexes.
ehm, I don't even know how to start thinking about the solution..
The guidelines seem to be similar to FFT algorithm but I still can't see the solution.
Would love to get some assistance! even just a way to think about it..
please note that no code should be supplied.. only explanations and maybe pseudo code about how to get it done.
thanks !
Here are a few hints, then the solution.
1 -
First of all, you should make sure that you can perform the multiplication in less than n^2 coefficient multiplications, on a simple example :
(aX + b)*(cX + d)
One of your multiplications should be (a+b)*(c+d)
2 -
Haven't found how to do it ?
Here are the operations for each power :
X^2 : ac
X : (a+b)*(c+d) - ac - bd
1 : bd
You just have to perform 3 multiplications instead of 4. Additions do not cost that much compared to multiplications.
3 -
You are asked to find a solution in Theta(n^lg(3)). Here is a quick reminder of the 'Théorème Général' :
Let T(n) the cost of your algorithme for the polynoms with degree n.
With a 'divide to conquer strategy' which leads to :
T(n) = aT(n/b)+f(n)
If f(n)~O(n^lg_b(a)) then T(n) = Theta(n^lg_b(a))
You are looking for T(n) = Theta(n^lg_2(3)). This could mean that :
T(n)=3.T(n/2) + epsilon
If you split your polynoms in even and odd polynoms, they have half of the initial coefficients amount : n/2.
The formula shows you that you will perform three multiplications between the odd and even polynoms...
4 -
Consider to represent your polynom P(x) with degree n this way :
P(X) = X.A(X) + B(X)
A(X) and B(X) contain n/2 coefficients.
5 - Solution
P(X) = X.A(X) + B(X)
P'(X) = X.A'(X) + B'(X)
The coefficients of P*P'(X) is the sum of the coefficients of :
X^2.A.A'
X.(A.B'+A'.B) = X.[(A+B)(A'+B') - A.A' - B.B']
B.B'
So you have to call your multiplication algorithm on :
A and A'
A+B and A'+B'
B and B'
Then you can recombine coefficients with shifts and additions.
Cheers

What does Logn actually mean?

I am just studying for my class in Algorithms and have been looking over QuickSort. I understand the algorithm and how it works, but not how to get the number of comparisons it does, or what logn actually means, at the end of the day.
I understand the basics, to the extent of :
x=logb(Y) then
b^x = Y
But what does this mean in terms of algorithm performance? It's the number of comparisons you need to do, I understand that...the whole idea just seems so unintelligible though. Like, for QuickSort, each level K invocation involves 2^k invocations each involving sublists of length n/2^K.
So, summing to find the number of comparisons :
log n
Σ 2^k. 2(n/2^k) = 2n(1+logn)
k=0
Why are we summing up to log n ? Where did 2n(1+logn) come from? Sorry for the vagueness of my descriptions, I am just so confused.
If you consider a full, balanced binary tree, then layer by layer you have 1 + 2 + 4 + 8 + ... vertices. If the total number of vertices in the tree is 2^n - 1 then you have 1 + 2 + 4 + 8 + ... + 2^(n-1) vertices, counting layer by layer. Now, let N = 2^n (the size of the tree), then the height of the tree is n, and n = log2(N) (the height of the tree). That's what the log(n) means in these Big O expressions.
below is a sample tree:
1
/ \
2 3
/ \ / \
4 5 6 7
number of nodes in tree is 7 but high of tree is log 7 = 3, log comes when you have divide and conquer methods, in quick sort you divide list into 2 sublist, and continue this until rich small lists, divisions takes logn time (in average case), because the high of division is log n, partitioning in each level takes O(n), because in each level in average you partition N numbers, (may be there are too many list for partitioning, but average number of numbers is N in each level, in fact some of count of lists is N). So for simple observation if you have balanced partition tree you have log n time for partitioning, which means high of tree.
1 forget about b-trees for sec
here' math : log2 N = k is same 2^k=N .. its the definion of log
, it could be natural log(e) N = k aka e^k = n,, or decimal log10 N = k is 10^k = n
2 see perfect , balanced tree
1
1+ 1
1 + 1 + 1+ 1
8 ones
16 ones
etc
how many elements? 1+2+4+8..etc , so for 2 level b-tree there are 2^2-1 elements, for 3 level tree 2^3-1 and etc.. SO HERE'S MAGIC FORMULA: N_TREE_ELEMENTS= number OF levels^ 2 -1 ,or using definition of log : log2 number OF levels= number_of_tree_elements (Can forget about -1 )
3 lets say there's a task to find element in N elements b-tree, w/ K levels (aka height)
where how b-tree is constructed log2 height = number_of_tree elements
MOST IMPORTANT
so by how b-tree is constructed you need no more then 'height' OPERATIONS to find element in all N elements , or less.. so WHAT IS HEIGHT equals : log2 number_of_tree_elements..
so you need log2 N_number_of_tree_elements.. or log(N) for shorter
To understand what O(log(n)) means you might wanna read up on Big O notaion. In shot it means, that if your data set gets 1024 times bigger you runtime will only be 10 times longer (or less)(for base 2).
MergeSort runs in O(n*log(n)), which means it will take 10 240 times longer. Bubble sort runs in O(n^2), which means it will take 1024^2 = 1 048 576 times longer. So there are really some time to safe :)
To understand your sum, you must look at the mergesort algorithm as a tree:
sort(3,1,2,4)
/ \
sort(3,1) sort(2,4)
/ \ / \
sort(3) sort(1) sort(2) sort(4)
The sum iterates over each level of the tree. k=0 it the top, k= log(n) is the buttom. The tree will always be of height log2(n) (as it a balanced binary tree).
To do a little math:
Σ 2^k * 2(n/2^k) =
2 * Σ 2^k * (n/2^k) =
2 * Σ n*2^k/2^k =
2 * Σ n =
2 * n * (1+log(n)) //As there are log(n)+1 steps from 0 to log(n) inclusive
This is of course a lot of work to do, especially if you have more complex algoritms. In those situations you get really happy for the Master Theorem, but for the moment it might just get you more confused. It's very theoretical so don't worry if you don't understand it right away.
For me, to understand issues like this, this is a good way to think about it.

Resources