Asymptotic analysis - order functions - asymptotic-complexity

Can you please help to answer the following question:
Arrange the following functions in increasing order of growth rate
(with g(n) following f(n) in your list if and only if
f(n)=O(g(n))).
sqr(n)
10^n
n^1.5
2^sqr(log(n))
n^5/3
I used logarithmic approach for each option - my answer is 13542. Am I on the right track?

Your answer looks correct. Taking the logs of all of these gives
.5
n lg 10
1.5
√(log n)
1.6
This would be ordered 1, 3, 5, 4, 2.
Hope this helps!

Related

Solve using either master theorem or by expansion

I have two questions, which I have trying but unable to figure them out.
1) 𝑇(𝑛) = 𝑇(𝑛 βˆ’ 1) + 𝑛^4
2) 𝑇(𝑛) = 2𝑇 (𝑛/2) + 𝑛 lg 𝑛
For first one, I am assuming substitution (am I correct?), and got kb + T(n-k). Pretty sure that's wrong so need help with it.
For the second one, I have no idea at all...
Any help would be great! Thanks!
1) So you got
...? I don't know how you obtained this but it's certainly incorrect.
This is basically the summation of the 4th power of all integers up to n. The standard formula for this is:
2) We can find a pattern if we keep expanding this:
The limit log n - 1 is because we keep dividing the parameter to T by 2, so the substitution as above can continue for log n lines until, say T(1) or wherever the stopping condition is. Continuing using the logarithm rules (google them if you don't know):
Both summations have log n terms. Since the 1st summation does not depend on i at all, we simply multiply by log n. The 2nd summation is given by a standard formula for summation of integers from 1 (or 0, doesn't matter in this case):

Time complexity exercise

I'm having trouble with solving an excercise. The context is an asymptotic analysis of the running time. There are given algorithms like Insertion Sort etc. The result should be the theta notation (asymptotic exact) for the input: {N, N-1, ..., N/2, 1, 1, 2, 3, ..., N/2}. The problem is: How can I calculate the running time? I mean, it's no problem to calculate the worst-case or best-case scenario. My problem is how to handle the inputs and how to consider them in the calculation.
Thanks for your help!
Greetings
GR
See comments:
Have you tried listing the steps the program actually will take for some simple input like (4, 3, 2, 1, 1, 2) or (6, 5, 4, 3, 1, 1, 2, 3)? Can you "list" the steps for the general case N? – David K Oct 23 '14 at 16:32
First of all thanks for your answer. :-) I simply count the made additions and compares. So in Insertion Sort there are n(n-1)\2 Operations. In this case the Theta is Theta(n*n). My problem now is, how can I map this to an real input? – GR_ Oct 23 '14 at 18:31
If you actually have counted operations for the worst-case complexity of insertion sort, then you can tell what two numbers are compared by the 10th operation for sorting the numbers 1 through 100. That is, counting operations is mapping the operations to real input. It is actually a harder problem because you must also determine what input is the worst case, whereas here the input is already described for you. – David K Oct 23 '14 at 19:01

algorithm - Sort an array with LogLogN distinct elements

This is not my school home work. This is my own home work and I am self-learning algorithms.
In Algorithm Design Manual, there is such an excise
4-25 Assume that the array A[1..n] only has numbers from {1, . . . , n^2} but that at most log log n of these numbers ever appear. Devise an algorithm that sorts A in substantially less than O(n log n).
I have two approaches:
The first approach:
Basically I want to do counting sort for this problem. I can first scan the whole array (O(N)) and put all distinct numbers into a loglogN size array (int[] K).
Then apply counting sort. However, when setting up the counting array (int[] C), I don't need to set its size as N^2, instead, I set the size as loglogN too.
But in this way, when counting the frequencies of each distinct number, I have to scan array K to get that element's index (O(NloglogN) and then update array C.
The second approach:
Again, I have to scan the whole array to get a distinct number array K with size loglogN.
Then I just do a kind of quicksort like, but the partition is based on median of K array (i.e., each time the pivot is an element of K array), recursively.
I think this approach will be best, with O(NlogloglogN).
Am I right? or there are better solutions?
Similar excises exist in Algorithm Design Manual, such as
4-22 Show that n positive integers in the range 1 to k can be sorted in O(n log k) time. The interesting case is when k << n.
4-23 We seek to sort a sequence S of n integers with many duplications, such that the number of distinct integers in S is O(log n). Give an O(n log log n) worst-case time algorithm to sort such sequences.
But basically for all these excises, my intuitive was always thinking of counting sort as we can know the range of the elements and the range is short enough comparing to the length of the whole array. But after more deeply thinking, I guess what the excises are looking for is the 2nd approach, right?
Thanks
We can just create a hash map storing each element as key and its frequency as value.
Sort this map in log(n)*log(log(n)) time i.e (klogk) using any sorting algorithm.
Now scan the hash map and add elements to the new array frequency number of times. Like so:
total time = 2n+log(n)*log(log(n)) = O(n)
Counting sort is one of possible ways:
I will demonstrate this solution on example 2, 8, 1, 5, 7, 1, 6 and all number are <= 3^2 = 9. I use more elements to make my idea more clearer.
First for each number A[i] compute A[i] / N. Lets call this number first_part_of_number.
Sort this array using counting sort by first_part_of_number.
Results are in form (example for N = 3)
(0, 2)
(0, 1)
(0, 1)
(2, 8)
(2, 6)
(2, 7)
(2, 6)
Divide them into groups by first_part_of_number.
In this example you will have groups
(0, 2)
(0, 1)
(0, 1)
and
(2, 8)
(2, 6)
(2, 7)
(2, 6)
For each number compute X modulo N. Lets call it second_part_of_number. Add this number to each element
(0, 2, 2)
(0, 1, 1)
(0, 1, 1)
and
(2, 8, 2)
(2, 6, 0)
(2, 7, 1)
(2, 6, 0)
Sort each group using counting sort by second_part_of_number
(0, 1, 1)
(0, 1, 1)
(0, 2, 2)
and
(2, 6, 0)
(2, 6, 0)
(2, 7, 1)
(2, 8, 2)
Now combine all groups and you have result 1, 1, 2, 6, 6, 7, 8.
Complexity:
You were using only counting sort on elements <= N.
Each element took part in exactly 2 "sorts". So overall complexity is O(N).
I'm going to betray my limited knowledge of algorithmic complexity here, but:
Wouldn't it make sense to scan the array once and build something like a self-balancing tree? As we know the number of nodes in the tree will only grow to (log log n) it is relatively cheap (?) to find a number each time. If a repeat number is found (likely) a counter in that node is incremented.
Then to construct the sorted array, read the tree in order.
Maybe someone can comment on the complexity of this and any flaws.
Update: After I wrote the answer below, #Nabb showed me why it was incorrect. For more information, see Wikipedia's brief entry on Γ•, and the links therefrom. At least because it is still needed to lend context to #Nabb's and #Blueshift's comments, and because the whole discussion remains interesting, my original answer is retained, as follows.
ORIGINAL ANSWER (INCORRECT)
Let me offer an unconventional answer: though there is indeed a difference between O(n*n) and O(n), there is no difference between O(n) and O(n*log(n)).
Now, of course, we all know that what I just said is wrong, don't we? After all, various authors concur that O(n) and O(n*log(n)) differ.
Except that they don't differ.
So radical-seeming a position naturally demands justification, so consider the following, then make up your own mind.
Mathematically, essentially, the order m of a function f(z) is such that f(z)/(z^(m+epsilon)) converges while f(z)/(z^(m-epsilon)) diverges for z of large magnitude and real, positive epsilon of arbitrarily small magnitude. The z can be real or complex, though as we said epsilon must be real. With this understanding, apply L'Hospital's rule to a function of O(n*log(n)) to see that it does not differ in order from a function of O(n).
I would contend that the accepted computer-science literature at the present time is slightly mistaken on this point. This literature will eventually refine its position in the matter, but it hasn't done, yet.
Now, I do not expect you to agree with me today. This, after all, is merely an answer on Stackoverflow -- and what is that compared to an edited, formally peer-reviewed, published computer-science book -- not to mention a shelffull of such books? You should not agree with me today, only take what I have written under advisement, mull it over in your mind these coming weeks, consult one or two of the aforementioned computer-science books that take the other position, and make up your own mind.
Incidentally, a counterintuitive implication of this answer's position is that one can access a balanced binary tree in O(1) time. Again, we all know that that's false, right? It's supposed to be O(log(n)). But remember: the O() notation was never meant to give a precise measure of computational demands. Unless n is very large, other factors can be more important than a function's order. But, even for n = 1 million, log(n) is only 20, compared, say, to sqrt(n), which is 1000. And I could go on in this vein.
Anyway, give it some thought. Even if, eventually, you decide that you disagree with me, you may find the position interesting nonetheless. For my part, I am not sure how useful the O() notation really is when it comes to O(log something).
#Blueshift asks some interesting questions and raises some valid points in the comments below. I recommend that you read his words. I don't really have a lot to add to what he has to say, except to observe that, because few programmers have (or need) a solid grounding in the mathematical theory of the complex variable, the O(log(n)) notation has misled probably, literally hundreds of thousands of programmers to believe that they were achieving mostly illusory gains in computational efficiency. Seldom in practice does reducing O(n*log(n)) to O(n) really buy you what you might think that it buys you, unless you have a clear mental image of how incredibly slow a function the logarithm truly is -- whereas reducing O(n) even to O(sqrt(n)) can buy you a lot. A mathematician would have told the computer scientist this decades ago, but the computer scientist wasn't listening, was in a hurry, or didn't understand the point. And that's all right. I don't mind. There are lots and lots of points on other subjects I don't understand, even when the points are carefully explained to me. But this is a point I believe that I do happen to understand. Fundamentally, it is a mathematical point not a computer point, and it is a point on which I happen to side with Lebedev and the mathematicians rather than with Knuth and the computer scientists. This is all.

Showing that a recurrence relation is O(n log n)

T (n) = T (xn) + T ((1 βˆ’ x)n) + n = O(n log n)
where x is a constant in the range 0 < x < 1. Is the asymptotic complexity the same when x = 0.5, 0.1 and 0.001?
What happens to the constant hidden in the O() notation. Use Substitution Method.
I'm trying to use the example on ~page 15 but I find it weird that in that example the logs change from default base to base 2.
I also do not really understand why it needed to be simplified so much just so as to remove cnlog2n from the left side, could that not be done in the first step and the left side would just have "stuff-cnlog2n<=0" then evaulated for any c and n like so?
From what I tried it could not prove that T(n)=O(n)
Well, if you break this up into a tree using Master's theorem, then this will have a constant "amount" to calculate each time. You know this because x + 1 - x = 1.
Thus, the time depends on the level of the tree, which is logarithmic since the pieces are reducing each time by some constant amount. Since you do O(n) calcs each level, your overall complexity is O( n log n ).
I expect this will be a little more complicated to "prove". Remember that it doesn't matter what base your logs are in, they're all just some constant factor. Refer to logarithmic relations for this.
PS: Looks like homework. Think harder yourself!
This seems to me exactly the recurrence equation for the average case in Quicksort.
You should look at CLRS explanation of "Balanced Partitioning".
but I find it weird that in that example the logs change from default base to base 2
That is indeed weird. Simple fact is, that it's easier to prove with base 2, than with base unknown x. For example, log(2n) = 1+log(n) in base 2, that is a bit easier. You don't have to use base 2, you can pick any base you want, or use base x. To be absolutely correct the Induction Hypothesis must have the base in it: T(K) <= c K log_2(K). You can't change the IH later on, so what is happening now is not correct in a strict sense. You're free to pick any IH you like, so, just pick one that makes the prove easier: in this case logs with base 2.
the left side would just have "stuff-cnlog2n<=0" then evaulated for any c and n
What do you mean with 'evaluated for any c and n'? stuff-cnlog2n<=0 is correct, but how do you prove that there is a c so that it holds for all n? Okay, c=2 is a good guess. To prove it like that in WolframAlpha, you need to do stuff <= 0 where c=2, n=1 OK!, stuff <=0 where c=2, n=2OK!, stuff <= 0 where c=2, n=3OK!, ..., etc taking n all the way to infinity. Hmm, it will take you an infinite amount of time to check all of them... The only practical way (I can think of right now) for solving this is to simplify stuff-cnlog2n<=0. Or maybe you prefer this argument: you don't have WolframAlpha at your exam, so you must simplify.

Tips for Project Euler Problem #78

This is the problem in question: Problem #78
This is driving me crazy. I've been working on this for a few hours now and I've been able to reduce the complexity of finding the number of ways to stack n coins to O(n/2), but even with those improvements and starting from an n for which p(n) is close to one-million, I still can't reach the answer in under a minute. Not at all, actually.
Are there any hints that could help me with this?
Keep in mind that I don't want a full solution and there shouldn't be any functional solutions posted here, so as not to spoil the problem for other people. This is why I haven't included any code either.
Wikipedia can help you here. I assume that the solution you already have is a recursion such as the one in the section "intermediate function". This can be used to find the solution to the Euler problem, but isn't fast.
A much better way is to use the recursion based on the pentagonal number theorem in the next section. The proof of this theorem isn't straight forward, so I don't think the authors of the problem expect that you come up with the theorem by yourself. Rather it is one of the problems, where they expect some literature search.
This problem is really asking to find the first term in the sequence of integer partitions that’s divisible by 1,000,000.
A partition of an integer, n, is one way of describing how many ways the sum of positive integers, ≀ n, can be added together to equal n, regardless of order. The function p(n) is used to denote the number of partitions for n. Below we show our 5 β€œcoins” as addends to evaluate 7 partitions, that is p(5)=7.
5 = 5
= 4+1
= 3+2
= 3+1+1
= 2+2+1
= 2+1+1+1
= 1+1+1+1+1
We use a generating function to create the series until we find the required n.
The generating function requires at most 500 so-called generalized pentagonal numbers, given by n(3n – 1)/2 with 0, Β± 1, Β± 2, Β± 3…, the first few of which are 0, 1, 2, 5, 7, 12, 15, 22, 26, 35, … (Sloane’s A001318).
We have the following generating function which uses our pentagonal numbers as exponents:
1 - q - q^2 + q^5 + q^7 - q^12 - q^15 + q^22 + q^26 + ...
my blog at blog.dreamshire.com has a perl program that solves this in under 10 sec.
Have you done problems 31 or 76 yet? They form a nice set that is an generalization of the same base problem each time. Doing the easier questions may give you insight into a solution for 78.
here some hints:
Divisibility by one million is not the same thing as just being larger than one million. 1 million = 1,000,000 = 10^6 = 2^6 * 5^6.
So the question is to find a lowest n so that the factors of p(n) contain six 2's and six 5's.

Resources