Hi and sorry for my bad English.
I'm studying computer science and I didn't understand why this expression (in the image) has this result.
Tmedio is the "medium" cost of a linear search algorithm, according to my mind and to the definition of summatory, if for example n = 4, the result should be like: (1/4)*(1+2+3+4)... What am I doing wrong?
The sum of first n numbers is n*(n+1)/2. Hence you get (1/n) * n * (n+1)/2 = (n+1)/2.
See the wiki page related to this identity here: http://en.wikipedia.org/wiki/1_%2B_2_%2B_3_%2B_4_%2B_%E2%8B%AF
Related
I have learned that finding the minimum/maximum of an unimodal function can be done with ternary search, an algorithm that runs in O(logN) time (where N is the size of the given search range).
However, I recently had the thought that maybe we could accomplish finding the max/min of an unimodal function with binary search as well. Specifically, for functions that have a definite maximum, we could find and return the first x-value for which f(x) >= f(x+Epsilon) (where Epsilon is the allowed error bound). On the other hand, for functions that have a definite minimum, we could find and return the first x-value such that f(x) <= f(x+Epsilon).
Overall, my question is, why is ternary search used at all for this search operation if binary search can accomplish the same applications? Am I missing something here, or have I made some other logical mistake?
There are some scenarios where binary search wont be helpful but ternary search can yield desired results.
Here is a resource that helps you understand in detail about it.
Let f(x) = -x^2 + 3. Let Epsilon = 2.
The first x-value for which f(x) >= f(x+Epsilon) is -1:
f(-1) = 2 = f(-1 + 2) = f(1) = 2
But the answer is x = 0.
I can't figure out what Knuth meant in his instructions for an exercise 8 from Chapter 1.1.
The task is to make an efficient gcd algorithm of two positive integers m and n using his notation theta[j], phi[j], b[j] and a[j] where theta and phi are strings and a and b - positive integers which represent computational steps in this case.
Let an input be the string of the form a^mb^n.
An excellent explanation of Knuth's algorithm is given by schnaader here.
My question is how this may be connected with the direction given in the exercise to use his Algorithm E given in the book with original r (remainder) substituted by |m-n| and n substituted by min(m,n).
When Knuth says "Let the input be represented by the string a^mb^n", what he means is that the input should take the form of m number of as and n number of bs. So the input f((m,n)) where m = 3 and n = 2 would be represented by the string aaabb.
Take a moment to look back at his equation 3 in that chapter, which represents a Markov algorithm. (below)
f((σ,j)) = (σ,a_j) if θ_j does not occur in σ
f((σ,j)) = (αφ_jω,b_j) if α is the shortest string for which σ = αθ_jω
f((σ,N)) = (σ,N)
So the idea is to define the sequence for each variable j, θ_j, φ_j, a_j & b_j. This way, using the above Markov's algorithm you can specify what happens to your input string, depending on the value of j.
Now, to get onto your main question;
My question is how this may be connected with the direction given in the excercise to use his Algorithm E given in the book with original r (remainder) substituted by |m-n| and n substituted by min(m,n).
Essentially what Knuth is saying here, is that you can't do division with the above Markov's algorithm. So what's the closest thing to division? Well, essentially we can subtract the smaller number from the larger number until we're left with a remainder. For example;
10 % 4 = 2 is the same as doing the following;
10 - 4 = 6 Can we remove another 4? Yes. Do it again.
6 - 4 = 2 Can we remove another 4? No. We have our remainder.
And now we have our remainder. This is essentially what he wants you to do with our input string eg aaabb.
If you read through Knuth's suggested answer in the back of the book and work through a couple of examples you will see that this is essentially what he is doing by removing the pairs ab and then adding a c until no more pairs ab exist. Taking the example I used at the top we get the string being manipulated as such aaabb, aab, caab, ca, cca, ccb, aab (then start again)
Which is the same as r = m % n, m = n, n = r (then start again). The difference is of course that in the above we have used the modulus operator and division, but in the example above that we have only used subtraction.
I hope this helps. I actually wrote a more in-depth analysis of Knuth's variation on a Markov algorithm on my blog. So if you're still feeling a little stuck have a read through the series.
I don't know if it's the right place to ask because my question is about how to calculate a computer science algorithm complexity using differential equation growth and decay method.
The algorithm that I would like to prove is Binary search for a sorted array, which has a complexity of log2(n)
The algorithm says: if the target value are searching for is equal to the mid element, then return its index. If if it's less, then search on the left sub-array, if greater search on the right sub-array.
As you can see each time N(t): [number of nodes at time t] is being divided by half. Therefore, we can say that it takes O(log2(n)) to find an element.
Now using differential equation growth and decay method.
dN(t)/dt = N(t)/2
dN(t): How fast the number of elements is increasing or decreasing
dt: With respect to time
N(t): Number of elements at time t
The above equation says that the number of cells is being divided by 2 with time.
Solving the above equations gives us:
dN(t)/N(t) = dt/2
ln(N(t)) = t/2 + c
t = ln(N(t))*2 + d
Even though we got t = ln(N(t)) and not log2(N(t)), we can still say that it's logarithmic.
Unfortunately, the above method, even if it makes sense while approaching it to finding binary search complexity, turns out it does not work for all algorithms. Here's a counter example:
Searching an array linearly: O(n)
dN(t)/dt = N(t)
dN(t)/N(t) = dt
t = ln(N(t)) + d
So according to this method, the complexity of searching linearly takes O(ln(n)) which is NOT true of course.
This differential equation method is called growth and decay and it's very popluar. So I would like to know if this method could be applied in computer science algorithm like the one I picked, and if yes, what did I do wrong to get incorrect result for the linear search ? Thank you
The time an algorithm takes to execute is proportional to the number
of steps covered(reduced here).
In your linear searching of the array, you have assumed that dN(t)/dt = N(t).
Incorrect Assumption :-
dN(t)/dt = N(t)
dN(t)/N(t) = dt
t = ln(N(t)) + d
Going as per your previous assumption, the binary-search is decreasing the factor by 1/2 terms(half-terms are directly reduced for traversal in each of the pass of array-traversal,thereby reducing the number of search terms by half). So, your point of dN(t)/dt=N(t)/2 was fine. But, when you are talking of searching an array linearly, obviously, you are accessing the element in one single pass and hence, your searching terms are decreasing in the order of one item in each of the passes. So, how come your assumption be true???
Correct Assumption :-
dN(t)/dt = 1
dN(t)/1 = dt
t = N(t) + d
I hope you got my point. The array elements are being accessed sequentially one pass(iteration) each. So, the array accessing is not changing in order of N(t), but in order of a constant 1. So, this N(T) order result!
We have this exercise in school, where we are to calculate the lower bound of an algorithm.
We know that the lower bound is: Log_6((3*n)! / n!^3) and we are to use stirlings approximation to approximate n!. When appling stirlings approximation we get:
log_6((sqrt(2*pi*3*n)*((3*n)/e)^(3*n) * e^alpha)/(sqrt(2*pi*n)*(n/e)^n * e^alpha)^3)
Now our problem is that every time we try expanding this formula with simple logarithm properties, such as log(a/b) = log(a)-log(b), log(a*b) = log(a)+log(b), log(a^b) = b*log(a) and lastly for sqrt log(sqrt(a)) = log(a^1/2) = 1/2 * log(a), we get a result where to dominating expression will be something with n*log(n) * constant. Now we know from the teacher that we have to find a linear lower bound, so this is wrong.
We have been using 2 days on this and are about to give up. Can anybody maybe help us?
Thanks in advance!
I hope someone can help me answer the following question. Thanks!
Here is a pseudo code of Permute-By-Sorting algorithm:
Permute-By-Sorting (A)
n = A.length
let P[1..n] be a new array
for i = 1 to n
P[i] = Random (1,n^3)
sort A, using P as sort keys
In the above algorithm, the array P represents the priorities of the elements in array A. Line 4 chooses a random number between 1 and n^3.
The question is what is the probability that all priorities in P are unique? and how do I get the probability?
To reconcile the answers already given: for choice i = 0, ..., n - 1, given that no duplicates have been chosen yet, there are n^3 - i non-duplicate choices of n^3 total for the ith value. Thus the probability is the product for i = 0, ..., n - 1 of (1 - i/n^3).
sdcwc is using a union bound to lowerbound this probability by 1 - O(1/n). This estimate turns out to be basically right. The proof sketch is that (1 - i/n^3) is exp(-i/n^3 + O(i^2/n^6)), so the product is exp(-O(n^2)/n^3 + O(n^-3)), which is greater than or equal to 1 - O(n^2)/n^3 + O(n^-3) = 1 - O(1/n). I'm sure the fine folks on math.SE would be happy to do this derivation "properly" for you.
Others have given you the probability calculation, but I think you may be asking the wrong question.
I assume the reason you're asking about the probability of the priorities being unique, and the reason for choosing n^3 in the first place, is because you're hoping they will be unique, and choosing a large range relative to n seems to be a reasonable way of achieving uniqueness.
It is much easier to ensure that the values are unique. Simply populate the array of priorities with the numbers 1 .. n and then shuffle them with the Fisher-Yates algorithm (aka algorithm P from The Art of Computer Programming, volume 2, Seminumerical Algorithms, by Donald Knuth).
The sort would then be carried out with known unique priority values.
(There are also other ways of going about getting a random permutation. It is possible to generate the nth lexicographic permutation of a sequence using factoradic numbers (or, the factorial number system), and so generate the permutation for a randomly chosen value in [1 .. n!].)
You are choosing n numbers from 1...n^3 and asking what is the probability that they are all unique.
There are (n^3) P n = (n^3)!/(n^3-n)! ways to choose the n numbers uniquely, and (n^3)^n ways to choose the n-numbers total.
So the probability of the numbers being unique is just the first equation divided by the second, which gives
n3!
--------------
(n3-n)! n3n
Let Aij be the event: i-th and j-th elements collide. Obviously P(Aij)=1/n3.
There is at most n2 pairs, therefore probability of at least one collision is at most 1/n.
If you are interested in exact thing, see BlueRaja's answer, but in randomized algorithms it is usually enough to give this type of bound.
So the sort part is irrelevant
Assuming the "Random" is real random, the probability is just
n^3!
----------------
(n^3-n)!n^(3n)