Computationally efficient way to check if large number is divisible by 3 - algorithm

If
Mp=2p-1 is prime ⇒
⇒ 2p-2⋮6 or 2p⋮6 ⇒
⇒ 2p-1-1⋮3 or 2p-1⋮3 ⇒
⇒ 2n-1⋮3 or 2n⋮3, n=p-1
In order to pick huge values for p to test if Mp is a prime number, I believe this is a good preliminary test before going with the computationally expensive Lucas-Lehmer test.
But what is the fastest, most efficient way to test if two numbers, 2n-1 and 2n, are divisible by 3?
Other info that we can use from this is that n always ends in 0, 2, 6 or 8 (because p=n+1 is a prime). Maybe it helps in some way.

It's easy to prove:
2^n mod 3
== 1 if n is even
== 2 if n is odd
using mathematical induction.
So 2n is never divisible by 3, and 2n-1 is divisible by 3 if and only if n is even.

Sum the digits, if result is dividable by 3 then it is.
ea
25681 = 2+5+6+8+3 = 24 (=2+4) = 6 is dividable.

if u're dealing with bigints and the inputs are already in an ASCII string format of some sort, you can simply :
high-speed scrub away all instances of 0, 3, 6, and 9 via regex ::gsub() (or equiv.),
measure the string length() / len() at that point,
count # of instances of 2, 5, and 8
sum up the values of the 2 sub-bullet points, then mod % 3
This works no matter how large the big-int is.
But yes, like others have mentioned already, no point to run it through this algorithm if you already knew that the input is an integer power of an integer base, and if the base itself isn't divisible by 3, then neither would any of its integer powers.

Related

Find only two numbers in array that evenly divide each other

Find the only two numbers in an array where one evenly divides the other - that is, where the result of the division operation is a whole number
Input Arrays Output
5 9 2 8 8/2 = 4
9 4 7 3 9/3 = 3
3 8 6 5 6/3 = 2
The brute force approach of having nested loops has time complexity of O(n^2). Is there any better way with less time complexity?
This question is part of advent of code.
Given an array of numbers A, you can identify the denominator by multiplying all the numbers together to give E, then testing each ith element by dividing E by Ai2. If this is a whole number, you have found the denominator, as no other factors can be introduced by multiplication.
Once you have the denominator, it's a simple task to do a second, independent loop searching for the paired numerator.
This eliminates the n2 comparisons.
Why does this work? First, we have an n-2 collection of non-divisors: abcde..
To complete the array, we also have numerator x and denominator y.
However, we know that x and only x has a factor of y, so it can be expressed as yz (z being a whole remainder from the division of x by y)
When we multiply out all the numbers, we end up with xyabcde.., but as x = yz, we can also say y2zabcde..
When we loop through dividing by the squared i'th element from the array, for most of the elements we create a fraction, e.g. for a:
y2zabcde.. / a2 = y2zbcde.. / a
However, for y and y only:
y2zabcde.. / y^2 = zabcde..
Why doesn't this work? The same is true of the other numbers. There's no guarantee that a and b can't produce another common factor when multiplied. Take the example of [9, 8, 6, 4], 9 and 8 multiplied equals 72, but as they both include prime factors 2 and 3, 72 has a factor of 6, also in the array. When we multiply it all out to 1728, those combine with the original 6 so that it can divide soundly by 36.
How might this be fixed? More accurately, if y is a factor of x, then y's prime factors will uniquely be a subset of x's prime factors, so maybe things can be refined along those lines. Obtaining a prime factorization should not scale according to the size of the array, but comparing subsets would, so it's not clear to me if this is at all useful.
I think that O(n^2) is the best time complexity you can get without any assumptions on the data.
If you can't tell anything about the numbers, knowing that x and y do not divide each other tells you nothing about x and z or y and z for any x, y, z. Therefore, in the worst case you must check all pairs of numbers - equal to n Choose 2 = n*(n-1)/2 = O(n^2).
Clearly, we can get O(n * sqrt(m)), where m is the absolute value range, by listing the pairs of divisors of each element against a hash of unique values in the array. This can be more efficient than O(n^2) depending on the input.
5 9 2 8
list divisor pairs (at most sqrt m iterations per element m)
5 (1,5)
9 (1,9), (3,3)
2 (1,2)
8 (1,8), (2,4) BINGO!
If you prime factorise all the numbers in the array progressively into a tree, when we discover a completely factored number leaf while factoring another number, we know we've found the divisor.
However, given we don't know which number is the divisor, we do need to test all primes up to divisor's largest factor. The largest factor for any m-digit number is, at most, sqrt(m), while the average number of primes below any m-digit number is m / ln(m). This means we will make at most n (sqrt(m) / ln(sqrt(m)) operations with very basic factorization and no optimization.
To be a little more specific, the algorithm should keep track of four things: a common tree of explored prime factors, the original number from the array, its current partial factorization, and its position in the tree.
For each prime number, we should test all numbers in the array (repeatedly to account for repeated factors). If the number divides evenly, we a) update the partial factorization, b) add/navigate to the corresponding child to the tree, c) if the partial factorization is 1, we have found the last factor and can indicate a leaf by adding the terminating '1' child, and d) if not, we can check for other numbers having left a child '1' to indicate they are completely factored.
When we find a child '1', we can identify the other number by multiplying out the partial factorization (e.g. all the parents up the tree) and exit.
For further optimization, we can cache the factorization (both partial and full) of numbers. We can also stop checking further factors of numbers that have a unique factor, narrowing the field of candidates over time.

How many numbers have a maximum number of unique prime factors in a given range

Note that the divisors have to be unique
So 32 has 1 unique prime factor [2], 40 has [2, 5] and so on.
Given a range [a, b], a, b <= 2^31, we should find how many numbers in this range have a maximum number of unique divisors.
The best algorithm I can imagine is an improved Sieve of Eratosthenes, with an array counting how many prime factors a number has. But it is not only O(n), which is unacceptable with such a range, but also very inefficient in terms of memory.
What is the best algorithm to solve this question? Is there such an algorithm?
I'll write a first idea in Python-like pseudocode. First find out how many prime factors you may need at most:
p = 1
i = 0
while primes[i] * p <= b:
p = p * primes[i]
i = i + 1
This only used b, not a, so you may have to decrease the number of actual prime factors. But since the result of the above is at most 9 (as the product of the first 10 primes already exceeds 231), you can conceivably go down from this maximum one step at a time:
cnt = 0
while cnt == 0:
cnt = count(i, 1, 0)
i = i - 1
return cnt
So now we need to implement this function count, which I define recursively.
def count(numFactorsToGo, productSoFar, nextPrimeIndex):
if numFactorsToGo > 0:
cnt = 0
while productSoFar * primes[nextPrimeIndex] <= b:
cnt = cnt + count(numFactorsToGo - 1,
productSoFar * primes[nextPrimeIndex],
nextPrimeIndex + 1)
nextPrimeIndex = nextPrimeIndex + 1
return cnt
else:
return floor(b / productSoFar) - ceil(a / productSoFar) + 1
This function has two cases to distinguish. In the first case, you don't have the desired number of prime factors yet. So you multiply in another prime, which has to be larger than the largest prime already included in the product so far. You achieve this by starting at the given index for the next prime. You add the counts for all these recursive calls.
The second case is where you have reached the desired number of prime factors. In this case, you want to count all possible integers k such that a ≤ k∙p ≤ b. Which translates easily into ⌈a/p⌉ ≤ k ≤ ⌊b/p⌋ so the count would be ⌊b/p⌋ − ⌈a/p⌉ + 1. In an actual implementation I'd not use floating-point division and floor or ceil, but instead I'd make use of truncating integer division for the sake of performance. So I'd probably write this line as
return (b // productSoFar) - ((a - 1) // productSoFar + 1) + 1
As it is written now, you'd need the primes array precomouted up to 231, which would be a list of 105,097,565 numbers according to Wolfram Alpha. That will cause considerable memory requirements, and will also make the outer loops (where productSoFar is still small) iterate over a large number of primes which won't be needed later on.
One thing you can do is change the end of loop condition. Instead of just checking that adding one more prime doesn't make the product exceed b, you can check whether including the next primesToGo primes in the product is possible without exceeding b. This will allow you to end the loop a lot earlier if the total number of prime factors is large.
For a small number of prime factors, things are still tricky. In particular if you have a very narrow range [a, b] then the number with maximal prime factor count might well be a large prime factor times a product of very small primes. Consider for example [2147482781, 2147482793]. This interval contains 4 elements with 4 distinct factors, some of which contain quite large prime factors, namely
3 ∙ 5 ∙ 7 ∙ 20,452,217
22 ∙ 3 ∙ 11 ∙ 16,268,809
2 ∙ 5 ∙ 19 ∙ 11,302,541
23 ∙ 7 ∙ 13 ∙ 2,949,839
Since there are only 4,792 primes up to sqrt(231), with 46,337 as their largest (which fits into a 16 bit unsigned integer). It would be possible to precompute only those, and use that to factor each number in the range. But that would again mean iterating over the range. Which makes sense for small ranges, but not for large ones.
So perhaps you need to distinguish these cases up front, and then choose the algorithm accordingly. I don't have a good idea of how to combine these ideas – yet. If someone else does, feel free to extend this post or write your own answer building on this.

Number of Fibonacci numbers smaller than number k. Sub O(n)

Interview question: How many Fibonacci numbers exists less than a given number k? Can you find a function in terms of k, to get the number of fibonacci number less than k?
Example : n = 6
Answer: 6 as (0, 1, 1, 2, 3, 5)
Easy enough, write a loop or use the recursive definition of Fibonacci. However, that sounds too easy... is there a way to do this using the closed-form definition? (https://en.wikipedia.org/wiki/Fibonacci_number#Closed-form_expression)
Here is a close-form Python solution which is O(1). It uses Binet's formula (from the Wikipedia article that you linked to):
>>> from math import sqrt,log
>>> def numFibs(n): return int(log(sqrt(5)*n)/log((1+sqrt(5))/2))
>>> numFibs(10)
6
Which tracks with 1,1,2,3,5,8
The point is that the second term in Binet's formula is negligible and it is easy enough to invert the result of neglecting it.
The above formula counts the number of Fibonacci numbers which are less than or equal to n. It jumps by 1 with each new Fibonacci number. So, for example, numFibs(12) = 6 and numFibs(13) = 7. 13 is the 7th Fibonacci number, so if you want the number of Fibobacci numbers which are strictly smaller than n you have to introduce a lag. Something like:
def smallerFibs(n):
if n <= 1:
return 0
else:
return min(numFibs(n-1),numFibs(n))
Now smallerFibs(13) is still 6 but then smallerFibs(14) = 7. This is of course still O(1).
I think it's fairly easy to see the growth of this number, at least. By the Binet / De-Moivre formula,
fn = (φn - ψn) / 5
Since |ψ| < 1 < φ, then
fn ∼ φn / 5.
From this it follows that the number of Fibonacci numbers smaller than x grows like logφ(5x).

Generating a stateless, pseudo-random permutation of integers from 0 to n?

Question spawned from this one. The problem can be formulated as follows:
Given two positive integers n and m, with m <= n, is there a way to find a suite of numbers, which cycles and covers all possible values from 0 to n?
As a basic example, if we take 3 as a number, for whatever number current between 0 and 3, we can compute the next value as:
next = (current+3) % 4
This will cycle. For instance: 1 -> 0 -> 3 -> 2 -> 1 etc. I found this solution by "chance" and it is even general ((i + n) % (n + 1) for any n), I cannot prove it mathematically. And it is a little too obvious.
Are there better ways to generate such a permutation?
I'm not sure what you intend m in the question to refer to, or how you're defining "a suite of numbers"). However, one way of getting a cycle of number is to use a recursion (or iteration) of the form:
next = f(current)
for some function f. For example, linear congruential RNGs use the iteration:
x = ( a · x + c ) mod m where 0 < a, c < m
They don't always produce all values from 0 to m-1, but under certain circumstances they do:
c and m are relatively prime
a - 1 is divisible by every prime factor of m (not including m)
if m is divisible by 4, a - 1 is divisible by 4.
(This is the Hull-Dobell theorem.)
Note that a, c == 1 satisfies the above criteria for any m. Futhermore, if m is prime, any values of a and c satisify the criteria, and if m is a power of 2, then the criteria are satisfied by any a, c such that a == 1 mod 4 and c == 1 mod 2. However, for certain values of m (eg. 6), the only value of a which will work is 1.
This might not qualify as "stateless", but I don't think that there is any strictly stateless solution; for example, you might look for some function f such that:
f(0), f(1),... f(m-1)
is a permutation of
0, 1, ..., m-1
so that you could generate the cycle by calling f(i) for successive values of i. But that's still a state, since you have to remember the last value of i you used,
Incrementing each subsequent number by any number that does not share a common prime divisor with (n-m+1) would cover the sequence (e.g. for the sequence [2-11] (10 numbers) incrementing by 3, 7, or 9 would work but 2, 4, 5, 6, and 8 would not because they share a common divisor (2 and/or 5)
EDIT
I took out the shuffling idea since it seems that you want to increment by the same number each time. If you want a truly "random" sequence that has m at the first element just take m out and place it at the beginning. I'm not sure how that helps you, though.

What would cause an algorithm to have O(log n) complexity?

My knowledge of big-O is limited, and when log terms show up in the equation it throws me off even more.
Can someone maybe explain to me in simple terms what a O(log n) algorithm is? Where does the logarithm come from?
This specifically came up when I was trying to solve this midterm practice question:
Let X(1..n) and Y(1..n) contain two lists of integers, each sorted in nondecreasing order. Give an O(log n)-time algorithm to find the median (or the nth smallest integer) of all 2n combined elements. For ex, X = (4, 5, 7, 8, 9) and Y = (3, 5, 8, 9, 10), then 7 is the median of the combined list (3, 4, 5, 5, 7, 8, 8, 9, 9, 10). [Hint: use concepts of binary search]
I have to agree that it's pretty weird the first time you see an O(log n) algorithm... where on earth does that logarithm come from? However, it turns out that there's several different ways that you can get a log term to show up in big-O notation. Here are a few:
Repeatedly dividing by a constant
Take any number n; say, 16. How many times can you divide n by two before you get a number less than or equal to one? For 16, we have that
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
Notice that this ends up taking four steps to complete. Interestingly, we also have that log2 16 = 4. Hmmm... what about 128?
128 / 2 = 64
64 / 2 = 32
32 / 2 = 16
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
This took seven steps, and log2 128 = 7. Is this a coincidence? Nope! There's a good reason for this. Suppose that we divide a number n by 2 i times. Then we get the number n / 2i. If we want to solve for the value of i where this value is at most 1, we get
n / 2i ≤ 1
n ≤ 2i
log2 n ≤ i
In other words, if we pick an integer i such that i ≥ log2 n, then after dividing n in half i times we'll have a value that is at most 1. The smallest i for which this is guaranteed is roughly log2 n, so if we have an algorithm that divides by 2 until the number gets sufficiently small, then we can say that it terminates in O(log n) steps.
An important detail is that it doesn't matter what constant you're dividing n by (as long as it's greater than one); if you divide by the constant k, it will take logk n steps to reach 1. Thus any algorithm that repeatedly divides the input size by some fraction will need O(log n) iterations to terminate. Those iterations might take a lot of time and so the net runtime needn't be O(log n), but the number of steps will be logarithmic.
So where does this come up? One classic example is binary search, a fast algorithm for searching a sorted array for a value. The algorithm works like this:
If the array is empty, return that the element isn't present in the array.
Otherwise:
Look at the middle element of the array.
If it's equal to the element we're looking for, return success.
If it's greater than the element we're looking for:
Throw away the second half of the array.
Repeat
If it's less than the the element we're looking for:
Throw away the first half of the array.
Repeat
For example, to search for 5 in the array
1 3 5 7 9 11 13
We'd first look at the middle element:
1 3 5 7 9 11 13
^
Since 7 > 5, and since the array is sorted, we know for a fact that the number 5 can't be in the back half of the array, so we can just discard it. This leaves
1 3 5
So now we look at the middle element here:
1 3 5
^
Since 3 < 5, we know that 5 can't appear in the first half of the array, so we can throw the first half array to leave
5
Again we look at the middle of this array:
5
^
Since this is exactly the number we're looking for, we can report that 5 is indeed in the array.
So how efficient is this? Well, on each iteration we're throwing away at least half of the remaining array elements. The algorithm stops as soon as the array is empty or we find the value we want. In the worst case, the element isn't there, so we keep halving the size of the array until we run out of elements. How long does this take? Well, since we keep cutting the array in half over and over again, we will be done in at most O(log n) iterations, since we can't cut the array in half more than O(log n) times before we run out of array elements.
Algorithms following the general technique of divide-and-conquer (cutting the problem into pieces, solving those pieces, then putting the problem back together) tend to have logarithmic terms in them for this same reason - you can't keep cutting some object in half more than O(log n) times. You might want to look at merge sort as a great example of this.
Processing values one digit at a time
How many digits are in the base-10 number n? Well, if there are k digits in the number, then we'd have that the biggest digit is some multiple of 10k. The largest k-digit number is 999...9, k times, and this is equal to 10k + 1 - 1. Consequently, if we know that n has k digits in it, then we know that the value of n is at most 10k + 1 - 1. If we want to solve for k in terms of n, we get
n ≤ 10k+1 - 1
n + 1 ≤ 10k+1
log10 (n + 1) ≤ k + 1
(log10 (n + 1)) - 1 ≤ k
From which we get that k is approximately the base-10 logarithm of n. In other words, the number of digits in n is O(log n).
For example, let's think about the complexity of adding two large numbers that are too big to fit into a machine word. Suppose that we have those numbers represented in base 10, and we'll call the numbers m and n. One way to add them is through the grade-school method - write the numbers out one digit at a time, then work from the right to the left. For example, to add 1337 and 2065, we'd start by writing the numbers out as
1 3 3 7
+ 2 0 6 5
==============
We add the last digit and carry the 1:
1
1 3 3 7
+ 2 0 6 5
==============
2
Then we add the second-to-last ("penultimate") digit and carry the 1:
1 1
1 3 3 7
+ 2 0 6 5
==============
0 2
Next, we add the third-to-last ("antepenultimate") digit:
1 1
1 3 3 7
+ 2 0 6 5
==============
4 0 2
Finally, we add the fourth-to-last ("preantepenultimate"... I love English) digit:
1 1
1 3 3 7
+ 2 0 6 5
==============
3 4 0 2
Now, how much work did we do? We do a total of O(1) work per digit (that is, a constant amount of work), and there are O(max{log n, log m}) total digits that need to be processed. This gives a total of O(max{log n, log m}) complexity, because we need to visit each digit in the two numbers.
Many algorithms get an O(log n) term in them from working one digit at a time in some base. A classic example is radix sort, which sorts integers one digit at a time. There are many flavors of radix sort, but they usually run in time O(n log U), where U is the largest possible integer that's being sorted. The reason for this is that each pass of the sort takes O(n) time, and there are a total of O(log U) iterations required to process each of the O(log U) digits of the largest number being sorted. Many advanced algorithms, such as Gabow's shortest-paths algorithm or the scaling version of the Ford-Fulkerson max-flow algorithm, have a log term in their complexity because they work one digit at a time.
As to your second question about how you solve that problem, you may want to look at this related question which explores a more advanced application. Given the general structure of problems that are described here, you now can have a better sense of how to think about problems when you know there's a log term in the result, so I would advise against looking at the answer until you've given it some thought.
When we talk about big-Oh descriptions, we are usually talking about the time it takes to solve problems of a given size. And usually, for simple problems, that size is just characterized by the number of input elements, and that's usually called n, or N. (Obviously that's not always true-- problems with graphs are often characterized in numbers of vertices, V, and number of edges, E; but for now, we'll talk about lists of objects, with N objects in the lists.)
We say that a problem "is big-Oh of (some function of N)" if and only if:
For all N > some arbitrary N_0, there is some constant c, such that the runtime of the algorithm is less than that constant c times (some function of N.)
In other words, don't think about small problems where the "constant overhead" of setting up the problem matters, think about big problems. And when thinking about big problems, big-Oh of (some function of N) means that the run-time is still always less than some constant times that function. Always.
In short, that function is an upper bound, up to a constant factor.
So, "big-Oh of log(n)" means the same thing that I said above, except "some function of N" is replaced with "log(n)."
So, your problem tells you to think about binary search, so let's think about that. Let's assume you have, say, a list of N elements that are sorted in increasing order. You want to find out if some given number exists in that list. One way to do that which is not a binary search is to just scan each element of the list and see if it's your target number. You might get lucky and find it on the first try. But in the worst case, you'll check N different times. This is not binary search, and it is not big-Oh of log(N) because there's no way to force it into the criteria we sketched out above.
You can pick that arbitrary constant to be c=10, and if your list has N=32 elements, you're fine: 10*log(32) = 50, which is greater than the runtime of 32. But if N=64, 10*log(64) = 60, which is less than the runtime of 64. You can pick c=100, or 1000, or a gazillion, and you'll still be able to find some N that violates that requirement. In other words, there is no N_0.
If we do a binary search, though, we pick the middle element, and make a comparison. Then we throw out half the numbers, and do it again, and again, and so on. If your N=32, you can only do that about 5 times, which is log(32). If your N=64, you can only do this about 6 times, etc. Now you can pick that arbitrary constant c, in such a way that the requirement is always met for large values of N.
With all that background, what O(log(N)) usually means is that you have some way to do a simple thing, which cuts your problem size in half. Just like the binary search is doing above. Once you cut the problem in half, you can cut it in half again, and again, and again. But, critically, what you can't do is some preprocessing step that would take longer than that O(log(N)) time. So for instance, you can't shuffle your two lists into one big list, unless you can find a way to do that in O(log(N)) time, too.
(NOTE: Nearly always, Log(N) means log-base-two, which is what I assume above.)
In the following solution, all the lines with a recursive call are done on
half of the given sizes of the sub-arrays of X and Y.
Other lines are done in a constant time.
The recursive function is T(2n)=T(2n/2)+c=T(n)+c=O(lg(2n))=O(lgn).
You start with MEDIAN(X, 1, n, Y, 1, n).
MEDIAN(X, p, r, Y, i, k)
if X[r]<Y[i]
return X[r]
if Y[k]<X[p]
return Y[k]
q=floor((p+r)/2)
j=floor((i+k)/2)
if r-p+1 is even
if X[q+1]>Y[j] and Y[j+1]>X[q]
if X[q]>Y[j]
return X[q]
else
return Y[j]
if X[q+1]<Y[j-1]
return MEDIAN(X, q+1, r, Y, i, j)
else
return MEDIAN(X, p, q, Y, j+1, k)
else
if X[q]>Y[j] and Y[j+1]>X[q-1]
return Y[j]
if Y[j]>X[q] and X[q+1]>Y[j-1]
return X[q]
if X[q+1]<Y[j-1]
return MEDIAN(X, q, r, Y, i, j)
else
return MEDIAN(X, p, q, Y, j, k)
The Log term pops up very often in algorithm complexity analysis. Here are some explanations:
1. How do you represent a number?
Lets take the number X = 245436. This notation of “245436” has implicit information in it. Making that information explicit:
X = 2 * 10 ^ 5 + 4 * 10 ^ 4 + 5 * 10 ^ 3 + 4 * 10 ^ 2 + 3 * 10 ^ 1 + 6 * 10 ^ 0
Which is the decimal expansion of the number. So, the minimum amount of information we need to represent this number is 6 digits. This is no coincidence, as any number less than 10^d can be represented in d digits.
So how many digits are required to represent X? Thats equal to the largest exponent of 10 in X plus 1.
==> 10 ^ d > X
==> log (10 ^ d) > log(X)
==> d* log(10) > log(X)
==> d > log(X) // And log appears again...
==> d = floor(log(x)) + 1
Also note that this is the most concise way to denote the number in this range. Any reduction will lead to information loss, as a missing digit can be mapped to 10 other numbers. For example: 12* can be mapped to 120, 121, 122, …, 129.
2. How do you search for a number in (0, N - 1)?
Taking N = 10^d, we use our most important observation:
The minimum amount of information to uniquely identify a value in a range between 0 to N - 1 = log(N) digits.
This implies that, when asked to search for a number on the integer line, ranging from 0 to N - 1, we need at least log(N) tries to find it. Why? Any search algorithm will need to choose one digit after another in its search for the number.
The minimum number of digits it needs to choose is log(N). Hence the minimum number of operations taken to search for a number in a space of size N is log(N).
Can you guess the order complexities of binary search, ternary search or deca search? Its O(log(N))!
3. How do you sort a set of numbers?
When asked to sort a set of numbers A into an array B, here’s what it looks like ->
Permute Elements
Every element in the original array has to be mapped to it’s corresponding index in the sorted array. So, for the first element, we have n positions. To correctly find the corresponding index in this range from 0 to n - 1, we need…log(n) operations.
The next element needs log(n-1) operations, the next log(n-2) and so on. The total comes to be:
==> log(n) + log(n - 1) + log(n - 2) + … + log(1)Using log(a) + log(b) = log(a * b), ==> log(n!)
This can be approximated to nlog(n) - n. Which is O(n*log(n))!
Hence we conclude that there can be no sorting algorithm that can do better than O(n*log(n)). And some algorithms having this complexity are the popular Merge Sort and Heap Sort!
These are some of the reasons why we see log(n) pop up so often in the complexity analysis of algorithms. The same can be extended to binary numbers. I made a video on that here.
Why does log(n) appear so often during algorithm complexity analysis?
Cheers!
We call the time complexity O(log n), when the solution is based on iterations over n, where the work done in each iteration is a fraction of the previous iteration, as the algorithm works towards the solution.
Can't comment yet... necro it is!
Avi Cohen's answer is incorrect, try:
X = 1 3 4 5 8
Y = 2 5 6 7 9
None of the conditions are true, so MEDIAN(X, p, q, Y, j, k) will cut both the fives. These are nondecreasing sequences, not all values are distinct.
Also try this even-length example with distinct values:
X = 1 3 4 7
Y = 2 5 6 8
Now MEDIAN(X, p, q, Y, j+1, k) will cut the four.
Instead I offer this algorithm, call it with MEDIAN(1,n,1,n):
MEDIAN(startx, endx, starty, endy){
if (startx == endx)
return min(X[startx], y[starty])
odd = (startx + endx) % 2 //0 if even, 1 if odd
m = (startx+endx - odd)/2
n = (starty+endy - odd)/2
x = X[m]
y = Y[n]
if x == y
//then there are n-2{+1} total elements smaller than or equal to both x and y
//so this value is the nth smallest
//we have found the median.
return x
if (x < y)
//if we remove some numbers smaller then the median,
//and remove the same amount of numbers bigger than the median,
//the median will not change
//we know the elements before x are smaller than the median,
//and the elements after y are bigger than the median,
//so we discard these and continue the search:
return MEDIAN(m, endx, starty, n + 1 - odd)
else (x > y)
return MEDIAN(startx, m + 1 - odd, n, endy)
}

Resources