Finding the next larger anagram of a given number - data-structures

What would be an efficient algorithm to find the next larger anagram of a given number?
Examples:
input: 7813 -> output: 7831
input: 3791 -> output: 3917
input: 4321 -> output: (None)

Basically you move through the number from right to left. Once you find a decreasing digit, then you stop. Compare that digit with all of the previous digits. Replace it with the next highest digit, then sort the rest in increasing order. For example:
take 978654321.
1) move right to left until you get to a decreasing digit:
Stop at the 7 because that is the first digit that decreases.
2) find the next largest digit that we have already seen:
out of 1 2 3 4 5 6 and 8, 8 is the next largest digit to 7.
3) sort the remaining digits in increasing order and append that to the end.
1234567
Which produces 981234567
complexity:
n is the number of digits.
Step 1) O(n) because in the worst case the numbers increase (or stay the same) until the last digit.
Step 2) O(n) because in the worst case you have to compare that number against all of the n digits.
Step 3) O(n lg n) because you have to sort and the best sorting algorithm is nlgn.
So this algorithm runs in O(n lg n). again where n is the number of digits in the number.

Related

Largest Number With Non-Decreasing Digits

What is the largest positive integer(call it k) less than or equal to N such that all digits of integer k are in non decreasing order?
Constraints:
1 <= N <= 10^18
1 <= k <= N
Time Limit: 8 sec
One of the solutions is checking all the values starting from N-1 (i.e., N-1, N-2, N-3, .....) till I find the number with non decreasing digits.
But this can be done in given time limit only if N <= 10^10. It exceeds the time limit for given constraints (N <= 10^18).
A simple greedy approach will be scanning the number from the right and if you find a digit that is less than the digit on its left then, decrease the digit on the left by 1 and replace all the digits from current digit to the rightmost digit with 9.
Ex:
132 -> 1 3 2
2 < 3 so replace 2 by 9 and decrease 3 by 1
You can do this because the resulting number will definitely be smaller than the original. And also we want to maximize the number so we are replacing the digit with largest possible digit 9 and decreasing the left digit with least possible digit 1 in order to maximize the resulting number.
Repeat this process for all the digits to the left until you find a valid number. And yes check for corner cases (leading zeros).
for number 1332
1332 -> 1329 -> 1299(valid number)
so the answer will be 1299.

linear solution of grid constraints

You have a grid n x n with n rows and n columns. For every column j you are given a number Cj and for every row i you are given a number Ri.
You need to mark some points on the grid, in this way:
the number of marked points in every row is at most Ri;
the number of marked points in every column is at most Cj;
you mark the maximum number of points that satify the last two constraints and return this number of points.
The input is: n (dimension of the grid); the sequence of Ri and the sequence of Cj.
for example in this grid the return is 34
example
Find an algorithm in linear time: O(n) or O(n log(n)) with demonstration.
I have found a solution with Max-Flow alg. but the complexity is too high.
Hints
I suggest iterating over the rows in order from greatest Ri to smallest.
Keep track of how many spaces we have for each column. The number of spaces starts at the given Cj values.
For each row, mark as many points in the grid as are allowed based on the current number of spaces in the columns. Make sure to place points in the columns with the greatest number of spaces first.
#NiklasB. with an augmented self-balancing binary search tree you can decrement one interval by 1 in O(Log n) but how can you find the number of elements that are not zero in minor time of O(n)?In this case for example input 4 3322 3322 3322 -> 2212 -> 1111 -> 0011 -> 0000 You take number 3(the first element of row indicator) and decrement the interval [0,2] of columns by 1 and this cost O(Log n), and the columns indicator are all > 0, but when you arrive at 0011 you must alert that are 2 zero because if you have a 3 in the row you can subtract only 2 if you don't want a number a negative number.If you say the number of zero you can take only interval of n - number of 0.But how you can manage this problem in complexity time that isn't O(n)?

Recreate the Sequence

I came across the following problem, which has been on my mind ever since:
Alice has written N consecutive, positive integers on a blackboard. E.g. "99, 100, 101, 102". Bob has erased all digits, but one, from each number, so the sequence now reads e.g. "9, 0, 0, 1". Notice that the digit he leaves over can be a different one for every integer.
Our task is, in O(N log N) time complexity, to find the smallest number that may have started the sequence. In the above example the answer would be 99. For the length 7 sequence "1, 4, 0, 5, 4, 1, 4" , the answer would be 1042. (Which yield the sequence 1042, 1043, 1044, 1045, 1046, 1047, 1048).
I can show an upper bound up of around 1234567890*N, so the output can't be of unlimited size. However I haven't been able to even find an efficient O(N^2) solution.
Any ideas?
UPDATE: For those interested, this problem appeared in the Baltic Olympiad in Informatics (BOI) 2014 (it's the task "Sequence"). Due to Codeforces user Fdg, here's an O(N log N) solution: Try every possible last digit of the starting value. Partition contiguous array elements into groups that have the digits 0 to 9 at the end (this can be inferred from the starting value's last digit). We know that all values in the same group have the same prefix, after we remove their last digit. Let's eliminate all those values where the digit in the input matches the last digit that it should have according to their position.
Now we have a slightly different subproblem: For every group of 10, we know a set of digits that appear in its prefix. We collapse this into a single array element. This generalized problem only has one tenth the size of the original problem and can be solved using the same algorithm, recursively.
We thus get the recurrence T(N) = 10 * T(N / 10) + O(N), which we solve as T(N) = O(N log N) using the master theorem.
Example:
Let's say the input is [1, 4, 0, 5, 4, 1, 4, 9, 5, 0, 1, 0]. So in the generalized form, we know the following subsets of digits for every position:
{1} {4} {0} {5} {4} {1} {4} {9} {5} {0} {1} {0}
We check the number 2 as the last digit of the starting number (of course we check all the other digits too, but this branch will turn out to contain the minimum solution). We know that the sequence of last digits goes like
2 3 4 5 6 7 8 9 0 1 2 3
So we know the groups that have the same prefix (2-9 and 0-3). We also eliminate those digits from the sets that we already know are at the correct position:
{1} {4} {0} {} {4} {1} {4} {} | {5} {0} {1} {0}
By collecting all the digits of each group, we arrive at the reduced problem
{0,1,4} {0,1,5}
Again we brute-force the second to last digit. Let's say we are checking 4. We get:
4 5
{0,1} {0,1}
Which reduces to
{0,1}
Now that we are down to only one array element, we just have to build the lexicographically smallest number out of those digits that has no leading zeroes, which is 10. So the result is 1042.
Old version
I believe one key observation here is that in a progression like this, of length N, only the last ceil(log_10(N)) digits are changed more than once. So we could brute-force the last ceil(log_10(N)) digits, the number of nines at the end of the prefix and the digit before that in O(N * log N).
So we fix the pattern
P..PX9..9S...S
where the suffix S is known, the number of nines is known, X < 9 is known, but the prefix P is not.
We can now remove those numbers from the sequence that already match one of the digits we already know appears at their respective position. We are left with a set of digits which we know comprise the prefix P. We just form the lexicographically smallest string that has no leading zeroes and contains all the digits.
The runtime is O(N^2 log N).

Given an array of integers, find the LARGEST number using the digits of the array such that it is divisible by 3

E.g.: Array: 4,3,0,1,5 {Assume all digits are >=0. Also each element in array correspond to a digit. i.e. each element on the array is between 0 and 9. }
In the above array, the largest number is: 5430 {using digits 5, 4, 3 and 0 from the array}
My Approach:
For divisibility by 3, we need the sum of digits to be divisible by 3.
So,
Step-1: Remove all the zeroes from the array.
Step-2: These zeroes will come at the end. {Since they dont affect the sum and we have to find the largest number}
Step-3: Find the subset of the elements of array (excluding zeroes) such that the number of digits is MAXIMUM and also that the sum of digits is MAXIMUM and the sum is divisible by 3.
STEP-4: The required digit consists of the digits in the above found set in decreasing order.
So, the main step is STEP-3 i.e. How to find the subset such that it contains MAXIMUM possible number of elements such that their sum is MAX and is divisible by 3 .
I was thinking, maybe Step-3 could be done by GREEDY CHOICE of taking all the elements and keep on removing the smallest element in the set till the sum is divisible by 3.
But i am not convinced that this GREEDY choice will work.
Please tell if my approach is correct.
If it is, then please suggest as to how to do Step-3 ?
Also, please suggest any other possible/efficient algorithm.
Observation: If you can get a number that is divisible by 3, you need to remove at most 2 numbers, to maintain optimal solution.
A simple O(n^2) solution will be to check all possibilities to remove 1 number, and if none is valid, check all pairs (There are O(n^2) of those).
EDIT:
O(n) solution: Create 3 buckets - bucket1, bucket2, bucket0. Each will denote the modulus 3 value of the numbers. Ignore bucket0 in the next algorithm.
Let the sum of the array be sum.
If sum % 3 ==0: we are done.
else if sum % 3 == 1:
if there is a number in bucket1 - chose the minimal
else: take 2 minimals from bucket 2
else if sum % 3 == 2
if there is a number in bucket2 - chose the minimal
else: take 2 minimals from bucket1
Note: You don't actually need the bucket, to achieve O(1) space - you need only the 2 minimal values from bucket1 and bucket2, since it is the only number we actually used from these buckets.
Example:
arr = { 3, 4, 0, 1, 5 }
bucket0 = {3,0} ; bucket1 = {4,1} bucket2 = { 5 }
sum = 13 ; sum %3 = 1
bucket1 is not empty - chose minimal from it (1), and remove it from the array.
result array = { 3, 4, 0, 5 }
proceed to STEP 4 "as planned"
Greedy choice definitely doesn't work: consider the set {5, 2, 1}. You'd remove the 1 first, but you should remove the 2.
I think you should work out the sum of the array modulo 3, which is either 0 (you're finished), or 1, or 2. Then you're looking to remove the minimal subset whose sum modulo 3 is 1 or 2.
I think that's fairly straightforward, so no real need for dynamic programming. Do it by removing one number with that modulus if possible, otherwise do it by removing two numbers with the other modulus. Once you know how many to remove, choose the smallest possible. You'll never need to remove three numbers.
You don't need to treat 0 specially, although if you're going to do that then you can further reduce the set under consideration in step 3 if you temporarily remove all 0, 3, 6, 9 from it.
Putting it all together, I would probably:
Sort the digits, descending.
Calculate the modulus. If 0, we're finished.
Try to remove a digit with that modulus, starting from the end. If successful, we're finished.
Remove two digits with negative-that-modulus, starting from the end. This always succeeds, so we're finished.
We might be left with an empty array (e.g. if the input is 1, 1), in which case the problem was impossible. Otherwise, the array contains the digits of our result.
Time complexity is O(n) provided that you do a counting sort in step 1. Which you certainly can since the values are digits.
What do you think about this:
first sort an array elements by value
sum up all numbers
- if sum's remainder after division by 3 is equal to 0, just return the sorted
array
- otherwise
- if sum of remainders after division by 3 of all the numbers is smaller
than the remainder of their sum, there is no solution
- otherwise
- if it's equal to 1, try to return the smallest number with remainder
equal to 1, or if no such, try two smallest with remainder equal to 2,
if no such two (I suppose it can happen), there's no solution
- if it's equal to 2, try to return the smallest number with remainder
equal to 2, or if no such, try two smallest with remainder equal to 1,
if no such two, there's no solution
first sort an array elements by remainder of division by 3 ascending
then each subset of equal remainder sort by value descending
First, this problem reduces to maximizing the number of elements selected such that their sum is divisible by 3.
Trivial: Select all numbers divisible by 3 (0,3,6,9).
Le a be the elements that leave 1 as remainder, b be the elements that leave 2 as remainder. If (|a|-|b|)%3 is 0, then select all elements from both a and b. If (|a|-|b|)%3 is 1, select all elements from b, and |a|-1 highest numbers from a. If the remainder is 2, then select all numbers from a, and |b|-1 highest numbers from b.
Once you have all the numbers, sort them in reverse order and concatenate. that is your answer.
Ultimately if n is the number of elements this algorithm returns a number that is al least n-1 digits long (except corner cases. see below).
NOTE: Take care of corner cases(i.e. what is |a|=0 or |b|=0 etc). (-1)%3 = 2 and (-2)%3 = 1 .
If m is the size of alphabet, and n is the number of elements, this my algorithm is O(m+n)
Sorting the data is unnecessary, since there are only ten different values.
Just count the number of zeroes, ones, twos etc. in O (n) if n digits are given.
Calculate the sum of all digits, check whether the remainder modulo 3 is 0, 1 or 2.
If the remainder is 1: Remove the first of the following which is possible (one of these is guaranteed to be possible): 1, 4, 7, 2+2, 2+5, 5+5, 2+8, 5+8, 8+8.
If the remainder is 2: Remove the first of the following which is possible (one of these is guaranteed to be possible): 2, 5, 8, 1+1, 1+4, 4+4, 1+7, 4+7, 7+7.
If there are no digits left then the problem cannot be solved. Otherwise, the solution is created by concatenating 9's, 8's, 7's, and so on as many as are remaining.
(Sorting n digits would take O (n log n). Unless of course you sort by counting how often each digit occurs and generating the sorted result according to these numbers).
Amit's answer has a tiny thing missing.
If bucket1 is not empty but it has a humongous value, lets say 79 and 97 and b2 is not empty as well and its 2 minimals are, say 2 and 5. Then in this case, when the modulus of the sum of all digits is 1, we should choose to remove 2 and 5 from bucket 2 instead of the minimal in bucket 1 to get the largest concatenated number.
Test case : 8 2 3 5 78 79
If we follow Amits and Steve's suggested method, largest number would be 878532 whereas the largest number possible divisble by 3 in this array is 879783
Solution would be to compare the appropriate bucket's smallest minimal with the concatenation of both the minimals of the other bucket and eliminate the smaller one.

What would cause an algorithm to have O(log n) complexity?

My knowledge of big-O is limited, and when log terms show up in the equation it throws me off even more.
Can someone maybe explain to me in simple terms what a O(log n) algorithm is? Where does the logarithm come from?
This specifically came up when I was trying to solve this midterm practice question:
Let X(1..n) and Y(1..n) contain two lists of integers, each sorted in nondecreasing order. Give an O(log n)-time algorithm to find the median (or the nth smallest integer) of all 2n combined elements. For ex, X = (4, 5, 7, 8, 9) and Y = (3, 5, 8, 9, 10), then 7 is the median of the combined list (3, 4, 5, 5, 7, 8, 8, 9, 9, 10). [Hint: use concepts of binary search]
I have to agree that it's pretty weird the first time you see an O(log n) algorithm... where on earth does that logarithm come from? However, it turns out that there's several different ways that you can get a log term to show up in big-O notation. Here are a few:
Repeatedly dividing by a constant
Take any number n; say, 16. How many times can you divide n by two before you get a number less than or equal to one? For 16, we have that
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
Notice that this ends up taking four steps to complete. Interestingly, we also have that log2 16 = 4. Hmmm... what about 128?
128 / 2 = 64
64 / 2 = 32
32 / 2 = 16
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
This took seven steps, and log2 128 = 7. Is this a coincidence? Nope! There's a good reason for this. Suppose that we divide a number n by 2 i times. Then we get the number n / 2i. If we want to solve for the value of i where this value is at most 1, we get
n / 2i ≤ 1
n ≤ 2i
log2 n ≤ i
In other words, if we pick an integer i such that i ≥ log2 n, then after dividing n in half i times we'll have a value that is at most 1. The smallest i for which this is guaranteed is roughly log2 n, so if we have an algorithm that divides by 2 until the number gets sufficiently small, then we can say that it terminates in O(log n) steps.
An important detail is that it doesn't matter what constant you're dividing n by (as long as it's greater than one); if you divide by the constant k, it will take logk n steps to reach 1. Thus any algorithm that repeatedly divides the input size by some fraction will need O(log n) iterations to terminate. Those iterations might take a lot of time and so the net runtime needn't be O(log n), but the number of steps will be logarithmic.
So where does this come up? One classic example is binary search, a fast algorithm for searching a sorted array for a value. The algorithm works like this:
If the array is empty, return that the element isn't present in the array.
Otherwise:
Look at the middle element of the array.
If it's equal to the element we're looking for, return success.
If it's greater than the element we're looking for:
Throw away the second half of the array.
Repeat
If it's less than the the element we're looking for:
Throw away the first half of the array.
Repeat
For example, to search for 5 in the array
1 3 5 7 9 11 13
We'd first look at the middle element:
1 3 5 7 9 11 13
^
Since 7 > 5, and since the array is sorted, we know for a fact that the number 5 can't be in the back half of the array, so we can just discard it. This leaves
1 3 5
So now we look at the middle element here:
1 3 5
^
Since 3 < 5, we know that 5 can't appear in the first half of the array, so we can throw the first half array to leave
5
Again we look at the middle of this array:
5
^
Since this is exactly the number we're looking for, we can report that 5 is indeed in the array.
So how efficient is this? Well, on each iteration we're throwing away at least half of the remaining array elements. The algorithm stops as soon as the array is empty or we find the value we want. In the worst case, the element isn't there, so we keep halving the size of the array until we run out of elements. How long does this take? Well, since we keep cutting the array in half over and over again, we will be done in at most O(log n) iterations, since we can't cut the array in half more than O(log n) times before we run out of array elements.
Algorithms following the general technique of divide-and-conquer (cutting the problem into pieces, solving those pieces, then putting the problem back together) tend to have logarithmic terms in them for this same reason - you can't keep cutting some object in half more than O(log n) times. You might want to look at merge sort as a great example of this.
Processing values one digit at a time
How many digits are in the base-10 number n? Well, if there are k digits in the number, then we'd have that the biggest digit is some multiple of 10k. The largest k-digit number is 999...9, k times, and this is equal to 10k + 1 - 1. Consequently, if we know that n has k digits in it, then we know that the value of n is at most 10k + 1 - 1. If we want to solve for k in terms of n, we get
n ≤ 10k+1 - 1
n + 1 ≤ 10k+1
log10 (n + 1) ≤ k + 1
(log10 (n + 1)) - 1 ≤ k
From which we get that k is approximately the base-10 logarithm of n. In other words, the number of digits in n is O(log n).
For example, let's think about the complexity of adding two large numbers that are too big to fit into a machine word. Suppose that we have those numbers represented in base 10, and we'll call the numbers m and n. One way to add them is through the grade-school method - write the numbers out one digit at a time, then work from the right to the left. For example, to add 1337 and 2065, we'd start by writing the numbers out as
1 3 3 7
+ 2 0 6 5
==============
We add the last digit and carry the 1:
1
1 3 3 7
+ 2 0 6 5
==============
2
Then we add the second-to-last ("penultimate") digit and carry the 1:
1 1
1 3 3 7
+ 2 0 6 5
==============
0 2
Next, we add the third-to-last ("antepenultimate") digit:
1 1
1 3 3 7
+ 2 0 6 5
==============
4 0 2
Finally, we add the fourth-to-last ("preantepenultimate"... I love English) digit:
1 1
1 3 3 7
+ 2 0 6 5
==============
3 4 0 2
Now, how much work did we do? We do a total of O(1) work per digit (that is, a constant amount of work), and there are O(max{log n, log m}) total digits that need to be processed. This gives a total of O(max{log n, log m}) complexity, because we need to visit each digit in the two numbers.
Many algorithms get an O(log n) term in them from working one digit at a time in some base. A classic example is radix sort, which sorts integers one digit at a time. There are many flavors of radix sort, but they usually run in time O(n log U), where U is the largest possible integer that's being sorted. The reason for this is that each pass of the sort takes O(n) time, and there are a total of O(log U) iterations required to process each of the O(log U) digits of the largest number being sorted. Many advanced algorithms, such as Gabow's shortest-paths algorithm or the scaling version of the Ford-Fulkerson max-flow algorithm, have a log term in their complexity because they work one digit at a time.
As to your second question about how you solve that problem, you may want to look at this related question which explores a more advanced application. Given the general structure of problems that are described here, you now can have a better sense of how to think about problems when you know there's a log term in the result, so I would advise against looking at the answer until you've given it some thought.
When we talk about big-Oh descriptions, we are usually talking about the time it takes to solve problems of a given size. And usually, for simple problems, that size is just characterized by the number of input elements, and that's usually called n, or N. (Obviously that's not always true-- problems with graphs are often characterized in numbers of vertices, V, and number of edges, E; but for now, we'll talk about lists of objects, with N objects in the lists.)
We say that a problem "is big-Oh of (some function of N)" if and only if:
For all N > some arbitrary N_0, there is some constant c, such that the runtime of the algorithm is less than that constant c times (some function of N.)
In other words, don't think about small problems where the "constant overhead" of setting up the problem matters, think about big problems. And when thinking about big problems, big-Oh of (some function of N) means that the run-time is still always less than some constant times that function. Always.
In short, that function is an upper bound, up to a constant factor.
So, "big-Oh of log(n)" means the same thing that I said above, except "some function of N" is replaced with "log(n)."
So, your problem tells you to think about binary search, so let's think about that. Let's assume you have, say, a list of N elements that are sorted in increasing order. You want to find out if some given number exists in that list. One way to do that which is not a binary search is to just scan each element of the list and see if it's your target number. You might get lucky and find it on the first try. But in the worst case, you'll check N different times. This is not binary search, and it is not big-Oh of log(N) because there's no way to force it into the criteria we sketched out above.
You can pick that arbitrary constant to be c=10, and if your list has N=32 elements, you're fine: 10*log(32) = 50, which is greater than the runtime of 32. But if N=64, 10*log(64) = 60, which is less than the runtime of 64. You can pick c=100, or 1000, or a gazillion, and you'll still be able to find some N that violates that requirement. In other words, there is no N_0.
If we do a binary search, though, we pick the middle element, and make a comparison. Then we throw out half the numbers, and do it again, and again, and so on. If your N=32, you can only do that about 5 times, which is log(32). If your N=64, you can only do this about 6 times, etc. Now you can pick that arbitrary constant c, in such a way that the requirement is always met for large values of N.
With all that background, what O(log(N)) usually means is that you have some way to do a simple thing, which cuts your problem size in half. Just like the binary search is doing above. Once you cut the problem in half, you can cut it in half again, and again, and again. But, critically, what you can't do is some preprocessing step that would take longer than that O(log(N)) time. So for instance, you can't shuffle your two lists into one big list, unless you can find a way to do that in O(log(N)) time, too.
(NOTE: Nearly always, Log(N) means log-base-two, which is what I assume above.)
In the following solution, all the lines with a recursive call are done on
half of the given sizes of the sub-arrays of X and Y.
Other lines are done in a constant time.
The recursive function is T(2n)=T(2n/2)+c=T(n)+c=O(lg(2n))=O(lgn).
You start with MEDIAN(X, 1, n, Y, 1, n).
MEDIAN(X, p, r, Y, i, k)
if X[r]<Y[i]
return X[r]
if Y[k]<X[p]
return Y[k]
q=floor((p+r)/2)
j=floor((i+k)/2)
if r-p+1 is even
if X[q+1]>Y[j] and Y[j+1]>X[q]
if X[q]>Y[j]
return X[q]
else
return Y[j]
if X[q+1]<Y[j-1]
return MEDIAN(X, q+1, r, Y, i, j)
else
return MEDIAN(X, p, q, Y, j+1, k)
else
if X[q]>Y[j] and Y[j+1]>X[q-1]
return Y[j]
if Y[j]>X[q] and X[q+1]>Y[j-1]
return X[q]
if X[q+1]<Y[j-1]
return MEDIAN(X, q, r, Y, i, j)
else
return MEDIAN(X, p, q, Y, j, k)
The Log term pops up very often in algorithm complexity analysis. Here are some explanations:
1. How do you represent a number?
Lets take the number X = 245436. This notation of “245436” has implicit information in it. Making that information explicit:
X = 2 * 10 ^ 5 + 4 * 10 ^ 4 + 5 * 10 ^ 3 + 4 * 10 ^ 2 + 3 * 10 ^ 1 + 6 * 10 ^ 0
Which is the decimal expansion of the number. So, the minimum amount of information we need to represent this number is 6 digits. This is no coincidence, as any number less than 10^d can be represented in d digits.
So how many digits are required to represent X? Thats equal to the largest exponent of 10 in X plus 1.
==> 10 ^ d > X
==> log (10 ^ d) > log(X)
==> d* log(10) > log(X)
==> d > log(X) // And log appears again...
==> d = floor(log(x)) + 1
Also note that this is the most concise way to denote the number in this range. Any reduction will lead to information loss, as a missing digit can be mapped to 10 other numbers. For example: 12* can be mapped to 120, 121, 122, …, 129.
2. How do you search for a number in (0, N - 1)?
Taking N = 10^d, we use our most important observation:
The minimum amount of information to uniquely identify a value in a range between 0 to N - 1 = log(N) digits.
This implies that, when asked to search for a number on the integer line, ranging from 0 to N - 1, we need at least log(N) tries to find it. Why? Any search algorithm will need to choose one digit after another in its search for the number.
The minimum number of digits it needs to choose is log(N). Hence the minimum number of operations taken to search for a number in a space of size N is log(N).
Can you guess the order complexities of binary search, ternary search or deca search? Its O(log(N))!
3. How do you sort a set of numbers?
When asked to sort a set of numbers A into an array B, here’s what it looks like ->
Permute Elements
Every element in the original array has to be mapped to it’s corresponding index in the sorted array. So, for the first element, we have n positions. To correctly find the corresponding index in this range from 0 to n - 1, we need…log(n) operations.
The next element needs log(n-1) operations, the next log(n-2) and so on. The total comes to be:
==> log(n) + log(n - 1) + log(n - 2) + … + log(1)Using log(a) + log(b) = log(a * b), ==> log(n!)
This can be approximated to nlog(n) - n. Which is O(n*log(n))!
Hence we conclude that there can be no sorting algorithm that can do better than O(n*log(n)). And some algorithms having this complexity are the popular Merge Sort and Heap Sort!
These are some of the reasons why we see log(n) pop up so often in the complexity analysis of algorithms. The same can be extended to binary numbers. I made a video on that here.
Why does log(n) appear so often during algorithm complexity analysis?
Cheers!
We call the time complexity O(log n), when the solution is based on iterations over n, where the work done in each iteration is a fraction of the previous iteration, as the algorithm works towards the solution.
Can't comment yet... necro it is!
Avi Cohen's answer is incorrect, try:
X = 1 3 4 5 8
Y = 2 5 6 7 9
None of the conditions are true, so MEDIAN(X, p, q, Y, j, k) will cut both the fives. These are nondecreasing sequences, not all values are distinct.
Also try this even-length example with distinct values:
X = 1 3 4 7
Y = 2 5 6 8
Now MEDIAN(X, p, q, Y, j+1, k) will cut the four.
Instead I offer this algorithm, call it with MEDIAN(1,n,1,n):
MEDIAN(startx, endx, starty, endy){
if (startx == endx)
return min(X[startx], y[starty])
odd = (startx + endx) % 2 //0 if even, 1 if odd
m = (startx+endx - odd)/2
n = (starty+endy - odd)/2
x = X[m]
y = Y[n]
if x == y
//then there are n-2{+1} total elements smaller than or equal to both x and y
//so this value is the nth smallest
//we have found the median.
return x
if (x < y)
//if we remove some numbers smaller then the median,
//and remove the same amount of numbers bigger than the median,
//the median will not change
//we know the elements before x are smaller than the median,
//and the elements after y are bigger than the median,
//so we discard these and continue the search:
return MEDIAN(m, endx, starty, n + 1 - odd)
else (x > y)
return MEDIAN(startx, m + 1 - odd, n, endy)
}

Resources