Complexity of searching sorted matrix - algorithm

Suppose we have a matrix of size NxN of numbers where all the rows and columns are in increasing order, and we want to find if it contains a value v. One algorithm is to perform a binary search on the middle row, to find the elements closest in value to v: M[row,col] < v < M[row,col+1] (if we find v exactly, the search is complete). Since the matrix is sorted we know that v is larger than all elements in the sub-matrix M[0..row, 0..col] (the top-left quadrant of the matrix), and similarly it's smaller than all elements in the sub-matrix M[row..N-1, col+1..N-1] (the bottom right quadrant). So we can recursively search the top right quadrant M[0..row-1, col+1..N-1] and the bottom left quadrant M[row+1..N-1, 0..col].
The question is what is the complexity of this algorithm ?
Example: Suppose we have the 5x5 matrix shown below and we are searching for the number 25:
0 10 20 30 40
1 11 21 31 41
2 12 22 32 42
3 13 23 33 43
4 14 24 34 44
In the first iteration we perform binary search on the middle row and find the closest element which is smaller than 25 is 22 (at row=2 col=2). So now we know 25 is larger than all items in the top-left 3x3 quadrant:
0 10 20
1 11 21
2 12 22
Similary we know 25 is smaller than all elements in the bottom right 3x2 quadrant:
32 42
33 43
34 44
So, we recursively search the remaining quadrants - the top right 2x2:
30 40
31 41
and the bottom left 2x3:
3 13 23
4 14 24
And so on. We essentially divided the matrix into 4 quadrants (which might be of different sizes depending on the result of the binary search on the middle row), and then we recursively search two of the quadrants.

The worst-case running time is Theta(n). Certainly this is as good as it gets for correct algorithms (consider an anti-diagonal, with elements less than v above and elements greater than v below). As far as upper bounds go, the bound for an n-row, m-column matrix is O(n log(2 + m/n)), as evidenced by the correct recurrence
m-1
f(n, m) = log m + max [f(n/2, j) + f(n/2, m-1 - j)],
j=0
where there are two sub-problems, not one. This recurrence is solvable by the substitution method.
?
f(n, m) ≤ c n log(2 + m/n) - log(m) - 2 [hypothesis; c to be chosen later]
m-1
f(n, m) = log m + max [f((n-1)/2, j) + f((n-1)/2, m-j)]
j=0
m-1
≤ log m + max [ c (n/2) log(2 + j/(n/2)) - log(j) - 2
+ c (n/2) log(2 + (m-j)/(n/2))] - log(m-j) - 2]
j=0
[fixing j = m/2 by the concavity of log]
≤ log m + c n log(2 + m/n) - 2 log(m/2) - 4
= log m + c n log(2 + m/n) - 2 log(m) - 2
= c n log(2 + m/n) - log(m) - 2.
Set c large enough that, for all n, m,
c n log(2 + m/n) - log(m) - 2 ≥ log(m),
where log(m) is the cost of the base case n = 1.

If you find your element after n steps, then the searchable range has size N = 4^n. Then, time complexity is O(log base 4 of N) = O(log N / log 4) = O(0.5 * log N) = O(log N).
In other words, your algorithm is two times faster then binary search, which is equal to O(log N)

A consideration on binary search on matrices:
Binary search on 2D matrices and in general ND matrices are nothing different than binary search on sorted 1D vectors. Infact C for instance store them in row-major fashion(as concat of rows from: [[row0],[row1]..[rowk]]
This means one can use the well-known binary search on matrix as following (with complexity log(n*m)):
template<typename T>
bool binarySearch_2D(T target,T** matrix){
int a=0;int b=NCELLS-1;//ROWS*COLS
bool found=false;
while(!found && a <= b){
int half=(a+b)/2;
int r=half/COLS;
int c=half-(half/COLS)*COLS;
int v =matrix[r][c];
if(v==target)
found=true;
else if(target > v)
a=half+1;
else //target < v
b=half-1;
}
return found;
}

The complexity of this algorithm will be -:
O(log2(n*n))
= O(log2(n))
This is because you are eliminating half of the matrix in one iteration.
EDIT -:
Recurrence relation -:
Assuming n to be the total number of elements in the matrix,
=> T(n) = T(n/2) + log(sqrt(n))
=> T(n) = T(n/2) + log(n^(1/2))
=> T(n) = T(n/2) + 1/2 * log(n)
Here, a = 1, b = 2.
Therefore, c = logb(a) = log2(1) = 0
=> n^c = n^0
Also, f(n) = n^0 * 1/2 * log(n)
According to case 2 of Master Theorem,
T(n) = O((log(n))^2)

You can use a recursive function and apply the master theorem to find the complexity.
Assume n is the number of elements in the matrix.
Cost for one step is binary search on sqrt(n) elements and you get two problems, in worst case same size each with n/4 elements: 2*T(n/4). So we have:
T(n)=2*T(n/4)+log(sqrt(n))
equal to
T(n)=2*T(n/4)+log(n)/2
Now apply master theorem case 1 (a=2, b=4, f(n)=log(n)/2 and f(n) in O(n^log_b(a))=O(n^(1/2)) therefore we have case 1)
=> Total running time T(n) is in O(n^(a/b)) = O(n^(1/2))
or equal to
O(sqrt(n))
which is equal to height or width of the matrix if both sides are the same.

Let's assume that we have the following matrix:
1 2 3
4 5 6
7 8 9
Let's search for value 7 using binary search as you specified:
Search nearest value to 7 in middle row: 4 5 6, which is 6.
Hmm we have a problem, 7 is not in the following submatrix:
6
9
So what to do? One solution would be to apply binary search to all rows, which has a complexity of nlog(n). So walking the matrix is a better solution.
Edit:
Recursion relation:
T(N*N) = T(N*N/2) + log(N)
if we normalize the function to one variable with M = N^2:
T(M) = T(M/2) + log(sqrt(M))
T(M) = T(M/2) + log(M)/2
According to Master Theorem Case #2, complexity is
(log(M))^2
=> (2log(N))^2
=> (log(N))^2
Edit 2:
Sorry I answered your question from my mobile, now when you think about it, M[0...row-1, col+1...N-1] doesn't make much sense right? Consider my example, if you search for a value that is smaller than all values in the middle row, you'll always end up with the leftmost number. Similarly, if you search for a value that is greater than all values in the middle row, you'll end up with the rightmost number. So the algorithm can be reworded as follows:
Search middle row with custom binary search that returns 1 <= idx <= N if found, idx = 0 or idx = N+1 if not found. After binary search if idx = 0, start the search in the upper submatrix: M[0...row][0...N].
If the index is N + 1 start the search in the lower submatrix: M[row+1...N][0...N]. Otherwise, we are done.
You suggest that complexity should be: 2T(M/4) + log(M)/2. But at each step, we divide the whole matrix by two and only process one of them.
Moreover, if you agree that T(N*N) = T(N*N/2) + log(N) is correct, than you can substitute all N*N expressions with M.

Related

polynomial (in n) time algorithm that decides whether N is a power

I am a computer science student; I am studying the Algorithms course independently.
During the course, I saw this question:
Given an n-bit integer N, find a polynomial (in n) time algorithm that decides whether N is a power (that is, there are integers a and k > 1 so that a^k = N).
I thought of a first option that is exponential in n:
For all k , 1<k<N , try to divide N by k until I get result 1.
For example, if N = 27, I will start with k = 2 , because 2 doesn't divide 27, I will go to next k =3.
I will divide 27 / 3 to get 9, and divide it again until I will get 1. This is not a good solution because it is exponential in n.
My second option is using Modular arithmetic, using ak ≡ 1 mod (k+1) if gcd(a, k+1 ) = 1 (Euler's theorem). I don't know if a and k are relatively prime.
I am trying to write an algorithm, but I am struggling to do it:
function power(N)
Input: Positive integer N
Output: yes/no
Pick positive integers a_1, a_2, . . . , a_k < N at random
if (a_i)^N−1 ≡ 1 (mod N)
for all i = 1, 2, . . . , k:
return yes
else:
return no
I'm not sure if the algorithm is correct. How can I write this correctly?
Ignoring the cases when N is 0 or 1, you want to know if N is representable as a^b for a>1, b>1.
If you knew b, you could find a in O(log(N)) arithmetic operations (using binary search). Each arithmetic operation including exponentiation runs in polynomial time in log(N), so that would be polynomial too.
It's possible to bound b: it can be at most log_2(N)+1, otherwise a will be less than 2.
So simply try each b from 2 to floor(log_2(N)+1). Each try is polynomial in n (n ~= log_2(N)), and there's O(n) trials, so the resulting time is polynomial in n.
This looks like a simple math question. Suppose that we are given N = 96889010407 which is much less than Number.MAX_SAFE_INTEGER.
The question trys to figure out if N is a power where a**k === N for a > 1 and k > 1 . So we can also write it as
Math.log(a**k) === Math.log(N) yielding k*Math.log(a) === Math.log(N) yielding Math.log(a) === Math.log(N) / k where k is an Integer > 1.
Now remember the inverse logarithm. Math.log(y) = x yields y = Math.E**x.
This means we are looking for an Integer like a = Math.E**(Math.log(N) / k) for some k if exists. So start from k=2 and increment by 1.
k a = Math.E**(Math.log(N) / k)
___ _____________________________
2 311269.99599543784 -> NO
3 4592.947769836504 -> NO
4 557.9157606623403 -> NO
5 157.49069663608586 -> NO
6 67.77129015915592 -> NO
7 37.1080205641031 -> NO
8 23.62024048697092 -> NO
9 16.622531664172815 -> NO
10 12.54952973764698 -> NO
11 9.971310247420734 -> NO
12 8.232332000056601 -> NO
13 6.999999999999999 -> YES a is 7 and 96889010407 = 7^13
So for how long do we have to iterate? As long as Math.E**(Math.log(N) / k >= 2. In this case max 36 iterations since Math.E**(Math.log(96889010407) / 37 is 1.9811909632660634 and a must be an integer > 1.
This algorithm is probably the most efficient one for this job. It's time complexity is O(log2(N)) as we iterate k (the power). Had we chosen a to iterate then the time complexity would be O(sqrt(N)).
This is OK for Natural numbers but you can extend this to the Rationals as well.
Say, is 10.999671418529301 a perfect power?
All you have to do is to convert the decimal into a fraction the best way possible to get the rational form 4084101/371293 and apply both the numerator and the denominator to the mentioned algorithm above, to see if they both give the same power which in this case would be 5. 10.999671418529301 is 21^5/13^5.
Note: JS Math object is used in the example.
The number N cannot exceed 2^n. Hence you can initialize i=2, j=n and compute i^j with decreasing j until you arrive at N, then increase i and so on. A power is found in polynomial time.
E.g. with 7776 < 8192 = 2^13, you try 2^12 = 4096, then 3^12, 3^11, 3^10, 3^9, 3^8, then 4^8, 4^7, 4^6, 5^6, 5^5, 6^5 and you are done.

Algorithm-Find sum in matrix

We are given 2D matrix array (let's say length i and wide j) and integer k
We have to find size of smallest rectangle, that contains this or greater sum
F.e k=7
4 1
1 1
1 1
4 4
Anwser is 2, because 4+4=8 >= 7, if there wasn't last line, anwser would be 4, 4+1+1+1 = 7 >= 7
My idea is to count prefix sums Pref[k,l]=Tab[k,l]+Pref[k-1,l]+Pref[k,l-1]
And then compare every single rectangle
Is this possible to make it faster? My idea is T(n)=O(n^2) (Where n is number of elements in matrix)
I would like to do this in time n or n * log n
I would be really glad if someone would give me any tip how to do this :)
First, create an auxillary matrix: sums, where:
sums[i,j] = A[0,0] + A[0,1] + .... + A[0,j] + A[1,0] + ... + A[1,j] + ... + A[i,j]
I think this is what you meant when you said "prefix matrix".
This can be calculated in linear time with dynamic programming:
sums[0,j] = A[0,0] + ... + A[0,j]
sums[i,0] = A[0,0] + ... + A[i,0]
sums[i,j] = sums[i-1,j] + sums[i,j-1] - sums[i-1,j-1] + A[i,j]
^
elements counted twice
Now, assuming all elements are non negative, this is non decreasing, matrix, where each column and each row are sorted.
So, iterating the matrix again, for each pair of indices i,j, find the value closest yet smaller than sum[i,j]-k.
This can be done in O(sqrt(n)).
Do it for each such (i,j) pair, and you get O(n*sqrt(n)) solution.

sum of maximum element of sliding window of length K

Recently I got stuck in a problem. The part of algorithm requires to compute sum of maximum element of sliding windows of length K. Where K ranges from 1<=K<=N (N length of an array).
Example if I have an array A as 5,3,12,4
Sliding window of length 1: 5 + 3 + 12 + 4 = 24
Sliding window of length 2: 5 + 12 + 12 = 29
Sliding window of length 3: 12 + 12 = 24
Sliding window of length 4: 12
Final answer is 24,29,24,12.
I have tried to this O(N^2). For each sliding window of length K, I can calculate the maximum in O(N). Since K is upto N. Therefore, overall complexity turns out to be O(N^2).
I am looking for O(N) or O(NlogN) or something similar to this algorithm as N maybe upto 10^5.
Note: Elements in array can be as large as 10^9 so output the final answer as modulo 10^9+7
EDIT: What I actually want to find answer for each and every value of K (i.e. from 0 to N) in overall linear time or in O(NlogN) not in O(KN) or O(KNlogN) where K={1,2,3,.... N}
Here's an abbreviated sketch of O(n).
For each element, determine how many contiguous elements to the left are no greater (call this a), and how many contiguous elements to the right are lesser (call this b). This can be done for all elements in time O(n) -- see MBo's answer.
A particular element is maximum in its window if the window contains the element and only elements among to a to its left and the b to its right. Usefully, the number of such windows of length k (and hence the total contribution of these windows) is piecewise linear in k, with at most five pieces. For example, if a = 5 and b = 3, there are
1 window of size 1
2 windows of size 2
3 windows of size 3
4 windows of size 4
4 windows of size 5
4 windows of size 6
3 windows of size 7
2 windows of size 8
1 window of size 9.
The data structure that we need to encode this contribution efficiently is a Fenwick tree whose values are not numbers but linear functions of k. For each linear piece of the piecewise linear contribution function, we add it to the cell at beginning of its interval and subtract it from the cell at the end (closed beginning, open end). At the end, we retrieve all of the prefix sums and evaluate them at their index k to get the final array.
(OK, have to run for now, but we don't actually need a Fenwick tree for step two, which drops the complexity to O(n) for that, and there may be a way to do step one in linear time as well.)
Python 3, lightly tested:
def left_extents(lst):
result = []
stack = [-1]
for i in range(len(lst)):
while stack[-1] >= 0 and lst[i] >= lst[stack[-1]]:
del stack[-1]
result.append(stack[-1] + 1)
stack.append(i)
return result
def right_extents(lst):
result = []
stack = [len(lst)]
for i in range(len(lst) - 1, -1, -1):
while stack[-1] < len(lst) and lst[i] > lst[stack[-1]]:
del stack[-1]
result.append(stack[-1])
stack.append(i)
result.reverse()
return result
def sliding_window_totals(lst):
delta_constant = [0] * (len(lst) + 2)
delta_linear = [0] * (len(lst) + 2)
for l, i, r in zip(left_extents(lst), range(len(lst)), right_extents(lst)):
a = i - l
b = r - (i + 1)
if a > b:
a, b = b, a
delta_linear[1] += lst[i]
delta_linear[a + 1] -= lst[i]
delta_constant[a + 1] += lst[i] * (a + 1)
delta_constant[b + 2] += lst[i] * (b + 1)
delta_linear[b + 2] -= lst[i]
delta_linear[a + b + 2] += lst[i]
delta_constant[a + b + 2] -= lst[i] * (a + 1)
delta_constant[a + b + 2] -= lst[i] * (b + 1)
result = []
constant = 0
linear = 0
for j in range(1, len(lst) + 1):
constant += delta_constant[j]
linear += delta_linear[j]
result.append(constant + linear * j)
return result
print(sliding_window_totals([5, 3, 12, 4]))
Let's determine for every element an interval, where this element is dominating (maximum). We can do this in linear time with forward and backward runs using stack. Arrays L and R will contain indexes out of the domination interval.
To get right and left indexes:
Stack.Push(0) //(1st element index)
for i = 1 to Len - 1 do
while Stack.Peek < X[i] do
j = Stack.Pop
R[j] = i //j-th position is dominated by i-th one from the right
Stack.Push(i)
while not Stack.Empty
R[Stack.Pop] = Len //the rest of elements are not dominated from the right
//now right to left
Stack.Push(Len - 1) //(last element index)
for i = Len - 2 to 0 do
while Stack.Peek < X[i] do
j = Stack.Pop
L[j] = i //j-th position is dominated by i-th one from the left
Stack.Push(i)
while not Stack.Empty
L[Stack.Pop] = -1 //the rest of elements are not dominated from the left
Result for (5,7,3,9,4) array.
For example, 7 dominates at 0..2 interval, 9 at 0..4
i 0 1 2 3 4
X 5 7 3 9 4
R 1 3 3 5 5
L -1 -1 1 -1 4
Now for every element we can count it's impact in every possible sum.
Element 5 dominates at (0,0) interval, it is summed only in k=1 sum entry
Element 7 dominates at (0,2) interval, it is summed once in k=1 sum entry, twice in k=2 entry, once in k=3 entry.
Element 3 dominates at (2,2) interval, it is summed only in k=1 sum entry
Element 9 dominates at (0,4) interval, it is summed once in k=1 sum entry, twice in k=2, twice in k=3, twice in k=4, once in k=5.
Element 4 dominates at (4,4) interval, it is summed only in k=1 sum entry.
In general element with long domination interval in the center of long array may give up to k*Value impact in k-length sum (it depends on position relative to array ends and to another dom. elements)
k 1 2 3 4 5
--------------------------
5
7 2*7 7
3
9 2*9 2*9 2*9 9
4
--------------------------
S(k) 28 32 25 18 9
Note that the sum of coefficients is N*(N-1)/2 (equal to the number of possible windows), the most of table entries are empty, so complexity seems better than O(N^2)
(I still doubt about exact complexity)
The sum of maximum in sliding windows for a given window size can be computed in linear time using a double ended queue that keeps elements from the current window. We maintain the deque such that the first (index 0, left most) element in the queue is always the maximum of the current window.
This is done by iterating over the array and in each iteration, first we remove the first element in the deque if it is no longer in the current window (we do that by checking its original position, which is also saved in the deque together with its value). Then, we remove any elements from the end of the deque that are smaller than the current element, and finally we add the current element to the end of the deque.
The complexity is O(N) for computing the maximum for all sliding windows of size K. If you want to do that for all values of K from 1..N, then time complexity will be O(N^2). O(N) is the best possible time to compute the sum of maximum values of all windows of size K (that is easy to see). To compute the sum for other values of K, the simple approach is to repeat the computation for each different value of K, which would lead to overall time of O(N^2). Is there a better way ? No, because even if we save the result from a computation for one value of K, we would not be able to use it to compute the result for a different value of K, in less then O(N) time. So best time is O(N^2).
The following is an implementation in python:
from collections import deque
def slide_win(l, k):
dq=deque()
for i in range(len(l)):
if len(dq)>0 and dq[0][1]<=i-k:
dq.popleft()
while len(dq)>0 and l[i]>=dq[-1][0]:
dq.pop()
dq.append((l[i],i))
if i>=k-1:
yield dq[0][0]
def main():
l=[5,3,12,4]
print("l="+str(l))
for k in range(1, len(l)+1):
s=0
for x in slide_win(l,k):
s+=x
print("k="+str(k)+" Sum="+str(s))

Probabilty based on quicksort partition

I have come across this question:
Let 0<α<.5 be some constant (independent of the input array length n). Recall the Partition subroutine employed by the QuickSort algorithm, as explained in lecture. What is the probability that, with a randomly chosen pivot element, the Partition subroutine produces a split in which the size of the smaller of the two subarrays is ≥α times the size of the original array?
Its answer is 1-2*α.
Can anyone explain me how has this answer come?Please Help.
The choice of the pivot element is random, with uniform distribution.
There are N elements in the array, and we will assume that N is large (or we won't get the answer we want).
If 0≤α≤1, the probability that the number of elements smaller than the pivot is less than αN is α. The probability that the number of elements greater than the pivot is less than αN is the same. If α≤ 1/2, then these two possibilities are exclusive.
To say that the smaller subarray is of length ≥αN, is to say that neither of these conditions holds, therefore the probability is 1-2α.
The other answers didn't quite click with me so here's another take:
If at least one of the 2 subarrays must be you can deduce that the pivot must also be in position . This is obvious by contradiction. If the pivot is then there is a subarray smaller than . By the same reasoning the pivot must also be . Any larger value for the pivot will yield a smaller subarray than on the "right hand side".
This means that , as shown by the diagram below:
What we want to calculate then is the probability of that event (call it A) i.e .
The way we calculate the probability of an event is to sum of the probability of the constituent outcomes i.e. that the pivot lands at .
That sum is expressed as:
Which easily simplifies to:
With some cancellation we get:
Just one more approach for solving the problem (for those who have uneasy time understanding it, like I have).
First.
Since we are talking about "the smaller of the two subarrays", then its length is less than 1/2 * n (n - the number of elements in original array).
Second.
If 0 < a < 0.5 it means the a * n is less than 1/2 * n either.
And thus we are talking from now about two randomly chosen integers bounded by 0 at lowest and 1/2 * n at highest.
Third.
Lets imagine the dice with numbers from 1 to 6 on it's sides. Lets choose a number from 1 to 6, for example 4. Now roll the dice. Each number has a probability 1/6 to be the outcome of this roll. Thus for event "outcome is less or equal to 4" we have probability equal to the sum of probabilities of each of this outcomes. And we have numbers 1, 2, 3 and 4. Altogether p(x <= 4) = 4 * 1/6 = 4/6 = 2/3. So the probability of event "output is bigger than 4" is p(x > 4) = 1 - p(x <= 4) = 1 - 2/3 = 1/3.
Fourth.
Lets go back to our problem. The "chosen number" is now a * n. And we are going to roll the dice with the numbers from 0 to (1/2 * n) on it to get k - the number of elements in a smallest of subarrays. The probability that outcome is bounded by (a * n) at highest is equals to sum of the probabilities of all outcomes from 0 to (a * n). And the probability for any particular outcome k is p(k) = 1 / (1/2 * n).
Therefore p(k <= a * n) = (a * n) * (1 / (1/2 * n)) = 2 * a.
From this we can easily conclude that p(k > a * n) = 1 - p(k <= a * n) = 1 - 2 * a.
Array length is n.
For smaller array length >= αn pivot should be greater than αn number of elements. At the same time pivot should be smaller than αn number of elements( else smaller array size will be less than required)
So out of n element we have to select one among (n-2α)n elements.
required probability is n(1-2α)/n.
Hence 1-2α
The probability would be, the number of desired elements/Total number of elements.
In this case, ((1-αn)-(αn))/n
Since α lies between,0 and 0.5,(1-α) must be bigger than α.Hence the number of elements contained between them would be,
(1-α-α)n=(1-2α)n
and so,the probability would be,
(1-2α)n/n=1-2α
Another approach:
List the "more balanced" options:
αn + 1 to (1 - α)n - 1
αn + 2 to (1 - α)n - 2
...
αn + k to (1 - α)n - k
So k in total. We know that the most balanced is n / 2 to n / 2, so:
αn + k = n / 2 => k = n(1/2 - α)
Similarly, list the "less balanced" options:
αn - 1 to (1 - α)n + 1
αn - 2 to (1 - α)n + 2
...
αn - m to (1 - α)n + m
So m in total. We know that the least balanced is 0 to n so:
αn - m = 0 => m = αn
Since all these options happen with equal probability we can use the frequency definition of probability so:
Pr{More balanced} = (total # of more balanced) / (total # of options) =>
Pr{More balanced} = k / (k + m) = n(1/2 - α) / (n(1/2 - α) + αn) = 1 - 2α

What would cause an algorithm to have O(log n) complexity?

My knowledge of big-O is limited, and when log terms show up in the equation it throws me off even more.
Can someone maybe explain to me in simple terms what a O(log n) algorithm is? Where does the logarithm come from?
This specifically came up when I was trying to solve this midterm practice question:
Let X(1..n) and Y(1..n) contain two lists of integers, each sorted in nondecreasing order. Give an O(log n)-time algorithm to find the median (or the nth smallest integer) of all 2n combined elements. For ex, X = (4, 5, 7, 8, 9) and Y = (3, 5, 8, 9, 10), then 7 is the median of the combined list (3, 4, 5, 5, 7, 8, 8, 9, 9, 10). [Hint: use concepts of binary search]
I have to agree that it's pretty weird the first time you see an O(log n) algorithm... where on earth does that logarithm come from? However, it turns out that there's several different ways that you can get a log term to show up in big-O notation. Here are a few:
Repeatedly dividing by a constant
Take any number n; say, 16. How many times can you divide n by two before you get a number less than or equal to one? For 16, we have that
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
Notice that this ends up taking four steps to complete. Interestingly, we also have that log2 16 = 4. Hmmm... what about 128?
128 / 2 = 64
64 / 2 = 32
32 / 2 = 16
16 / 2 = 8
8 / 2 = 4
4 / 2 = 2
2 / 2 = 1
This took seven steps, and log2 128 = 7. Is this a coincidence? Nope! There's a good reason for this. Suppose that we divide a number n by 2 i times. Then we get the number n / 2i. If we want to solve for the value of i where this value is at most 1, we get
n / 2i ≤ 1
n ≤ 2i
log2 n ≤ i
In other words, if we pick an integer i such that i ≥ log2 n, then after dividing n in half i times we'll have a value that is at most 1. The smallest i for which this is guaranteed is roughly log2 n, so if we have an algorithm that divides by 2 until the number gets sufficiently small, then we can say that it terminates in O(log n) steps.
An important detail is that it doesn't matter what constant you're dividing n by (as long as it's greater than one); if you divide by the constant k, it will take logk n steps to reach 1. Thus any algorithm that repeatedly divides the input size by some fraction will need O(log n) iterations to terminate. Those iterations might take a lot of time and so the net runtime needn't be O(log n), but the number of steps will be logarithmic.
So where does this come up? One classic example is binary search, a fast algorithm for searching a sorted array for a value. The algorithm works like this:
If the array is empty, return that the element isn't present in the array.
Otherwise:
Look at the middle element of the array.
If it's equal to the element we're looking for, return success.
If it's greater than the element we're looking for:
Throw away the second half of the array.
Repeat
If it's less than the the element we're looking for:
Throw away the first half of the array.
Repeat
For example, to search for 5 in the array
1 3 5 7 9 11 13
We'd first look at the middle element:
1 3 5 7 9 11 13
^
Since 7 > 5, and since the array is sorted, we know for a fact that the number 5 can't be in the back half of the array, so we can just discard it. This leaves
1 3 5
So now we look at the middle element here:
1 3 5
^
Since 3 < 5, we know that 5 can't appear in the first half of the array, so we can throw the first half array to leave
5
Again we look at the middle of this array:
5
^
Since this is exactly the number we're looking for, we can report that 5 is indeed in the array.
So how efficient is this? Well, on each iteration we're throwing away at least half of the remaining array elements. The algorithm stops as soon as the array is empty or we find the value we want. In the worst case, the element isn't there, so we keep halving the size of the array until we run out of elements. How long does this take? Well, since we keep cutting the array in half over and over again, we will be done in at most O(log n) iterations, since we can't cut the array in half more than O(log n) times before we run out of array elements.
Algorithms following the general technique of divide-and-conquer (cutting the problem into pieces, solving those pieces, then putting the problem back together) tend to have logarithmic terms in them for this same reason - you can't keep cutting some object in half more than O(log n) times. You might want to look at merge sort as a great example of this.
Processing values one digit at a time
How many digits are in the base-10 number n? Well, if there are k digits in the number, then we'd have that the biggest digit is some multiple of 10k. The largest k-digit number is 999...9, k times, and this is equal to 10k + 1 - 1. Consequently, if we know that n has k digits in it, then we know that the value of n is at most 10k + 1 - 1. If we want to solve for k in terms of n, we get
n ≤ 10k+1 - 1
n + 1 ≤ 10k+1
log10 (n + 1) ≤ k + 1
(log10 (n + 1)) - 1 ≤ k
From which we get that k is approximately the base-10 logarithm of n. In other words, the number of digits in n is O(log n).
For example, let's think about the complexity of adding two large numbers that are too big to fit into a machine word. Suppose that we have those numbers represented in base 10, and we'll call the numbers m and n. One way to add them is through the grade-school method - write the numbers out one digit at a time, then work from the right to the left. For example, to add 1337 and 2065, we'd start by writing the numbers out as
1 3 3 7
+ 2 0 6 5
==============
We add the last digit and carry the 1:
1
1 3 3 7
+ 2 0 6 5
==============
2
Then we add the second-to-last ("penultimate") digit and carry the 1:
1 1
1 3 3 7
+ 2 0 6 5
==============
0 2
Next, we add the third-to-last ("antepenultimate") digit:
1 1
1 3 3 7
+ 2 0 6 5
==============
4 0 2
Finally, we add the fourth-to-last ("preantepenultimate"... I love English) digit:
1 1
1 3 3 7
+ 2 0 6 5
==============
3 4 0 2
Now, how much work did we do? We do a total of O(1) work per digit (that is, a constant amount of work), and there are O(max{log n, log m}) total digits that need to be processed. This gives a total of O(max{log n, log m}) complexity, because we need to visit each digit in the two numbers.
Many algorithms get an O(log n) term in them from working one digit at a time in some base. A classic example is radix sort, which sorts integers one digit at a time. There are many flavors of radix sort, but they usually run in time O(n log U), where U is the largest possible integer that's being sorted. The reason for this is that each pass of the sort takes O(n) time, and there are a total of O(log U) iterations required to process each of the O(log U) digits of the largest number being sorted. Many advanced algorithms, such as Gabow's shortest-paths algorithm or the scaling version of the Ford-Fulkerson max-flow algorithm, have a log term in their complexity because they work one digit at a time.
As to your second question about how you solve that problem, you may want to look at this related question which explores a more advanced application. Given the general structure of problems that are described here, you now can have a better sense of how to think about problems when you know there's a log term in the result, so I would advise against looking at the answer until you've given it some thought.
When we talk about big-Oh descriptions, we are usually talking about the time it takes to solve problems of a given size. And usually, for simple problems, that size is just characterized by the number of input elements, and that's usually called n, or N. (Obviously that's not always true-- problems with graphs are often characterized in numbers of vertices, V, and number of edges, E; but for now, we'll talk about lists of objects, with N objects in the lists.)
We say that a problem "is big-Oh of (some function of N)" if and only if:
For all N > some arbitrary N_0, there is some constant c, such that the runtime of the algorithm is less than that constant c times (some function of N.)
In other words, don't think about small problems where the "constant overhead" of setting up the problem matters, think about big problems. And when thinking about big problems, big-Oh of (some function of N) means that the run-time is still always less than some constant times that function. Always.
In short, that function is an upper bound, up to a constant factor.
So, "big-Oh of log(n)" means the same thing that I said above, except "some function of N" is replaced with "log(n)."
So, your problem tells you to think about binary search, so let's think about that. Let's assume you have, say, a list of N elements that are sorted in increasing order. You want to find out if some given number exists in that list. One way to do that which is not a binary search is to just scan each element of the list and see if it's your target number. You might get lucky and find it on the first try. But in the worst case, you'll check N different times. This is not binary search, and it is not big-Oh of log(N) because there's no way to force it into the criteria we sketched out above.
You can pick that arbitrary constant to be c=10, and if your list has N=32 elements, you're fine: 10*log(32) = 50, which is greater than the runtime of 32. But if N=64, 10*log(64) = 60, which is less than the runtime of 64. You can pick c=100, or 1000, or a gazillion, and you'll still be able to find some N that violates that requirement. In other words, there is no N_0.
If we do a binary search, though, we pick the middle element, and make a comparison. Then we throw out half the numbers, and do it again, and again, and so on. If your N=32, you can only do that about 5 times, which is log(32). If your N=64, you can only do this about 6 times, etc. Now you can pick that arbitrary constant c, in such a way that the requirement is always met for large values of N.
With all that background, what O(log(N)) usually means is that you have some way to do a simple thing, which cuts your problem size in half. Just like the binary search is doing above. Once you cut the problem in half, you can cut it in half again, and again, and again. But, critically, what you can't do is some preprocessing step that would take longer than that O(log(N)) time. So for instance, you can't shuffle your two lists into one big list, unless you can find a way to do that in O(log(N)) time, too.
(NOTE: Nearly always, Log(N) means log-base-two, which is what I assume above.)
In the following solution, all the lines with a recursive call are done on
half of the given sizes of the sub-arrays of X and Y.
Other lines are done in a constant time.
The recursive function is T(2n)=T(2n/2)+c=T(n)+c=O(lg(2n))=O(lgn).
You start with MEDIAN(X, 1, n, Y, 1, n).
MEDIAN(X, p, r, Y, i, k)
if X[r]<Y[i]
return X[r]
if Y[k]<X[p]
return Y[k]
q=floor((p+r)/2)
j=floor((i+k)/2)
if r-p+1 is even
if X[q+1]>Y[j] and Y[j+1]>X[q]
if X[q]>Y[j]
return X[q]
else
return Y[j]
if X[q+1]<Y[j-1]
return MEDIAN(X, q+1, r, Y, i, j)
else
return MEDIAN(X, p, q, Y, j+1, k)
else
if X[q]>Y[j] and Y[j+1]>X[q-1]
return Y[j]
if Y[j]>X[q] and X[q+1]>Y[j-1]
return X[q]
if X[q+1]<Y[j-1]
return MEDIAN(X, q, r, Y, i, j)
else
return MEDIAN(X, p, q, Y, j, k)
The Log term pops up very often in algorithm complexity analysis. Here are some explanations:
1. How do you represent a number?
Lets take the number X = 245436. This notation of “245436” has implicit information in it. Making that information explicit:
X = 2 * 10 ^ 5 + 4 * 10 ^ 4 + 5 * 10 ^ 3 + 4 * 10 ^ 2 + 3 * 10 ^ 1 + 6 * 10 ^ 0
Which is the decimal expansion of the number. So, the minimum amount of information we need to represent this number is 6 digits. This is no coincidence, as any number less than 10^d can be represented in d digits.
So how many digits are required to represent X? Thats equal to the largest exponent of 10 in X plus 1.
==> 10 ^ d > X
==> log (10 ^ d) > log(X)
==> d* log(10) > log(X)
==> d > log(X) // And log appears again...
==> d = floor(log(x)) + 1
Also note that this is the most concise way to denote the number in this range. Any reduction will lead to information loss, as a missing digit can be mapped to 10 other numbers. For example: 12* can be mapped to 120, 121, 122, …, 129.
2. How do you search for a number in (0, N - 1)?
Taking N = 10^d, we use our most important observation:
The minimum amount of information to uniquely identify a value in a range between 0 to N - 1 = log(N) digits.
This implies that, when asked to search for a number on the integer line, ranging from 0 to N - 1, we need at least log(N) tries to find it. Why? Any search algorithm will need to choose one digit after another in its search for the number.
The minimum number of digits it needs to choose is log(N). Hence the minimum number of operations taken to search for a number in a space of size N is log(N).
Can you guess the order complexities of binary search, ternary search or deca search? Its O(log(N))!
3. How do you sort a set of numbers?
When asked to sort a set of numbers A into an array B, here’s what it looks like ->
Permute Elements
Every element in the original array has to be mapped to it’s corresponding index in the sorted array. So, for the first element, we have n positions. To correctly find the corresponding index in this range from 0 to n - 1, we need…log(n) operations.
The next element needs log(n-1) operations, the next log(n-2) and so on. The total comes to be:
==> log(n) + log(n - 1) + log(n - 2) + … + log(1)Using log(a) + log(b) = log(a * b), ==> log(n!)
This can be approximated to nlog(n) - n. Which is O(n*log(n))!
Hence we conclude that there can be no sorting algorithm that can do better than O(n*log(n)). And some algorithms having this complexity are the popular Merge Sort and Heap Sort!
These are some of the reasons why we see log(n) pop up so often in the complexity analysis of algorithms. The same can be extended to binary numbers. I made a video on that here.
Why does log(n) appear so often during algorithm complexity analysis?
Cheers!
We call the time complexity O(log n), when the solution is based on iterations over n, where the work done in each iteration is a fraction of the previous iteration, as the algorithm works towards the solution.
Can't comment yet... necro it is!
Avi Cohen's answer is incorrect, try:
X = 1 3 4 5 8
Y = 2 5 6 7 9
None of the conditions are true, so MEDIAN(X, p, q, Y, j, k) will cut both the fives. These are nondecreasing sequences, not all values are distinct.
Also try this even-length example with distinct values:
X = 1 3 4 7
Y = 2 5 6 8
Now MEDIAN(X, p, q, Y, j+1, k) will cut the four.
Instead I offer this algorithm, call it with MEDIAN(1,n,1,n):
MEDIAN(startx, endx, starty, endy){
if (startx == endx)
return min(X[startx], y[starty])
odd = (startx + endx) % 2 //0 if even, 1 if odd
m = (startx+endx - odd)/2
n = (starty+endy - odd)/2
x = X[m]
y = Y[n]
if x == y
//then there are n-2{+1} total elements smaller than or equal to both x and y
//so this value is the nth smallest
//we have found the median.
return x
if (x < y)
//if we remove some numbers smaller then the median,
//and remove the same amount of numbers bigger than the median,
//the median will not change
//we know the elements before x are smaller than the median,
//and the elements after y are bigger than the median,
//so we discard these and continue the search:
return MEDIAN(m, endx, starty, n + 1 - odd)
else (x > y)
return MEDIAN(startx, m + 1 - odd, n, endy)
}

Resources