Why is binary search O(log n), when it runs 4 times? - algorithm

I saw some videos about O(log n) time complexity,
but then I tried a couple of binary search method on the Internet tutorial,
and I am more confused now.
In computer science, big O notation is used to classify algorithms according to how their running time or space requirements grow as the input size grows.
An example binary search:
https://jsfiddle.net/Hoyly/mhynk7gp/3/
function binarySearch(sortedArray, key){
let start = 0;
let end = sortedArray.length - 1;
while (start <= end) {
let middle = Math.floor((start + end) / 2);
console.log('count',++count,sortedArray[middle]);
if (sortedArray[middle] === key) {
return middle;
} else if (sortedArray[middle] < key) {
start = middle + 1;
} else {
end = middle - 1;
}
}
return -1;
}
let count= 0;
console.log('first 1:');
let res1 = binarySearch([1,2,3,4,5,6,7,8],8);
console.log('first 2:');
let res2 = binarySearch([1,2,3,4,5,6,7,8],1);
console.log('answer:',res1,res2);
As you can see in the jsfiddle
If I try to find "1" in 8-length array
The method calling count is 3
2^3 = 8
It is how people call it is a O(log n) function
But If I try to find "8"
The calling count is 4
2^4 != 8
It is definitely not O(log n) definition from the worst case

The time complexity is O(log n), not log n without the big O. I won't explain the full meaning of big-O here; see the definition on Wikipedia for that.
Suffice to say that it only gives an upper bound on the growth rate of the runtime as n grows, and only when n is big enough. Even if n = 8 resulted in 1000 calls, the algorithm could still be O(log n).

The binary search here can do one extra step depending on which half of the array you are searching in. If it used Math.ceil instead of Math.floor then 8 would be found in three steps, while 1 would be found in four.
If we expand this to 128 items, then the last item would be found in 7 or 8 steps (again, depending on which half). In general, the real worst case for the steps taken would be log n + 1. However, for big O, we do not consider the constants, only the growth rate of the function. O(log n + 1) simplifies to O(log n). The same way how O(2n) is still O(n).

Related

Does manipulating n have any impact on the O of an algorithm?

Does manipulating n have any impact on the O of an algorithm?
recursive code for example:
Public void Foo(int n)
{
n -= 1;
if(n <= 0) return;
n -= 1;
if(n <= 0) return;
Foo(n)
}
Does the reassignment of n impact O(N)? Sounds intuitive to me...
Does this algorithm have O(N) by dropping the constant? Technically, since it's decrementing n by 2, it would not have the same mathematical effect as this:
public void Foo(int n) // O(Log n)
{
if(n <= 0) return;
Console.WriteLine(n);
Foo(n / 2);
}
But wouldn't the halving of n contribute to O(N), since you are only touching half of the amount of n? To be clear, I am learning O Notation and it's subtleties. I have been looking for cases such that are like the first example, but I am having a hard time finding such a specific answer.
The reassignment of n itself is not really what matters when talking about O notation. As an example consider a simple for-loop:
for i in range(n):
do_something()
In this algorithm, we do something n times. This would be equivalent to the following algorithm
while n > 0:
do_something()
n -= 1
And is equivalent to the first recursive function you presented. So what really matters is how many computations is done compared to the input size, which is the original value of n.
For this reason, all these three algorithms would be O(n) algorithms, since all three of them decreases the 'input size' by one each time. Even if they had increased it by 2, it would still be a O(n) algorithm, since constants doesn't matter when using O notation. Thus the following algorithm is also a O(n) algorithm.
while n > 0:
do something()
n -= 2
or
while n > 0:
do_something()
n -= 100000
However, the second recursive function you presented is a O(log n) algorithm (even though it does not have a base case and would techniqually run till the stack overflows), as you've written in the comments. Intuitively, what happens i that when halving the input size every time, this exactly corresponds to taking the logarithm in base two of the original input number. Consider the following:
n = 32. The algorithm halves every time: 32 -> 16 -> 8 -> 4 -> 2 -> 1.
In total, we did 5 computations. Equivalently log2(32) = 5.
So to recap, what matters is the original input size and how many computations is done compared to this input size. Whatever constant may affect the computations does not matter.
If I misunderstood your question, or you have follow up questions, feel free to comment this answer.

Proper time complexity with the two loops

1) int p = 0;
2) for (int i = 1; i < n; i*=2) p++;
3) for (int j = 1; j < p; j*=2) stmt;
In my analysis, line #1 O(1), line #2 O(lg(n)), and line #3 O(lg(p)). I believe that the second and third lines are independent. Therefore, the asymptotic time complexity should be O(lg(n) + lg(p)). By the way, the lecturer said O(lglg(n)) because of p = lg(n). At this point, I have three questions.
How does the second line relate to the third line? Could you please explain it in detail with some examples? I don't understand how p = lg(n) is available.
O(lg(n) + lg(p)) is wrong? Would you please explain if I am wrong?
If the complexity in my second question is correct, I don't understand O(lglg(n)) can be answer because I think O(lg(n) + lg(p)) > O(lglg(n)).
Please comment if you could not catch my question point.
It can be shown that p will be O(log n) after line 2 is finished. Therefore, the overall time complexity is O(O(stmt) * log(p) + log(n)) and since we know p, this can be reduced to O(O(stmt) * log(log(n)) + log(n)). I assume stmt is O(1), so the real runtime would be O(log(n) + log(log(n))). This can be further reduced to O(log(n)) since it can be shown that for any non-trivial n, log(n) > log(log(n)).
Why is p O(log n)? Well, consider what p evaluates to after line 2 is complete when n is 2, 4, 8, 16. Each time, p will end up being 1, 2, 3, 4. Thus, to increase p by one, you need to double n. Thus, p is the inverse of 2^n, which is log(n). This same logic must be carried to line 3, and the final construction of the runtime is detailed in the first paragraph of this post.
As of your question I made this c program and try to do the complexity analysis step by step so you can understand:
#include<stdio.h>
int main(){
//-----------------------------------//
//------------first line to analysis-------------//
//O(1) as of input size siz(p)=1
int p = 0;
int i=1,j=1,n=100;
//-----------------------------------//
//-----------second line to analysis---//
//O(log(n)) as of input size siz(loop1)=n
for(i=1;i<n;i=i*2)
printf("%d",i);
//---------------------------------//
//-------------third line to analysis---//
//O(log(p)) as of input size siz(loop2)=p
//we get O(log(n)) if we assume that input size siz(loop2)=p=n
for(j=1;j<p;j=j*2)
printf("%d",j);
}
As of first line there is one variable p and it can take only one input at a time,so the time complexity is constant time.
we can say that int p = 1 is O(1) and we take the function f(n)=O(1).
After that we have the first loop and it increases in a logarithmic scale like log with a base of 2,so it will be O(log(n)) as of input size is dependent on variable n.
so the worst case time complexity is now f(n) = O(1)+O(log(n)).
in third case it is same as second loop so we can say that time complexity is O(log(p)) as of input size is p and the 3rd line of code or 2nd loop is always independent part of the source code.if it will be a nested loop then it will depend on the first loop.
so the time complexity now f(n) = O(1)+O(log(n))+O(log(p))
Now we the time complexity formula and need to choose the worst one from this.
**O(LogLogn) Time Complexity of a loop is considered as O(LogLogn) if the loop variables is reduced / increased exponentially by a constant amount.
// Here c is a constant greater than 1
for (int i = 2; i <=n; i = pow(i, c)) {
// some O(1) expressions
}
//Here fun is sqrt or cuberoot or any other constant root
for (int i = n; i > 0; i = fun(i)) {
// some O(1) expressions
}
so by the reference of ** mark we can easily understand that the time complexity will be O(log(log(n)) if the input size of p = n.This is the answer of your 3rd question.
reference: time complexity analysis

Example of Big O of 2^n

So I can picture what an algorithm is that has a complexity of n^c, just the number of nested for loops.
for (var i = 0; i < dataset.len; i++ {
for (var j = 0; j < dataset.len; j++) {
//do stuff with i and j
}
}
Log is something that splits the data set in half every time, binary search does this (not entirely sure what code for this looks like).
But what is a simple example of an algorithm that is c^n or more specifically 2^n. Is O(2^n) based on loops through data? Or how data is split? Or something else entirely?
Algorithms with running time O(2^N) are often recursive algorithms that solve a problem of size N by recursively solving two smaller problems of size N-1.
This program, for instance prints out all the moves necessary to solve the famous "Towers of Hanoi" problem for N disks in pseudo-code
void solve_hanoi(int N, string from_peg, string to_peg, string spare_peg)
{
if (N<1) {
return;
}
if (N>1) {
solve_hanoi(N-1, from_peg, spare_peg, to_peg);
}
print "move from " + from_peg + " to " + to_peg;
if (N>1) {
solve_hanoi(N-1, spare_peg, to_peg, from_peg);
}
}
Let T(N) be the time it takes for N disks.
We have:
T(1) = O(1)
and
T(N) = O(1) + 2*T(N-1) when N>1
If you repeatedly expand the last term, you get:
T(N) = 3*O(1) + 4*T(N-2)
T(N) = 7*O(1) + 8*T(N-3)
...
T(N) = (2^(N-1)-1)*O(1) + (2^(N-1))*T(1)
T(N) = (2^N - 1)*O(1)
T(N) = O(2^N)
To actually figure this out, you just have to know that certain patterns in the recurrence relation lead to exponential results. Generally T(N) = ... + C*T(N-1) with C > 1means O(x^N). See:
https://en.wikipedia.org/wiki/Recurrence_relation
Think about e.g. iterating over all possible subsets of a set. This kind of algorithms is used for instance for a generalized knapsack problem.
If you find it hard to understand how iterating over subsets translates to O(2^n), imagine a set of n switches, each of them corresponding to one element of a set. Now, each of the switches can be turned on or off. Think of "on" as being in the subset. Note, how many combinations are possible: 2^n.
If you want to see an example in code, it's usually easier to think about recursion here, but I can't think od any other nice and understable example right now.
Consider that you want to guess the PIN of a smartphone, this PIN is a 4-digit integer number. You know that the maximum number of bits to hold a 4-digit number is 14 bits. So, you will have to guess the value, the 14-bit correct combination let's say, of this PIN out of the 2^14 = 16384 possible values!!
The only way is to brute force. So, for simplicity, consider this simple 2-bit word that you want to guess right, each bit has 2 possible values, 0 or 1. So, all the possibilities are:
00
01
10
11
We know that all possibilities of an n-bit word will be 2^n possible combinations. So, 2^2 is 4 possible combinations as we saw earlier.
The same applies to the 14-bit integer PIN, so guessing the PIN would require you to solve a 2^14 possible outcome puzzle, hence an algorithm of time complexity O(2^n).
So, those types of problems, where combinations of elements in a set S differs, and you will have to try to solve the problem by trying all possible combinations, will have this O(2^n) time complexity. But, the exponentiation base does not have to be 2. In the example above it's of base 2 because each element, each bit, has two possible values which will not be the case in other problems.
Another good example of O(2^n) algorithms is the recursive knapsack. Where you have to try different combinations to maximize the value, where each element in the set, has two possible values, whether we take it or not.
The Edit Distance problem is an O(3^n) time complexity since you have 3 decisions to choose from for each of the n characters string, deletion, insertion, or replace.
int Fibonacci(int number)
{
if (number <= 1) return number;
return Fibonacci(number - 2) + Fibonacci(number - 1);
}
Growth doubles with each additon to the input data set. The growth curve of an O(2N) function is exponential - starting off very shallow, then rising meteorically.
My example of big O(2^n), but much better is this:
public void solve(int n, String start, String auxiliary, String end) {
if (n == 1) {
System.out.println(start + " -> " + end);
} else {
solve(n - 1, start, end, auxiliary);
System.out.println(start + " -> " + end);
solve(n - 1, auxiliary, start, end);
}
In this method program prints all moves to solve "Tower of Hanoi" problem.
Both examples are using recursive to solve problem and had big O(2^n) running time.
c^N = All combinations of n elements from a c sized alphabet.
More specifically 2^N is all numbers representable with N bits.
The common cases are implemented recursively, something like:
vector<int> bits;
int N
void find_solution(int pos) {
if (pos == N) {
check_solution();
return;
}
bits[pos] = 0;
find_solution(pos + 1);
bits[pos] = 1;
find_solution(pos + 1);
}
Here is a code clip that computes value sum of every combination of values in a goods array(and value is a global array variable):
fun boom(idx: Int, pre: Int, include: Boolean) {
if (idx < 0) return
boom(idx - 1, pre + if (include) values[idx] else 0, true)
boom(idx - 1, pre + if (include) values[idx] else 0, false)
println(pre + if (include) values[idx] else 0)
}
As you can see, it's recursive. We can inset loops to get Polynomial complexity, and using recursive to get Exponential complexity.
Here are two simple examples in python with Big O/Landau (2^N):
#fibonacci
def fib(num):
if num==0 or num==1:
return num
else:
return fib(num-1)+fib(num-2)
num=10
for i in range(0,num):
print(fib(i))
#tower of Hanoi
def move(disk , from, to, aux):
if disk >= 1:
# from twoer , auxilart
move(disk-1, from, aux, to)
print ("Move disk", disk, "from rod", from_rod, "to rod", to_rod)
move(disk-1, aux, to, from)
n = 3
move(n, 'A', 'B', 'C')
Assuming that a set is a subset of itself, then there are 2ⁿ possible subsets for a set with n elements.
think of it this way. to make a subset, lets take one element. this element has two possibilities in the subset you're creating: present or absent. the same applies for all the other elements in the set. multiplying all these possibilities, you arrive at 2ⁿ.

Algorithm time complexity with binary search

I am trying to figure out what the time complexity of my algorithm is, I have algorithm with binary search, which is in general O(log n), I know. But I search between two constants, namely x=1 and x = 2^31 - 1 (size of integer). I think that in the worst case my time complexity is log2(2^31) = 31, so binary search takes 31 steps in the worst case. However every step in binary search I call a function, which has O(n) runtime (just one loop of the size of the input). Will my algorithm simply be of order O(31n)=O(n)?
The input of my algorithm: a key, two arrays a and b of size n.
In code it will look something like this:
binarySearch(key, a, b)
min = 0, max = 2^31 - 1
mid = (min + max) / 2
while (min<=max) {
x = function(mid, a, b); //this function has O(n)
if (x==key) {
return mid;
} else if (x < key) {
min = mid + 1
} else {
max = mid - 1
}
mid = (min + max) / 2
}
return KEY_NOT_FOUND
I just want to be sure, please if you come with a time complexity (reduced ones) explain your answer.
Update
Yes, You are absolutely right.
In the worst case function() will be invoked 31 times, and each invocation requires time O(n), hence the running time of your algorithm is simply given by 31 * O(n) = O(n).
Solution for the original question where x = function(mid)
Your question is a bit fishy, the time complexity of your algorithm should be O(1).
One important point when we talk about the time complexity of an algorithm is that:
We always consider the time that the algorithm requires with respect to the size of it's input.
In the following snippet:
x = function(mid); //this function has O(n)
While function() may be a linear time function, but in your case, function() only takes input (mid) from the set {0, 1, ..., 230}, so in the worst case function() computes in time max{T(0), T(1), ..., T(230)} = T, which is a constant!
So in the worst case, your while loop will invoke function() 31 times, so in the worst case your algorithm runs in time 31 * T, which is a constant.
Note that the input of your algorithm is key, and the worst case running time of your algorithm is 31 * T, which is actually independent of the size of your input key! So the time complexity is O(1).
In your case, I don't think talking about time complexity in terms of big-O notation is appropriate. I would suggest you to talk about the numbers of computation steps required in the worst case.

Time complexity

The Problem is finding majority elements in an array.
I understand how this algorithm works, but i don't know why this has O(nlogn) as a time complexity.....
a. Both return \no majority." Then neither half of the array has a majority
element, and the combined array cannot have a majority element. Therefore,
the call returns \no majority."
b. The right side is a majority, and the left isn't. The only possible majority for
this level is with the value that formed a majority on the right half, therefore,
just compare every element in the combined array and count the number of
elements that are equal to this value. If it is a majority element then return
that element, else return \no majority."
c. Same as above, but with the left returning a majority, and the right returning
\no majority."
d. Both sub-calls return a majority element. Count the number of elements equal
to both of the candidates for majority element. If either is a majority element
in the combined array, then return it. Otherwise, return \no majority."
The top level simply returns either a majority element or that no majority element
exists in the same way.
Therefore, T(1) = 0 and T(n) = 2T(n/2) + 2n = O(nlogn)
I think,
Every recursion it compares the majority element to whole array which takes 2n.
T(n) = 2T(n/2) + 2n = 2(2T(n/4) + 2n) +
2n = ..... = 2^kT(n/2^k) + 2n + 4n + 8n........ 2^kn = O(n^2)
T(n) = 2T(n/2) + 2n
The question is how many iterations does it take for n to get to 1.
We divide by 2 in each iteration so we get a series: n , n/2 , n/4 , n/8 ... n/(n^k)
So, let's find k that will bring us to 1 (last iteration):
n/(2^k)=1 .. n=2^k ... k=log(n)
So we got log(n) iterations.
Now, in each iteration we do 2n operations (less because we divide n by 2 each time) but in worth case scenario lets say 2n.
So in total, we got log(n) iterations with O(n) operations: nlog(n)
I'm not sure if I understand, but couldn't you just create a hash map, walk over the array, incrementing hash[value] at every step, then sort the hash map (xlogx time complexity) and compare the top two elements? This would cost you O(n) + O(mlogm) + 2 = O(n + mlogm), with n the size of the array and m the amount of different elements in the vector.
Am I mistaken here? Or ...?
When you do this recursively, you split the array in two for each level, make a call for each half, then makes one of the tests a - d. The test a requires no looping, the other tests requires looping through the entire array. By average you will loop through (0 + 1 + 1 + 1) / 4 = 3 / 4 of the array for each level in the recursion.
The number of levels in the recursion is based on the size of the array. As you split the array in half each level, the number of levels will be log2(n).
So, the total work is (n * 3/4) * log2(n). As constants are irrelevant to the time complexity, and all logarithms are the same, the complexity is O(n * log n).
Edit:
If someone is wondering about the algorithm, here's a C# implementation. :)
private int? FindMajority(int[] arr, int start, int len) {
if (len == 1) return arr[start];
int len1 = len / 2, len2 = len - len1;
int? m1 = FindMajority(arr, start, len1);
int? m2 = FindMajority(arr, start + len1, len2);
int cnt1 = m1.HasValue ? arr.Skip(start).Take(len).Count(n => n == m1.Value) : 0;
if (cnt1 * 2 >= len) return m1;
int cnt2 = m2.HasValue ? arr.Skip(start).Take(len).Count(n => n == m2.Value) : 0;
if (cnt2 * 2 >= len) return m2;
return null;
}
This guy has a lot of videos on recurrence relation, and the different techniques you can use to solve them:
https://www.youtube.com/watch?v=TEzbkIggJfo&list=PLj68PAxAKGoyyBwi6qrfcsqE_4trSO1yL
Basically for this problem I would use the Master Theorem:
https://youtu.be/i5kTZof1LRY
T(1) = 0 and T(n) = 2T(n/2) + 2n
Master Theorem ==> AT(n/B) + 2n^D, so in this case A=2, B=3, D=1
So according to the Master Theorem this is O(nlogn)
You can also use another method to solve this (below) it would just take a little bit more time:
https://youtu.be/TEzbkIggJfo?list=PLj68PAxAKGoyyBwi6qrfcsqE_4trSO1yL
I hope this helps you out !

Resources