What is the complexity of this algorithm? - complexity-theory

procedure max (a[1..n]: integers)
max := a[1]
for i := 2 to n
if max < a[i] then max := a[i]
Is the complexity O(1) or O(n) with the best case scenario? The sequence contains n elements. It is pseudocode.

There's no difference between the best case and worst case asymptotic running times for this algorithm. In all cases, you have to traverse the whole array (n elements) and your algorithm would be O(n).
Theoretically, there's no way you can find the maximum element of an arbitrary array in less than O(n) since you should always visit each element at least once.

The algorithm is O(n) in the best case, average case, and worst case. It also doesn't work, since it only refers to a1 and uses < where it should use >.

O(n) - you have to scan n elements, or a factor of n (n/2, n/4 etc) - still O(n).

Roughly, O(1) means that whatever the size of the input, you can implement the solution in fixed number of steps.
O(n) means that if you have n inputs, the solution will be implemented in An steps (where A is a number, not another variable). Clearly, if you have for loop which counts from 2 to n, that is n cycles, meaning that if you have An input elements, your loop will count from 2 to An, meaning that it is on the same level az the input, so it's O(n). But that's how linear scanning is. :)

If you have code that reads:
for i := 2 to n
Then that code will be O(n) best case.
I'm curious why you think it might be constant time?

You have to traverse the whole array.
So the complexity would be O(n).

O(n)
you can achieve the O(1) if the array was sorted, so you'll just return the last element.
but in random arranged elements the best order for this problem is O(n)

Related

Counting Sort has a lower bound of O(n)

The running time of counting sort is Θ (n+k). If k=O(n), the algorithm is O(n). The k represents the range of the input elements.
Can I say that the Counting sort has a lower bound of O(n) because the algorithm takes O(n) time to compute a problem and that the lower bound of O(n) shows that there is no hope of solving a specific computation problem in time better than Ω(n)??
Well yes since T(n,k) = Theta(n+k) then T(n,k) = Omega(n+k). Since k is nonnegative we know that n + k = Omega(n) and so T(n, k) = Omega(n) as required.
Another perspective on why the lower bound is indeed Ω(n): if you want to sort an array of n elements, you need to at least look at all the array elements. If you don’t, you can’t form a sorted list of all the elements of the array because you won’t know what those array elements are. :-)
That gives an immediate Ω(n) lower bound for sorting any sequence of n elements, unless you can read multiple elements of the sequence at once (say, using parallelism or if the array elements are so small that you can read several with a single machine instruction.)

quick sort time and space complexity?

Quick is the in place algorithm which does not use any auxiliary array. So why memory complexity of this O(nlog(n)) ?
Similarly I understand it's worst case time complexity is O(n^2) but not getting why average case time complexity is O(nlog(n)). Basically I am not sure what do we mean when we say average case complexity ?
To your second point an excerpt from Wikipedia:
The most unbalanced partition occurs when the partitioning routine returns one of sublists of size n − 1. This may occur if the pivot happens to be the smallest or largest element in the list, or in some implementations (e.g., the Lomuto partition scheme as described above) when all the elements are equal.
If this happens repeatedly in every partition, then each recursive call processes a list of size one less than the previous list. Consequently, we can make n − 1 nested calls before we reach a list of size 1. This means that the call tree is a linear chain of n − 1 nested calls. The ith call does O(n − i) work to do the partition, and {\displaystyle \textstyle \sum _{i=0}^{n}(n-i)=O(n^{2})} , so in that case Quicksort takes O(n²) time.
Because you usually don't know what exact numbers you have to sort and you don't know, which pivot element you choose, you have the chance, that your pivot element isn't the smallest or biggest number in the array you sort. If you have an array of n not duplicated numbers, you have the chance of (n - 2) / n, that you don't have a worst case.

Algorithm that sorts n numbers from 0 to n^m in O(n)? where m is a constant

So i came upon this question where:
we have to sort n numbers between 0 and n^3 and the answer of time complexity is O(n) and the author solved it this way:
first we convert the base of these numbers to n in O(n), therefore now we have numbers with maximum 3 digits ( because of n^3 )
now we use radix sort and therefore the time is O(n)
so i have three questions :
1. is this correct? and the best time possible?
2. how is it possible to convert the base of n numbers in O(n)? like O(1) for each number? because some previous topics in this website said its O(M(n) log(n))?!
3. and if this is true, then it means we can sort any n numbers from 0 to n^m in O(n) ?!
( I searched about converting the base of n numbers and some said its
O(logn) for each number and some said its O(n) for n numbers so I got confused about this too)
1) Yes, it's correct. It is the best complexity possible, because any sort would have to at least look at the numbers and that is O(n).
2) Yes, each number is converted to base-n in O(1). Simple ways to do this take O(m^2) in the number of digits, under the usual assumption that you can do arithmetic operations on numbers up to O(n) in O(1) time. m is constant so O(m^2) is O(1)... But really this step is just to say that the radix you use in the radix sort is in O(n). If you implemented this for real, you'd use the smallest power of 2 >= n so you wouldn't need these conversions.
3) Yes, if m is constant. The simplest way takes m passes in an LSB-first radix sort with a radix of around n. Each pass takes O(n) time, and the algorithm requires O(n) extra memory (measured in words that can hold n).
So the author is correct. In practice, though, this is usually approached from the other direction. If you're going to write a function that sorts machine integers, then at some large input size it's going to be faster if you switch to a radix sort. If W is the maximum integer size, then this tradeoff point will be when n >= 2^(W/m) for some constant m. This says the same thing as your constraint, but makes it clear that we're thinking about large-sized inputs only.
There is wrong assumption that radix sort is O(n), it is not.
As described on i.e. wiki:
if all n keys are distinct, then w has to be at least log n for a
random-access machine to be able to store them in memory, which gives
at best a time complexity O(n log n).
The answer is no, "author implementation" is (at best) n log n. Also converting these numbers can take probably more than O(n)
is this correct?
Yes it's correct. If n is used as the base, then it will take 3 radix sort passes, where 3 is a constant, and since time complexity ignores constant factors, it's O(n).
and the best time possible?
Not always. Depending on the maximum value of n, a larger base could be used so that the sort is done in 2 radix sort passes or 1 counting sort pass.
how is it possible to convert the base of n numbers in O(n)? like O(1) for each number?
O(1) just means a constant time complexity == fixed number of operations per number. It doesn't matter if the method chosen is not the fastest if only time complexity is being considered. For example, using a, b, c to represent most to least significant digits and x as the number, then using integer math: a = x/(n^2), b = (x-(a*n^2))/n, c = x%n (assumes x >= 0). (side note - if n is a constant, then an optimizing compiler may convert the divisions into a multiply and shift sequence).
and if this is true, then it means we can sort any n numbers from 0 to n^m in O(n) ?!
Only if m is considered a constant. Otherwise it's O(m n).

Given an unsorted array A, check if A[i] = i exists efficiently

Given array A, check if A[i] = i for any i exists.
I'm supposed to solve this faster than linear time, which to me seems impossible. The solution I came up with is to first sort the array in n*log(n) time, and then you can easily check faster than linear time. However, since the array is given unsorted I can't see an "efficient" solution?
You can't have a correct algorithm with better than O(N) complexity for an arbitrary (unsorted) array.
Suppose you have the solution better than O(N). It means that the algorithm has to omit some items of the array since scanning all the items is O(N).
Construct A such that A[i] != i for all i then run the algorithm.
Let A[k] be the item which has been omitted. Assign k to A[k],
run the algorithm again - it'll return no such items when k is expected.
You'll get O(log n) with a parallel algorithm (you didn't restrict that). Just start N processors in ld(N) steps and let them check the array items in parallel.

Testing if unsorted sets are disjoint in linear time. (homework problem)

Problem:
Two sets A and B have n elements each. Assume that each element is an integer in the range [0, n^100]. These sets are not necessarily sorted. Show how to check whether these two sets are disjoint in O(n) time. Your algorithm should use O(n) space.
My original idea for this problem was to create a hash table of set A and search this hash table for each of the elements in B. However, I'm not aware of any way to create a hash table of a data set with this range that only takes O(n) space. Should I be considering a completely different approach?
UPDATE:
I contacted the professor regarding this problem asking about implementing a hash table and his response was:
Please note that hashing takes O(1) time for the operations only on an average. We need a worst case O(n) time algorithm for this problem.
So it seems the problem is looking for a different approach...
Input: Arrays A[m], B[n]
Output: True if they are disjoint, False otherwise
1. Brute Force: O(m*n) time, O(1) space
1. Search for each element of A into B
2. As soon as you get a match break and return false
3. If you reach till end, return true
Advantage: Doesn't modify the input
2. Sort both O(mlogm + nlogn + m + n)
1. Sort both arrays
2. Scan linearly
Disadvantage: Modifies the input
3. Sort smaller O((m + n)logm)
1. Say, m < n, sort A
2. Binary search for each element of B into A
Disadvantage: Modifies the input
4. Sort larger O((m + n)logn)
1. Say n > m, sort B
2. Binary search for each element of A into B
Disadvantage: Modifies the input
5. Hashing O(m + n) time, O(m) or O(n) space
Advantage: Doesn't modify the input
Why not use a hash table? Aren't they O(n) to create(Assuming they are all unique), then O(n) to search, being O(2n) = O(n)?
A hash set will work fine. It's extremely common to assume hash sets/tables are constant time per operation even though that's not strictly true.
Note that hash sets/tables absolutely only use space proportional to the elements inserted, not the potential total number of elements. You seem to have misunderstood that.
If "commonly assumed to be good enough" is unacceptable for some reason, you can use radix sort. It's linear in the total representation size of the input elements. (Caveat: that's slightly different from being linear in the number of elements.)
Honestly I didn't expect such answers from SO community but never mind. The question explicitly states that the algorithm should take O(n) space and time complexity, therefore we can rule out algorithms involving hashing since in the worst case hashing is not O(n).
Now I was going through some texts and found that the problem of finding whether 2 sets are reducible or not is reducible to the sorting problem. This is very standard when studying the lower bounds of many algorithms.
Actual lines from the book DESIGN METHODS AND ANALYSIS OF ALGORITHMS
By S. K. BASU · 2013.
Here the author clearly states that set disjointedness is clearly Omega(nlogn)
#include <bits/stdc++.h>
using namespace std;
int main()
{
unordered_map<string,int>m;
int n,i;
cin>>n;
string a,b; // for storing numbers upto n^100
for(i=0;i<n;i++)
{
cin>>a;
m[a]=1;
}
for(i=0;i<n;i++)
{
cin>>b;
if(m[b])
{
cout<<"Not disjoint";
exit(0);
}
}
cout<<"Disjoint";
return 0;
}
Time complexity : O(n)
Auxiliary space : O(n)
You can radix sort the inputs, in base n.
This will take 101 iterations through each array (because the input numbers are in the range 0 to n^100).
Once you've sorted the inputs, you can compare them in the obvious way in O(n) time.
Note: for the radix sort to run in O(n) time, you need to check that extracting the k'th digit (base n) of an input number is O(1). You can do that with (k-1) divisions by n and a modulo operation. Since k is at most 101, this is O(1).
side note
I note that kennytm# gave a similar answer is 2010, but the answer was deleted after commenters noted that "Radix sort is O(nk) time, where n is the number of keys, and k is the average key length. Since the max key value is n^100, the max key length would be 100 log n. So, this would still be O(n log n), same as all of the best sorting algorithms."
Note that this comment is incorrect -- the maximum key length is 101, because the key is a sequence of numbers some base, and is not measured in bits.

Resources