Time complexity for n-ary search. - algorithm

I am studying time complexity for binary search, ternary search and k-ary search in N elements and have come up with its respective asymptotic worse case run- time. However, I started to wonder what would happen if I divide N elements into n ranges (or aka n-ary search in n elements). Would that be a sorted linear search in an array which would result in a run-time of O(N)? This is a little confusing. Please help me out!

What you say is right.
For a k-ary search we have:
Do k-1 checks in boundaries to isolate one of the k ranges.
Jump into the range obtained from above.
Hence the time complexity is essentially O((k-1)*log_k(N)) where log_k(N) means 'log(N) to base k'. This has a minimum when k=2.
If k = N, the time complexity will be: O((N-1) * log_N(N)) = O(N-1) = O(N), which is the same algorithmically and complexity-wise as linear search.
Translated to the algorithm above, it is:
Do N-1 checks in boundaries (each of the first N-1 elements) to isolate one of the N ranges. This is the same as a linear search in the first N-1 elements.
Jump into the range obtained from above. This is the same as checking the last element (in constant time).

Related

Are there any Computational problem with ϴ(logn)^2 algorithm?

Are there any real computaional problems which can be solved by time complexity of log(n) * log(n)?
This is different from finding smallest element in sorted matrix, which is log(n)+log(n) or 2log(n).
There are can be some kind of pattern printing algorithm which can be made as ϴ(logn)^2 but I'm not sure if they are classified as Computational Problems.
A range query on a d-dimensional range tree with k results runs in O(log^d(n) + k) time. So a query that you know will result in a bounded number of results on a 2-d range tree runs in O(log^2(n)) time.
See https://en.wikipedia.org/wiki/Range_tree
Dichotomic search in a sorted array when the indexes are processed as binary strings (bignums).

A linear algorithm for this specification?

This is my question I have got somewhere.
Given a list of numbers in random order write a linear time algorithm to find the 𝑘th smallest number in the list. Explain why your algorithm is linear.
I have searched almost half the web and what I got to know is a linear-time algorithm is whose time complexity must be O(n). (I may be wrong somewhere)
We can solve the above question by different algorithms eg.
Sort the array and select k-1 element [O(n log n)]
Using min-heap [O(n + klog n)]
etc.
Now the problem is I couldn't find any algorithm which has O(n) time complexity and satisfies that algorithm is linear.
What can be the solution for this problem?
This is std::nth_element
From cppreference:
Notes
The algorithm used is typically introselect although other selection algorithms with suitable average-case complexity are allowed.
Given a list of numbers
although it is not compatible with std::list, only std::vector, std::deque and std::array, as it requires RandomAccessIterator.
linear search remembering k smallest values is O(n*k) but if k is considered constant then its O(n) time.
However if k is not considered as constant then Using histogram leads to O(n+m.log(m)) time and O(m) space complexity where m is number of possible distinct values/range in your input data. The algo is like this:
create histogram counters for each possible value and set it to zero O(m)
process all data and count the values O(m)
sort the histogram O(m.log(m))
pick k-th element from histogram O(1)
in case we are talking about unsigned integers from 0 to m-1 then histogram is computed like this:
int data[n]={your data},cnt[m],i;
for (i=0;i<m;i++) cnt[i]=0;
for (i=0;i<n;i++) cnt[data[i]]++;
However if your input data values does not comply above condition you need to change the range by interpolation or hashing. However if m is huge (or contains huge gaps) is this a no go as such histogram is either using buckets (which is not usable for your problem) or need list of values which lead to no longer linear complexity.
So when put all this together is your problem solvable with linear complexity when:
n >= m.log(m)

Binary search with Random element

I know that Binary Search has time complexity of O(logn) to search for an element in a sorted array. But let's say if instead of selecting the middle element, we select a random element, how would it impact the time complexity. Will it still be O(logn) or will it be something else?
For example :
A traditional binary search in an array of size 18 , will go down like 18 -> 9 -> 4 ...
My modified binary search pings a random element and decides to remove the right part or left part based on the value.
My attempt:
let C(N) be the average number of comparisons required by a search among N elements. For simplicity, we assume that the algorithm only terminates when there is a single element left (no early termination on strict equality with the key).
As the pivot value is chosen at random, the probabilities of the remaining sizes are uniform and we can write the recurrence
C(N) = 1 + 1/N.Sum(1<=i<=N:C(i))
Then
N.C(N) - (N-1).C(N-1) = 1 + C(N)
and
C(N) - C(N-1) = 1 / (N-1)
The solution of this recurrence is the Harmonic series, hence the behavior is indeed logarithmic.
C(N) ~ Ln(N-1) + Gamma
Note that this is the natural logarithm, which is better than the base 2 logarithm by a factor 1.44 !
My bet is that adding the early termination test would further improve the log basis (and keep the log behavior), but at the same time double the number of comparisons, so that globally it would be worse in terms of comparisons.
Let us assume we have a tree of size 18. The number I am looking for is in the 1st spot. In the worst case, I always randomly pick the highest number, (18->17->16...). Effectively only eliminating one element in every iteration. So it become a linear search: O(n) time
The recursion in the answer of #Yves Daoust relies on the assumption that the target element is located either at the beginning or the end of the array. In general, where the element lies in the array changes after each recursive call making it difficult to write and solve the recursion. Here is another solution that proves O(log n) bound on the expected number of recursive calls.
Let T be the (random) number of elements checked by the randomized version of binary search. We can write T=sum I{element i is checked} where we sum over i from 1 to n and I{element i is checked} is an indicator variable. Our goal is to asymptotically bound E[T]=sum Pr{element i is checked}. For the algorithm to check element i it must be the case that this element is selected uniformly at random from the array of size at least |j-i|+1 where j is the index of the element that we are searching for. This is because arrays of smaller size simply won't contain the element under index i while the element under index j is always contained in the array during each recursive call. Thus, the probability that the algorithm checks the element at index i is at most 1/(|j-i|+1). In fact, with a bit more effort one can show that this probability is exactly equal to 1/(|j-i|+1). Thus, we have
E[T]=sum Pr{element i is checked} <= sum_i 1/(|j-i|+1)=O(log n),
where the last equation follows from the summation of harmonic series.

Efficiently find order statistics of unsorted list prefixes?

A is an array of the integers from 1 to n in random order.
I need random access to the ith largest element of the first j elements in at least log time.
What I've come up with so far is an n x n matrix M, where the element in the (i, j) position is the ith largest of the first j. This gives me constant-time random access, but requires n^2 storage.
By construction, M is sorted by row and column. Further, each column differs from its neighbors by a single value.
Can anyone suggest a way to compress M down to n log(n) space or better, with log(n) or better random access time?
I believe you can perform the access in O(log(N)) time, given O(N log(N)) preprocessing time and O(N log(N)) extra space. Here's how.
You can augment a red-black tree to support a select(i) operation which retrieves the element at rank i in O(log(N)) time. For example, see this PDF or the appropriate chapter of Introduction to Algorithms.
You can implement a red-black tree (even one augmented to support select(i)) in a functional manner, such that the insert operation returns a new tree which shares all but O(log(N)) nodes with the old tree. See for example Purely Functional Data Structures by Chris Okasaki.
We will build an array T of purely functional augmented red-black trees, such that the tree T[j] stores the indexes 0 ... j-1 of the first j elements of A sorted largest to smallest.
Base case: At T[0] create an augmented red-black tree with just one node, whose data is the number 0, which is the index of the 0th largest element in the first 1 elements of your array A.
Inductive step: For each j from 1 to N-1, at T[j] create an augmented red-black tree by purely functionally inserting a new node with index j into the tree T[j-1]. This creates at most O(log(j)) new nodes; the remaining nodes are shared with T[j-1]. This takes O(log(j)) time.
The total time to construct the array T is O(N log(N)) and the total space used is also O(N log(N)).
Once T[j-1] is created, you can access the ith largest element of the first j elements of A by performing T[j-1].select(i). This takes O(log(j)) time. Note that you can create T[j-1] lazily the first time it is needed. If A is very large and j is always relatively small, this will save a lot of time and space.
Unless I misunderstand, you are just finding the k-th order statistic of an array which is the prefix of another array.
This can be done using an algorithm that I think is called 'quickselect' or something along those lines. Basically, it's like quicksort:
Take a random pivot
Swap around array elements so all the smaller ones are on one side
You know this is the p+1th largest element where p is the number of smaller array elements
If p+1 = k, it's the solution! If p+1 > k, repeat on the 'smaller' subarray. If p+1 < k, repeat on the larger 'subarray'.
There's a (much) better description here under the Quickselect and Quicker Select headings, and also just generally on the internet if you search for k-th order quicksort solutions.
Although the worst-case time for this algorithm is O(n2) like quicksort, its expected case is much better (also like quicksort) if you properly select your random pivots. I think the space complexity would just be O(n); you can just make one copy of your prefix to muck up the ordering for.

Find Rank of element in unsorted array in complexity:O(lg n).Any other better approach than this?

given numbers in an unsorted way say X:{4,2,5,1,8,2,7}
How do you find rank of number??
Eg: Rank of 4:4
: Rank of 5:5
Complexity has to be O(lg n).
It can be done in complexity of O(lg n) with the help of Red Black Trees and Augmented Data structure approach(one of the fascinating stuff nowadays).
lets make use of order statistic treeOrder Statistic Tree
Algorithm:
RANK(T,x)
//T: order-statistic tree, x: node(to find rank of this node)
r = x.left.size + 1
y=x
While y != T.root
if y==y.p.right
r= + y.p.left.size + 1
y=y.p
Return r;
any help is appreciated.
are there any better approach than this??
Given numbers in an unsorted way, say X:{4,2,5,1,8,2,7}
How do you find rank of number?
Rank is the position of the element when it is sorted.
Complexity has to be O(lg n).
That's impossible. You have to look at each element at least once. Thus, you can't get better than O(n), and it's trivial in O(n):
set found to false
set smaller to 0
for each number in array
if the number is smaller than needle
increment the smaller counter
if the number is equal to the needle
set found to true
if found, return smaller+1, else return error
It can be done in complexity of O(lg n) with the help of Red Black Trees and Augmented Data structure approach(one of the fascinating stuff nowadays). Let's make use of order statistic tree
The problem is you don't have an order-statistic tree, and you don't have the time to build one. Building an order-statistic tree takes more than O(lg n) time*.
But let's say you have the time to build an order-statistic tree. Since extracting the sorted list of nodes in a binary search tree takes linear time, building an order-statistic tree cannot be faster than sorting an array directly.
So, let's sort the array directly. Then, finding the rank of an element is equivalent to finding the element in a sorted array. This is a well known task that can be solved in O(lg n) via binary search (repeatedly split the array in half until you find the element). It turns out that the order-statistic tree does not, quite, help. In fact, you can imagine the binary search as a lookup in an order-statistic tree (except the tree doesn't actually exist).
If x could change at runtime, then order-statistic trees do help. Then, element removal/addition takes Th(lg n) (worst-case) time, while it takes Th(n)* (average-case) in an ordinary sorted array because you need shift the elements around. With x immutable, order-statistic trees don't speed up anything over plain arrays.
* Technically, O(lg n) is a set of functions that grow asymptotically no more than lg n. When I say "more than O(lg n)", the correct interpretation is "more than every function in O(lg n). Incidentally, this is equivalent to saying the run time is omega(lg n) (note the omega is lowercase).
Th(lg n) is the set of functions that are asymptotically equal to lg n, up to a constant. Expressing the same using O(lg n) and english while staying technically correct would be awkward.

Resources