Find the least element in an array, which has a pattern

Find the least element in an array, which has a pattern - algorithm

An array is given such that its element's value increases from 0th index through some (k-1) index. At k the value is minimum, and than it starts increasing again through the nth element. Find the minimum element.
Essentially, its one sorted list appended to another; example: (1, 2, 3, 4, 0, 1, 2, 3).
I have tried all sorts of algorithm like buliding min-heap, quick select or just plain traversing. But cant get it below O(n). But there is a pattern in this array, something that suggest binary search kind of thing should be possible, and complexity should be something like O(log n), but cant find anything.
Thoughts ??
Thanks

No The drop can be anywhere, there is no structure to this.
Consider the extremes
1234567890
9012345678
1234056789
1357024689
It reduces to finding the minimum element.

Do a breadth-wise binary search for a decreasing range, with a one-element overlap at the binary splits. In other words, if you had, say, 17 elements, compare elements
0,8
8,16
0,4
4,8
8,12
12,16
0,2
2,4
etc., looking for a case where the left element is greater than the right.
Once you find such a range, recurse, doing the same binary search within that range.
Repeat until you've found the decreasing adjacent pair.
The average complexity is not less than O(log n), with a worst-case of O(n). Can anyone get a tighter average-complexity estimate? It seems roughly "halfway between" O(log n) and O(n), but I don't see how to evaluate it. It also depends on any additional constraints on the ranges of values and size of increment from one member to the next.
If the increment between elements is always 1, there's an O(log n) solution.

It can not be done in less then O(n).
The worst case of this kind will always keep troubling us -
An increasing list
a1,a2,a3....ak,ak+1... an
with just one deviation ak < ak-1 e.g. 1,2,3,4,5,6,4,7,8,9,10
And all other numbers hold absolutely zero information about value of 'k' or 'ak'

The simplest solution is to just look forward through the list until the next value is less than the current one, or backward to find a value that is greater than the current one. That is O(n).
Doing both concurrently would still be O(n) but the running time would probably be faster (depending on complicated processor/cache factors).
I don't think you can get it much faster algorithmically than O(n) since a lot of the divide-and-conquer search algorithms rely on having a sorted data set.

Related

Number of elements in array greater than given number

Okay , so I know this has been asked countless times because I googled in every form possible but could not get an answer.
I have an array say A= {10, 9, 6, 11, 22 }. I have to find number of elements greater than 11.
I know this can be done using Modified Binary Search but I need to do it in O(1) time. Is this possible?
(Keeping in mind we are taking the elements as input, so may be some pre-computation can be done while taking the input. )

Remove all the 0s from the array and count them. Now you know the result for input 0: n - count. Afterwards subtract 1 from all the remaining elements in the array. The goal of this step is to bring the numbers in the range of [0,999999999]. If the input is greater than 0 subtract one from it too otherwise return result immediately.
Sort the numbers and think of them as 9 digit strings (fill up with leading 0s).
Build the tree. Each node represents a digit. Each leaf has to store the amount of numbers greater than itself. I don't think the number of nodes will be too high. For the maximum n = 10^5 we can get about 5*10^5 nodes (10^5 different prefixes brings us down to about level 5 after that we have to have linked lists to the leaves 10^5 existing + 4*10^5 for the linked lists).
Now you have to go through all non-leaf nodes and for all the missing digits in the children create direct links to the next smaller leaf. About an additional 9*4*10^5 nodes if you represent the links as leaves with the same count as the next lower leaf.
I think now you can theoretically get O(1), because the complexity of the request doesn't depend on n and you will have to save much less than when creating a hash map. For the worst case you have to go down 9 nodes, this is a constant that is independent from n.

You might also consider first sorting the input and then inserting it in a Y-fast trie (https://en.wikipedia.org/wiki/Y-fast_trie), where each element will also point to its index in the sorted input, and thus the number of elements greater and lower than it. Y-fast tries support successor and predecessor lookup in O(log log M) time using O(n) space, where M is the range.

This answer makes the assumption that building the data structure itself does not have to be constant time, but only the retrieval part.
You can iterate through your array of numbers and build a binary tree. Each node in this tree will contain, in addition to the numerical value, two more points of data. These points will be the number of elements which each node is both greater than and less than. The insertion logic would be tricky, because this state would need to be maintained.
During insertion, while updating the counters for each node, we can also maintain a hashmap indexed by value. The keys would be the numbers in your array, and the value could be a wrapper containing the number of elements which this number is greater and less than. Since hashmaps have O(1) lookup time, this would satisfy your requirement.
If you need O(1) lookup time, only a hashmap comes to mind as an option. Note that traversing a binary tree, even if balanced, would still be a lg(N) operation in general. This is potentially quite fast, but still not constant.

The only way to decrease time complexity beyond this is to increase the space complexity.
If you have a range of elements of the array limited, lets say [-R1, R2], then you can build a hashmap over this range, pointing to linked list. You can precompute this hashMap, and then return results in o(1).

Balancing KD-Tree: Which approach is more efficient?

I'm trying to balance a set of (Million +) 3D points using a KD-tree and I have two ways of doing it.
Way 1:
Use an O(n) algorithm to find the arraysize/2-th largest element along a given axis and store it at the current node
Iterate over all the elements in the vector and for each, compare them to the element I just found and put those smaller in newArray1, and those larger in newArray2
Recurse
Way 2:
Use quicksort O(nlogn) to sort all the elements in the array along a given axis, take the element at position arraysize/2 and store it in the current node.
Then put all the elements from index 0 to arraysize/2-1 in newArray1, and those from arraysize/2 to arraysize-1 in newArray2
Recurse
Way 2 seems more "elegant" but way 1 seems faster since the median search and the iterating are both O(n) so I get O(2n) which just reduces to O(n). But then at the same time, even though way 2 is O(nlogn) time to sort, splitting up the array into 2 can be done in constant time, but does it make up for the O(nlogn) time for sorting?
What should I do? Or is there an even better way to do this that I'm not even seeing?

How about Way 3:
Use an O(n) algorithm such as QuickSelect to ensure that the element at position length/2 is the correct element, all elements before are less, and all afterwards are larger than it (without sorting them completely!) - this is probably the algorithm you used in your Way 1 step 1 anyway...
Recurse into each half (except middle element) and repeat with next axis.
Note that you actually do not need to make "node" objects. You can actually keep the tree in a large array. When searching, start at length/2 with the first axis.
I've seen this trick being used by ELKI. It uses very little memory and code, which makes the tree quite fast.

Another way:
Sort for each of the dimensions: O(K N log N). This will be performed only once, we will utilize the sorted list on the dimensions.
For the current dimension, find the median in O(1) time, split for the median in O(N) time, split also the sorted arrays for each of the dimensions in O(KN) time, and recurse for the next dimension.
In that way, you will perform sorts at the beginning. And perform (K+1) splits/filterings for each subtree, for a known value. For small K, this approach should be faster than the other approaches.
Note: The additional space needed for the algorithm can be decreased by the tricks pointed out by Anony-Mousse.

Notice that if the query hyper-rectangle contains many points (all of them for example) it does not matter if the tree is balanced or not. A balanced tree is useful if the query hyper-rects are small.

Finding a specific ratio in an unsorted array. Time complexity

This is a homework assignment.
The goal is to present an algorithm in pseudocode that will search an array of numbers (doesn't specify if integers or >0) and check if the ratio of any two numbers equals a given x. Time complexity must be under O(nlogn).
My idea was to mergesort the array (O(nlogn) time) and then if |x| > 1 start checking for every number in desending order (using a binary traversal algorithm). The check should also take O(logn) time for each number, with a worst case of n checks gives a total of O(nlogn). If I am not missing anything this should give us a worst case of O(nlogn) + O(nlogn) = O(nlogn), within the parameters of the assignment.
I realize that it doesn't really matter where I start checking the ratios after sorting, but the time cost is amortized by 1/2).
Is my logic correct? Is there a faster algorithm?
An example in case it isn't clear:
Given an array { 4, 9, 2, 1, 8, 6 }
If we want to seach for a ratio of 2:
Mergesort { 9, 8, 6, 4, 2, 1 }
Since the given ratio is >1 we will search from left to right.
2a. First number is 9. Checking 9 / 4 > 2. Checking 9/6 < 2 Next Number.
2b. Second number is 8. Checking 8 / 4 = 2. DONE

The analysis you have presented is correct and is a perfectly good way to solve this problem. Sorting does work in time O(n log n), and 2n binary searches also takes O(n log n) time. That said, I don't think you want to use the term "amortized" here, since that refers to a different type of analysis.
As a hint for how to speed up your solution a bit, the general idea of your solution is to make it possible to efficiently query, for any number, whether that number exists in the array. That way, you can just loop over all numbers and look for anything that would make the ratio work. However, if you use an auxiliary data structure outside the array that supports fast access, you can possibly whittle down your runtime at the cost of increasing the memory usage. Try thinking about what data structures support very fast access (say, O(1) lookups) and see if you can use any of them here.
Hope this helps!

to solve this problem, only O(nlgn) is enough
step 1, sort the array. that cost O(nlgn)
step 2, check whether the ratio exists, this step only needs o(n)
u just need two pointers, one points to the first element(smallest one), another points to the last element(biggest one).
calculate the ratio.
if the ratio is bigger than the specified one, move the second pointer to its previous element.
if the ratio is smaller than the specified one, move the first pointer to its next element.
repeat the above steps until:
u find the exact ratio, or
either the first pointer reaches the end, or the second point reaches the beginning

The complexity of your algorithm is O(n²), because after sorting the array, you iterate over each element (up to n times) and in each iteration you execute up to n - 1 divisions.
Instead, after sorting the array, iterate over each element, and in each iteration divide the element by the ratio, then see if the result is contained in the array:
division: O(1)
search in sorted list: O(log n)
repeat for each element: n times
Results in time complexity O(n log n)
In your example:
9/2 = 4.5 (not found)
8/2 = 4 (found)

(1) Build a hashmap of this array. Time Cost: O(n)
(2) For every element a[i], search a[i]*x in HashMap. Time Cost: O(n).
Total Cost: O(n)

Prove that the running time of quick sort after modification = O(Nk)

this is a homework question, and I'm not that at finding the complixity but I'm trying my best!
Three-way partitioning is a modification of quicksort that partitions elements into groups smaller than, equal to, and larger than the pivot. Only the groups of smaller and larger elements need to be recursively sorted. Show that if there are N items but only k unique values (in other words there are many duplicates), then the running time of this modification to quicksort is O(Nk).
my try:
on the average case:
the tree subroutines will be at these indices:
I assume that the subroutine that have duplicated items will equal (n-k)
first: from 0 - to(i-1)
Second: i - (i+(n-k-1))
third: (i+n-k) - (n-1)
number of comparisons = (n-k)-1
So,
T(n) = (n-k)-1 + Sigma from 0 until (n-k-1) [ T(i) + T (i-k)]
then I'm not sure how I'm gonna continue :S
It might be a very bad start though :$
Hope to find a help

First of all, you shouldn't look at the average case since the upper bound of O(nk) can be proved for the worst case, which is a stronger statement.
You should look at the maximum possible depth of recursion. In normal quicksort, the maximum depth is n. For each level, the total number of operations done is O(n), which gives O(n^2) total in the worst case.
Here, it's not hard to prove that the maximum possible depth is k (since one unique value will be removed at each level), which leads to O(nk) total.

I don't have a formal education in complexity. But if you think about it as a mathematical problem, you can prove it as a mathematical proof.
For all sorting algorithms, the best case scenario will always be O(n) for n elements because to sort n elements you have to consider each one atleast once. Now, for your particular optimisation of quicksort, what you have done is simplified the issue because now, you are only sorting unique values: All the values that are the same as the pivot are already considered sorted, and by virtue of its nature, quicksort will guarantee that every unique value will feature as the pivot at some point in the operation, so this eliminates duplicates.
This means for an N size list, quicksort must perform some operation N times (once for every position in the list), and because it is trying to sort the list, that operation is trying to find the position of that value in the list, but because you are effectively dealing with just unique values, and there are k of those, the quicksort algorithm must perform k comparisons for each element. So it performs Nk operations for an N sized list with k unique elements.
To summarise:
This algorithm eliminates checking against duplicate values.
But all sorting algorithms must look at every value in the list at least once. N operations
For every value in the list the operation is to find its position relative to other values in the list.
Because duplicates get removed, this leaves only k values to check against.
O(Nk)

Is it possible to find two numbers whose difference is minimum in O(n) time

Given an unsorted integer array, and without making any assumptions on
the numbers in the array:
Is it possible to find two numbers whose
difference is minimum in O(n) time?
Edit: Difference between two numbers a, b is defined as abs(a-b)

Find smallest and largest element in the list. The difference smallest-largest will be minimum.
If you're looking for nonnegative difference, then this is of course at least as hard as checking if the array has two same elements. This is called element uniqueness problem and without any additional assumptions (like limiting size of integers, allowing other operations than comparison) requires >= n log n time. It is the 1-dimensional case of finding the closest pair of points.

I don't think you can to it in O(n). The best I can come up with off the top of my head is to sort them (which is O(n * log n)) and find the minimum difference of adjacent pairs in the sorted list (which adds another O(n)).

I think it is possible. The secret is that you don't actually have to sort the list, you just need to create a tally of which numbers exist. This may count as "making an assumption" from an algorithmic perspective, but not from a practical perspective. We know the ints are bounded by a min and a max.
So, create an array of 2 bit elements, 1 pair for each int from INT_MIN to INT_MAX inclusive, set all of them to 00.
Iterate through the entire list of numbers. For each number in the list, if the corresponding 2 bits are 00 set them to 01. If they're 01 set them to 10. Otherwise ignore. This is obviously O(n).
Next, if any of the 2 bits is set to 10, that is your answer. The minimum distance is 0 because the list contains a repeated number. If not, scan through the list and find the minimum distance. Many people have already pointed out there are simple O(n) algorithms for this.
So O(n) + O(n) = O(n).
Edit: responding to comments.
Interesting points. I think you could achieve the same results without making any assumptions by finding the min/max of the list first and using a sparse array ranging from min to max to hold the data. Takes care of the INT_MIN/MAX assumption, the space complexity and the O(m) time complexity of scanning the array.

The best I can think of is to counting sort the array (possibly combining equal values) and then do the sorted comparisons -- bin sort is O(n + M) (M being the number of distinct values). This has a heavy memory requirement, however. Some form of bucket or radix sort would be intermediate in time and more efficient in space.

Sort the list with radixsort (which is O(n) for integers), then iterate and keep track of the smallest distance so far.
(I assume your integer is a fixed-bit type. If they can hold arbitrarily large mathematical integers, radixsort will be O(n log n) as well.)

It seems to be possible to sort unbounded set of integers in O(n*sqrt(log(log(n))) time. After sorting it is of course trivial to find the minimal difference in linear time.
But I can't think of any algorithm to make it faster than this.

No, not without making assumptions about the numbers/ordering.
It would be possible given a sorted list though.

I think the answer is no and the proof is similar to the proof that you can not sort faster than n lg n: you have to compare all of the elements, i.e create a comparison tree, which implies omega(n lg n) algorithm.
EDIT. OK, if you really want to argue, then the question does not say whether it should be a Turing machine or not. With quantum computers, you can do it in linear time :)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio