Given an array of positive integers, how can I find the number of increasing (or decreasing) subsequences of length 3? E.g. [1,6,3,7,5,2,9,4,8] has 24 of these, such as [3,4,8] and [6,7,9].
I've found solutions for length k, but I believe those solutions can be made more efficient since we're only looking at k = 3.
For example, a naive O(n^3) solution can be made faster by looping over elements and counting how many elements to their left are less, and how many to their right are higher, then multiplying these two counts, and adding it to a sum. This is O(n^2), which obviously doesn't translate easily into k > 3.
The solution can be by looping over elements, on every element you can count how many elements to their left and less be using segment tree algorithm which work in O(log(n)), and by this way you can count how many elements to their right and higher, then multiplying these two counts, and adding it to the sum. This is O(n*log(n)).
You can learn more about segment tree algorithm over here:
Segment Tree Tutorial
For each curr element, count how many elements on the left and right have less and greater values.
This curr element can form less[left] * greater[right] + greater[left] * less[right] triplet.
Complexity Considerations
The straightforward approach to count elements on left and right yields a quadratic solution. You might be tempted to use a set or something to count solders in O(log n) time.
You can find a solder rating in a set in O(log n), however, counting elements before and after will still be linear. Unless you implement BST where each node tracks count of left children.
Check the solution here:
https://leetcode.com/problems/count-number-of-teams/discuss/554795/C%2B%2BJava-O(n-*-n)
Related
Given an array of integers and some query operations.
The query operations are of 2 types
1.Update the value of the ith index to x.
2.Given 2 integers find the kth minimum in that range.(Ex if the 2 integers are i and j ,we have to find out the kth minimum between i and j both inclusive).
I can find the Range minimum query using segment tree but could no do so for the kth minimum.
Can anyone help me?
Here is a O(polylog n) per query solution that does actually not assume a constant k, so the k can vary between queries. The main idea is to use a segment tree, where every node represents an interval of array indices and contains a multiset (balanced binary search tree) of the values in the represened array segment. The update operation is pretty straightforward:
Walk up the segment tree from the leaf (the array index you're updating). You will encounter all nodes that represent an interval of array indices that contain the updated index. At every node, remove the old value from the multiset and insert the new value into the multiset. Complexity: O(log^2 n)
Update the array itself.
We notice that every array element will be in O(log n) multisets, so the total space usage is O(n log n). With linear-time merging of multisets we can build the initial segment tree in O(n log n) as well (there's O(n) work per level).
What about queries? We are given a range [i, j] and a rank k and want to find the k-th smallest element in a[i..j]. How do we do that?
Find a disjoint coverage of the query range using the standard segment tree query procedure. We get O(log n) disjoint nodes, the union of whose multisets is exactly the multiset of values in the query range. Let's call those multisets s_1, ..., s_m (with m <= ceil(log_2 n)). Finding the s_i takes O(log n) time.
Do a select(k) query on the union of s_1, ..., s_m. See below.
So how does the selection algorithm work? There is one really simple algorithm to do this.
We have s_1, ..., s_n and k given and want to find the smallest x in a, such that s_1.rank(x) + ... + s_m.rank(x) >= k - 1, where rank returns the number of elements smaller than x in the respective BBST (this can be implemented in O(log n) if we store subtree sizes).
Let's just use binary search to find x! We walk through the BBST of the root, do a couple of rank queries and check whether their sum is larger than or equal to k. It's a predicate monotone in x, so binary search works. The answer is then the minimum of the successors of x in any of the s_i.
Complexity: O(n log n) preprocessing and O(log^3 n) per query.
So in total we get a runtime of O(n log n + q log^3 n) for q queries. I'm sure we could get it down to O(q log^2 n) with a cleverer selection algorithm.
UPDATE: If we are looking for an offline algorithm that can process all queries at once, we can get O((n + q) * log n * log (q + n)) using the following algorithm:
Preprocess all queries, create a set of all values that ever occured in the array. The number of those will be at most q + n.
Build a segment tree, but this time not on the array, but on the set of possible values.
Every node in the segment tree represents an interval of values and maintains a set of positions where these values occurs.
To answer a query, start at the root of the segment tree. Check how many positions in the left child of the root lie in the query interval (we can do that by doing two searches in the BBST of positions). Let that number be m. If k <= m, recurse into the left child. Otherwise recurse into the right child, with k decremented by m.
For updates, remove the position from the O(log (q + n)) nodes that cover the old value and insert it into the nodes that cover the new value.
The advantage of this approach is that we don't need subtree sizes, so we can implement this with most standard library implementations of balanced binary search trees (e.g. set<int> in C++).
We can turn this into an online algorithm by changing the segment tree out for a weight-balanced tree such as a BB[α] tree. It has logarithmic operations like other balanced binary search trees, but allows us to rebuild an entire subtree from scratch when it becomes unbalanced by charging the rebuilding cost to the operations that must have caused the imbalance.
If this is a programming contest problem, then you might be able to get away with the following O(n log(n) + q n^0.5 log(n)^1.5)-time algorithm. It is set up to use the C++ STL well and has a much better big-O constant than Niklas's (previous?) answer on account of using much less space and indirection.
Divide the array into k chunks of length n/k. Copy each chunk into the corresponding locations of a second array and sort it. To update: copy the chunk that changed into the second array and sort it again (time O((n/k) log(n/k)). To query: copy to a scratch array the at most 2 (n/k - 1) elements that belong to a chunk partially overlapping the query interval. Sort them. Use one of the answers to this question to select the element of the requested rank out of the union of the sorted scratch array and fully overlapping chunks, in time O(k log(n/k)^2). The optimum setting of k in theory is (n/log(n))^0.5. It's possible to shave another log(n)^0.5 using the complicated algorithm of Frederickson and Johnson.
perform a modification of the bucket sort: create a bucket that contains the numbers in the range you want and then sort this bucket only and find the kth minimum.
Damn, this solution can't update an element but at least finds that k-th element, here you'll get some ideas so you can think of some solution that provides update. Try pointer-based B-trees.
This is O(n log n) space and O(q log^2 n) time complexity. Later I explained the same with O(log n) per query.
So, you'll need to do the next:
1) Make a "segment tree" over given array.
2) For every node, instead of storing one number, you would store a whole array. The size of that array has to be equal to the number of it's children. That array (as you guessed) has to contain the values of the bottom nodes (children, or the numbers from that segment), but sorted.
3) To make such an array, you would merge two arrays from its two sons from segment tree. But not only that, for every element from the array you have just made (by merging), you need to remember the position of the number before its insertion in merged array (basically, the array from which it comes, and position in it). And a pointer to the first next element that is not inserted from the same array.
4) With this structure, you can check how many numbers there are that are lower than given value x, in some segment S. You find (with binary search) the first number in the array of the root node that is >= x. And then, using the pointers you have made, you can find the results for the same question for two children arrays (arrays of nodes that are children to the previous node) in O(1). You stop to operate this descending for each node that represents the segment that is whole either inside or outside of given segment S. The time complexity is O(log n): O(log n) to find the first element that is >=x, and O(log n) for all segments of decomposition of S.
5) Do a binary search over solution.
This was solution with O(log^2 n) per query. But you can reduce to O(log n):
1) Before doing all I wrote above, you need to transform the problem. You need to sort all numbers and remember the positions for each in original array. Now these positions are representing the array you are working on. Call that array P.
If bounds of the query segment are a and b. You need to find the k-th element in P that is between a and b by value (not by index). And that element represents the index of your result in original array.
2) To find that k-th element, you would do some type of back-tracking with complexity of O(log n). You will be asking the number of elements between index 0 and (some other index) that are between a and b by value.
3) Suppose that you know the answer for such a question for some segment (0,h). Get answers on same type of questions for all segments in tree that begin on h, starting from the greatest one. Keep getting those answers as long as the current answer (from segment (0,h)) plus the answer you got the last are greater than k. Then update h. Keep updating h, until there is only one segment in tree that begins with h. That h is the index of the number you are looking for in the problem you have stated.
To get the answer to such a question for some segment from tree you will spend exactly O(1) of time. Because you already know the answer of it's parent's segment, and using the pointers I explained in the first algorithm you can get the answer for the current segment in O(1).
I'm looking to implement an algorithm, which is given an array of integers and a list of ranges (intervals) in that array, returns the number of distinct elements in each interval. That is, given the array A and a range [i,j] returns the size of the set {A[i],A[i+1],...,A[j]}.
Obviously, the naive approach (iterate from i to j and count ignoring duplicates) is too slow. Range-Sum seems inapplicable, since A U B - B isn't always equal to B.
I've looked up Range Queries in Wikipedia, and it hints that Yao (in '82) showed an algorithm that does this for semigroup operators (which union seems to be) with linear preprocessing time and space and almost constant query time. The article, unfortunately, is not available freely.
Edit: it appears this exact problem is available at http://www.spoj.com/problems/DQUERY/
There's rather simple algorithm which uses O(N log N) time and space for preprocessing and O(log N) time per query. At first, create a persistent segment tree for answering range sum query(initially, it should contain zeroes at all the positions). Then iterate through all the elements of the given array and store the latest position of each number. At each iteration create a new version of the persistent segment tree putting 1 to the latest position of each element(at each iteration the position of only one element can be updated, so only one position's value in segment tree changes so update can be done in O(log N)). To answer a query (l, r) You just need to find sum on (l, r) segment for the version of the tree which was created when iterating through the r's element of the initial array.
Hope this algorithm is fast enough.
Upd. There's a little mistake in my explanation: at each step, at most two positions' values in the segment tree might change(because it's necessary to put 0 to a previous latest position of a number if it's updated). However, it doesn't change the complexity.
You can answer any of your queries in constant time by performing a quadratic-time precomputation:
For every i from 0 to n-1
S <- new empty set backed by hashtable;
C <- 0;
For every j from i to n-1
If A[j] does not belong to S, increment C and add A[j] to S.
Stock C as the answer for the query associated to interval i..j.
This algorithm takes quadratic time since for each interval we perform a bounded number of operations, each one taking constant time (note that the set S is backed by a hashtable), and there's a quadratic number of intervals.
If you don't have additional information about the queries (total number of queries, distribution of intervals), you cannot do essentially better, since the total number of intervals is already quadratic.
You can trade off the quadratic precomputation by n linear on-the-fly computations: after receiving a query of the form A[i..j], precompute (in O(n) time) the answer for all intervals A[i..k], k>=i. This will guarantee that the amortized complexity will remain quadratic, and you will not be forced to perform the complete quadratic precomputation at the beginning.
Note that the obvious algorithm (the one you call obvious in the statement) is cubic, since you scan every interval completely.
Here is another approach which might be quite closely related to the segment tree. Think of the elements of the array as leaves of a full binary tree. If there are 2^n elements in the array there are n levels of that full tree. At each internal node of the tree store the union of the points that lie in the leaves beneath it. Each number in the array needs to appear once in each level (less if there are duplicates). So the cost in space is a factor of log n.
Consider a range A..B of length K. You can work out the union of points in this range by forming the union of sets associated with leaves and nodes, picking nodes as high up the tree as possible, as long as the subtree beneath those nodes is entirely contained in the range. If you step along the range picking subtrees that are as big as possible you will find that the size of the subtrees first increases and then decreases, and the number of subtrees required grows only with the logarithm of the size of the range - at the beginning if you could only take a subtree of size 2^k it will end on a boundary divisible by 2^(k+1) and you will have the chance of a subtree of size at least 2^(k+1) as the next step if your range is big enough.
So the number of semigroup operations required to answer a query is O(log n) - but note that the semigroup operations may be expensive as you may be forming the union of two large sets.
this is a homework question, and I'm not that at finding the complixity but I'm trying my best!
Three-way partitioning is a modification of quicksort that partitions elements into groups smaller than, equal to, and larger than the pivot. Only the groups of smaller and larger elements need to be recursively sorted. Show that if there are N items but only k unique values (in other words there are many duplicates), then the running time of this modification to quicksort is O(Nk).
my try:
on the average case:
the tree subroutines will be at these indices:
I assume that the subroutine that have duplicated items will equal (n-k)
first: from 0 - to(i-1)
Second: i - (i+(n-k-1))
third: (i+n-k) - (n-1)
number of comparisons = (n-k)-1
So,
T(n) = (n-k)-1 + Sigma from 0 until (n-k-1) [ T(i) + T (i-k)]
then I'm not sure how I'm gonna continue :S
It might be a very bad start though :$
Hope to find a help
First of all, you shouldn't look at the average case since the upper bound of O(nk) can be proved for the worst case, which is a stronger statement.
You should look at the maximum possible depth of recursion. In normal quicksort, the maximum depth is n. For each level, the total number of operations done is O(n), which gives O(n^2) total in the worst case.
Here, it's not hard to prove that the maximum possible depth is k (since one unique value will be removed at each level), which leads to O(nk) total.
I don't have a formal education in complexity. But if you think about it as a mathematical problem, you can prove it as a mathematical proof.
For all sorting algorithms, the best case scenario will always be O(n) for n elements because to sort n elements you have to consider each one atleast once. Now, for your particular optimisation of quicksort, what you have done is simplified the issue because now, you are only sorting unique values: All the values that are the same as the pivot are already considered sorted, and by virtue of its nature, quicksort will guarantee that every unique value will feature as the pivot at some point in the operation, so this eliminates duplicates.
This means for an N size list, quicksort must perform some operation N times (once for every position in the list), and because it is trying to sort the list, that operation is trying to find the position of that value in the list, but because you are effectively dealing with just unique values, and there are k of those, the quicksort algorithm must perform k comparisons for each element. So it performs Nk operations for an N sized list with k unique elements.
To summarise:
This algorithm eliminates checking against duplicate values.
But all sorting algorithms must look at every value in the list at least once. N operations
For every value in the list the operation is to find its position relative to other values in the list.
Because duplicates get removed, this leaves only k values to check against.
O(Nk)
An array is given such that its element's value increases from 0th index through some (k-1) index. At k the value is minimum, and than it starts increasing again through the nth element. Find the minimum element.
Essentially, its one sorted list appended to another; example: (1, 2, 3, 4, 0, 1, 2, 3).
I have tried all sorts of algorithm like buliding min-heap, quick select or just plain traversing. But cant get it below O(n). But there is a pattern in this array, something that suggest binary search kind of thing should be possible, and complexity should be something like O(log n), but cant find anything.
Thoughts ??
Thanks
No The drop can be anywhere, there is no structure to this.
Consider the extremes
1234567890
9012345678
1234056789
1357024689
It reduces to finding the minimum element.
Do a breadth-wise binary search for a decreasing range, with a one-element overlap at the binary splits. In other words, if you had, say, 17 elements, compare elements
0,8
8,16
0,4
4,8
8,12
12,16
0,2
2,4
etc., looking for a case where the left element is greater than the right.
Once you find such a range, recurse, doing the same binary search within that range.
Repeat until you've found the decreasing adjacent pair.
The average complexity is not less than O(log n), with a worst-case of O(n). Can anyone get a tighter average-complexity estimate? It seems roughly "halfway between" O(log n) and O(n), but I don't see how to evaluate it. It also depends on any additional constraints on the ranges of values and size of increment from one member to the next.
If the increment between elements is always 1, there's an O(log n) solution.
It can not be done in less then O(n).
The worst case of this kind will always keep troubling us -
An increasing list
a1,a2,a3....ak,ak+1... an
with just one deviation ak < ak-1 e.g. 1,2,3,4,5,6,4,7,8,9,10
And all other numbers hold absolutely zero information about value of 'k' or 'ak'
The simplest solution is to just look forward through the list until the next value is less than the current one, or backward to find a value that is greater than the current one. That is O(n).
Doing both concurrently would still be O(n) but the running time would probably be faster (depending on complicated processor/cache factors).
I don't think you can get it much faster algorithmically than O(n) since a lot of the divide-and-conquer search algorithms rely on having a sorted data set.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How to find the kth largest element in an unsorted array of length n in O(n)?
I'm currently sitting in front of an course assignment.
The task is to find the nth-smallest element in an array. (Without sorting it!)
I tried to understand the BFPRT algorithm but from what I got it is only useful if you want to calculate the median and not the "n-th smallest" element.
Another idea I had, was to convert the array into a tree by attaching smaller/bigger nodes to the left/right of the root node. I'm not sure however if this counts as sorting.
To accelerate this I could store the number of subnodes in each node.
The complete assignment also includes that the algorithm has to be recursive.
There is also the hint to think about other data structures.
What do you think about my idea of transforming the array into a balanced tree?
Are there any other options I might have missed?
EDIT: I looked at various similar questions but were not able to completely understand the answers/apply them to my specific task.
The traditional approach to this problem (the order statistic problem) is reminiscent of quicksort. Let's say that you are looking for the k'th smallest element. Pick a (random) pivot element and partition the remaining elements into two groups (without sorting the two groups): L contains all elements that are smaller than or equal to the pivot element (except the pivot element itself), and G contains all elements that are greater than the pivot element. How large is L? If it contains exactly k - 1 elements, then the pivot element must be the k'th smallest element, and you are done. If L contains more than k - 1 elements, then the k'th smallest element must be in L; otherwise, it is in G. Now, apply the same algorithm to either L or G (if you need to use G, you must adjust k since you are no longer looking for the k'th smallest element of G, but the k'th smallest element overall).
This algorithm runs in expected O(n) time; however, there exists a clever modification of the algorithm that guarantees O(n) time in worst case.
Edit: As #Ishtar points out, the "clever modification" is the BFPRT algorithm. Its core idea is to make sure that you never select a bad pivot element, such that the two partitions L and G do not become too unbalanced. As long as one can guarantee that one partition will never be more than c times larger than the other (for some arbitrary, but fixed c), the run time will be O(n).
There is a quite complex algorithm that in theory runs in O(n). In practise it is a bit slower. Have a look at this link: link. There is a wikipedia entry about this problem as well: wikilink
EDIT:
A simple pseudocode-algorithm to solve the problem:
k = the k'th element is what we are looking for
FindKthSmallest(Array, k)
pivot = some pivot element of the array.
L = Set of all elements smaller than pivot in Array
R = Set of all elements greater than pivot in Array
if |L| > k FindKthSmalles(L, k)
else if(|L|+1 == k) return pivot
else return FindKthSmallest(R, k-|L|+1)
I love the tournament algorithm here -- it's very intuitive and easy to understand.
http://en.wikipedia.org/wiki/Tournament_selection