K-th Smallest in Lexicographical Order - algorithm

Given integers n and k, find the lexicographically k-th smallest integer in the range from 1 to n.
Note: 1 ≤ k ≤ n ≤ 109.
Example:
Input:
n: 13 k: 2
Output:
10
Explanation:
The lexicographical order is [1, 10, 11, 12, 13, 2, 3, 4, 5, 6, 7, 8, 9], so the second smallest number is 10.
I have written a code which works fine but when I give very high input it takes a lot of time to execute and hence time out. Could some one please suggest me how i can make it more efficient.
Thanks!!
public class Solution {
class MyComp implements Comparator<Integer>{
#Override
public int compare(Integer n1, Integer n2) {
return String.valueOf(n1).compareTo(String.valueOf(n2));
}
}
public int findKthNumber(int n, int k) {
if(n==0 || k ==0 || k > n) return 0;
int[] tracker = new int[9];
Arrays.fill(tracker,0);
Map<Integer,TreeSet<Integer>> map = new HashMap<Integer,TreeSet<Integer>>();
for(int i =1;i<=n;i++){
String prefix = String.valueOf(i);
int currIndex = Integer.parseInt(prefix.substring(0,1));
//Update count
tracker[currIndex-1] = tracker[currIndex-1] + 1;
if(map.containsKey(currIndex)){
TreeSet<Integer> set = map.get(currIndex);
set.add(i);
map.put(currIndex,set);
}else{
TreeSet<Integer> set = new TreeSet<Integer>(new MyComp());
set.add(i);
map.put(currIndex,set);
}
}
// counter to check the if we reach near by K
int count =1;
for(int i=0;i<9 ;i++ ){
int lookUp = i+1;
int val = tracker[i];
if( count + map.get(lookUp).size() > k){
for(int res : map.get(lookUp)){
if(count == k) return res;
count++;
}
}
count = count + map.get(lookUp).size();
}
return 0;
}
}

You can use QuickSort for finding location. That will give you nlogn but will be much quicker n simpler.
Psudo code
-> pick random number of out list.
-> put all element lesser than that one side,higher than that number other side.
->if the number of lesser elements are k-1, you got the answer.
->if lesser elements are less than k-1, apply same algo on right side.
-> if lesser numbers of elements are greater than k, apply same algo on left side.
You can do this inplace
Best time complexity is o (1)
Worst time complexity is o (n*n)
But it gives very stable performance for multiple iteration
Random nature works on all sort of data
Let me know if any of the steps are not clear :)

For such small numbers you could generate all Strings in an array, sort it and return the k-th entry:
String[] arr = new String[n];
for (int i = 0; i < n; i++)
arr[i] = String.valueOf(n + 1);
Arrays.sort(arr);
return Integer.parseInt(arr[k - 1]);
It seems to be much easier than counting to the k-th entry. You don't need to sort the whole array, because you only need to find the k-th smallest entry, anyway for those small numbers it doesn't matter.
Or even better use an integer array and the comparator you already created:
Integer[] arr = new Integer[n];
for (int i = 0; i < n; i++)
arr[i] = i + 1;
Arrays.sort(arr, MyComp);
return arr[k - 1].intValue();

Related

Looking for largest sum inside array

I have a given array [-2 -3 4 -1 -2 1 5 -3] so the largest sum would be 7 (numbers from 3rd to 7th index). This array is just a simple example, the program should be user input elements and length of the array.
My question is, how to determine which sum would be largest?
I created a sum from all numbers and the sum of only positive numbers and yet the positive sum would be great but I didn't used the -1 and -2 after that 3rd index because of the "IF statement" so my sum is 10 and the solution is not good.
I assume your questions is to find the contiguous subarray(containing at least one number) which has the largest sum. Otherwise, the problem is pretty trivial as you can just pick all the positive numbers.
There are 3 solutions that are better than the O(N^2) brute force solution. N is the length of the input array.
Dynamic programming. O(N) runtime, O(N) space
Since the subarray contains at least one number, we know that there are only N possible candidates: subarray that ends at A[0], A[1]...... A[N - 1]
For the subarray that ends at A[i], we have the following optimal substructure:
maxSum[i] = max of {maxSum[i - 1] + A[i], A[i]};
class Solution {
public int maxSubArray(int[] nums) {
int max = Integer.MIN_VALUE;
if(nums == null || nums.length == 0) {
return max;
}
int[] maxSum = new int[nums.length + 1];
for(int i = 1; i < maxSum.length; i++) {
maxSum[i] = Math.max(maxSum[i - 1] + nums[i - 1], nums[i - 1]);
}
for(int i = 1; i < maxSum.length; i++) {
max = Math.max(maxSum[i], max);
}
return max;
}
}
Prefix sum, O(N) runtime, O(1) space
Maintain a minimum sum variable as you iterate through the entire array. When visiting each number in the input array, update the prefix sum variable currSum. Then update the maximum sum and minimum sum shown in the following code.
class Solution {
public int maxSubArray(int[] nums) {
if(nums == null || nums.length == 0) {
return 0;
}
int maxSum = Integer.MIN_VALUE, currSum = 0, minSum = 0;
for(int i = 0; i < nums.length; i++) {
currSum += nums[i];
maxSum = Math.max(maxSum, currSum - minSum);
minSum = Math.min(minSum, currSum);
}
return maxSum;
}
}
Divide and conquer, O(N * logN) runtime
Divide the original problem into two subproblems and apply this principle recursively using the following formula.
Let A[0,.... midIdx] be the left half of A, A[midIdx + 1, ..... A.length - 1] be the right half of A. leftSumMax is the answer of the left subproblem, rightSumMax is the answer of the right subproblem.
The final answer will be one of the following 3:
1. only uses numbers from the left half (solved by the left subproblem)
2. only uses numbers from the right half (solved by the right subproblem)
3. uses numbers from both left and right halves (solved in O(n) time)
class Solution {
public int maxSubArray(int[] nums) {
if(nums == null || nums.length == 0)
{
return 0;
}
return maxSubArrayHelper(nums, 0, nums.length - 1);
}
private int maxSubArrayHelper(int[] nums, int startIdx, int endIdx){
if(startIdx == endIdx){
return nums[startIdx];
}
int midIdx = startIdx + (endIdx - startIdx) / 2;
int leftMax = maxSubArrayHelper(nums, startIdx, midIdx);
int rightMax = maxSubArrayHelper(nums, midIdx + 1, endIdx);
int leftIdx = midIdx, rightIdx = midIdx + 1;
int leftSumMax = nums[leftIdx], rightSumMax = nums[rightIdx];
int leftSum = nums[leftIdx], rightSum = nums[rightIdx];
for(int i = leftIdx - 1; i >= startIdx; i--){
leftSum += nums[i];
leftSumMax = Math.max(leftSumMax, leftSum);
}
for(int j = rightIdx + 1; j <= endIdx; j++){
rightSum += nums[j];
rightSumMax = Math.max(rightSumMax, rightSum);
}
return Math.max(Math.max(leftMax, rightMax), leftSumMax + rightSumMax);
}
}
Try this:
locate the first positive number, offset i.
add the following positive numbers, giving a sum of sum, last offset is j. If this sum is greater than your current best sum, it becomes the current best sum with offsets i to j.
add the negative numbers that follow until you get another positive number. If this negative sum is greater in absolute value than sum, start a new sum at this offset, otherwise continue with the current sum.
go back to step 2.
Stop this when you get to the end of the array. The best positive sum has been found.
If no positive sum can be found, locate the least negative value, this single entry would be your best non-trivial sum.

How to maintain a min sliding window for an unsorted array? [duplicate]

Given an array of size n and k, how do you find the maximum for every contiguous subarray of size k?
For example
arr = 1 5 2 6 3 1 24 7
k = 3
ans = 5 6 6 6 24 24
I was thinking of having an array of size k and each step evict the last element out and add the new element and find maximum among that. It leads to a running time of O(nk). Is there a better way to do this?
You have heard about doing it in O(n) using dequeue.
Well that is a well known algorithm for this question to do in O(n).
The method i am telling is quite simple and has time complexity O(n).
Your Sample Input:
n=10 , W = 3
10 3
1 -2 5 6 0 9 8 -1 2 0
Answer = 5 6 6 9 9 9 8 2
Concept: Dynamic Programming
Algorithm:
N is number of elements in an array and W is window size. So, Window number = N-W+1
Now divide array into blocks of W starting from index 1.
Here divide into blocks of size 'W'=3.
For your sample input:
We have divided into blocks because we will calculate maximum in 2 ways A.) by traversing from left to right B.) by traversing from right to left.
but how ??
Firstly, Traversing from Left to Right. For each element ai in block we will find maximum till that element ai starting from START of Block to END of that block.
So here,
Secondly, Traversing from Right to Left. For each element 'ai' in block we will find maximum till that element 'ai' starting from END of Block to START of that block.
So Here,
Now we have to find maximum for each subarray or window of size 'W'.
So, starting from index = 1 to index = N-W+1 .
max_val[index] = max(RL[index], LR[index+w-1]);
for index=1: max_val[1] = max(RL[1],LR[3]) = max(5,5)= 5
Simliarly, for all index i, (i<=(n-k+1)), value at RL[i] and LR[i+w-1]
are compared and maximum among those two is answer for that subarray.
So Final Answer : 5 6 6 9 9 9 8 2
Time Complexity: O(n)
Implementation code:
#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#define LIM 100001
using namespace std;
int arr[LIM]; // Input Array
int LR[LIM]; // maximum from Left to Right
int RL[LIM]; // maximum from Right to left
int max_val[LIM]; // number of subarrays(windows) will be n-k+1
int main(){
int n, w, i, k; // 'n' is number of elements in array
// 'w' is Window's Size
cin >> n >> w;
k = n - w + 1; // 'K' is number of Windows
for(i = 1; i <= n; i++)
cin >> arr[i];
for(i = 1; i <= n; i++){ // for maximum Left to Right
if(i % w == 1) // that means START of a block
LR[i] = arr[i];
else
LR[i] = max(LR[i - 1], arr[i]);
}
for(i = n; i >= 1; i--){ // for maximum Right to Left
if(i == n) // Maybe the last block is not of size 'W'.
RL[i] = arr[i];
else if(i % w == 0) // that means END of a block
RL[i] = arr[i];
else
RL[i] = max(RL[i+1], arr[i]);
}
for(i = 1; i <= k; i++) // maximum
max_val[i] = max(RL[i], LR[i + w - 1]);
for(i = 1; i <= k ; i++)
cout << max_val[i] << " ";
cout << endl;
return 0;
}
Running Code Link
I'll try to proof: (by #johnchen902)
If k % w != 1 (k is not the begin of a block)
Let k* = The begin of block containing k
ans[k] = max( arr[k], arr[k + 1], arr[k + 2], ..., arr[k + w - 1])
= max( max( arr[k], arr[k + 1], arr[k + 2], ..., arr[k*]),
max( arr[k*], arr[k* + 1], arr[k* + 2], ..., arr[k + w - 1]) )
= max( RL[k], LR[k+w-1] )
Otherwise (k is the begin of a block)
ans[k] = max( arr[k], arr[k + 1], arr[k + 2], ..., arr[k + w - 1])
= RL[k] = LR[k+w-1]
= max( RL[k], LR[k+w-1] )
Dynamic programming approach is very neatly explained by Shashank Jain. I would like to explain how to do the same using dequeue.
The key is to maintain the max element at the top of the queue(for a window ) and discarding the useless elements and we also need to discard the elements that are out of index of current window.
useless elements = If Current element is greater than the last element of queue than the last element of queue is useless .
Note : We are storing the index in queue not the element itself. It will be more clear from the code itself.
1. If Current element is greater than the last element of queue than the last element of queue is useless . We need to delete that last element.
(and keep deleting until the last element of queue is smaller than current element).
2. If if current_index - k >= q.front() that means we are going out of window so we need to delete the element from front of queue.
vector<int> max_sub_deque(vector<int> &A,int k)
{
deque<int> q;
for(int i=0;i<k;i++)
{
while(!q.empty() && A[i] >= A[q.back()])
q.pop_back();
q.push_back(i);
}
vector<int> res;
for(int i=k;i<A.size();i++)
{
res.push_back(A[q.front()]);
while(!q.empty() && A[i] >= A[q.back()] )
q.pop_back();
while(!q.empty() && q.front() <= i-k)
q.pop_front();
q.push_back(i);
}
res.push_back(A[q.front()]);
return res;
}
Since each element is enqueued and dequeued atmost 1 time to time complexity is O(n+n) = O(2n) = O(n).
And the size of queue can not exceed the limit k . so space complexity = O(k).
An O(n) time solution is possible by combining the two classic interview questions:
Make a stack data-structure (called MaxStack) which supports push, pop and max in O(1) time.
This can be done using two stacks, the second one contains the minimum seen so far.
Model a queue with a stack.
This can done using two stacks. Enqueues go into one stack, and dequeues come from the other.
For this problem, we basically need a queue, which supports enqueue, dequeue and max in O(1) (amortized) time.
We combine the above two, by modelling a queue with two MaxStacks.
To solve the question, we queue k elements, query the max, dequeue, enqueue k+1 th element, query the max etc. This will give you the max for every k sized sub-array.
I believe there are other solutions too.
1)
I believe the queue idea can be simplified. We maintain a queue and a max for every k. We enqueue a new element, and dequeu all elements which are not greater than the new element.
2) Maintain two new arrays which maintain the running max for each block of k, one array for one direction (left to right/right to left).
3) Use a hammer: Preprocess in O(n) time for range maximum queries.
The 1) solution above might be the most optimal.
You need a fast data structure that can add, remove and query for the max element in less than O(n) time (you can just use an array if O(n) or O(nlogn) is acceptable). You can use a heap, a balanced binary search tree, a skip list, or any other sorted data structure that performs these operations in O(log(n)).
The good news is that most popular languages have a sorted data structure implemented that supports these operations for you. C++ has std::set and std::multiset (you probably need the latter) and Java has PriorityQueue and TreeSet.
Here is the java implementation
public static Integer[] maxsInEveryWindows(int[] arr, int k) {
Deque<Integer> deque = new ArrayDeque<Integer>();
/* Process first k (or first window) elements of array */
for (int i = 0; i < k; i++) {
// For very element, the previous smaller elements are useless so
// remove them from deque
while (!deque.isEmpty() && arr[i] >= arr[deque.peekLast()]) {
deque.removeLast(); // Remove from rear
}
// Add new element at rear of queue
deque.addLast(i);
}
List<Integer> result = new ArrayList<Integer>();
// Process rest of the elements, i.e., from arr[k] to arr[n-1]
for (int i = k; i < arr.length; i++) {
// The element at the front of the queue is the largest element of
// previous window, so add to result.
result.add(arr[deque.getFirst()]);
// Remove all elements smaller than the currently
// being added element (remove useless elements)
while (!deque.isEmpty() && arr[i] >= arr[deque.peekLast()]) {
deque.removeLast();
}
// Remove the elements which are out of this window
while (!deque.isEmpty() && deque.getFirst() <= i - k) {
deque.removeFirst();
}
// Add current element at the rear of deque
deque.addLast(i);
}
// Print the maximum element of last window
result.add(arr[deque.getFirst()]);
return result.toArray(new Integer[0]);
}
Here is the corresponding test case
#Test
public void maxsInWindowsOfSizeKTest() {
Integer[] result = ArrayUtils.maxsInEveryWindows(new int[]{1, 2, 3, 1, 4, 5, 2, 3, 6}, 3);
assertThat(result, equalTo(new Integer[]{3, 3, 4, 5, 5, 5, 6}));
result = ArrayUtils.maxsInEveryWindows(new int[]{8, 5, 10, 7, 9, 4, 15, 12, 90, 13}, 4);
assertThat(result, equalTo(new Integer[]{10, 10, 10, 15, 15, 90, 90}));
}
Using a heap (or tree), you should be able to do it in O(n * log(k)). I'm not sure if this would be indeed better.
here is the Python implementation in O(1)...Thanks to #Shahshank Jain in advance..
from sys import stdin,stdout
from operator import *
n,w=map(int , stdin.readline().strip().split())
Arr=list(map(int , stdin.readline().strip().split()))
k=n-w+1 # window size = k
leftA=[0]*n
rightA=[0]*n
result=[0]*k
for i in range(n):
if i%w==0:
leftA[i]=Arr[i]
else:
leftA[i]=max(Arr[i],leftA[i-1])
for i in range(n-1,-1,-1):
if i%w==(w-1) or i==n-1:
rightA[i]=Arr[i]
else:
rightA[i]=max(Arr[i],rightA[i+1])
for i in range(k):
result[i]=max(rightA[i],leftA[i+w-1])
print(*result,sep=' ')
Method 1: O(n) time, O(k) space
We use a deque (it is like a list but with constant-time insertion and deletion from both ends) to store the index of useful elements.
The index of the current max is kept at the leftmost element of deque. The rightmost element of deque is the smallest.
In the following, for easier explanation we say an element from the array is in the deque, while in fact the index of that element is in the deque.
Let's say {5, 3, 2} are already in the deque (again, if fact their indexes are).
If the next element we read from the array is bigger than 5 (remember, the leftmost element of deque holds the max), say 7: We delete the deque and create a new one with only 7 in it (we do this because the current elements are useless, we have found a new max).
If the next element is less than 2 (which is the smallest element of deque), say 1: We add it to the right ({5, 3, 2, 1})
If the next element is bigger than 2 but less than 5, say 4: We remove elements from right that are smaller than the element and then add the element from right ({5, 4}).
Also we keep elements of the current window only (we can do this in constant time because we are storing the indexes instead of elements).
from collections import deque
def max_subarray(array, k):
deq = deque()
for index, item in enumerate(array):
if len(deq) == 0:
deq.append(index)
elif index - deq[0] >= k: # the max element is out of the window
deq.popleft()
elif item > array[deq[0]]: # found a new max
deq = deque()
deq.append(index)
elif item < array[deq[-1]]: # the array item is smaller than all the deque elements
deq.append(index)
elif item > array[deq[-1]] and item < array[deq[0]]:
while item > array[deq[-1]]:
deq.pop()
deq.append(index)
if index >= k - 1: # start printing when the first window is filled
print(array[deq[0]])
Proof of O(n) time: The only part we need to check is the while loop. In the whole runtime of the code, the while loop can perform at most O(n) operations in total. The reason is that the while loop pops elements from the deque, and since in other parts of the code, we do at most O(n) insertions into the deque, the while loop cannot exceed O(n) operations in total. So the total runtime is O(n) + O(n) = O(n)
Method 2: O(n) time, O(n) space
This is the explanation of the method suggested by S Jain (as mentioned in the comments of his post, this method doesn't work with data streams, which most sliding window questions are designed for).
The reason that method works is explained using the following example:
array = [5, 6, 2, 3, 1, 4, 2, 3]
k = 4
[5, 6, 2, 3 1, 4, 2, 3 ]
LR: 5 6 6 6 1 4 4 4
RL: 6 6 3 3 4 4 3 3
6 6 4 4 4
To get the max for the window [2, 3, 1, 4],
we can get the max of [2, 3] and max of [1, 4], and return the bigger of the two.
Max of [2, 3] is calculated in the RL pass and max of [1, 4] is calculated in LR pass.
Using Fibonacci heap, you can do it in O(n + (n-k) log k), which is equal to O(n log k) for small k, for k close to n this becomes O(n).
The algorithm: in fact, you need:
n inserts to the heap
n-k deletions
n-k findmax's
How much these operations cost in Fibonacci heaps? Insert and findmax is O(1) amortized, deletion is O(log n) amortized. So, we have
O(n + (n-k) log k + (n-k)) = O(n + (n-k) log k)
Sorry, this should have been a comment but I am not allowed to comment for now.
#leo and #Clay Goddard
You can save yourselves from re-computing the maximum by storing both maximum and 2nd maximum of the window in the beginning
(2nd maximum will be the maximum only if there are two maximums in the initial window). If the maximum slides out of the window you still have the next best candidate to compare with the new entry. So you get O(n) , otherwise if you allowed the whole re-computation again the worst case order would be O(nk), k is the window size.
class MaxFinder
{
// finds the max and its index
static int[] findMaxByIteration(int arr[], int start, int end)
{
int max, max_ndx;
max = arr[start];
max_ndx = start;
for (int i=start; i<end; i++)
{
if (arr[i] > max)
{
max = arr[i];
max_ndx = i;
}
}
int result[] = {max, max_ndx};
return result;
}
// optimized to skip iteration, when previous windows max element
// is present in current window
static void optimizedPrintKMax(int arr[], int n, int k)
{
int i, j, max, max_ndx;
// for first window - find by iteration.
int result[] = findMaxByIteration(arr, 0, k);
System.out.printf("%d ", result[0]);
max = result[0];
max_ndx = result[1];
for (j=1; j <= (n-k); j++)
{
// if previous max has fallen out of current window, iterate and find
if (max_ndx < j)
{
result = findMaxByIteration(arr, j, j+k);
max = result[0];
max_ndx = result[1];
}
// optimized path, just compare max with new_elem that has come into the window
else
{
int new_elem_ndx = j + (k-1);
if (arr[new_elem_ndx] > max)
{
max = arr[new_elem_ndx];
max_ndx = new_elem_ndx;
}
}
System.out.printf("%d ", max);
}
}
public static void main(String[] args)
{
int arr[] = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
//int arr[] = {1,5,2,6,3,1,24,7};
int n = arr.length;
int k = 3;
optimizedPrintKMax(arr, n, k);
}
}
package com;
public class SlidingWindow {
public static void main(String[] args) {
int[] array = { 1, 5, 2, 6, 3, 1, 24, 7 };
int slide = 3;//say
List<Integer> result = new ArrayList<Integer>();
for (int i = 0; i < array.length - (slide-1); i++) {
result.add(getMax(array, i, slide));
}
System.out.println("MaxList->>>>" + result.toString());
}
private static Integer getMax(int[] array, int i, int slide) {
List<Integer> intermediate = new ArrayList<Integer>();
System.out.println("Initial::" + intermediate.size());
while (intermediate.size() < slide) {
intermediate.add(array[i]);
i++;
}
Collections.sort(intermediate);
return intermediate.get(slide - 1);
}
}
Here is the solution in O(n) time complexity with auxiliary deque
public class TestSlidingWindow {
public static void main(String[] args) {
int[] arr = { 1, 5, 7, 2, 1, 3, 4 };
int k = 3;
printMaxInSlidingWindow(arr, k);
}
public static void printMaxInSlidingWindow(int[] arr, int k) {
Deque<Integer> queue = new ArrayDeque<Integer>();
Deque<Integer> auxQueue = new ArrayDeque<Integer>();
int[] resultArr = new int[(arr.length - k) + 1];
int maxElement = 0;
int j = 0;
for (int i = 0; i < arr.length; i++) {
queue.add(arr[i]);
if (arr[i] > maxElement) {
maxElement = arr[i];
}
/** we need to maintain the auxiliary deque to maintain max element in case max element is removed.
We add the element to deque straight away if subsequent element is less than the last element
(as there is a probability if last element is removed this element can be max element) otherwise
remove all lesser element then insert current element **/
if (auxQueue.size() > 0) {
if (arr[i] < auxQueue.peek()) {
auxQueue.push(arr[i]);
} else {
while (auxQueue.size() > 0 && (arr[i] > auxQueue.peek())) {
auxQueue.pollLast();
}
auxQueue.push(arr[i]);
}
}else {
auxQueue.push(arr[i]);
}
if (queue.size() > 3) {
int removedEl = queue.removeFirst();
if (maxElement == removedEl) {
maxElement = auxQueue.pollFirst();
}
}
if (queue.size() == 3) {
resultArr[j++] = maxElement;
}
}
for (int i = 0; i < resultArr.length; i++) {
System.out.println(resultArr[i]);
}
}
}
static void countDistinct(int arr[], int n, int k)
{
System.out.print("\nMaximum integer in the window : ");
// Traverse through every window
for (int i = 0; i <= n - k; i++) {
System.out.print(findMaximuminAllWindow(Arrays.copyOfRange(arr, i, arr.length), k)+ " ");
}
}
private static int findMaximuminAllWindow(int[] win, int k) {
// TODO Auto-generated method stub
int max= Integer.MIN_VALUE;
for(int i=0; i<k;i++) {
if(win[i]>max)
max=win[i];
}
return max;
}
arr = 1 5 2 6 3 1 24 7
We have to find the maximum of subarray, Right?
So, What is meant by subarray?
SubArray = Partial set and it should be in order and contiguous.
From the above array
{1,5,2} {6,3,1} {1,24,7} all are the subarray examples
n = 8 // Array length
k = 3 // window size
For finding the maximum, we have to iterate through the array, and find the maximum.
From the window size k,
{1,5,2} = 5 is the maximum
{5,2,6} = 6 is the maximum
{2,6,3} = 6 is the maximum
and so on..
ans = 5 6 6 6 24 24
It can be evaluated as the n-k+1
Hence, 8-3+1 = 6
And the length of an answer is 6 as we seen.
How can we solve this now?
When the data is moving from the pipe, the first thought for the data structure came in mind is the Queue
But, rather we are not discussing much here, we directly jump on the deque
Thinking Would be:
Window is fixed and data is in and out
Data is fixed and window is sliding
EX: Time series database
While (Queue is not empty and arr[Queue.back() < arr[i]] {
Queue.pop_back();
Queue.push_back();
For the rest:
Print the front of queue
// purged expired element
While (queue not empty and queue.front() <= I-k) {
Queue.pop_front();
While (Queue is not empty and arr[Queue.back() < arr[i]] {
Queue.pop_back();
Queue.push_back();
}
}
arr = [1, 2, 3, 1, 4, 5, 2, 3, 6]
k = 3
for i in range(len(arr)-k):
k=k+1
print (max(arr[i:k]),end=' ') #3 3 4 5 5 5 6
Two approaches.
Segment Tree O(nlog(n-k))
Build a maximum segment-tree.
Query between [i, i+k)
Something like..
public static void printMaximums(int[] a, int k) {
int n = a.length;
SegmentTree tree = new SegmentTree(a);
for (int i=0; i<=n-k; i++) System.out.print(tree.query(i, i+k));
}
Deque O(n)
If the next element is greater than the rear element, remove the rear element.
If the element in the front of the deque is out of the window, remove the front element.
public static void printMaximums(int[] a, int k) {
int n = a.length;
Deque<int[]> deck = new ArrayDeque<>();
List<Integer> result = new ArrayList<>();
for (int i=0; i<n; i++) {
while (!deck.isEmpty() && a[i] >= deck.peekLast()[0]) deck.pollLast();
deck.offer(new int[] {a[i], i});
while (!deck.isEmpty() && deck.peekFirst()[1] <= i - k) deck.pollFirst();
if (i >= k - 1) result.add(deck.peekFirst()[0]);
}
System.out.println(result);
}
Here is an optimized version of the naive (conditional) nested loop approach I came up with which is much faster and doesn't require any auxiliary storage or data structure.
As the program moves from window to window, the start index and end index moves forward by 1. In other words, two consecutive windows have adjacent start and end indices.
For the first window of size W , the inner loop finds the maximum of elements with index (0 to W-1). (Hence i == 0 in the if in 4th line of the code).
Now instead of computing for the second window which only has one new element, since we have already computed the maximum for elements of indices 0 to W-1, we only need to compare this maximum to the only new element in the new window with the index W.
But if the element at 0 was the maximum which is the only element not part of the new window, we need to compute the maximum using the inner loop from 1 to W again using the inner loop (hence the second condition maxm == arr[i-1] in the if in line 4), otherwise just compare the maximum of the previous window and the only new element in the new window.
void print_max_for_each_subarray(int arr[], int n, int k)
{
int maxm;
for(int i = 0; i < n - k + 1 ; i++)
{
if(i == 0 || maxm == arr[i-1]) {
maxm = arr[i];
for(int j = i+1; j < i+k; j++)
if(maxm < arr[j]) maxm = arr[j];
}
else {
maxm = maxm < arr[i+k-1] ? arr[i+k-1] : maxm;
}
cout << maxm << ' ';
}
cout << '\n';
}
You can use Deque data structure to implement this. Deque has an unique facility that you can insert and remove elements from both the ends of the queue unlike the traditional queue where you can only insert from one end and remove from other.
Following is the code for the above problem.
public int[] maxSlidingWindow(int[] nums, int k) {
int n = nums.length;
int[] maxInWindow = new int[n - k + 1];
Deque<Integer> dq = new LinkedList<Integer>();
int i = 0;
for(; i<k; i++){
while(!dq.isEmpty() && nums[dq.peekLast()] <= nums[i]){
dq.removeLast();
}
dq.addLast(i);
}
for(; i <n; i++){
maxInWindow[i - k] = nums[dq.peekFirst()];
while(!dq.isEmpty() && dq.peekFirst() <= i - k){
dq.removeFirst();
}
while(!dq.isEmpty() && nums[dq.peekLast()] <= nums[i]){
dq.removeLast();
}
dq.addLast(i);
}
maxInWindow[i - k] = nums[dq.peekFirst()];
return maxInWindow;
}
the resultant array will have n - k + 1 elements where n is length of the given array, k is the given window size.
We can solve it using the Python , applying the slicing.
def sliding_window(a,k,n):
max_val =[]
val =[]
val1=[]
for i in range(n-k-1):
if i==0:
val = a[0:k+1]
print("The value in val variable",val)
val1 = max(val)
max_val.append(val1)
else:
val = a[i:i*k+1]
val1 =max(val)
max_val.append(val1)
return max_val
Driver Code
a = [15,2,3,4,5,6,2,4,9,1,5]
n = len(a)
k = 3
sl=s liding_window(a,k,n)
print(sl)
Create a TreeMap of size k. Put first k elements as keys in it and assign any value like 1(doesn't matter). TreeMap has the property to sort the elements based on key so now, first element in map will be min and last element will be max element. Then remove 1 element from the map whose index in the arr is i-k. Here, I have considered that Input elements are taken in array arr and from that array we are filling the map of size k. Since, we can't do anything with sorting happening inside TreeMap, therefore this approach will also take O(n) time.
100% working Tested (Swift)
func maxOfSubArray(arr:[Int],n:Int,k:Int)->[Int]{
var lenght = arr.count
var resultArray = [Int]()
for i in 0..<arr.count{
if lenght+1 > k{
let tempArray = Array(arr[i..<k+i])
resultArray.append(tempArray.max()!)
}
lenght = lenght - 1
}
print(resultArray)
return resultArray
}
This way we can use:
maxOfSubArray(arr: [1,2,3,1,4,5,2,3,6], n: 9, k: 3)
Result:
[3, 3, 4, 5, 5, 5, 6]
Just notice that you only have to find in the new window if:
* The new element in the window is smaller than the previous one (if it's bigger, it's for sure this one).
OR
* The element that just popped out of the window was the current bigger.
In this case, re-scan the window.
for how big k? for reasonable-sized k. you can create k k-sized buffers and just iterate over the array keeping track of max element pointers in the buffers - needs no data structures and is O(n) k^2 pre-allocation.
A complete working solution in Amortised Constant O(1) Complexity.
https://github.com/varoonverma/code-challenge.git
Compare the first k elements and find the max, this is your first number
then compare the next element to the previous max. If the next element is bigger, that is your max of the next subarray, if its equal or smaller, the max for that sub array is the same
then move on to the next number
max(1 5 2) = 5
max(5 6) = 6
max(6 6) = 6
... and so on
max(3 24) = 24
max(24 7) = 24
It's only slightly better than your answer

Finding contiguos subarray of equal sum

Given array : 8 3 5 2 10 6 7 9 5 2
So the o/p will be Yes.
as: {8,3,5} {10,6} {9,5,2} they all have same sum value i.e. 16.
But for this array : 1 4 9 6 2 12
o/p will be No.
as: No contiguous slide have same sum value
I was thinking to go with SubSetSum Algorithm / Kadane Maximum SubArray Algorithm but later I end up as all of the algorithms requires a target sum which is predefined.
But here we don't know the target sum
If desired sum is given, and all subarrays should be contiguous, then it's easily can be done in O(n).
Run a loop over array and maintain boundaries of slices (left and right indexes) and currentSum.
Start with first element as a 0. Boundaries will be [0, 0] (for simplicity we include right). Then in a loop you have three conditions.
If sum is less than desired, add right element to the sum and advance right index
If sum is greater than desired, remove left element from the sum and advance left index
If sum is equal to given, print the slice. To avoid this slice in next iteration, advance left index and adjust the sum.
Translated to code
public static void main(String[] args) {
int givenSum = 16;
int[] a = new int[] {8, 3, 5, 2, 10, 6, 7, 9, 5, 2};
// boundaries of slice
int left = 0; // defines position of slice
int right = 0; // exclusive
int currentSum = 0;
while (right < a.length) {
if (currentSum < givenSum) { // sum is not enough, add from the right
currentSum += a[right];
right++;
}
if (currentSum > givenSum) { // sum exceeds given, remove from the left
currentSum -= a[left];
left++;
}
if (currentSum == givenSum) { // boundaries of given sum found, print it
System.out.println(Arrays.toString(Arrays.copyOfRange(a, left, right)));
// remove the left element, so we can process next sums
currentSum -= a[left];
left++;
}
}
}
For your case it prints 4 slices which yields sum 16
[8, 3, 5]
[10, 6]
[7, 9]
[9, 5, 2]
EDIT:
As OP clarified, no given sum available, the goal is to check if there are at least two different contiguous subarrays present which yields equal sum.
The most straightforward algorithm is to generate all possible sums and check if there are duplicates
int[] a = new int[] {1, 4, 9, 6, 2, 12};
HashSet<Integer> sums = new HashSet<>();
int numOfSums = 0;
for (int left = 0; left < a.length - 1; left++) {
for (int right = left; right < a.length; right++) {
// sum from left to right
int sum = 0;
for (int k = left; k <= right; k++) {
sum += a[k];
}
numOfSums++;
sums.add(sum);
}
}
System.out.println(sums.size() == numOfSums);
Complexity of this is O(n^3), not a good one, but works.
Hint: One trick could be explored to boost it to O(n^2), you don't need to calculate sum for every pair of slices!
You can do it in the following way
You have the total sum = 48
Now the each subset would have a sum which would be equal to a factor of 48. The smaller the factor the more number of subsets you can break it into
For all factors of the sum, check if the answer is possible for that factor or not. This can be done in O(n) by simply traversing the array.
Time Complexity would be O(n * factors(sum))
Use dynamic programming to find all sub-sums of the array, then find the sub array with same sum. The complexity should be O(n2).
void subsum(int n, int* arr, int** sum) {
for (int i = 0; i < n; ++i) {
sum[i][i] = arr[i];
}
for (int l = 2; l <= n; ++l) {
for (int i = 0; i < n - l + 1; ++i) {
sum[i][i + l - 1] = sum[i][i + l - 2] + arr[i + l -1];
}
}
}

Finding maximum for every window of size k in an array

Given an array of size n and k, how do you find the maximum for every contiguous subarray of size k?
For example
arr = 1 5 2 6 3 1 24 7
k = 3
ans = 5 6 6 6 24 24
I was thinking of having an array of size k and each step evict the last element out and add the new element and find maximum among that. It leads to a running time of O(nk). Is there a better way to do this?
You have heard about doing it in O(n) using dequeue.
Well that is a well known algorithm for this question to do in O(n).
The method i am telling is quite simple and has time complexity O(n).
Your Sample Input:
n=10 , W = 3
10 3
1 -2 5 6 0 9 8 -1 2 0
Answer = 5 6 6 9 9 9 8 2
Concept: Dynamic Programming
Algorithm:
N is number of elements in an array and W is window size. So, Window number = N-W+1
Now divide array into blocks of W starting from index 1.
Here divide into blocks of size 'W'=3.
For your sample input:
We have divided into blocks because we will calculate maximum in 2 ways A.) by traversing from left to right B.) by traversing from right to left.
but how ??
Firstly, Traversing from Left to Right. For each element ai in block we will find maximum till that element ai starting from START of Block to END of that block.
So here,
Secondly, Traversing from Right to Left. For each element 'ai' in block we will find maximum till that element 'ai' starting from END of Block to START of that block.
So Here,
Now we have to find maximum for each subarray or window of size 'W'.
So, starting from index = 1 to index = N-W+1 .
max_val[index] = max(RL[index], LR[index+w-1]);
for index=1: max_val[1] = max(RL[1],LR[3]) = max(5,5)= 5
Simliarly, for all index i, (i<=(n-k+1)), value at RL[i] and LR[i+w-1]
are compared and maximum among those two is answer for that subarray.
So Final Answer : 5 6 6 9 9 9 8 2
Time Complexity: O(n)
Implementation code:
#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
#define LIM 100001
using namespace std;
int arr[LIM]; // Input Array
int LR[LIM]; // maximum from Left to Right
int RL[LIM]; // maximum from Right to left
int max_val[LIM]; // number of subarrays(windows) will be n-k+1
int main(){
int n, w, i, k; // 'n' is number of elements in array
// 'w' is Window's Size
cin >> n >> w;
k = n - w + 1; // 'K' is number of Windows
for(i = 1; i <= n; i++)
cin >> arr[i];
for(i = 1; i <= n; i++){ // for maximum Left to Right
if(i % w == 1) // that means START of a block
LR[i] = arr[i];
else
LR[i] = max(LR[i - 1], arr[i]);
}
for(i = n; i >= 1; i--){ // for maximum Right to Left
if(i == n) // Maybe the last block is not of size 'W'.
RL[i] = arr[i];
else if(i % w == 0) // that means END of a block
RL[i] = arr[i];
else
RL[i] = max(RL[i+1], arr[i]);
}
for(i = 1; i <= k; i++) // maximum
max_val[i] = max(RL[i], LR[i + w - 1]);
for(i = 1; i <= k ; i++)
cout << max_val[i] << " ";
cout << endl;
return 0;
}
Running Code Link
I'll try to proof: (by #johnchen902)
If k % w != 1 (k is not the begin of a block)
Let k* = The begin of block containing k
ans[k] = max( arr[k], arr[k + 1], arr[k + 2], ..., arr[k + w - 1])
= max( max( arr[k], arr[k + 1], arr[k + 2], ..., arr[k*]),
max( arr[k*], arr[k* + 1], arr[k* + 2], ..., arr[k + w - 1]) )
= max( RL[k], LR[k+w-1] )
Otherwise (k is the begin of a block)
ans[k] = max( arr[k], arr[k + 1], arr[k + 2], ..., arr[k + w - 1])
= RL[k] = LR[k+w-1]
= max( RL[k], LR[k+w-1] )
Dynamic programming approach is very neatly explained by Shashank Jain. I would like to explain how to do the same using dequeue.
The key is to maintain the max element at the top of the queue(for a window ) and discarding the useless elements and we also need to discard the elements that are out of index of current window.
useless elements = If Current element is greater than the last element of queue than the last element of queue is useless .
Note : We are storing the index in queue not the element itself. It will be more clear from the code itself.
1. If Current element is greater than the last element of queue than the last element of queue is useless . We need to delete that last element.
(and keep deleting until the last element of queue is smaller than current element).
2. If if current_index - k >= q.front() that means we are going out of window so we need to delete the element from front of queue.
vector<int> max_sub_deque(vector<int> &A,int k)
{
deque<int> q;
for(int i=0;i<k;i++)
{
while(!q.empty() && A[i] >= A[q.back()])
q.pop_back();
q.push_back(i);
}
vector<int> res;
for(int i=k;i<A.size();i++)
{
res.push_back(A[q.front()]);
while(!q.empty() && A[i] >= A[q.back()] )
q.pop_back();
while(!q.empty() && q.front() <= i-k)
q.pop_front();
q.push_back(i);
}
res.push_back(A[q.front()]);
return res;
}
Since each element is enqueued and dequeued atmost 1 time to time complexity is O(n+n) = O(2n) = O(n).
And the size of queue can not exceed the limit k . so space complexity = O(k).
An O(n) time solution is possible by combining the two classic interview questions:
Make a stack data-structure (called MaxStack) which supports push, pop and max in O(1) time.
This can be done using two stacks, the second one contains the minimum seen so far.
Model a queue with a stack.
This can done using two stacks. Enqueues go into one stack, and dequeues come from the other.
For this problem, we basically need a queue, which supports enqueue, dequeue and max in O(1) (amortized) time.
We combine the above two, by modelling a queue with two MaxStacks.
To solve the question, we queue k elements, query the max, dequeue, enqueue k+1 th element, query the max etc. This will give you the max for every k sized sub-array.
I believe there are other solutions too.
1)
I believe the queue idea can be simplified. We maintain a queue and a max for every k. We enqueue a new element, and dequeu all elements which are not greater than the new element.
2) Maintain two new arrays which maintain the running max for each block of k, one array for one direction (left to right/right to left).
3) Use a hammer: Preprocess in O(n) time for range maximum queries.
The 1) solution above might be the most optimal.
You need a fast data structure that can add, remove and query for the max element in less than O(n) time (you can just use an array if O(n) or O(nlogn) is acceptable). You can use a heap, a balanced binary search tree, a skip list, or any other sorted data structure that performs these operations in O(log(n)).
The good news is that most popular languages have a sorted data structure implemented that supports these operations for you. C++ has std::set and std::multiset (you probably need the latter) and Java has PriorityQueue and TreeSet.
Here is the java implementation
public static Integer[] maxsInEveryWindows(int[] arr, int k) {
Deque<Integer> deque = new ArrayDeque<Integer>();
/* Process first k (or first window) elements of array */
for (int i = 0; i < k; i++) {
// For very element, the previous smaller elements are useless so
// remove them from deque
while (!deque.isEmpty() && arr[i] >= arr[deque.peekLast()]) {
deque.removeLast(); // Remove from rear
}
// Add new element at rear of queue
deque.addLast(i);
}
List<Integer> result = new ArrayList<Integer>();
// Process rest of the elements, i.e., from arr[k] to arr[n-1]
for (int i = k; i < arr.length; i++) {
// The element at the front of the queue is the largest element of
// previous window, so add to result.
result.add(arr[deque.getFirst()]);
// Remove all elements smaller than the currently
// being added element (remove useless elements)
while (!deque.isEmpty() && arr[i] >= arr[deque.peekLast()]) {
deque.removeLast();
}
// Remove the elements which are out of this window
while (!deque.isEmpty() && deque.getFirst() <= i - k) {
deque.removeFirst();
}
// Add current element at the rear of deque
deque.addLast(i);
}
// Print the maximum element of last window
result.add(arr[deque.getFirst()]);
return result.toArray(new Integer[0]);
}
Here is the corresponding test case
#Test
public void maxsInWindowsOfSizeKTest() {
Integer[] result = ArrayUtils.maxsInEveryWindows(new int[]{1, 2, 3, 1, 4, 5, 2, 3, 6}, 3);
assertThat(result, equalTo(new Integer[]{3, 3, 4, 5, 5, 5, 6}));
result = ArrayUtils.maxsInEveryWindows(new int[]{8, 5, 10, 7, 9, 4, 15, 12, 90, 13}, 4);
assertThat(result, equalTo(new Integer[]{10, 10, 10, 15, 15, 90, 90}));
}
Using a heap (or tree), you should be able to do it in O(n * log(k)). I'm not sure if this would be indeed better.
here is the Python implementation in O(1)...Thanks to #Shahshank Jain in advance..
from sys import stdin,stdout
from operator import *
n,w=map(int , stdin.readline().strip().split())
Arr=list(map(int , stdin.readline().strip().split()))
k=n-w+1 # window size = k
leftA=[0]*n
rightA=[0]*n
result=[0]*k
for i in range(n):
if i%w==0:
leftA[i]=Arr[i]
else:
leftA[i]=max(Arr[i],leftA[i-1])
for i in range(n-1,-1,-1):
if i%w==(w-1) or i==n-1:
rightA[i]=Arr[i]
else:
rightA[i]=max(Arr[i],rightA[i+1])
for i in range(k):
result[i]=max(rightA[i],leftA[i+w-1])
print(*result,sep=' ')
Method 1: O(n) time, O(k) space
We use a deque (it is like a list but with constant-time insertion and deletion from both ends) to store the index of useful elements.
The index of the current max is kept at the leftmost element of deque. The rightmost element of deque is the smallest.
In the following, for easier explanation we say an element from the array is in the deque, while in fact the index of that element is in the deque.
Let's say {5, 3, 2} are already in the deque (again, if fact their indexes are).
If the next element we read from the array is bigger than 5 (remember, the leftmost element of deque holds the max), say 7: We delete the deque and create a new one with only 7 in it (we do this because the current elements are useless, we have found a new max).
If the next element is less than 2 (which is the smallest element of deque), say 1: We add it to the right ({5, 3, 2, 1})
If the next element is bigger than 2 but less than 5, say 4: We remove elements from right that are smaller than the element and then add the element from right ({5, 4}).
Also we keep elements of the current window only (we can do this in constant time because we are storing the indexes instead of elements).
from collections import deque
def max_subarray(array, k):
deq = deque()
for index, item in enumerate(array):
if len(deq) == 0:
deq.append(index)
elif index - deq[0] >= k: # the max element is out of the window
deq.popleft()
elif item > array[deq[0]]: # found a new max
deq = deque()
deq.append(index)
elif item < array[deq[-1]]: # the array item is smaller than all the deque elements
deq.append(index)
elif item > array[deq[-1]] and item < array[deq[0]]:
while item > array[deq[-1]]:
deq.pop()
deq.append(index)
if index >= k - 1: # start printing when the first window is filled
print(array[deq[0]])
Proof of O(n) time: The only part we need to check is the while loop. In the whole runtime of the code, the while loop can perform at most O(n) operations in total. The reason is that the while loop pops elements from the deque, and since in other parts of the code, we do at most O(n) insertions into the deque, the while loop cannot exceed O(n) operations in total. So the total runtime is O(n) + O(n) = O(n)
Method 2: O(n) time, O(n) space
This is the explanation of the method suggested by S Jain (as mentioned in the comments of his post, this method doesn't work with data streams, which most sliding window questions are designed for).
The reason that method works is explained using the following example:
array = [5, 6, 2, 3, 1, 4, 2, 3]
k = 4
[5, 6, 2, 3 1, 4, 2, 3 ]
LR: 5 6 6 6 1 4 4 4
RL: 6 6 3 3 4 4 3 3
6 6 4 4 4
To get the max for the window [2, 3, 1, 4],
we can get the max of [2, 3] and max of [1, 4], and return the bigger of the two.
Max of [2, 3] is calculated in the RL pass and max of [1, 4] is calculated in LR pass.
Using Fibonacci heap, you can do it in O(n + (n-k) log k), which is equal to O(n log k) for small k, for k close to n this becomes O(n).
The algorithm: in fact, you need:
n inserts to the heap
n-k deletions
n-k findmax's
How much these operations cost in Fibonacci heaps? Insert and findmax is O(1) amortized, deletion is O(log n) amortized. So, we have
O(n + (n-k) log k + (n-k)) = O(n + (n-k) log k)
Sorry, this should have been a comment but I am not allowed to comment for now.
#leo and #Clay Goddard
You can save yourselves from re-computing the maximum by storing both maximum and 2nd maximum of the window in the beginning
(2nd maximum will be the maximum only if there are two maximums in the initial window). If the maximum slides out of the window you still have the next best candidate to compare with the new entry. So you get O(n) , otherwise if you allowed the whole re-computation again the worst case order would be O(nk), k is the window size.
class MaxFinder
{
// finds the max and its index
static int[] findMaxByIteration(int arr[], int start, int end)
{
int max, max_ndx;
max = arr[start];
max_ndx = start;
for (int i=start; i<end; i++)
{
if (arr[i] > max)
{
max = arr[i];
max_ndx = i;
}
}
int result[] = {max, max_ndx};
return result;
}
// optimized to skip iteration, when previous windows max element
// is present in current window
static void optimizedPrintKMax(int arr[], int n, int k)
{
int i, j, max, max_ndx;
// for first window - find by iteration.
int result[] = findMaxByIteration(arr, 0, k);
System.out.printf("%d ", result[0]);
max = result[0];
max_ndx = result[1];
for (j=1; j <= (n-k); j++)
{
// if previous max has fallen out of current window, iterate and find
if (max_ndx < j)
{
result = findMaxByIteration(arr, j, j+k);
max = result[0];
max_ndx = result[1];
}
// optimized path, just compare max with new_elem that has come into the window
else
{
int new_elem_ndx = j + (k-1);
if (arr[new_elem_ndx] > max)
{
max = arr[new_elem_ndx];
max_ndx = new_elem_ndx;
}
}
System.out.printf("%d ", max);
}
}
public static void main(String[] args)
{
int arr[] = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1};
//int arr[] = {1,5,2,6,3,1,24,7};
int n = arr.length;
int k = 3;
optimizedPrintKMax(arr, n, k);
}
}
package com;
public class SlidingWindow {
public static void main(String[] args) {
int[] array = { 1, 5, 2, 6, 3, 1, 24, 7 };
int slide = 3;//say
List<Integer> result = new ArrayList<Integer>();
for (int i = 0; i < array.length - (slide-1); i++) {
result.add(getMax(array, i, slide));
}
System.out.println("MaxList->>>>" + result.toString());
}
private static Integer getMax(int[] array, int i, int slide) {
List<Integer> intermediate = new ArrayList<Integer>();
System.out.println("Initial::" + intermediate.size());
while (intermediate.size() < slide) {
intermediate.add(array[i]);
i++;
}
Collections.sort(intermediate);
return intermediate.get(slide - 1);
}
}
Here is the solution in O(n) time complexity with auxiliary deque
public class TestSlidingWindow {
public static void main(String[] args) {
int[] arr = { 1, 5, 7, 2, 1, 3, 4 };
int k = 3;
printMaxInSlidingWindow(arr, k);
}
public static void printMaxInSlidingWindow(int[] arr, int k) {
Deque<Integer> queue = new ArrayDeque<Integer>();
Deque<Integer> auxQueue = new ArrayDeque<Integer>();
int[] resultArr = new int[(arr.length - k) + 1];
int maxElement = 0;
int j = 0;
for (int i = 0; i < arr.length; i++) {
queue.add(arr[i]);
if (arr[i] > maxElement) {
maxElement = arr[i];
}
/** we need to maintain the auxiliary deque to maintain max element in case max element is removed.
We add the element to deque straight away if subsequent element is less than the last element
(as there is a probability if last element is removed this element can be max element) otherwise
remove all lesser element then insert current element **/
if (auxQueue.size() > 0) {
if (arr[i] < auxQueue.peek()) {
auxQueue.push(arr[i]);
} else {
while (auxQueue.size() > 0 && (arr[i] > auxQueue.peek())) {
auxQueue.pollLast();
}
auxQueue.push(arr[i]);
}
}else {
auxQueue.push(arr[i]);
}
if (queue.size() > 3) {
int removedEl = queue.removeFirst();
if (maxElement == removedEl) {
maxElement = auxQueue.pollFirst();
}
}
if (queue.size() == 3) {
resultArr[j++] = maxElement;
}
}
for (int i = 0; i < resultArr.length; i++) {
System.out.println(resultArr[i]);
}
}
}
static void countDistinct(int arr[], int n, int k)
{
System.out.print("\nMaximum integer in the window : ");
// Traverse through every window
for (int i = 0; i <= n - k; i++) {
System.out.print(findMaximuminAllWindow(Arrays.copyOfRange(arr, i, arr.length), k)+ " ");
}
}
private static int findMaximuminAllWindow(int[] win, int k) {
// TODO Auto-generated method stub
int max= Integer.MIN_VALUE;
for(int i=0; i<k;i++) {
if(win[i]>max)
max=win[i];
}
return max;
}
arr = 1 5 2 6 3 1 24 7
We have to find the maximum of subarray, Right?
So, What is meant by subarray?
SubArray = Partial set and it should be in order and contiguous.
From the above array
{1,5,2} {6,3,1} {1,24,7} all are the subarray examples
n = 8 // Array length
k = 3 // window size
For finding the maximum, we have to iterate through the array, and find the maximum.
From the window size k,
{1,5,2} = 5 is the maximum
{5,2,6} = 6 is the maximum
{2,6,3} = 6 is the maximum
and so on..
ans = 5 6 6 6 24 24
It can be evaluated as the n-k+1
Hence, 8-3+1 = 6
And the length of an answer is 6 as we seen.
How can we solve this now?
When the data is moving from the pipe, the first thought for the data structure came in mind is the Queue
But, rather we are not discussing much here, we directly jump on the deque
Thinking Would be:
Window is fixed and data is in and out
Data is fixed and window is sliding
EX: Time series database
While (Queue is not empty and arr[Queue.back() < arr[i]] {
Queue.pop_back();
Queue.push_back();
For the rest:
Print the front of queue
// purged expired element
While (queue not empty and queue.front() <= I-k) {
Queue.pop_front();
While (Queue is not empty and arr[Queue.back() < arr[i]] {
Queue.pop_back();
Queue.push_back();
}
}
arr = [1, 2, 3, 1, 4, 5, 2, 3, 6]
k = 3
for i in range(len(arr)-k):
k=k+1
print (max(arr[i:k]),end=' ') #3 3 4 5 5 5 6
Two approaches.
Segment Tree O(nlog(n-k))
Build a maximum segment-tree.
Query between [i, i+k)
Something like..
public static void printMaximums(int[] a, int k) {
int n = a.length;
SegmentTree tree = new SegmentTree(a);
for (int i=0; i<=n-k; i++) System.out.print(tree.query(i, i+k));
}
Deque O(n)
If the next element is greater than the rear element, remove the rear element.
If the element in the front of the deque is out of the window, remove the front element.
public static void printMaximums(int[] a, int k) {
int n = a.length;
Deque<int[]> deck = new ArrayDeque<>();
List<Integer> result = new ArrayList<>();
for (int i=0; i<n; i++) {
while (!deck.isEmpty() && a[i] >= deck.peekLast()[0]) deck.pollLast();
deck.offer(new int[] {a[i], i});
while (!deck.isEmpty() && deck.peekFirst()[1] <= i - k) deck.pollFirst();
if (i >= k - 1) result.add(deck.peekFirst()[0]);
}
System.out.println(result);
}
Here is an optimized version of the naive (conditional) nested loop approach I came up with which is much faster and doesn't require any auxiliary storage or data structure.
As the program moves from window to window, the start index and end index moves forward by 1. In other words, two consecutive windows have adjacent start and end indices.
For the first window of size W , the inner loop finds the maximum of elements with index (0 to W-1). (Hence i == 0 in the if in 4th line of the code).
Now instead of computing for the second window which only has one new element, since we have already computed the maximum for elements of indices 0 to W-1, we only need to compare this maximum to the only new element in the new window with the index W.
But if the element at 0 was the maximum which is the only element not part of the new window, we need to compute the maximum using the inner loop from 1 to W again using the inner loop (hence the second condition maxm == arr[i-1] in the if in line 4), otherwise just compare the maximum of the previous window and the only new element in the new window.
void print_max_for_each_subarray(int arr[], int n, int k)
{
int maxm;
for(int i = 0; i < n - k + 1 ; i++)
{
if(i == 0 || maxm == arr[i-1]) {
maxm = arr[i];
for(int j = i+1; j < i+k; j++)
if(maxm < arr[j]) maxm = arr[j];
}
else {
maxm = maxm < arr[i+k-1] ? arr[i+k-1] : maxm;
}
cout << maxm << ' ';
}
cout << '\n';
}
You can use Deque data structure to implement this. Deque has an unique facility that you can insert and remove elements from both the ends of the queue unlike the traditional queue where you can only insert from one end and remove from other.
Following is the code for the above problem.
public int[] maxSlidingWindow(int[] nums, int k) {
int n = nums.length;
int[] maxInWindow = new int[n - k + 1];
Deque<Integer> dq = new LinkedList<Integer>();
int i = 0;
for(; i<k; i++){
while(!dq.isEmpty() && nums[dq.peekLast()] <= nums[i]){
dq.removeLast();
}
dq.addLast(i);
}
for(; i <n; i++){
maxInWindow[i - k] = nums[dq.peekFirst()];
while(!dq.isEmpty() && dq.peekFirst() <= i - k){
dq.removeFirst();
}
while(!dq.isEmpty() && nums[dq.peekLast()] <= nums[i]){
dq.removeLast();
}
dq.addLast(i);
}
maxInWindow[i - k] = nums[dq.peekFirst()];
return maxInWindow;
}
the resultant array will have n - k + 1 elements where n is length of the given array, k is the given window size.
We can solve it using the Python , applying the slicing.
def sliding_window(a,k,n):
max_val =[]
val =[]
val1=[]
for i in range(n-k-1):
if i==0:
val = a[0:k+1]
print("The value in val variable",val)
val1 = max(val)
max_val.append(val1)
else:
val = a[i:i*k+1]
val1 =max(val)
max_val.append(val1)
return max_val
Driver Code
a = [15,2,3,4,5,6,2,4,9,1,5]
n = len(a)
k = 3
sl=s liding_window(a,k,n)
print(sl)
Create a TreeMap of size k. Put first k elements as keys in it and assign any value like 1(doesn't matter). TreeMap has the property to sort the elements based on key so now, first element in map will be min and last element will be max element. Then remove 1 element from the map whose index in the arr is i-k. Here, I have considered that Input elements are taken in array arr and from that array we are filling the map of size k. Since, we can't do anything with sorting happening inside TreeMap, therefore this approach will also take O(n) time.
100% working Tested (Swift)
func maxOfSubArray(arr:[Int],n:Int,k:Int)->[Int]{
var lenght = arr.count
var resultArray = [Int]()
for i in 0..<arr.count{
if lenght+1 > k{
let tempArray = Array(arr[i..<k+i])
resultArray.append(tempArray.max()!)
}
lenght = lenght - 1
}
print(resultArray)
return resultArray
}
This way we can use:
maxOfSubArray(arr: [1,2,3,1,4,5,2,3,6], n: 9, k: 3)
Result:
[3, 3, 4, 5, 5, 5, 6]
Just notice that you only have to find in the new window if:
* The new element in the window is smaller than the previous one (if it's bigger, it's for sure this one).
OR
* The element that just popped out of the window was the current bigger.
In this case, re-scan the window.
for how big k? for reasonable-sized k. you can create k k-sized buffers and just iterate over the array keeping track of max element pointers in the buffers - needs no data structures and is O(n) k^2 pre-allocation.
A complete working solution in Amortised Constant O(1) Complexity.
https://github.com/varoonverma/code-challenge.git
Compare the first k elements and find the max, this is your first number
then compare the next element to the previous max. If the next element is bigger, that is your max of the next subarray, if its equal or smaller, the max for that sub array is the same
then move on to the next number
max(1 5 2) = 5
max(5 6) = 6
max(6 6) = 6
... and so on
max(3 24) = 24
max(24 7) = 24
It's only slightly better than your answer

How to find the kth largest element in an unsorted array of length n in O(n)?

I believe there's a way to find the kth largest element in an unsorted array of length n in O(n). Or perhaps it's "expected" O(n) or something. How can we do this?
This is called finding the k-th order statistic. There's a very simple randomized algorithm (called quickselect) taking O(n) average time, O(n^2) worst case time, and a pretty complicated non-randomized algorithm (called introselect) taking O(n) worst case time. There's some info on Wikipedia, but it's not very good.
Everything you need is in these powerpoint slides. Just to extract the basic algorithm of the O(n) worst-case algorithm (introselect):
Select(A,n,i):
Divide input into ⌈n/5⌉ groups of size 5.
/* Partition on median-of-medians */
medians = array of each group’s median.
pivot = Select(medians, ⌈n/5⌉, ⌈n/10⌉)
Left Array L and Right Array G = partition(A, pivot)
/* Find ith element in L, pivot, or G */
k = |L| + 1
If i = k, return pivot
If i < k, return Select(L, k-1, i)
If i > k, return Select(G, n-k, i-k)
It's also very nicely detailed in the Introduction to Algorithms book by Cormen et al.
If you want a true O(n) algorithm, as opposed to O(kn) or something like that, then you should use quickselect (it's basically quicksort where you throw out the partition that you're not interested in). My prof has a great writeup, with the runtime analysis: (reference)
The QuickSelect algorithm quickly finds the k-th smallest element of an unsorted array of n elements. It is a RandomizedAlgorithm, so we compute the worst-case expected running time.
Here is the algorithm.
QuickSelect(A, k)
let r be chosen uniformly at random in the range 1 to length(A)
let pivot = A[r]
let A1, A2 be new arrays
# split into a pile A1 of small elements and A2 of big elements
for i = 1 to n
if A[i] < pivot then
append A[i] to A1
else if A[i] > pivot then
append A[i] to A2
else
# do nothing
end for
if k <= length(A1):
# it's in the pile of small elements
return QuickSelect(A1, k)
else if k > length(A) - length(A2)
# it's in the pile of big elements
return QuickSelect(A2, k - (length(A) - length(A2))
else
# it's equal to the pivot
return pivot
What is the running time of this algorithm? If the adversary flips coins for us, we may find that the pivot is always the largest element and k is always 1, giving a running time of
T(n) = Theta(n) + T(n-1) = Theta(n2)
But if the choices are indeed random, the expected running time is given by
T(n) <= Theta(n) + (1/n) ∑i=1 to nT(max(i, n-i-1))
where we are making the not entirely reasonable assumption that the recursion always lands in the larger of A1 or A2.
Let's guess that T(n) <= an for some a. Then we get
T(n)
<= cn + (1/n) ∑i=1 to nT(max(i-1, n-i))
= cn + (1/n) ∑i=1 to floor(n/2) T(n-i) + (1/n) ∑i=floor(n/2)+1 to n T(i)
<= cn + 2 (1/n) ∑i=floor(n/2) to n T(i)
<= cn + 2 (1/n) ∑i=floor(n/2) to n ai
and now somehow we have to get the horrendous sum on the right of the plus sign to absorb the cn on the left. If we just bound it as 2(1/n) ∑i=n/2 to n an, we get roughly 2(1/n)(n/2)an = an. But this is too big - there's no room to squeeze in an extra cn. So let's expand the sum using the arithmetic series formula:
∑i=floor(n/2) to n i
= ∑i=1 to n i - ∑i=1 to floor(n/2) i
= n(n+1)/2 - floor(n/2)(floor(n/2)+1)/2
<= n2/2 - (n/4)2/2
= (15/32)n2
where we take advantage of n being "sufficiently large" to replace the ugly floor(n/2) factors with the much cleaner (and smaller) n/4. Now we can continue with
cn + 2 (1/n) ∑i=floor(n/2) to n ai,
<= cn + (2a/n) (15/32) n2
= n (c + (15/16)a)
<= an
provided a > 16c.
This gives T(n) = O(n). It's clearly Omega(n), so we get T(n) = Theta(n).
A quick Google on that ('kth largest element array') returned this: http://discuss.joelonsoftware.com/default.asp?interview.11.509587.17
"Make one pass through tracking the three largest values so far."
(it was specifically for 3d largest)
and this answer:
Build a heap/priority queue. O(n)
Pop top element. O(log n)
Pop top element. O(log n)
Pop top element. O(log n)
Total = O(n) + 3 O(log n) = O(n)
You do like quicksort. Pick an element at random and shove everything either higher or lower. At this point you'll know which element you actually picked, and if it is the kth element you're done, otherwise you repeat with the bin (higher or lower), that the kth element would fall in. Statistically speaking, the time it takes to find the kth element grows with n, O(n).
A Programmer's Companion to Algorithm Analysis gives a version that is O(n), although the author states that the constant factor is so high, you'd probably prefer the naive sort-the-list-then-select method.
I answered the letter of your question :)
The C++ standard library has almost exactly that function call nth_element, although it does modify your data. It has expected linear run-time, O(N), and it also does a partial sort.
const int N = ...;
double a[N];
// ...
const int m = ...; // m < N
nth_element (a, a + m, a + N);
// a[m] contains the mth element in a
You can do it in O(n + kn) = O(n) (for constant k) for time and O(k) for space, by keeping track of the k largest elements you've seen.
For each element in the array you can scan the list of k largest and replace the smallest element with the new one if it is bigger.
Warren's priority heap solution is neater though.
Although not very sure about O(n) complexity, but it will be sure to be between O(n) and nLog(n). Also sure to be closer to O(n) than nLog(n). Function is written in Java
public int quickSelect(ArrayList<Integer>list, int nthSmallest){
//Choose random number in range of 0 to array length
Random random = new Random();
//This will give random number which is not greater than length - 1
int pivotIndex = random.nextInt(list.size() - 1);
int pivot = list.get(pivotIndex);
ArrayList<Integer> smallerNumberList = new ArrayList<Integer>();
ArrayList<Integer> greaterNumberList = new ArrayList<Integer>();
//Split list into two.
//Value smaller than pivot should go to smallerNumberList
//Value greater than pivot should go to greaterNumberList
//Do nothing for value which is equal to pivot
for(int i=0; i<list.size(); i++){
if(list.get(i)<pivot){
smallerNumberList.add(list.get(i));
}
else if(list.get(i)>pivot){
greaterNumberList.add(list.get(i));
}
else{
//Do nothing
}
}
//If smallerNumberList size is greater than nthSmallest value, nthSmallest number must be in this list
if(nthSmallest < smallerNumberList.size()){
return quickSelect(smallerNumberList, nthSmallest);
}
//If nthSmallest is greater than [ list.size() - greaterNumberList.size() ], nthSmallest number must be in this list
//The step is bit tricky. If confusing, please see the above loop once again for clarification.
else if(nthSmallest > (list.size() - greaterNumberList.size())){
//nthSmallest will have to be changed here. [ list.size() - greaterNumberList.size() ] elements are already in
//smallerNumberList
nthSmallest = nthSmallest - (list.size() - greaterNumberList.size());
return quickSelect(greaterNumberList,nthSmallest);
}
else{
return pivot;
}
}
I implemented finding kth minimimum in n unsorted elements using dynamic programming, specifically tournament method. The execution time is O(n + klog(n)). The mechanism used is listed as one of methods on Wikipedia page about Selection Algorithm (as indicated in one of the posting above). You can read about the algorithm and also find code (java) on my blog page Finding Kth Minimum. In addition the logic can do partial ordering of the list - return first K min (or max) in O(klog(n)) time.
Though the code provided result kth minimum, similar logic can be employed to find kth maximum in O(klog(n)), ignoring the pre-work done to create tournament tree.
Sexy quickselect in Python
def quickselect(arr, k):
'''
k = 1 returns first element in ascending order.
can be easily modified to return first element in descending order
'''
r = random.randrange(0, len(arr))
a1 = [i for i in arr if i < arr[r]] '''partition'''
a2 = [i for i in arr if i > arr[r]]
if k <= len(a1):
return quickselect(a1, k)
elif k > len(arr)-len(a2):
return quickselect(a2, k - (len(arr) - len(a2)))
else:
return arr[r]
As per this paper Finding the Kth largest item in a list of n items the following algorithm will take O(n) time in worst case.
Divide the array in to n/5 lists of 5 elements each.
Find the median in each sub array of 5 elements.
Recursively find the median of all the medians, lets call it M
Partition the array in to two sub array 1st sub-array contains the elements larger than M , lets say this sub-array is a1 , while other sub-array contains the elements smaller then M., lets call this sub-array a2.
If k <= |a1|, return selection (a1,k).
If k− 1 = |a1|, return M.
If k> |a1| + 1, return selection(a2,k −a1 − 1).
Analysis: As suggested in the original paper:
We use the median to partition the list into two halves(the first half,
if k <= n/2 , and the second half otherwise). This algorithm takes
time cn at the first level of recursion for some constant c, cn/2 at
the next level (since we recurse in a list of size n/2), cn/4 at the
third level, and so on. The total time taken is cn + cn/2 + cn/4 +
.... = 2cn = o(n).
Why partition size is taken 5 and not 3?
As mentioned in original paper:
Dividing the list by 5 assures a worst-case split of 70 − 30. Atleast
half of the medians greater than the median-of-medians, hence atleast
half of the n/5 blocks have atleast 3 elements and this gives a
3n/10 split, which means the other partition is 7n/10 in worst case.
That gives T(n) = T(n/5)+T(7n/10)+O(n). Since n/5+7n/10 < 1, the
worst-case running time isO(n).
Now I have tried to implement the above algorithm as:
public static int findKthLargestUsingMedian(Integer[] array, int k) {
// Step 1: Divide the list into n/5 lists of 5 element each.
int noOfRequiredLists = (int) Math.ceil(array.length / 5.0);
// Step 2: Find pivotal element aka median of medians.
int medianOfMedian = findMedianOfMedians(array, noOfRequiredLists);
//Now we need two lists split using medianOfMedian as pivot. All elements in list listOne will be grater than medianOfMedian and listTwo will have elements lesser than medianOfMedian.
List<Integer> listWithGreaterNumbers = new ArrayList<>(); // elements greater than medianOfMedian
List<Integer> listWithSmallerNumbers = new ArrayList<>(); // elements less than medianOfMedian
for (Integer element : array) {
if (element < medianOfMedian) {
listWithSmallerNumbers.add(element);
} else if (element > medianOfMedian) {
listWithGreaterNumbers.add(element);
}
}
// Next step.
if (k <= listWithGreaterNumbers.size()) return findKthLargestUsingMedian((Integer[]) listWithGreaterNumbers.toArray(new Integer[listWithGreaterNumbers.size()]), k);
else if ((k - 1) == listWithGreaterNumbers.size()) return medianOfMedian;
else if (k > (listWithGreaterNumbers.size() + 1)) return findKthLargestUsingMedian((Integer[]) listWithSmallerNumbers.toArray(new Integer[listWithSmallerNumbers.size()]), k-listWithGreaterNumbers.size()-1);
return -1;
}
public static int findMedianOfMedians(Integer[] mainList, int noOfRequiredLists) {
int[] medians = new int[noOfRequiredLists];
for (int count = 0; count < noOfRequiredLists; count++) {
int startOfPartialArray = 5 * count;
int endOfPartialArray = startOfPartialArray + 5;
Integer[] partialArray = Arrays.copyOfRange((Integer[]) mainList, startOfPartialArray, endOfPartialArray);
// Step 2: Find median of each of these sublists.
int medianIndex = partialArray.length/2;
medians[count] = partialArray[medianIndex];
}
// Step 3: Find median of the medians.
return medians[medians.length / 2];
}
Just for sake of completion, another algorithm makes use of Priority Queue and takes time O(nlogn).
public static int findKthLargestUsingPriorityQueue(Integer[] nums, int k) {
int p = 0;
int numElements = nums.length;
// create priority queue where all the elements of nums will be stored
PriorityQueue<Integer> pq = new PriorityQueue<Integer>();
// place all the elements of the array to this priority queue
for (int n : nums) {
pq.add(n);
}
// extract the kth largest element
while (numElements - k + 1 > 0) {
p = pq.poll();
k++;
}
return p;
}
Both of these algorithms can be tested as:
public static void main(String[] args) throws IOException {
Integer[] numbers = new Integer[]{2, 3, 5, 4, 1, 12, 11, 13, 16, 7, 8, 6, 10, 9, 17, 15, 19, 20, 18, 23, 21, 22, 25, 24, 14};
System.out.println(findKthLargestUsingMedian(numbers, 8));
System.out.println(findKthLargestUsingPriorityQueue(numbers, 8));
}
As expected output is:
18
18
Find the median of the array in linear time, then use partition procedure exactly as in quicksort to divide the array in two parts, values to the left of the median lesser( < ) than than median and to the right greater than ( > ) median, that too can be done in lineat time, now, go to that part of the array where kth element lies,
Now recurrence becomes:
T(n) = T(n/2) + cn
which gives me O (n) overal.
Below is the link to full implementation with quite an extensive explanation how the algorithm for finding Kth element in an unsorted algorithm works. Basic idea is to partition the array like in QuickSort. But in order to avoid extreme cases (e.g. when smallest element is chosen as pivot in every step, so that algorithm degenerates into O(n^2) running time), special pivot selection is applied, called median-of-medians algorithm. The whole solution runs in O(n) time in worst and in average case.
Here is link to the full article (it is about finding Kth smallest element, but the principle is the same for finding Kth largest):
Finding Kth Smallest Element in an Unsorted Array
How about this kinda approach
Maintain a buffer of length k and a tmp_max, getting tmp_max is O(k) and is done n times so something like O(kn)
Is it right or am i missing something ?
Although it doesn't beat average case of quickselect and worst case of median statistics method but its pretty easy to understand and implement.
There is also one algorithm, that outperforms quickselect algorithm. It's called Floyd-Rivets (FR) algorithm.
Original article: https://doi.org/10.1145/360680.360694
Downloadable version: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.309.7108&rep=rep1&type=pdf
Wikipedia article https://en.wikipedia.org/wiki/Floyd%E2%80%93Rivest_algorithm
I tried to implement quickselect and FR algorithm in C++. Also I compared them to the standard C++ library implementations std::nth_element (which is basically introselect hybrid of quickselect and heapselect). The result was quickselect and nth_element ran comparably on average, but FR algorithm ran approx. twice as fast compared to them.
Sample code that I used for FR algorithm:
template <typename T>
T FRselect(std::vector<T>& data, const size_t& n)
{
if (n == 0)
return *(std::min_element(data.begin(), data.end()));
else if (n == data.size() - 1)
return *(std::max_element(data.begin(), data.end()));
else
return _FRselect(data, 0, data.size() - 1, n);
}
template <typename T>
T _FRselect(std::vector<T>& data, const size_t& left, const size_t& right, const size_t& n)
{
size_t leftIdx = left;
size_t rightIdx = right;
while (rightIdx > leftIdx)
{
if (rightIdx - leftIdx > 600)
{
size_t range = rightIdx - leftIdx + 1;
long long i = n - (long long)leftIdx + 1;
long long z = log(range);
long long s = 0.5 * exp(2 * z / 3);
long long sd = 0.5 * sqrt(z * s * (range - s) / range) * sgn(i - (long long)range / 2);
size_t newLeft = fmax(leftIdx, n - i * s / range + sd);
size_t newRight = fmin(rightIdx, n + (range - i) * s / range + sd);
_FRselect(data, newLeft, newRight, n);
}
T t = data[n];
size_t i = leftIdx;
size_t j = rightIdx;
// arrange pivot and right index
std::swap(data[leftIdx], data[n]);
if (data[rightIdx] > t)
std::swap(data[rightIdx], data[leftIdx]);
while (i < j)
{
std::swap(data[i], data[j]);
++i; --j;
while (data[i] < t) ++i;
while (data[j] > t) --j;
}
if (data[leftIdx] == t)
std::swap(data[leftIdx], data[j]);
else
{
++j;
std::swap(data[j], data[rightIdx]);
}
// adjust left and right towards the boundaries of the subset
// containing the (k - left + 1)th smallest element
if (j <= n)
leftIdx = j + 1;
if (n <= j)
rightIdx = j - 1;
}
return data[leftIdx];
}
template <typename T>
int sgn(T val) {
return (T(0) < val) - (val < T(0));
}
iterate through the list. if the current value is larger than the stored largest value, store it as the largest value and bump the 1-4 down and 5 drops off the list. If not,compare it to number 2 and do the same thing. Repeat, checking it against all 5 stored values. this should do it in O(n)
i would like to suggest one answer
if we take the first k elements and sort them into a linked list of k values
now for every other value even for the worst case if we do insertion sort for rest n-k values even in the worst case number of comparisons will be k*(n-k) and for prev k values to be sorted let it be k*(k-1) so it comes out to be (nk-k) which is o(n)
cheers
Explanation of the median - of - medians algorithm to find the k-th largest integer out of n can be found here:
http://cs.indstate.edu/~spitla/presentation.pdf
Implementation in c++ is below:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int findMedian(vector<int> vec){
// Find median of a vector
int median;
size_t size = vec.size();
median = vec[(size/2)];
return median;
}
int findMedianOfMedians(vector<vector<int> > values){
vector<int> medians;
for (int i = 0; i < values.size(); i++) {
int m = findMedian(values[i]);
medians.push_back(m);
}
return findMedian(medians);
}
void selectionByMedianOfMedians(const vector<int> values, int k){
// Divide the list into n/5 lists of 5 elements each
vector<vector<int> > vec2D;
int count = 0;
while (count != values.size()) {
int countRow = 0;
vector<int> row;
while ((countRow < 5) && (count < values.size())) {
row.push_back(values[count]);
count++;
countRow++;
}
vec2D.push_back(row);
}
cout<<endl<<endl<<"Printing 2D vector : "<<endl;
for (int i = 0; i < vec2D.size(); i++) {
for (int j = 0; j < vec2D[i].size(); j++) {
cout<<vec2D[i][j]<<" ";
}
cout<<endl;
}
cout<<endl;
// Calculating a new pivot for making splits
int m = findMedianOfMedians(vec2D);
cout<<"Median of medians is : "<<m<<endl;
// Partition the list into unique elements larger than 'm' (call this sublist L1) and
// those smaller them 'm' (call this sublist L2)
vector<int> L1, L2;
for (int i = 0; i < vec2D.size(); i++) {
for (int j = 0; j < vec2D[i].size(); j++) {
if (vec2D[i][j] > m) {
L1.push_back(vec2D[i][j]);
}else if (vec2D[i][j] < m){
L2.push_back(vec2D[i][j]);
}
}
}
// Checking the splits as per the new pivot 'm'
cout<<endl<<"Printing L1 : "<<endl;
for (int i = 0; i < L1.size(); i++) {
cout<<L1[i]<<" ";
}
cout<<endl<<endl<<"Printing L2 : "<<endl;
for (int i = 0; i < L2.size(); i++) {
cout<<L2[i]<<" ";
}
// Recursive calls
if ((k - 1) == L1.size()) {
cout<<endl<<endl<<"Answer :"<<m;
}else if (k <= L1.size()) {
return selectionByMedianOfMedians(L1, k);
}else if (k > (L1.size() + 1)){
return selectionByMedianOfMedians(L2, k-((int)L1.size())-1);
}
}
int main()
{
int values[] = {2, 3, 5, 4, 1, 12, 11, 13, 16, 7, 8, 6, 10, 9, 17, 15, 19, 20, 18, 23, 21, 22, 25, 24, 14};
vector<int> vec(values, values + 25);
cout<<"The given array is : "<<endl;
for (int i = 0; i < vec.size(); i++) {
cout<<vec[i]<<" ";
}
selectionByMedianOfMedians(vec, 8);
return 0;
}
There is also Wirth's selection algorithm, which has a simpler implementation than QuickSelect. Wirth's selection algorithm is slower than QuickSelect, but with some improvements it becomes faster.
In more detail. Using Vladimir Zabrodsky's MODIFIND optimization and the median-of-3 pivot selection and paying some attention to the final steps of the partitioning part of the algorithm, i've came up with the following algorithm (imaginably named "LefSelect"):
#define F_SWAP(a,b) { float temp=(a);(a)=(b);(b)=temp; }
# Note: The code needs more than 2 elements to work
float lefselect(float a[], const int n, const int k) {
int l=0, m = n-1, i=l, j=m;
float x;
while (l<m) {
if( a[k] < a[i] ) F_SWAP(a[i],a[k]);
if( a[j] < a[i] ) F_SWAP(a[i],a[j]);
if( a[j] < a[k] ) F_SWAP(a[k],a[j]);
x=a[k];
while (j>k & i<k) {
do i++; while (a[i]<x);
do j--; while (a[j]>x);
F_SWAP(a[i],a[j]);
}
i++; j--;
if (j<k) {
while (a[i]<x) i++;
l=i; j=m;
}
if (k<i) {
while (x<a[j]) j--;
m=j; i=l;
}
}
return a[k];
}
In benchmarks that i did here, LefSelect is 20-30% faster than QuickSelect.
Haskell Solution:
kthElem index list = sort list !! index
withShape ~[] [] = []
withShape ~(x:xs) (y:ys) = x : withShape xs ys
sort [] = []
sort (x:xs) = (sort ls `withShape` ls) ++ [x] ++ (sort rs `withShape` rs)
where
ls = filter (< x)
rs = filter (>= x)
This implements the median of median solutions by using the withShape method to discover the size of a partition without actually computing it.
Here is a C++ implementation of Randomized QuickSelect. The idea is to randomly pick a pivot element. To implement randomized partition, we use a random function, rand() to generate index between l and r, swap the element at randomly generated index with the last element, and finally call the standard partition process which uses last element as pivot.
#include<iostream>
#include<climits>
#include<cstdlib>
using namespace std;
int randomPartition(int arr[], int l, int r);
// This function returns k'th smallest element in arr[l..r] using
// QuickSort based method. ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
// If k is smaller than number of elements in array
if (k > 0 && k <= r - l + 1)
{
// Partition the array around a random element and
// get position of pivot element in sorted array
int pos = randomPartition(arr, l, r);
// If position is same as k
if (pos-l == k-1)
return arr[pos];
if (pos-l > k-1) // If position is more, recur for left subarray
return kthSmallest(arr, l, pos-1, k);
// Else recur for right subarray
return kthSmallest(arr, pos+1, r, k-pos+l-1);
}
// If k is more than number of elements in array
return INT_MAX;
}
void swap(int *a, int *b)
{
int temp = *a;
*a = *b;
*b = temp;
}
// Standard partition process of QuickSort(). It considers the last
// element as pivot and moves all smaller element to left of it and
// greater elements to right. This function is used by randomPartition()
int partition(int arr[], int l, int r)
{
int x = arr[r], i = l;
for (int j = l; j <= r - 1; j++)
{
if (arr[j] <= x) //arr[i] is bigger than arr[j] so swap them
{
swap(&arr[i], &arr[j]);
i++;
}
}
swap(&arr[i], &arr[r]); // swap the pivot
return i;
}
// Picks a random pivot element between l and r and partitions
// arr[l..r] around the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}
The worst case time complexity of the above solution is still O(n2).In worst case, the randomized function may always pick a corner element. The expected time complexity of above randomized QuickSelect is Θ(n)
Have Priority queue created.
Insert all the elements into heap.
Call poll() k times.
public static int getKthLargestElements(int[] arr)
{
PriorityQueue<Integer> pq = new PriorityQueue<>((x , y) -> (y-x));
//insert all the elements into heap
for(int ele : arr)
pq.offer(ele);
// call poll() k times
int i=0;
while(i<k)
{
int result = pq.poll();
}
return result;
}
This is an implementation in Javascript.
If you release the constraint that you cannot modify the array, you can prevent the use of extra memory using two indexes to identify the "current partition" (in classic quicksort style - http://www.nczonline.net/blog/2012/11/27/computer-science-in-javascript-quicksort/).
function kthMax(a, k){
var size = a.length;
var pivot = a[ parseInt(Math.random()*size) ]; //Another choice could have been (size / 2)
//Create an array with all element lower than the pivot and an array with all element higher than the pivot
var i, lowerArray = [], upperArray = [];
for (i = 0; i < size; i++){
var current = a[i];
if (current < pivot) {
lowerArray.push(current);
} else if (current > pivot) {
upperArray.push(current);
}
}
//Which one should I continue with?
if(k <= upperArray.length) {
//Upper
return kthMax(upperArray, k);
} else {
var newK = k - (size - lowerArray.length);
if (newK > 0) {
///Lower
return kthMax(lowerArray, newK);
} else {
//None ... it's the current pivot!
return pivot;
}
}
}
If you want to test how it perform, you can use this variation:
function kthMax (a, k, logging) {
var comparisonCount = 0; //Number of comparison that the algorithm uses
var memoryCount = 0; //Number of integers in memory that the algorithm uses
var _log = logging;
if(k < 0 || k >= a.length) {
if (_log) console.log ("k is out of range");
return false;
}
function _kthmax(a, k){
var size = a.length;
var pivot = a[parseInt(Math.random()*size)];
if(_log) console.log("Inputs:", a, "size="+size, "k="+k, "pivot="+pivot);
// This should never happen. Just a nice check in this exercise
// if you are playing with the code to avoid never ending recursion
if(typeof pivot === "undefined") {
if (_log) console.log ("Ops...");
return false;
}
var i, lowerArray = [], upperArray = [];
for (i = 0; i < size; i++){
var current = a[i];
if (current < pivot) {
comparisonCount += 1;
memoryCount++;
lowerArray.push(current);
} else if (current > pivot) {
comparisonCount += 2;
memoryCount++;
upperArray.push(current);
}
}
if(_log) console.log("Pivoting:",lowerArray, "*"+pivot+"*", upperArray);
if(k <= upperArray.length) {
comparisonCount += 1;
return _kthmax(upperArray, k);
} else if (k > size - lowerArray.length) {
comparisonCount += 2;
return _kthmax(lowerArray, k - (size - lowerArray.length));
} else {
comparisonCount += 2;
return pivot;
}
/*
* BTW, this is the logic for kthMin if we want to implement that... ;-)
*
if(k <= lowerArray.length) {
return kthMin(lowerArray, k);
} else if (k > size - upperArray.length) {
return kthMin(upperArray, k - (size - upperArray.length));
} else
return pivot;
*/
}
var result = _kthmax(a, k);
return {result: result, iterations: comparisonCount, memory: memoryCount};
}
The rest of the code is just to create some playground:
function getRandomArray (n){
var ar = [];
for (var i = 0, l = n; i < l; i++) {
ar.push(Math.round(Math.random() * l))
}
return ar;
}
//Create a random array of 50 numbers
var ar = getRandomArray (50);
Now, run you tests a few time.
Because of the Math.random() it will produce every time different results:
kthMax(ar, 2, true);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 2);
kthMax(ar, 34, true);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
kthMax(ar, 34);
If you test it a few times you can see even empirically that the number of iterations is, on average, O(n) ~= constant * n and the value of k does not affect the algorithm.
I came up with this algorithm and seems to be O(n):
Let's say k=3 and we want to find the 3rd largest item in the array. I would create three variables and compare each item of the array with the minimum of these three variables. If array item is greater than our minimum, we would replace the min variable with the item value. We continue the same thing until end of the array. The minimum of our three variables is the 3rd largest item in the array.
define variables a=0, b=0, c=0
iterate through the array items
find minimum a,b,c
if item > min then replace the min variable with item value
continue until end of array
the minimum of a,b,c is our answer
And, to find Kth largest item we need K variables.
Example: (k=3)
[1,2,4,1,7,3,9,5,6,2,9,8]
Final variable values:
a=7 (answer)
b=8
c=9
Can someone please review this and let me know what I am missing?
Here is the implementation of the algorithm eladv suggested(I also put here the implementation with random pivot):
public class Median {
public static void main(String[] s) {
int[] test = {4,18,20,3,7,13,5,8,2,1,15,17,25,30,16};
System.out.println(selectK(test,8));
/*
int n = 100000000;
int[] test = new int[n];
for(int i=0; i<test.length; i++)
test[i] = (int)(Math.random()*test.length);
long start = System.currentTimeMillis();
random_selectK(test, test.length/2);
long end = System.currentTimeMillis();
System.out.println(end - start);
*/
}
public static int random_selectK(int[] a, int k) {
if(a.length <= 1)
return a[0];
int r = (int)(Math.random() * a.length);
int p = a[r];
int small = 0, equal = 0, big = 0;
for(int i=0; i<a.length; i++) {
if(a[i] < p) small++;
else if(a[i] == p) equal++;
else if(a[i] > p) big++;
}
if(k <= small) {
int[] temp = new int[small];
for(int i=0, j=0; i<a.length; i++)
if(a[i] < p)
temp[j++] = a[i];
return random_selectK(temp, k);
}
else if (k <= small+equal)
return p;
else {
int[] temp = new int[big];
for(int i=0, j=0; i<a.length; i++)
if(a[i] > p)
temp[j++] = a[i];
return random_selectK(temp,k-small-equal);
}
}
public static int selectK(int[] a, int k) {
if(a.length <= 5) {
Arrays.sort(a);
return a[k-1];
}
int p = median_of_medians(a);
int small = 0, equal = 0, big = 0;
for(int i=0; i<a.length; i++) {
if(a[i] < p) small++;
else if(a[i] == p) equal++;
else if(a[i] > p) big++;
}
if(k <= small) {
int[] temp = new int[small];
for(int i=0, j=0; i<a.length; i++)
if(a[i] < p)
temp[j++] = a[i];
return selectK(temp, k);
}
else if (k <= small+equal)
return p;
else {
int[] temp = new int[big];
for(int i=0, j=0; i<a.length; i++)
if(a[i] > p)
temp[j++] = a[i];
return selectK(temp,k-small-equal);
}
}
private static int median_of_medians(int[] a) {
int[] b = new int[a.length/5];
int[] temp = new int[5];
for(int i=0; i<b.length; i++) {
for(int j=0; j<5; j++)
temp[j] = a[5*i + j];
Arrays.sort(temp);
b[i] = temp[2];
}
return selectK(b, b.length/2 + 1);
}
}
it is similar to the quickSort strategy, where we pick an arbitrary pivot, and bring the smaller elements to its left, and the larger to the right
public static int kthElInUnsortedList(List<int> list, int k)
{
if (list.Count == 1)
return list[0];
List<int> left = new List<int>();
List<int> right = new List<int>();
int pivotIndex = list.Count / 2;
int pivot = list[pivotIndex]; //arbitrary
for (int i = 0; i < list.Count && i != pivotIndex; i++)
{
int currentEl = list[i];
if (currentEl < pivot)
left.Add(currentEl);
else
right.Add(currentEl);
}
if (k == left.Count + 1)
return pivot;
if (left.Count < k)
return kthElInUnsortedList(right, k - left.Count - 1);
else
return kthElInUnsortedList(left, k);
}
Go to the End of this link : ...........
http://www.geeksforgeeks.org/kth-smallestlargest-element-unsorted-array-set-3-worst-case-linear-time/
You can find the kth smallest element in O(n) time and constant space. If we consider the array is only for integers.
The approach is to do a binary search on the range of Array values. If we have a min_value and a max_value both in integer range, we can do a binary search on that range.
We can write a comparator function which will tell us if any value is the kth-smallest or smaller than kth-smallest or bigger than kth-smallest.
Do the binary search until you reach the kth-smallest number
Here is the code for that
class Solution:
def _iskthsmallest(self, A, val, k):
less_count, equal_count = 0, 0
for i in range(len(A)):
if A[i] == val: equal_count += 1
if A[i] < val: less_count += 1
if less_count >= k: return 1
if less_count + equal_count < k: return -1
return 0
def kthsmallest_binary(self, A, min_val, max_val, k):
if min_val == max_val:
return min_val
mid = (min_val + max_val)/2
iskthsmallest = self._iskthsmallest(A, mid, k)
if iskthsmallest == 0: return mid
if iskthsmallest > 0: return self.kthsmallest_binary(A, min_val, mid, k)
return self.kthsmallest_binary(A, mid+1, max_val, k)
# #param A : tuple of integers
# #param B : integer
# #return an integer
def kthsmallest(self, A, k):
if not A: return 0
if k > len(A): return 0
min_val, max_val = min(A), max(A)
return self.kthsmallest_binary(A, min_val, max_val, k)
What I would do is this:
initialize empty doubly linked list l
for each element e in array
if e larger than head(l)
make e the new head of l
if size(l) > k
remove last element from l
the last element of l should now be the kth largest element
You can simply store pointers to the first and last element in the linked list. They only change when updates to the list are made.
Update:
initialize empty sorted tree l
for each element e in array
if e between head(l) and tail(l)
insert e into l // O(log k)
if size(l) > k
remove last element from l
the last element of l should now be the kth largest element
First we can build a BST from unsorted array which takes O(n) time and from the BST we can find the kth smallest element in O(log(n)) which over all counts to an order of O(n).

Resources