What is Sliding Window Algorithm? Examples? - algorithm

While solving a geometry problem, I came across an approach called Sliding Window Algorithm.
Couldn't really find any study material/details on it.
What is the algorithm about?

I think of it as more a technique than an algorithm. It's a technique that could be utilized in various algorithms.
I think the technique is best understood with the following example. Imagine we have this array:
[ 5, 7, 1, 4, 3, 6, 2, 9, 2 ]
How would we find the largest sum of five consecutive elements? Well, we'd first look at 5, 7, 1, 4, 3 and see that the sum is 20. Then we'd look at the next set of five consecutive elements, which is 7, 1, 4, 3, 6. The sum of those is 21. This is more than our previous sum, so 7, 1, 4, 3, 6 is currently the best we've got so far.
Let's see if we could improve. 1, 4, 3, 6, 2? No, that sums to 16. 4, 3, 6, 2, 9? That sums to 24, so now that's the best sequence we've got. Now we move along to the next sequence, 3, 6, 2, 9, 2. That one sums to 22, which doesn't beat our current best of 24. And we've reached the end, so we're done.
The brute force approach to implementing this programmatically is as follows:
const getMaxSumOfFiveContiguousElements = (arr) => {
let maxSum = -Infinity;
let currSum;
for (let i = 0; i <= arr.length - 5; i++) {
currSum = 0;
for (let j = i; j < i + 5; j++) {
currSum += arr[j];
}
maxSum = Math.max(maxSum, currSum);
}
return maxSum;
};
What is the time complexity of this? It's O(n*k). The outer loop is going through n - k + 1 items, but when n is much larger than k, we can forget about the k + 1 part and just call it n items. Then the inner loop is going through k items, so we have O(n*k). Try visualizing it like this:
Can we get this down to just O(n)? Let's return to this array:
[ 5, 7, 1, 4, 3, 6, 2, 9, 2 ]
First we get the sum of 5, 7, 1, 4, 3. Next we need the sum of 7, 1, 4, 3, 6. Visualize it like this, with a "window" surrounding each group of five elements.
What's the difference between the first window and the second window? Well, the second window got rid of the 5 on the left but added a 6 on the right. So since we know the sum of the first window was 20, to get the sum of the second window, we take that 20, subtract out the 5, and add the 6 to get 21. We don't actually have to go through each element in the second window and add them up (7 + 1 + 4 + 3 + 6). That would involve doing repeated and unnecessary work.
Here the sliding window approach ends up being two operations instead of five, since k is 5. That's not a huge improvement, but you can imagine that for larger k (and larger n) it really does help.
Here's how the code would work using the sliding window technique:
const getLargestSumOfFiveConsecutiveElements = (arr) => {
let currSum = getSum(arr, 0, 4);
let largestSum = currSum;
for (let i = 1; i <= arr.length - 5; i++) {
currSum -= arr[i - 1]; // subtract element to the left of curr window
currSum += arr[i + 4]; // add last element in curr window
largestSum = Math.max(largestSum, currSum);
}
return largestSum;
};
const getSum = (arr, start, end) => {
let sum = 0;
for (let i = start; i <= end; i++) {
sum += arr[i];
}
return sum;
};
And that's the gist of the sliding window technique. In other problems you may be doing something more complicated than getting the sum of the elements inside the window. Or the window itself may be of varying size instead of the fixed size of five that we saw here. But this basic application of the sliding window technique should give you a foundation from which you could build off of.

Generally speaking a sliding window is a sub-list that runs over an underlying collection. I.e., if you have an array like
[a b c d e f g h]
a sliding window of size 3 would run over it like
[a b c]
[b c d]
[c d e]
[d e f]
[e f g]
[f g h]
This is useful if you for instance want to compute a running average, or if you want to create a set of all adjacent pairs etc.

The Sliding window is a problem-solving technique for problems that involve arrays/lists. These problems are easy to solve using a brute force approach in O(n^2) or O(n^3). Using the 'sliding window' technique, we can reduce the time complexity to O(n).
Great article on this is here: https://medium.com/outco/how-to-solve-sliding-window-problems-28d67601a66
So the first thing you want to be able to do is to identify a problem
that uses a sliding window paradigm. Luckily, there are some common
giveaways:
The problem will involve a data structure that is ordered and iterable like an array or a string
You are looking for some subrange in that array/string, like the longest, shortest or target value.
There is an apparent naive or brute force solution that runs in O(N²), O(2^N) or some other large time complexity.
But the biggest giveaway is that the thing you are looking for is
often some kind of optimal, like the longest sequence or shortest
sequence of something that satisfies a given condition exactly.

To add to the previous answers here are some more resources which illustrates this concept very well.
This youtube video is the best that I have found on this topic.
Here are the list of questions on leetcode which can be solved using this technique
The sliding window is one of the most frequent topic which is asked in the coding rounds in the top companies so it is definitely worth spending some time to master this.

Related

Rearrange list to satisfy a condition

I was asked this during a coding interview but wasn't able to solve this. Any pointers would be very helpful.
I was given an integer list (think of it as a number line) which needs to be rearranged so that the difference between elements is equal to M (an integer which is given). The list needs to be rearranged in such a way that the value of the max absolute difference between the elements' new positions and the original positions needs to be minimized. Eventually, this value multiplied by 2 is returned.
Test cases:
//1.
original_list = [1, 2, 3, 4]
M = 2
rearranged_list = [-0.5, 1.5, 3.5, 5.5]
// difference in values of original and rearranged lists
diff = [1.5, 0.5, 0.5, 1.5]
max_of_diff = 1.5 // list is rearranged in such a way so that this value is minimized
return_val = 1.5 * 2 = 3
//2.
original_list = [1, 2, 4, 3]
M = 2
rearranged_list = [-1, 1, 3, 5]
// difference in values of original and rearranged lists
diff = [2, 1, 1, 2]
max_of_diff = 2 // list is rearranged in such a way so that this value is minimized
return_val = 2 * 2 = 4
Constraints:
1 <= list_length <= 10^5
1 <= M <= 10^4
-10^9 <= list[i] <= 10^9
There's a question on leetcode which is very similar to this: https://leetcode.com/problems/minimize-deviation-in-array/ but there, the operations that are performed on the array are mentioned while that's not been mentioned here. I'm really stumped.
Here is how you can think of it:
The "rearanged" list is like a straight line that has a slope that corresponds to M.
Here is a visualisation for the first example:
The black dots are the input values [1, 2, 3, 4] where the index of the array is the X-coordinate, and the actual value at that index, the Y-coordinate.
The green line is determined by M. Initially this line runs through the origin at (0, 0). The red line segments represent the differences that must be taken into account.
Now the green line has to move vertically to its optimal position. We can see that we only need to look at the difference it makes with the first and with the last point. The other two inputs will never contribute to an extreme. This is generally true: there are only two input elements that need to be taken into account. They are the points that make the greatest (signed -- not absolute) difference and the least difference.
We can see that we need to move the green line in such a way that the signed differences with these two extremes are each others opposite: i.e. their absolute difference becomes the same, but the sign will be opposite.
Twice this absolute difference is what we need to return, and it is actually the difference between the greatest (signed) difference and the least (signed) difference.
So, in conclusion, we must generate the values on the green line, find the least and greatest (signed) difference with the data points (Y-coordinates) and return the difference between those two.
Here is an implementation in JavaScript running the two examples you provided:
function solve(y, slope) {
let low = Infinity;
let high = -Infinity;
for (let x = 0; x < y.length; x++) {
let dy = y[x] - x * slope;
low = Math.min(low, dy);
high = Math.max(high, dy);
}
return high - low;
}
console.log(solve([1, 2, 3, 4], 2)); // 3
console.log(solve([1, 2, 4, 3], 2)); // 4

Number of ways to pick the elements of an array?

How to formulate this problem in code?
Problem Statement:
UPDATED:
Find the number of ways to pick the element from the array which are
not visited.
We starting from 1,2,.....,n with some (1<= x <= n) number of elements already picked/visited randomly which is given in the input.
Now, we need to find the number of ways we can pick rest of the (n - x) number of elements present in the array, and the way we pick an element is defined as:
On every turn, we can only pick the element which is adjacent(either left or right) to some visited element i.e
in an array of elements:
1,2,3,4,5,6 let's say we have visited 3 & 6 then we can now pick
2 or 4 or 5, as they are unvisited and adjacent to visited nodes, now say we pick 2, so now we can pick 1 or 4 or 5 and continues.
example:
input: N = 6(number of elements: 1, 2, 3, 4, 5, 6)
M = 2(number of visited elements)
visited elements are = 1, 5
Output: 16(number of ways we can pick the unvisited elements)
ways: 4, 6, 2, 3
4, 6, 3, 2
4, 2, 3, 6
4, 2, 6, 3
4, 3, 2, 6
4, 3, 6, 2
6, 4, 2, 3
6, 4, 2, 3
6, 2, 3, 4
6, 2, 4, 3
2, 6, 4, 3
2, 6, 3, 4
2, 4, 6, 3
2, 4, 3, 6
2, 3, 4, 6
2, 3, 6, 4.
Some analysis of the problem:
The actual values in the input array are assumed to be 1...n, but these values do not really play a role. These values just represent indexes that are referenced by the other input array, which lists the visited indexes (1-based)
The list of visited indexes actually cuts the main array into subarrays with smaller sizes. So for example, when n=6 and visited=[1,5], then the original array [1,2,3,4,5,6] is cut into [2,3,4] and [6]. So it cuts it into sizes 3 and 1. At this point the index numbering loses its purpose, so the problem really is fully described with those two sizes: 3 and 1. To illustrate, the solution for (n=6, visited=[1,5]) is necessarily the same as for (n=7, visited[1,2,6]): the sizes into which the original array is cut, are the same in both cases (in a different order, but that doesn't influence the result).
Algorithm, based on a list of sizes of subarrays (see above):
The number of ways that one such subarray can be visited, is not that difficult: if the subarray's size is 1, there is just one way. If it is greater, then at each pick, there are two possibilities: either you pick from the left side or from the right side. So you get like 2*2*..*2*1 possibilities to pick. This is 2size-1 possibilities.
The two outer subarrays are an exception to this, as you can only pick items from the inside-out, so for those the number of ways to visit such a subarray is just 1.
The number of ways that you can pick items from two subarrays can be determined as follows: count the number of ways to pick from just one of those subarrays, and the number of ways to pick from the other one. Then consider that you can alternate when to pick from one sub array or from the other. This comes down to interweaving the two sub arrays. Let's say the larger of the two sub arrays has j elements, and the smaller k, then consider there are j+1 positions where an element from the smaller sub array can be injected (merged) into the larger array. There are "k multichoose j+1" ways ways to inject all elements from the smaller sub array.
When you have counted the number of ways to merge two subarrays, you actually have an array with a size that is the sum of those two sizes. The above logic can then be applied with this array and the next subarray in the problem specification. The number of ways just multiplies as you merge more subarrays into this growing array. Of course, you don't really deal with the arrays, just with sizes.
Here is an implementation in JavaScript, which applies the above algorithm:
function getSubArraySizes(n, visited) {
// Translate the problem into a set of sizes (of subarrays)
let j = 0;
let sizes = [];
for (let i of visited) {
let size = i - j - 1;
if (size > 0) sizes.push(size);
j = i;
}
let size = n - j;
if (size > 0) sizes.push(size);
return sizes;
}
function Combi(n, k) {
// Count combinations: "from n, take k"
// See Wikipedia on "Combination"
let c = 1;
let end = Math.min(k, n - k);
for (let i = 0; i < end; i++) {
c = c * (n-i) / (end-i); // This is floating point
}
return c; // ... but result is integer
}
function getPickCount(sizes) {
// Main function, based on a list of sizes of subarrays
let count = 0;
let result = 1;
for (let i = 0; i < sizes.length; i++) {
let size = sizes[i];
// Number of ways to take items from this chunk:
// - when items can only be taken from one side: 1
// - otherwise: every time we have a choice between 2, except for the last remaining item
let pickCount = i == 0 || i == sizes.length-1 ? 1 : 2 ** (size-1);
// Number of ways to merge/weave two arrays, where relative order of elements is not changed
// = a "k multichoice from n". See
// https://en.wikipedia.org/wiki/Combination#Number_of_combinations_with_repetition
let weaveCount = count == 0 ? 1 // First time only
: Combi(size+count, Math.min(count, size));
// Number of possibilities:
result *= pickCount * weaveCount;
// Update the size to be the size of the merged/woven array
count += size;
}
return result;
}
// Demo with the example input (n = 6, visited = 1 and 5)
let result = getPickCount(getSubArraySizes(6, [1, 5]));
console.log(result);

Minimum common remainder of division

I have n pairs of numbers: ( p[1], s[1] ), ( p[2], s[2] ), ... , ( p[n], s[n] )
Where p[i] is integer greater than 1; s[i] is integer : 0 <= s[i] < p[i]
Is there any way to determine minimum positive integer a , such that for each pair :
( s[i] + a ) mod p[i] != 0
Anything better than brute force ?
It is possible to do better than brute force. Brute force would be O(A·n), where A is the minimum valid value for a that we are looking for.
The approach described below uses a min-heap and achieves O(n·log(n) + A·log(n)) time complexity.
First, notice that replacing a with a value of the form (p[i] - s[i]) + k * p[i] leads to a reminder equal to zero in the ith pair, for any positive integer k. Thus, the numbers of that form are invalid a values (the solution that we are looking for is different from all of them).
The proposed algorithm is an efficient way to generate the numbers of that form (for all i and k), i.e. the invalid values for a, in increasing order. As soon as the current value differs from the previous one by more than 1, it means that there was a valid a in-between.
The pseudocode below details this approach.
1. construct a min-heap from all the following pairs (p[i] - s[i], p[i]),
where the heap comparator is based on the first element of the pairs.
2. a0 = -1; maxA = lcm(p[i])
3. Repeat
3a. Retrieve and remove the root of the heap, (a, p[i]).
3b. If a - a0 > 1 then the result is a0 + 1. Exit.
3c. if a is at least maxA, then no solution exists. Exit.
3d. Insert into the heap the value (a + p[i], p[i]).
3e. a0 = a
Remark: it is possible for such an a to not exist. If a valid a is not found below LCM(p[1], p[2], ... p[n]), then it is guaranteed that no valid a exists.
I'll show below an example of how this algorithm works.
Consider the following (p, s) pairs: { (2, 1), (5, 3) }.
The first pair indicates that a should avoid values like 1, 3, 5, 7, ..., whereas the second pair indicates that we should avoid values like 2, 7, 12, 17, ... .
The min-heap initially contains the first element of each sequence (step 1 of the pseudocode) -- shown in bold below:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
We retrieve and remove the head of the heap, i.e., the minimum value among the two bold ones, and this is 1. We add into the heap the next element from that sequence, thus the heap now contains the elements 2 and 3:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
We again retrieve the head of the heap, this time it contains the value 2, and add the next element of that sequence into the heap:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
The algorithm continues, we will next retrieve value 3, and add 5 into the heap:
1, 3, 5, 7, ...
2, 7, 12, 17, ...
Finally, now we retrieve value 5. At this point we realize that the value 4 is not among the invalid values for a, thus that is the solution that we are looking for.
I can think of two different solutions. First:
p_max = lcm (p[0],p[1],...,p[n]) - 1;
for a = 0 to p_max:
zero_found = false;
for i = 0 to n:
if ( s[i] + a ) mod p[i] == 0:
zero_found = true;
break;
if !zero_found:
return a;
return -1;
I suppose this is the one you call "brute force". Notice that p_max represents Least Common Multiple of p[i]s - 1 (solution is either in the closed interval [0, p_max], or it does not exist). Complexity of this solution is O(n * p_max) in the worst case (plus the running time for calculating lcm!). There is a better solution regarding the time complexity, but it uses an additional binary array - classical time-space tradeoff. Its idea is similar to the Sieve of Eratosthenes, but for remainders instead of primes :)
p_max = lcm (p[0],p[1],...,p[n]) - 1;
int remainders[p_max + 1] = {0};
for i = 0 to n:
int rem = s[i] - p[i];
while rem >= -p_max:
remainders[-rem] = 1;
rem -= p[i];
for i = 0 to n:
if !remainders[i]:
return i;
return -1;
Explanation of the algorithm: first, we create an array remainders that will indicate whether certain negative remainder exists in the whole set. What is a negative remainder? It's simple, notice that 6 = 2 mod 4 is equivalent to 6 = -2 mod 4. If remainders[i] == 1, it means that if we add i to one of the s[j], we will get p[j] (which is 0, and that is what we want to avoid). Array is populated with all possible negative remainders, up to -p_max. Now all we have to do is search for the first i, such that remainder[i] == 0 and return it, if it exists - notice that the solution does not have to exists. In the problem text, you have indicated that you are searching for the minimum positive integer, I don't see why zero would not fit (if all s[i] are positive). However, if that is a strong requirement, just change the for loop to start from 1 instead of 0, and increment p_max.
The complexity of this algorithm is n + sum (p_max / p[i]) = n + p_max * sum (1 / p[i]), where i goes from to 0 to n. Since all p[i]s are at least 2, that is asymptotically better than the brute force solution.
An example for better understanding: suppose that the input is (5,4), (5,1), (2,0). p_max is lcm(5,5,2) - 1 = 10 - 1 = 9, so we create array with 10 elements, initially filled with zeros. Now let's proceed pair by pair:
from the first pair, we have remainders[1] = 1 and remainders[6] = 1
second pair gives remainders[4] = 1 and remainders[9] = 1
last pair gives remainders[0] = 1, remainders[2] = 1, remainders[4] = 1, remainders[6] = 1 and remainders[8] = 1.
Therefore, first index with zero value in the array is 3, which is a desired solution.

(Any Language) Find all permutations of elements in a vector using swapping

I was asked this question in a Lab session today.
We can imagine a vector containing the elements 1 ... N - 1, with a length N. Is there an algorithmic (systematic) method of generating all permutations, or orders of the elements in the vector. One proposed method was to swap random elements. Obviously this would work provided all previously generated permutations were stored for future reference, however this is obviously a very inefficient method, both space wise and time wise.
The reason for doing this by the way is to remove special elements (eg elements which are zero) from special positions in the vector, where such an element is not allowed. Therefore the random method isn't quite so ridiculous, but imagine the case where the number of elements is large and the number of possible permutations (which are such that there are no "special elements" in any of the "special positions") is low.
We tried to work through this problem for the case of N = 5:
x = [1, 2, 3, 4, 5]
First, swap elements 4 and 5:
x = [1, 2, 3, 5, 4]
Then swap 3 and 5:
x = [1, 2, 4, 5, 3]
Then 3 and 4:
x = [1, 2, 5, 4, 3]
Originally we thought using two indices, ix and jx, might be a possible solution. Something like:
ix = 0;
jx = 0;
for(;;)
{
++ ix;
if(ix >= N)
{
ix = 0;
++ jx;
if(jx >= N)
{
break; // We have got to an exit condition, but HAVENT got all permutations
}
}
swap elements at positions ix and jx
print out the elements
}
This works for the case where N = 3. However it doesn't work for higher N. We think that this sort of approach might be along the right lines. We were trying to extend to a method where 3 indexes are used, for some reason we think that might be the solution: Using a 3rd index to mark a position in the vector where the index ix starts or ends. But we got stuck, and decided to ask the SO community for advice.
One way to do this is to, for the first character e:
First recurse on the next element
Then, for each element e2 after e:
Swap e and e2
Then recurse on the next element
And undo the swap
Pseudo-code:
permutation(input, 0)
permutation(char[] array, int start)
if (start == array.length)
print array
for (int i = start; i < array.length; i++)
swap(array[start], array[i])
permutation(array, start+1)
swap(array[start], array[i])
With the main call of this function, it will try each character in the first position and then recurse. Simply looping over all the characters works here because we undo each swap afterwards, so after the recursive call returns, we're guaranteed to be back where we started.
And then, for each of those recursive calls, it tries each remaining character in the second position. And so on.
Java live demo.

Problem coming up with an array function

Let's say I have an increasing sequence of integers: seq = [1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4 ... ] not guaranteed to have exactly the same number of each integer but guaranteed to be increasing by 1.
Is there a function F that can operate on this sequence whereby F(seq, x) would give me all 1's when an integer in the sequence equals x and all other integers would be 0.
For example:
t = [1, 1, 1, 1, 2, 2, 3, 3, 3, 4]
F(t, 2) = [0, 0, 0, 0, 1, 1, 0, 0, 0, 0]
EDIT: I probably should have made it more clear. Is there a solution where I can do some algebraic operations on the entire array to get the desired result, without iterating over it?
So, I'm wondering if I can do something like: F(t, x) = t op x ?
In Python (t is a numpy.array) it could be:
(t * -1) % x or something...
EDIT2: I found out that the identity function I(t[i] == x) is acceptable to use as an algebraic operation. Sorry, I did not know about identity functions.
There's a very simple solution to this that doesn't require most of the restrictions you place upon the domain. Just create a new array of the same size, loop through and test for equality between the element in the array and the value you want to compare against. When they're the same, set the corresponding element in the new array to 1. Otherwise, set it to 0. The actual implementation depends on the language you're working with, but should be fairly simple.
If we do take into account your domain, you can introduce a couple of optimisations. If you start with an array of zeroes, you only need to fill in the ones. You know you don't need to start checking until the (n - 1)th element, where n is the value you're comparing against, because there must be at least one of the numbers 1 to n in increasing order. If you don't have to start at 1, you can still start at (n - start). Similarly, if you haven't come across it at array[n - 1], you can jump n - array[n - 1] more elements. You can repeat this, skipping most of the elements, as much as you need to until you either hit the right value or the end of the list (if it's not in there at all).
After you finish dealing with the value you want, there's no need to check the rest of the array, as you know it'll always be increasing. So you can stop early too.
A simple method (with C# code) is to simply iterate over the sequence and test it, returning either 1 or 0.
foreach (int element in sequence)
if (element == myValue)
yield return 1;
else
yield return 0;
(Written using LINQ)
sequence.Select(elem => elem == myValue ? 1 : 0);
A dichotomy algorithm can quickly locate the range where t[x] = n making such a function of sub-linear complexity in time.
Are you asking for a readymade c++, java API or are you asking for an algorithm? Or is this homework question?
I see the simple algorithm for scanning the array from start to end and comparing with each. If equals then put as 1 else put as 0. Anyway to put the elements in the array you will have to access each element of the new array atleast one. So overall approach will be O(1).
You can certainly reduce the comparison by starting a binary search. Once you find the required number then simply go forward and backward searching for the same number.
Here is a java method which returns a new array.
public static int[] sequence(int[] seq, int number)
{
int[] newSequence = new int[seq.length];
for ( int index = 0; index < seq.length; index++ )
{
if ( seq[index] == number )
{
newSequence[index] = 1;
}
else
{
newSequence[index] = 0;
}
}
return newSequence;
}
I would initialize an array of zeroes, then do a binary search on the sequence to find the first element that fits your criteria, and only start setting 1's from there. As soon as you have a not equal condition, stop.
Here is a way to do it in O(log n)
>>> from bisect import bisect
>>> def f(t, n):
... i = bisect(t,n-1)
... j = bisect(t,n,lo=i) - i
... return [0]*i+[1]*j+[0]*(len(t)-j-i)
...
...
>>> t = [1, 1, 1, 1, 2, 2, 3, 3, 3, 4]
>>> print f(t, 2)
[0, 0, 0, 0, 1, 1, 0, 0, 0, 0]

Resources