Number of Contigious subarrays satisfying constraints - algorithm

Given array A , find number of continious sub arrays which satisfies condition:
There is no pair (i,j) in the subarray such that i < j and A[i] mod A[j]= M
1<=A[i]<=100000
My Approach: Do it naive way in O(n^2) time complexity, which is bad.
Can I reduce it to (nlogn) ?

This is a O(N) time complexity, but requires O(N+M) space complexity:
We scan the array and perform A[i] = A[i] mod M
Keep a counter array of size N which keeps track of how many elements before it respect the given condition (basically not being the equal to the current element):
C[i] = the number of elements A[j], such that A[j]!=A[i] where j
Consider the array:
Original A array: 12, 25, 16, 14, 37, 18, 28, 17, 9, 37
New A array: 0, 1, 4, 2, 1, 6, 4, 5, 9, 1 (A[i] mod M)
Counter array: 0, 1, 2, 3, 2, 3, 3, 4, 5, 4
The counter array can be constructed incrementally based on previous values:
A[j] = A[i] and j
Otherwise C[i] = C[i-1]+1;
When looking for the last occurrence of A[i] before index i, we need to keep a dictionary (A[i], last-index-of-A[i]) for fast lookup.
Since we keep only values for A[i] mod M => a O(M) dictionary will do.
We now just sum up the counter array values:
Number of contiguous subarrays = Sum(C)
In this case we will have 27 contiguous subarrays that respect this condition.

Basically, we need to find all pairs such that i < j and A[i] mod A[j] = M
If a mod b = m, then a mod d = m, where d is a divisor of b and d > m
Hash the numbers in the array. For each number a in the array, check if a - m or any of the divisors of a - m are elements in the array and their index is greater than the index of a - enumerate any existing pairs (a, a - m) or (a, divisor of (a - m) > m) in O(n sqrt n).
Clearly, any contiguous sub-arrays satisfying the condition lie between any such pairs. If we aggregate the pairs in an interval tree, as we traverse the array, we can test if we are within a pair in O(log n) time. Once we detect we are overlapping a pair (an interval in the tree), we reset our window to (i + 1, j) (where (i,j) is the interval) and keep counting; we add to the total the largest segments achievable according to the formula, segment * (segment + 1) / 2, subtracted by the previous overlapping count.

Related

Find maximum sum of subarray with length less than or equal to k in Python

Given n= 8, pn/ = [2, 5, -7, 8, -6, 4, 1, -9], k= 5. We can select the subarray [2, 5, -7, 8] with sum = 8 and size 4 which is less than k= 5.Hence, the answer is 8. It can be shown that the answer cannot be greater than 8.
Solvable with a deque in O(n). Approximate code, not tested and definitely missing some edge cases, but presents the right idea:
deque = []
s = prefix sums of p, s[i] = p[0] + p[1] + ... + p[i]
for i = 0, n:
if not deque.empty and i - deque.first > k + 1: # otherwise we would have subarrays longer than k (s[a] - s[b] gives the sum pf p[b+1]+...+s[a])
deque.pop_first()
while not deque.empty and s[i] < s[ deque.last ]: # otherwise our queue wouldn't be sorted anymore
deque.pop_last()
if s[i] - s[ deque.first ] > global_max:
global_max = s[i] - s[ deque.first ]
deque.push_last(i)
The idea is to use the deque to store the index of the minimum prefix sum for the last k positions. That one will contribute to the maximum possible subarray for the current position, and it will be the first in the deque. We keep the deque sorted to make sure that the first element is always the minimum.
Because each element can enter and leave the deque at most once, even if we use nested loops, the complexity is linear.

Finding largest sum in an unsorted array using divide and conquer algorithm

I have a sequence of n real numbers stored in a array, A[1], A[2], …, A[n]. I am trying to implement a divide and conquer algorithm to find two numbers A[i] and A[j], where i < j, such that A[i] ≤ A[j] and their sum is the largest.
For eg. {2, 5, 9, 3, -2, 7} will give the output of 14 (5+9, not 16=9+7). Can anyone suggest me some ideas on how to do it?
Thanks in advance.
This problem is not really suited to a divide and conquer approach. It's easy to observe that if (i, j) is a solution for this problem, then A[j] >= A[k] for every k > j, i.e A[j] is the maximum in A[j..n]
Prove: if there exists such k > j and A[k] > A[j], then (j, k) is a better solution than (i, j)
So we only need to consider js that satisfies that criteria.
Algorithm (pseudo-code)
maxj = n
for (j = n - 1 down to 1):
if (a[j] > a[maxj]) then:
maxj = j
else:
check if (j, maxj) is a better solution
Complexity: O(n)
C++ implementation: http://ideone.com/ENp5WR (The implementation use an integer array, but it should be the same for floats)
Declare two variables, during your algorithm check if the current number is bigger than either of the two values currently be stored in the variables, if yes replace the smallest, if not, continue.
Here's a recursive solution in Python. I wouldn't exactly call it "divide and conquer" but then again, this problem isn't very suited to a divide and conquer approach.
def recurse(lst, pair): # the remaining list left to process
if not lst: return # if lst is empty, return
for i in lst[1:]: # for each elements in lst starting from index 1
curr_sum = lst[0] + i
if lst[0] < i and curr_sum > pair[0]+pair[1]: # if the first value is less than the second and curr_sum is greater than the max sum so far
pair[0] = lst[0]
pair[1] = i # update pair to contain the new pair of values that give the max sum
recurse(lst[1:], pair) # recurse on the sub list from index 1 to the end
def find_pair(s):
if len(s) < 2: return s[0]
pair = [s[0],s[1]] # initialises pair array
recurse(s, pair) # passed by reference
return pair
Sample output:
s = [2, 5, 9, 3, -2, 7]
find_pair(s) # ============> (5,9)
I think you can just use an algorithm in O(n) as described follow
(The merge part uses constant time)
Here is the outline of the algorithm:
Divide the problem into two half: LHS & RHS
Each half should returned the largest answer meeting the requirement in that half AND the largest element in that half
Merge and return the answer to upper level: answer is the maximum of LHS's answer, RHS's answer, and the sum of the largest element in both half (consider this only if RHS's largest element >= LHS's largest element)
Here is the demonstration of the algorithm using your example: {2, 5, 9, 3, -2, 7}
Divide into {2,5,9}, {3,-2,7}
Divide into {2,5}, {9}, {3,-2}, {7}
{2,5} return max(2,5, 5+2) = 7, largest element = 5
{9} return 9, largest element = 9
{3,-2} return max(3,-2) = 3, largest element = 3
{7} return 7, largest element = 7
{2,5,9} merged from {2,5} & {9}: return max(7,9,9+5) = 14, largest element = max(9,5) = 9
{3,-2,7} merged from {3,-2} & {7}: return max(3,7,7+3) = 10, largest element = max(7,3) = 7
{2,5,9,3,-2,7} merged from {2,5,9} and {3,-2,7}: return max(14,10) = 14, largest element = max(9,7) = 9
ans = 14
Special cases like {5,4,3,2,1} which yields no answer needs extra handling but not affecting the core part and the complexity of the algorithm.

Count number of swaps to sort first k-smallest element using a bubble sort like algorithm

Given an array a and integer k. Someone uses following algorithm to get first k smallest elements:
cnt = 0
for i in [1, k]:
for j in [i + 1, n]:
if a[i] > a[j]:
swap(a[i], a[j])
cnt = cnt + 1
The problem is: How to calculate value of cnt (when we get final k-sorted array), i.e. the number of swaps, in O(n log n) or better ?
Or simply put: calculate the number of swaps needed to get first k-smallest number sorted using the above algorithm, in less than O(n log n).
I am thinking about a binary search tree, but I get confused (How array will change when increase i ? How to calculate number of swap for a fixed i ?...).
This is a very good question: it involves Inverse Pairs, Stack and some proof techniques.
Note 1: All index used below are 1-based, instead of traditional 0-based.
Note 2: If you want to see the algorithm directly, please start reading from the bottom.
First we define Inverse Pairs as:
For a[i] and a[j], in which i < j holds, if we have a[i] > a[j], then a[i] and a[j] are called an Inverse Pair.
For example, In the following array:
3 2 1 5 4
a[1] and a[2] is a pair of Inverse Pair, a[2] and a[3] is another pair.
Before we start the analysis, let's define a common language: in the reset of the post, "inverse pair starting from i" means the total number of inverse pairs involving a[i].
For example, for a = {3, 1, 2}, inverse pair starting from 1 is 2, and inverse pair starting from 2 is 0.
Now let's look at some facts:
If we have i < j < k, and a[i] > a[k], a[j] > a[k], swap a[i] and a[j] (if they are an inverse pair) won't affect the total number of inverse pair starting from j;
Total inverse pairs starting from i may change after a swap (e.g. suppose we have a = {5, 3, 4}, before a[1] is swapped with a[2], total number of inverse pair starting from 1 is 2, but after swap, array becomes a = {3, 5, 4}, and the number of inverse pair starting from 1 becomes 1);
Given an array A and 2 numbers, a and b, as the head element of A, if we can form more inverse pair with a than b, we have a > b;
Let's denote the total number of inverse pair starting from i as ip[i], then we have: if k is the min number satisfies ip[i] > ip[i + k], then a[i] > a[i + k] while a[i] < a[i + 1 .. i + k - 1] must be true. In words, if ip[i + k] is the first number smaller than ip[i], a[i + k] is also the first number smaller than a[i];
Proof of point 1:
By definition of inverse pair, for all a[k], k > j that forms inverse pair with a[j], a[k] < a[j] must hold. Since a[i] and a[j] are a pair of inverse and provided that i < j, we have a[i] > a[j]. Therefore, we have a[i] > a[j] > a[k], which indicates the inverse-pair-relationships are not broken.
Proof of point 3:
Leave as empty since quite obvious.
Proof of point 4:
First, it's easy to see that when i < j, a[i] > a[j], we have ip[i] >= ip[j] + 1 > ip[j]. Then, it's inverse-contradict statement is also true, i.e. when i < j, ip[i] <= ip[j], we have a[i] <= a[j].
Now back to the point. Since k is the min number to satisfy ip[i] > ip[i + k], then we have ip[i] <= ip[i + 1 .. i + k - 1], which indicates a[i] <= a[i + 1.. i + k - 1] by the lemma we just proved, which also indicates there's no inverse pairs in the region [i + 1, i + k - 1]. Therefore, ip[i] is the same as the number of inverse pairs starting from i + k, but involving a[i]. Given ip[i + k] < ip[i], we know a[i + k] has less inverse pair than a[i] in the region of [i + k + 1, n], which indicates a[i + k] < a[i] (by Point 3).
You can write down some sequences and try out the 4 facts mentioned above and convince yourself or disprove them :P
Now it's about the algorithm.
A naive implementation will take O(nk) to compute the result, and the worst case will be O(n^2) when k = n.
But how about we make use of the facts above:
First we compute ip[i] using Fenwick Tree (see Note 1 below), which takes O(n log n) to construct and O(n log n) to get all ip[i] calculated.
Next, we need to make use of facts. Since swap of 2 numbers only affect current position's inverse pair number but not values after (point 1 and 2), we don't need to worry about the value change. Also, since the nearest smaller number to the right shares the same index in ip and a, we only need to find the first ip[j] that is smaller than ip[i] in [i + 1, n]. If we denote the number of swaps to get first i element sorted as f[i], we have f[i] = f[j] + 1.
But how to find this "first smaller number" fast? Use stack! Here is a post which asks a highly similar problem: Given an array A,compute B s.t B[i] stores the nearest element to the left of A[i] which is smaller than A[i]
In short, we are able to do this in O(n).
But wait, the post says "to the left" but in our case it's "to the right". The solution is simple: we do backward in our case, then everything the same :D
Therefore, in summary, the total time complexity of the algorithm is O(n log n) + O(n) = O(n log n).
Finally, let's talk with an example (a simplified example of #make_lover's example in the comment):
a = {2, 5, 3, 4, 1, 6}, k = 2
First, let's get the inverse pairs:
ip = {1, 3, 1, 1, 0, 0}
To calculate f[i], we do backward (since we need to use the stack technique):
f[6] = 0, since it's the last one
f[5] = 0, since we could not find any number that is smaller than 0
f[4] = f[5] + 1 = 1, since ip[5] is the first smaller number to the right
f[3] = f[5] + 1 = 1, since ip[5] is the first smaller number to the right
f[2] = f[3] + 1 = 2, since ip[3] is the first smaller number to the right
f[1] = f[5] + 1 = 1, since ip[5] is the first smaller number to the right
Therefore, ans = f[1] + f[2] = 3
Note 1: Using Fenwick Tree (Binary Index Tree) to get inverse pair can be done in O(N log N), here is a post on this topic, please have a look :)
Update
Aug/20/2014: There was a critical error in my previous post (thanks to #make_lover), here is the latest update.

count no. of pairs such that absolute difference is less than K

Given an array A of size N, how do I count the number of pairs(A[i], A[j]) such that the absolute difference between them is less than or equal to K where K is any positive natural number? (i, j<=N and i!=j)
My approach:
Sort the array.
Create another array that stores the absolute difference between two consecutive numbers.
Am I heading in the right direction? If yes, then how do I proceed further?
Here is a O(nlogn) algorithm :-
1. sort input
2. traverse the sorted array in ascending order.
3. for A[i] find largest index A[j]<=A[i]+k using binary search.
4. count = count+j-i
5. do 3 to 4 all i's
Time complexity :-
Sorting : O(n)
Binary Search : O(logn)
Overall : O(nlogn)
This is O(n^2):
Sort the array
For each item_i in array,
For each item_j in array such that j > i
If item_j - item_i <= k, print (item_j, item_i)
Else proceed with the next item_i
Your approach is partially correct. You first sort the array. Then keep two pointers i and j.
1. Initialize i = 0, j = 1.
2. Check if A[j] - A[i] <= K.
- If yes, then increment j,
- else
- **increase the count of pairs by C(j-i,2)**.
- increment i.
- if i == j, then increment j.
3. Do this till pointer j goes past the end of the array. Then just add C(j-1,2) to the count and stop.
By i and j, you are basically maintaining a window within which the difference between elements is <= K.
EDIT: This is the basic idea, you will have to check for boundary conditions. Also you will have to keep track of the past interval that was added to the count. You will need to subtract the overlap with the current interval to avoid double counting.
Complexity: O(NlogN), for the sort operation, linear for the array traversal
Once your array is sorted, you can compute the sum in O(N) time.
Here's some code. The O(N) algorithm is pair_sums, and pair_sums_slow is the obviously correct, but O(N^2) algorithm. I run through some test cases at the end to make sure that the two algorithms returns the same results.
def pair_sums(A, k):
A.sort()
counts = 0
j = 0
for i in xrange(len(A)):
while j < len(A) and A[j] - A[i] <= k:
j+=1
counts += j - i - 1
return counts
def pair_sums_slow(A, k):
counts = 0
for i in xrange(len(A)):
for j in xrange(i+1, len(A)):
if A[j] - A[i] <= k:
counts+=1
return counts
cases = [
([0, 1, 2, 3, 4, 5], 10),
([0, 0, 0, 0, 0], 1),
([0, 1, 2, 4, 8, 16], 9),
([0, -1, -2, 1, 2], 2)
]
for A, k in cases:
want = pair_sums_slow(A, k)
got = pair_sums(A, k)
if want != got:
print A, k, want, got
The idea behind pair_sums is that for each i, we find the smallest j such that A[j] - A[i] > K (or j=N). Then j-i-1 is the number of pairs with i as the first value.
Because the array is sorted, j only ever increases as i increases, so the overall complexity is linear since although there's nested loops the inner operation j+=1 can occur at most N times.

Find a single number in a list when other numbers occur more than twice

The problem is extended from Finding a single number in a list
If I extend the problem to this:
What would be the best algorithm for finding a number that occurs only once in a list which has all other numbers occurring exactly k times?
Does anyone have good answer?
for example, A = { 1, 2, 3, 4, 2, 3, 1, 2, 1, 3 }, in this case, k = 3. How can I get the single number "4" in O(n) time and the space complexity is O(1)?
If every element in the array is less n and greater than 0.
Let the array be a, traverse the array for each a[i] add n to a[(a[i])%(n)].
Now traverse the array again, the position at which a[i] is less than 2*n and greater than n (assuming 1 based index) is the answer.
This method won't work if at least on element is greater than n. In that case you have to use method suggested by Jayram
EDIT:
To retrieve the array just apply mod n to every element in the array
This can be solved in given with your constraints if the numbers other than lonely number are occurring exactly in even count (i.e. 2, 4, 6, 8...) by doing the XOR operation on all the numbers.
But other than this in space complexity O(1) its just teasing me.
If other than your given constraints you could use these approaches to solve this.
Sort the numbers and have a current variable to get the count of current number. If it is greater than 1 then go to next number and so on. Space O(1)...Time O(nlogn)
Use O(n) extra memory to count the occurrences of each number. Time O(n)...Space O(n)
I Just want to extend #banarun answer .
Take the input as map . Like a[0]=1; Then take it as myMap with 0 as index and 1 as value .
And while reading the input find the maximum number M . Then find A prime greater than M as P.
No iterate through the map and for every key i of myMap add P to myMap(myMap(i)%P) if myMap(myMap(i)%P) is not initiated set it to P. Now iterate through the myMap again, the position at which myMap[i] is >=P And < 2*P is your answer. Basically the the Idea is to remove overflow and overwrite problem from the banarun suggested Algo .
Here is an mechanism which may not be as good as the others but which is instructive and gets to the core of why the XOR answer is as good as it is when k = 2.
1. Represent each number in base k. Support there are at most r digits in the representation
2. Add each of the numbers in the right-most ('r'th) digit mod k, then 'r - 1'st digit (mod k) and so on
3. The final representation of r digits that you have is the answer.
For example, if the array is
A = {1, 2, 3, 4, 2, 3, 1, 2, 1, 3, 5, 4, 4}
Representation in mod 3 is
A = {01, 02, 10, 11, 02, 10, 01, 02, 01, 10, 12, 11, 11}
r = 2
Sum of 'r'th place = 2
Sum of the 'r-1'th place = 1
Hence answer = {12} in base 3 which is 5.
This is an answer which will be O(n * r). Note that r is proportional to log n.
Why is the XOR answer in O(n) ? Because the processor provides an XOR operation which is performed in O(1) time rather than the O(r) factor that we have above.
According to banarun solution(with small fix's):
Algorithm conditions:
for each i arr[i]<N (size of array)
for each i arr[i]>=0 (positive)
The Algorithm:
int[] arr = { 1, 2, 3, 4, 2, 3, 1, 2, 1, 3 };
for (int i = 0; i < arr.Length; i++)
{
arr[(arr[i])%(arr.Length)] += arr.Length;
if(arr[i] < arr.Length)
arr[i] = -1;
}
for (int i = 0; i < arr.Length; i++)
{
if (arr[i] - 3 * arr.Length <0 && arr[i]!=-1)
Console.WriteLine("single number = "+i);
}
This solution is with Time complexity of O(N) And Space complexity of O(1)
Note:
Again this algorithm can work only if all number are positives and all numbers are less then N.

Resources