Counting subarray have sum in range [L, R] - algorithm

I am solving a competitive programming problem, it was described like this:
Given n < 10^5 integer a1, a2, a3, ..., an and L, R. How many
subarrays are there such that sum of its element in range [L, R].
Example:
Input:
n = 4, L = 2, R = 4
1 2 3 4
Output: 4
(4 = 4, 3 = 1 + 2 = 3, 2 = 2)
One solution I have is bruteforce, but O(n^2) is too slow. What data structures / algorithms should I use to solve this problem efficiently ?

Compute prefix sums(p[0] = 0, p[1] = a1, p[2] = a1 + a2, ..., p[n] = sum of all numbers).
For a fixed prefix sum p[i], you need to find the number of such prefix sums p[j] that j is less than i and p[i] - R <= p[j] <= p[i] - L. One can do it in O(log n) with treap or another balanced binary search tree.
Pseudo code:
treap.add(0)
sum = 0
ans = 0
for i from 1 to n:
sum += a[i]
left, right = treap.split(sum - R)
middle, right = right.split(sum - L)
ans += middle.size()
merge left, middle and right together
treap.add(sum)

We can do it in linear time if the array contains positive numbers only.
First build an array with prefix sum from left to right.
1. Fix three pointers, X, Y and Z and initialize them with 0
2. At every step increase X by 1
3. While sum of numbers between X and Y are greater than R keep increasing Y
4. While sum of numbers between X and Z are greater than or equal to L, keep increasing Z
5. If valid Y and Z are found, add Z - Y + 1 to result.
6. If X is less than length of the array, Go to step 2.

Related

Count number of subsequences of A such that every element of the subsequence is divisible by its index (starts from 1)

B is a subsequence of A if and only if we can turn A to B by removing zero or more element(s).
A = [1,2,3,4]
B = [1,4] is a subsequence of A.(Just remove 2 and 4).
B = [4,1] is not a subsequence of A.
Count all subsequences of A that satisfy this condition : A[i]%i = 0
Note that i starts from 1 not 0.
Example :
Input :
5
2 2 1 22 14
Output:
13
All of these 13 subsequences satisfy B[i]%i = 0 condition.
{2},{2,2},{2,22},{2,14},{2},{2,22},{2,14},{1},{1,22},{1,14},{22},{22,14},{14}
My attempt :
The only solution that I could came up with has O(n^2) complexity.
Assuming the maximum element in A is C, the following is an algorithm with time complexity O(n * sqrt(C)):
For every element x in A, find all divisors of x.
For every i from 1 to n, find every j such that A[j] is a multiple of i, using the result of step 1.
For every i from 1 to n and j such that A[j] is a multiple of i (using the result of step 2), find the number of B that has i elements and the last element is A[j] (dynamic programming).
def find_factors(x):
"""Returns all factors of x"""
for i in range(1, int(x ** 0.5) + 1):
if x % i == 0:
yield i
if i != x // i:
yield x // i
def solve(a):
"""Returns the answer for a"""
n = len(a)
# b[i] contains every j such that a[j] is a multiple of i+1.
b = [[] for i in range(n)]
for i, x in enumerate(a):
for factor in find_factors(x):
if factor <= n:
b[factor - 1].append(i)
# There are dp[i][j] sub arrays of A of length (i+1) ending at b[i][j]
dp = [[] for i in range(n)]
dp[0] = [1] * n
for i in range(1, n):
k = x = 0
for j in b[i]:
while k < len(b[i - 1]) and b[i - 1][k] < j:
x += dp[i - 1][k]
k += 1
dp[i].append(x)
return sum(sum(dpi) for dpi in dp)
For every divisor d of A[i], where d is greater than 1 and at most i+1, A[i] can be the dth element of the number of subsequences already counted for d-1.
JavaScript code:
function getDivisors(n, max){
let m = 1;
const left = [];
const right = [];
while (m*m <= n && m <= max){
if (n % m == 0){
left.push(m);
const l = n / m;
if (l != m && l <= max)
right.push(l);
}
m += 1;
}
return right.concat(left.reverse());
}
function f(A){
const dp = [1, ...new Array(A.length).fill(0)];
let result = 0;
for (let i=0; i<A.length; i++){
for (d of getDivisors(A[i], i+1)){
result += dp[d-1];
dp[d] += dp[d-1];
}
}
return result;
}
var A = [2, 2, 1, 22, 14];
console.log(JSON.stringify(A));
console.log(f(A));
I believe that for the general case we can't provably find an algorithm with complexity less than O(n^2).
First, an intuitive explanation:
Let's indicate the elements of the array by a1, a2, a3, ..., a_n.
If the element a1 appears in a subarray, it must be element no. 1.
If the element a2 appears in a subarray, it can be element no. 1 or 2.
If the element a3 appears in a subarray, it can be element no. 1, 2 or 3.
...
If the element a_n appears in a subarray, it can be element no. 1, 2, 3, ..., n.
So to take all the possibilities into account, we have to perform the following tests:
Check if a1 is divisible by 1 (trivial, of course)
Check if a2 is divisible by 1 or 2
Check if a3 is divisible by 1, 2 or 3
...
Check if a_n is divisible by 1, 2, 3, ..., n
All in all we have to perform 1+ 2 + 3 + ... + n = n(n - 1) / 2 tests, which gives a complexity of O(n^2).
Note that the above is somewhat inaccurate, because not all the tests are strictly necessary. For example, if a_i is divisible by 2 and 3 then it must be divisible by 6. Nevertheless, I think this gives a good intuition.
Now for a more formal argument:
Define an array like so:
a1 = 1
a2 = 1× 2
a3 = 1× 2 × 3
...
a_n = 1 × 2 × 3 × ... × n
By the definition, every subarray is valid.
Now let (m, p) be such that m <= n and p <= n and change a_mtoa_m / p`. We can now choose one of two paths:
If we restrict p to be prime, then each tuple (m, p) represents a mandatory test, because the corresponding change in the value of a_m changes the number of valid subarrays. But that requires prime factorization of each number between 1 and n. By the known methods, I don't think we can get here a complexity less than O(n^2).
If we omit the above restriction, then we clearly perform n(n - 1) / 2 tests, which gives a complexity of O(n^2).

Number of pair in array that satisfy Bitwise Equation

I found this problem in a contest. The question is:
You are given an array of N non-negative integers (A1, A2, A3,..., An) and an integer M. Your task is to find the number of unordered pairs of array elements (X,Y) that satisfies the following bitwise equation:
2 * set_bits(X|Y) = M + set_bits(X ⊕ Y)
Note:
set_bits(n) represents the number of ones in the binary represenataion of an integer n.
X|Y represents the bitwise OR of integer X and Y.
X ⊕ Y represents the bitwise XOR of integer X and Y.
The unordered pair of array elements is pair (Ai, Aj) where 1 ≤ i < j ≤ N.
Print the number of unordered pairs of array elements that satisfy the above bitwise equation.
Sample Input 1:
N=4 M=2
arr = [3, 0, 4, 5]
Sample Output: 2
2 pairs are (3,0) and (0,5)
Sample Input 2:
N=8 M=2
arr = [3, 0, 4, 5, 6, 8, 1, 8]
Sample Output: 9
Is there any other way except brute force to solve this equation?
A solution with time complexity O(n) exists if the time complexity of set_bits is O(1).
First, we are going to rephrase the condition (the bitwise equation) a bit. Assume a pair of elements (X, Y) is given. Let c_01 represent the number of digits where X is 0 but Y is 1, c_10 represent the number digits where X is 1 and Y is 0, and c_11 represent the number of digits where X and Y are 1. For example, when X=5 and Y=1 (X=101, Y=001), c_01 = 0, c_10 = 1, c_11 = 1. Now, the condition can be rewritten as
2 * (c_01 + c_10 + c_11) = M + (c_01 + c_10)
because set_bits(X|Y) is equal to c_01 + c_10 + c_11 and set_bits(X^Y) is equal to c_01 + c_10.
We can reorder the equation into
c_01 + c_10 + 2*c_11 = M
by moving the term on the right to the left side. Now, realize that set_bits(X) = c_10 + c_11. Applying this information on the equation we get
c_01 + c_11 = M - set_bits(X)
Now, also realize that set_bits(Y) = c_01 + c_11. The equation becomes
set_bits(Y) = M - set_bits(X)
or
set_bits(X) + set_bits(Y) = M
The problem has turned into counting the number of pairs such that the number of set bits in the first element plus the number of set bits in the second element is equal to M. This can be done in linear time assuming you can compute set_bits in constant time.

Finding median in merged array of two sorted arrays

Assume we have 2 sorted arrays of integers with sizes of n and m. What is the best way to find median of all m + n numbers?
It's easy to do this with log(n) * log(m) complexity. But i want to solve this problem in log(n) + log(m) time. So is there any suggestion to solve this problem?
Explanation
The key point of this problem is to ignore half part of A and B each step recursively by comparing the median of remaining A and B:
if (aMid < bMid) Keep [aMid +1 ... n] and [bLeft ... m]
else Keep [bMid + 1 ... m] and [aLeft ... n]
// where n and m are the length of array A and B
As the following: time complexity is O(log(m + n))
public double findMedianSortedArrays(int[] A, int[] B) {
int m = A.length, n = B.length;
int l = (m + n + 1) / 2;
int r = (m + n + 2) / 2;
return (getkth(A, 0, B, 0, l) + getkth(A, 0, B, 0, r)) / 2.0;
}
public double getkth(int[] A, int aStart, int[] B, int bStart, int k) {
if (aStart > A.length - 1) return B[bStart + k - 1];
if (bStart > B.length - 1) return A[aStart + k - 1];
if (k == 1) return Math.min(A[aStart], B[bStart]);
int aMid = Integer.MAX_VALUE, bMid = Integer.MAX_VALUE;
if (aStart + k/2 - 1 < A.length) aMid = A[aStart + k/2 - 1];
if (bStart + k/2 - 1 < B.length) bMid = B[bStart + k/2 - 1];
if (aMid < bMid)
return getkth(A, aStart + k / 2, B, bStart, k - k / 2); // Check: aRight + bLeft
else
return getkth(A, aStart, B, bStart + k / 2, k - k / 2); // Check: bRight + aLeft
}
Hope it helps! Let me know if you need more explanation on any part.
Here's a very good solution I found in Java on Stack Overflow. It's a method of finding the K and K+1 smallest items in the two arrays where K is the center of the merged array.
If you have a function for finding the Kth item of two arrays then finding the median of the two is easy;
Calculate the weighted average of the Kth and Kth+1 items of X and Y
But then you'll need a way to find the Kth item of two lists; (remember we're one indexing now)
If X contains zero items then the Kth smallest item of X and Y is the Kth smallest item of Y
Otherwise if K == 2 then the second smallest item of X and Y is the smallest of the smallest items of X and Y (min(X[0], Y[0]))
Otherwise;
i. Let A be min(length(X), K / 2)
ii. Let B be min(length(Y), K / 2)
iii. If the X[A] > Y[B] then recurse from step 1. with X, Y' with all elements of Y from B to the end of Y and K' = K - B, otherwise recurse with X' with all elements of X from A to the end of X, Y and K' = K - A
If I find the time tomorrow I will verify that this algorithm works in Python as stated and provide the example source code, it may have some off-by-one errors as-is.
Take the median element in list A and call it a. Compare a to the center elements in list B. Lets call them b1 and b2 (if B has odd length then exactly where you split b depends on your definition of the median of an even length list, but the procedure is almost identical regardless). if b1&leq;a&leq;b2 then a is the median of the merged array. This can be done in constant time since it requires exactly two comparisons.
If a is greater than b2 then we add the top half of A to the top of B and repeat. B will no longer be sorted, but it doesn't matter. If a is less than b1 then we add the bottom half of A to the bottom of B and repeat. These will iterate log(n) times at most (if the median is found sooner then stop, of course).
It is possible that this will not find the median. If this is the case then the median is in B. If so, perform the same algorithm with A and B reversed. This will require log(m) iterations. In total you will have performed at most 2*(log(n)+log(m)) iterations of a constant time operation, so you have solved the problem in order log(n)+log(m) time.
This is essentially the same answer as was given by iehrlich, but written out more explicitly.
Yes, this can be done. Given two arrays, A and B, in the worst-case scenario you have to first perform a binary search in A, and then, if it fails, binary search in B looking for the median. On each step of a binary search, you check if the current element is actually a median of a merged A+B array. Such check takes constant time.
Let's see why such check is constant. For simplicity, let's assume that |A| + |B| is an odd number, and that all numbers in both arrays are different. You can remove these restrictions later by applying the usual median definition approach (i.e., how to calculate the median of an array containing duplicates, or of an array with even length). Anyway, given that, we know for sure, that in the merged array there will be (|A| + |B| - 1) / 2 elements to the right and to the left of an actual median. In the process of a binary search in A, we know the index of current element x in array A (let it be i). Now, if x satisfies the condition B[j] < x < B[j+1], where i + j == (|A| + |B| - 1) / 2, then x is your median.
The overall complexity is O(log(max(|A|, |B|)) time and O(1) memory.

Given k sorted numbers, what is the minimum cost to turn them into consecutive numbers?

Suppose, we are given a sorted list of k numbers. Now, we want to convert this sorted list into a list having consecutive numbers. The only operation allowed is that we can increase/decrease a number by one. Performing every such operation will result in increasing the total cost by one.
Now, how to minimize the total cost while converting the list as mentioned?
One idea that I have is to get the median of the sorted list and arrange the numbers around the median. After that just add the absolute difference between the corresponding numbers in the newly created list and the original list. But, this is just an intuitive method. I don't have any proof of it.
P.S.:
Here's an example-
Sorted list: -96, -75, -53, -24.
We can convert this list into a consecutive list by various methods.
The optimal one is: -58, -59, -60, -61
Cost: 90
This is a sub-part of a problem from Topcoder.
Let's assume that the solution is in increasing order and m, M are the minimum and maximum value of the sorted list. The other case will be handled the same way.
Each solution is defined by the number assigned to the first element. If this number is very small then increasing it by one will reduce the cost. We can continue increasing this number until the cost grows. From this point the cost will continuously grow. So the optimum will be a local minimum and we can find it by using binary search. The range we are going to search will be [m - n, M + n] where n is the number of elements:
l = [-96, -75, -53, -24]
# Cost if initial value is x
def cost(l, x):
return sum(abs(i - v) for i, v in enumerate(l, x))
def find(l):
a, b = l[0] - len(l), l[-1] + len(l)
while a < b:
m = (a + b) / 2
if cost(l, m + 1) >= cost(l, m) <= cost(l, m - 1): # Local minimum
return m
if cost(l, m + 1) < cost(l, m):
a = m + 1
else:
b = m - 1
return b
Testing:
>>> initial = find(l)
>>> range(initial, initial + len(l))
[-60, -59, -58, -57]
>>> cost(l, initial)
90
Here is a simple solution:
Let's assume that these numbers are x, x + 1, x + n - 1. Then the cost is sum i = 0 ... n - 1 of abs(a[i] - (x + i)). Let's call it f(x).
f(x) is piece-wise linear and it approaches infinity as x approaches +infinity or -infinity. It means that its minimum is reached in one of the end points.
The end points are a[0], a[1] - 1, a[2] - 2, ..., a[n - 1] - (n - 1). So we can just try all of them and pick the best.

Sum of continuous sequences

Given an array A with N elements, I want to find the sum of minimum elements in all the possible contiguous sub-sequences of A. I know if N is small we can look for all possible sub sequences but as N is upto 10^5 what can be best way to find this sum?
Example: Let N=3 and A[1,2,3] then ans is 10 as Possible contiguous sub sequences {(1),(2),(3),(1,2),(1,2,3),(2,3)} so Sum of minimum elements = 1 + 2 + 3 + 1 + 1 + 2 = 10
Let's fix one element(a[i]). We want to know the position of the rightmost element smaller than this one located to the left from i(L). We also need to know the position of the leftmost element smaller than this one located to the right from i(R).
If we know L and R, we should add (i - L) * (R - i) * a[i] to the answer.
It is possible to precompute L and R for all i in linear time using a stack. Pseudo code:
s = new Stack
L = new int[n]
fill(L, -1)
for i <- 0 ... n - 1:
while !s.isEmpty() && s.top().first > a[i]:
s.pop()
if !s.isEmpty():
L[i] = s.top().second
s.push(pair(a[i], i))
We can reverse the array and run the same algorithm to find R.
How to deal with equal elements? Let's assume that a[i] is a pair <a[i], i>. All elements are distinct now.
The time complexity is O(n).
Here is a full pseudo code(I assume that int can hold any integer value here, you should
choose a feasible type to avoid an overflow in a real code. I also assume that all elements are distinct):
int[] getLeftSmallerElementPositions(int[] a):
s = new Stack
L = new int[n]
fill(L, -1)
for i <- 0 ... n - 1:
while !s.isEmpty() && s.top().first > a[i]:
s.pop()
if !s.isEmpty():
L[i] = s.top().second
s.push(pair(a[i], i))
return L
int[] getRightSmallerElementPositions(int[] a):
R = getLeftSmallerElementPositions(reversed(a))
for i <- 0 ... n - 1:
R[i] = n - 1 - R[i]
return reversed(R)
int findSum(int[] a):
L = getLeftSmallerElementPositions(a)
R = getRightSmallerElementPositions(a)
int res = 0
for i <- 0 ... n - 1:
res += (i - L[i]) * (R[i] - i) * a[i]
return res
If the list is sorted, you can consider all subsets for size 1, then 2, then 3, to N. The algorithm is initially somewhat inefficient, but an optimized version is below. Here's some pseudocode.
let A = {1, 2, 3}
let total_sum = 0
for set_size <- 1 to N
total_sum += sum(A[1:N-(set_size-1)])
First, sets with one element:{{1}, {2}, {3}}: sum each of the elements.
Then, sets of two element {{1, 2}, {2, 3}}: sum each element but the last.
Then, sets of three elements {{1, 2, 3}}: sum each element but the last two.
But this algorithm is inefficient. To optimize to O(n), multiply each ith element by N-i and sum (indexing from zero here). The intuition is that the first element is the minimum of N sets, the second element is the minimum of N-1 sets, etc.
I know it's not a python question, but sometimes code helps:
A = [1, 2, 3]
# This is [3, 2, 1]
scale = range(len(A), 0, -1)
# Take the element-wise product of the vectors, and sum
sum(a*b for (a,b) in zip(A, scale))
# Or just use the dot product
np.dot(A, scale)

Resources