Fenwick trees to determine which interval a point falls in - algorithm

Let a0,...,an-1 be a sequence of lengths. We can construct intervals [0,a0], (a1,a2+a1],(a2+a1,a3+a2+a1],... I store the sequence a1,...,an-1 in a Fenwick tree.
I ask the question: given a number m, how can I efficiently (log n time) find into which interval m falls?
For example, given the a: 3, 5, 2, 7, 9, 4.
The Fenwick Tree stores 3, 8, 2, 17, 9, 13.
The intervals are [0,3],(3,8],(8,10],(10,17],(17,26],(26,30].
Given the number 9, the algorithm should return the 3rd index of the Fenwick Tree (2 if 0-based arrays are used, 3 if 1-based arrays are used). Given the number 26, the algorithm should return the 5th index of the Fenwick Tree (4 if 0-based arrays are used or 5 if 1-based arrays are used).
Possibly another data structure might be more suited to this operation. I am using Fenwick Trees because of their seeming simplicity and efficiency.

We can get an O(log n)-time search operation. The trick is to integrate the binary search with the prefix sum operation.
def get_total(tree, i):
total = 0
while i > 0:
total += tree[i - 1]
i -= i & (-i)
return total
def search(tree, total):
j = 1
while j < len(tree):
j <<= 1
j >>= 1
i = -1
while j > 0:
if i + j < len(tree) and total > tree[i + j]:
total -= tree[i + j]
i += j
j >>= 1
return i + 1
tree = [3, 8, 2, 17, 9, 13]
print('Intervals')
for i in range(len(tree)):
print(get_total(tree, i), get_total(tree, i + 1))
print('Searches')
for total in range(31):
print(total, search(tree, total))
Output is
Intervals
0 3
3 8
8 10
10 17
17 26
26 30
Searches
0 0
1 0
2 0
3 0
4 1
5 1
6 1
7 1
8 1
9 2
10 2
11 3
12 3
13 3
14 3
15 3
16 3
17 3
18 4
19 4
20 4
21 4
22 4
23 4
24 4
25 4
26 4
27 5
28 5
29 5
30 5

If the intervals don't change frequently, you can use a simple binary search in the accumulated array to do that. In Python you can use the bisect module to do that. Each query will be O(log n):
import bisect
A = [3, 5, 2, 7, 9, 4]
for i in xrange(1, len(A)):
A[i] += A[i-1]
print bisect.bisect_left(A, 9)
print bisect.bisect_left(A, 26)
If the intervals change, you can use the same idea, but each array lookup will be O(log n), making the query operation O(logĀ² n) overall.

Related

How to solve M times prefix sum with better time complexity

The problem is to find the prefix sum of array of length N by repeating the process M times. e.g.
Example N=3
M=4
array = 1 2 3
output = 1 6 21
Explanation:
Step 1 prefix Sum = 1 3 6
Step 2 prefix sum = 1 4 10
Step 3 prefix sum = 1 5 15
Step 4(M) prefix sum = 1 6 21
Example 2:
N=5
M=3
array = 1 2 3 4 5
output = 1 5 15 35 70
I was not able to solve the problem and kept getting lime limit exceeded. I used dynamic programming to solve it in O(NM) time. I looked around and found the following general mathematical solution but I still not able to solve it because my math isn't that great to understand it. Can someone solve it in a better time complexity?
https://math.stackexchange.com/questions/234304/sum-of-the-sum-of-the-sum-of-the-first-n-natural-numbers
Hint: 3, 4, 5 and 6, 10, 15 are sections of diagonals on Pascal's Triangle.
JavaScript code:
function f(n, m) {
const result = [1];
for (let i = 1; i < n; i++)
result.push(result[i-1] * (m + i + 1) / i);
return result;
}
console.log(JSON.stringify(f(3, 4)));
console.log(JSON.stringify(f(5, 3)));

What is the meaning of "exclusive" and "inclusive" when describing number ranges?

Simple question but, I see exclusive and inclusive when referring to number ranges.
For example, this is a line from an algorithms book:
The following function prints the powers of 2 from 1 through n (inclusive).
What is meant by this? What makes a number range inclusive or exclusive?
In Computer Science, inclusive/exclusive doesn't apply to algorithms, but to a number range (more specifically, to the endpoint of the range):
1 through 10 (inclusive)
1 2 3 4 5 6 7 8 9 10
1 through 10 (exclusive)
1 2 3 4 5 6 7 8 9
In mathematics, the 2 ranges above would be:
[1, 10]
[1, 10)
You can remember it easily:
Inclusive - Including the last number
Exclusive - Excluding the last number
The following function prints the powers of 2 from 1 through n (inclusive).
This means that the function will compute 2^i where i = 1, 2, ..., n, in other words, i can have values from 1 up to and including the value n. i.e n is Included in Inclusive
If, on the other hand, your book had said:
The following function prints the powers of 2 from 1 through n (exclusive).
This would mean that i = 1, 2, ..., n-1, i.e. i can take values up to n-1, but not including, n, which means i = n-1 is the highest value it could have.i.e n is excluded in exclusive.
In simple terms, inclusive means within and the number n, while exclusive means within and without the number n.
Note: that each argument should be marked its "clusivity"/ "participation"
# 1 (inclusive) through 5 (inclusive)
1 <= x <= 5 == [1, 2, 3, 4, 5]
# 1 (inclusive) through 5 (exclusive)
1 <= x < 5 == [1, 2, 3, 4]
# 1 (exclusive) through 5 (inclusive)
1 < x <= 5 == [2, 3, 4, 5]
# 1 (exclusive) through 5 (exclusive)
1 < x < 5 == [2, 3, 4]
The value of n inclusive 2 and 5 [2,5]
including both the numbes in case exclusive only the first is included
programming terms n>=2 && n<=5
The value of of n exlcusive of 2 and 5 [2,5)
n>=2 && n<5

Minimum number of moves to empty all buckets, which all have same capacity but different amount of water initially. I can only move left to right

Problem definition
I have n buckets with the same capacity m, one next to the other. I can pour water from one bucket to the one to it's right.Tthe goal is to empty them all into another container but only the rightmost bucket can be emptied. They each have a certain initial amount of water w, where 0 <= w <= m and w is an integer. You can't do partial moves in the sense that if you have the following case: 6 6 -> 3 9 where you only pour 3, that would not be allowed. If you pour, you have to pour as much as you can, so a legal move would be 6 6 -> 2 10.
What is the minimum number of moves I have to make to empty all of the buckets? The maximum amount of buckets is 1000 and the maximum capacity is 100.
Examples
Example 1
4 buckets capacity 10 with the following amount of water: 4 0 6
The answer would be 4 0 6 -> 0 4 6 -> 0 0 10 -> 0 0 0 which is three moves.
Example 2
3 buckets capacity 10, 8 9 3
8 9 3 -> 8 2 10 -> 0 10 10 -> 0 10 0 -> 0 0 10 -> 0 0 0 = 5 moves total
I first tried doing it with different types of algorithms (greedy, dynamic, backtracking, etc) but none seemed to work. I thought I found a pretty intuitive solution but the program that checks these answers tells me it's wrong, so I might be wrong. Another thing is that this program has rejected correct answers previously so I'm not really sure.
Here is my solution:
Calculate the sum of all of the buckets before each bucket, and then take the ceiling of that number divided by the capacity of the buckets, and then add all of those numbers.
For example: 6 6 6 6 6 -> 6 12 18 24 30
ceil(6/10) ceil(12/10) ceil(18/10) ceil(24/10) ceil(30/10) = 1 + 2 + 2 + 3 + 3 = 11
that is the right answer: 6 6 6 6 6 -> 6 2 10 6 6 -> 0 8 10 6 6 -> 0 8 10 2 10 -> 0 8 2 10 10 -> 0 0 10 10 10 -> 0 0 10 10 0 -> 0 0 10 0 10 -> 0 0 10 0 0 -> 0 0 0 10 0 -> 0 0 0 0 10 -> 0 0 0 0 0 = 11 steps
The logic is that if there are L liters of water before a certain bucket, then there must be at least ceil(L/Capacity) moves that pass through that position. So far I have tried around 30 test cases and they have all worked. Every time I thought I found a counterexample, I realized I was wrong after trying it out a few times by hand. The problem is that although I am pretty sure this is the right answer, but I have no idea how to prove something like this or I might simply be wrong.
Can someone tell me if this answer is correct?
Here are my findings on the problem
First thing let's review the game rules:
Rule 1: You can only dispose all of the right-most bucket if it's filled
Rule 2: You can dispose as many liters as you want (up to 10-bucket[i+1]) from any bucket[i] to bucket[i+1]
Let's consider the following special cases:
Shifting: in case of 0, 10, 0, 0, 0 you shift the 2nd bucket 3 times before disposal
Merge: in case of 2 4 you can merge 1st bucket to 2nd to get 0 6
Perfect Merge: in case of 6 4 you merge to 0 10 getting 2nd bucket full, an optimal perfect merge will leave 0 in the current bucket
Conclusion: I suggest the following strategy, Right-to-Left Sequential Disposal which is from left to right, take a Perfect Merge and Shift it to disposal.
Note: this is not an optimal algorithm for this problem
void solution (int bucket [], const int size)
{
for (int i=size-1; i>=0; i--)
{
// find a non-empty bucket
if (bucket[i] > 0)
{
// bucket need to be filled
if (bucket[i] < 10)
for (int j=i-1; j>=0; j--)
{
// -------------------
// fill bucket[i] from previous
// buckets and count moves
// -------------------
if (bucket[i] == 10)
break;
}
bucket[i]=0; // (size-i) shift + 1 dispose
moves = moves + (size-i) + 1;
}
}
}
This is not a solution, but Python code that finds the minimum number of moves from a given configuration, by brute force. Every sequence of legal moves is tried and the shortest length is reported.
B= [6, 6, 6, 6, 6] # Buckets state
M= sum(B) * len(B) # Gross upper bound
B.append(- sum(B)) # Append the container
def Try(N):
global M
# Check if we have found a better solution
if B[-1] == 0 and N < M:
M= N
# Try every possible move from i to i+1
for i in range(len(B) - 1):
# Amount transferable
D= min(B[i], 10 - B[i + 1])
if D > 0:
# Transfer
B[i]-= D
B[i + 1]+= D
# Recurse
Try(N+1)
# Restore
B[i]+= D
B[i + 1]-= D
Try(0)
print M, "moves"
>>>
11 moves
Few points that can influence the design of right algorithm :-
rightmost full buckets are directly disposable. Just need to add n*(n+1)/2 for n rightmost buckets for total steps.
if rightmost bucket is not full then check the previous bucket if it is full or is able to fill rightmost then do else try to fill that
bucket do this recursively till either condition is met or first bucket
is reached. Then pour the bucket into next till it is full or other is
empty bottom up and count steps.
Do 1 to 2 till only 1 or no bucket is left
if one bucket is left then empty it.
Examples :-
given 4 0 6
1. fails
2. last bucket is not empty hence try to pour second last but it is also empty
pour 4 into bucket 1 and then bucket 1 into bucket 2 . Hence after this step
4 0 6 => 0 4 6 => 0 0 10
3. 1 bucket left so 4
4. empty bucket 0 0 10 => 0 0 0
given 8 9 3
iteration 1 : 8 9 3 => 8 2 10
iteration 2 : 8 2 10 => 8 2 0 and 8 2 0 => 0 10 0
iteration 3 : 0 10 0 => 0 0 10 => 0 0 0
given 6 6 6 6 6
iteration 1 : 6 6 6 6 6 => 6 6 6 2 10
iteration 2 : 6 6 6 2 10 => 6 6 6 2 0 && 6 6 6 2 0 => 6 2 10 2 0 => 6 2 2 10 0
iteration 3 : 6 2 2 10 0 => 6 2 2 0 10 => 6 2 2 0 0 &&
6 2 2 0 0 => 0 8 2 0 0 => 0 0 10 0 0
iteration 4 : 0 0 10 0 0 => 0 0 0 10 0 => 0 0 0 0 10 => 0 0 0 0 0
Time complexity :- Straight forward implementation would be O(n^2) because in 1 iteration atleast 1 rightmost bucket is emptied in O(n) computation.
There is a wonderfully simple solution.
The optimal number of moves is known to be Sum(0<=i<N: Pi\C), where \ denotes integer division by excess and the Pi form the prefix sum of B.
Example:
6 6 6 6 6
6 12 18 24 30
1 2 2 3 3 => 11
It suffices to always choose a move that decreases the optimal number by one unit. This move can be found in linear time with respect to N. (Hint: find some Pi\C that a move decreases.)
2>10 6 6 6
2 12 18 24 30
1 2 2 3 3 => 11
6 2>10 6 6
6 8 18 24 30
1 1* 2 3 3 => 10
6 6 2>10 6
6 12 14 24 30
1 2 2 3 3 => 11
6 6 6 2>10
6 12 18 20 30
1 2 2 2* 3 => 10
6 6 6 6 0>
6 12 18 24 24
1 2 2 3 3 => 11
The total complexity is O(N.M), where M is the optimal number of moves.
B= [6, 6, 6, 6, 6]
C= 10
# Show the current state
print B
# Append the container
B.append(- sum(B))
# Loop until all buckets are empty
while B[-1] != 0:
# Emulate the quotient by excess
Prefix= C - 1
# Try all buckets
for i in range(len(B) - 1):
# Amount transferable
A= min(B[i], C - B[i + 1])
# Evaluate the move from i to i+1
Prefix+= B[i]
if (Prefix - A) / C < Prefix / C:
# The total count will decrease, accept this move
B[i]-= A
B[i + 1]+= A
break
# Show the current state
print B[:-1]
>>>
[6, 6, 6, 6, 6]
[6, 2, 10, 6, 6]
[0, 8, 10, 6, 6]
[0, 8, 10, 2, 10]
[0, 8, 2, 10, 10]
[0, 0, 10, 10, 10]
[0, 0, 10, 10, 0]
[0, 0, 10, 0, 10]
[0, 0, 0, 10, 10]
[0, 0, 0, 10, 0]
[0, 0, 0, 0, 10]
[0, 0, 0, 0, 0]
The proof of termination of the algorithm rests on the validity of the optimal number formula, which remains to demonstrate.

Find two subarrays with equal weighted average

We are given an array A of integers. I want to find 2 contiguous subarrays of the largest length(both subarrays must be equal in length) that have the same weighted average. The weights are the positions in the subarray. For example
A=41111921111119
Subarrays:: (11119) and (11119)
Ive tried to find the weighted average of all subarrays by DP and then sorting them columnwise to find 2 with same length.But I cant proceed further and my approach seems too vague/bruteforce.I would appreciate any help. Thanks in advance.
The first step should be to sort the array. Any pairs of equal values can then be identified and factored out. The remaining numbers will all be different, like this:
2, 3, 5, 9, 14, 19 ... etc
The next step would be to compare pairs to their center:
2 + 5 == 2 * 3 ?
3 + 9 == 2 * 5 ?
5 + 14 == 2 * 9 ?
9 + 19 == 2 * 14 ?
The next step is to compare nested pairs, meaning if you have A B C D, you compare A+D to B+C. So for the above example it would be:
2+9 == 3+5 ?
3+15 == 5+9 ?
5+19 == 9+14 ?
Next you would compare triples to the two inside values:
2 + 3 + 9 == 3 * 5 ?
2 + 5 + 9 == 3 * 3 ?
3 + 5 + 14 == 3 * 9 ?
3 + 9 + 14 == 3 * 5 ?
5 + 9 + 19 == 3 * 14 ?
5 + 14 + 19 == 3 * 9 ?
Then you would compare pairs of triples:
2 + 3 + 19 == 5 + 9 + 14 ?
2 + 5 + 19 == 3 + 9 + 14 ?
2 + 9 + 19 == 3 + 5 + 14 ?
and so on. There are different ways to do the ordering. One way is to create an initial bracket, for example, given A B C D E F G H, the initial bracket is ABGH versus CDEF, ie the outside compared to the center. Then switch values according to the comparison. For example, if ABGH > CDEF, then you can try all switches where the left value is greater than the right value. In this case G and H are greater than E and F, so the possible switches are:
G <-> E
G <-> F
H <-> E
H <-> F
GH <-> EF
First, as the length of the two subarray must be equal, you can consider the length from 1 to n step by step.
For length i, you can calculate the weighted sum of every subarray in a total complexity of O(n). Then sort the sums to determine if there's an equal pair.
Because you sort n times the time would be O(n^2 log n) while the space is O(n).
Maybe I just repeated your solution mentioned in the question? But I don't think it can be optimized any more...

Link list algorithm to find pairs adding up to 10

Can you suggest an algorithm that find all pairs of nodes in a link list that add up to 10.
I came up with the following.
Algorithm: Compare each node, starting with the second node, with each node starting from the head node till the previous node (previous to the current node being compared) and report all such pairs.
I think this algorithm should work however its certainly not the most efficient one having a complexity of O(n2).
Can anyone hint at a solution which is more efficient (perhaps takes linear time). Additional or temporary nodes can be used by such a solution.
If their range is limited (say between -100 and 100), it's easy.
Create an array quant[-100..100] then just cycle through your linked list, executing:
quant[value] = quant[value] + 1
Then the following loop will do the trick.
for i = -100 to 100:
j = 10 - i
for k = 1 to quant[i] * quant[j]
output i, " ", j
Even if their range isn't limited, you can have a more efficient method than what you proposed, by sorting the values first and then just keeping counts rather than individual values (same as the above solution).
This is achieved by running two pointers, one at the start of the list and one at the end. When the numbers at those pointers add up to 10, output them and move the end pointer down and the start pointer up.
When they're greater than 10, move the end pointer down. When they're less, move the start pointer up.
This relies on the sorted nature. Less than 10 means you need to make the sum higher (move start pointer up). Greater than 10 means you need to make the sum less (end pointer down). Since they're are no duplicates in the list (because of the counts), being equal to 10 means you move both pointers.
Stop when the pointers pass each other.
There's one more tricky bit and that's when the pointers are equal and the value sums to 10 (this can only happen when the value is 5, obviously).
You don't output the number of pairs based on the product, rather it's based on the product of the value minus 1. That's because a value 5 with count of 1 doesn't actually sum to 10 (since there's only one 5).
So, for the list:
2 3 1 3 5 7 10 -1 11
you get:
Index a b c d e f g h
Value -1 1 2 3 5 7 10 11
Count 1 1 1 2 1 1 1 1
You start pointer p1 at a and p2 at h. Since -1 + 11 = 10, you output those two numbers (as above, you do it N times where N is the product of the counts). Thats one copy of (-1,11). Then you move p1 to b and p2 to g.
1 + 10 > 10 so leave p1 at b, move p2 down to f.
1 + 7 < 10 so move p1 to c, leave p2 at f.
2 + 7 < 10 so move p1 to d, leave p2 at f.
3 + 7 = 10, output two copies of (3,7) since the count of d is 2, move p1 to e, p2 to e.
5 + 5 = 10 but p1 = p2 so the product is 0 times 0 or 0. Output nothing, move p1 to f, p2 to d.
Loop ends since p1 > p2.
Hence the overall output was:
(-1,11)
( 3, 7)
( 3, 7)
which is correct.
Here's some test code. You'll notice that I've forced 7 (the midpoint) to a specific value for testing. Obviously, you wouldn't do this.
#include <stdio.h>
#define SZSRC 30
#define SZSORTED 20
#define SUM 14
int main (void) {
int i, s, e, prod;
int srcData[SZSRC];
int sortedVal[SZSORTED];
int sortedCnt[SZSORTED];
// Make some random data.
srand (time (0));
for (i = 0; i < SZSRC; i++) {
srcData[i] = rand() % SZSORTED;
printf ("srcData[%2d] = %5d\n", i, srcData[i]);
}
// Convert to value/size array.
for (i = 0; i < SZSORTED; i++) {
sortedVal[i] = i;
sortedCnt[i] = 0;
}
for (i = 0; i < SZSRC; i++)
sortedCnt[srcData[i]]++;
// Force 7+7 to specific count for testing.
sortedCnt[7] = 2;
for (i = 0; i < SZSORTED; i++)
if (sortedCnt[i] != 0)
printf ("Sorted [%3d], count = %3d\n", i, sortedCnt[i]);
// Start and end pointers.
s = 0;
e = SZSORTED - 1;
// Loop until they overlap.
while (s <= e) {
// Equal to desired value?
if (sortedVal[s] + sortedVal[e] == SUM) {
// Get product (note special case at midpoint).
prod = (s == e)
? (sortedCnt[s] - 1) * (sortedCnt[e] - 1)
: sortedCnt[s] * sortedCnt[e];
// Output the right count.
for (i = 0; i < prod; i++)
printf ("(%3d,%3d)\n", sortedVal[s], sortedVal[e]);
// Move both pointers and continue.
s++;
e--;
continue;
}
// Less than desired, move start pointer.
if (sortedVal[s] + sortedVal[e] < SUM) {
s++;
continue;
}
// Greater than desired, move end pointer.
e--;
}
return 0;
}
You'll see that the code above is all O(n) since I'm not sorting in this version, just intelligently using the values as indexes.
If the minimum is below zero (or very high to the point where it would waste too much memory), you can just use a minVal to adjust the indexes (another O(n) scan to find the minimum value and then just use i-minVal instead of i for array indexes).
And, even if the range from low to high is too expensive on memory, you can use a sparse array. You'll have to sort it, O(n log n), and search it for updating counts, also O(n log n), but that's still better than the original O(n2). The reason the binary search is O(n log n) is because a single search would be O(log n) but you have to do it for each value.
And here's the output from a test run, which shows you the various stages of calculation.
srcData[ 0] = 13
srcData[ 1] = 16
srcData[ 2] = 9
srcData[ 3] = 14
srcData[ 4] = 0
srcData[ 5] = 8
srcData[ 6] = 9
srcData[ 7] = 8
srcData[ 8] = 5
srcData[ 9] = 9
srcData[10] = 12
srcData[11] = 18
srcData[12] = 3
srcData[13] = 14
srcData[14] = 7
srcData[15] = 16
srcData[16] = 12
srcData[17] = 8
srcData[18] = 17
srcData[19] = 11
srcData[20] = 13
srcData[21] = 3
srcData[22] = 16
srcData[23] = 9
srcData[24] = 10
srcData[25] = 3
srcData[26] = 16
srcData[27] = 9
srcData[28] = 13
srcData[29] = 5
Sorted [ 0], count = 1
Sorted [ 3], count = 3
Sorted [ 5], count = 2
Sorted [ 7], count = 2
Sorted [ 8], count = 3
Sorted [ 9], count = 5
Sorted [ 10], count = 1
Sorted [ 11], count = 1
Sorted [ 12], count = 2
Sorted [ 13], count = 3
Sorted [ 14], count = 2
Sorted [ 16], count = 4
Sorted [ 17], count = 1
Sorted [ 18], count = 1
( 0, 14)
( 0, 14)
( 3, 11)
( 3, 11)
( 3, 11)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 7, 7)
Create a hash set (HashSet in Java) (could use a sparse array if your numbers are well-bounded, i.e. you know they fall into +/- 100)
For each node, first check if 10-n is in the set. If so, you have found a pair. Either way, then add n to the set and continue.
So for example you have
1 - 6 - 3 - 4 - 9
1 - is 9 in the set? Nope
6 - 4? No.
3 - 7? No.
4 - 6? Yup! Print (6,4)
9 - 1? Yup! Print (9,1)
This is a mini subset sum problem, which is NP complete.
If you were to first sort the set, it would eliminate the pairs of numbers that needed to be evaluated.

Resources