Minimum range of 3 sets - algorithm

We have three sets S1, S2, S3. I need to find x,y,z such that
x E S1
y E S2
z E S3
let min denote the minimum value out of x,y,z
let max denote the maximum value out of x,y,z
The range denoted by max-min should be the MINIMUM possible value

Of course, the full-bruteforce solution described by IVlad is simple and therefore, easier and faster to write, but it's complexity is O(n3).
According to your algorithm tag, I would like to post a more complex algorithm, that has a O(n2) worst case and O(nlogn) average complexity (almost sure about this, but I'm too lazy to make a proof).
Algorithm description
Consider thinking about some abstract (X, Y, Z) tuple. We want to find a tuple that has a minimal distance between it's maximum and minimum element. What we can say at this point is that distance is actually created by our maximum element and minimum element. Therefore, the value of element between them really doesn't matter as long as it really lies between the maximum and the minimum.
So, here is the approach. We allocate some additional set (let's call it S) and combine every initial set (X, Y, Z) into one. We also need an ability to lookup the initial set of every element in the set we've just created (so, if we point to some element in S, let's say S[10] and ask "Where did this guy come from?", our application should answer something like "He comes from Y).
After that, let's sort our new set S by it's keys (this would be O(n log n) or O(n) in some certain cases)
Determining the minimal distance
Now the interesting part comes. What we want to do is to compute some artificial value, let's call it minimal distance and mark it as d[x], where x is some element from S. This value refers to the minimal max - min distance which can be achived using the elements that are predecessors / successors of current element in the sequence.
Consider the following example - this is our S set(first line shows indexes, second - values and letters X, Y and Z refer to initial sets):
0 1 2 3 4 5 6 7
------------------
1 2 4 5 8 10 11 12
Y Z Y X Y Y X Z
Let's say we want to compute that our minimal distance for element with index 4. In fact, that minimal distance means the best (x, y, z) tuple that can be built using the selected element.
In our case (S[4]), we can say that our (x, y, z) pair would definitely look like (something, 8, something), because it should have the element we're counting the distance for (pretty obvious, hehe).
Now, we have to fill the gaps. We know that elements we're seeking for, should be from X and Z. And we want those elements to be the best in terms of max - min distance. There is an easy way to select them.
We make a bidirectional run (run left, the run right from current element) seeking for the first element-not-from-Y. In this case we would seek for two nearest elements from X and Z in two directions (4 elements total).
This finding method is what we need: if we select the first element of from X while running (left / right, doesn't matter), that element would suit us better than any other element that follows it in terms of distance. This happens because our S set is sorted.
In case of my example (counting the distance for element with index number 4), we would mark elements with indexes 6 and 7 as suitable from the right side and elements with indexes 1 and 3 from the left side.
Now, we have to test 4 cases that can happen - and take the case so that our distance is minimal. In our particular case we have the following (elements returned by the previous routine):
Z X Y X Z
2 5 8 11 12
We should test every (X, Y, Z) tuple that can be built using these elements, take the tuple with minimal distance and save that distance for our element. In this example, we would say that (11, 8, 12) tuple has the best distance of 4. So, we store d[5] = 4 (5 here is the element index).
Yielding the result
Now, when we know how to find the distance, let's do it for every element in our S set (this operation would take O(n2) in the worst case and better time - something like O(nlogn) in average).
After we have that distance value for every element in our set, just select the element with minimal distance and run our distance counting algorithm (which is described above) for it once again, but now save the (-, -, -) tuple. It would be the answer.
Pseudocode
Here is comes the pseudocode, I tried to make it easy to read, but it's implementation would be more complex, because you'll need to code set lookups *("determine set for element"). Also note that determine tuple and determine distance routines are basically the same, but the second yields the actual tuple.
COMBINE (X, Y, Z) -> S
SORT(S)
FOREACH (v in S)
DETERMINE_DISTANCE(v, S) -> d[v]
DETERMINE_TUPLE(MIN(d[v]))
P.S
I'm pretty sure that this method could be easily used for (-, -, -, ... -) tuple seeking, still resulting in good algorithmic complexity.

min = infinity (really large number in practice, like 1000000000)
solution = (-, -, -)
for each x E S1
for each y E S2
for each z E S3
t = max(x, y, z) - min(x, y, z)
if t < min
min = t
solution = (x, y, z)

Related

Maximize minimum distance between arrays

Lets say that you are given n sorted arrays of numbers and you need to pick one number from each array such that the minimum distance between the n chosen elements is maximized.
Example:
arrays:
[0, 500]
[100, 350]
[200]
2<=n<=10 and every array could have ~10^3-10^4 elements.
In this example the optimal solution to maximize minimum distance is pick numbers: 500, 350, 200 or 0, 200, 350 where min distance is 150 and is the maximum possible of every combination.
I am looking for an algorithm to solve this. I know that I could binary search the max min distance but I can't see how to decide is there is a solution with max min distance of at least d, in order for the binary search to work. I am thinking maybe dynamic programming could help but haven't managed to find a solution with dp.
Of course generating all combination with n elements is not efficient. I have already tried backtracking but it is slow since it tries every combination.
n ≤ 10 suggests that we can take an exponential dependence on n. Here's
an O(2n m n)-time algorithm where m is the total size of the
arrays.
The dynamic programming approach I have in mind is, for each subset of
arrays, calculate all of the pairs (maximum number, minimum distance) on
the efficient frontier, where we have to choose one number from each of
the arrays in the subset. By efficient frontier I mean that if we have
two pairs (a, b) ≠ (c, d) with a ≤ c and b ≥ d, then (c, d) is not on
the efficient frontier. We'll want to keep these frontiers sorted for
fast merges.
The base case with the empty subset is easy: there's one pair, (minimum
distance = ∞, maximum number = −∞).
For every nonempty subset of arrays in some order that extends the
inclusion order, we compute a frontier for each array in the subset,
representing the subset of solutions where that array contributes the
maximum number. Then we merge these frontiers. (Naively this costs us
another factor of log n, which maybe isn't worth the hassle to avoid
given that n ≤ 10, but we can avoid it by merging the arrays once at the
beginning to enable future merges to use bucketing.)
To construct a new frontier from a subset of arrays and another array
also involves a merge. We initialize an iterator at the start of the
frontier (i.e., least maximum number) and an iterator at the start of
the array (i.e., least number). While neither iterator is past the end,
Emit a candidate pair (min(minimum distance, array number − maximum
number), array number).
If the min was less than or equal to minimum distance, increment the
frontier iterator. If the min was less than or equal to array number
− maximum number, increment the array iterator.
Cull the candidate pairs to leave only the efficient frontier. There is
an elegant way to do this in code that is more trouble to explain.
I am going to give an algorithm that for a given distance d, will output whether it is possible to make a selection where the distance between any pair of chosen numbers is at least d. Then, you can binary-search the maximum d for which the algorithm outputs "YES", in order to find the answer to your problem.
Assume the minimum distance d be given. Here is the algorithm:
for every permutation p of size n do:
last := -infinity
ok := true
for p_i in p do:
x := take the smallest element greater than or equal to last+d in the p_i^th array (can be done efficiently with binary search).
if no such x was found; then
ok = false
break
end
last = x
done
if ok; then
return "YES"
end
done
return "NO"
So, we brute-force the order of arrays. Then, for every possible order, we use a greedy method to choose elements from each array, following the order. For example, take the example you gave:
arrays:
[0, 500]
[100, 350]
[200]
and assume d = 150. For the permutation 1 3 2, we first take 0 from the 1st array, then we find the smallest element in the 3rd array that is greater than or equal to 0+150 (it is 200), then we find the smallest element in the 2nd array which is greater than or equal to 200+150 (it is 350). Since we could find an element from every array, the algorithm outputs "YES". But for d = 200 for instance, the algorithm would output "NO" because none of the possible orderings would result in a successful selection.
The complexity for the above algorithm is O(n! * n * log(m)) where m is the maximum number of elements in an array. I believe it would be sufficient, since n is very small. (For m = 10^4, 10! * 10 * 13 ~ 5*10^8. It can be computed under a second on a modern CPU.)
Lets look at an example with optimal choices, x (horizontal arrays A, B, C, D):
A x
B b x b
C x c
D d x
Our recurrence based on range could be: let f(low, excluded) represent the maximum closest distance between two chosen elements (from arrays 1 to n) of the subset without elements in excluded, where low is the lowest chosen element. Then:
(1)
f(low, excluded) when |excluded| = n-1:
max(low)
for low in the only permitted array
(2)
f(low, excluded):
max(
min(
a - low,
f(a, excluded')
)
)
for a ≥ low, a not in excluded'
where excluded' = excluded ∪ {low's array}
We can limit a. For one thing the maximum we can achieve is
(3)
m = (highest - low) / (n - |excluded| - 1)
which means a need not go higher than low + m.
Secondly, we can store results for all f(a, excluded'), keyed by excluded' (we have 2^10 possible keys), each in a decorated binary tree ordered by a. The decoration will be the highest result achievable in the right subtree, meaning we can find the max for all f(v, excluded'), v ≥ a in logarithmic time.
The latter establishes a dominance relationship and clearly we are intetested in both a larger a and a larger f(a, excluded') so as to maximise the min function in (2). Picking an a in the middle, we can use a binary search. If we have:
a - low < max(v, excluded'), v ≥ a
where max(v, excluded') is the lookup
for a in the decorated tree
then we look to the right since max(v, excluded) indicates there's a better answer on the right, where a - low is also larger.
And if we have:
a - low ≥ max(v, excluded), v ≥ a
then we record this candidate and look to the left since to the right, the answer is fixed at max(v, excluded), given that a - low could not decrease.
In order to conduct the binary search on the range, [low, low + m] (see (3)), rather than merge and label all the arrays at the outset, we can keep them separate and compare the closest candidates to mid out of each array we are currently permitted to choose a from. (The trees have the mixed results, keyed by subset.) (The flow of this part is not completely clear to me.)
Worst case with this method, given that n = C is constant seems to be
O(C * array_length * 2^C * C * log(array_length) * log(C * array_length))
C * array_length is the iteration on low
Each low can be paired with 2^C inclusions
C * log(array_length) is the separated binary-search
And log(C * array_length) is the tree lookup
Simplifying:
= O(array_length * log^2(array_length))
although in practice, there could be many dead-end branches that exit early where a full selection wouldn't be possible.
In case, it wasn't clear, the iteration is on a fixed lowest element in the selection. In other words, we want the best f(low, excluded) for all different lows (and excludeds). For bottom-up, we would iterate from the highest value down so our results for a get stored as we iterate.

How to find number of steps to transform (a,b) to (x,y)

Given 2 numbers a=1 and b=1.
At each steps, you can do one of the following:
a+=b;
b+=a;
If it's possible to transform a into x and b into y, find the minimum steps needed
x and y can be arbitrarily large (more than 10^15)
My approach so far was just to do a recursive backtrack which will be around O(2^min(x,y)) in complexity (too large). DP won't do either since the states can be more than 10^15.
Any idea? Is there any number theory that is needed to solve this?
P.s. This is not a homework.
Given that you reached some (x,y) the only way to get there is if you added the smaller value into what is now the larger value. Say x > y, then the only possible previous state is x-y, y.
Also note that the number of steps to get to x,y is the same to get to y,x.
So the solution you are looking for is something like
steps(x,y):
if x < y: return steps(y, x)
if y == 1: return x - 1
if y == 0: throw error # You can't get this combination.
return x / y + steps (y, x % y)
In other words, find the depth of a node in the Calkin--Wilf tree. The node exists iff gcd(a, b) = 1. You can modify the gcd algorithm to give the number of operations as a byproduct (sum all of the quotients computed along the way and subtract one).

Data structure to support a particular query on a set of 2D points

I have a set of 2D points, and I want to be able to make the following query with arguments x_min and n: what are the n points with largest y which have x > x_min?
To rephrase in Ruby:
class PointsThing
def initialize(points)
#points = points
end
def query(x_min, n)
#points.select { |point| point.x > x_min }.sort_by { |point| point.y }.take(n)
end
end
Ideally, my class would also support an insert and delete operation.
I can't think of a data structure for this which would enable the query to run in less than O(|#points|) time. Does anyone know one?
Sort the points by x descending. For each point in order, insert it into a purely functional red-black tree ordered by y descending. Keep all of the intermediate trees in an array.
To look up a particular x_min, use binary search to find the intermediate tree where exactly the points with x > x_min have been inserted. Traverse this tree to find the first n points.
The preprocessing cost is O(p log p) in time and space, where p is the number of points. The query time is O(log p + n), where n is the number of points to be returned in the query.
If your data are not sorted, then you have no choice but to check every point since you cannot know if there exists another point for which y is greater than that of all other points and for which x > x_min. In short: you can't know if another point should be included if you don't check them all.
In that case, I would assume that it would be impossible to check in sublinear time as you ask for, since you have to check them all. Best case for searching all would be linear.
If your data are sorted, then your best case will be constant time (all n points are those with the greatest y), and worst case would be linear (all n points are those with least y). Average case would be closer to constant I think if your x and x_min are both roughly random within a specific range.
If you want this to scale (that is, you could have large values of n), you will want to keep your resultant set sorted as well since you will need to check new potential points against it and to drop the lowest value when you insert (if size > n). Using a tree, this can be log time.
So, to do the entire thing, worst case is for unsorted points, in which case you're looking at nlog(n) time. Sorted points is better, in which case you're looking at average case of log(n) time (again, assuming roughly randomly distributed values for x and x_min), which yes is sub-linear.
In case it isn't at first obvious why sorted points will have have constant time to search through, I will go over that here quickly.
If the n points with the greatest y values all had x > x_min (the best case) then you are just grabbing what you need off the top, so that case is obvious.
For the average case, assuming roughly randomly distributed x and x_min, the odds that x > x_min are basically half. For any two random numbers a and b, a > b is just as likely to be true as b > a. This is the same thing with x and x_min; x > x_min is equally as likely to be true as x_min > x, meaning 0.5 probability. This means that, for your points, on average every second point checked will meet your x > x_min requirement, so on average you will check 2n points to find the n highest points that meet your criteria. So the best case was c time, average is 2c which is still constant.
Note, however, that for values of n approaching the size of the set this hides the fact that you are going through the entire set, essentially bringing it right back up to linear time. So my assertion that it is constant time does not hold true if you assume random values of n within the range of the size of your set.
If this is not a purely academic question and is prompted by some actual need, then it depends on the situation.
(edit)
I just realized that my constant-time assertions were assuming a data structure where you have direct access to the highest value and can go sequentially to lower values. If the data structure that those are provided to you in does not fit that description, then obviously that will not be the case.
Some precomputation would help in this case.
First partition the set of points taking x_min as pivot element.
Then for set of points lying on right side of x_min build a max_heap based on y co-ordinates.
Now run your query as: Perform n extract_max operations on the built max_heap.
The running time of your query would be log X + log (X-1) + ..... log (X-(n-1))
log X: For the first extract max operation.
log X-1: For the second extract max operation and so on.
X: Size of original Max heap.
Even in the worst case when your n << X , The time taken would be O(n log X).
Notation
Let P be the set of points.
Let top_y ( n, x_min) describe the query to collect the n points from P with the largest y-coordinates among those with x-coordinate greater than or equal to `x_min' .
Let x_0 be the minimum of x coordinates in your point set. Partition the x axis to the right of x_0 into a set of left-hand closed, right-hand open intervals I_i by the set of x coordinates of the point set P such that min(I_i) is the i-th but smallest x coordinate from P. Define the coordinate rank r(x) of x as the index of the interval x is an element of or 0 if x < x_0.
Note that r(x) can be computed in O(log #({I_i})) using a binary search tree.
Simple Solution
Sort your point set by decreasing y-coordinates and save this array A in time O(#P log #P) and space O(#P).
Process each query top_y ( n, x_min ) by traversing this array in order, skipping over items A_i: A_i.x < x_0, counting all other entries until the counter reaches n or you are at the end of A. This processing takes O(n) time and O(1) space.
Note that this may already be sufficient: Queries top_y ( n_0, a_0 ); a_0 < min { p.x | p \in P }, n_0 = c * #P, c = const require step 1 anyway and for n << #P and 'infrequent' queries any further optimizations weren't worth the effort.
Observation
Consider the sequences s_i,s_(i+1)of points with x-coordinates greater than or equal tomin(I_i), min(I_(i+1)), ordered by decreasing y-coordinate.s_(i+1)is a strict subsequence ofs_i`.
If p_1 \in s_(i+1) and p_2.x >= p_1.x then p_2 \in s_(i+1).
Refined Solution
A refined data structure allows for O(n) + O(log #P) query processing time.
Annotate the array A from the simple solution with a 'successor dispatch' for precisely those elements A_i with A_(i+1).x < A_i.x; This dispatch data would consist of an array disp:[r(A_(i+1).x) + 1 .. r(A_i.x)] of A-indexes of the next element in A whose x-coordinate ranks at least as high as the index into disp. The given dispatch indices suffice for processing the query, since ...
... disp[j] = disp[r(A_(i+1).x) + 1] for each j <= r(A_(i+1).x).
... for any x_min with r(x_min) > r(A_i.x), the algorithm wouldn't be here
The proper index to access disp is r(x_min) which remains constant throughout a query and thus takes O(log #P) to compute once per query while the index selection itself is O(1) at each A element.
disp can be precomputed. No two disp entries across all disp arrays are identical (Proof skipped, but it's easy [;-)] to see given the construction). Therefore the construction of disp arrays can be performed stack-based in a single sweep through the point set sorted in A. As there are #P entries, the disp structure takes O(#P) space and O(#P) time to construct, being dominated by space and time requirements for y-sorting. So in a certain sense, this structure comes for free.
Time requirements for query top_y(n,x_min)
Computing r(x_min): O(log #P);
Passage through A: O(n);

Topological Sorting of a directed acyclic graph

How would you output all the possible topological sorts for a directed acyclic graph? For example, given a graph where V points to W and X, W points to Y and Z, and X points to Z:
V --> W --> Y
W --> Z
V --> X --> Z
How do you topologically sort this graph to produce all possible results? I was able to use a breadth-first-search to get V, W, X, Y, Z and a depth-first search to get V, W, Y, Z, X. But wasn't able to output any other sorts.
An algorithm for generating all topological sorts for a given DAG (aka generating all linear extensions of a partial order) is given in the paper "Generating Linear Extensions Fast" by Pruesse and Ruskey. The algorithm has an amortized running time that is linear in the output (e.g.: if it outputs M topological sorts, it runs in time O(M)).
Note that in general you can't really have anything that has a runtime that's efficient with respect to the size of the input since the size of the output can be exponentially larger than the input. For example, a completely disconnected DAG of N nodes has N! possible topological sorts.
It might be possible to count the number of orderings faster, but the only way to actually generate all orderings that I can think of is with a full brute-force recursion. (I say "brute force", but this is still much better than the brutest-possible brute force approach of testing every possible permutation :) )
Basically, at every step there is a set S of vertices remaining (i.e. which have not been added to the order yet), and a subset X of these can be safely added in the next step. This subset X is exactly the set of vertices that have no in-edges from vertices in S.
For a given partial solution L consisting of some number of vertices that are already in the order, the set S of remaining vertices, and the set X of vertices in S that have no in-edges from other vertices in S, the call Generate(L, X, S) will generate all valid topological orders beginning with L.
Generate(L, X, S):
If X is empty:
Either L is already a complete solution, in which case it contains all n vertices and S is also empty, or the original graph contains a cycle.
If S is empty:
Output L as a solution.
Otherwise:
Report that a cycle exists. (In fact, all vertices in S participate in some cycle, though there may be more than one.)
Otherwise:
For each x in X:
Let L' be L with x added to the end.
Let X' be X\{x} plus any vertices whose only in-edge among vertices in S came from x.
Let S' = S\{x}.
Generate(L', X', S')
To kick things off, find the set X of all vertices having no in-edges and call Generate((), X, V). Because every x chosen in the "For each" loop is different, every partial solution L' generated by the iterations of this loop must also be distinct, so no solution is generated more than once by any call to Generate(), including the top-level call.
In practice, forming X' can be done more efficiently than the above pseudocode suggests: When we choose x, we can delete all out-edges from x, but also add them to a temporary list of edges, and by tracking the total number of in-edges for each vertex (e.g. in an array indexed by vertex number) we can efficiently detect which vertices now have 0 in-edges and should thus be added to X'. Then at the end of the loop iteration, all the edges that we deleted can be restored from the temporary list.
So this approach is flawed! Unsure if it can be salvaged, I'll leave it a little while, if anyone can spot how to fix it, either grab what you can and post a new answer or edit mine.
Specifically, I used the below algorithm on the example from the comment and it will not output the example given, so it is clearly flawed.
The way I've learned to do a topological sort is the following:
Create a list of all the elements with no arrows pointing into it
Create a dictionary of element -> number, where element here is any element in the original collection that has an arrow into it, and the number is how many elements point to it.
Create a dictionary of element -> list, where element here is any element in the original collection that has an arrow out of it, and the list is all the elements those arrows point to
In your example, the two dictionaries and the list would be like this:
D1 D2 List
W: 1 V: W, X V
Y: 1 W: Y, Z
Z: 2 X: Z
X: 1
Then, start a loop where on each iteration you do the following:
Output all elements of the list, these currently have no arrows pointing into them. Make a temporary copy of the list, and clear the list, preparing it for the following iteration
Loop through the temporary copy, and find each element (if it exists) in the dictionary that is element -> list
For each element in those lists, decrement the corresponding number in the element -> number dictionary by 1 (removing 1 arrow). Once a number for an element here reaches 0, add that element to the list (it has no arrows left)
If the list is non-empty, redo the iteration loop
If you reach this point, and the dictionary with element -> number still has any elements left in it with a number above 0 (if you want to, you can remove the elements as you go in the above iteration once their numbers reach zero to make this part easier), then you have a cycle, since the above loop should not terminate until all arrows have been removed.
For your example, each iteration would output the following:
V
W, X (2nd iteration output both W and X)
Y, Z
If you want to know how I arrived at this solution, simply go through my iteration description step by step using the above dictionaries and list as the starting point.
Now, to specifically answer your question, how to output all combinations. The only places where "combinations" comes into play is per iteration. Basically, all the elements that you output in the first step of the iteration (the ones you made a temporary copy of) are considered "equivalent" and any internal ordering between these would have no impact on the topological sort.
So, do this:
In the first point in the iteration, place those elements into a list, and add that to another list, giving you a list of lists
This lists of lists will now contain each iteration as one element, and one element will be yet another list with the elements output in that iteration
Now, combine all permutations of the first list with all the permutations of the second list with all the permutations of the third list, and so on
This means taking this output:
V
W, X
Y, Z
Which gives you 1 * 2 * 2 = 4 permutations in total and you would combine all permutations of the 1st iteration (which is 1) with all the permutations of the 2nd iteration (which is 2, W, X and X, W) with all the permutations of the 3rd iteration (which is 2, Y, Z and Z, Y).
The final list of permutations that are valid topological sorts would be this:
V, W, X, Y, Z
V, X, W, Y, Z
V, W, X, Z, Y
V, X, W, Z, Y
Here is the example from the comment:
A and B with no in-edges. Both A and B have an edge to C, but only A has an edge to D. Neither C nor D has any out-edges.
Which gives:
A --> C
A --> D
B --> C
Dictionaries and list:
D1 D2 List
C: 2 A: C, D A
D: 1 B: C B
Iterations would output:
A, B
D, C
All permutations (2 * 2 = 4):
A, B, D, C
A, B, C, D
B, A, D, C
B, A, C, D

Finding the number of possible arithmetic series of 3 among a given set of numbers

Given a set integers, the problem consists of finding the number of possible arithmetic series of length 3. The set of integers may or may not be sorted.
I could implement a simple bruteforce algorithm taking time O(n^3) but time efficiency is important and the set of integers can be as large as 10^5. This means bruteforce obviously won't work. Can anyone suggest some algorithm/pseudocode/code in c++?
An example: there are 4 numbers 5,2,7,8 . Clearly there is only one such possibility - (2,5,8) in which the common difference is 3, so our answer is 1.
EDIT:I forgot to mention one important property - each number of set given is between 1 to 30000 (inclusive).
You can do it in O(N^2) as follows: create a hash set of your integers so that you could check a presence or absence of an element in O(1). After that, make two nested loops over all pairs of set elements {X, Y}. This is done in O(N^2).
For each pair {X, Y}, assume that X < Y, and calculate two numbers:
Z1 = X - (Y-X)
Z2 = Y + (Y-X)
A triple {X, Y, Zi} form an arithmetic sequence if Zi != X && Zi != Y && set.contains(Zi)
Check both triples {X, Y, Z1} and {X, Y, Z2}. You can do it in O(1) using a hash set, for a total running time of the algorithm of O(N^2).
An alternative solution that is O(N+BlogB) (where B is the maximum size of the integers - in your case 30,000) is to consider the histogram H, where H[x] is the number of times x is present in the sequence.
This histogram can be computed in time N.
You are seeking elements a,b,c such that b-a=c-b. This is equivalent to 2b=a+c.
So the idea is to compute a second histogram G[x] for a+c and then loop through all elements b and add H[b]*G[2b] to the total. This takes time O(B).
(G[x] is the number of times in the sequence there are a pair of values a,b such that x=a+b.)
The only difficulty is computing G[x], but this can be done using the Fast Fourier Transform to convolve H[x] with itself in time O(BlogB).

Resources