How to solve the equation sum{max(a_i, x)}=y with variable x? Is there any algorithm with O(n) time complexity? - algorithm

I am trying to find an algorithm to solve the following equation:
∑ max(ai, x) = y
in which the ai are constants and x is the variable.
I can find an algorithm with O(n log n) time complexity as follows:
First of all, sort the ai in O(n log n) time, and arrange intervals
(−∞, a0), (a0, a1), …, (ai, ai+1), …, (an−1, an), (an, ∞)
Then, for each interval, assume x belongs to this interval, and solve the equation. We could get a x̂, and then test whether x̂ belongs to this interval or not. If x̂ belongs to the corresponding interval, we will assign x̂ to x, and return x. On the other hand, we will try the next interval until we get the solution.
The above method is an O(n log n) algorithm due to the sort. With the definition of the equation-solving problem, I expect an algorithm with O(n) time complexity. Is there any reference for this problem?

First of all, this only has a solution if the sum of all a_i is smaller than y. You should check this first, because the algorithm below depends on this property.
Assume that we have chosen some pivot p from all a_i and want to calculate the x that corresponds to the interval [p, q), where q is the next larger a_i. This is:
If you move p to the next larger a_i, x changes as follows:
, where p' is the new pivot and n is the old number of a_i that are smaller or equal to p. Under the assumption that the sum of all a_i is smaller than y, this clearly leads to a decrease of x. Similarly, if we choose a smaller p, x is increased.
Coming back to the first equation, we can observe the following: If x is smaller than p, we should choose a smaller p. If x is greater than the smallest of the greater a_is, we should choose a larger p. In every other case, we have found the right x.
This can be utilized in a quick select procedure. #MvG's comment brought me onto this track. All credits for the quick select idea go to him. Here is some pseudo code (modified version from Wikipedia):
findX(list, y)
left := 0
right := length(list) - 1
sumGreater := 0 // the sum of all a_i greater than the current interval
numSmaller := 0 // the number of all a_i smaller than the current interval
minGreater := inf //the minimum of all a_i greater than the current interval
loop
if left = right
return (y - sumGreater) / (numSmaller + 1)
pivotIndex := medianOfMedians(list, left, right)
//the partition function will also sum the elements larger than the pivot,
//count the elements smaller than the pivot, and find the minimum of the
//larger elements
(pivotIndex, partialSumGreater, partialNumSmaller, partialMinGreater)
:= partition(list, left, right, pivotIndex)
x := (y - sumGreater - partialSumGreater) / (numSmaller + partialNumSmaller + 1)
if(x >= list[pivotIndex] && x < min(partialMinGreater, minGreater))
return x
else if x < list[pivotIndex]
right := pivotIndex - 1
minGreater := list[pivotIndex]
sumGreater += partialSumGreater + list[pivotIndex]
else
left := pivotIndex + 1
numSmaller += partialNumSmaller + 1
The key idea is that the partitioning function gathers some additional statistics. This does not change the time complexity of the partitioning function because it requires O(n) additional operations, leaving a total time complexity of O(n) for the partitioning function. The medianOfMedians function is also linear in time. The remaining operations in the loop are constant time. Assuming that the median of medians yields good pivots, the total time of the entire algorithm is approximately O(n + n/2 + n/4 + n/8 ...) = O(n).

Since comments might get deleted, I'm turning my own comments into a coherent answer. Contrary to the original question, I'm using indices 1 through n, avoiding the a0 originally used. So this is consistent one-based indexing using inclusive indices.
Assume for the moment that bi are the coefficients from your input, but in sorted order, so bi ≤ bi+1. As you essentially already wrote, if bi ≤ x ≤ bi+1 then the result is i ⋅ x + bi+1 + ⋯ + bn since the first i terms will use the x and the other terms will use the bj. Solving for x you get x = (y − bi+1 − ⋯ - bn) / i and putting that back into your inequality you have i ⋅ bi ≤ y − bi+1 − ⋯ − bn ≤ i ⋅ bi+1. Concentrating on one of the inequalities, you want the largest i such that
i ⋅ bi ≤ y − bi+1 − ⋯ − bn       (subsequently called “the inequality”)
But in order to make this work on unsorted ai, you'd need something similar to the median of medians. That is an algorithm which achieves O(n) guaranteed worst-case behavior for the problem of selecting a median, where the typical quickselect would take O(n²) in the worst case although it usually does quite well in practice.
Actually your problem is not that different from quickselect. You can pick a pivot coefficient, and split the remainder into larger and smaller values. Then you evaluate the inequality for the pivot element. If it is satisfied, you recurse into the list of larger elements, otherwise you recurse into the list of smaller elements, until at some point you have two adjacent elements, one which satisfies the inequality and one which does not.
This is O(n²) in the worst case, since you might need O(n) recursive calls, each of them taking O(n) time to process its input. Just like the O(n²) quickselect itself is suboptimal. The median-of-medians shows that that problem can indeed be solved in O(n). So we either need to find a similar solution here, or reformulate this problem here in terms of finding the median, or write some algorithm wich makes use of the median in a reasonable way.
Actually Nico Schertler found a way to achieve that last option: Take the algorithm I outlined above, but choose the pivot element to be the median. That way you can guarantee that each recursive call will process at most half as much input as the previous call. Since the median of medians itself is O(n) this can be done without exceeding the O(n) bound for each recursive call.
So in pseudocode it's like this (using inclusive indices throughout):
# f: Process whole problem with coefficients a_1 through a_n
f(y, a, n) := begin
if y < (sum of a_i for i from 1 through n): # O(n)
throw Error "Cannot satisfy equation" # Or omit check and risk division by zero
return g(a, 1, n, y) # O(n)
end
# g: Recursively process part of the problem, namely a_l through a_r
# Precondition: we know inequality holds for i = l - 1 and fails for i = r + 1
# a: the array as provided to f; will get modified in place
# l: left index (inclusive)
# r: right index (inclusive)
# y: (original y) - (sum of a_j for j from r + 1 through n)
g(a, l, r, y) := begin # process a_l through a_r O(r-l)
if r < l: # inequality holds in r but fails in l O(1)
return y / r # compute x for the case of i = r O(1)
m = median(a, l, r) # computed using median of medians O(r-l)
i = floor((l + r) / 2) # index of median, with same tie breaks O(1)
partition(a, l, r, m) # so a_l…a_(i-1) ≤ a_i=m ≤ a_(i+1)…a_r O(r-l)
rhs = y - (sum of a_j for j from i + 1 to r) # O((r-l)/2)
if i * a_i ≤ rhs: # condition holds, check larger i
return g(a, i + 1, r, y) # recurse in right half of list O((r-l)/2)
else: # condition fails, check smaller i
return g(a, l, i - 1, rhs - m) # recurse in left half of list O((r-l)/2)
end

Related

How do we find the logsumexp of all subsets (or some approximation to this)?

I have a set of numbers n_1, n_2, ... n_k. I need to find the sum (or mean, its the same) of the logsumexp of all possible subsets of this set of numbers. Is there a way to approximate this or compute it exactly?
Note, the logsumexp of a, b, c is log(e_a + e_b + e_c) (exp, followed by sum, followed by log)
I don’t know if this will be accurate enough, but log sum exp is sort of a smooth analog of max, so one possibility would be to sort so that n1 ≥ n2 ≥ … ≥ nk and return ∑i (2k−i / (2k − 1)) ni, which under-approximates the true mean by an error term between 0 and log k.
You could also use a sample mean of the log sum exp of {ni} ∪ (a random subset of {ni+1, ni+2, …, nk}) instead of ni in the sum. By using enough samples, you can make the approximation as good as you like (though obviously at some point it’s cheaper to evaluate with brute force).
(I’m assuming that the empty set is omitted.)
On a different tack, here’s a deterministic scheme whose additive error is at most ε. As in my other answer, we sort n1 ≥ … ≥ nk, define an approximation f(i) of the mean log sum exp over nonempty subsets where the minimum index is i, and evaluate ∑i (2k−i / (2k − 1)) f(i). If each f(i) is within ε, then so is the result.
Fix i and for 1 ≤ j ≤ k−i define dj = ni+j − ni. Note that dj ≤ 0. We define f(i) = ni + g(i) where g(i) will approximate the mean log one plus sum exp of subsets of {dj | j}.
I don’t want to write the next part formally since I believe that it will be harder to understand, so the idea is that with log sum exp being commutative and associative, we’re going to speed up the brute force algorithm that initializes the list of results as [0] and then, for each j, doubles the size of the list the list by appending the log sum exp with dj of each of its current elements. I claim that, if we round each intermediate result down to the nearest multiple of δ = ε/k, then the result will under-approximate at most ε. By switching from the list to its histogram, we can do each step in time proportional to the number of distinct entries. Finally, we set g(i) to the histogram average.
To analyze the running time, in the worst case, we have dj = 0 for all j, making the largest possible result log k. This means that the list can have at most (log k)/δ = ε−1 k log k + 1 entries, making the total running time O(ε−1 k3 log k). (This can undoubtedly be improved slightly with a faster convolution algorithm and/or by rounding the dj and taking advantage.)

Find The number of sides of triangle

I am giving a Array A consists of integer, I have to choose any two integer from array A and any third integer from range [L,R] , such that all three integer forms a valid triangle.
I have to find the number of integer in range [L,R] which can be used to form a valid triangle, by choosing any two values from array A.
I know if i know two sides then third side must be range a-b<x<a+b
where a and b are any two integer from A.
How to find the number of valid integer in [L,R] in O(N) N= Size of A time.
L and R and be very large upto 10^20
To my understanding, complete enumeration will yield an efficient solution by the following argument. The maximum number of possible solutions occurs in the case where any selection of elements of A yields a consistent choice of side lengths for a triangle. Such an input, for instance, would be an input of N times the value 1. However, the number of triples chosen from A can be bounded by
(n choose 3) = n!/(3!(n-3))
= n!/(6(n-3)!)
= (1/6)*(n-2)*(n-1)*n
<= n^3
(where choose is meant to denote the binomial coefficient) which is a polynomial number of choices. Any choice can be checked for validity in constant time, as only 3 values are involved.
Now the contest has ended, so here is my way to solve it in the contest.
The problem is asking how many number x in [L,R]can form triangle with any pair (a_i, a_j) in A.
Naive method is to brute all pairs which is O(N^2 * (R-L+1)) = O(N^2 * (R-L))
But indeed, we do not need to test all O(N^2) pairs, we only need to test O(N) pairs IF A is sorted, namely, all adjacent pair (a_i-1, a_i) for i > 0
Why? Because in sorted A:
If there is some pair (a_j, a_i) where a_j < a_i and j != i-1, which can form triangle with x
Then (a_i-1, a_i) must form a triangle with x too:
a_i-1 + a_i > a_j + a_i > x
x + a_i-1 > x + a_j > a_i
x + a_i > x + a_i-1 > a_j
Therefore checking all (a_i-1, a_i) is sufficient, this is the first core idea to solve this problem.
So now we have a O(NlgN + N*(R-L+1)) = O(N*(R-L)) algorithm.
For the original contest problem, it is still too slow as R-L+1 is as large as 10^18, so we need another tricks.
Note that actually by the triangle inequality, for a pair (a_i-1, a_i), indeed we can find a range of x which can form triangle with this pair:
a_i-1 + a_i > x > a_i - a_i-1 (By above 1 & 2)
For example, (4,5) can form triangle with all 1 < x < 9
So instead of iterating all x in [L,R], we try to find range of x for each pair, which can be done in O(N) as we know the range for x of a pair in O(1). Beware that x must fall in the range of [L,R].
After that we have O(N) ranges / segments of x which then we can union them and the size of the result set is the desired answer.
Union O(N) segments can be done easily in O(N) as well, I leave you as a homework :)
Combine both tricks, the algorithm is O(N lg N + N + N) = O(N lg N)

How can I find a faster algorithm for this special case of Longest Common Sub-sequence (LCS)?

I know the LCS problem need time ~ O(mn) where m and n are length of two sequence X and Y respectively. But my problem is a little bit easier so I expect a faster algorithm than ~O(mn).
Here is my problem:
Input:
a positive integer Q, two sequence X=x1,x2,x3.....xn and Y=y1,y2,y3...yn, both of length n.
Output:
True, if the length of the LCS of X and Y is at least n - Q;
False, otherwise.
The well-known algorithm costs O(n^2) here, but actually we can do better than that. Because whenever we eliminate as many as Q elements in either sequence without finding a common element, the result returns False. Someone said there should be an algorithm as good as O(Q*n), but I cannot figure out.
UPDATE:
Already found an answer!
I was told I can just calculate the diagonal block of the table c[i,j], because if |i-j|>Q, means there are already more than Q unmatched elements in both sequences. So we only need to calculate the c[i,j] when |i-j|<=Q.
Here is one possible way to do it:
1. Let's assume that f(prefix_len, deleted_cnt) is the leftmost position in Y such that prefix_len elements of X were already processed and exactly deleted_cnt of them were deleted. Obviously, there are only O(N * Q) states because deleted_cnt cannot exceed Q.
2. The base case is f(0, 0) = 0(nothing was processed, thus nothing was deleted).
3. Transitions:
a) Remove the current element: f(i + 1, j + 1) = min(f(i + 1, j + 1), f(i, j)).
b) Match the current element with the leftmost possible element from Y that is equal to it and located after f(i, j)(let's assume that it has index pos): f(i + 1, j) = min(f(i + 1, j), pos).
4. So the only question remaining is how to get the leftmost matching element located to the right from a given position. Let's precompute the following pairs: (position in Y, element of X) -> the leftmost occurrence of the element of Y equal to this element of X to the right from this position in Y and put them into a hash table. It looks like O(n^2). But is not. For a fixed position in Y, we never need to go further to the right from it than by Q + 1 positions. Why? If we go further, we skip more than Q elements! So we can use this fact to examine only O(N * Q) pairs and get desired time complexity. When we have this hash table, finding pos during the step 3 is just one hash table lookup. Here is a pseudo code for this step:
map = EmptyHashMap()
for i = 0 ... n - 1:
for j = i + 1 ... min(n - 1, i + q + 1)
map[(i, Y[j])] = min(map[(i, Y[j])], j)
Unfortunately, this solution uses hash tables so it has O(N * Q) time complexity on average, not in the worst case, but it should be feasible.
You can also say cost of the process to make the string equal must not be greater than Q.if it greater than Q than answer must be false.(EDIT DISTANCE PROBLEM)
Suppose of the of string x is m, and the size of string y is n, then we create a two dimensional array d[0..m][0..n], where d[i][j] denotes the edit distance between the i-length prefix of x and j-length prefix of y.
The computation of array d is done using dynamic programming, which uses the following recurrence:
d[i][0] = i , for i <= m
d[0][j] = j , for j <= n
d[i][j] = d[i - 1][j - 1], if s[i] == w[j],
d[i][j] = min(d[i - 1][j] + 1, d[i][j - 1] + 1, d[i - 1][j - 1] + 1), otherwise.
answer of LCS if m>n, m-dp[m][m-n]

What is the numerical complexity of computing the empirical cdf of an array?

It's all in the title. Suppose $X$ is an array of n floats. The empirical CDF is the function (of t):
Fn(t) = (1/n) sum{1{Xi <= t} : i=1,...,n}
This has to be computed for t_1<t_2,...,t_m (e.g. for m different, sorted, values of t). My question is what is the numerical complexity of computing this? I think O(nlog(n))+O(mlog(n)) [sort the array then perform m binary search, one for each value of t]
but I may be naive. Can anyone confirm?
Edit:
Sorry for the mess. While writing the question, I realized that I was imposing some constraints that are not in the original problem. I respond to Yves's question below.
The Xi are not sorted.
The t_j are sorted and equi-spaced.
m is smaller than n, but not by orders of magnitudes: typically m~n/4.
The given expression, a sum of N 0/1 terms, is clearly O(N).
UPDATE:
If the Xi are presorted, the function is trivially CDFi = CDF(Xi) = i/N, and the computation is in a way O(0)!
If the Xi are unsorted, you'll need to sort first in O(N.Log(N)), unless the range of the variable allows a faster sorting such as Counting sort.
If you only need to evaluate for a small number of Xis, let K, then you can consider using the naïve summation, as K.N can beat N.Log(N).
UPDATE: (second change by the OP)
Else, sort the Xi if necessary and sort the tj if necessary. Then a single linear pass will suffice. Total complexity will be one of:
O(n.Log(n) + m.Log(m))
O(n.Log(n) + m)
O(n + m.Log(m))
O(n + m).
If m < Log(n) and the Xi are unsorted, use the naïve formula. Complexity O(m.n).
Possibly there could be better options when m>n.
UPDATE: final specs: Xi unsorted, Tj sorted, m < n.
The solution I would choose is as follows:
1) Sort the Xi.
2) "Merge" the sorted Xi and Tj. This means, progress simultaneously in the X and T lists, keeping two running indexes; make sure to always increment the index that causes the shortest move; use CDF(Tj)=i/n. This is a linear process. (Very close to a merge in mergesort.)
Global complexity is O(n.Log(n)), the merging term O(n) being absorbed in the former.
UPDATE: uniform sampling.
When the Tj values are equi-spaced, let Tj = T0 + D.j, you can use an histogram approach.
Allocate an array of m+1 counters, initially 0. For every Xi, compute a bin index as Floor((Xi - T0) / D). Clamp negative values to 0 and values larger than m to m. Increment that bin. In the end, every bin will tell you how many X values are in range [Tj, Tj+1[.
Compute the prefix sum of the counters. They will now tell you how many X values are smaller than Xj+1, and CDF(j)=Counter[j]/n.
[Caution, this is an unchecked sketch, can be wrong in details.]
Total computation will take n bin incrementations followed by a prefix sum on m elements, i.e. O(n) operations.
# Input data
X= [0.125, 6, 3.25, 9, 1.4375, 6, 3.125, 7]
n= len(X)
# Sampling points (1 to 6)
T0= 1
DT= 1
m= 6
# Initialize the counters: O(m)
C= [0] * m
# Accumulate the histogram: O(n)
for x in X:
i= max(0, int((x - T0) / DT))
if i < m:
C[i]+= 1
# Compute the prefix sum: O(m)
S= 0
for i in range(m - 1):
C[i + 1]+= C[i]
# Reduce: O(m)
for i in range(m):
C[i]/= float(n)
# Display
print "T=", C
T= [0.25, 0.25, 0.5, 0.5, 0.5, 0.75]
A CDF Fn(t) is always a non-decreasing function in [0..1]. Therefore I assume your notation is saying to count the number of elements Xi <= t and return that count divided by n.
Thus if t is very large, you have n/n = 1. For very small, it's 0/n = 0 as we'd expect.
This is a poor definition of an empiracle CDF. See for example see Law, Averill M., Simulation & Modeling, 4th ed., p 301 for some more advanced ideas.
The simplest efficient way to compute your function (given that m, the number of Fn(t) values you need, is unknown) is first to sort the inputs Xi. This requires O(n log n) time, but needs to be done only once no matter how many t values you're processing.
Let's call the sorted values Yi. To find the count of Yi values <= t is the same as finding i such that Yi <= t < Yi+i. This can be done by binary search in O(log n) time for a given value of t. Divide by n and you have the Fn(t) value required. Of course you can repeat this m times to get the job done in O(m log n) time.
However you say your special case is m presorted values of t_j. You can find all the i values with a single pass over the Yi and simultaneously over the t_j, in the fashion of the merge operation in mergesort. With this you find all the answers in O(m + n) time.
Putting this together with the sorting cost, you have O(m + n + n log n) = O(m + n log n).
Note this is always faster than using the binary search lookup m times, O(n log n + m log n) = O((m + n) log n).
The only case you'd want to skip the presorting is when m < O(log n). This is because with no presorting, processing all the t_j needs O(mn) time - you must touch all n elements to count the number <= t_j. Consequently, if m < O(log n), then skipping the presort leads to less than O(n log n), i.e. asymptotically faster than the presort method.

Maximum Subrectangle for very special matrices

While working on an image processing task I have come across the following problem: There are n points in the unit square with coordinates $x_i$ and $y_i$, each assigned with a positive or negative weight $w_i$. Find a rectangle such that the sum of all weights of those points lying within the rectangle is positive and maximal.
By defining a proper grid, the problem can be rephrased as finding a submatrix in an n-by-n matrix A whose sum of elements is maximal. This is also known as the "maximal subrectangle problem" and has been discussed on SO before. While a brute force approach has a run-time of O(n^5), there is a kind of tricky solution with a run-time of O(n^3). It utilizes a solution for the corresponding one-dimensional problem, called "maximal subarray problem", with an O(n) run-time.
I have implemented both algorithms in R and can solve 100s of points in a few seconds. But with thousands of points it will be much too slow, probably even when outsourcing the loops to some Fortran or C code.
Now look at the matrix A. When assuming (w/o loss of generality) that all points have different x- or y-coordinates, A has a special form: In each row and column of A there is exactly one non-zero element. For matrices with this special property I assume there should be an algorithm performing the task in O(n^2) time, or even better.
Here is an example with the optimal rectangle added:
set.seed(723)
N <- 50; w <- rnorm(N)
x <- runif(N); y <- runif(N)
clr <- ifelse (w >= 0, "blue", "red")
plot(x, y, pch = 20, col = clr, xlim = c(0, 1), ylim = c(0, 1))
rect(0.075, 0.45, 0.31, 0.95, border="gray")
You see that there can be red, ie. negative, points in the optimal rectangle. It also shows that it will not suffice to solve the one-dimensional cases for the x- and y-coordinates.
I will translate the standard solution into Fortran, but I would surely like to have a more efficient algorithm at hand.
These guys (found from the wiki page) claim to have a simpler sub-cubic solution for the 2-dimensional case. It may be the one you're already aware of.
See the accepted answer for "Maximum sum subrectangle in a sparse matrix". For an nxn matrix with m non-zero elements, the solution there takes O(nm log n) time. So, for you, since you have exactly n non-zero elements, this would give O(n^2 log n) time. Probably you'll be able to handle cases with n being 50 times larger or more, vs. the standard O(n^3) solution.
The best I can do is O(n^2 log n).
If we look at the n+1 choose 2 calls made by Kadane's 2D algorithm to Kadane's 1D algorithm on an input of your type, all but O(n) successive pairs are on 1D arrays that differ only in one element. I'm going to present a divide-and-conquer variant of Kadane's 1D; by caching the outcomes of each recursive call, only the O(log n) that involve the changed array element have to be recomputed, reducing the (amortized) running time of the inner loop from Theta(n) to Theta(log n).
def maxsubarray(arr, a, b):
# this function returns a 4-tuple
# element 0 is the max over intervals of the form [i, j)
# element 1 is the max over intervals of the form [i, b)
# element 2 is the max over intervals of the form [a, j)
# element 3 is the max over intervals of the form [a, b), i.e., sum(arr[a:b])
n = b - a
if n == 0:
return (0, 0, 0, 0)
elif n == 1:
x = arr[a]
y = max(x, 0)
return (y, y, y, x)
else:
m = a + n // 2
l = maxsubarray(arr, a, m)
r = maxsubarray(arr, m, b)
return (max(l[0], r[0], l[1] + r[2]),
max(r[1], l[1] + r[3]),
max(l[2], l[3] + r[2]),
l[3] + r[3])

Resources