Maximizing the overall sum of K disjoint and contiguous subsets of size L among N positive numbers - algorithm

I'm trying to find an algorithm to find K disjoint, contiguous subsets of size L of an array x of real numbers that maximize the sum of the elements.
Spelling out the details, X is a set of N positive real numbers:
X={x[1],x[2],...x[N]} where x[j]>=0 for all j=1,...,N.
A contiguous subset of length L called S[i] is defined as L consecutive members of X starting at position n[i] and ending at position n[i]+L-1:
S[i] = {x[j] | j=n[i],n[i]+1,...,n[i]+L-1} = {x[n[i]],x[n[i]+1],...,x[n[i]+L-1]}.
Two of such subsets S[i] and S[j] are called pairwise disjoint (non-overlapping) if |n[i]-n[j]|>=L. In other words, they don't contain any identical members of X.
Define the summation of the members of each subset:
SUM[i] = x[n[i]]+x[n[i]+1]+...+x[n[i]+L-1];
The goal is find K contiguous and disjoint(non-overlapping) subsets S[1],S[2],...,S[K] of length L such that SUM[1]+SUM[2]+...+SUM[K] is maximized.

This is solved by dynamic programming. Let M[i] be the best solution only for the first i elements of x. Then:
M[i] = 0 for i < L
M[i] = max(M[i-1], M[i-L] + sum(x[i-L+1] + x[i-L+2] + ... + x[i]))
The solution to your problem is M[N].
When you code it, you can incrementally compute the sum (or simply pre-compute all the sums) leading to an O(N) solution in both space and time.
If you have to find exactly K subsets, you can extend this, by defining M[i, k] to be the optimal solution with k subsets on the first i elements. Then:
M[i, k] = 0 for i < k * L or k = 0.
M[i, k] = max(M[i-1, k], M[i-L, k-1] + sum(x[i-L+1] + ... + x[i])
The solution to your problem is M[N, K].
This is a 2d dynamic programming solution, and has time and space complexity of O(NK) (assuming you use the same trick as above for avoiding re-computing the sum).

Related

Get highest score in this game: choosing and removing elements in an array

Given an array arr of n integers, what is the highest score that a player can reach, playing the following game?
Choose an index 0 < i < n-1 in the array
Add arr[i-1] * arr[i+1] points to the score (initially the score is 0)
Shrink the array by removing element i (forall j >= i: arr[j] = arr[j+1]; then n = n - 1
Repeat steps 1-3 until n == 2.
Do the above until there are only 2 elements (which are the first and the last element because you can't remove them).
What is the highest score you can get ?
Example
arr = [1 2 3 4]
Choose i=2, get: 2*4 = 8 points, remove 3
Remaining: arr = [1 2 4]
Choose i=1, get 1*4 = 4 points, remove 2
Remaining: arr = [1 4].
The sum of points is 8 + 4 = 12, which is the highest possible score on this example.
I think it is related to Dynamic programming but I'm not sure how to solve it.
This problem has a dynamic programming approach similar to Matrix-chain multiplication problem. You can find further explanation in the book "Introduction to Algorithms", 3rd Edition (Cormen, page 370).
Let's find the optimal substructure property and then use it to construct an optimal solution to the problem from optimal solutions to subproblems.
Notation: Ci..j, where i ≤ j, stands for elements Ci,Ci+1,...,Cj.
Definition: A removal sequence for Ci..j is a permutation of i+1,i+2,...,j-1.
A removal sequence for Ci..j is optimal if the score achieved by removing the elements of Ci..j in that order is maximum among all possible removal sequences for Ci..j.
1. Characterize the structure of an optimal solution
If the problem is nontrivial, i.e. i + 1 < j, then any solution has a last removed element which corresponding index is k in the range
i < k < j. Such k split the problem into Ci..k and Ck..j. That is, for some value k, we first remove non extremal elements of Ci..k and Ck..j and then we remove element k. As removing non extremal elements of Ci..k doesn't affect score obtained by removing non extremal elements of Ck..j and an analogous reasoning for removing non extremal elements of Ck..j is also true we state that both subproblems are independent. Then, for a given removal sequence where kth-element is last, the score of Ci..j is equal to the sum of scores of Ci..k and Ck..j, plus the score of removing kth-element (C[i] * C[j]).
The optimal substructure of this problem is as follows. Suppose there is an optimal removal sequence O for Ci..j that ends at kth-element, then the ordering of removed elements from Ci..k must be optimal too. We can prove it by contradiction: If there was a removal sequence for Ci..k that scored higher than removal subsequence extracted from O for Ci..k then we can produce another removal sequence for Ci..j with higher score than optimal removal sequence (contradiction). A similar observation holds for the ordering of removed elements from Ck..j in the optimal removal sequence for Ci..j: it must be optimal too.
We can build an optimal solution for nontrivial instances of the problem by splitting the problem into two subproblems, finding optimal solutions to subproblem instances, and them combining these optimal subproblem solutions.
2. Recursively define the value of an optimal solution.
For this problem our subproblems are the maximum score obtained in Ci..j for 1 ≤ i ≤ j ≤ N. Let S[i, j] be the maximum score obtained in Ci..j; for the full problem, the highest score when evaluating the given rules is S[1, N].
We can define S[i, j] recursively as follows:
If j ≤ i + 1 then S[i, j] = 0
If i + 1 < j then S[i, j] = maxi < k < j{S[i, k] + S[k, j] + C[i] * C[j]}
We ensure that we search for the correct place to split because we consider all possible places, so that we are sure of having examined the optimal one.
3. Compute the value of an optimal solution
You can use your favorite method to compute S:
top-down approach (recursive)
bottom-up approach (iterative)\
I would use bottom-up for computing the solution since it would be < 5 lines long in almost any programming language.
Example in C++11:
for(int l = 2; l <= N; ++l) \\ increasing length intervals
for(int i = 1, j = i + l; j <= N; ++i, ++j)
for(int k = i + 1; k < j; ++k)
S[i, j] = max(S[i, j], S[i, k] + S[k, j] + C[i] * C[j])
4. Time Complexity and Space Complexity
There are nC2 + n = Θ(n2) subproblems and every subproblem do an operation which running time is Θ(l) where l is length of the subproblem so the math yield a running time of Θ(n3) for the algorithm (it's easy to spot the O(n3) part :-)). Also, the algorithm requires Θ(n2) space to store the S table.

Finding best algorithm for sum of a section of an array's values

Given an array of n integers in the locations A[1], A[2], …, A[n], describe an O(n^2) time algorithm to
compute the sum A[i] + A[i+1] + … + A[j] for all i, j, 1 ≤ i < j ≤ n.
I've tried multiple ways of solving this problem but none have in O(n^2) time.
So for an array containing {1,2,3,4}
You would output:
1+2 = 3
1+2+3 = 6
1+2+3+4 = 10
2+3 = 5
2+3+4 = 9
3+4 = 7
The answer does not need to be in a specific language, pseudocode is preferred.
A good preperation is everything.
You could create an array of integrals:
I[0..n] = (0, I[0] + A[1], I[1] + A[2], ..., I[n-1]+A[n]);
This will cost you O(n) * O(1) (looping over all elements and doing one addition);
Now you can calculate each Sum(A, i, j) with just a single subtraction: I[j] - I[i-1];
so this has O(1)
Looping over all combinations of i and j with 1 <= (i,j) <= n has O(n^2).
So you end up with O(n) * O(1) + O(n^2) * O(1) = O(n^2) .
Edit:
Your array A starts at 1 - adapted to this - this also solves the little quirk with i-1
So the integral array I starts with index 0 and is 1 element larger than A
Edit:
First you'll maybe have thought about the most naive idea:
Naive idea
Create a function that for given values of i and of j will return the sum A[i] + ... + A[j].
function sumRange(A, i, j):
sum = 0
for k = i to j
sum = sum + A[k]
return sum
Then generate all pairs of i and j (with i < j) and call the above function for each pair:
for i = 1 to n
for j = i+1 to n
output sumRange(A, i, j)
This is not O(n²), because already the two loops on i and j represent O(n²) iterations, and then the function will perform yet another loop, making it O(n³).
Better idea
The above can be improved. Look at the repetition it performs. The sum that was calculated for given values of i and j could be reused to calculate the sum for when j has increased with 1, without starting from scratch and summing the values between i and (now) j-1 again, only to add that one more value to it.
We should just remember what the previous sum was, and add A[j] to it.
So without a separate function:
for i = 1 to n
sum = A[i]
for j = i+1 to n
sum = sum + A[j]
output sum
Note how the sum is not reset to 0 once it is output. It is preserved, so that when j is incremented, only one value needs to be added to it.
Now it is O(n²). Note also how it does not require an extra array for storage. It only needs the memory for a few variables (i, j, sum), so its space complexity is O(1).
As the number of sums you need to output is O(n²), there is no way to improve this time complexity any further.
NB: I assume here that single array values do not constitute a "sum". As you stated in your question, i < j, and also in your example you only showed sums of at least two array values. The above can be easily adapted to also include single value "sums" if ever that were needed.

Finding median in merged array of two sorted arrays

Assume we have 2 sorted arrays of integers with sizes of n and m. What is the best way to find median of all m + n numbers?
It's easy to do this with log(n) * log(m) complexity. But i want to solve this problem in log(n) + log(m) time. So is there any suggestion to solve this problem?
Explanation
The key point of this problem is to ignore half part of A and B each step recursively by comparing the median of remaining A and B:
if (aMid < bMid) Keep [aMid +1 ... n] and [bLeft ... m]
else Keep [bMid + 1 ... m] and [aLeft ... n]
// where n and m are the length of array A and B
As the following: time complexity is O(log(m + n))
public double findMedianSortedArrays(int[] A, int[] B) {
int m = A.length, n = B.length;
int l = (m + n + 1) / 2;
int r = (m + n + 2) / 2;
return (getkth(A, 0, B, 0, l) + getkth(A, 0, B, 0, r)) / 2.0;
}
public double getkth(int[] A, int aStart, int[] B, int bStart, int k) {
if (aStart > A.length - 1) return B[bStart + k - 1];
if (bStart > B.length - 1) return A[aStart + k - 1];
if (k == 1) return Math.min(A[aStart], B[bStart]);
int aMid = Integer.MAX_VALUE, bMid = Integer.MAX_VALUE;
if (aStart + k/2 - 1 < A.length) aMid = A[aStart + k/2 - 1];
if (bStart + k/2 - 1 < B.length) bMid = B[bStart + k/2 - 1];
if (aMid < bMid)
return getkth(A, aStart + k / 2, B, bStart, k - k / 2); // Check: aRight + bLeft
else
return getkth(A, aStart, B, bStart + k / 2, k - k / 2); // Check: bRight + aLeft
}
Hope it helps! Let me know if you need more explanation on any part.
Here's a very good solution I found in Java on Stack Overflow. It's a method of finding the K and K+1 smallest items in the two arrays where K is the center of the merged array.
If you have a function for finding the Kth item of two arrays then finding the median of the two is easy;
Calculate the weighted average of the Kth and Kth+1 items of X and Y
But then you'll need a way to find the Kth item of two lists; (remember we're one indexing now)
If X contains zero items then the Kth smallest item of X and Y is the Kth smallest item of Y
Otherwise if K == 2 then the second smallest item of X and Y is the smallest of the smallest items of X and Y (min(X[0], Y[0]))
Otherwise;
i. Let A be min(length(X), K / 2)
ii. Let B be min(length(Y), K / 2)
iii. If the X[A] > Y[B] then recurse from step 1. with X, Y' with all elements of Y from B to the end of Y and K' = K - B, otherwise recurse with X' with all elements of X from A to the end of X, Y and K' = K - A
If I find the time tomorrow I will verify that this algorithm works in Python as stated and provide the example source code, it may have some off-by-one errors as-is.
Take the median element in list A and call it a. Compare a to the center elements in list B. Lets call them b1 and b2 (if B has odd length then exactly where you split b depends on your definition of the median of an even length list, but the procedure is almost identical regardless). if b1&leq;a&leq;b2 then a is the median of the merged array. This can be done in constant time since it requires exactly two comparisons.
If a is greater than b2 then we add the top half of A to the top of B and repeat. B will no longer be sorted, but it doesn't matter. If a is less than b1 then we add the bottom half of A to the bottom of B and repeat. These will iterate log(n) times at most (if the median is found sooner then stop, of course).
It is possible that this will not find the median. If this is the case then the median is in B. If so, perform the same algorithm with A and B reversed. This will require log(m) iterations. In total you will have performed at most 2*(log(n)+log(m)) iterations of a constant time operation, so you have solved the problem in order log(n)+log(m) time.
This is essentially the same answer as was given by iehrlich, but written out more explicitly.
Yes, this can be done. Given two arrays, A and B, in the worst-case scenario you have to first perform a binary search in A, and then, if it fails, binary search in B looking for the median. On each step of a binary search, you check if the current element is actually a median of a merged A+B array. Such check takes constant time.
Let's see why such check is constant. For simplicity, let's assume that |A| + |B| is an odd number, and that all numbers in both arrays are different. You can remove these restrictions later by applying the usual median definition approach (i.e., how to calculate the median of an array containing duplicates, or of an array with even length). Anyway, given that, we know for sure, that in the merged array there will be (|A| + |B| - 1) / 2 elements to the right and to the left of an actual median. In the process of a binary search in A, we know the index of current element x in array A (let it be i). Now, if x satisfies the condition B[j] < x < B[j+1], where i + j == (|A| + |B| - 1) / 2, then x is your median.
The overall complexity is O(log(max(|A|, |B|)) time and O(1) memory.

Find longest sequences with sufficient average score

I have a long list of scores between 0 and 1. How do I efficiently find all contiguous sublists longer than x elements such that the average score in each sublist is not less than y?
E.g., how do I find all contiguous sublists longer than 300 elements such that the average score of these sublists is not less than 0.8?
I'm mainly interested in the LONGEST sublists that fulfill these criteria, not actually all sublists. So I'm looking for all longest sublists.
If you want only the longest such substrings, this can be solved in O(n log n) time by transforming the problem slightly and then binary-searching over maximum solution lengths.
Let the input list of scores be x[1], ..., x[n]. Let's transform this list by subtracting y from each element, to form the list z[1], ..., z[n], whose elements may be positive or negative. Notice that any sublist x[i .. j] has average score at least y if and only if the sum of elements in the corresponding sublist in z (i.e., z[i] + z[i+1] + ... + z[j]) is at least 0. So, if we had a way to compute the maximum sum T of any sublist in z[] efficiently (spoiler: we do), this would, as a side effect, tell us if there is any sublist in x[] that has average score at least y: if T >= 0 then there is at least 1 such sublist, while if T < 0 then there is no sublist in x[] (not even a single-element sublist) that has average score at least y. But this doesn't yet give us all the information we need to answer your original question, since nothing forces the maximum-sum sublist in z to have maximum length: it could well be that a longer sublist exists that has lower overall average, while still having average at least y.
This can be addressed by generalising the problem of finding the sublist with maximum sum: instead of asking for a sublist with maximum sum overall, we will now ask for a sublist having maximum sum among all sublists having length at least some given k. I'll now describe an algorithm that, given a list of numbers z[1], ..., z[n], each of which can be positive or negative, and any positive integer k, will compute the maximum sum of any sublist of z[] having length at least k, as well as the location of a particular sublist that achieves this sum, and has longest possible length among all sublists having this sum. It's a slight generalisation of Kadane's algorithm.
FindMaxSumLongerThan(z[], k):
v = 0 # Sum of the rightmost k numbers in the current sublist
For i from 1 to k:
v = v + z[i]
best = v
bestStart = 1
bestEnd = k
# Now for each i, with k+1 <= i <= n, find the biggest sum ending at position i.
tail = -1 # Will contain the maximum sum among all lists ending at i-k
tailLen = 0 # The length of the longest list having the above sum
For i from k+1 to n:
If tail >= 0:
tail = tail + z[i-k]
tailLen = tailLen + 1
Else:
tail = z[i-k]
tailLen = 1
If tail >= 0:
nonnegTail = tail
nonnegTailLen = tailLen
Else:
nonnegTail = 0
nonnegTailLen = 0
v = v + z[i] - z[i-k] # Slide the window right 1 position
If v + nonnegTail > best:
best = v + nonnegTail
bestStart = i - k - nonnegTailLen + 1
bestEnd = i
The above algorithm takes O(n) time and O(1) space, returning the maximum sum in best and the beginning and ending positions of some sublist that achieves that sum in bestStart and bestEnd, respectively.
How is the above useful? For a given input list x[], suppose we first transform x[] into z[] by subtracting y from each element as described above; this will be the z[] passed into every call to FindMaxSumLongerThan(). We can view the value of best that results from calling the function with z[] and a given minimum sublist length k as a mathematical function of k: best(k). Since FindMaxSumLongerThan() finds the maximum sum of any sublist of z[] having length at least k, best(k) is a nonincreasing function of k. (Say we set k=5 and found that the maximum sum of any sublist is 42; then we are guaranteed to find a total of at least 42 if we try again with k=4 or k=3.) That means we can binary search on k to find the largest k such that best(k) >= 0: that k will then be the longest sublist of x[] that has average value at least y. The resulting bestStart and bestEnd will identify a particular sublist having this property; it's easy to modify the algorithm to find all (at most n -- one per rightmost position) of these sublists without increasing the time complexity.
I think that general solution is always O(N^2). I will demonstrate a code in Python and some optimizations you can implement to increase the performance by several orders of magnitude.
Let's generate some data:
from random import random
scores_list = [random() for i in range(10000)]
scores_len = len(scores_list)
Let's say these are our target values:
# Your average
avg = 0.55
# Your min lenght
min_len = 10
Here is a naive brute force solution
res = []
for i in range(scores_len - min_len):
for j in range(i+min_len, scores_len):
l = scores_list[i:j]
if sum(l) / (j - i) >= avg:
res.append(l)
That will run very slowly because it has to perform 10000^2 (10^8) operations.
Here is how we can do it better. It is still quadratic but there is some tricks wich allows it to perform much much faster:
res = []
i = 0
while i < scores_len - min_len:
j = i + min_len
di = scores_len
dj = 0
current_sum = sum(scores_list[i:j])
while j < scores_len:
current_sum += sum(scores_list[j-dj:j])
current_avg = current_sum/(j - i)
if current_avg >= avg:
res.append(scores_list[i:j])
dj = 1
di = 1
else:
dj = max(1, int((avg * (j - i) - current_sum)/(1 - avg)))
di = min(di, max(1, int(((j-i) * avg - current_sum)/avg)))
j += dj
i += di
For uniform distribution (which we have here) and for given target values it will perform only less than 10^6 operations (~7 * 10^5) and this is by two orders of magnitude less than brute force solution.
So basically if you have a few target sublists it will perform very good. And if you have a lot of them this algorithm will be about the same as a brute force one.

The expected number of inversions--From Introduction to Algorithms by Cormen

Let A[1 .. n] be an array of n distinct numbers. If i < j and A[i] > A[j], then the pair (i, j) is called an inversion of A. (See Problem 2-4 for more on inversions.) Suppose that each element of A is chosen randomly, independently, and uniformly from the range 1 through n. Use indicator random variables to compute the expected number of inversions.
The problem is from exercise 5.2-5 in Introduction to Algorithms by Cormen. Here is my recursive solution:
Suppose x(i) is the number of inversions in a[1..i], and E(i) is the expected value of x(i), then E(i+1) can be computed as following:
Image we have i+1 positions to place all the numbers, if we place i+1 on the first position, then x(i+1) = i + x(i); if we place i+1 on the second position, then x(i+1) = i-1 + x(i),..., so E(i+1) = 1/(i+1)* sum(k) + E(i), where k = [0,i]. Finally we get E(i+1) = i/2 + E(i).
Because we know that E(2) = 0.5, so recursively we get: E(n) = (n-1 + n-2 + ... + 2)/2 + 0.5 = n* (n-1)/4.
Although the deduction above seems to be right, but I am still not very sure of that. So I share it here.
If there is something wrong, please correct me.
All the solutions seem to be correct, but the problem says that we should use indicator random variables. So here is my solution using the same:
Let Eij be the event that i < j and A[i] > A[j].
Let Xij = I{Eij} = {1 if (i, j) is an inversion of A
0 if (i, j) is not an inversion of A}
Let X = Σ(i=1 to n)Σ(j=1 to n)(Xij) = No. of inversions of A.
E[X] = E[Σ(i=1 to n)Σ(j=1 to n)(Xij)]
= Σ(i=1 to n)Σ(j=1 to n)(E[Xij])
= Σ(i=1 to n)Σ(j=1 to n)(P(Eij))
= Σ(i=1 to n)Σ(j=i + 1 to n)(P(Eij)) (as we must have i < j)
= Σ(i=1 to n)Σ(j=i + 1 to n)(1/2) (we can choose the two numbers in
C(n, 2) ways and arrange them
as required. So P(Eij) = C(n, 2) / n(n-1))
= Σ(i=1 to n)((n - i)/2)
= n(n - 1)/4
Another solution is even simpler, IMO, although it does not use "indicator random variables".
Since all of the numbers are distinct, every pair of elements is either an inversion (i < j with A[i] > A[j]) or a non-inversion (i < j with A[i] < A[j]). Put another way, every pair of numbers is either in order or out of order.
So for any given permutation, the total number of inversions plus non-inversions is just the total number of pairs, or n*(n-1)/2.
By symmetry of "less than" and "greater than", the expected number of inversions equals the expected number of non-inversions.
Since the expectation of their sum is n*(n-1)/2 (constant for all permutations), and they are equal, they are each half of that or n*(n-1)/4.
[Update 1]
Apparently my "symmetry of 'less than' and 'greater than'" statement requires some elaboration.
For any array of numbers A in the range 1 through n, define ~A as the array you get when you subtract each number from n+1. For example, if A is [2,3,1], then ~A is [2,1,3].
Now, observe that for any pair of numbers in A that are in order, the corresponding elements of ~A are out of order. (Easy to show because negating two numbers exchanges their ordering.) This mapping explicitly shows the symmetry (duality) between less-than and greater-than in this context.
So, for any A, the number of inversions equals the number of non-inversions in ~A. But for every possible A, there corresponds exactly one ~A; when the numbers are chosen uniformly, both A and ~A are equally likely. Therefore the expected number of inversions in A equals the expected number of inversions in ~A, because these expectations are being calculated over the exact same space.
Therefore the expected number of inversions in A equals the expected number of non-inversions. The sum of these expectations is the expectation of the sum, which is the constant n*(n-1)/2, or the total number of pairs.
[Update 2]
A simpler symmetry: For any array A of n elements, define ~A as the same elements but in reverse order. Associate the element at position i in A with the element at position n+1-i in ~A. (That is, associate each element with itself in the reversed array.)
Now any inversion in A is associated with a non-inversion in ~A, just as with the construction in Update 1 above. So the same argument applies: The number of inversions in A equals the number of inversions in ~A; both A and ~A are equally likely sequences; etc.
The point of the intuition here is that the "less than" and "greater than" operators are just mirror images of each other, which you can see either by negating the arguments (as in Update 1) or by swapping them (as in Update 2). So the expected number of inversions and non-inversions is the same, since you cannot tell whether you are looking at any particular array through a mirror or not.
Even simpler (similar to Aman's answer above, but perhaps clearer) ...
Let Xij be a random variable with Xij=1 if A[i] > A[j] and Xij=0 otherwise.
Let X=sum(Xij) over i, j where i < j
Number of pairs (ij)*: n(n-1)/2
Probability that Xij=1 (Pr(Xij=1))): 1/2
By linearity of expectation**: E(X) = E(sum(Xij))
= sum(E(Xij))
= sum(Pr(Xij=1))
= n(n-1)/2 * 1/2
= n(n-1)/4
* I think of this as the size of the upper triangle of a square matrix.
** All sums here are over i, j, where i < j.
I think it's right, but I think the proper way to prove it is to use conditionnal expectations :
for all X and Y we have : E[X] =E [E [X|Y]]
then in your case :
E(i+1) = E[x(i+1)] = E[E[x(i+1) | x(i)]] = E[SUM(k)/(1+i) + x(i)] = i/2 + E[x(i)] = i/2 + E(i)
about the second statement :
if :
E(n) = n* (n-1)/4.
then E(n+1) = (n+1)*n/4 = (n-1)*n/4 + 2*n/4 = (n-1)*n/4 + n/2 = E(n) +n/2
So n* (n-1)/4. verify the recursion relation for all n >=2 and it verifies it for n=2
So E(n) = n*(n-1)/4
Hope I understood your problem and it helps
Using indicator random variables:
Let X = random variable which is equal to the number of inversions.
Let Xij = 1 if A[i] and A[j] form an inversion pair, and Xij = 0 otherwise.
Number of inversion pairs = Sum over 1 <= i < j <= n of (Xij)
Now P[Xij = 1] = P[A[i] > A[j]] = (n choose 2) / (2! * n choose 2) = 1/2
E[X] = E[sum over all ij pairs such that i < j of Xij] = sum over all ij pairs such that i < j of E[Xij] = n(n - 1) / 4

Resources