Problem: For an ordered set of edges E of a complete graph Kn, given an edge Ei, find the edge's vertices (v, w)_Ei.
Note: This is likely not a problem specific to graph theory, although it was chosen to express the problem solely because of familiarity. Apologies for any incorrect notation introduced.
Suppose that constructed from a complete graph K5 consisting of vertices 1, 2, 3, 4, 5, we have an ordered set E of the graph's edges, totalling 10 edges. The set E is known to always be ordered as follows:
Ei = (0 < v < n, v < w =< n)
E1 = (1, 2)
E2 = (1, 3)
E3 = (1, 4)
E4 = (1, 5)
E5 = (2, 3)
E6 = (2, 4)
E7 = (2, 5)
E8 = (3, 4)
E9 = (3, 5)
E10 = (4, 5)
For any given Ei, we must now find the vertices (v, w)_Ei using i alone. For example, given 6 we should obtain (2, 4).
Update:
Another, perhaps simpler way of expressing this problem is:
n = 5
i = 0
for v = 1 to n - 1
for w = v + 1 to n
i++
print "E" + i + " = " + v + ", " w
print "E6 = " + findV(6) + ", " + findW(6)
How is this done?
To solve the problem in closed form, we need the formula for the sum of first k numbers: 1 + 2 + ... + k = (k + 1) * k / 2. This gives us a mapping from edge (i, j) to edge index:
from math import ceil, sqrt
def edge_to_index((i, j)):
return n * (i - 1) + j - i * (i + 1) / 2
We can derive the inverse mapping:
def index_to_edge(k, n):
b = 1.0 - 2 * n
i = int(ceil((-b - sqrt(b**2 - 8 * k)) / 2))
j = k - n * (i - 1) + i * (i + 1) / 2
return (i, j)
A test:
n = 5
print "Edge to index and index to edge:"
for i in range(1, n + 1):
for j in range(i + 1, n + 1):
k = edge_to_index((i, j))
print (i, j), "->", k, "->", index_to_edge(k, n)
The output:
Edge to index and index to edge:
(1, 2) -> 1 -> (1, 2)
(1, 3) -> 2 -> (1, 3)
(1, 4) -> 3 -> (1, 4)
(1, 5) -> 4 -> (1, 5)
(2, 3) -> 5 -> (2, 3)
(2, 4) -> 6 -> (2, 4)
(2, 5) -> 7 -> (2, 5)
(3, 4) -> 8 -> (3, 4)
(3, 5) -> 9 -> (3, 5)
(4, 5) -> 10 -> (4, 5)
Let me restate the question I think you're asking so that if this is totally off-topic, you can let me know:
Given an integer k and the series (1, 2), (1, 3), ..., (1, k), (2, 3), (2, 4), ..., (2, k), (3, 4), ..., (k - 1, k) and an index n, return the value of the nth term of this series.
Here's a simple algorithm to solve this problem that is probably not asymptotically optimal. Notice that the first (k - 1) of the pairs start with 1, the next (k - 2) start with 2, the next (k - 3) with 3, etc. To determine what the value of the first element in the pair is, you can keep adding up these numbers (k - 1) + (k - 2) + ... until you end up with a value that is greater than or equal to your index. The number of times you could do this, plus one, gives you your first number:
E1 = (1, 2)
E2 = (1, 3)
E3 = (1, 4)
E4 = (1, 5)
E5 = (2, 3)
E6 = (2, 4)
E7 = (2, 5)
E8 = (3, 4)
E9 = (3, 5)
E10 = (4, 5)
Here, k = 5. To find the first number of term 8, we first add k - 1 = 4, which is less than eight. We then add k - 2 = 3 to get 7, which is still less than eight. However, adding k - 3 = 2 would give us nine, which is greater than eight, and so we stop. We added two numbers together, and so the first number must be a 3.
Once we know what the first number is, you can get the second number quite easily. When doing the step to get the first number, we're essentially listing off the indices of the pairs where the first number changes. For example, in our above case, we had the series 0, 4, 7. Adding one to each of these gives 1, 5, 8, which are indeed the first pairs that start with the numbers 1, 2, and 3, respectively. Once you know what the first number is, you also know where pairs with that number start, and so you can subtract out the index of your number from that position. This tells you, with a zero-index, how many steps you've taken forward from that element. Moreover, you know what the second value of that first element is, because it's one plus the first element, and so you can say that the second value is given by the first number, plus one, plus the number of steps your index is beyond the first pair starting with the given number. In our case, since we are looking at index 8 and we know the first pair starting with a three is at position 8, we get that the second number is 3 + 1 + 0 = 4, and our pair is (3, 4).
This algorithm is actually pretty fast. Given any k, this algorithm takes at most k steps to complete, and so runs in O(k). Compare this to the naive approach of scanning everything, which takes O(k2).
To make my life easier, I'm going to do my math 0-based, not 1-based as in your question.
First, we derive a formula for the index of the term (v,v+1) (the first that starts with v). This is just the arithmetic sum of n-1 + n-2 + ... + n-v, which is v(2n-v-1)/2.
So to find v given the index i, just solve the equation v(2n-v-1)/2 <= i for the largest integral v. Binary search would work well, or you could solve the quadratic using the quadratic formula and round down (maybe, have to think if that ends up working).
Finding the W is easy given V:
findW(i):
v = findV(i)
i_v = v(2n-v-1)/2
return i - i_v + 1
Well, the simple way is to loop through and subtract values corresponding to the first vertex, as follows (in python):
def unpackindex(i,n):
for v in range(1,n):
if v+i<=n: return (v,v+i)
i-= n-v
raise IndexError("bad index")
If you're looking for a closed-form formula, rather than an algorithm, you will need to do a square root at some point, so it is likely to be messy and somewhat slow (though not as slow as the above loop, for large enough n...). For moderate values of n, you might want to consider a precomputed lookup table, if performance is important.
Related
There are shuffle algorithms like FisherYates. They take an array and return one with elements in random order. This runs in O(n).
What I'm trying to do is to implement a prioritized left-shuffle algorithm. What does that mean?
Prioritized: It does not take an array of values. It takes an array of value-probability pairs. E.g. [ (1, 60), (2, 10), (3, 10), (4, 20) ]. Value 1 has 60%, value 2 has 10%, ...
left-shuffle: The higher the probability of a value, the higher its chances to be far on the left of the array.
Let's take this example [ (1, 10), (2, 10), (3, 60), (4, 20) ]. The most probable result should be [ 3, 4, 1, 2 ] or [ 3, 4, 2, 1 ].
I tried implementing this, but I haven't found any solution in O(n).
O(n^2) in pseudocode based on FisherYates:
sum = 100 #100%
for i = 0 to n-2:
r = random value between 0 and sum
localsum = 0
for j = i to n-1:
localsum = localsum + pair[j].Probability
if localsum >= r + 1:
swap(i, j)
break
sum = sum - pair[i].Probability
What could probably improve this a bit: Sorting the elements decreasing by probability right in the beginning to minimize the number of swaps and the iterations in the inner loop.
Is there a better solution (maybe even in O(n))?
Update of my first answer:
I've found a paper where the 'Roulette-wheel selection via stochastic acceptance' with O(1) is introduced. This makes the algorithm to O(n) and is simple to implement
from random import randint
from random import random
import time
data = [ (1, 10), (2, 10), (3, 60), (4, 20) ]
def swap(i, j, array):
array[j], array[i] = array[i], array[j]
def roulette_wheel_selection(data, start, max_weight_limit):
while True:
r = random()
r_index = randint(start, len(data) - 1)
if r <= data[r_index][1] / max_weight_limit:
return r_index
def shuffle(data, max_weight):
data = data.copy()
n = len(data)
for i in range(n-1):
r_index = roulette_wheel_selection(data, i, max_weight)
swap(i, r_index, data)
return data
def performance_test(iterations, data):
start = time.time()
max_weight = max([item[1] for item in data])
for i in range(iterations):
shuffle(data, max_weight)
end = time.time()
print(len(data), ': ',end - start)
return end - start
performance_test(1000, data)
data2 = []
for i in range(10):
data2 += data
performance_test(1000, data2)
data3 = []
for i in range(100):
data3 += data
performance_test(1000, data3)
data4 = []
for i in range(1000):
data4 += data
performance_test(1000, data4)
Performance Output
4 : 0.09153580665588379
40 : 0.6010794639587402
400 : 5.142168045043945
4000 : 50.09365963935852
So it's linear time in n (data size). I updated from my first answer the constant from "updated sum" to "maximum weight of all data items" But sure it depends on the max_weight konstant. If someone has a strategy to update max_weight in a proper way, the performance would increase.
There’s a way to do this in time O(n log n) using augmented binary search trees. The idea is the following. Take the items you want to shuffle and add them into a binary search tree, each annotated with their associated weights. Then, for each node in the BST, calculate the total weight of all the nodes in the subtree rooted at that node. For example, the weight of the root node will be 1 (sum of all the weights, which is 1 because it’s a probability distribution), the sum of the weight of the left child of the root will be the total weight in the left subtree, and the sum of the weights in the right child of the root will be the total weight of the right subtree.
With this structure in place, you can in time O(log n) select a random element from the tree, distributed according to your weights. The algorithm works like this. Pick a random number x, uniformly, in the range from 0 to the total weight left in the tree (initially 1, but as items are picked this will decrease). Then, start at the tree root. Let L be the weight of the tree’s left subtree and w be the weight of the root. Recursively use this procedure to select a node:
If x < L, move left and recursively select a node from there.
If L ≤ x < L + w, return the root.
If L + w ≤ x, set x := x - L - w and recursively select a node from the right subtree.
This technique is sometimes called roulette wheel selection, in case you want to learn more about it.
Once you’ve selected an item from the BST, you can then delete that item from the BST to ensure you don’t pick it again. There are techniques that ensure that, after removing the node from the tree, you can fix up the weight sums of the remaining nodes in the tree in time O(log n) so that they correctly reflect the weights of the remaining items. Do a search for augmented binary search tree for details about how to do this. Overall, this means that you’ll spend O(log n) work sampling and removing a single item, which summed across all n items gives an O(n log n)-time algorithm for generating your shuffle.
I’m not sure whether it’s possible to improve upon this. There is another algorithm for sampling from a discrete distribution called Vose’s alias method which gives O(1)-time queries, but it doesn’t nicely handle changes to the underlying distribution, which is something you need for your use case.
I've found a paper where the 'Roulette-wheel selection via stochastic acceptance' with O(1) is introduced. This makes the algorithm to O(n) and is simple to implement
from random import randint
from random import random
data = [ (1, 10), (2, 10), (3, 60), (4, 20) ]
def swap(i, j, array):
array[j], array[i] = array[i], array[j]
def roulette_wheel_selection(data, start, sum):
while True:
r = random()
r_index = randint(start, len(data) - 1)
if r <= data[r_index][1] / sum:
return r_index
def shuffle(data):
data = data.copy()
n = len(data)
sum = 100.0
for i in range(n-1):
r_index = roulette_wheel_selection(data, i, sum)
swap(i, r_index, data)
sum = sum - data[i][1]
return data
for i in range(10):
print(shuffle(data))
Output
[(3, 60), (4, 20), (2, 10), (1, 10)]
[(3, 60), (1, 10), (4, 20), (2, 10)]
[(3, 60), (1, 10), (4, 20), (2, 10)]
[(3, 60), (4, 20), (1, 10), (2, 10)]
[(3, 60), (4, 20), (2, 10), (1, 10)]
[(3, 60), (4, 20), (2, 10), (1, 10)]
[(3, 60), (4, 20), (2, 10), (1, 10)]
[(4, 20), (3, 60), (1, 10), (2, 10)]
[(3, 60), (2, 10), (4, 20), (1, 10)]
[(4, 20), (3, 60), (2, 10), (1, 10)]
Notice: For best performance the roulette_wheel_selection should use p_max depending on every iteration instead of sum. I use sum because it is easy to compute and update.
The 'Roulette-wheel selection via stochastic acceptance' answer of #StefanFenn technically answers my question.
But it has a disadvantage:
The maximum in the algorithm is only calculated once. Calculating it more often leads to a performance worse than O(n). If there are priorities like [100.000.000, 1, 2, 3], the algorithm would probably need 1 iteration through the while loop of roulette_wheel_selection if it picks the number 100.000.000, but millions of iterations through the while loop as soon as 100.000.000 is picked.
So I want to show you a very short O(n*log(n)) solution I've found that does not depend on how large the priorities themselves are (C# code):
var n = elements.Count;
Enumerable.Range(0, n)
.OrderByDescending(k => Math.Pow(_rng.NextDouble(), 1.0 / elements[k].Priority))
.Select(i => elements[i].Value);
Description: Based on the collection with the priorities with n elements, we create a new collection with values 0, 1, ... n-1. For each of them, we call the Math.Pow method to calculate a key and order the values descending by that key (because we want the values with higher priorities on the left, not the right). Now, we've got a collection with 0, 1, ... n-1 but in a prioritized/weighted random order. These are indices. In the last step, we get the insert the values based on the order of these indices.
Problem
Given a list of integers, identify sequences where successive numbers exactly N indexes apart have a value equal to N multiplied by the previous number in the sequence.
Rules:
N must be greater than 1
Sequences with less than 3 entries should be
ignored
Sequences returned must always be the longest possible for a
given value of N
Sequences of all zeros do not count
My Solution
Iterate the list of numbers of length M
At each iteration:
1.a hold the current number and current index in current_number and current_index respectively.
1.b Calculate the maximum possible number of successive number sequences the current_number can fit in, and hold this number in nested_iteration_count.
1.c Start the nested iteration with a loop count of nested_iteration_count and N at the minimum possible value of N = 2
1.c.1 Check if a sequence exists. If it exists, store the sequence in an array
1.c.2 Increment N by 1 and repeat the loop until the inner loop iterations are complete.
Repeat outer loop for next number
Example
Consider the following list of integers:
Number 2 10 4 3 8 6 9 9 18 27
Index 0 1 2 3 4 5 6 7 8 9
The following sequences are found:
2, 4, 8
3, 9, 27
This algorithm obviously has O(n^2) complexity. Is it possible to improve on this?
Quick-made Python implementation using #user3386109 optimization
The first stage checks whether progression with multiplier N is continued with i-th item
The second stage - retrieving of the longest sequence for every N - might be made more concise
res contains the longest progressions for (N:(count, endingindex) {2: (3, 4), 3: (3, 9)}
import math
lst = [2,10,4,3,8,6,9,9,18,27]
l = len(lst)
mp = {}
mn = min(lst)
mx = max(lst)
nmax = int(math.sqrt(mx / mn))
for i in range(2, l):
for n in range(2, min(i, (l - 1)//2, nmax) + 1):
if lst[i - n] * n == lst[i]:
t = (i-n, n)
le = mp[t] if t in mp else 1
mp[(i, n)] = le + 1
res = {}
for x in mp:
n = x[1]
le = mp[x]
ending = x[0]
if n in res:
if res[n][0] < le:
res[n] = (le, ending)
else:
res[n] = (le, ending)
print(mp)
print(res)
{(2, 2): 2, (4, 2): 3, (5, 2): 2, (6, 3): 2, (8, 2): 2, (8, 3): 2, (9, 3): 3}
{2: (3, 4), 3: (3, 9)}
I am exploring how a Dynamic Programming design approach relates to the underlying combinatorial properties of problems.
For this, I am looking at the canonical instance of the coin change problem: Let S = [d_1, d_2, ..., d_m] and n > 0 be a requested amount. In how many ways can we add up to n using nothing but the elements in S?
If we follow a Dynamic Programming approach to design an algorithm for this problem that would allow for a solution with polynomial complexity, we would start by looking at the problem and how it is related to smaller and simpler sub-problems. This would yield a recursive relation describing an inductive step representing the problem in terms of the solutions to its related subproblems. We can then implement either a memoization technique or a tabulation technique to efficiently implement this recursive relation in a top-down or a bottom-up manner, respectively.
A recursive relation to solve this instance of the problem could be the following (Python 3.6 syntax and 0-based indexing):
def C(S, m, n):
if n < 0:
return 0
if n == 0:
return 1
if m <= 0:
return 0
count_wout_high_coin = C(S, m - 1, n)
count_with_high_coin = C(S, m, n - S[m - 1])
return count_wout_high_coin + count_with_high_coin
This recursive relation yields a correct amount of solutions but disregarding the order. However, this relation:
def C(S, n):
if n < 0:
return 0
if n == 0:
return 1
return sum([C(S, n - coin) for coin in S])
yields a correct amount of solutions while regarding the order.
I am interested in capturing more subtle combinatorial patterns through a recursion relation that can be further optimized via memorization/tabulation.
For example, this relation:
def C(S, m, n, p):
if n < 0:
return 0
if n == 0 and not p:
return 1
if n == 0 and p:
return 0
if m == 0:
return 0
return C(S, m - 1, n, p) + C(S, m, n - S[n - 1], not p)
yields a solution disregarding order but counting only solutions with an even number of summands. The same relation can be modified to regard order and counting number of even number of summands:
def C(S, n, p):
if n < 0:
return 0
if n == 0 and not p:
return 1
if n == 0 and p:
return 0
return sum([C(S, n - coin, not p) for coin in S])
However, what if we have more than 1 person among which we want to split the coins? Say I want to split n among 2 persons s.t. each person gets the same number of coins, regardless of the total sum each gets. From the 14 solutions, only 7 include an even number of coins so that I can split them evenly. But I want to exclude redundant assignments of coins to each person. For example, 1 + 2 + 2 + 1 and 1 + 2 + 1 + 2 are different solutions when order matters, BUT they represent the same split of coins to two persons, i.e. person B would get 1 + 2 = 2 + 1. I am having a hard time coming up with a recursion to count splits in a non-redundant manner.
(Before I elaborate on a possible answer, let me just point out that counting the splits of the coin exchange, for even n, by sum rather than coin-count would be more or less trivial since we can count the number of ways to exchange n / 2 and multiply it by itself :)
Now, if you'd like to count splits of the coin exchange according to coin count, and exclude redundant assignments of coins to each person (for example, where splitting 1 + 2 + 2 + 1 into two equal size parts is only either (1,1) | (2,2), (2,2) | (1,1) or (1,2) | (1,2) and element order in each part does not matter), we could rely on your first enumeration of partitions where order is disregarded.
However, we would need to know the multiset of elements in each partition (or an aggregate of similar ones) in order to count the possibilities of dividing them in two. For example, to count the ways to split 1 + 2 + 2 + 1, we would first count how many of each coin we have:
def partitions_with_even_number_of_parts_as_multiset(n, coins):
results = []
def C(m, n, s, p):
if n < 0 or m <= 0:
return
if n == 0:
if not p:
results.append(s)
return
C(m - 1, n, s, p)
_s = s[:]
_s[m - 1] += 1
C(m, n - coins[m - 1], _s, not p)
C(len(coins), n, [0] * len(coins), False)
return results
Output:
=> partitions_with_even_number_of_parts_as_multiset(6, [1,2,6])
=> [[6, 0, 0], [2, 2, 0]]
^ ^ ^ ^ this one represents two 1's and two 2's
Now since we are counting the ways to choose half of these, we need to find the coefficient of x^2 in the polynomial multiplication
(x^2 + x + 1) * (x^2 + x + 1) = ... 3x^2 ...
which represents the three ways to choose two from the multiset count [2,2]:
2,0 => 1,1
0,2 => 2,2
1,1 => 1,2
In Python, we can use numpy.polymul to multiply polynomial coefficients. Then we lookup the appropriate coefficient in the result.
For example:
import numpy
def count_split_partitions_by_multiset_count(multiset):
coefficients = (multiset[0] + 1) * [1]
for i in xrange(1, len(multiset)):
coefficients = numpy.polymul(coefficients, (multiset[i] + 1) * [1])
return coefficients[ sum(multiset) / 2 ]
Output:
=> count_split_partitions_by_multiset_count([2,2,0])
=> 3
Here is a table implementation and a little elaboration on algrid's beautiful answer. This produces an answer for f(500, [1, 2, 6, 12, 24, 48, 60]) in about 2 seconds.
The simple declaration of C(n, k, S) = sum(C(n - s_i, k - 1, S[i:])) means adding all the ways to get to the current sum, n using k coins. Then if we split n into all ways it can be partitioned in two, we can just add all the ways each of those parts can be made from the same number, k, of coins.
The beauty of fixing the subset of coins we choose from to a diminishing list means that any arbitrary combination of coins will only be counted once - it will be counted in the calculation where the leftmost coin in the combination is the first coin in our diminishing subset (assuming we order them in the same way). For example, the arbitrary subset [6, 24, 48], taken from [1, 2, 6, 12, 24, 48, 60], would only be counted in the summation for the subset [6, 12, 24, 48, 60] since the next subset, [12, 24, 48, 60] would not include 6 and the previous subset [2, 6, 12, 24, 48, 60] has at least one 2 coin.
Python code (see it here; confirm here):
import time
def f(n, coins):
t0 = time.time()
min_coins = min(coins)
m = [[[0] * len(coins) for k in xrange(n / min_coins + 1)] for _n in xrange(n + 1)]
# Initialize base case
for i in xrange(len(coins)):
m[0][0][i] = 1
for i in xrange(len(coins)):
for _i in xrange(i + 1):
for _n in xrange(coins[_i], n + 1):
for k in xrange(1, _n / min_coins + 1):
m[_n][k][i] += m[_n - coins[_i]][k - 1][_i]
result = 0
for a in xrange(1, n + 1):
b = n - a
for k in xrange(1, n / min_coins + 1):
result = result + m[a][k][len(coins) - 1] * m[b][k][len(coins) - 1]
total_time = time.time() - t0
return (result, total_time)
print f(500, [1, 2, 6, 12, 24, 48, 60])
Say there's a matrix with N rows and M columns.
You start the traversal at the bottom left, and your current points P is 0, and space S which is larger than 0. At each point in the matrix, the coordinate is either empty or contains points. If the points have size X and value V, you can choose to pick up the points or not when you reach a coordinate.
For traversing the matrix, we can only go up by one row and choose from one of the three columns (i.e. (i + 1, j − 1), (i + 1, j), or (i + 1, j + 1))
Picking up the points increases P by V and decreases S by X.
I'm trying to write a dynamic programming algorithm that would traverse this and return the best path resulting in the largest number of points.
I figure the subproblems are:
L(N, j) = Null
L(i, 0) = max(L(i + 1, 0), L(i + 1, 1))
L(i, j) = max(L(i + 1, j − 1), L(i + 1, j), L(i + 1, j + 1))
L(i, M) = max(L(i + 1, j - 1), L(i + 1, j))
Would that work? How would I go about introducing this to an algorithm?
You can do it in two ways :
Recursive call for function which should cover all boundary/edge conditions
For example the function call would look like this :
function L(i,j, p, s) :
if(i<0 or j<0) return -1 #edge case
if(s<0) return -1 #out of space
if(p<0) return 0 #no possible outcome
if (s == 0) return p #the end
else return max{L(i + 1, j − 1, p+v, s-x), L(i + 1, j, p+v, s-x), L(i + 1, j + 1, p+v, s-x)}
Iterative with tracking with (s<0) condition and you can maintain the matrix for max S and P values for each step.
Given a set of n points on a 2-d plane of the form (x,y), the aim is to find the number of pairs of all points (xi,yi) and (xj, yj) such that the line joining the two points has a negative slope.
Assume that, no two xi's have the same value. Assume all points lie within [-100,100] or some other range.
What you are asking about is equivalent to finding the number of non-inversions in the array of the ys you will obtain when you sort the points in respect to x. You can afford this sorting - it is O(n log n).
I remind you that inversion is i > j and a[i] < a[j]. The equivalence I am speaking of is easy to prove.
Imagine you have 6 points (4, 4), (2, 1), (6, 6), (3, 3), (5, 2), (1, 5). After you sort them in respect of x you obtain: (1, 5), (2, 1), (3, 3), (4, 4), (5, 2), (6, 6). You can see that the negative slopes are formed by <(2, 1), (3, 3)>, <(2, 1), (4, 4)>, <(2, 1), (5, 2)>, <(2, 1), (6, 6)> etc. All the pairs whose ys are not in inversion.
Number of inversions can be counted in O(n log n) using augmention of the merge sort algorithm: basically you only need to increase the counter of inversions every time you choose to add value of the right subarray (the one containing larger indices). You increase the number of inversions with the amount of still not processed values from the left subarray.
Here is example of the counting the number of inversions.
Initial array 5 1 3 4 2 6 inv := 0 // Total inversions: 6
merge step 1: <5 1 3> <4 2 6> inv = 0
merge step 2: <5> <1 3> | <4> <2 6> inv = 0
merge step 3: <5> [<1> <3>] | <4> [<2> <6>] inv = 0
merge step 4: <5> <1 3> | <4> <2 6> inv = 0 // both pairs were already sorted
merge step 5: <1 3 5> | <2 4 6> inv = 3 // we add one for 1, 3 and 2
merge step 6 <1 2 3 4 5 6> inv = 6 // we add 2 (3 and 5) for 2 and 1 for 4
After you find the number of inversions the number of non-inversions in the total number of pairs (n * (n - 1)) / 2 minus the number of inversions inv.
In the example case this is: 6 * 5 / 2 - 6 = 9.