Calculating final market distribution - competitive programming - algorithm

I came across following question while practicing competitive programming. I solved it manually, kinda designing an approach, but my answer is wrong and I cannot imagine how to scale my approach.
Question:
N coffee chains are competing for market share by a fierce advertising battle. each day a percentage of customers will be convinced to switch from one chain to another.
Current market share and daily probability of customer switching is given. If the advertising runs forever, what will be the final distribution of market share?
Assumptions: Total market share is 1.0, probability that a customer switches is independent of other customers and days.
Example: 2 coffee chains: A and B market share of A: 0.4 market share of B: 0.6.
Each day, there is a 0.2 probability that a customer switches from A to B Each day, there is a 0.1 probability that a customer switches from B to A
input: market_share=[0.4,0.6],
switch_prob = [[.8,.2][.1,.9]]
output: [0.3333 0.6667]
Everything till here is part of a question, I did not form the example or assumptions, they were given with the question.
My_attempt: In my understanding, switch probabilities indicate the probability of switching the from A to B.
Hence,
market_share_of_A = current_market_share - lost_customers + gained_customers and
marker_share_of_B = (1 - marker_share_of_A)
iter_1:
lost_customers = 0.4 * 0.8 * 0.2 = 0.064
gained_customers = 0.6 * 0.2 * 0.1 = 0.012
market_share_of_A = 0.4 - 0.064 + 0.012 = 0.348
marker_share_of_B = 1 - 0.348 = 0.652
iter_2:
lost_customers = 0.348 * 0.1 * 0.2 = 0.00696
gained_customers = 0.652 * 0.9 * 0.1 = 0.05868
market_share_of_A = 0.348 - 0.00696 + 0.05868 = 0.39972
marker_share_of_B = 1 - 0.32928 = 0.60028
my answer: [0.39972, 0.60028]
As stated earlier, expected answers are [0.3333 0.6667].
I do not understand where am I wrong? If something is wrong, it has to be my understanding of the question. Please provide your thoughts.
In the example, they demonstrated an easy case that there were only two competitors. What if there are more? Let us say three - A, B, C. I think input has to provide switch probabilities in the form [[0.1, 0.3, 0.6]..] because A can lose its customers to B as well as C and there would be many instances of that. Now, I will have to compute at least two companies market share, third one will be (1-sum_of_all). And while computing B's market share, I will have to compute it's lost customers as well as gained and formula would be (current - lost + gained). Gained will be sum of gain_from_A and gain_from_C. Is this correct?

Following on from my comment, this problem can be expressed as a matrix equation.
The elements of the "transition" matrix, T(i, j) (dimensions N x N) are defined as follows:
i = j (diagonal): the probability of a customer staying with chain i
i != j (off-diagonal): the probability of a customer of chain j transferring to chain i
What is the physical meaning of this matrix? Let the market share state be represented by a vector P(i) of size N, whose i-th value is the market share of chain i. The vector P' = T * P is the next share state after each day.
With that in mind, the equilibrium equation is given by T * P = P, i.e. the final state is invariant under transition T:
| T(1, 1) T(1, 2) T(1, 3) ... T(1, N) | | P(1) | | P(1) |
| T(2, 1) T(2, 2) ... | | P(2) | | P(2) |
| T(3, 1) ... | | P(3) | | P(3) |
| . . | * | . | = | . |
| . . | | . | | . |
| . . | | . | | . |
| T(N, 1) T(N, N) | | P(N) | | P(N) |
However, this is unsolvable by itself - P can only be determined up to a number of ratios between its elements (the technical name for this situation escapes me - as MBo suggests it is due to degeneracy). There is an additional constraint that the shares add up to 1:
P(1) + P(2) + ... P(N) = 1
We can choose an arbitrary share value (say, the Nth one) and replace it with this expression. Multiplying out, the first row of the equation is:
T(1, 1) P(1) + T(1, 2) P(2) + ... T(1, N) (1 - [P(1) + P(2) + ... P(N - 1)]) = P(1)
--> [T(1, 1) - T(1, N) - 1] P(1) + [T(1, 2) - T(1, N)] P(2) + ... "P(N - 1)" = -T(1, N)
The equivalent equation for the second row is:
[T(2, 1) - T(2, N)] P(1) + [T(2, 2) - T(2, N) - 1] P(2) + ... = -T(2, N)
To summarize the general pattern, we define:
A matrix S(i, j) (dimensions [N - 1] x [N - 1]):
- S(i, i) = T(i, i) - T(i, N) - 1
- S(i, j) = T(i, j) - T(i, N) (i != j)
A vector Q(i) of size N - 1 containing the first N - 1 elements of P(i)
A vector R(i) of size N - 1, such that R(i) = -T(i, N)
The equation then becomes S * Q = R:
| S(1, 1) S(1, 2) S(1, 3) ... S(1, N-1) | | Q(1) | | R(1) |
| S(2, 1) S(2, 2) ... | | Q(2) | | R(2) |
| S(3, 1) ... | | Q(3) | | R(3) |
| . . | * | . | = | . |
| . . | | . | | . |
| . . | | . | | . |
| S(N-1, 1) S(N-1, N-1) | | Q(N-1) | | R(N-1) |
Solving the above equation gives Q, which gives the first N - 1 share values (and of course the last one too from the constraint). Methods for doing so include Gaussian elimination and LU decomposition, both of which are more efficient than the naive route of directly computing Q = inv(S) * R.
Note that you can flip the signs in S and R for slightly more convenient evaluation.
The toy example given above turns out to be quite trivial:
| 0.8 0.1 | | P1 | | P1 |
| | * | | = | |
| 0.2 0.9 | | P2 | | P2 |
--> S = | -0.3 |, R = | -0.1 |
--> Q1 = P1 = -1.0 / -0.3 = 0.3333
P2 = 1 - P1 = 0.6667
An example for N = 3:
| 0.1 0.2 0.3 | | -1.2 -0.1 | | -0.3 |
T = | 0.4 0.7 0.3 | --> S = | | , R = | |
| 0.5 0.1 0.4 | | 0.1 -0.6 | | -0.3 |
| 0.205479 |
--> Q = | | , P3 = 0.260274
| 0.534247 |
Please forgive the Robinson Crusoe style formatting - I'll try to write these in LaTeX later for readability.

Related

Logic behind including / excluding current element in recursive approach

I'm studying DP nowadays however I've run into previously some examples like subset sum or as shown in this question coin change problem that their solutions call recursive cases both including the current element and excluding the current element. Yet, I've genuinely difficulty in comprehending what/why it's real reason by doing this approach. I cannot get the underneath logic behind of it. I don't want to memorize or to say "humm, okay, keep in mind it, there is an approach" like that styles.
class Util
{
// Function to find the total number of distinct ways to get
// change of N from unlimited supply of coins in set S
public static int count(int[] S, int n, int N)
{
// if total is 0, return 1 (solution found)
if (N == 0) {
return 1;
}
// return 0 (solution do not exist) if total become negative or
// no elements are left
if (N < 0 || n < 0) {
return 0;
}
// Case 1. include current coin S[n] in solution and recurse
// with remaining change (N - S[n]) with same number of coins
int incl = count(S, n, N - S[n]);
// Case 2. exclude current coin S[n] from solution and recurse
// for remaining coins (n - 1)
int excl = count(S, n - 1, N);
// return total ways by including or excluding current coin
return incl + excl;
}
// Coin Change Problem
public static void main(String[] args)
{
// n coins of given denominations
int[] S = { 1, 2, 3 };
// Total Change required
int N = 4;
System.out.print("Total number of ways to get desired change is "
+ count(S, S.length - 1, N));
}
}
I don't want to skip the parts superficially since recurrence formulas are really play leading role for dynamic programming.
At each recursion you want to explore both cases:
one more coin of type n is used
you are done with coin type n and proceed to the next coin type
The remaining task is handled in both cases by a recursive call.
By the way, this solution has nothing to do with dynamic programming.
In the common powerset problem, given (1 2 3) we are asked to generate ((1 2 3) (1 2) (1 3) (1) (2 3) (2) (3) ()). We can use this with and without technique to generate the result.
+---+ +---------------------------+ +--------------------------------------------+
| +-with----> ((1 2 3) (1 2) (1 3) (1)) | | |
| 1 | | +-----> ((1 2 3) (1 2) (1 3) (1) (2 3) (2) (3) ()) |
| +-without-> ((2 3) (2) (3) ()) | | |
+-^-+ +---------------------------+ +--------------------------------------------+
|
+-------------------------------------------+
|
+---+ +-------------+ +-----------+--------+
| +-with------> ((2 3) (2)) | | |
| 2 | | +---> ((2 3) (2) (3) ()) |
| +-without---> ((3) ()) | | |
+-^-+ +-------------+ +--------------------+
|
+--------------------------------+
|
+---+ +-----+ +------+--------+
| +-with------> (3) | | |
| 3 | | +-----> ((3) ()) |
| +-without---> () | | |
+-^-+ +-----+ +---------------+
|
|
+-+-+
|() |
| | <- base case
+---+

Solving a constrained system of linear equations

I have a system of equations of the form y=Ax+b where y, x and b are n×1 vectors and A is a n×n (symmetric) matrix.
So here is the wrinkle. Not all of x is unknown. Certain rows of x are specified and the corresponding rows of y are unknown. Below is an example
| 10 | | 5 -2 1 | | * | | -1 |
| * | = | -2 2 0 | | 1 | + | 1 |
| 1 | | 1 0 1 | | * | | 2 |
where * designates unknown quantities.
I have built a solver for problems such as the above in Fortran, but I wanted to know if there is a decent robust solver out-there as part of Lapack or MLK for these types of problems?
My solver is based on a sorting matrix called pivot = [1,3,2] which rearranges the x and y vectors according to known and unknown
| 10 | | 5 1 -2 | | * | | -1 |
| 1 | | 1 1 0 | | * | + | 2 |
| * | | -2 0 2 | | 1 | | 1 |
and the solving using a block matrix solution & LU decomposition
! solves a n×n system of equations where k values are known from the 'x' vector
function solve_linear_system(A,b,x_known,y_known,pivot,n,k) result(x)
use lu
integer(c_int),intent(in) :: n, k, pivot(n)
real(c_double),intent(in) :: A(n,n), b(n), x_known(k), y_known(n-k)
real(c_double) :: x(n), y(n), r(n-k), A1(n-k,n-k), A3(n-k,k), b1(n-k)
integer(c_int) :: i, j, u, code, d, indx(n-k)
u = n-k
!store known `x` and `y` values
x(pivot(u+1:n)) = x_known
y(pivot(1:u)) = y_known
!define block matrices
! |y_known| = | A1 A3 | | * | + |b1|
| | * | = | A3` A2 | | x_known | |b2|
A1 = A(pivot(1:u), pivot(1:u))
A3 = A(pivot(1:u), pivot(u+1:n))
b1 = b(pivot(1:u))
!define new rhs vector
r = y_known -matmul(A3, x_known)-b1
% solve `A1*x=r` with LU decomposition from NR book for 'x'
call ludcmp(A1,u,indx,d,code)
call lubksb(A1,u,indx,r)
% store unknown 'x' values (stored into 'r' by 'lubksb')
x(pivot(1:u)) = r
end function
For the example above the solution is
| 10.0 | | 3.5 |
y = | -4.0 | x = | 1.0 |
| 1.0 | | -4.5 |
PS. The linear systems have typically n<=20 equations.
The problem with only unknowns is a linear least squares problem.
Your a-priori knowledge can be introduced with equality-constraints (fixing some variables), transforming it to an linear equality-constrained least squares problem.
There is indeed an algorithm within lapack solving the latter, called xGGLSE.
Here is some overview.
(It also seems, you need to multiply b with -1 in your case to be compatible with the definition)
Edit: On further inspection, i missed the unknowns within y. Ouch. This is bad.
First, i would rewrite your system into a AX=b form where A and b are known. In your example, and provided that i didn't make any mistakes, it would give :
5 0 1 x1 13
A = 2 1 0 X = x2 and b = 3
1 0 1 x3 -1
Then you can use plenty of methods coming from various libraries, like LAPACK or BLAS depending on the properties of your matrix A (positive-definite ,...). As a starting point, i would suggest a simple method with a direct inversion of the matrix A, especially if your matrix is small. There are also many iterative approach ( Jacobi, Gradients, Gauss seidel ...) that you can use for bigger cases.
Edit : An idea to solve it in 2 steps
First step : You can rewrite your system in 2 subsystem that have X and Y as unknows but dimension are equals to the numbers of unknowns in each vector.
The first subsystem in X will be AX = b which can be solved by direct or iterative methods.
Second step : The second system in Y can be directly resolved once you know X cause Y will be expressed in the form Y = A'X + b'
I think this approach is more general.

Kernel density estimation julia

I am trying to implement a kernel density estimation. However my code does not provide the answer it should. It is also written in julia but the code should be self explanatory.
Here is the algorithm:
where
So the algorithm tests whether the distance between x and an observation X_i weighted by some constant factor (the binwidth) is less then one. If so, it assigns 0.5 / (n * h) to that value, where n = #of observations.
Here is my implementation:
#Kernel density function.
#Purpose: estimate the probability density function (pdf)
#of given observations
##param data: observations for which the pdf should be estimated
##return: returns an array with the estimated densities
function kernelDensity(data)
|
| #Uniform kernel function.
| ##param x: Current x value
| ##param X_i: x value of observation i
| ##param width: binwidth
| ##return: Returns 1 if the absolute distance from
| #x(current) to x(observation) weighted by the binwidth
| #is less then 1. Else it returns 0.
|
| function uniformKernel(x, observation, width)
| | u = ( x - observation ) / width
| | abs ( u ) <= 1 ? 1 : 0
| end
|
| #number of observations in the data set
| n = length(data)
|
| #binwidth (set arbitraily to 0.1
| h = 0.1
|
| #vector that stored the pdf
| res = zeros( Real, n )
|
| #counter variable for the loop
| counter = 0
|
| #lower and upper limit of the x axis
| start = floor(minimum(data))
| stop = ceil (maximum(data))
|
| #main loop
| ##linspace: divides the space from start to stop in n
| #equally spaced intervalls
| for x in linspace(start, stop, n)
| | counter += 1
| | for observation in data
| | |
| | | #count all observations for which the kernel
| | | #returns 1 and mult by 0.5 because the
| | | #kernel computed the absolute difference which can be
| | | #either positive or negative
| | | res[counter] += 0.5 * uniformKernel(x, observation, h)
| | end
| | #devide by n times h
| | res[counter] /= n * h
| end
| #return results
| res
end
#run function
##rand: generates 10 uniform random numbers between 0 and 1
kernelDensity(rand(10))
and this is being returned:
> 0.0
> 1.5
> 2.5
> 1.0
> 1.5
> 1.0
> 0.0
> 0.5
> 0.5
> 0.0
the sum of which is: 8.5 (The cumulative distibution function. Should be 1.)
So there are two bugs:
The values are not properly scaled. Each number should be around one tenth of their current values. In fact, if the number of observation increases by 10^n n = 1, 2, ... then the cdf also increases by 10^n
For example:
> kernelDensity(rand(1000))
> 953.53
They don't sum up to 10 (or one if it were not for the scaling error). The error becomes more evident as the sample size increases: there are approx. 5% of the observations not being included.
I believe that I implemented the formula 1:1, hence I really don't understand where the error is.
I'm not an expert on KDEs, so take all of this with a grain of salt, but a very similar (but much faster!) implementation of your code would be:
function kernelDensity{T<:AbstractFloat}(data::Vector{T}, h::T)
res = similar(data)
lb = minimum(data); ub = maximum(data)
for (i,x) in enumerate(linspace(lb, ub, size(data,1)))
for obs in data
res[i] += abs((obs-x)/h) <= 1. ? 0.5 : 0.
end
res[i] /= (n*h)
end
sum(res)
end
If I'm not mistaken, the density estimate should integrate to 1, that is we would expect kernelDensity(rand(100), 0.1)/100 to get at least close to 1. In the implementation above I'm getting there, give or take 5%, but then again we don't know that 0.1 is the optimal bandwith (using h=0.135 instead I'm getting there to within 0.1%), and the uniform Kernel is known to only be about 93% "efficient".
In any case, there's a very good Kernel Density package in Julia available here, so you probably should just do Pkg.add("KernelDensity") instead of trying to code your own Epanechnikov kernel :)
To point out the mistake: You have n bins B_i of size 2h covering [0,1], a random point X lands in expected number of bins. You divide by 2 n h.
For n points, the expected value of your function is .
Actually, you have some bins of size < 2h. (for example if start = 0, half of first the bin is outside of [0,1]), factoring this in gives the bias.
Edit: Btw, the bias is easy to calculate if you assume that the bins have random locations in [0,1]. Then the bins are on average missing h/2 = 5% of their size.

Number of n-element permutations with exactly k inversions

I am trying to efficiently solve SPOJ Problem 64: Permutations.
Let A = [a1,a2,...,an] be a permutation of integers 1,2,...,n. A pair
of indices (i,j), 1<=i<=j<=n, is an inversion of the permutation A if
ai>aj. We are given integers n>0 and k>=0. What is the number of
n-element permutations containing exactly k inversions?
For instance, the number of 4-element permutations with exactly 1
inversion equals 3.
To make the given example easier to see, here are the three 4-element permutations with exactly 1 inversion:
(1, 2, 4, 3)
(1, 3, 2, 4)
(2, 1, 3, 4)
In the first permutation, 4 > 3 and the index of 4 is less than the index of 3. This is a single inversion. Since the permutation has exactly one inversion, it is one of the permutations that we are trying to count.
For any given sequence of n elements, the number of permutations is factorial(n). Thus if I use the brute force n2 way of counting the number of inversions for each permutation and then checking to see if they are equal to k, the solution to this problem would have the time complexity O(n! * n2).
Previous Research
A subproblem of this problem was previously asked here on StackOverflow. An O(n log n) solution using merge sort was given which counts the number of inversions in a single permutation. However, if I use that solution to count the number of inversions for each permutation, I would still get a time complexity of O(n! * n log n) which is still very high in my opinion.
This exact question was also asked previously on Stack Overflow but it received no answers.
My goal is to avoid the factorial complexity that comes from iterating through all permutations. Ideally I would like a mathematical formula that yields the answer to this for any n and k but I am unsure if one even exists.
If there is no math formula to solve this (which I kind of doubt) then I have also seen people giving hints that an efficient dynamic programming solution is possible. Using DP or another approach, I would really like to formulate a solution which is more efficient than O(n! * n log n), but I am unsure of where to start.
Any hints, comments, or suggestions are welcome.
EDIT: I have answered the problem below with a DP approach to computing Mahonian numbers.
The solution needs some explanations.
Let's denote the number of permutations with n items having exactly k inversions
by I(n, k)
Now I(n, 0) is always 1. For any n there exist one and only one permutation which has 0
inversions i.e., when the sequence is increasingly sorted
Now I(0, k) is always 0 since we don't have the sequence itself
Now to find the I(n, k) let's take an example of sequence containing 4 elements
{1,2,3,4}
for n = 4 below are the permutations enumerated and grouped by number of inversions
|___k=0___|___k=1___|___k=2___|___k=3___|___k=4___|___k=5___|___k=6___|
| 1234 | 1243 | 1342 | 1432 | 2431 | 3421 | 4321 |
| | 1324 | 1423 | 2341 | 3241 | 4231 | |
| | 2134 | 2143 | 2413 | 3412 | 4312 | |
| | | 2314 | 3142 | 4132 | | |
| | | 3124 | 3214 | 4213 | | |
| | | | 4123 | | | |
| | | | | | | |
|I(4,0)=1 |I(4,1)=3 |I(4,2)=5 |I(4,3)=6 |I(4,4)=5 |I(4,5)=3 |I(4,6)=1 |
| | | | | | | |
Now to find the number of permutation with n = 5 and for every possible k
we can derive recurrence I(5, k) from I(4, k) by inserting the nth (largest)
element(5) somewhere in each permutation in the previous permutations,
so that the resulting number of inversions is k
for example, I(5,4) is nothing but the number of permutations of the sequence {1,2,3,4,5}
which has exactly 4 inversions each.
Let's observe I(4, k) now above until column k = 4 the number of inversions is <= 4
Now lets place the element 5 as shown below
|___k=0___|___k=1___|___k=2___|___k=3___|___k=4___|___k=5___|___k=6___|
| |5|1234 | 1|5|243 | 13|5|42 | 143|5|2 | 2431|5| | 3421 | 4321 |
| | 1|5|324 | 14|5|23 | 234|5|1 | 3241|5| | 4231 | |
| | 2|5|134 | 21|5|43 | 241|5|3 | 3412|5| | 4312 | |
| | | 23|5|14 | 314|5|4 | 4132|5| | | |
| | | 31|5|24 | 321|5|4 | 4213|5| | | |
| | | | 412|5|3 | | | |
| | | | | | | |
| 1 | 3 | 5 | 6 | 5 | | |
| | | | | | | |
Each of the above permutation which contains 5 has exactly 4 inversions.
So the total permutation with 4 inversions I(5,4) = I(4,4) + I(4,3) + I(4,2) + I(4,1) + I(4,0)
= 1 + 3 + 5 + 6 + 5 = 20
Similarly for I(5,5) from I(4,k)
|___k=0___|___k=1___|___k=2___|___k=3___|___k=4___|___k=5___|___k=6___|
| 1234 | |5|1243 | 1|5|342 | 14|5|32 | 243|5|1 | 3421|5| | 4321 |
| | |5|1324 | 1|5|423 | 23|5|41 | 324|5|1 | 4231|5| | |
| | |5|2134 | 2|5|143 | 24|5|13 | 341|5|2 | 4312|5| | |
| | | 2|5|314 | 31|5|44 | 413|5|2 | | |
| | | 3|5|124 | 32|5|14 | 421|5|3 | | |
| | | | 41|5|23 | | | |
| | | | | | | |
| | 3 | 5 | 6 | 5 | 3 | |
| | | | | | | |
So the total permutation with 5 inversions I(5,5) = I(4,5) + I(4,4) + I(4,3) + I(4,2) + I(4,1)
= 3 + 5 + 6 + 5 + 3 = 22
So I(n, k) = sum of I(n-1, k-i) such that i < n && k-i >= 0
Also, k can go up to n*(n-1)/2 this occurs when the sequence is sorted in decreasing order
https://secweb.cs.odu.edu/~zeil/cs361/web/website/Lectures/insertion/pages/ar01s04s01.html
http://www.algorithmist.com/index.php/SPOJ_PERMUT1
#include <stdio.h>
int dp[100][100];
int inversions(int n, int k)
{
if (dp[n][k] != -1) return dp[n][k];
if (k == 0) return dp[n][k] = 1;
if (n == 0) return dp[n][k] = 0;
int j = 0, val = 0;
for (j = 0; j < n && k-j >= 0; j++)
val += inversions(n-1, k-j);
return dp[n][k] = val;
}
int main()
{
int t;
scanf("%d", &t);
while (t--) {
int n, k, i, j;
scanf("%d%d", &n, &k);
for (i = 1; i <= n; i++)
for (j = 0; j <= k; j++)
dp[i][j] = -1;
printf("%d\n", inversions(n, k));
}
return 0;
}
It's one day later and I have managed to solve the problem using dynamic programming. I submitted it and my code was was accepted by SPOJ so I figure I'll share my knowledge here for anyone who is interested in the future.
After looking in the Wikipedia page which discusses inversion in discrete mathematics, I found an interesting recommendation at the bottom of the page.
Numbers of permutations of n elements with k inversions; Mahonian
numbers: A008302
I clicked on the link to OEIS and it showed me an infinite sequence of integers called the Triangle of Mahonian numbers.
1, 1, 1, 1, 2, 2, 1, 1, 3, 5, 6, 5, 3, 1, 1, 4, 9, 15, 20, 22, 20, 15,
9, 4, 1, 1, 5, 14, 29, 49, 71, 90, 101, 101, 90, 71, 49, 29, 14, 5, 1,
1, 6, 20, 49, 98, 169, 259, 359, 455, 531, 573, 573, 531, 455, 359,
259, 169, 98, 49, 20, 6, 1 . . .
I was curious about what these numbers were since they seemed familiar to me. Then I realized that I had seen the subsequence 1, 3, 5, 6, 5, 3, 1 before. In fact, this was the answer to the problem for several pairs of (n, k), namely (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6). I looked at what was on both sides of this subsequence and was amazed to see that it was all valid (i.e. greater than 0 permutations) answers for n < 4 and n > 4.
The formula for the sequence was given as:
coefficients in expansion of Product_{i=0..n-1} (1+x+...+x^i)
This was easy enough for me to understand and verify. I could basically take any n and plug into the formula. Then the coefficient for the xk term would be the answer for (n, k).
I will show an example for n = 3.
(x0)(x0 + 1)(x0 + x1 + x2)
= (1)(1 + x)(1 + x + x2)
= (1 + x)(1 + x + x2)
= 1 + x + x + x2 + x2 + x3
= 1 + 2x + 2x2 + x3
The final expansion was 1 + 2x + 2x2 + x3 and the coefficients of the xk terms were 1, 2, 2, and 1 for k = 0, 1, 2, 3 respectively. This just happens to be all valid numbers of inversions for 3-element permutations.
1, 2, 2, 1 is the 3rd row of the Mahonian numbers when they are laid out in a table as follows:
1
1 1
1 2 2 1
1 3 5 6 5 3 1
etc.
So basically computing my answer came down to simply calculating the nth Mahonian row and taking the kth element with k starting at 0 and printing 0 if the index was out of range. This was a simple case of bottom-up dynamic programming since each ith row could be used to easily compute the i+1st row.
Given below is the Python solution I used which ran in only 0.02 seconds. The maximum time limit for this problem was 3 seconds for their given test cases and I was getting a timeout error before so I think this optimization is rather good.
def mahonian_row(n):
'''Generates coefficients in expansion of
Product_{i=0..n-1} (1+x+...+x^i)
**Requires that n is a positive integer'''
# Allocate space for resulting list of coefficients?
# Initialize them all to zero?
#max_zero_holder = [0] * int(1 + (n * 0.5) * (n - 1))
# Current max power of x i.e. x^0, x^0 + x^1, x^0 + x^1 + x^2, etc.
# i + 1 is current row number we are computing
i = 1
# Preallocate result
# Initialize to answer for n = 1
result = [1]
while i < n:
# Copy previous row of n into prev
prev = result[:]
# Get space to hold (i+1)st row
result = [0] * int(1 + ((i + 1) * 0.5) * (i))
# Initialize multiplier for this row
m = [1] * (i + 1)
# Multiply
for j in range(len(m)):
for k in range(len(prev)):
result[k+j] += m[j] * prev[k]
# Result now equals mahonian_row(i+1)
# Possibly should be memoized?
i = i + 1
return result
def main():
t = int(raw_input())
for _ in xrange(t):
n, k = (int(s) for s in raw_input().split())
row = mahonian_row(n)
if k < 0 or k > len(row) - 1:
print 0
else:
print row[k]
if __name__ == '__main__':
main()
I have no idea of the time complexity but I am absolutely certain this code can be improved through memoization since there are 10 given test cases and the computations for previous test cases can be used to "cheat" on future test cases. I will make that optimization in the future, but hopefully this answer in its current state will help anyone attempting this problem in the future since it avoids the naive factorial-complexity approach of generating and iterating through all permutations.
If there is a dynamic programming solution, there is probably a way to do it step by step, using the results for permutations of length n to help with the results for permutations of length n+1.
Given a permutation of length n - values 1-n, you can get a permutation of length n+1 by adding value (n+1) at n+1 possible positions. (n+1) is larger than any of 1-n so the number of inversions you create when you do this depends on where you add it - add it at the last position and you create no inversions, add it at the last but one position and you create one inversion, and so on - look back at the n=4 cases with one inversion to check this.
So if you consider one of n+1 places where you can add (n+1) if you add it at place j counting from the right so the last position as position 0 the number of permutations with K inversions this creates is the number of permutations with K-j inversions on n places.
So if at each step you count the number of permutations with K inversions for all possible K you can update the number of permutations with K inversions for length n+1 using the number of permutations with K inversions for length n.
A major problem in computing these coefficients is the size of the order of the resultant product. The polynomial Product i=1,2,..,n {(1+x).(1+x+x^2)....(1+x+x^2+..+x^i)+...(1+x+x^2+...+x^n) will have an order equivalent to n*(n+1). Consequently, this puts a restrictive computational limit on the process. If we use a process where the previous results for the Product for n-1 are used in the process for computation of the Product for n, we are looking at the storage of (n-1)*n integers. It is possible to use a recursive process, which will be much slower, and again it is limited to integers less than the square root of the common size of the integer. The following is some rough and ready recursive code for this problem. The function mahonian(r,c) returns the c th coefficient for the r th Product. But again it is extremely slow for large Products greater than 100 or so. Running this it can be seen that recursion is clearly not the answer.
unsigned int numbertheory::mahonian(unsigned int r, unsigned int c)
{
unsigned int result=0;
unsigned int k;
if(r==0 && c==0)
return 1;
if( r==0 && c!=0)
return 0;
for(k=0; k <= r; k++)
if(r > 0 && c >=k)
result = result + mahonian(r-1,c-k);
return result;
}
As a matter of interest I have included the following which is a c++ version of Sashank which is lot more faster than my recursion example. Note I use the armadillo library.
uvec numbertheory::mahonian_row(uword n){
uword i = 2;
uvec current;
current.ones(i);
uword current_size;
uvec prev;
uword prev_size;
if(n==0){
current.ones(1);
return current;
}
while (i <= n){ // increment through the rows
prev_size=current.size(); // reset prev size to current size
prev.set_size(prev_size); // set size of prev vector
prev= current; //copy contents of current to prev vector
current_size =1+ (i*(i+1)/2); // reset current_size
current.zeros(current_size); // reset current vector with zeros
for(uword j=0;j<i+1; j++) //increment through current vector
for(uword k=0; k < prev_size;k++)
current(k+j) += prev(k);
i++; //increment to next row
}
return current; //return current vector
}
uword numbertheory::mahonian_fast(uword n, uword c) {
**This function returns the coefficient of c order of row n of
**the Mahonian numbers
// check for input errors
if(c >= 1+ (n*(n+1)/2)) {
cout << "Error. Invalid input parameters" << endl;
}
uvec mahonian;
mahonian.zeros(1+ (n*(n+1)/2));
mahonian = mahonian_row(n);
return mahonian(c);
}
We can make use to dynamic programming to solve this problem. we have n place to fill with numbers to from 1 to n, _ _ _ _ _ _ _ take n=7, then at very first place we can achieve atmost n-1 inversion and at least 0 , similarly for second place we can achieve atmost n-2 inversion and at least 0, in general, we can achieve atmost n-i inversions at ith index, irrespective of the choice of number we place before.
our recursive formula will look like :
f(n,k) = f(n-1,k) + f(n-1,k-1) + f(n-1,k-2) ............. f(n-1,max(0,k-(n-1))
no inversion one inversion two inversion n-1 inversion
we can achieve 0 inversions by placing smallest of the remaining number from the set (1,n)
1 inversion by placing second smallest and so on,
base condition for our recursive formula will be.
if( i==0 && k==0 ) return 1(valid permutation)
if( i==0 && k!=0 ) return 0 (invalid permutation).
if we draw recursion tree we will see subproblems repeated multiple times, Hence use memoization to reduce complexity to O(n*k).

Determining if two line segments intersect? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How do you detect where two line segments intersect?
Can someone provide an algorithm or C code for determining if two line segments intersect?
That really depends on how the lines are represented. I'm going to assume that you have them represented in the parametric form
x0(t) = u0 + t v0
x1(t) = u1 + t v1
Here, the x's, u's, and v's are vectors (further denoted in bold) in ℜ2 and t ∈ [0, 1].
These two points intersect if there's some point that's on both of these line segments. Thus if there is some point p so that there's a t where
p = x0(t) = u0 + t v0
and an s such that
p = x1(s) = u1 + s v1
And moreover, both s, t ∈ [0, 1], then the two lines intersect. Otherwise, they do not.
If we combine the two equalities, we get
u0 + t v0 = u1 + s v1
Or, equivalently,
u0 - u1 = s v1 - t v0
u0 = (x00, y00)
u1 = (x10, y10)
v0 = (x01, y01)
v1 = (x11, y11)
If we rewrite the above expression in matrix form, we now have that
| x00 - x10 | | x11 | | x01 |
| y00 - y10 | = | y11 | s - | y01 | t
This is in turn equivalent to the matrix expression
| x00 - x10 | | x11 x01 | | s|
| y00 - y10 | = | y11 y01 | |-t|
Now, we have two cases to consider. First, if this left-hand side is the zero vector, then there's a trivial solution - just set s = t = 0 and the points intersect. Otherwise, there's a unique solution only if the right-hand matrix is invertible. If we let
| x11 x01 |
d = det(| y11 y01 |) = x11 y01 - x01 y11
Then the inverse of the matrix
| x11 x01 |
| y11 y01 |
is given by
| y01 -x01 |
(1/d) | -y11 x11 |
Note that this matrix isn't defined if the determinant is zero, but if that's true it means that the lines are parallel and thus don't intersect.
If the matrix is invertible, then we can solve the above linear system by left-multiplying by this matrix:
| s| | y01 -x01 | | x00 - x10 |
|-t| = (1/d) | -y11 x11 | | y00 - y10 |
| (x00 - x10) y01 - (y00 - y10) x01 |
= (1/d) | -(x00 - x10) y11 + (y00 - y10) x11 |
So this means that
s = (1/d) ((x00 - x10) y01 - (y00 - y10) x01)
t = (1/d) -(-(x00 - x10) y11 + (y00 - y10) x11)
If both of these values are in the range [0, 1], then the two line segments intersect and you can compute the intersection point. Otherwise, they do not intersect. Additionally, if d is zero then the two lines are parallel, which may or may not be of interest to you. Coding this up in C shouldn't be too bad; you just need to make sure to be careful not to divide by zero.
If anyone can double-check the math, that would be great.
You could build an equation for two lines, find the point of intersection and then check if it belongs to those segments.

Resources