Complexity of sampling from mixture model - complexity-theory

I have a model where state j among M states is chosen with probability p_j. The probability could be any real number. This specifies a mixture model over the M states. I can access p_j for all j in constant time.
I want to make a large number (N) of random samples. The most obvious algorithm is
1) Compute the cumulative probability distribution P_j = p_1+p_2+...p_j. O(M)
2) For each sample choose random float x in [0,1]. O(N)
3) For each sample choose j such that min(0,P_j-1) < x <= max(1,P_j). O(Nlog(M))
So the asymptotic complexity is O(Nlog(M)). The factor of N is obviously unavoidable, but I am wondering about log(M). Is it possible to beat this factor in a realistic implementation?

I think you can do better using something like the following algorithm, or any other reasonable Multinomial distribution sampler,
// Normalize p_j
for j = 1 to M
p_hat[j] = p[j] / P_j
// Place the draws from the mixture model in this array
draws = [];
// Sample until we have N iid samples
cdf = 1.0;
for ( j = 1, remaining = N; j <= M && remaining > 0; j++ )
{
// p_hat[j] is the probability of sampling item j and there
// are (N - count) items remaining to sample. This is just
// (N - count) Bernoulli trials, so draw from a
// Binomial(N - count, p_hat[j] / cdf) distribution to get the
// number of items
//
// Adjusting the probability by 1 - CDF ensures that *something*
// is sampled because p_hat[M] / cdf = p_hat[M] / p_hat[M] = 1.0
items = Binomial.sample( remaining, p_hat[j] / cdf );
remaining -= items;
cdf -= p_hat[j];
for ( k = 0; k < items; k++ )
draws.push( sample_from_mixture_component( j ))
}
This should take close to O(N) time but it does depend on how efficient your Binomial distribution and mixture model component samplers are.

Related

Algorithm for partial symmetry searched

I'm trying to calculate in percentage how symmetric a concrete Matrix is.
The "traditional" way of calculating symmetry would be that I have as input an arbitrary square matrix M of size N×N and that the algorithm's output must be true (=symmetric) if M[i,j] = M[j,i] for all j≠i, false otherwise.
How would be a adequate handling of calculating the percentage? So not just saying symmetric or asymmetric? Maybe counting the times j≠i and divide it by the overall amount of (i,j) ?
So f.e. if I have the following Matrixes:
1 1 1 1 1 1
A = 2 2 2 B = 2 2 2
1 1 2 3 4 5
then I need to know that A is "more symmetric" than B, even both are not symmetric.
You should first define your symmetry distance metric per cell. This should be zero if the symmetric cells is the same or some other number if they aren't.
For example:
s(i,j):= (m(i,j)==m(j,i) ? 0:1) // returns 0/1 if the symmetric cell is/isn't the same
or
s(i,j):= |m(i,j)-m(j,i)| // returns the absolute difference between the two cells
Then just sum the distances for all the cells:
int SymmetricDistance(matrix){
for (int i=0; i<matrix.Width; i++)
for (int j=i; j<matrix.Width; j++) // check if th matrix is square first
dist = dist + s(i,j);
return dist;
}
now you can say matrix A is "more symmetric" than matrix B iff
SymmetricDistance(A) < SymmetricDistance(B)
I agree with #Sten Petrov's answer in general. However, if you are looking for a percentage of symmetry specifically:
First, find the total number of pairs of elements which could be symmetric in an NxN matrix.
You could find this by splitting the matrix along the diagonal and counting the number of elements. Since adding 1 to N increases the total number of pairs by N, a general rule for finding the total pairs is to sum the numbers from 1 to N. However, rather than looping just use the sum formula:
Total Possible = N * (N + 1) / 2
The matrix is perfectly symmetrical iff all the pairs are symmetrical. Therefore, percentage of symmetry can be defined as the fraction of symmetric pairs to total possible pairs.
Symmetry = Symmetric Pairs / Total Pairs
Pseudo-code:
int matchingPairs= 0;
int N = matrix.Width;
int possiblePairs = N * (N + 1 ) / 2;
for(int i = 0; i < N; ++i){
for(int j = 0; j <= i; ++j){
matchingPairs += (matrix[i][j] == matrix[j][i]) ? 1 : 0;
}
}
float percentSymmetric = matchingPairs / possiblePairs ;

Boolean matrix multiplication algorithm

This is my first question on stackoverflow. I've been solving some exercises from "Algorithm design" by Goodrich, Tamassia. However, I'm quite clueless about this problem. Unusre where to start from and how to proceed. Any advice would be great. Here's the problem:
Boolean matrices are matrices such that each entry is 0 or 1, and matrix multiplication is performed by using AND for * and OR for +. Suppose we are given two NxN random Boolean matrices A and B, so that the probability that any entry
in either is 1, is 1/k. Show that if k is a constant, then there is an algorithm for multiplying A and B whose expected running time is O(n^2). What if k is n?
Matrix multiplication using the standard iterative approach is O(n3), because you have to iterate over n rows and n columns, and for each element do a vector multiply of one of the rows and one of the columns, which takes n multiplies and n-1 additions.
Psuedo code to multiply matrix a by matrix b and store in matrix c:
for(i = 0; i < n; i++)
{
for(j = 0; j < n; j++)
{
int sum = 0;
for(m = 0; m < n; m++)
{
sum += a[i][m] * b[m][j];
}
c[i][j] = sum;
}
}
For a boolean matrix, as specified in the problem, AND is used in
place of multiplication and OR in place of addition, so it becomes
this:
for(i = 0; i < n; i++)
{
for(j = 0; j < n; j++)
{
boolean value = false;
for(m = 0; m < n; m++)
{
value ||= a[i][m] && b[m][j];
if(value)
break; // early out
}
c[i][j] = value;
}
}
The thing to notice here is that once our boolean "sum" is true, we can stop calculating and early out of the innermost loop, because ORing any subsequent values with true is going to produce true, so we can immediately know that the final result will be true.
For any constant k, the number of operations before we can do this early out (assuming the values are random) is going to depend on k and will not increase with n. At each iteration there will be a (1/k)2 chance that the loop will terminate, because we need two 1s for that to happen and the chance of each entry being a 1 is 1/k. The number of iterations before terminating will follow a Geometric distribution where p is (1/k)2, and the expected number of "trials" (iterations) before "success" (breaking out of the loop) doesn't depend on n (except as an upper bound for the number of trials) so the innermost loop runs in constant time (on average) for a given k, making the overall algorithm O(n2). The Geometric distribution formula should give you some insight about what happens if k = n. Note that in the formula given on Wikipedia k is the number of trials.

Algorithm for matrix addition and multiplication

Let m, n be integers such that 0<= m,n< N.
Define:
Algorithm A: Computes m + n in time O(A(N))
Algorithm B: Computes m*n in time O(B(N))
Algorithm C: Computes m mod n in time O(C(N))
Using any combination of algorithms A, B and C describe an algorithm for N X N matrix addition and matrix multiplication with entries in Z/NZ. Also indicate the algorithm's run time big-O notation.
Attempt at solution:
For N X N addition in Z/NZ:
Let A, and B be N X N matrices in Z/NZ with entries a_{ij} and b_{ij} such that i,j in {0,1,...,N} where i represents the row and j represents the column in the matrix. Also, let A + B = C
Step 1. Run Algorithm A to get a_{ij} + b_{ij} = c_{ij} in time O(A(N))
Step 2. Run Algorithm C to get c_{ij} mod N in time O(C(N))
Repeat Steps 1 and 2 for all i,j in {0,1,...,N}.
This means that we have to repeat the steps 1,2 N^2 times. So the total run time is estimated by
N^2[ O(A(N)) + O(C(N)) ] = O(N^2 A(N)) + O(N^2 C(N)) = O(|N^2 A(N)| + |(N^2 C(N)|).
For multiplication algorithm I just replaced step 1 by Algorithm B and got the total run time to beO(|N^2 B(N)| + |(N^2 C(N)|) just like above.
Please tell me if I am approaching this problem correctly, especially with the big-O notation.
Thanks.
Your algorithm for matrix multiplication is wrong, and will yield a wrong answer, since A*B_{i,j} != A_{i,j} * B_{i,j} (with exception for some unique cases like zero matrix)
I assume the goal of the question is not to implement an efficient matrix multiplication, since it's a hard and still studied problem, so I will answer for the naive implementation of matrix multiplication.
For any indices i,j:
(AB)_{i,j} = Sum( A_{i,k} * B_{k,j}) =
= A_{i,1} * B_{1,j} + A_{i,2} * B_{2,j} + ... + A_{i,k} * B_{k,j} + ... + A_{i,n} * B_{n,j}
As you can see, for each pair i,j there are n multiplications and n-1 additionsץ Regarding the amount of invokations of C - it depends if you need to invoke it after each addition, or only once when you are done (it really depends on how many bits you have to represent each number), so for each pair of i,j - you might need it anywhere from once to 2n-1 invokations.
This gives us total complexity of (assuming 2n-1 modolus for each (i,j) pair, if less are needed as explained above - adjust accordingly):
O(n^3*A + n^3*B + n^3*C)
As a side note, a good sanity check that shows your algorithm is indeed incorrect - it is proven that matrix multiplication cannot be done better than Omega(n^2 logn) (Raz,2002), and best current implementation is ~O(n^2.3)
#include <stdio.h>
void main()
{
int i, j;
int a[3][3], b[3][3];
printf("enter elements for 1 matrix\n");
for (i = 0; i < 3; i++)
{
for (j = 0; j < 3; j++)
{
scanf("%d", &a[i][j]);
}
printf("\n");
}
printf("enter elements for 2 matrix\n");
for (i = 0; i < 3; i++)
{
for (j = 0; j < 3; j++)
{
scanf("%d", &b[i][j]);
}
printf("\n");
}
printf("the sum of matrix 1 and 2\n");
for (i = 0; i < 3; i++)
{
for (j = 0; j < 3; j++)
{
printf("%d\n", (a[i][j] + b[i][j]));
}
printf("\n");
}
}

maximum sum of a subset of size K with sum less than M

Given:
array of integers
value K,M
Question:
Find the maximum sum which we can obtain from all K element subsets of given array such that sum is less than value M?
is there a non dynamic programming solution available to this problem?
or if it is only dp[i][j][k] can only solve this type of problem!
can you please explain the algorithm.
Many people have commented correctly that the answer below from years ago, which uses dynamic programming, incorrectly encodes solutions allowing an element of the array to appear in a "subset" multiple times. Luckily there is still hope for a DP based approach.
Let dp[i][j][k] = true if there exists a size k subset of the first i elements of the input array summing up to j
Our base case is dp[0][0][0] = true
Now, either the size k subset of the first i elements uses a[i + 1], or it does not, giving the recurrence
dp[i + 1][j][k] = dp[i][j - a[i + 1]][k - 1] OR dp[i][j][k]
Put everything together:
given A[1...N]
initialize dp[0...N][0...M][0...K] to false
dp[0][0][0] = true
for i = 0 to N - 1:
for j = 0 to M:
for k = 0 to K:
if dp[i][j][k]:
dp[i + 1][j][k] = true
if j >= A[i] and k >= 1 and dp[i][j - A[i + 1]][k - 1]:
dp[i + 1][j][k] = true
max_sum = 0
for j = 0 to M:
if dp[N][j][K]:
max_sum = j
return max_sum
giving O(NMK) time and space complexity.
Stepping back, we've made one assumption here implicitly which is that A[1...i] are all non-negative. With negative numbers, initializing the second dimension 0...M is not correct. Consider a size K subset made up of a size K - 1 subset with sum exceeding M and one other sufficiently negative element of A[] such that overall sum no longer exceeds M. Similarly, our size K - 1 subset could sum to some extremely negative number and then with a sufficiently positive element of A[] sum to M. In order for our algorithm to still work in both cases we would need to increase the second dimension from M to the difference between the sum of all positive elements in A[] and the sum of all negative elements (the sum of the absolute values of all elements in A[]).
As for whether a non dynamic programming solution exists, certainly there is the naive exponential time brute force solution and variations that optimize the constant factor in the exponent.
Beyond that? Well your problem is closely related to subset sum and the literature for the big name NP complete problems is rather extensive. And as a general principle algorithms can come in all shapes and sizes -- it's not impossible for me to imagine doing say, randomization, approximation, (just choose the error parameter to be sufficiently small!) plain old reductions to other NP complete problems (convert your problem into a giant boolean circuit and run a SAT solver). Yes these are different algorithms. Are they faster than a dynamic programming solution? Some of them, probably. Are they as simple to understand or implement, without say training beyond standard introduction to algorithms material? Probably not.
This is a variant of the Knapsack or subset-problem, where in terms of time (at the cost of exponential growing space requirements as the input size grows), dynamic programming is the most efficient method that CORRECTLY solves this problem. See Is this variant of the subset sum problem easier to solve? for a similar question to yours.
However, since your problem is not exactly the same, I'll provide an explanation anyways. Let dp[i][j] = true, if there is a subset of length i that sums to j and false if there isn't. The idea is that dp[][] will encode the sums of all possible subsets for every possible length. We can then simply find the largest j <= M such that dp[K][j] is true. Our base case dp[0][0] = true because we can always make a subset that sums to 0 by picking one of size 0.
The recurrence is also fairly straightforward. Suppose we've calculated the values of dp[][] using the first n values of the array. To find all possible subsets of the first n+1 values of the array, we can simply take the n+1_th value and add it to all the subsets we've seen before. More concretely, we have the following code:
initialize dp[0..K][0..M] to false
dp[0][0] = true
for i = 0 to N:
for s = 0 to K - 1:
for j = M to 0:
if dp[s][j] && A[i] + j < M:
dp[s + 1][j + A[i]] = true
for j = M to 0:
if dp[K][j]:
print j
break
We're looking for a subset of K elements for which the sum of the elements is a maximum, but less than M.
We can place bounds [X, Y] on the largest element in the subset as follows.
First we sort the (N) integers, values[0] ... values[N-1], with the element values[0] is the smallest.
The lower bound X is the largest integer for which
values[X] + values[X-1] + .... + values[X-(K-1)] < M.
(If X is N-1, then we've found the answer.)
The upper bound Y is the largest integer less than N for which
values[0] + values[1] + ... + values[K-2] + values[Y] < M.
With this observation, we can now bound the second-highest term for each value of the highest term Z, where
X <= Z <= Y.
We can use exactly the same method, since the form of the problem is exactly the same. The reduced problem is finding a subset of K-1 elements, taken from values[0] ... values[Z-1], for which the sum of the elements is a maximum, but less than M - values[Z].
Once we've bound that value in the same way, we can put bounds on the third-largest value for each pair of the two highest values. And so on.
This gives us a tree structure to search, hopefully with much fewer combinations to search than N choose K.
Felix is correct that this is a special case of the knapsack problem. His dynamic programming algorithm takes O(K*M) size and O(K*K*M) amount of time. I believe his use of the variable N really should be K.
There are two books devoted to the knapsack problem. The latest one, by Kellerer, Pferschy and Pisinger [2004, Springer-Verlag, ISBN 3-540-40286-1] gives an improved dynamic programming algorithm on their page 76, Figure 4.2 that takes O(K+M) space and O(KM) time, which is huge reduction compared to the dynamic programming algorithm given by Felix. Note that there is a typo on the book's last line of the algorithm where it should be c-bar := c-bar - w_(r(c-bar)).
My C# implementation is below. I cannot say that I have extensively tested it, and I welcome feedback on this. I used BitArray to implement the concept of the sets given in the algorithm in the book. In my code, c is the capacity (which in the original post was called M), and I used w instead of A as the array that holds the weights.
An example of its use is:
int[] optimal_indexes_for_ssp = new SubsetSumProblem(12, new List<int> { 1, 3, 5, 6 }).SolveSubsetSumProblem();
where the array optimal_indexes_for_ssp contains [0,2,3] corresponding to the elements 1, 5, 6.
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
public class SubsetSumProblem
{
private int[] w;
private int c;
public SubsetSumProblem(int c, IEnumerable<int> w)
{
if (c < 0) throw new ArgumentOutOfRangeException("Capacity for subset sum problem must be at least 0, but input was: " + c.ToString());
int n = w.Count();
this.w = new int[n];
this.c = c;
IEnumerator<int> pwi = w.GetEnumerator();
pwi.MoveNext();
for (int i = 0; i < n; i++, pwi.MoveNext())
this.w[i] = pwi.Current;
}
public int[] SolveSubsetSumProblem()
{
int n = w.Length;
int[] r = new int[c+1];
BitArray R = new BitArray(c+1);
R[0] = true;
BitArray Rp = new BitArray(c+1);
for (int d =0; d<=c ; d++) r[d] = 0;
for (int j = 0; j < n; j++)
{
Rp.SetAll(false);
for (int k = 0; k <= c; k++)
if (R[k] && k + w[j] <= c) Rp[k + w[j]] = true;
for (int k = w[j]; k <= c; k++) // since Rp[k]=false for k<w[j]
if (Rp[k])
{
if (!R[k]) r[k] = j;
R[k] = true;
}
}
int capacity_used= 0;
for(int d=c; d>=0; d--)
if (R[d])
{
capacity_used = d;
break;
}
List<int> result = new List<int>();
while (capacity_used > 0)
{
result.Add(r[capacity_used]);
capacity_used -= w[r[capacity_used]];
} ;
if (capacity_used < 0) throw new Exception("Subset sum program has an internal logic error");
return result.ToArray();
}
}

Probability of Outcomes Algorithm

I have a probability problem, which I need to simulate in a reasonable amount of time. In simplified form, I have 30 unfair coins each with a different known probability. I then want to ask things like "what is the probability that exactly 12 will be heads?", or "what is the probability that AT LEAST 5 will be tails?".
I know basic probability theory, so I know I can enumerate all (30 choose x) possibilities, but that's not particularly scalable. The worst case (30 choose 15) has over 150 million combinations. Is there a better way to approach this problem from a computational standpoint?
Any help is greatly appreciated, thanks! :-)
You can use a dynamic programming approach.
For example, to calculate the probability of 12 heads out of 30 coins, let P(n, k) be the probability that there's k heads from the first n coins.
Then P(n, k) = p_n * P(n - 1, k - 1) + (1 - p_n) * P(n - 1, k)
(here p_i is the probability the i'th coin is heads).
You can now use this relation in a dynamic programming algorithm. Have a vector of 13 probabilities (that represent P(n - 1, i) for i in 0..12). Build a new vector of 13 for P(n, i) using the above recurrence relation. Repeat until n = 30. Of course, you start with the vector (1, 0, 0, 0, ...) for n=0 (since with no coins, you're sure to get no heads).
The worst case using this algorithm is O(n^2) rather than exponential.
This is actually an interesting problem. I was inspired to write a blog post about it covering in detail fair vs unfair coin tosses all the way to the OP's situation of having a different probability for each coin. You need a technique called dynamic programming to solve this problem in polynomial time.
General Problem: Given C, a series of n coins p1 to pn where pi represents the probability of the i-th coin coming up heads, what is the probability of k heads coming up from tossing all the coins?
This means solving the following recurrence relation:
P(n,k,C,i) = pi x P(n-1,k-1,C,i+1) + (1-pi) x P(n,k,C,i+1)
A Java code snippet that does this is:
private static void runDynamic() {
long start = System.nanoTime();
double[] probs = dynamic(0.2, 0.3, 0.4);
long end = System.nanoTime();
int total = 0;
for (int i = 0; i < probs.length; i++) {
System.out.printf("%d : %,.4f%n", i, probs[i]);
}
System.out.printf("%nDynamic ran for %d coinsin %,.3f ms%n%n",
coins.length, (end - start) / 1000000d);
}
private static double[] dynamic(double... coins) {
double[][] table = new double[coins.length + 2][];
for (int i = 0; i < table.length; i++) {
table[i] = new double[coins.length + 1];
}
table[1][coins.length] = 1.0d; // everything else is 0.0
for (int i = 0; i <= coins.length; i++) {
for (int j = coins.length - 1; j >= 0; j--) {
table[i + 1][j] = coins[j] * table[i][j + 1] +
(1 - coins[j]) * table[i + 1][j + 1];
}
}
double[] ret = new double[coins.length + 1];
for (int i = 0; i < ret.length; i++) {
ret[i] = table[i + 1][0];
}
return ret;
}
What this is doing is constructing a table that shows the probability that a sequence of coins from pi to pn contain k heads.
For a deeper introduction to binomial probability and a discussion on how to apply dynamic programming take a look at Coin Tosses, Binomials and Dynamic Programming.
Pseudocode:
procedure PROB(n,k,p)
/*
input: n - number of coins flipped
k - number of heads
p - list of probabilities for n-coins where p[i] is probability coin i will be heads
output: probability k-heads in n-flips
assumptions: 1 <= i <= n, i in [0,1], 0 <= k <= n, additions and multiplications of [0,1] numbers O(1)
*/
A = ()() //matrix
A[0][0] = 1 // probability no heads given no coins flipped = 100%
for i = 0 to k //O(k)
if i != 0 then A[i][i] = A[i-1][i-1] * p[i]
for j = i + 1 to n - k + i //O( n - k + 1 - (i + 1)) = O(n - k) = O(n)
if i != 0 then A[i][j] = p[j] * A[i-1][j-1] + (1-p[j]) * A[i][j-1]
otherwise A[i][j] = (1 - p[j]) * A[i][j-1]
return A[k][n] //probability k-heads given n-flips
Worst case = O(kn)

Resources