I am trying to use Cholesky decomposition to generate a multivariate matrix with this: Y = U + X*L
U is the mean vector: n x m
L from cholesky: m x m
X is a matrix with univariate normal vectors: n x m
After calculating the mean of the simulated matrix, I realized it was off. The reason is that the mean vector is very close to zero, so when adding it to L*X, L*X dominated the U. Anyone know how to work around this issue?
Related
One of the questions I've come across in my textbook is:
In Computer Graphics transformations are applied on many vertices on the screen. Translation, Rotations
and Scaling.
Assume you’re operating on a vertex with 3 values (X, Y, 1). X, Y being the X Y coordinates and 1 is always
constant
A Translation is done on X as X = X + X’ and on Y as Y = Y + Y’
X’ and Y’ being the values to translate by
A scaling is done on X as X = aX and on Y as Y = bY
a and b being the scaling factors
Propose the best way to store these linear equations and an optimal way to calculate them on each vertex
It is hinted that it involves matrix multiplication and Strassen. However, I'm not sure where to start here? It doesn't involve complex code and it states to come up with something simple to showcase my idea but all Strassen implementations I've come across are definitely large enough to not call complex. What should my thought process be here?
Would my matrix look like this? 3x3 for each equation or do I combine them all in one?
[ X X X']
[ Y Y Y']
[ 1 1 1 ]
What you're trying to find is a transformation matrix, which you can then use to transform some current (x, y) point into the next (nx, ny) point. In other words, we want
start = Vec([x, y, 1])
matrix = Matrix(...)
next = start * matrix // * is matrix multiplication
Now, if your next is supposed to look something like Vec([a * x + x', b * y + y', 1]), we can work our way backwards to figure out the matrix. First, look at just the x component. We're going to effectively take the dot product of our start vector and the topmost row of our matrix, yielding a * x + x'.
If we write it out more explicitly, we want a * x + 0 * y + x' * 1. Hopefully that makes it a bit more easy to see that the vector we want to dot start with is Vec([a, 0, x']). We can repeat this for the remaining two rows of the matrix, and obtain the following matrix:
matrix = Matrix(
[[a, 0, x'],
[0, b, y'],
[0, 0, 1]])
Double check that this makes sense and seems reasonable to you. If we take our start vector and multiply it with this matrix, we'll get the translated vector next as Vec([a * x + x', b * y + y', 1]).
Now for the real beauty of this- the matrix itself doesn't care at all about what our inputs are, its completely independent. So, we can repeatedly apply this matrix over and over again to step forward through more scaling and translations.
next_next_next = start * matrix * matrix * matrix
Knowing this, we can actually compute many steps ahead really quickly, using some mathematical tricks. Multiplying but the matrix n times is the same as multiplying by matrix raised to the nth power. And fortunately, we have an efficient method for computing a matrix to a power- its called exponentiation by squaring (actually applies to regular numbers as well, but here we're concerned with multiplying matrices, and the logic still applies). In a nutshell, rather than multiplying the number or matrix over and over again n times, we square it and multiply intermediate values by the original number / matrix at the right times, to very rapidly approach the desired power (in log(n) multiplications).
This is almost certainly what your professor is wanting you to realize. You can simulate n translations / scalings / rotations (yes, there are rotation matrices as well) in log(n) time.
Extra Mile
What's even cooler is that using some more advanced linear algebra, you can actually do it even faster. You can diagonalize your matrix (meaning you rewrite your matrix as P * D * P^-1, that is, the product of a some matrix P with a matrix D where the only non-zero elements are along the main diagonal, multiplied by the inverse of P). You can then raise this diagonalized matrix to a power really quickly, because (P * D * P^-1) * (P * D * P^-1) simplifies to P * D * D * P^-1, and this generalizes to:
M^N = (P * D * P^-1)^N = (P * D^N * P^-1)
Since D only has non-zero elements along its diagonal, you can raise it to any power by just raising each individual element to that power, which is just the normal cost of integer multiplication, across as many elements as your matrix is wide/tall. This is stupidly fast, and then you just do a single matrix multiplication on either side to arrive at M^N, and then multiply your start vector with this, for your end result.
I am working on a little puzzle-game-project. The basic idea is built around projecting multi-dimensonal data down to 2D. My only problem is how to generate the randomized scenario data. Here is the problem:
I got muliple randomized vectors v_i and a target vector t, all 2D. Now I want to randomize scalar values c_i that:
t = sum c_i v_i
Because there are more than two v_i this is a overdetermined system. I also took care that the linear combination of v_i is actual able to reach t.
How can I create (randomized) values for my c_i?
Edit: After finding this Question I can additionally state, that it is possible for me also (slightly) change the v_i.
All values are based on double
Let's say your v_i form a matrix V with 2 rows and n columns, each vector is a column. The coefficients c_i form a column vector c. Then the equation can be written in matrix form as
V×c = t
Now apply a Singular Value Decomposition to matrix V:
V = A×D×B
with A being an orthogonal 2×2 matrix, D is a 2×n matrix and B an orthogonal n×n matrix. The original equation now becomes
A×D×B×c = t
multiply this equation with the inverse of A, the inverse is the same as the transposed matrix AT:
D×B×c = AT×t
Let's introduce new symbols c'=B×c and t'=AT×t:
D×c' = t'
The solution of this equation is simple, because Matrix D looks like this:
u 0 0 0 ... // n columns
0 v 0 0 ...
The solution is
c1' = t1' / u
c2' = t2' / v
And because all the other columns of D are zero, the remaining components c3'...cn' can be chosen freely. This is the place where you can create random numbers for c3'...cn. Having vector c' you can calculate c as
c = BT×c'
with BT being the inverse/transposed of B.
Since the v_i are linearly dependent there are non trivial solutions to 0 = sum l_i v_i.
If you have n vectors you can find n-2 independent such solutions.
If you have now one solution to t = sum c_i v_i you can add any multiple of l_i to c_i and you will still have a solution: c_i' = p l_i + c_i.
For each independent solution of the homogenous problem determine a random p_j and calculate
c_i'' = c_i + sum p_j l_i_j.
Is there a way to generate N x N random diagonalizable matrix in MATLAB? I tried as following:
N = 10;
A = diag(rand(N,N))
but it is giving me an N x 1 matrix. I also need the matrix to be symmetric.
Assuming that you are considering real-valued matrices: Every real symmetric matrix is diagonalizable. You can therefore randomly generate some matrix A, e.g. by using A = rand(N, N), and then symmetrize it, e.g. by
A = A + A'
For complex matrices the condition for diagonalizability is that the matrix is normal. If A is an arbitrary square random matrix, you can normalize it by
A = A * A'
All full-rank matrices are diagonalizable by SVD or eigen-decomposition.
If you want a random symmetric matrix...
N = 5
V = rand(N*(N+1)/2, 1)
M = triu(ones(N))
M(M==1) = V
M = M + tril(M.',-1)
#DavidEisenstat is right. I tried his example. Sorry for the false statement. Here's a true statement that is relevant specifically to your situation, but is not as general: Random matrices are virtually guaranteed to be diagonalizable.
and thank you for the attention you're paying to my question :)
My question is about finding an (efficient enough) algorithm for finding orthogonal polynomials of a given weight function f.
I've tried to simply apply the Gram-Schmidt algorithm but this one is not efficient enough. Indeed, it requires O(n^2) integrals. But my goal is to use this algorithm in order to find Hankel determinants of a function f. So a "direct" computation wich consists in simply compute the matrix and take its determinants requires only 2*n - 1 integrals.
But I want to use the theorem stating that the Hankel determinant of order n of f is a product of the n first leading coefficients of the orthogonal polynomials of f. The reason is that when n gets larger (say about 20), Hankel determinant gets really big and my goal is to divided it by an other big constant (for n = 20, the constant is of order 10^103). My idea is then to "dilute" the computation of the constant in the product of the leading coefficients.
I hope there is a O(n) algorithm to compute the n first orthogonal polynomials :) I've done some digging and found nothing in that direction for general function f (f can be any smooth function, actually).
EDIT: I'll precise here what the objects I'm talking about are.
1) A Hankel determinant of order n is the determinant of a square matrix which is constant on the skew diagonals. Thus for example
a b c
b c d
c d e
is a Hankel matrix of size 3 by 3.
2) If you have a function f : R -> R, you can associate to f its "kth moment" which is defined as (I'll write it in tex) f_k := \int_{\mathbb{R}} f(x) x^k dx
With this, you can create a Hankel matrix A_n(f) whose entries are (A_n(f)){ij} = f{i+j-2}, that is something of the like
f_0 f_1 f_2
f_1 f_2 f_3
f_2 f_3 f_4
With this in mind, it is easy to define the Hankel determinant of f which is simply
H_n(f) := det(A_n(f)). (Of course, it is understood that f has sufficient decay at infinity, this means that all the moments are well defined. A typical choice for f could be the gaussian f(x) = exp(-x^2), or any continuous function on a compact set of R...)
3) What I call orthogonal polynomials of f is a set of polynomials (p_n) such that
\int_{\mathbb{R}} f(x) p_j(x) p_k(x) is 1 if j = k and 0 otherwize.
(They are called like that since they form an orthonormal basis of the vector space of polynomials with respect to the scalar product
(p|q) = \int_{\mathbb{R}} f(x) p(x) q(x) dx
4) Now, it is basic linear algebra that from any basis of a vector space equipped with a scalar product, you can built a orthonormal basis thanks to the Gram-Schmidt algorithm. This is where the n^2 integrations comes from. You start from the basis 1, x, x^2, ..., x^n. Then you need n(n-1) integrals for the family to be orthogonal, and you need n more in order to normalize them.
5) There is a theorem saying that if f : R -> R is a function having sufficient decay at infinity, then we have that its Hankel determinant H_n(f) is equal to
H_n(f) = \prod_{j = 0}^{n-1} \kappa_j^{-2}
where \kappa_j is the leading coefficient of the j+1th orthogonal polynomial of f.
Thank you for your answer!
(PS: I tagged octave because I work in octave so, with a bit of luck (but I doubt it), there is a built-in function or a package already done managing this kind of think)
Orthogonal polynomials obey a recurrence relation, which we can write as
P[n+1] = (X-a[n])*P[n] - b[n-1]*P[n-1]
P[0] = 1
P[1] = X-a[0]
and we can compute the a, b coefficients by
a[n] = <X*P[n]|P[n]> / c[n]
b[n-1] = c[n-1]/c[n]
where
c[n] = <P[n]|P[n]>
(Here < | > is your inner product).
However I cannot vouch for the stability of this process at large n.
I am working on algorithm to perform linear regression for one or more independent variables.
that is: (if I have m real world values and in the case of two independent variables a and b)
C + D*a1 + E* b1 = y1
C + D*a2 + E* b2 = y2
...
C + D*am + E* bm = ym
I would like to use the least squares solution to find best fitting straight line.
I will be using the matrix notation
so
where Beta is the vector [C, D, E] where these values will be the best fit line.
Question
What is the best way to solve this formula? Should I compute the inverse of
or should I use the LU factorization/decmposition of the matrix. What is the performance of each on large amount of data (i.e a big value of m , could be in order of 10^8 ...)
EDIT
If the answer was to use Cholesky decomposition or QR decomposition, are there any implementation hints/ simple libraries to use.
I am coding in C/ C++.
Two straightforward approaches spring to mind for solving a dense overdetermined system Ax=b:
Form A^T A x = A b, then Cholesky-factorise A^T A = L L^T, then do two back-solves. This usually gets you an answer precise to about sqrt(machine epsilon).
Compute the QR factorisation A = Q*R, where Q's columns are orthogonal and R is square and upper-triangular, using something like Householder elimination. Then solve Rx = Q^T b for x by back-substitution. This usually gets you an answer precise to about machine epsilon --- twice the precision as the Cholesky method, but it takes about twice as long.
For sparse systems, I'd usually prefer the Cholesky method because it takes better advantage of sparsity.
Your X^TX matrix should have a Cholesky decomposition. I'd look into this decomposition before LU. It is faster: http://en.wikipedia.org/wiki/Cholesky_decomposition