Let A,B be matrixes of R^n space and b belong to R^n.Describe a fast algorithm to compute A^-2*B*A^-3*b.How many computations will the algorithm make?
This is an exam question I have for numerical analysis.I tried brute forcing an algorithm but i believe that the answer is more mathematical.
We haven't yet talked about Big O notations so the question asks for strictly the actions of the algorithm.How would you go about answering this question?
I would just work the problem from right to left, using a linear solver when dealing with the inverse of A, and matrix multiply when dealing with B:
x1 = linsolve(A, b)
x2 = linsolve(A, x1)
x3 = linsolve(A, x2)
y = B*x3
z1 = linsolve(A,y)
result = linsolve(A,z1)
You can reduce by a constant multiplier the number of operations by keeping the LU decomposition of A in memory, but unless you are given more structure on A and B, quadratic complexity seems to be the best you can aim for.
I am trying to use Cholesky decomposition to generate a multivariate matrix with this: Y = U + X*L
U is the mean vector: n x m
L from cholesky: m x m
X is a matrix with univariate normal vectors: n x m
After calculating the mean of the simulated matrix, I realized it was off. The reason is that the mean vector is very close to zero, so when adding it to L*X, L*X dominated the U. Anyone know how to work around this issue?
and thank you for the attention you're paying to my question :)
My question is about finding an (efficient enough) algorithm for finding orthogonal polynomials of a given weight function f.
I've tried to simply apply the Gram-Schmidt algorithm but this one is not efficient enough. Indeed, it requires O(n^2) integrals. But my goal is to use this algorithm in order to find Hankel determinants of a function f. So a "direct" computation wich consists in simply compute the matrix and take its determinants requires only 2*n - 1 integrals.
But I want to use the theorem stating that the Hankel determinant of order n of f is a product of the n first leading coefficients of the orthogonal polynomials of f. The reason is that when n gets larger (say about 20), Hankel determinant gets really big and my goal is to divided it by an other big constant (for n = 20, the constant is of order 10^103). My idea is then to "dilute" the computation of the constant in the product of the leading coefficients.
I hope there is a O(n) algorithm to compute the n first orthogonal polynomials :) I've done some digging and found nothing in that direction for general function f (f can be any smooth function, actually).
EDIT: I'll precise here what the objects I'm talking about are.
1) A Hankel determinant of order n is the determinant of a square matrix which is constant on the skew diagonals. Thus for example
a b c
b c d
c d e
is a Hankel matrix of size 3 by 3.
2) If you have a function f : R -> R, you can associate to f its "kth moment" which is defined as (I'll write it in tex) f_k := \int_{\mathbb{R}} f(x) x^k dx
With this, you can create a Hankel matrix A_n(f) whose entries are (A_n(f)){ij} = f{i+j-2}, that is something of the like
f_0 f_1 f_2
f_1 f_2 f_3
f_2 f_3 f_4
With this in mind, it is easy to define the Hankel determinant of f which is simply
H_n(f) := det(A_n(f)). (Of course, it is understood that f has sufficient decay at infinity, this means that all the moments are well defined. A typical choice for f could be the gaussian f(x) = exp(-x^2), or any continuous function on a compact set of R...)
3) What I call orthogonal polynomials of f is a set of polynomials (p_n) such that
\int_{\mathbb{R}} f(x) p_j(x) p_k(x) is 1 if j = k and 0 otherwize.
(They are called like that since they form an orthonormal basis of the vector space of polynomials with respect to the scalar product
(p|q) = \int_{\mathbb{R}} f(x) p(x) q(x) dx
4) Now, it is basic linear algebra that from any basis of a vector space equipped with a scalar product, you can built a orthonormal basis thanks to the Gram-Schmidt algorithm. This is where the n^2 integrations comes from. You start from the basis 1, x, x^2, ..., x^n. Then you need n(n-1) integrals for the family to be orthogonal, and you need n more in order to normalize them.
5) There is a theorem saying that if f : R -> R is a function having sufficient decay at infinity, then we have that its Hankel determinant H_n(f) is equal to
H_n(f) = \prod_{j = 0}^{n-1} \kappa_j^{-2}
where \kappa_j is the leading coefficient of the j+1th orthogonal polynomial of f.
Thank you for your answer!
(PS: I tagged octave because I work in octave so, with a bit of luck (but I doubt it), there is a built-in function or a package already done managing this kind of think)
Orthogonal polynomials obey a recurrence relation, which we can write as
P[n+1] = (X-a[n])*P[n] - b[n-1]*P[n-1]
P[0] = 1
P[1] = X-a[0]
and we can compute the a, b coefficients by
a[n] = <X*P[n]|P[n]> / c[n]
b[n-1] = c[n-1]/c[n]
c[n] = <P[n]|P[n]>
(Here < | > is your inner product).
However I cannot vouch for the stability of this process at large n.
I am working on algorithm to perform linear regression for one or more independent variables.
that is: (if I have m real world values and in the case of two independent variables a and b)
C + D*a1 + E* b1 = y1
C + D*a2 + E* b2 = y2
C + D*am + E* bm = ym
I would like to use the least squares solution to find best fitting straight line.
I will be using the matrix notation
where Beta is the vector [C, D, E] where these values will be the best fit line.
What is the best way to solve this formula? Should I compute the inverse of
or should I use the LU factorization/decmposition of the matrix. What is the performance of each on large amount of data (i.e a big value of m , could be in order of 10^8 ...)
If the answer was to use Cholesky decomposition or QR decomposition, are there any implementation hints/ simple libraries to use.
I am coding in C/ C++.
Two straightforward approaches spring to mind for solving a dense overdetermined system Ax=b:
Form A^T A x = A b, then Cholesky-factorise A^T A = L L^T, then do two back-solves. This usually gets you an answer precise to about sqrt(machine epsilon).
Compute the QR factorisation A = Q*R, where Q's columns are orthogonal and R is square and upper-triangular, using something like Householder elimination. Then solve Rx = Q^T b for x by back-substitution. This usually gets you an answer precise to about machine epsilon --- twice the precision as the Cholesky method, but it takes about twice as long.
For sparse systems, I'd usually prefer the Cholesky method because it takes better advantage of sparsity.
Your X^TX matrix should have a Cholesky decomposition. I'd look into this decomposition before LU. It is faster: http://en.wikipedia.org/wiki/Cholesky_decomposition
Using assorted matrix math, I've solved a system of equations resulting in coefficients for a polynomial of degree 'n'
Ax^(n-1) + Bx^(n-2) + ... + Z
I then evaulate the polynomial over a given x range, essentially I'm rendering the polynomial curve. Now here's the catch. I've done this work in one coordinate system we'll call "data space". Now I need to present the same curve in another coordinate space. It is easy to transform input/output to and from the coordinate spaces, but the end user is only interested in the coefficients [A,B,....,Z] since they can reconstruct the polynomial on their own. How can I present a second set of coefficients [A',B',....,Z'] which represent the same shaped curve in a different coordinate system.
If it helps, I'm working in 2D space. Plain old x's and y's. I also feel like this may involve multiplying the coefficients by a transformation matrix? Would it some incorporate the scale/translation factor between the coordinate systems? Would it be the inverse of this matrix? I feel like I'm headed in the right direction...
Update: Coordinate systems are linearly related. Would have been useful info eh?
The problem statement is slightly unclear, so first I will clarify my own interpretation of it:
You have a polynomial function
f(x) = Cnxn + Cn-1xn-1 + ... + C0
[I changed A, B, ... Z into Cn, Cn-1, ..., C0 to more easily work with linear algebra below.]
Then you also have a transformation such as: z = ax + b that you want to use to find coefficients for the same polynomial, but in terms of z:
f(z) = Dnzn + Dn-1zn-1 + ... + D0
This can be done pretty easily with some linear algebra. In particular, you can define an (n+1)×(n+1) matrix T which allows us to do the matrix multiplication
d = T * c ,
where d is a column vector with top entry D0, to last entry Dn, column vector c is similar for the Ci coefficients, and matrix T has (i,j)-th [ith row, jth column] entry tij given by
tij = (j choose i) ai bj-i.
Where (j choose i) is the binomial coefficient, and = 0 when i > j. Also, unlike standard matrices, I'm thinking that i,j each range from 0 to n (usually you start at 1).
This is basically a nice way to write out the expansion and re-compression of the polynomial when you plug in z=ax+b by hand and use the binomial theorem.
If I understand your question correctly, there is no guarantee that the function will remain polynomial after you change coordinates. For example, let y=x^2, and the new coordinate system x'=y, y'=x. Now the equation becomes y' = sqrt(x'), which isn't polynomial.
Tyler's answer is the right answer if you have to compute this change of variable z = ax+b many times (I mean for many different polynomials). On the other hand, if you have to do it just once, it is much faster to combine the computation of the coefficients of the matrix with the final evaluation. The best way to do it is to symbolically evaluate your polynomial at point (ax+b) by Hörner's method:
you store the polynomial coefficients in a vector V (at the beginning, all coefficients are zero), and for i = n to 0, you multiply it by (ax+b) and add Ci.
adding Ci means adding it to the constant term
multiplying by (ax+b) means multiplying all coefficients by b into a vector K1, multiplying all coefficients by a and shifting them away from the constant term into a vector K2, and putting K1+K2 back into V.
This will be easier to program, and faster to compute.
Note that changing y into w = cy+d is really easy. Finally, as mattiast points out, a general change of coordinates will not give you a polynomial.
Technical note: if you still want to compute matrix T (as defined by Tyler), you should compute it by using a weighted version of Pascal's rule (this is what the Hörner computation does implicitely):
ti,j = b ti,j-1 + a ti-1,j-1
This way, you compute it simply, column after column, from left to right.
You have the equation:
y = Ax^(n-1) + Bx^(n-2) + ... + Z
In xy space, and you want it in some x'y' space. What you need is transformation functions f(x) = x' and g(y) = y' (or h(x') = x and j(y') = y). In the first case you need to solve for x and solve for y. Once you have x and y, you can substituted those results into your original equation and solve for y'.
Whether or not this is trivial depends on the complexity of the functions used to transform from one space to another. For example, equations such as:
5x = x' and 10y = y'
are extremely easy to solve for the result
y' = 2Ax'^(n-1) + 2Bx'^(n-2) + ... + 10Z
If the input spaces are linearly related, then yes, a matrix should be able to transform one set of coefficients to another. For example, if you had your polynomial in your "original" x-space:
ax^3 + bx^2 + cx + d
and you wanted to transform into a different w-space where w = px+q
then you want to find a', b', c', and d' such that
ax^3 + bx^2 + cx + d = a'w^3 + b'w^2 + c'w + d'
and with some algebra,
a'w^3 + b'w^2 + c'w + d' = a'p^3x^3 + 3a'p^2qx^2 + 3a'pq^2x + a'q^3 + b'p^2x^2 + 2b'pqx + b'q^2 + c'px + c'q + d'
a = a'p^3
b = 3a'p^2q + b'p^2
c = 3a'pq^2 + 2b'pq + c'p
d = a'q^3 + b'q^2 + c'q + d'
which can be rewritten as a matrix problem and solved.