I know that the mse between two 2d matrices, A and B, of shapes pxq, can be calculated in matrix terms as follows:
1/n tr(AtB)
The nice thing about this equation is that the matrices At and B are conformal to matrix multiplication and also yields a square matrix which has a defined trace.
But if we have two 3d tensors A and B of shapes pxqxr, then I don't understand how to get the outer product between them to get a square matrix so that the mse camne written in terms of trace.
Can a 2D transpose convolution operation be represented as a matrix multiplication with the Toeplitz matrix, as can be done for a normal convolution?
I want to generalise some ideas from a dense network to a convolutional network. For normal convolutions, this is not a problem, as they can be represented as matrix multiplications with the Toeplitz matrix. But I couldn't find a clear mathematical formulation of transposed convolution, so I am not sure about this case.
I was looking for a mathematical answer, so should have probably asked somewhere else, anyways I think my latex write-up is correct and answers the question:
formula transposed convolution
Discr. convolution can be represented as multiplication of input with matrix M.
Where M is presented a special case of Toeplitz matrices - circulant matrices.
The questions is: is 2d convolution can also be represented as matrix multiplication?
p.s. By dicr. convolution I mean dicr. convolution with indexing discrete samples in modulus fashion, i.e. the discrete signal is repeating
....X[n-1]x[0]x[1]...x[N-1]x[0]...
Yes, it can, but it will generally be a rather big matrix. If your data set is on a grid of size NxM, then the convolution is a matrix operating on a vector of length N*M; the convolution matrix has N2M2 elements.
If your convolution kernel is small, then the matrix will typically a band matrix where the width of the band is at least N or M.
I have multiple estimates for a transformation matrix, from mapping two point clouds to each other via ICP (Iterative Closest Point).
How can I generate the average transformation matrix for all these matrices?
Each matrix consists of a rigid translation and a rotation only, no scale or skew.
Ideally I would also like to calculate a weighted average, but an unweighted one is fine for now.
Averaging the translation vectors is of course trivial, but the rotations are problematic. One approach I found is averaging the individual base vectors for the rotations, but I am not sure that will result in a new orthonormal base, and the approach seems a little ad-hoc.
Splitting the transformation in translation and rotation is a good start. Averaging the translation is trivial.
Averaging the rotation is not that easy. Most approaches will use quaternions. So you need to transform the rotation matrix to a quaternion.
The easiest way to approximate the average is a linear blending, followed by renormalization of the quaternion:
q* = w1 * q1 + w2 * q2 + ... + w2 * qn
normalize q*
However, this is only an approximation. The reason for that is that the combination of two rotations is not performed by adding the quaternions, but by multiplying them. If we convert quaternions to a logarithmic space, we can use a simple linear blend (because multiplication will become additions). Then transform the quaternion back to the original space. This is the idea of the Spherical Average (Buss 2001). If you're lucky, you find a library that supports log and exp of quaternions:
start with q* as above
do until convergence
for each input quaternion i (index)
diff = q[i] * inverse(q*)
u[i] = log(diff, base q*)
//Now perform the linear blend
adapt := zero quaternion
weights := 0
for each input quaternion i
adapt += weight[i] * u[i]
weights += weight[i]
adapt *= 1/weights
adaptInOriginalSpace = q* ^ adapt (^ is the power operator)
q* = adaptInOriginalSpace * q*
You can define a threshold for adaptInOriginalSpace. If it is a very very small rotation, you can break the loop. This algorithm is proven to preserve geodesic distances on a sphere.
http://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation and http://en.wikipedia.org/wiki/Rotation_matrix#Quaternion will give you some elegant mathematics and a way to turn a rotation matrix into an angle of rotation round an axis of rotation. There will be two possible representations of each rotation, with different signs for both angle of rotation and axis of rotation.
You could convert everything and normalize them to have +ve angles of rotation, then work out the average angle of rotation and the average axis of rotation, renormalising this into a unit vector.
OTOH if your intention is to work out the most accurate possible estimate of the transformation, you need to write down some measure of the goodness of fit of any candidate transformation - a sum of squared errors is often mathematically convenient - and then solve an optimization problem to work out which transformation minimizes the sum of squared errors. This is at least easier to justify than taking an average of individually error-prone estimates, and may well be more accurate.
If you have an existing lerp method, then there is a trivial solution:
count = 1
average_transform = Matrix.Identity(4)
for new_transform in list_of_matrices:
factor = 1/count
average_transform = lerp(average_transform, new_transform, factor)
count += 1
This is only useful because lots more mathermatics packages have the ability to lerp matrices than to average lots of them.
Because I haven't come across this method elsewhere, here's an informal proof:
If there is one matrix, use just that matrix (factor will equal 1 for first matrix)
If there are two matrices, we need 50% of the second one (second factor is 50% so we lerp to half way between the existing first one and the new one)
If there are three matrices we need 33% of each, or 66% of the average of the first two and 33% of the third. The lerp factor of 0.3333 makes this happen.
And so on.
I haven't tested extensively with matrices, but I've used this successfully as a rolling average for other datatypes.
The singular value decomposition (SVD) can be used here.
Take the SVD of the sum of the rotation matricies, and then the average rotation matrix is simply given by Ravg = UV'.
"sdfgeoff" I can't comment in your answer because I'm new here, but you are the most correct, I think. Beutifull and elegant solution, by the way. Would be perfect if you use Spherical Linear Interpolation (SLERP) with quaternions, instead of Linear Interpolation (LERP) because quaternions that map rotations (quaternions with norm 1) define a sphere in 4D, and interpolating between then is in fact interpolate between two point in a sphere surface.
With my experience from point cloud registration, I wuold like to say that this will not work. ICP don't return random rotations in the likehood of the correct rotation. You need to use a beter algorith to register you point clouds (Global Registration algorithms, like FPFH, 4PCS, K4PCS, BSC, FGR, etc). Or a better initial guess for the transformation. ICP will only give you totally wrong rotations (when stuck in local minima) or almost perfect rotations, when initialized with good initial transformations.
Conclusion: averaging it will not work.
I would suggest taking a look at "Average" of multiple quaternions? for a more elaborate discussion on how to compute the average of rotations.
How can we find inverse of a matrix using Simplex method? Do we need to have square matrix only or inverse can be found of any matrix? Also specify about the upper bound on the matrix size?
The Matrix Inverse is required during simplex only over the Basis Matrix (Basis Inversion).
Base matrix is a square matrix of dimensions (mxm) where m is the total number of constraints.
This Matrix Inversion is carried out using either the Product form of Inverse or LU Decomposition.