How to compute the variances in Expectation Maximization with n dimensions? - algorithm

I have been reviewing Expectation Maximization (EM) in research papers such as this one:
http://pdf.aminer.org/000/221/588/fuzzy_k_means_clustering_with_crisp_regions.pdf
I have some doubts that I have not figured it out. For example, what would happen if we have many dimensions for each datapoint?
For example I have the following dataset with 6 datapoints and 4 dimensions:
>D1 D2 D3 D4
5, 19, 72, 5
6, 18, 14, 1
7, 22, 29, 4
3, 22, 51, 1
2, 21, 89, 2
1, 12, 28, 1
It means that for computing the expectation step, do I need to compute 4 standard deviations (one for each dimension)?
Do I also have to compute the variance for each cluster assuming k=3 (Do not know if it is necessary based on the formula from the paper...) or just the variances for each dimensions (4 attributes)?

Usually, you use a Covariance matrix, which also includes variances.
But it really depends on your chosen model. The simplest model does not use variances at all.
A more complex model has a single variance value, the average variance over all dimensions.
Next, you can have a separate variance for each dimension independently; and last but not least a full covariance matrix. That is probably the most flexible GMM in popular use.
Depending on your implementation, there can be many more.
From R's mclust documentation:
univariate mixture
"E" = equal variance (one-dimensional)
"V" = variable variance (one-dimensional)
multivariate mixture
"EII" = spherical, equal volume
"VII" = spherical, unequal volume
"EEI" = diagonal, equal volume and shape
"VEI" = diagonal, varying volume, equal shape
"EVI" = diagonal, equal volume, varying shape
"VVI" = diagonal, varying volume and shape
"EEE" = ellipsoidal, equal volume, shape, and orientation
"EEV" = ellipsoidal, equal volume and equal shape
"VEV" = ellipsoidal, equal shape
"VVV" = ellipsoidal, varying volume, shape, and orientation
single component
"X" = univariate normal
"XII" = spherical multivariate normal
"XXI" = diagonal multivariate normal
"XXX" = elliposidal multivariate normal

Related

model matrix transforming only one vertex [duplicate]

I'm getting thoroughly confused over matrix definitions. I have a matrix class, which holds a float[16] which I assumed is row-major, based on the following observations:
float matrixA[16] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
float matrixB[4][4] = { { 0, 1, 2, 3 }, { 4, 5, 6, 7 }, { 8, 9, 10, 11 }, { 12, 13, 14, 15 } };
matrixA and matrixB both have the same linear layout in memory (i.e. all numbers are in order). According to http://en.wikipedia.org/wiki/Row-major_order this indicates a row-major layout.
matrixA[0] == matrixB[0][0];
matrixA[3] == matrixB[0][3];
matrixA[4] == matrixB[1][0];
matrixA[7] == matrixB[1][3];
Therefore, matrixB[0] = row 0, matrixB[1] = row 1, etc. Again, this indicates row-major layout.
My problem / confusion comes when I create a translation matrix which looks like:
1, 0, 0, transX
0, 1, 0, transY
0, 0, 1, transZ
0, 0, 0, 1
Which is laid out in memory as, { 1, 0, 0, transX, 0, 1, 0, transY, 0, 0, 1, transZ, 0, 0, 0, 1 }.
Then when I call glUniformMatrix4fv, I need to set the transpose flag to GL_FALSE, indicating that it's column-major, else transforms such as translate / scale etc don't get applied correctly:
If transpose is GL_FALSE, each matrix is assumed to be supplied in
column major order. If transpose is GL_TRUE, each matrix is assumed to
be supplied in row major order.
Why does my matrix, which appears to be row-major, need to be passed to OpenGL as column-major?
matrix notation used in opengl documentation does not describe in-memory layout for OpenGL matrices
If think it'll be easier if you drop/forget about the entire "row/column-major" thing. That's because in addition to row/column major, the programmer can also decide how he would want to lay out the matrix in the memory (whether adjacent elements form rows or columns), in addition to the notation, which adds to confusion.
OpenGL matrices have same memory layout as directx matrices.
x.x x.y x.z 0
y.x y.y y.z 0
z.x z.y z.z 0
p.x p.y p.z 1
or
{ x.x x.y x.z 0 y.x y.y y.z 0 z.x z.y z.z 0 p.x p.y p.z 1 }
x, y, z are 3-component vectors describing the matrix coordinate system (local coordinate system within relative to the global coordinate system).
p is a 3-component vector describing the origin of matrix coordinate system.
Which means that the translation matrix should be laid out in memory like this:
{ 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, transX, transY, transZ, 1 }.
Leave it at that, and the rest should be easy.
---citation from old opengl faq--
9.005 Are OpenGL matrices column-major or row-major?
For programming purposes, OpenGL matrices are 16-value arrays with base vectors laid out contiguously in memory. The translation components occupy the 13th, 14th, and 15th elements of the 16-element matrix, where indices are numbered from 1 to 16 as described in section 2.11.2 of the OpenGL 2.1 Specification.
Column-major versus row-major is purely a notational convention. Note that post-multiplying with column-major matrices produces the same result as pre-multiplying with row-major matrices. The OpenGL Specification and the OpenGL Reference Manual both use column-major notation. You can use any notation, as long as it's clearly stated.
Sadly, the use of column-major format in the spec and blue book has resulted in endless confusion in the OpenGL programming community. Column-major notation suggests that matrices are not laid out in memory as a programmer would expect.
I'm going to update this 9 years old answer.
A mathematical matrix is defined as m x n matrix. Where m is a number of rows and n is number of columns. For the sake of completeness, rows are horizontals, columns are vertical. When denoting a matrix element in mathematical notation Mij, the first element (i) is a row index, the second one (j) is a column index. When two matrices are multiplied, i.e. A(m x n) * B(m1 x n1), the resulting matrix has number of rows from the first argument(A), and number of columns of the second(B), and number of columns of the first argument (A) must match number of rows of the second (B). so n == m1. Clear so far, yes?
Now, regarding in-memory layout. You can store matrix two ways. Row-major and column-major. Row-major means that effectively you have rows laid out one after another, linearly. So, elements go from left to right, row after row. Kinda like english text. Column-major means that effectively you have columns laid out one after another, linearly. So elements start at top left, and go from top to bottom.
Example:
//matrix
|a11 a12 a13|
|a21 a22 a23|
|a31 a32 a33|
//row-major
[a11 a12 a13 a21 a22 a23 a31 a32 a33]
//column-major
[a11 a21 a31 a12 a22 a32 a13 a23 a33]
Now, here's the fun part!
There are two ways to store 3d transformation in a matrix.
As I mentioned before, a matrix in 3d essentially stores coordinate system basis vectors and position. So, you can store those vectors in rows or in columns of a matrix. When they're stored as columns, you multiply a matrix with a column vector. Like this.
//convention #1
|vx.x vy.x vz.x pos.x| |p.x| |res.x|
|vx.y vy.y vz.y pos.y| |p.y| |res.y|
|vx.z vy.z vz.z pos.z| x |p.z| = |res.z|
| 0 0 0 1| | 1| |res.w|
However, you can also store those vectors as rows, and then you'll be multiplying a row vector with a matrix:
//convention #2 (uncommon)
| vx.x vx.y vx.z 0|
| vy.x vy.y vy.z 0|
|p.x p.y p.z 1| x | vz.x vz.y vz.z 0| = |res.x res.y res.z res.w|
|pos.x pos.y pos.z 1|
So. Convention #1 often appears in mathematical texts. Convention #2 appeared in DirectX sdk at some point. Both are valid.
And in regards of the question, if you're using convention #1, then your matrices are column-major. And if you're using convention #2, then they're row major. However, memory layout is the same in both cases
[vx.x vx.y vx.z 0 vy.x vy.y vy.z 0 vz.x vz.y vz.z 0 pos.x pos.y pos.z 1]
Which is why I said it is easier to memorize which element is which, 9 years ago.
To summarize the answers by SigTerm and dsharlet: The usual way to transform a vector in GLSL is to right-multiply the transformation matrix by the vector:
mat4 T; vec4 v; vec4 v_transformed;
v_transformed = T*v;
In order for that to work, OpenGL expects the memory layout of T to be, as described by SigTerm,
{1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, transX, transY, transZ, 1 }
which is also called 'column major'. In your shader code (as indicated by your comments), however, you left-multiplied the transformation matrix by the vector:
v_transformed = v*T;
which only yields the correct result if T is transposed, i.e. has the layout
{ 1, 0, 0, transX, 0, 1, 0, transY, 0, 0, 1, transZ, 0, 0, 0, 1 }
(i.e. 'row major'). Since you already provided the correct layout to your shader, namely row major, it was not necessary to set the transpose flag of glUniform4v.
You are dealing with two separate issues.
First, your examples are dealing with the memory layout. Your [4][4] array is row major because you've used the convention established by C multi-dimensional arrays to match your linear array.
The second issue is a matter of convention for how you interpret matrices in your program. glUniformMatrix4fv is used to set a shader parameter. Whether your transform is computed for a row vector or column vector transform is a matter of how you use the matrix in your shader code. Because you say you need to use column vectors, I assume your shader code is using the matrix A and a column vector x to compute x' = A x.
I would argue that the documentation of glUniformMatrix is confusing. The description of the transpose parameter is a really roundabout way of just saying that the matrix is transposed or it isn't. OpenGL itself is just transporting that data to your shader, whether you want to transpose it or not is a matter of convention you should establish for your program.
This link has some good further discussion: http://steve.hollasch.net/cgindex/math/matrix/column-vec.html
I think that the existing answers here are very unhelpful, and I can see from the comments that people are left feeling confused after reading them, so here is another way of looking at this situation.
As a programmer, if I want to store an array in memory, I cannot store a rectangular grid of numbers, because computer memory doesn't work like that, I have to store the numbers in a linear sequence.
Lets say I have a 2x2 matrix and I initialize it in my code like this:
const matrix = [a, b, c, d];
I can successfully use this matrix in other parts of my code provided I know what each of the array elements represents.
The OpenGL specification defines what each index position represents, and this is all you need to know to construct an array and pass it to OpenGL and have it do what you expect.
The row or column major issue only comes into play when I want to write my matrix in a document that describes my code, because mathematicians write matrixes as rectangular grids of numbers. However this is just a convention, a way of writing things down, and has no impact on the code I write or the arrangement of numbers in memory on my computer. You could easily re-write these mathematics papers using some other notation, and it would work just as well.
For the array above, I have two options for writing this array in my documentation as a rectangular grid:
|a b| OR |a c|
|c d| |b d|
Whichever way I choose to write my documentation, this will have no impact on my code or the order of the numbers in memory on my computer, it's just documentation.
In order for people reading my documentation to know the order that I stored the values in the linear array in my program, I can specify that this is a column major or row major representation of the array as a matrix. If it is in column major order then I should traverse the columns to get the linear arrangement of numbers. If this is a row major representation then I should traverse the rows to get the linear arrangement of numbers.
In general, writing documentation in row major order makes life easier for programmers, because if I want to translate this matrix
|a b c|
|d e f|
|g h i|
into code, I can write it like this:
const matrix = [
a, b, c
d, e, f
g, h, i
];
For example:
GLM stores matrix values as m[4][4]. But it treats matrices as if they have a column major order. Even though for 2 dimensional array m[x][y] in C x represents a row and y represents a column, which means that matrix represented by this array has in fact row major order. The trick is to treat m[x][y] as if x represents a column and y represents a row. It is like you transposing the matrix without performing any additional operations to achieve that.

Applying weights to KNN dimensions

When doing a KNN searches in ES/OS it seems to be recommended to normalize the data in the knn vectors to prevent single dimensions from over powering the the final scoring.
In my current example I have a 3 dimensional vector where all values are normalized to values between 0 and 1
[0.2, 0.3, 0.2]
From the perspective of Euclidian distance based scoring this seems to give equal weight to all dimensions.
In my particular example I am using an l2 vector:
"method": {
"name": "hnsw",
"space_type": "l2",
"engine": "nmslib",
}
However, if I want to give more weight to one of my dimensions (say by a factor of 2), would it be acceptable to single out that dimension and normalize between 0-2 instead of the base range of 0-1?
Example:
[0.2, 0.3, 1.2] // Third vector is now between 0-2
The distance computation for this term would now be (2 * (xi - yi))^2 and lead to bigger diffs compared to the rest. As a result the overall score would be more sensitive to differences in this particular term.
In OS the score is calculated as 1 / (1 + Distance Function) so the higher the value returned from the distance function, the lower the score will be.
Is there a method to deciding what the weighting range should be? Setting the range too high would likely make the dimension too dominant?

finding thermocline using depth and temperature

temp = {23,23,23,22,20,20,19,12,11,10,10 };
depth= {0,1,2,3, 8, 9, 10, 12, 18, 23, 29 };
I have two arrays as shown i need to find thermocline using the following statement
it is easily see that the slope of the curve (i.e. dT/dh) is a maximum at the (very!) obvious thermocline.
Furthermore, because the curvature of the curve changes at the thermocline, a point of inflection, then, by definition d2T/dx2 = 0.
Another way to look at it is that to maximise the slope, i.e. the 1st derivative,
then the 2nd derivative must be equal to zero.
Please help!

Efficient algorithm to fit a linear line along the upper boundary of data only

I'm currently trying to fit a linear line through a spread of scattered data in MATLAB. Now this is easy enough using the polyfit function where I can easily obtain my y= mx + c equation. However, I need to now fit a line along the upper boundary of my data, i.e., the top few data points. I know this description is vague, so lets assume that my scattered data will be in a shape of a cone, with its apex on the y-axis, and it spreads outwards and upwards in the +x and +y direction. I need to fit a best fit line on the 'upper edge of the cone' if you will.
I've developed an algorithm but it's extremely slow. It involves first fitting a line of best fit through ALL data, deleting all data points below this line of best fit, and iterating through until only 5% of the initial data points are left. The final best fit line will then reside close to the top edge of the cone. For 250 data points, this takes about 5s and with me dealing with more than a million data points, this algorithm is simply too inefficient.
I guess my question is: is there an algorithm to more efficiently achieve what I need? Or is there a way to sharpen up my code to eliminate unnecessary complexity?
Here is my code in MATLAB:
(As an example)
a = [4, 5, 1, 8, 1.6, 3, 8, 9.2]; %To be used as x-axis points
b = [45, 53, 12, 76, 25, 67, 75, 98]; %To be used as y-axis points
while prod(size(a)) > (0.05*prod(size(a))) %Iterative line fitting occurs until there are less than 5% of the data points left
lobf = polyfit(a,b,1); %Line of Best Fit for current data points
alen = length(a);
for aindex = alen:-1:1 %For loop to delete all points below line of best fit
ValLoBF = lobf(1)*a(aindex) + lobf(2)
if ValLoBF > b(aindex) %if LoBF is above current point...
a(aindex) = []; %delete x coordinate...
b(aindex) = []; %and delete its corresponding y coordinate
end
end
end
Well first of all your example code seems to be running indefinitely ;)
Some optimizations for your code:
a = [4, 5, 1, 8, 1.6, 3, 8, 9.2]; %To be used as x-axis points
b = [45, 53, 12, 76, 25, 67, 75, 98]; %To be used as y-axis points
n_init_a = length(a);
while length(a) > 0.05*n_init_a %Iterative line fitting occurs until there are less than 5% of the data points left
lobf = polyfit(a,b,1); % Line of Best Fit for current data points
% Delete data points below line using logical indexing
% First create values of the polyfit points using element-wise vector multiplication
temp = lobf(1)*a + lobf(2); % Containing all polyfit values
% Using logical indexing to discard all points below
a(b<temp)=[]; % First destroy a
b(b<temp)=[]; % Then b, very important!
end
Also you should try profiling your code by typing in the command window
profile viewer
and check what takes most time calculating your results. I suspect it is polyfit but that can't be sped up much probably.
What you are looking for is not line fitting. You are trying to find the convex hull of the points.
You should check out the function convhull. Once you find the hull, you can remove all of the points that aren't close to it, and fit each part independently to avoid the fact that the data is noisy.
Alternatively, you could render the points onto some pixel grid, and then do some kind of morphological operation, like imclose, and finish with Hough transform. Check out also this answer.

Calculating translation value and rotation angle of a rotated 2D image

I have two images which one of them is the Original image and the second one is Transformed image.
I have to find out how many degrees Transformed image was rotated using 3x3 transformation matrix. Plus, I need to find how far translated from origin.
Both images are grayscaled and held in matrix variables. Their sizes are same [350 500].
I have found a few lecture notes like this.
Lecture notes say that I should use the following matrix formula for rotation:
For translation matrix the formula is given:
Everything is good. But there are two problems:
I could not imagine how to implement the formulas using MATLAB.
The formulas are shaped to find x',y' values but I already have got x,x',y,y' values. I need to find rotation angle (theta) and tx and ty.
I want to know the equivailence of x, x', y, y' in the the matrix.
I have got the following code:
rotationMatrix = [ cos(theta) sin(theta) 0 ; ...
-sin(theta) cos(theta) 0 ; ...
0 0 1];
translationMatrix = [ 1 0 tx; ...
0 1 ty; ...
0 0 1];
But as you can see, tx, ty, theta variables are not defined before used. How can I calculate theta, tx and ty?
PS: It is forbidden to use Image Processing Toolbox functions.
This is essentially a homography recovery problem. What you are doing is given co-ordinates in one image and the corresponding co-ordinates in the other image, you are trying to recover the combined translation and rotation matrix that was used to warp the points from the one image to the other.
You can essentially combine the rotation and translation into a single matrix by multiplying the two matrices together. Multiplying is simply compositing the two operations together. You would this get:
H = [cos(theta) -sin(theta) tx]
[sin(theta) cos(theta) ty]
[ 0 0 1]
The idea behind this is to find the parameters by minimizing the error through least squares between each pair of points.
Basically, what you want to find is the following relationship:
xi_after = H*xi_before
H is the combined rotation and translation matrix required to map the co-ordinates from the one image to the other. H is also a 3 x 3 matrix, and knowing that the lower right entry (row 3, column 3) is 1, it makes things easier. Also, assuming that your points are in the augmented co-ordinate system, we essentially want to find this relationship for each pair of co-ordinates from the first image (x_i, y_i) to the other (x_i', y_i'):
[p_i*x_i'] [h11 h12 h13] [x_i]
[p_i*y_i'] = [h21 h22 h23] * [y_i]
[ p_i ] [h31 h32 1 ] [ 1 ]
The scale of p_i is to account for homography scaling and vanishing points. Let's perform a matrix-vector multiplication of this equation. We can ignore the 3rd element as it isn't useful to us (for now):
p_i*x_i' = h11*x_i + h12*y_i + h13
p_i*y_i' = h21*x_i + h22*y_i + h23
Now let's take a look at the 3rd element. We know that p_i = h31*x_i + h32*y_i + 1. As such, substituting p_i into each of the equations, and rearranging to solve for x_i' and y_i', we thus get:
x_i' = h11*x_i + h12*y_i + h13 - h31*x_i*x_i' - h32*y_i*x_i'
y_i' = h21*x_i + h22*y_i + h23 - h31*x_i*y_i' - h32*y_i*y_i'
What you have here now are two equations for each unique pair of points. What we can do now is build an over-determined system of equations. Take each pair and build two equations out of them. You will then put it into matrix form, i.e.:
Ah = b
A would be a matrix of coefficients that were built from each set of equations using the co-ordinates from the first image, b would be each pair of points for the second image and h would be the parameters you are solving for. Ultimately, you are finally solving this linear system of equations reformulated in matrix form:
You would solve for the vector h which can be performed through least squares. In MATLAB, you can do this via:
h = A \ b;
A sidenote for you: If the movement between images is truly just a rotation and translation, then h31 and h32 will both be zero after we solve for the parameters. However, I always like to be thorough and so I will solve for h31 and h32 anyway.
NB: This method will only work if you have at least 4 unique pairs of points. Because there are 8 parameters to solve for, and there are 2 equations per point, A must have at least a rank of 8 in order for the system to be consistent (if you want to throw in some linear algebra terminology in the loop). You will not be able to solve this problem if you have less than 4 points.
If you want some MATLAB code, let's assume that your points are stored in sourcePoints and targetPoints. sourcePoints are from the first image and targetPoints are for the second image. Obviously, there should be the same number of points between both images. It is assumed that both sourcePoints and targetPoints are stored as M x 2 matrices. The first columns contain your x co-ordinates while the second columns contain your y co-ordinates.
numPoints = size(sourcePoints, 1);
%// Cast data to double to be sure
sourcePoints = double(sourcePoints);
targetPoints = double(targetPoints);
%//Extract relevant data
xSource = sourcePoints(:,1);
ySource = sourcePoints(:,2);
xTarget = targetPoints(:,1);
yTarget = targetPoints(:,2);
%//Create helper vectors
vec0 = zeros(numPoints, 1);
vec1 = ones(numPoints, 1);
xSourcexTarget = -xSource.*xTarget;
ySourcexTarget = -ySource.*xTarget;
xSourceyTarget = -xSource.*yTarget;
ySourceyTarget = -ySource.*yTarget;
%//Build matrix
A = [xSource ySource vec1 vec0 vec0 vec0 xSourcexTarget ySourcexTarget; ...
vec0 vec0 vec0 xSource ySource vec1 xSourceyTarget ySourceyTarget];
%//Build RHS vector
b = [xTarget; yTarget];
%//Solve homography by least squares
h = A \ b;
%// Reshape to a 3 x 3 matrix (optional)
%// Must transpose as reshape is performed
%// in column major format
h(9) = 1; %// Add in that h33 is 1 before we reshape
hmatrix = reshape(h, 3, 3)';
Once you are finished, you have a combined rotation and translation matrix. If you want the x and y translations, simply pick off column 3, rows 1 and 2 in hmatrix. However, we can also work with the vector of h itself, and so h13 would be element 3, and h23 would be element number 6. If you want the angle of rotation, simply take the appropriate inverse trigonometric function to rows 1, 2 and columns 1, 2. For the h vector, this would be elements 1, 2, 4 and 5. There will be a bit of inconsistency depending on which elements you choose as this was solved by least squares. One way to get a good overall angle would perhaps be to find the angles of all 4 elements then do some sort of average. Either way, this is a good starting point.
References
I learned about homography a while ago through Leow Wee Kheng's Computer Vision course. What I have told you is based on his slides: http://www.comp.nus.edu.sg/~cs4243/lecture/camera.pdf. Take a look at slides 30-32 if you want to know where I pulled this material from. However, the MATLAB code I wrote myself :)

Resources