I want to compute the principal axis form of an inertia tensor that stays consistent with changes of the inertia. Currently eigen outputs the computed eigenvalues(Ia,Ib,Ic) from smallest to largest, which scrambles the order of the x,y and z inertia moments. This doesn't allow me to map the eigenvalues directly to the diagonal inertia tensor.
To start, I have a multibody system for which I calculate the inertia moments Ixx, Iyy, Izz and products Ixy, Ixz, Iyz around the center of mass. From here I construct a 3x3 inertia matrix. This inertia is in coordinate frame A and can have non-zero off-diagonal components. I can observe this inertia matrix change continuously depending on the movement of my bodies.
For example, initially my computed inertia looks something like this:
0.25 0 0
0 0.22 0
0 0 0.02
then the eigenvalues look something like this:
0.02
0.22
0.25
and the eigenvectors look something like this:
0 0 1
0 1 0
1 0 0
As you can see, the eigensolver sorts the eigenvalues in ascending order, which results in a Izz,Iyy,Ixx vector, which isn't the desired Ixx,Iyy,Izz order.
As I move the bodies around, the inertia changes and so does the ordering of the eigenvalues (Ixx,Iyy and Izz can easily swap places). Two of these eigenvalues can stay constant, while the third one changes so much that it influences the ordering. I would like to find the mapping that keeps the values consistent - achieve always a Ixx,Iyy and Izz order of my eigenvalues.
Eigenvalues quantify inertia along principal axes, their ordering is irrelevant as long as the correspondance with eigenvectors is preserved. Don't be confused by the fact that they are equal to diagonal Ixx,Iyy,Izz elements in your example, the eigenvalues have nothing to do with x-y-z coordinates, in particular, they are orientation-agnostic. If you want to compare inertias of different body configurations, you have to take into account the eigenvectors as well, e.g., draw ellipsoids.
Related
Suppose I have an image and I want to find a subarray with shape 3x3 that contains the maximum sum compared to other subarrays.
How do I do that in python efficiently (run as fast as possible)? If you can provide a sample code that would be great.
My specific problem:
I want to extract the location of the center of the blob in this heatmap
I don't want to just get the maximum point because that would cause the coordinate to not be very precise. The true center of the blob could actually be between 2 pixels. Thus, it's better to do weighted average between many points to obtain subpixel precision. For example, if there are 2 points (x1,y1) and (x2,y2) with values 200 and 100. Then the average coordinate will be x=(200*x1+100*x2)/300 y=(200*y1+100*y2)/300
One of my solution is to do a convolution operation. But I think it's not efficient enough because it requires multiplication to the kernel (which contains only ones). I'm looking for a fast implementation so I cannot do looping myself because I'm not sure if it will be fast.
I want to do this algorithm to 50 images every few milliseconds. (Image come in as a batch). Concretely, think of these images as output of a machine learning model that output heatmaps. In order to obtain the coordinate from these heatmaps, I need to do some kind of weighted average between the coordinates with high intensity. My idea is to do a weighted average around 3x3 area on the image. I am also open to other approaches that can be faster or more elegant.
Looking for the "subarray of shape 3x3 with the maximum sum" is the same as looking for the maximum of an image after it has been filtered with an un-normalized 3x3 box filter. So it boils down to finding efficiently the maximum of an image, which you assume is a (perhaps "noisy") discrete sample of an underlying continuous and smooth signal - hence your desire to find a sub-pixel location.
You really need to split the problem in 2 parts:
Find the pixel location m=(xm, ym) of the maximum value of the image. This requires no more than a visit of every pixel in the image, and one comparison per pixel, so it's O(N) and hence optimal as long as you are operating at the native image resolution. In OpenCv it is done using
the minMaxLoc function.
Apply whatever model of the image you are using to find its (subpixel-interpolated) maximum in a neighborhood of m.
To clarify point (2): you write
I don't want to just get the maximum point because that would cause the coordinate to not be very precise. The true center of the blob could actually be between 2 pixels
While intuitively plausible, this assertion needs to be made more precise in order to be computable. That is, you need to express mathematically what assumptions you make about the image, that bring you to search for a "true" maximum between pixel-sampled location.
A simple example for such assumptions is quadratic smoothness. In this scenario you assume that, in a small (say, 3x3, of 5x5) neighborhood of the "true" maximum location, the image signal z is well approximated by a quadratic:
z = A00 dx^2 + A01 dx dy + A11 dy^2 + A02 dx + A12 dy + A22
where:
dx = x - xm; dy = y - ym
This assumption makes sense if the underlying signal is expected to be at least 3rd order continuous and differentiable, because of the Taylor series theorem. Geometrically, it means that you assume (hope?) that the signal looks like a quadric (a paraboloid, or an ellipsoid) near its maximum.
You can then evaluate the above equation for each of the pixels in a neighborhood of m, replacing the actual image values for z, and thus obtain a linear system in the unknown Aij, with as many equations as there are neighbor pixels (so even a 3x3 neighborhood will yield an over-constrained system). Solving the system in the least-squares sense gives you the "optimal" coefficients Aij. The theoretical maximum as predicted by this model is where the first partial derivatives vanish:
del z / del dx = 2 A00 dx + A01 dy = 0
del z / del dy = A01 dx + 2 A11 dy = 0
This is a linear system in the two unknown (dx, dy), and solving it yields the estimated location of the maximum and, through the above equation for z, the predicted image value at the maximum.
In terms of computational cost, all such model estimations are extremely fast, compared with traversing an image of even moderate size.
I am sorry I did not exactly understand the meaning of your last paragraph so I have just stopped at a point where I got all the coordinates having the maximum value. I have used cv2.filter2D for convolution on a thresholded image and then using np.amax and np.where have found the coordinates having the maximum value.
import cv2
import numpy as np
from timeit import default_timer as timer
img = cv2.imread('blob.png', 0)
start = timer()
_, thresh = cv2.threshold(img, 240, 1, cv2.THRESH_BINARY)
mask = np.ones((3, 3), np.uint8)
res = cv2.filter2D(thresh, -1, mask)
result = np.where(res == np.amax(res))
end = timer()
print(end - start)
I don't whether it as efficient as you want or not but the output was 0.0013461999999435648 s
P.S. The image you have provided had a white border which I had to crop out for this method.
One way is to sub-sampling the image and find the neighborhood of the desired point. You can make it by doing a loop not on all the pixels but on e.g. every 5 pixels (row=row+5andcol=col+5 in the loop). After finding the near location, consider a specific neighborhood around that location and do a loop on whole pixels of that specific crop to find the exact location.
Based on my knowledge of image processing, to get a reliable result that works for any one blob, follow these steps:
Make the image greyscale if it isn’t already (pixel values 0-255)
Normalise the image so that pixel intensities cover the full range of 0-255
Convert image to binary (a pixel is either 0 or 1) - this can be achieved by thresholding, such as applying the rule that any pixel less than or equal to 127 in intensity is given an intensity of 0 and anything else is given an intensity of 1
Find the weighted average of all the pixels that hold the value of “1”
or
Apple an erosion to the image until you are left with either 2 pixels or 1 pixel.
Case 1
If you have two pixels then you need to find the u and v co-ordinates if both pixels. The centre of the blob will be the halfway point between the u and v coordinates of the pixels.
Case 2
If you have one pixel left then that pixel’s co-ordinates is the centre point.
—————
You mentioned about achieving this quickly in Python:
Python by design is an interpreted language, so it executed line by line, making it less suitable for highly iterative tasks like image processing. However, you can make use of libraries like OpenCV (https://docs.opencv.org/2.4/index.html), which is written in C, to mitigate this apart from making the task at hand a lot easier for you.
OpenCV also provides solutions for all the steps I listed above in this capacity, therefore you should be able to achieve a reliable solution fairly quickly, though I can’t say for sure if it will hit your target of 50 images every few milliseconds. Other factors to take into account is the size of the image you are processing. That will increase the processing load exponentially.
UPDATE
I just found a good article that practically echoes my step-process:
https://www.learnopencv.com/find-center-of-blob-centroid-using-opencv-cpp-python/
More importantly it also denotes the formula for finding the centroid mathematically as:
c = (1/n)sigma(n, i = 1, x_i)
but this is better written in the article than I can do so here.
I have multiple estimates for a transformation matrix, from mapping two point clouds to each other via ICP (Iterative Closest Point).
How can I generate the average transformation matrix for all these matrices?
Each matrix consists of a rigid translation and a rotation only, no scale or skew.
Ideally I would also like to calculate a weighted average, but an unweighted one is fine for now.
Averaging the translation vectors is of course trivial, but the rotations are problematic. One approach I found is averaging the individual base vectors for the rotations, but I am not sure that will result in a new orthonormal base, and the approach seems a little ad-hoc.
Splitting the transformation in translation and rotation is a good start. Averaging the translation is trivial.
Averaging the rotation is not that easy. Most approaches will use quaternions. So you need to transform the rotation matrix to a quaternion.
The easiest way to approximate the average is a linear blending, followed by renormalization of the quaternion:
q* = w1 * q1 + w2 * q2 + ... + w2 * qn
normalize q*
However, this is only an approximation. The reason for that is that the combination of two rotations is not performed by adding the quaternions, but by multiplying them. If we convert quaternions to a logarithmic space, we can use a simple linear blend (because multiplication will become additions). Then transform the quaternion back to the original space. This is the idea of the Spherical Average (Buss 2001). If you're lucky, you find a library that supports log and exp of quaternions:
start with q* as above
do until convergence
for each input quaternion i (index)
diff = q[i] * inverse(q*)
u[i] = log(diff, base q*)
//Now perform the linear blend
adapt := zero quaternion
weights := 0
for each input quaternion i
adapt += weight[i] * u[i]
weights += weight[i]
adapt *= 1/weights
adaptInOriginalSpace = q* ^ adapt (^ is the power operator)
q* = adaptInOriginalSpace * q*
You can define a threshold for adaptInOriginalSpace. If it is a very very small rotation, you can break the loop. This algorithm is proven to preserve geodesic distances on a sphere.
http://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation and http://en.wikipedia.org/wiki/Rotation_matrix#Quaternion will give you some elegant mathematics and a way to turn a rotation matrix into an angle of rotation round an axis of rotation. There will be two possible representations of each rotation, with different signs for both angle of rotation and axis of rotation.
You could convert everything and normalize them to have +ve angles of rotation, then work out the average angle of rotation and the average axis of rotation, renormalising this into a unit vector.
OTOH if your intention is to work out the most accurate possible estimate of the transformation, you need to write down some measure of the goodness of fit of any candidate transformation - a sum of squared errors is often mathematically convenient - and then solve an optimization problem to work out which transformation minimizes the sum of squared errors. This is at least easier to justify than taking an average of individually error-prone estimates, and may well be more accurate.
If you have an existing lerp method, then there is a trivial solution:
count = 1
average_transform = Matrix.Identity(4)
for new_transform in list_of_matrices:
factor = 1/count
average_transform = lerp(average_transform, new_transform, factor)
count += 1
This is only useful because lots more mathermatics packages have the ability to lerp matrices than to average lots of them.
Because I haven't come across this method elsewhere, here's an informal proof:
If there is one matrix, use just that matrix (factor will equal 1 for first matrix)
If there are two matrices, we need 50% of the second one (second factor is 50% so we lerp to half way between the existing first one and the new one)
If there are three matrices we need 33% of each, or 66% of the average of the first two and 33% of the third. The lerp factor of 0.3333 makes this happen.
And so on.
I haven't tested extensively with matrices, but I've used this successfully as a rolling average for other datatypes.
The singular value decomposition (SVD) can be used here.
Take the SVD of the sum of the rotation matricies, and then the average rotation matrix is simply given by Ravg = UV'.
"sdfgeoff" I can't comment in your answer because I'm new here, but you are the most correct, I think. Beutifull and elegant solution, by the way. Would be perfect if you use Spherical Linear Interpolation (SLERP) with quaternions, instead of Linear Interpolation (LERP) because quaternions that map rotations (quaternions with norm 1) define a sphere in 4D, and interpolating between then is in fact interpolate between two point in a sphere surface.
With my experience from point cloud registration, I wuold like to say that this will not work. ICP don't return random rotations in the likehood of the correct rotation. You need to use a beter algorith to register you point clouds (Global Registration algorithms, like FPFH, 4PCS, K4PCS, BSC, FGR, etc). Or a better initial guess for the transformation. ICP will only give you totally wrong rotations (when stuck in local minima) or almost perfect rotations, when initialized with good initial transformations.
Conclusion: averaging it will not work.
I would suggest taking a look at "Average" of multiple quaternions? for a more elaborate discussion on how to compute the average of rotations.
I am trying to learn more about matrices. If I have a 4x4 matrice such as :
0.005 0.978 -0.20 60.62
-0.98 -0.027 0.15 -18.942
-0.15 0.20 0.96 -287.13
0 0 0 1
Which part of the matrix tells me the rotation that is applied to an object ? I know that column 4 is the position of the object and suspect row 1,2 and 3 are the x,y and z rotation ?
Thanks in advance.
The first three columns are directional vectors in the x, y, z directions, possibly including scaling of the object. If you imagine a cube, the first column's vector points in the direction of the positive-x-face of the cube, the second in the direction of the positive-y-face and the third in the direction of the positive-z-face.
Note that when object-scaling was applied to the matrix (which doesn't appear to be the case in your example), those direction vectors are not normalized.
But this isn't "rotation" in the euler-angle or quaternion-rotation sense. In fact calculating any angles from this matrix is pretty tricky.
Here are some links that explain how to do it, but this comes with a lot of problems and you should avoid it if it's not absolutely necessary:
http://www.euclideanspace.com/maths/geometry/rotations/conversions/matrixToEuler/index.htm
http://www.euclideanspace.com/maths/geometry/rotations/conversions/quaternionToEuler/index.htm
So first of all I have such image (and ofcourse I have all points coordinates in 2d so I can regenerate lines and check where they cross each other)
(source: narod.ru)
But hey, I have another Image of same lines (I know thay are same) and new coords of my points like on this image
(source: narod.ru)
So... now Having points (coords) on first image, How can I determin plane rotation and Z depth on second image (asuming first one's center was in point (0,0,0) with no rotation)?
What you're trying to find is called a projection matrix. Determining precise inverse projection usually requires that you have firmly established coordinates in both source and destination vectors, which the images above aren't going to give you. You can approximate using pixel positions, however.
This thread will give you a basic walkthrough of the techniques you need to use.
Let me say this up front: this problem is hard. There is a reason Dan Story's linked question has not been answered. Let provide an explanation for people who want to take a stab at it. I hope I'm wrong about how hard it is, though.
I will assume that the 2D screen coordinates and projection/perspective matrix is known to you. You need to know at least this much (if you don't know the projection matrix, essentially you are using a different camera to look at the world). Let's call each pair of 2D screen coordinates (a_i, b_i), and I will assume the projection matrix is of the form
P = [ px 0 0 0 ]
[ 0 py 0 0 ]
[ 0 0 pz pw]
[ 0 0 s 0 ], s = +/-1
Almost any reasonable projection has this form. Working through the rendering pipeline, you find that
a_i = px x_i / (s z_i)
b_i = py y_i / (s z_i)
where (x_i, y_i, z_i) are the original 3D coordinates of the point.
Now, let's assume you know your shape in a set of canonical coordinates (whatever you want), so that the vertices is (x0_i, y0_i, z0_i). We can arrange these as columns of a matrix C. The actual coordinates of the shape are a rigid transformation of these coordinates. Let's similarly organize the actual coordinates as columns of a matrix V. Then these are related by
V = R C + v 1^T (*)
where 1^T is a row vector of ones with the right length, R is an orthogonal rotation matrix of the rigid transformation, and v is the offset vector of the transformation.
Now, you have an expression for each column of V from above: the first column is { s a_1 z_1 / px, s b_1 z_1 / py, z_1 } and so on.
You must solve the set of equations (*) for the set of scalars z_i, and the rigid transformation defined R and v.
Difficulties
The equation is nonlinear in the unknowns, involving quotients of R and z_i
We have assumed up to now that you know which 2D coordinates correspond to which vertices of the original shape (if your shape is a square, this is slightly less of a problem).
We assume there is even a solution at all; if there are errors in the 2D data, then it's hard to say how well equation (*) will be satisfied; the transformation will be nonrigid or nonlinear.
It's called (digital) photogrammetry. Start Googling.
If you are really interested in this kind of problems (which are common in computer vision, tracking objects with cameras etc.), the following book contains a detailed treatment:
Ma, Soatto, Kosecka, Sastry, An Invitation to 3-D Vision, Springer 2004.
Beware: this is an advanced engineering text, and uses many techniques which are mathematical in nature. Skim through the sample chapters featured on the book's web page to get an idea.
Finding the angle between two vectors is not hard using the cosine rule. However, because I am programming for a platform with very limited resources, I would like to avoid calculations such as sqrt and arccos. Even simple divisions should be limited as much as possible.
Fortunately, I do not need the angle per se, but only need some value that is proportional to said angle.
So I am looking for some computationally cheap algorithm to calculate a quantity that is related to the angle between two vectors. So far, I haven't found something that fits the bill, nor have I been able to come up with something myself.
If you don't need the actual euclidean angle, but something that you can use as a base for angle comparisons, then changing to taxicab geometry may be a choice, because you can drop trigonometry and it's slowness while MAINTAINING THE ACCURACY (or at least with really minor loosing of accuracy, see below).
In main modern browser engines the speedup factor is between 1.44 - 15.2 and the accuracy is nearly the same as in atan2. Calculating diamond angle is average 5.01 times faster than atan2 and using inline code in Firefox 18 the speedup reaches factor 15.2. Speed comparison: http://jsperf.com/diamond-angle-vs-atan2/2.
The code is very simple:
function DiamondAngle(y, x)
{
if (y >= 0)
return (x >= 0 ? y/(x+y) : 1-x/(-x+y));
else
return (x < 0 ? 2-y/(-x-y) : 3+x/(x-y));
}
The above code gives you angle between 0 and 4, while atan2 gives you angle between -PI and PI as the following table shows:
Note that diamond angle is always positive and in the range of 0-4, while atan2 gives also negative radians. So diamond angle is sort of more normalized. And another note is that atan2 gives a little more precise result, because the range length is 2*pi (ie 6.283185307179586), while in diamond angles it is 4. In practice this is not very significant, eg. rad 2.3000000000000001 and 2.3000000000000002 are both in diamond angles 1.4718731421442295, but if we lower the precision by dropping one zero, rad 2.300000000000001 and 2.300000000000002 gives both different diamond angle. This "precision loosing" in diamond angles is so small, that it has some significant influence only if the distances are huge. You can play with conversions in http://jsbin.com/bewodonase/1/edit?output (Old version: http://jsbin.com/idoyon/1):
The above code is enough for fast angle comparisons, but in many cases there is need to convert diamond angle to radians and vice verca. If you eg. have some tolerance as radian angles, and then you have a 100,000 times loop where this tolerance is compared to other angles, it's not wise to make comparisons using atan2. Instead, before looping, you change the radian tolerance to taxicab (diamond angles) tolerance and make in-loop comparisons using diamond tolerance and this way you don't have to use slow trigonometric functions in speed-critical parts of the code ( = in loops).
The code that makes this conversion is this:
function RadiansToDiamondAngle(rad)
{
var P = {"x": Math.cos(rad), "y": Math.sin(rad) };
return DiamondAngle(P.y, P.x);
}
As you notice there is cos and sin. As you know, they are slow, but you don't have to make the conversion in-loop, but before looping and the speedup is huge.
And if for some reason, you have to convert diamond angle to radians, eg. after looping and making angle comparisons to return eg. the minimum angle of comparisons or whatever as radians, the code is as follows:
function DiamondAngleToRadians(dia)
{
var P = DiamondAngleToPoint(dia);
return Math.atan2(P.y,P.x);
}
function DiamondAngleToPoint(dia)
{
return {"x": (dia < 2 ? 1-dia : dia-3),
"y": (dia < 3 ? ((dia > 1) ? 2-dia : dia) : dia-4)};
}
Here you use atan2, which is slow, but idea is to use this outside any loops. You cannot convert diamond angle to radians by simply multiplying by some factor, but instead finding a point in taxicab geometry of which diamond angle between that point and the positive X axis is the diamond angle in question and converting this point to radians using atan2.
This should be enough for fast angle comparisons.
Of course there is other atan2 speedup techniques (eg. CORDIC and lookup tables), but AFAIK they all loose accuracy and still may be slower than atan2.
BACKGROUND: I have tested several techniques: dot products, inner products, law of cosine, unit circles, lookup tables etc. but nothing was sufficient in case where both speed and accuracy are important. Finally I found a page in http://www.freesteel.co.uk/wpblog/2009/06/encoding-2d-angles-without-trigonometry/ which had the desired functions and principles.
I assumed first that also taxicab distances could be used for accurate and fast distance comparisons, because the bigger distance in euclidean is bigger also in taxicab. I realized that contrary to euclidean distances, the angle between start and end point has affect to the taxicab distance. Only lengths of vertical and horizontal vectors can be converted easily and fast between euclidean and taxicab, but in every other case you have to take the angle into account and then the process is too slow (?).
So as a conclusion I am thinking that in speed critical applications where is a loop or recursion of several comparisons of angles and/or distances, angles are faster to compare in taxicab space and distances in euclidean (squared, without using sqrt) space.
Have you tried a CORDIC algorithm? It's a general framework for solving polar ↔ rectangular problems with only add/subtract/bitshift + table, essentially doing rotation by angles of the form tan-1 (2-n). You can trade off accuracy with execution time by altering the number of iterations.
In your case, take one vector as a fixed reference, and copy the other to a temporary vector, which you rotate using the cordic angles towards the first vector (roughly bisection) until you reach a desired angular accuracy.
(edit: use sign of dot product to determine at each step whether to rotate forward or backward. Although if multiplies are cheap enough to allow using dot product, then don't bother with CORDIC, perhaps use a table of sin/cos pairs for rotation matrices of angles π/2n to solve the problem with bisection.)
(edit: I like Eric Bainville's suggestion in the comments: rotate both vectors towards zero and keep track of the angle difference.)
Back in the day of a few K of RAM and machines with limited mathematical capabilities I used lookup tables and linear interpolation. The basic idea is simple: create an array with as much resolution as you need (more elements reduce the error created by interpolation). Then interpolate between lookup values.
Here is an example in processing (original dead link).
You can do this with your other trig functions as well. On the 6502 processor this allowed for full 3D wire frame graphics to be computed with an order of magnitude speed increase.
Here on SO I still don't have the privilege to comment (though I have at math.se) so this is actually a reply to Timo's post on diamond angles.
The whole concept of diamond angles based on the L1 norm is most interesting and if it were merely a comparison of which vector has a greater/lesser w.r.t. the positive X axis it would be sufficient. However, the OP did mention angle between two generic vectors, and I presume the OP wants to compare it to some tolerance for finding smoothness/corner status or sth like that, but unfortunately, it seems that with only the formulae provided on jsperf.com or freesteel.co.uk (links above) it seems it is not possible to do this using diamond angles.
Observe the following output from my Asymptote implementation of the formulae:
Vectors : 50,20 and -40,40
Angle diff found by acos : 113.199
Diff of angles found by atan2 : 113.199
Diamond minus diamond : 1.21429
Convert that to degrees : 105.255
Rotate same vectors by 30 deg.
Vectors : 33.3013,42.3205 and -54.641,14.641
Angle diff found by acos : 113.199
Diff of angles found by atan2 : 113.199
Diamond minus diamond : 1.22904
Convert that to degrees : 106.546
Rotate same vectors by 30 deg.
Vectors : 7.67949,53.3013 and -54.641,-14.641
Angle diff found by acos : 113.199
Diff of angles found by atan2 : 113.199
Diamond minus diamond : 1.33726
Convert that to degrees : 116.971
So the point is you can't do diamond(alpha)-diamond(beta) and compare it to some tolerance unlike you can do with the output of atan2. If all you want to do is diamond(alpha)>diamond(beta) then I suppose diamond is fine.
The cross product is proportional to the angle between two vectors, and when the vectors are normalized and the angle is small the cross product is very close to the actual angle in radians due to the small angle approximation.
specifically:
I1Q2-I2Q1 is proportional to the angle between I1Q1 and I2Q2.
The solution would be trivial if the vectors were defined/stored using polar coordinates instead of cartesian coordinates (or, 'as well as' using cartesian coordinates).
dot product of two vectors (x1, y1) and (x2, y2) is
x1 * x2 + y1 * y2
and is equivilent to the product of the lengths of the two vectors times the cosine of the angle between them.
So if you normalize the two vectors first (divide the coordinates by the length)
Where length of V1 L1 = sqrt(x1^2 + y1^2),
and length of V2 L2 = sqrt(x2^2 + y2^2),
Then normalized vectors are
(x1/L1, y1/L1), and (x2/L2, y2/L2),
And dot product of normalized vectors (which is the same as the cosine of angle between the vectors) would be
(x1*x2 + y1*y2)
-----------------
(L1*L2)
of course this may be just as computationally difficult as calculating the cosine
if you need to compute the square root, then consider using the invsqrt hack.
acos((x1*x2 + y1*y2) * invsqrt((x1*x1+y1*y1)*(x2*x2+y2*y2)));
The dot product might work in your case. It's not proportional to the angle, but "related".