A-B-C-D are 4 points. We define r = length(B-C), angle, ang1 = (A-B-C) and angle ang2 = (B-C-D) and the torsion angle tors1 = (A-B-C-D). What I really need to do is to find the coordinates of C and D provided that I have the new values of r, ang1, ang2 and tors1.
The thing is that the points A and B are rigidly connected to each other, and points C and D are also connected to each other by a rigid connector, so to speak. That is the distance (C-D) remains fixed and also distance A-B remains fixed. There is no such rigid connection between the points B and C.
We have the old coordinates of the 4 points for some other set of (r,ang1,ang2,tors1) and we need to find the new coordinates when this defining set of variables changes to some arbitrary value.
I would be grateful for any helpful comments.
Thanks a lot.
I'm not allowed to post a picture because I'm a new user :(
Additional Info: An iterative solution is not going to be useful because I need to do this in a simulation "plenty of times O(10^6)".
I think the best way to approach this problem would be to think in terms of analytic geometry.
Each point A,B,C,D has some 3D coordinates (x,y,z) and you have some relationships between
them (e.g. distance B-C is equal to r means that
r = sqrt[ (x_b - x_c)^2 + (y_b - y_c)^2 + (z_b - z_c)^2 ]
Once you define such relations it remains to solve the resulting system of equations for the unknown values of coordinates of the points you need to determine.
This is a general approach, if you describe the problem better (maybe a picture?) it might be easy to find some efficient ways of solving such systems because of some special properties your problem has.
You haven't mentioned the coordinate system. Even if (r, a1, a2, t) don't change, the "coordinates" will change if the whole structure can be sent whirling off into space. So I'll make some assumptions:
Put B at the origin, C on the positive X axis and A in the XY plane with y>0. If you don't know the distance AB, calculate it from the old coordinates. Likewise CD.
A: (-AB cos(a1), AB sin(a1), 0)
B: (0, 0, 0)
C: (r, 0, 0)
D: (r + CD cos(a2), CD sin(a2) cos(t), CD sin(a2) sin(t))
(Just watch out for sign conventions in the angles.)
you are describing a set of constraints.
what you need to do is for every constraint check if they are still satisfied, and if not calc the most efficient way to get it correct again.
for instance, in case of length b-c=r if b-c is not r anymore, make it r again by moving both b and c to or from eachother so that the constraint is met again.
for every constraint one by one do this.
Then repeat a few times until the system has stabilized again (e.g. all constraints are met).
that's it
You are asking for a solution to a nonlinear system of equations. For the mathematically inclined, I will write out the constraint equations:
Suppose you have positions of points A,B,C,D. We define vectors AB=A-B, etc., and furthermore, we use the notation nAB to denote the normalized vector AB/|AB|. With this notation, we have:
AB.AB = fixed
CD.CD = fixed
CB.CB = r*r
nAB.nCB = cos(ang1)
nDC.nBC = cos(ang2)
Let E = D - DC.(nCB x nAB) // projection of D onto plane defined by ABC
nEC.nDC = cos(tors1)
nEC x nDC = sin(tors1) // not sure if your torsion angle is signed (if not, delete this)
where the dot (.) denotes dot product, and cross (x) denotes cross product.
Each point is defined by 3 coordinates, so there are 12 unknowns, and 6 constraint equations, leaving 6 degrees of freedom that are unconstrained. These are the 6 gauge DOFs from the translational and rotational invariance of the space.
Assuming you have old point positions A', B', C', and D', and you want to find a new solution which is "closest" (in a sense I defined) to those old positions, then you are solving an optimization problem:
minimize: AA'.AA' + BB'.BB' + CC'.CC' + DD'.DD'
subject to the 4-5 constraints above.
This optimization problem has no nice properties so you will want to use something like Conjugate Gradient descent to find a locally optimal solution with the starting guess being the old point positions. That is an iterative solution, which you said is unacceptable, but there is no direct solution unless you clarify your problem.
If this sounds good to you, I can elaborate on the nitty gritty of performing the numerical optimization.
This is a different solution than the one I gave already. Here I assume that the positions of A and B are not allowed to change (i.e. positions of A and B are constants), similar to Beta's solution. Note that there are still an infinite number of solutions, since we can rotate the structure around the axis defined by A-B and all your constraints are still satisfied.
Let the coordinates of A be A[0], A[1] and A[2], and similarly for B. You want explicit equations for C and D, as you mentioned in the response to Beta's solution, so here they are:
First find the position of C. As mentioned before, there are an infinite number of possibilities, so I will pick a good one for you.
Vector AB = A-B
Normalize(AB)
int best_i = 0;
for i = 1 to 2
if AB[i] < AB[best_i]
best_i = i
// best_i contains dimension in which AB is smallest
Vector N = Cross(AB, unit_vec[best_i]) // A good normal vector to AB
Normalize(N)
Vector T = Cross(N, AB) // AB, N, and T form an orthonormal frame
Normalize(T) // redundant, but just in case
C = B + r*AB*cos(ang1) + r*N*sin(ang1)
// Assume s is the known, fixed distance between C and D
// Update the frame
Vector BC = B-C, Normalize(BC)
N = Cross(BC, T), Normalize(N)
D = C + s*cos(tors1)*BC*cos(ang2) + s*cos(tors1)*N*sin(ang1) +/- s*sin(tors1)*T
That last plus or minus depends on how you define the orthonormal frame. Try one and see if it's what you want, otherwise it's the other sign. The notation above is pretty informal, but it gives a definite recipe for how to generate C and D from A, B, and your parameters. It also chooses a good C (which depends on a good, nondegenerate N). unit_vec[i] refers to the vector of all zeros, except for a 1 at index i. As usual, I have not tested the pseudocode above :)
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
I'm working with SVR, and using this resource. Erverything is super clear, with epsilon intensive loss function (from figure). Prediction comes with tube, to cover most training sample, and generalize bounds, using support vectors.
Then we have this explanation. This can be described by introducing (non-negative) slack variables , to measure the deviation of training samples outside -insensitive zone. I understand this error, outside tube, but don't know, how we can use this in optimization. Could somebody explain this?
In local source. I'm trying to achieve very simple optimization solution, without libraries. This what I have for loss function.
import numpy as np
# Kernel func, linear by default
def hypothesis(x, weight, k=None):
k = k if k else lambda z : z
k_x = np.vectorize(k)(x)
return np.dot(k_x, np.transpose(weight))
.......
import math
def boundary_loss(x, y, weight, epsilon):
prediction = hypothesis(x, weight)
scatter = np.absolute(
np.transpose(y) - prediction)
bound = lambda z: z \
if z >= epsilon else 0
return np.sum(np.vectorize(bound)(scatter))
First, let's look at the objective function. The first term, 1/2 * w^2 (wish this site had LaTeX support but this will suffice) correlates with the margin of the SVM. The article you linked doesn't, in my opinion, explain this very well and calls this term describing "the model's complexity", but perhaps this is not the best way of explaining it. Minimizing this term maximizes the margin (while still representing the data well), which is the predominant goal of using SVM's doing regression.
Warning, Math Heavy Explanation: The reason this is the case is that when maximizing the margin, you want to find the "farthest" non-outlier points right on the margin and minimize its distance. Let this farthest point be x_n. We want to find its Euclidean distance d from the plane f(w, x) = 0, which I will rewrite as w^T * x + b = 0 (where w^T is just the transpose of the weights matrix so that we can multiply the two). To find the distance, let us first normalize the plane such that |w^T * x_n + b| = epsilon, which we can do WLOG as w is still able to form all possible planes of the form w^T * x + b= 0. Then, let's note that w is perpendicular to the plane. This is obvious if you have dealt a lot with planes (particularly in vector calculus), but can be proven by choosing two points on the plane x_1 and x_2, then noticing that w^T * x_1 + b = 0, and w^T * x_2 + b = 0. Subtracting the two equations we get w^T(x_1 - x_2) = 0. Since x_1 - x_2 is just any vector strictly on the plane, and its dot product with w is 0, then we know that w is perpendicular to the plane. Finally, to actually calculate the distance between x_n and the plane, we take the vector formed by x_n' and some point on the plane x' (The vectors would then be x_n - x', and projecting it onto the vector w. Doing this, we get d = |w * (x_n - x') / |w||, which we can rewrite as d = (1 / |w|) * | w^T * x_n - w^T x'|, and then add and subtract b to the inside to get d = (1 / |w|) * | w^T * x_n + b - w^T * x' - b|. Notice that w^T * x_n + b is epsilon (from our normalization above), and that w^T * x' + b is 0, as this is just a point on our plane. Thus, d = epsilon / |w|. Notice that maximizing this distance subject to our constraint of finding the x_n and having |w^T * x_n + b| = epsilon is a difficult optimization problem. What we can do is restructure this optimization problem as minimizing 1/2 * w^T * w subject to the first two constraints in the picture you attached, that is, |y_i - f(x_i, w)| <= epsilon. You may think that I have forgotten the slack variables, and this is true, but when just focusing on this term and ignoring the second term, we ignore the slack variables for now, I will bring them back later. The reason these two optimizations are equivalent is not obvious, but the underlying reason lies in discrimination boundaries, which you are free to read more about (it's a lot more math that frankly I don't think this answer needs more of). Then, note that minimizing 1/2 * w^T * w is the same as minimizing 1/2 * |w|^2, which is the desired result we were hoping for. End of the Heavy Math
Now, notice that we want to make the margin big, but not so big that includes noisy outliers like the one in the picture you provided.
Thus, we introduce a second term. To motivate the margin down to a reasonable size the slack variables are introduced, (I will call them p and p* because I don't want to type out "psi" every time). These slack variables will ignore everything in the margin, i.e. those are the points that do not harm the objective and the ones that are "correct" in terms of their regression status. However, the points outside the margin are outliers, they do not reflect well on the regression, so we penalize them simply for existing. The slack error function that is given there is relatively easy to understand, it just adds up the slack error of every point (p_i + p*_i) for i = 1,...,N, and then multiplies by a modulating constant C which determines the relative importance of the two terms. A low value of C means that we are okay with having outliers, so the margin will be thinned and more outliers will be produced. A high value of C indicates that we care a lot about not having slack, so the margin will be made bigger to accommodate these outliers at the expense of representing the overall data less well.
A few things to note about p and p*. First, note that they are both always >= 0. The constraint in your picture shows this, but it also intuitively makes sense as slack should always add to the error, so it is positive. Second, notice that if p > 0, then p* = 0 and vice versa as an outlier can only be on one side of the margin. Last, all points inside the margin will have p and p* be 0, since they are fine where they are and thus do not contribute to the loss.
Notice that with the introduction of the slack variables, if you have any outliers then you won't want the condition from the first term, that is, |w^T * x_n + b| = epsilon as the x_n would be this outlier, and your whole model would be screwed up. What we allow for, then, is to change the constraint to be |w^T * x_n + b| = epsilon + (p + p*). When translated to the new optimization's constraint, we get the full constraint from the picture you attached, that is, |y_i - f(x_i, w)| <= epsilon + p + p*. (I combined the two equations into one here, but you could rewrite them as the picture is and that would be the same thing).
Hopefully after covering all this up, the motivation for the objective function and the corresponding slack variables makes sense to you.
If I understand the question correctly, you also want code to calculate this objective/loss function, which I think isn't too bad. I have not tested this (yet), but I think this should be what you want.
# Function for calculating the error/loss for a SVM. I assume that:
# - 'x' is 2d array representing the vectors of the data points
# - 'y' is an array representing the values each vector actually gives
# - 'weights' is an array of weights that we tune for the regression
# - 'epsilon' is a scalar representing the breadth of our margin.
def optimization_objective(x, y, weights, epsilon):
# Calculates first term of objective (note that norm^2 = dot product)
margin_term = np.dot(weight, weight) / 2
# Now calculate second term of objective. First get the sum of slacks.
slack_sum = 0
for i in range(len(x)): # For each observation
# First find the absolute distance between expected and observed.
diff = abs(hypothesis(x[i]) - y[i])
# Now subtract epsilon
diff -= epsilon
# If diff is still more than 0, then it is an 'outlier' and will have slack.
slack = max(0, diff)
# Add it to the slack sum
slack_sum += slack
# Now we have the slack_sum, so then multiply by C (I picked this as 1 aribtrarily)
C = 1
slack_term = C * slack_sum
# Now, simply return the sum of the two terms, and we are done.
return margin_term + slack_term
I got this function working on my computer with small data, and you may have to change it a little to work with your data if, for example, the arrays are structured differently, but the idea is there. Also, I am not the most proficient with python, so this may not be the most efficient implementation, but my intent was to make it understandable.
Now, note that this just calculates the error/loss (whatever you want to call it). To actually minimize it requires going into Lagrangians and intense quadratic programming which is a much more daunting task. There are libraries available for doing this but if you want to do this library free as you are doing with this, I wish you good luck because doing that is not a walk in the park.
Finally, I would like to note that most of this information I got from notes I took in my ML class I took last year, and the professor (Dr. Abu-Mostafa) was a great help to have me learn the material. The lectures for this class are online (by the same prof), and the pertinent ones for this topic are here and here (although in my very biased opinion you should watch all the lectures, they were a great help). Leave a comment/question if you need anything cleared up or if you think I made a mistake somewhere. If you still don't understand, I can try to edit my answer to make more sense. Hope this helps!
I'm trying to derive a formula to extract vector u.
I'm given some initial data:
Plane F with the method to extract its normal n = F->normal().
vector c that does not lie within the plane F and passes through some point E that also does not lie within the plane F.
And some constrains to use:
The desired vector u is perpendicular to the vector c.
Vector u is also perpendicular to some vector r which is not given. The vector r is parallel to the plane F and also perpendicular to the vector c. Therefore, we can say the vectors c, r and u are orthogonal.
Let's denote * as dot product, and ^ operator is cross product between two 3d vectors.
The calculation of the vector u is easy by using cross product: vec3 u = c^r. So, my whole task is narrowed down to how to find the vector r which is parallel to a given plane F and at the same time perpendicular to the given vector c.
Because we know that r is parallel to F, we can use plane's normal and dot product: n*r = 0. Since r is unknown, an infinite number of lines can satisfy the aforementioned equation. So, we can also use the condition that r is perpendicular to c: r*c = 0.
To summarize, there are two dot-product equations that should help us to find the vector r:
r*c = 0;
r*n = 0;
However, I am having hard time trying to figure out how to obtain the vector r coordinates provided the two equations, in algorithmic way. Assuming r = (x, y, z) and we want to find x, y and z; it does not seem possible from only two equations:
x*c.x + y*c.y + z*c.z = 0;
x*n.x + y*n.y + z*n.z = 0;
I feel like I'm missing something, e.g., I need a third constrain. Is there anything else needed to extract x, y and z? Or do I have a flaw in my logic?
You can find the vector r by computing the cross product of n and c.
This will be automatically satisfy r.c=r.n=0
You are right that your two equations will have multiple solutions. The other solutions are given by any scalar multiple of r.
This problem has been bothering me for several days, hence I decided to ask you for help.
I am reading the book "Quaternions and Rotation Sequence" written by Jack B. Kuipers. In section 6.4, the author derives a formula of a composite rotation quaternion. One of the steps of this derivation is difficult for me to understand.
I would like to briefly describe the derivation process as follow:
Consider a tracking problem as in this picture.
(I am sorry I have to use links instead of posting pictures directly because this is the first time I post a question here so I am not eligible to do so yet)
In the picture, XYZ is a global, reference frame. 2 successive rotations are performed:
The first one is a rotation about the Z axis through an angle alpha, transforming frame XYZ into a new frame x1y1z1.
The second one is a rotation about the y1 axis through an angle beta, transforming frame x1y1z1 into a new frame x2y2z2.
The goal is to find a single composite rotation quaternion which is equivalent to the two rotations above.
The author does this as follow. The first rotation can be represented by the following quaternion p:
p = cos(alpha/2) + k*sin(alpha/2) (1)
In this formula, k is a standard basis vector (we have vectors i, j, k in R3 corresponding to the axes x, y, z respectively).
The second rotation can be represented by the following quaternion q:
q = cos(beta/2) + j*sin(beta/2) (2)
The composite quaternion we are looking for is the product of these 2 quaternions: qp. The formula of this product is in this picture.
In order to derive this final formula, the author uses 2 assumptions about the standard basis vectors i, j, k, which are: k.j = 0 and k x j = -i. And this is where I dont understand.
We all know that, for a set of 3 mutually orthogonal vectors i, j, k, these 2 assumptions above are correct. However, vector k in (1) and vector j in (2) don't belong to the same coordinate frame. In other words, k in (1) corresponds to Z in frame XYZ, and j in (2) corresponds to y1 in x1y1z1. And these are 2 different, distinguish frames, so I think the second assumption used by the author is incorrect.
What do you think about this? Any answer would be appreciated. Thank you.
author uses 2 assumptions about the standard basis vectors i, j, k...
it is not assumption!
You not understand cross product and dot product see
http://en.wikipedia.org/wiki/Cross_product
http://en.wikipedia.org/wiki/Dot_product
3 mutually orthogonal vectors i, j, k
orthogonal vectors.... What is it(definition)?
Dot_product... What is it(definition)?
Can we define orthogonal vectors via Dot_product?
You must learn a basic of Linear algebra and Complex analysis before understand quaternion.
I have two sets of points A and B.
I want to find all points in B that are within a certain range r to A, where a point b in B is said to be within range r to A if there is at least one point a in A whose (Euclidean) distance to b is equal or smaller to r.
Each of the both sets of points is a coherent set of points. They are generated from the voxel locations of two non overlapping objects.
In 1D this problem fairly easy: all points of B within [min(A)-r max(A)+r]
But I am in 3D.
What is the best way to do this?
I currently repetitively search for every point in A all points in B that within range using some knn algorithm (ie. matlab's rangesearch) and then unite all those sets. But I got a feeling that there should be a better way to do this. I'd prefer a high level/vectorized solution in matlab, but pseudo code is fine too :)
I also thought of writing all the points to images and using image dilation on object A with a radius of r. But that sounds like quite an overhead.
You can use a k-d tree to store all points of A.
Iterate points b of B, and for each point - find the nearest point in A (let it be a) in the k-d tree. The point b should be included in the result if and only if the distance d(a,b) is smaller then r.
Complexity will be O(|B| * log(|A|) + |A|*log(|A|))
I archived further speedup by enhancing #amit's solution by first filtering out points of B that are definitely too far away from all points in A, because they are too far away even in a single dimension (kinda following the 1D solution mentioned in the question).
Doing so limits the complexity to O(|B|+min(|B|,(2r/res)^3) * log(|A|) + |A|*log(|A|)) where res is the minimum distance between two points and thus reduces run time in the test case to 5s (from 10s, and even more in other cases).
example code in matlab:
r=5;
A=randn(10,3);
B=randn(200,3)+5;
roughframe=[min(A,[],1)-r;max(A,[],1)+r];
sortedout=any(bsxfun(#lt,B,roughframe(1,:)),2)|any(bsxfun(#gt,B,roughframe(2,:)),2);
B=B(~sortedout,:);
[~,dist]=knnsearch(A,B);
B=B(dist<=r,:);
bsxfun() is your friend here. So, say you have 10 points in set A and 3 points in set B. You want to have them arrange so that the singleton dimension is at the row / columns. I will randomly generate them for demonstration
A = rand(10, 1, 3); % 10 points in x, y, z, singleton in rows
B = rand(1, 3, 3); % 3 points in x, y, z, singleton in cols
Then, distances among all the points can be calculated in two steps
dd = bsxfun(#(x,y) (x - y).^2, A, B); % differences of x, y, z in squares
d = sqrt(sum(dd, 3)); % this completes sqrt(dx^2 + dy^2 + dz^2)
Now, you have an array of the distance among points in A and B. So, for exampl, the distance between point 3 in A and point 2 in B should be in d(3, 2). Hope this helps.
Ok, here is the story : I found this problem in one of the pizza boxes a few weeks ago. It said if you can solve this before you could finish the pizza, you would get hired at tripadviser. Though I am not looking to get hired, this problem got my eyes and spoiled my focus on pizza and dinner. I worked out something but with some assumptions. Here is the question :
Assume we know P,Q R and S. There is the line connecting centers of each rectangle. We need to find out points C and D. I am not sure if there is some other variable that we should know to solve this.
EDIT
Looking for a programmatic or psudo-code explanation- no need to move to maxthexchange.
Any suggestions ?
It's pretty simple to do step-by-step:
Compute A = (P + Q) / 2 and B = R + S / 2 (component-by-component)
An equation for the line between A and B is L(t) = t * A + (1 - t) * (B - A). Just solve
this linear equation for a t* such that L(t*).y = Q.y to get C = L(t*). Do a similar thing with L(t).y = R.y to get D.
You can also use the values of t* that you get when solving for C and D to determine pathological cases like overlapping rectangles.
You actually don't need to find the points C and D to find the distance.
I assume you already know the coordinates of the rectangle. It's trivial to compute the coordinates of the center points and the lenghts of the edges.
Now, imagine a vertical line passing through A and a horizontal line passing through B. They intersect at a point, call it X. Also, imagine a vertical line passing through C and call its intersection point with the top edge of rectangle RS - C'.
You can trivially compute the length of AX. But the length of AX is half the height of RS + half the height of PQ (both of which you know) + the length of CC'.
So now you know the length of CC' (call it x).
You can also compute the angle (call it n) that AB makes with CC' from A and B's coordinates, since you know CC' is vertical.
Ergo, the length of the segment CD is x * cos(n).